w2c logo Missouri S&T
About People News Projects Publications Services Grants Contact Us
Projects

XML join based on content and structure for XML data integration

XML is a standard for data representation and interchange of the data on the Internet because of its ability to represent data from a wide variety of sources. It is likely to be the language which can integrate data from multiple sources. However, correlating XML data sources has to cope up with additional complexities due to the structure of XML documents, which cannot be ignored. Different data sources may have similar contents described using different tag names and structures. There are several challenges in this area.

In this project, we study existing approaches of the similarity between two XML documents based on either content or structure. We propose an approach which considers both the data structure and the content. Our approach detects the similarity of two sub-trees clustered semantically from two XML documents using advantage of XML keys for sub-tree matching.

Reserchers

Dr. Sanjay Madria

Waraporn Viyanon