XML join based on content and structure for XML data integration
XML is a standard for data representation and interchange of the data on the Internet because of its ability to represent data from a wide variety of sources. It is likely to be the language which can integrate data from multiple sources. However, correlating XML data sources has to cope up with additional complexities due to the structure of XML documents, which cannot be ignored. Different data sources may have similar contents described using different tag names and structures. There are several challenges in this area.
In this project, we study existing approaches of the similarity between two XML documents based on either content or structure. We propose an approach which considers both the data structure and the content. Our approach detects the similarity of two sub-trees clustered semantically from two XML documents using advantage of XML keys for sub-tree matching.
Dr. Sanjay Madria