w2c logo Missouri S&T
About People News Projects Publications Services Grants Contact Us
Projects

Cooperative Query Processing with Semistructured Data

The growth of the Internet and other data repositories outside traditional databases has forced rethinking in several tenets in database research. In traditional, databases, it is expected that all conforms to a certain structure, which can be determined beforehand and to which there are no exceptions. By contrast, data in web pages is loosely organized. Semistructured data is a new paradigm in database research which studies collections of heterogeneous, irregularly structured data.

Querying semi-structured data poses some particular challenges. Since semistructured data is accessed by many kinds of users, including non-expert users., the chances of formulating a query incorrectly are higher than in traditional databases. The lack of a completely regular structure also increases the likelihood of making a mistake when writing a query. Finally, query languages for semi-structured data still share the notion of exact answer with more traditional query languages. Thus, they are inflexible in that they require exact matches between the query specification and the data in the database, and are unable to point out closely related information to the user. Therefore, systems that manage semistructured data could benefit from cooperative query processing techniques.

Here, we propose new methods to achieve cooperative query answering (CQA) in the context of semistructured data. The goal is to make the database systems behave in a way that maximizes information exchange be devising strategies in which the system does not merely respond to queries, but tries to collaborate with the user. Instead of literally answering a query, the system tries to provide related data which may help the user obtain the information she needs.

The overall goals of this project are :

1. To develop a general framework for CQA in semistructured data environment to expand the traditional notion of answer.

2. To develop particular techniques to capture more knowledge on semistructured data. An extended answer would answer questions like: what is to be considered part of answer in case of partial knowledge? What is to be considered close to a given object if we need to enlarge an answer.

3. To develop an implementation of the framework and the techniques

Resercher

Dr. Sanjay Madria

Dr. A. Badia, Univerisyt of Louisville, KY