Loading...
Please wait, while we are loading the content...
Similar Documents
Query-Related Data Extraction of Hidden Web Documents (2004)
| Content Provider | CiteSeerX |
|---|---|
| Author | Hedley, Y. L. Younas, M. James, A. |
| Description | The larger amount of information on the Web is stored in document databases and is not indexed by general-purpose search engines (i.e., Google and Yahoo). Such information is dynamically generated through querying databases — which are referred to as Hidden Web databases. Documents returned in response to a user query are typically presented using templategenerated Web pages. This paper proposes a novel approach that identifies Web page templates by analysing the textual contents and the adjacent tag structures of a document in order to extract query-related data. Preliminary results demonstrate that our approach effectively detects templates and retrieves data with high recall and precision. |
| File Format | |
| Language | English |
| Publisher Date | 2004-01-01 |
| Publisher Institution | In Proceedings of the 27th Annual International ACM SIGIR Conference |
| Access Restriction | Open |
| Subject Keyword | Hidden Web Database Retrieves Data Novel Approach High Recall Query-related Data Querying Database Hidden Web Document Query-related Data Extraction Document Database Preliminary Result User Query General-purpose Search Engine Adjacent Tag Structure Templategenerated Web Page Textual Content Web Page Template |
| Content Type | Text |
| Resource Type | Article |