Loading...
Please wait, while we are loading the content...
Similar Documents
Web Page Segmentation and Informative Content Extraction for Effective Information Retrieval
| Content Provider | Semantic Scholar |
|---|---|
| Author | Win, Chaw Su Thwin, Mie Mie Su |
| Copyright Year | 2014 |
| Abstract | Internet web pages typically contain a large amount of non-informative content such as advertisements, search and filtering panel, headers, footers, navigation links, and copyright notices, etc. Such irrelevant information in web pages can seriously harm Web Mining. So the need of Informative Content Extraction from web pages becomes evident. Two steps, Web Page Segmentation and Informative Content Extraction, are needed to carry out for Web Informative Content Extraction. DOM-based Segmentation Approaches cannot often provide satisfactory results. Vision-based Segmentation Approaches also have some drawbacks. So this paper proposes Effective Visual Block Extractor (EVBE) Algorithm to overcome the problems of DOM-based Approaches and reduce the drawbacks of previous works in Web Page Segmentation. And it also proposes Effective Informative Content Extractor (EIFCE) Algorithm to reduce the drawbacks of previous works in Web Informative Content Extraction. In this paper Effective Informative Content Title Extractor (EICTE) Algorithm is also proposed to effectively extract the title of the informative content of the web page. The effective extraction results of the Proposed Algorithm, higher Precision and Recall can help for increasing the performance of Web Mining tasks. |
| Starting Page | 35 |
| Ending Page | 45 |
| Page Count | 11 |
| File Format | PDF HTM / HTML |
| Volume Number | 2 |
| Alternate Webpage(s) | http://ijccer.org/index.php/ojs/article/download/76/41 |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |