Please wait, while we are loading the content...
Please wait, while we are loading the content...
| Content Provider | IEEE Xplore Digital Library |
|---|---|
| Author | Arpteg, A. |
| Copyright Year | 2005 |
| Description | Author affiliation: Dept. of Technol., Kalmar Univ., Sweden (Arpteg, A.) |
| Abstract | A major problem with information extraction (IE) systems is that it is difficult to handle a large number of domains. This problem can be handled in various ways. One way is to try to create a system that is as general as possible and can extraction information from a large number of domains. Another way is to create a system that can be trained to extract information. To make such a solution useful, users should not be required to spend a lot of time training or required to have expertise knowledge. If it would be possible for average users to train IE systems to extract information, then systems can be created for a large number of domains and for a large number of users. This is called a user-driven approach. A system called ISSIE has been developed to evaluate this user-driven approach, and that take advantage of semistructured information to extract relevant pieces. The system is built using the JADE agent framework, together with other tools for working with ontologies and knowledge bases. The goal with that system is to be able to train-by-example to handle a new domain. That means that a user shall be able to train the system by simply surfing the Web using a traditional browser and only give small hints about what information that are to be extracted. The system monitors the traffic and behavior of the user and then tries to automate the extraction process. By monitoring the traffic from the Web browser, and trying to understand the communication, the system is able to handle extraction from advanced Web sites. The focus in this system is to be able to extract various types of lists, e.g. a list of products at a retailer's Web site. There are currently four types of lists: singleton, simple lists, complex lists, and multipage lists. The focus in this paper is management of multipage lists. A multipage list is divided into several pages and the system must be able to navigate to and extract from these pages. A set of experiments has been conducted to evaluate the approach and the management of multipage lists. The results showed that the multipage lists extraction works well but there are some problems. However, in general, the approach is promising and shows that a user-driven approach for multi-page list extraction is viable. |
| Sponsorship | IEEE Boston Sect |
| Starting Page | 431 |
| Ending Page | 437 |
| File Size | 1744116 |
| Page Count | 7 |
| File Format | |
| ISBN | 078039013X |
| DOI | 10.1109/KIMAS.2005.1427120 |
| Language | English |
| Publisher | Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher Date | 2005-04-18 |
| Publisher Place | USA |
| Access Restriction | Subscribed |
| Rights Holder | Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subject Keyword | Navigation Natural languages Web and internet services Ontologies Data structures Information retrieval Data mining Web sites Monitoring |
| Content Type | Text |
| Resource Type | Article |
National Digital Library of India (NDLI) is a virtual repository of learning resources which is not just a repository with search/browse facilities but provides a host of services for the learner community. It is sponsored and mentored by Ministry of Education, Government of India, through its National Mission on Education through Information and Communication Technology (NMEICT). Filtered and federated searching is employed to facilitate focused searching so that learners can find the right resource with least effort and in minimum time. NDLI provides user group-specific services such as Examination Preparatory for School and College students and job aspirants. Services for Researchers and general learners are also provided. NDLI is designed to hold content of any language and provides interface support for 10 most widely used Indian languages. It is built to provide support for all academic levels including researchers and life-long learners, all disciplines, all popular forms of access devices and differently-abled learners. It is designed to enable people to learn and prepare from best practices from all over the world and to facilitate researchers to perform inter-linked exploration from multiple sources. It is developed, operated and maintained from Indian Institute of Technology Kharagpur.
Learn more about this project from here.
NDLI is a conglomeration of freely available or institutionally contributed or donated or publisher managed contents. Almost all these contents are hosted and accessed from respective sources. The responsibility for authenticity, relevance, completeness, accuracy, reliability and suitability of these contents rests with the respective organization and NDLI has no responsibility or liability for these. Every effort is made to keep the NDLI portal up and running smoothly unless there are some unavoidable technical issues.
Ministry of Education, through its National Mission on Education through Information and Communication Technology (NMEICT), has sponsored and funded the National Digital Library of India (NDLI) project.
| Sl. | Authority | Responsibilities | Communication Details |
|---|---|---|---|
| 1 | Ministry of Education (GoI), Department of Higher Education |
Sanctioning Authority | https://www.education.gov.in/ict-initiatives |
| 2 | Indian Institute of Technology Kharagpur | Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project | https://www.iitkgp.ac.in |
| 3 | National Digital Library of India Office, Indian Institute of Technology Kharagpur | The administrative and infrastructural headquarters of the project | Dr. B. Sutradhar bsutra@ndl.gov.in |
| 4 | Project PI / Joint PI | Principal Investigator and Joint Principal Investigators of the project |
Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon |
| 5 | Website/Portal (Helpdesk) | Queries regarding NDLI and its services | support@ndl.gov.in |
| 6 | Contents and Copyright Issues | Queries related to content curation and copyright issues | content@ndl.gov.in |
| 7 | National Digital Library of India Club (NDLI Club) | Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach | clubsupport@ndl.gov.in |
| 8 | Digital Preservation Centre (DPC) | Assistance with digitizing and archiving copyright-free printed books | dpc@ndl.gov.in |
| 9 | IDR Setup or Support | Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops | idr@ndl.gov.in |
|
Loading...
|