Loading...
Please wait, while we are loading the content...
Extraction of Structure and Content from the Edgar Database: A Template-Based Approach
| Content Provider | Semantic Scholar |
|---|---|
| Author | Cong, Yu Kogan, Alexander Vasarhelyi, Miklos A. |
| Copyright Year | 2007 |
| Abstract | This paper presents a template‐based approach to extract data from the EDGAR database. A set of heuristic‐based templates is used to configure the trainable system in order to have one type of EDGAR filings processed in a single configuration. Such configurability is highly desirable as it adds expendability and flexibility to this system. The template‐based approach also enables the system to extract both structural information and content from the filings in the EDGAR database. The ability to extract structural information from a section or a complete filing makes it possible to collect data from real‐world documents for users of financial data in both academia and industry. We use the income statement section of 10‐K filings to illustrate the system and the utilization of the template‐based approach. |
| Starting Page | 69 |
| Ending Page | 86 |
| Page Count | 18 |
| File Format | PDF HTM / HTML |
| DOI | 10.2308/jeta.2007.4.1.69 |
| Volume Number | 4 |
| Alternate Webpage(s) | http://accounting.rutgers.edu/MiklosVasarhelyi/Resume%20Articles/MAJOR%20REFEREED%20ARTICLES/M42.%20extraction%20of%20structure%20and%20content.pdf |
| Alternate Webpage(s) | https://doi.org/10.2308/jeta.2007.4.1.69 |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |