Loading...
Please wait, while we are loading the content...
Similar Documents
Generating grammars for SGML tagged texts lacking DTD
| Content Provider | Semantic Scholar |
|---|---|
| Author | Ahonen, Helena Mannila, Heikki Nikunen, Erja |
| Copyright Year | 1994 |
| Abstract | We describe a technique for forming a context free grammar for a document that has some kind of tagging-structural or typographical-but no concise description of the structure is available. The technique is based on ideas from machine learning. It forms first a set of finite-state automata describing the document completely. These automata are modified by considering certain context conditions; the modifications correspond to generalizing the underlying languages. Finally, the automata are converted into regular expressions, which are then used to construct the grammar. An alternative representation, characteristic k-grams, is also introduced. Additionally, the paper describes some interactive operations necessary for generating a grammar for a large and complicated document. |
| Starting Page | 1 |
| Ending Page | 13 |
| Page Count | 13 |
| File Format | PDF HTM / HTML |
| DOI | 10.1016/S0895-7177(97)00100-3 |
| Volume Number | 26 |
| Alternate Webpage(s) | http://www.cs.helsinki.fi/u/hahonen/ahonen_podp94.ps |
| Alternate Webpage(s) | https://doi.org/10.1016/S0895-7177%2897%2900100-3 |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |