Loading...
Please wait, while we are loading the content...
Similar Documents
Making indian language legacy documents accessible via web (2002).
| Content Provider | CiteSeerX |
|---|---|
| Author | Kashyap, Abhishek Kohli, Sanjeev Chaudhury, Santanu Joshi, S. D. |
| Abstract | The reliable optical character recognition is not available for scripts of Indian languages. Thus, the only way to make legacy documents in Indian languages available on the web is by scanning them. This work is an attempt to cater to the need for a better representation and efficient storage technique for Indian language documents and their near perfect regeneration at the browser. We work with the segments (corresponding to text, image or white spaces) extracted from the original document page. For compressing the segments separately, we use Shape-Adaptive Wavelet based coding scheme, Run Length encoding and Arithmetic Bit-plane coding. An XML representation scheme is being used to represent the document page and the data is stored at a server. A plug-in has been implemented that decodes the data encoded coming from the server and displays the document page on the web browser thereby making the document pages web accessible. |
| File Format | |
| Publisher Date | 2002-01-01 |
| Access Restriction | Open |
| Content Type | Text |