NDLI: In search of heuristics for keyword detection (abstract only): my source of discontent

Please wait, while we are loading the content...

Conjunctive planning problems with constrained goal interactions

Analysis of heurisitc search models

A methodology for multiple rule system integration and resolutions within a singular knowledge base

An intelligent tutor for high-school algebra

ADVISOR—an expert system for student advisement

An undergraduate curriculum in expert systems design or knowledge engineering

Natural language processing in a Japanese text-to-speech system

Retirement savings plan advisor: a knowledge-based expert system for tax deferred savings investing

An assistance system for UNIX text formatters

Moses: a graphics oriented software development environment

UIG: the User Interface Generator

Design issues and an architecture for a heterogenous multidatabase system

Incremental conceptual clustering from existing databases

Checkpointing strategies for database systems

Interactive digital-simulation laboratory gains special components

A family of conservative codes with block delimiters for decoding without a phase-locked loop

A symbolic computation method of analytic solution of the mixed Direchlet-Neumann-Robin problem for LaPlace's equation

Design of a pictorial knowledgebase

An expert system for bibliographic retrieval using Prolog

A response generation mechanism for an intelligent active assistance system

An approach to automating knowledge acquisition for expert systems: annotated traces→diagnostic heirarchies

Commonsense reasoning in Prolog

Coordinating existing expert systems

Sensitivity problem in dynamic file organization

An algebraic language for graphical query formulation using an extended entity-relationship model

A pegging method for decomposing relations in databases

Heuristic search in networks with modifiable estimate

Improved parallel algorithms for the depth-first search and monotone circuit value problems

Quicksort algorithms with an early exit for sorted subfiles

A very high level language for large-grained data flow

Manifolds: a very high-level conceptual framework of interprocess synchronization and communication

On solving large maximum concurrent flow problems

Formalization of operations and function definitions in a functional programming language for data structures

A pivotal function approach to estimation and prediction for a model of software reliability

A method to improve testing and debugging in robotic programs using attribute grammars

A parallel algorithm for finding a maximum flow in 0-1 networks

A routing algorithm for three stage rearrangeable Clos networks

Improved sorting algorithms for parallel computers

On efficient balanced codes

A computer based model of life transitions

On optimal algorithms for solving bi-valued game trees

Modified M-timed Petri nets in modelling and performance evaluation of systems

Exploring operating system internals with workstations

Concept and synthesis of an operating system nucleus implemented in computer hardware

An extensible static analysis tool for COBOL programs

Infuse: a tool for automatically managing and coordinating source changes in large systems

A multi-language syntax-directed editor

Information for competitive advantage: implications for computer science education

A partnership—school and computer science work experiences: a career component to the curriculum

Brit bits: computer science in British further education

Intelligent computer-based instruction

What is computer literacy: the sham, the imposter, and the misdirected

Examining the problems of computer-based anxiety: a systemic approach

Bringing algorithms to life

Teaching principles of computer programming

Two methods of instruction for an introductory computer programming course: a language oriented vs a non-language oriented method of instruction

The role of truth maintenance in model-based reasoning (abstract only)

Intelligent systems for statistical process control in steel (abstract only)

Manhattanville College expert academic advisor—preliminary report (abstract only)

Liability for malfunction of medical expert systems—why an expert system is like a power saw (abstract only)

ARMS: arbitrary robot modelling system (abstract only)

Project management expert system (abstract only)

Human factors considerations in the design of a multiple source expert system for military applications (abstract only)

The design of a packet capturing system for measuring IEEE802.5 Token-Ring performances (abstract only)

A fourth order spline method for singular two-point boundary value problems (abstract only)

A frame-based approach to hardware verification (abstract only)

A parallel inference engine (PIE) (abstract only)

An algorithm to minimally decompose a rectilinear figure into rectangles (abstract only)

Formal verification of systolic networks using theorem proving techniques (abstract only)

An algebraic approach to multidimensional convolution (abstract only)

A mixed integer mathematical programming model solution using branch and bound techniques (abstract only)

Min-max sort: a simple sorting method (abstract only)

Property computation in Grapple (abstract only)

The theoretical distribution of the Goodman-Kruskal statistic (abstract only)

A relational data model to represent meaning in natural language sentences (abstract only)

Grammatical relational database model (abstract only)

Database prototyping and implementation (abstract only)

Reliable distributed database systems (abstract only)

Inter-relational information and incompleteness in relational databases (abstract only)

Comparative analysis of dBase by students (abstract only)

Space saving key-lock access control system (abstract only)

Logic programming in LISP with controlled search (abstract only)

Integration of problem-solving entities and processing entities in a coherent model; basis of a method of application (abstract only)

Implementing knowledge bases on secondary storage (abstract only)

Toward more efficient and flexible expert systems via database design (abstract only)

MEND (abstract only): a self-improving diagnostic with deep knowledge

A journalistic explanation facility for an expert system shell (abstract only)

Modula-2 input/output procedure using polymorphic and open-ended data type extensions (abstract only)

The adaptability of Ada as a language for expert systems (abstract only)

Simultaneous presentation in text generation (abstract only)

A functional language architecture that supports fine-grain parallelism (extended abstract)

Critique of SIMAN as a programming language (abstract only)

Combining explanation based generalization with the learning of macro operators (abstract only)

In search of heuristics for keyword detection (abstract only): my source of discontent

Information systems as implementations (abstract only)

SLAW (abstract only): a language free environment—future directions and research

Software protection of micro computer software (abstract only)

On the identity of decision support systems (abstract only)

A probabilistic similarity index between binary vectors for questionnaire data analysis (abstract only)

Computation of pressure distribution on the surface of a vehicle (abstract only)

Representation of an arbitrary tubular surface for CAD/CAM (abstract only)

Syntactic approach to image analysis (abstract only)

Image measurement and recognition (abstract only)

Qualitative reasoning about fit (abstract only)

A geometry package in Ada (abstract only)

Shape from projected light grid (abstract only)

A polygon matching problem in computer vision (abstract only)

IGKS (abstract only): an integrated image processing and graphics environment

Using animated color graphics to illustrate software and hardware organizations (abstract only)

Expert systems in lease accounting (abstract only)

PCDEC, an interactive decision table system for personal computers (abstract only)

A comparison of the effects of structured vs. non-structured and modularized vs. non-modularized programs on run time (abstract only)

What is the proper size of a program module? (abstract only)

The design of an applications development system for a laser videodisc workstation (abstract only)

A niche for structured flowcharts (abstract only)

A comparison of two object-oriented design methodologies (abstract only)

Peer group software reviews in university education for software engineering (abstract only)

An expert tutor in the SLAW programming environment (abstract only)

KAOS (abstract only): a knowledge aided operator's system for the VM operator's console

Towards conceptualisation of physical object propositions (abstract only)

Time representation based on knowledge partitioning (abstract only)

Knowledge and reasoning in graph theory research (abstract only)

A multi-agent planning system (abstract only)

Do women fear computers? (abstract only)

Expert-VSim (abstract only): an expert simulation environment

A fuzzy-power factor correction (abstract only)

Robots and management techniques (abstract only)

Lincoln log factory of the future (LIFOF) (abstract only)

The process allocation in parallel interpretation of logic programs (abstract only)

A parallel processing approach to image object labeling problems (abstract only)

Compilers and parallel architectures (abstract only): sequential to parallel mapping strategies

Semantic parallelization (abstract only): a non-standard denotational approach for imperative programs parallelization

Parsing in a multiprocessor environment (abstract only)

Structured techniques for IMIS (Integrated Maintenance Information System) software development—a case study

The changing role of the university computer center (abstract only)

Teaching assembly language on the Macintosh (abstract only)

On the use of personal computers in teaching the principles of concurrent processing (abstract only)

Software engineering at the University of Dayton (abstract only)

Adaptive approaches to structural software testing (abstract only)

Realization of a multi-valued inner product step processor using CCD's (abstract only)

A quaternary complex number CCD adder (abstract only)

A parallel architecture for rapid image generation and analysis (abstract only)

A multiprogramming stand alone systolic data flow machine (abstract only)

On rational solution of the state equation of a finite automation (abstract only)

Computing multiple modulo summation (abstract only): a new algorithm, its VLSI designs and applications

Algorithms for covering and packing and applications to CAD/CAM (abstract only): preliminary results

Algorithms for paths in the lattice of topologies on finite sets (abstract only)

Surviving the many-person dilemna (abstract only)

An expert system for tuning a macroeconometric model (abstract only)

Computer use and grades (abstract only)

Simulating human problem solving which requires successive decisions (abstract only)

A demographic and attitudinal profile of general education students in a CS survey course (abstract only)

Associations of student characteristics to measures of introductory Pascal computer programming achievement for suburban community college students (abstract only)

Conducting a survey of community computer resources (abstract only)

An evaluation of presentation methods for CAL (abstract only)

Predicting success of a beginning computer course using logistic regression (abstract only)

Professional education for secondary computer science (abstract only)

Conducting a survey of computer technology graduates (abstract only)

An expert system for evaluation of sports injuries (abstract only)

Groundwater model selector (abstract only)

An expert system for selecting a development tool (abstract only)

Programming language selector (abstract only)

College guidance counselor (abstract only)

A reconfigurable software style expert system (abstract only)

A distributed measurement technique for an operating ethernet network (abstract only)

Robustness of the destination tag based routing algorithm for the control of unique path networks (abstract only)

On communication software testing (abstract only)

Implementation of a local area network (abstract only)

An approach to parallel architecture modelling (abstract only)

The porting of XINU to the system 370 VM/CMS environment (abstract only)

Load leveling for control of distributed processing systems (abstract only)

In search of heuristics for keyword detection (abstract only): my source of discontent

Content Provider	ACM Digital Library
Author	Olagunju, Amos O.
Abstract	The question of how to index documents is a central problem in document retrieval. The indexing problem can be stated as follows. There exists a large document collection, together with population of retrieval system customers, each of whom wants information that he thinks might be supplied by documents in the collection. How should the documents in the collection be identified (“indexed,” “cataloged,” etc.) so that the collection can be searched to the maximal collective benefit of the customers?The problem under investigation is that of developing a set of formal statistical rules for selecting the keywords of a document, the words likely to be useful as index terms for that document. A number of simple weighting techniques have been suggested for selecting the keywords of a document. These are: (i) frequency of occurrence in a document, (ii) frequency/document length, (iii) frequency/frequency in document collection, and (iv) frequency/(document length x frequency in collection). These have been examined in detail by Spark Jones, [Sp73]. The major results of her experiments show that there is no best technique except that (i) is consistently outperformed by the others. Her experiments also show that automatic indexing sometimes, but not always, outperforms manual controlled indexing. This has led to more sophisticated procedures for selecting keywords.The first technique for keyword recognition was developed by Salton, [Sa75], and is known as the discrimination value model. The technique measures the effectiveness of a term by examining what happens if that term is removed from the index. The assumption is made that if all the documents seem more similar to one another after a term has been removed from the index, then that term has a descriptive power whose magnitude is represented by change in total similarities. Salton has found significant retrieval improvement by making use of the discrimination value model to select the index terms for certain collections of documents. A second more sophisticated technique has been developed by Harter, [Ha75]. The technique is based upon the distribution characteristics of terms throughout the document collection. Harter's technique is based upon the hypothesis that authors choose terms, other than those directly related to the subject under discussion, randomly from a fixed vocabulary when composing a document. If this is in fact the case, then the distribution characteristics of the non-descriptive terms should be described by a Poisson distribution.It has been further hypothesized that the descriptive terms are chosen by authors randomly in relation to a particular topic. If this is the case, the distribution of these terms within documents dealing with the topic in question should also be describable by the Poisson function defined as f(k) = EXP(-1+k* LN(1))/k1 which gives the probability, f(k), that a document contains k occurrences of a particular term, 1 being the mean number of occurrences of the term in each document of the collection, and where the term is randomly distributed. This gives rise to the 2-Poisson model, [Bo75], which states that the distribution of a term within a document collection should be descritable by two Poisson distributions, one of which describes the usage of the term as a “background” term and the other its usage as a keyword. Thus the overall model is a combination of two Poisson functions and takes the form f(k) = pEXP(-L1+kLN(L1))/ k1+(1-p)EXP(-L2+kLN(L2))/k1 where L1 and L2 represent the mean number of occurrences of the term in each of the two classes and p is the proportion of documents in which the term is a keyword. Bookstein and Swanson, [Bo74], found that 2-Poisson model did not successfully describe the distributions of all keywords since the complete validity of the model is based on the rather naive assumption that there exactly two ways in which a term is used. Harter, [Ha75], suggests (L1-L2)/SQRT(L1+L2) as being an effective measurement of the usefulness of an index term.In his probabilistic approach to keyword selection, Harter [Ha75] used the less efficient moment estimators for estimating the parameters of mixtures of discrete distributions. Harter emphasized that the method of maximum likelihood provides iterative solutions rather than exact solutions for a mixture of two distributions, and that the solutions are very slow to converge, in general. Contending that it was back in the 1930's when computers were unavailable to the statisticians that the method of moment estimators would have been acceptable for estimating the parameters of the 2-Poisson distribution, Olagunju, [O180], has investigated the properties of the 2-Poisson model.In this presentation we show how a combination of the method of moment and the method of maximum likelihood can be used for estimating the parameters of the 2-Poisson distribution. The likelihood function for the 2-Poisson model is given by L(Xi) = PRODUCT [F(Xi/pi,L1,L2, i=1 to ∞], and Log[L(Xi)] = SUM [NiLog(piEXP(-L1+iLN(L1))/i! +(1-pi) EXP(-L2+iLN(L2))/i!, i=0 to ∞]. The estimator Log[L(Xi)] is used to estimate the parameters pi, L1 and L2 since it is easier to find the maximum of the likelihood by it. In fact, by Taylor's series expansion, the point where the likelihood is a maximum is a solution of three systems of equations. The logarithm of the likelihood function for the Degenerate 2-Poisson model is given by Log [Xi] = N0Log[piEXP(-L1)+(1-pi)] + SUM [NiLog(piEXP(-L1 + iLN(L1)))/i!, i=1 to ∞]. In Olagunju's thesis, [O180], the combination of the 2-Poisson model and the Degenerate 2-Poisson model are examined in detail as models of keyword distribution, and formulae expressing the parameters of the models in terms of empirical frequency statistics are derived. Finally, a measure, consistent with the 2-Poisson and the Degenerate 2-Poisson models, intended to identify keywords is proposed.
File Format	PDF
ISBN	0897912187
DOI	10.1145/322917.323048
Language	English
Publisher	Association for Computing Machinery (ACM)
Publisher Date	1987-02-01
Publisher Place	New York
Access Restriction	Subscribed
Content Type	Text
Resource Type	Article

Central Library (ISO-9001:2015 Certified)
Indian Institute of Technology Kharagpur
Kharagpur, West Bengal, India | PIN - 721302

See location in the Map
03222 282435
Mail: support@ndl.gov.in

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in