--6--
adjective in I00 occurrences in a text, the probability of
havipg an adjective as a dependent is 0.75. The zeroes and
ones in Table I are constant for all words in the glossary.
These values are not listed in the sets of probability
values for the entrles of the glossary; however, they are
known to the system. For instance, the set of probability
values for a transitive verb will contain PI' P2' and P3"
The probability I of governing a noun as object will not
be listed in the data.
The second type of co--occurrence data accompanying
every word in the glossary is a list of possible dependents.
The list is specified in terms of word numbers and semantic
classes (to be described later). It contains the words that
actually appear in the processed physics text as dependents
of the word with which the list is associated. Since the
lists of dependents are compiled on the basis of word co-
occurrence in the text, legitimate word combinations are
guaranteed. In the list of dependents for a verb~ those
words which can only be the subject are marked "S" and
those which can only be the direct object are marked "0".
The co--occurrence data can be regarded as either
syntactic or semantic. They are distinguished here from
both the dependency rules and part of speech designation,
and from the semantic classes that have been established.
At present, seventy--four semantic classes have been set up.
Some of these are formed distributionally (i.e., on the