--2--
Harper's earlier sentence generation program differed
from other versions in its use of data on lexical co-
occurrence and word behavior, both obtained from machine
analysis of written text. These data are incorporated
with some modifications in a new program designed to pro-
duce strings of sentences that possess the properties of
coherence and development found in "real" discourse. (The
actual goal is the production of isolated paragraphs, not
an extended discourse.) In essence the program is designed
(i) to generate an initial sentence; (ii) to "inspect"
the result in order to determine strategies for producing
the following sentence; (iii) to build a second sentence,
.making use of one of these strategies, and employing, in
addition, such criteria of cohesion as lexical class
recurrence, substitution, anaphora, an4 synonymy; (iv) to
continue the process for a prescribed number of sentences,
observing both the general strategic principles and the
lexical context. Analysis of the output ~ill lead to
modification of the input materials, and the cycle will be
repeated.
This paper describes the implementations of these
ideas, and discusses the theoretical implications of the
paragraph generator. First we give a description of the
language materials on which the generator operates. The
next section deals with a program which converts the
language data into tables with associative links to minimize