detailed explaination of sentence reordering.

  • 0

Reordering a coherent text that has been shuffled is a relatively easy task for humans. A successful algorithm doing such a reordering provides a lot of insight in how to do proper automatical multi-document summarisation and discourse generation. This research will address the development of an algorithm that attempts to automatically reorder sentences, answering the following question how good is a computer in reordering sentences compared to humans?.

1. INTRODUCTION

When one shuffles the sentences of an existing coherent text, a human is often able to easily reorder those sentences into proper discourse

machine do the same. Work by [odAea04] shows that having humans reorder texts can lead to ambiguities: often more

than one reordering leads to proper discourse The ability to successfully reorder sentences has several applications, amongst which discourse generation. A robust sentence-reordering-algorithm narrows the problem of generating proper discourse to simply forming (proper) sentences. Moreover, the ability to (re-) construct proper discourse from a set of sentences has a high potential in the field of (multi-) document summarisation; extracted sentences can be reformed into proper discourse. This research focuses on the reordering of sentences by means of a sorting algorithm; several modules provide hints on the order of individual sentences, which combined provide a partial ordering. This ordering guides the topological sorting of the sentences, which was defined [Bla04] as:

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for profit or commercial advantage and that copies

bear this notice and the full citation on the first page. To copy otherwise, or

republish, to post on servers or to redistribute to lists, requires prior specific

permission.

3rd Twente Student Conference on IT, Enschede June, 2005

Copyright 2005, University of Twente, Faculty of Electrical Engineering,

Mathematics and Computer Science

items when some pairs of items have no comparison, that is,

according to a partial order

Previous research [HMG95] has learned that it is possible to

determine the temporal structure of sentences, while there

exist several algorithms for doing automatic anaphor resolution.

This research attempts to use these (and other)

techniques in the reordering of sentences.

This research tries to expand this knowledge to the automatic

reordering of sentences by developing an algorithm,

which reorders sentences from a shuffled text. The resulting

algorithm is compared with humans before being evaluated.

.

2. RESEARCH QUESTION

The research question is defined as: how good is a computer

in reordering sentences compared to humans? To

answer this question, various (existing) techniques are be

used, which leads to a sub-question: what techniques can

be used to determine sentence order?. If the prototype

developed in this research demonstrates a performance to

what may be expected from a human, it will be compared

to humans, answering a third question: how does automatically

reordered discourse compare to manually reordered

discourse?.

3. APPROACH

The answer to the question above depends entirely on the

quality of the software used to reorder sentences. The first

part of this research was thus dedicated to finding an existing

solution to the problem of reordering sentences. If

such a method was found, it would be applied and evaluated.

After some thorough searching on the web

it quickly appeared that such an algorithm did not exist yet.

The next step was to design an algorithm, which was to be

tested and compared with human reordering. Dutch was

chosen as language of the sentences to be ordered, since this

is the native (and thus the most intu¨ıtive) language of the

persons performing this research.

2, however,

4. RELATEDWORK

In [Keh94] the temporal relations between events are discussed,

described by successive utterances. Two main phenomena

are addressed in this paper:

of tense

the referential propertiesand the role of temporal constraints imposed

2

databases

Using Google, the ACM Digital Library, and various knowledge

  • 0
What are you looking for?