Putting the t where it belongs: Solving a confusion problem in Dutch
A common Dutch writing error is to confuse a word ending in -d with a neighbor word
ending in -dt. In this paper we describe the development of a machine-learning-based disambiguator that can determine which word ending is appropriate, on the basis of its local context. We develop alternative disambiguators, varying between a single monolithic
classifier and having multiple confusable experts disambiguate between confusable pairs.
Disambiguation accuracy of the best developed disambiguators exceeds 99%; when we apply these disambiguators to an external test set of collected errors, our detection strategy
correctly identifies up to 79% of the errors.
Share this page