CombiTagger: A system for developing combined taggers

Verena Henrich*, Timo Reuter, Hrafn Loftsson

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Citations (Scopus)

Abstract

The main task of part-of-speech (P0S) tagging is to assign the appropriate morphosyntactic category to each word in a sentence. A combination of different PoS taggers usually results in higher tagging accuracy than obtained by the use of only a single tagger. We present a new language and tagset independent system, CombiTagger, which combines automatically the output of several taggers. The system, which is open source, provides algorithms for simple and weighted voting, but it is extensible so that other combination algorithms can be added easily. We demonstrate the functionality of CombiTagger by using it to develop and evaluate combined taggers for Icelandic. The most accurate individual tagger obtains an accuracy of 91.83%. CombiTagger achieves 93.09%-93.41% accuracy by combining the output of five or six taggers using simple and weighted voting.

Original languageEnglish
Title of host publicationProceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22
Pages254-259
Number of pages6
Publication statusPublished - 2009
Event22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22 - Sanibel Island, FL, United States
Duration: 19 Mar 200921 Mar 2009

Publication series

NameProceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22

Conference

Conference22nd International Florida Artificial Intelligence Research Society Conference, FLAIRS-22
Country/TerritoryUnited States
CitySanibel Island, FL
Period19/03/0921/03/09

Fingerprint

Dive into the research topics of 'CombiTagger: A system for developing combined taggers'. Together they form a unique fingerprint.

Cite this