Grammar and Corpora 2016
How to cite this chapter

Lang, Christian, Schneider, Roman and Suchowolec, Karolina: Extracting Specialized Terminology from Linguistic Corpora, in: Fuß, Eric et al. (Eds.): Grammar and Corpora 2016, Heidelberg: Heidelberg University Publishing, 2018. https://doi.org/10.17885/heiup.361.c4715

More citation styles
Licenses

This work is licensed under a Creative Commons License 4.0
(CC BY-SA 4.0)
.
Creative Commons License BY-SA 4.0

Identifiers (Book)
ISBN 978-3-946054-84-9 (PDF)
ISBN 978-3-946054-82-5 (Softcover)
ISBN 978-3-946054-83-2 (Hardcover)

Published 16.05.2018.


Christian Lang, Roman Schneider, Karolina Suchowolec

Extracting Specialized Terminology from Linguistic Corpora

Abstract In this paper, we present our approach to automatically extracting German terminology in the domain of grammar using texts from the online information system grammis as our corpus. We analyze existing repositories of German grammatical terminology and develop Part-of-speech patterns for our extraction thereby showing the importance of unigrams in this domain. We contrast the results of the automatic extraction with a manually extracted standard. By comparing the performance of well-known statistical measures, we show how measures based on corpus comparison outperform alternative methods.

Keywords Grammatical terminology, terminological structures, automatic term extraction, grammatical information system