Grammar and Corpora 2016
How to cite this chapter

Tuggener, Don and Businger, Martin: Needles in Haystacks: Semi-Automatic Identification of Regional Grammatical Variation in Standard German, in: Fuß, Eric et al. (Eds.): Grammar and Corpora 2016, Heidelberg: Heidelberg University Publishing, 2018. https://doi.org/10.17885/heiup.361.c4709

More citation styles
Licenses

This work is licensed under a Creative Commons License 4.0
(CC BY-SA 4.0)
.
Creative Commons License BY-SA 4.0

Identifiers (Book)
ISBN 978-3-946054-84-9 (PDF)
ISBN 978-3-946054-82-5 (Softcover)
ISBN 978-3-946054-83-2 (Hardcover)

Published 16.05.2018.


Don Tuggener, Martin Businger

Needles in Haystacks: Semi-Automatic Identification of Regional Grammatical Variation in Standard German

Abstract This paper lays out a semi-automatic approach to identifying regional variation in the grammar of Standard German. Our approach takes as input manually defined templates of grammatical constructions that are auto­matically instantiated over a corpus collected from regional newspapers. These instantiations are automatically ranked by a metric that quantifies how spe­cific an instantiation is for a region. Ranked lists of instantiations are compiled that contain instantiations specific to a region and are scanned manually by linguists to identify those that denote grammatical variants of Standard Ger­man. This approach enabled us to discover variants that so far have not been documented. With respect to research on variation within standard languages as seen from a more general perspective, we aim to contribute towards research strategies that clearly rely on empiricism rather than on intuition or bias.1

Keywords Association measures, corpus-driven approaches, diatopic varia­tion, grammatical variation, standard language