How to Cite

Tuggener, Don and Businger, Martin: Needles in Haystacks: Semi-Automatic Identification of Regional Grammatical Variation in Standard German, in Fuß, Eric et al. (Eds.): Grammar and Corpora 2016, Heidelberg: Heidelberg University Publishing, 2018, p. 313–335.

License (Chapter)

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Identifiers (Book)

ISBN 978-3-946054-84-9 (PDF)
ISBN 978-3-946054-82-5 (Softcover)
ISBN 978-3-946054-83-2 (Hardcover)




Don Tuggener, Martin Businger

Needles in Haystacks: Semi-Automatic Identification of Regional Grammatical Variation in Standard German

Abstract This paper lays out a semi-automatic approach to identifying regional variation in the grammar of Standard German. Our approach takes as input manually defined templates of grammatical constructions that are auto­matically instantiated over a corpus collected from regional newspapers. These instantiations are automatically ranked by a metric that quantifies how spe­cific an instantiation is for a region. Ranked lists of instantiations are compiled that contain instantiations specific to a region and are scanned manually by linguists to identify those that denote grammatical variants of Standard Ger­man. This approach enabled us to discover variants that so far have not been documented. With respect to research on variation within standard languages as seen from a more general perspective, we aim to contribute towards research strategies that clearly rely on empiricism rather than on intuition or bias.1

Keywords Association measures, corpus-driven approaches, diatopic varia­tion, grammatical variation, standard language