Coping with Unruly Language: Non-Standard Usage in a Corpus

Alexandr Rosen

doi:10.17885/heiup.361.509

Zitationsvorschlag

Rosen, Alexandr: Coping with Unruly Language: Non-Standard Usage in a Corpus, in Fuß, Eric et al. (Hrsg.): Grammar and Corpora 2016, Heidelberg: Heidelberg University Publishing, 2018, S. 271–287. https://doi.org/10.17885/heiup.361.c4707

Bibliografische Angaben herunterladen

Lizenz (Kapitel)

Dieses Werk steht unter der Lizenz Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 4.0 International.

Identifier (Buch)

https://doi.org/10.17885/heiup.361.509

ISBN 978-3-946054-84-9 (PDF)

ISBN 978-3-946054-82-5 (Softcover)

ISBN 978-3-946054-83-2 (Hardcover)

Veröffentlicht

16.05.2018

Downloads

Kapitel herunterladen (PDF/337KB)

Kapitel lesen (HTML)

Statistik

Autor/innen

Alexandr Rosen

Coping with Unruly Language: Non-Standard Usage in a Corpus

Abstract A language as used in real situations may differ substantially from its standard form. Before the entire range of NLP methods and tools can be applied to non-canonical variants of a language, appropriate categories for the analysis of deviant forms and constructions are needed, together with texts annotated by these categories. A discussion of non-standard language is followed by two case studies. The first study proposes a taxonomy of morphosyntactic categories as an attempt to analyze non-standard forms in non-native learners’ Czech. The second study focuses on the role of a rule-based grammar and lexicon as tools for the detection and diagnostics of non-standard words and constructions in the process of building and using a parsebank.

Keywords Non-standard language, Czech, learner corpus, parsebank, treebank, constrain-based grammar, valency, HPSG

Heidelberg University Publishing

Zitationsvorschlag

Lizenz (Kapitel)

Identifier (Buch)

Veröffentlicht

Downloads

Autor/innen

Coping with Unruly Language: Non-Standard Usage in a Corpus

Sprache

Informationen