Grammar and Corpora 2016
How to cite this chapter

Bilińska, Joanna, Kwiecień, Monika and Derwojedowa, Magdalena: Microcorpus of Nineteenth-Century Polish, in: Fuß, Eric et al. (Eds.): Grammar and Corpora 2016, Heidelberg: Heidelberg University Publishing, 2018. https://doi.org/10.17885/heiup.361.c4712

More citation styles
Licenses

This work is licensed under a Creative Commons License 4.0
(CC BY-SA 4.0)
.
Creative Commons License BY-SA 4.0

Identifiers (Book)
ISBN 978-3-946054-84-9 (PDF)
ISBN 978-3-946054-82-5 (Softcover)
ISBN 978-3-946054-83-2 (Hardcover)

Published 16.05.2018.


Joanna Bilińska, Monika Kwiecień, Magdalena Derwojedowa

Microcorpus of Nineteenth-Century Polish

Abstract In the paper, a 1M word corpus of Polish texts from the period 1830– 1918 is described. The corpus was compiled to provide diversified linguistic data for morphological analysis, however several tests proved that it can be used as a versatile resource to identify various linguistic phenomena and trace their dynamics in regard to inflection, spelling or even syntax. It is divided into five equal subcorpora to provide stylistic variety: scientific texts for general public, news, feuilletons, fiction and drama. In order to conduct morphological analysis an analyzer made for contemporary texts was adapted, which can, therefore, process word forms that differ from contemporary inflection and spelling. In the paper, several experiments made with the use of the corpus are discussed.

Keywords Morphological analysis, spelling, 19th century Polish, corpus