Grammar and Corpora 2016
Zitierempfehlung (Kapitel)

Bilińska, Joanna, Kwiecień, Monika und Derwojedowa, Magdalena: Microcorpus of Nineteenth-Century Polish, in: Fuß, Eric et al. (Hrsg.): Grammar and Corpora 2016, Heidelberg: Heidelberg University Publishing, 2018.

Weitere Zitierweisen

Dieses Werk ist unter der
Creative Commons-Lizenz 4.0
(CC BY-SA 4.0)
Creative Commons Lizenz BY-SA 4.0

Identifikatoren (Buch)
ISBN 978-3-946054-84-9 (PDF)
ISBN 978-3-946054-82-5 (Softcover)
ISBN 978-3-946054-83-2 (Hardcover)

Veröffentlicht am 16.05.2018.

Joanna Bilińska, Monika Kwiecień, Magdalena Derwojedowa

Microcorpus of Nineteenth-Century Polish

Abstract In the paper, a 1M word corpus of Polish texts from the period 1830– 1918 is described. The corpus was compiled to provide diversified linguistic data for morphological analysis, however several tests proved that it can be used as a versatile resource to identify various linguistic phenomena and trace their dynamics in regard to inflection, spelling or even syntax. It is divided into five equal subcorpora to provide stylistic variety: scientific texts for general public, news, feuilletons, fiction and drama. In order to conduct morphological analysis an analyzer made for contemporary texts was adapted, which can, therefore, process word forms that differ from contemporary inflection and spelling. In the paper, several experiments made with the use of the corpus are discussed.

Keywords Morphological analysis, spelling, 19th century Polish, corpus