Significance Filters for N-gram Viewer

Velislava Todorova; Maria Chinkina

doi:10.17885/heiup.345.474

Zitationsvorschlag

Todorova, Velislava und Chinkina, Maria: Significance Filters for N-gram Viewer, in Bubenhofer, Noah und Kupietz, Marc (Hrsg.): Visualisierung sprachlicher Daten: Visual Linguistics – Praxis – Tools, Heidelberg: Heidelberg University Publishing, 2018, S. 301–314. https://doi.org/10.17885/heiup.345.c4407

Bibliografische Angaben herunterladen

Lizenz (Kapitel)

Dieses Werk steht unter der Lizenz Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 4.0 International.

Identifier (Buch)

https://doi.org/10.17885/heiup.345.474

ISBN 978-3-946054-75-7 (PDF)

ISBN 978-3-946054-77-1 (Hardcover)

ISBN 978-3-947732-15-9 (Softcover)

Veröffentlicht

12.04.2018

Downloads

Kapitel herunterladen (PDF/857KB)

Kapitel lesen (HTML)

Statistik

Autor/innen

Velislava Todorova, Maria Chinkina

Significance Filters for N-gram Viewer

Abstract This paper presents a visualization tool for the analysis of tendencies in language use over time. Given a dated and tokenized corpus, it calculates frequencies of selected n-grams and visually presents them as data points on a line chart in a coordinate system, with time on the x axis and relative frequency on the y axis. It provides the option of smoothing the graph in order to make the general tendency more salient. The user can specify an n-gram as a sequence of tokens, lemmas, and/or POS tags, if the corpus provides these anno-tations. Along with the original text, the tool also accesses the metadata of the corpus, such as dates and authors’ names, allowing for a comparison of the use of n-grams by different authors at different time periods in context. The latest version of our tool introduces a filtering mechanism that indicates the periods of time throughout which the observed values within one or more datasets are significantly different. We used Fisher’s exact test of independence because it has the advantage of providing reliable results even for sparse data.

Heidelberg University Publishing

Zitationsvorschlag

Lizenz (Kapitel)

Identifier (Buch)

Veröffentlicht

Downloads

Autor/innen

Significance Filters for N-gram Viewer

Sprache

Informationen