Vol 6 (2021): Anja Behnke – Josefina Budzisch: Selkup Language Corpus

This paper documents the project: “Syntactic Description of the Southern and Central Selkup Dialects: A Corpus-Based Investigation”, which was carried out between 2015 and 2018 at the University of Hamburg. The project was funded by the German Research Foundation (DFG). The main goal of the project was the creation of a digital language corpus of Selkup. In addition to the originally planned texts from Central and Southern Selkup dialects, a number of Northern Selkup texts were added in the course of the project. The corpus, therefore, reflects the great dialectal diversity of Selkup.

The paper is structured as follows: Section 2 describes the project objectives and the tasks that were carried out during the course of the project. In section 3, a short overview of Selkup is presented, giving some remarks about the areal distribution as well as the linguistic status of Selkup. In section 4, metadata about the corpus are introduced; here information about archiving and conventions throughout the corpus are described. Section 5 deals with the structure of the corpus and gives a detailed analysis of the transcription and annotation of the data. In section 6, a list of research based on the corpus is presented, section 7 lists the text sources for the corpus, and in section 8 references are given. In the appendix, the used characters, as well as labels for glosses and categories, can be found.

Published: 2021-11-29

