Creator/Principal investigator(s)
Uppsala University, Department of Modern Languages
Lennart Lönngren - University of Tromsø, Department of Language and Linguistics
Description
The Uppsala Corpus (Upsal'skij korpus russkix tekstov) consists of some 600 Russian texts with a total of one million running words (word tokens), equally divided between informative and literary prose. The informative texts are from between 1985 and 1989, while the literary texts, whose vocabulary does not date as quickly, cover a longer period, 1960-88. The corpus does not include poetry or drama.
Within the given frameword, considerable effort has been made to ensure as representative and varied a corpus as possible. The informative texts are drawn from 25 different subject areas: economics, foreign affairs / foreign policy, ideology / domestic policy, party matters, Soviet society, social issues, defence, education, law, history, culture, linguistics, medicine / health care, psychology, environment / ecology, agriculture, engineering, information technology, space research, energy, biology, geology / geography, physics, chemistry and sport. Certain areas which were felt to be more important are represented by a larger volume of texts.
The literary half of the corpus comprises work by the
Subject area
Language and linguistics
(CESSDA Topic Classification)
Languages and Literature
(The Swedish standard of fields of research 2011)
Principal organisation
Accessibility status
Homepage
Responsible department/unit
University of Tromsø, Department of Language and Linguistics
Creator/Principal investigator(s)
Uppsala University, Department of Modern Languages
Lennart Lönngren - University of Tromsø, Department of Language and Linguistics
Identifiers
SND-ID: EXT 0071
Description
The Uppsala Corpus (Upsal'skij korpus russkix tekstov) consists of some 600 Russian texts with a total of one million running words (word tokens), equally divided between informative and literary prose. The informative texts are from between 1985 and 1989, while the literary texts, whose vocabulary does not date as quickly, cover a longer period, 1960-88. The corpus does not include poetry or drama.
Within the given frameword, considerable effort has been made to ensure as representative and varied a corpus as possible. The informative texts are drawn from 25 different subject areas: economics, foreign affairs / foreign policy, ideology / domestic policy, party matters, Soviet society, social issues, defence, education, law, history, culture, linguistics, medicine / health care, psychology, environment / ecology, agriculture, engineering, information technology, space research, energy, biology, geology / geography, physics, chemistry and sport. Certain areas which were felt to be more important are represented by a larger volume of texts.
The literary half of the corpus comprises work by the
Subject area
Language and linguistics
(CESSDA Topic Classification)
Languages and Literature
(The Swedish standard of fields of research 2011)
Resource type
Lönngren, Lennart (ed.), 1993. Частотный словарь современногo русского языка. [Chastotnyj slovar' sovremennogo russkogo jazyka.] (A Frequency Dictionary of Modern Russian. With a Summary in English.) Acta Universitatis Upsaliensis, Studia Slavica Upsaliensia 32. 188 pp. Uppsala
ISBN:
91-554-3134-8
If you have published anything based on these data, please notify us with a reference to your publication(s).
If you are responsible for the catalogue entry, you can update the metadata/data description in DORIS.
Creator/Principal investigator(s)
Uppsala University, Department of Modern Languages
Data format / data structure
Text
Time period(s) investigated
1960 — 1988
1985 — 1989