Swedish Treebank

Creator/Principal investigator(s)

Joakim Nivre - Uppsala University

Beáta Megyesi - Uppsala University

Bengt Dahlqvist - Uppsala University

Anna Sågvall Hein - Uppsala University, Department of Linguistics and Philology

Johan Hall - Uppsala University

Jens Nilsson - Växjö University

Sofia Gustafson-Capková - Stockholm University orcid

Description

The Swedish Treebank is a syntactically annotated corpus. The annotation includes word and sentence boundaries, morphological information (word classes etc.), and syntactic information (phrases and grammatical functions, dependency structure). The Swedish Treebank is based on two previous corpora, Talbanken and SUC, which have been harmonised. The Swedish Treebank contains approx. 350,000 tokens.

Subject area

Creator/Principal investigator(s)

Joakim Nivre - Uppsala University

Beáta Megyesi - Uppsala University

Bengt Dahlqvist - Uppsala University

Anna Sågvall Hein - Uppsala University, Department of Linguistics and Philology

Johan Hall - Uppsala University

Jens Nilsson - Växjö University

Sofia Gustafson-Capková - Stockholm University orcid

Contributor(s)

Filip Salomonsson - SolarWinds, Pingdom

Identifiers

SND-ID: EXT 0368

Description

The Swedish Treebank is a syntactically annotated corpus. The annotation includes word and sentence boundaries, morphological information (word classes etc.), and syntactic information (phrases and grammatical functions, dependency structure). The Swedish Treebank is based on two previous corpora, Talbanken and SUC, which have been harmonised. The Swedish Treebank contains approx. 350,000 tokens.

Language

Swedish

Geographic spread

Geographic location: Sweden

Data contains personal data

No

Subject area

Language resources

Resource type

Corpus

Contact for questions about the data

Swedish Treebank

Description

The Swedish Treebank is a syntactically annotated corpus. The annotation includes word and sentence boundaries, morphological information (word classes etc.), and syntactic information (phrases and grammatical functions, dependency structure). The Swedish Treebank is based on two previous corpora, Talbanken and SUC, which have been harmonised. The Swedish Treebank contains approx. 350,000 tokens.

Data format / data structure

Text

Published: 2020-10-14