Swedish Treebank

Creator/Principal investigator(s):

Joakim Nivre - Uppsala University

Beáta Megyesi - Uppsala University

Bengt Dahlqvist - Uppsala University

Anna Sågvall Hein - Uppsala University, Department of Linguistics and Philology

Johan Hall - Uppsala University

Jens Nilsson - Växjö University

Sofia Gustafson-Capková - Stockholm University orcid

Description:

The Swedish Treebank is a syntactically annotated corpus. The annotation includes word and sentence boundaries, morphological information (word classes etc.), and syntactic information (phrases and grammatical functions, dependency structure). The Swedish Treebank is based on two previous corpora, Talbanken and SUC, which have been harmonised. The Swedish Treebank contains approx. 350,000 tokens.

Subject area:

Responsible department/unit:

Uppsala University, Department of Linguistics and Philology

Creator/Principal investigator(s):

Joakim Nivre - Uppsala University

Beáta Megyesi - Uppsala University

Bengt Dahlqvist - Uppsala University

Anna Sågvall Hein - Uppsala University, Department of Linguistics and Philology

Johan Hall - Uppsala University

Jens Nilsson - Växjö University

Sofia Gustafson-Capková - Stockholm University orcid

Contributor(s):

Filip Salomonsson - SolarWinds, Pingdom

Identifiers:

SND-ID: EXT 0368

Description:

The Swedish Treebank is a syntactically annotated corpus. The annotation includes word and sentence boundaries, morphological information (word classes etc.), and syntactic information (phrases and grammatical functions, dependency structure). The Swedish Treebank is based on two previous corpora, Talbanken and SUC, which have been harmonised. The Swedish Treebank contains approx. 350,000 tokens.

Language:

Swedish

Geographic spread:

Geographic location: Sweden

Subject area:

Language resources

Resource type

Corpus

Contact for questions about the data:

Eva Pettersson

Swedish Treebank

Description:

The Swedish Treebank is a syntactically annotated corpus. The annotation includes word and sentence boundaries, morphological information (word classes etc.), and syntactic information (phrases and grammatical functions, dependency structure). The Swedish Treebank is based on two previous corpora, Talbanken and SUC, which have been harmonised. The Swedish Treebank contains approx. 350,000 tokens.

Data format / data structure:

Text

Published: 2020-10-14