Gold standard for English-Swedish Europarl data (GES)

SND-ID: EXT 0283

Creator/Principal investigator(s)

Lars Ahrenberg - Linköping University, Department of Computer and Information Science

Maria Holmqvist - Linköping University, Department of Computer and Information Science

Description

Reference corpus for word linking, divided into training data and test data. The sentences come from the English and Swedish parts of Europarl.
Research principal, contributors, and funding

Research principal

Linköping University

Responsible department/unit

Department of Computer and Information Science

Protection and ethical review
Method and time period
Language resources

Resource type

Corpus

Foreseen use

NLP application
Geographic coverage
Publications

Maria Holmqvist and Lars Ahrenberg (2011). A Gold Standard for English-Swedish Word Alignment. In Proceedings of the 18th Nordic Conference on Computational Linguistics, Riga, Latvia, May 11-13, 2011.

If you have published anything based on these data, please notify us with a reference to your publication(s). If you are responsible for the catalogue entry, you can update the metadata/data description in DORIS.

Dataset
Gold standard for English-Swedish Europarl data (GES)

Description

Data are created from the English-Swedish part of the Europarl corpus. For each sentence pair in the selected subset, token correspondences are stated as pairs of integral token identifiers

Data format / data structure

Numeric

Text

Creator/Principal investigator(s)

Lars Ahrenberg - Linköping University, Department of Computer and Information Science

Maria Holmqvist - Linköping University, Department of Computer and Information Science

License

Creative Commons  Attribution 4.0 International (CC BY 4.0)

Contact for questions about the data

CLARIN Virtual Collection Registry

Add this resource to a virtual collection

A virtual collection is connected to a specific research purpose and contains links to data resources from various digital archives. It is easy to create, access, and cite the collection.

Read more about virtual collections on the CLARIN website.