Texts from the Swedish Work Environment Authority

SND-ID: EXT 0332

This study is part of the collection Parallel Texts from Public Agencies

Description Data and documentation

Creator/Principal investigator(s)

Simon Dahlberg - Institute for Language and Folklore, Language Council of Sweden

Institute for Language and Folklore, Language Council of Sweden

Description

Parallel texts downloaded from the websites of the Swedish Work Environment Authority.

Language

English

Swedish

Protection and ethical review
Method and time period

Sampling procedure

Swedish texts with translations to one or more languages.
Language resources

Resource type

Corpus

Foreseen use

NLP application

Text corpus

  • Linguality

    Multilingual
  • Language

    • (swe)

      Texts: 21

    • (eng)

      Texts: 19

    • (bul)

      Texts: 2

    • (ces)

      Texts: 2

    • (deu)

      Texts: 3

    • (est)

      Texts: 3

    • (fin)

      Texts: 1

    • (hun)

      Texts: 1

    • (lav)

      Texts: 3

    • (lit)

      Texts: 3

    • (pol)

      Texts: 4

    • (ron)

      Texts: 3

    • (spa)

      Texts: 2

    • (zho)

      Texts: 2

    • (rus)

      Texts: 3

    • (ara)

      Texts: 1

    • (tur)

      Texts: 2

    • (tha)

      Texts: 1

    • (hin)

      Texts: 1

    More..
  • Modality

    Written Language
  • Size

    Words: 166367 (swe)

    Texts: 21 (swe)

    Words: 432133 (TOT)

    Texts: 78

  • Original source

    arbetsmiljöverket
    www.av.se
Geographic coverage

Geographic spread

Geographic location: Sweden

Topic and keywords

Research area

LABOUR AND EMPLOYMENT (CESSDA Topic Classification)
Social Sciences, Languages and Literature (The Swedish standard of fields of research 2011)

Publications
Dataset
Parallel texts from the Swedish Work Environment Authority

Description

Parallel texts downloaded from the websites of the Swedish Work Environment Authority. The txt files that are available are the result of running the pdf files through the pdftotext command from an ubuntu shell.

Data format / data structure

Text

Creator/Principal investigator(s)

Simon Dahlberg - Institute for Language and Folklore, Language Council of Sweden

Institute for Language and Folklore, Language Council of Sweden

Data collection

  • Mode of collection: Self-administered writings and/or diaries: web-based
  • Time period(s) for data collection: 2017-01-01–2017-01-31

License

Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
Published: 2020-03-30
Last updated: 2020-05-15