Choosing a file format

Person on computer

When you choose a file format, select one or more formats that make it possible to reuse the project data, now and in the future. As a general rule, file formats that are suitable for long-term preservation and accessibility: 

  • are commonly used;
  • can be read by multiple software 
  • are well-documented, meaning that it’s possible to find a technical specification that explains how information is stored in the format 
  • are open 
  • are non-proprietary 

A format that is common in your field of research may be proprietary or run the risk of becoming obsolete. In that case, you may want to make the data accessible in a standardized format for the field, but to also attach the data in a format better suited for long-term preservation and accessibility.  

You can find more advice on file management and file formats on our Data Management pages. If you want to go deeper, there are more comprehensive best practice guides for a number of file types and file formats in our Guides, or by clicking the headlines in the tables below. 

You can also find more information on digital file formats on the Library of Congress  website: https://www.loc.gov/preservation/digital/formats/

File formats for data accessibility

To help you figure out whether a certain file format complies with the criteria above, SND has compiled a list of file formats that are suitable for making research data accessible.

Recommended formats: Formats that SND considers having the highest probability to maintain accessibility and readability in the future. 

Accepted formats: Formats that are commonly used, which SND accept to make data accessible. These formats have good prospects of remaining readable in a longer perspective, but the files may have to be converted into other formats for long-term preservation. 

Read more about file formats for different types of data in SND's guides to best practice. Click the links below to open each guide. 

Note that this isn’t a complete list of file formats and that it will change and be updated according to technical developments. The preferred formats in your research field may also be one that’s not included in the lists below. If that is the case, please contact SND or your local research data support unit for further discussion and assistance. Note also that the file formats below may be converted into another format for archiving. 

 

Recommended formats

Accepted formats

Text documents

  • MS Word (.docx)
  • OpenDocument Text (.odt) 
  • PDF/A (.pdf) 
  • Unicode (.txt) 
  • Markdown (.md) 
  • MS Word (.doc)
  • PDF (.pdf)
  • Rich Text Format (.rtf)

Markup language

  • HTML (.html)
  • JSON (.json)
  • XML (.xml)
  • SGML (.sgml)
  • Markdown (.md)

Spreadsheets

  • CSV (.csv) 
  • TSV (.tsv) 
  • OpenDocument Spreadsheet (.ods) 
  • MS Excel (.xlsx) 
  • MS Excel (.xls)
  • OOXML (.docx, .docm)

Databases

  • CSV (.csv)
  • SIARD (.siard)
  • SQL (.sql)
  • dBase III or IV (.dbf)
  • FileMaker Pro (.fp7, .fmp12)
  • MS Access (.mdb, .accdb)
  • OpenDocument Base (.odb)

Statistical data

  • CSV (.csv) 
  • TSV (.tsv) 
  • OpenDocument Spreadsheet (.ods) 
  • MS Excel (.xlsx) 
  • MS Excel (.xls) 
  • R (.rdata, .rda) 
  • SPSS SAV (.sav) 
  • SPSS portabel (.por) 
  • STATA (.dta) 
  • SAS (.sas) 
  • SAS transport (.xpt) 

Image (bitmap/raster)

  • PNG (.png) 
  • TIFF (.tif, .tiff) 
  • Adobe Digital Negative format (.dng) 
  • DICOM (.dcm) 
  • JPEG (.jpg, .jpeg) 
  • JPEG 2000 (.jp2, .jpx) 
  • Adobe Photoshop document file (.psd) 
  • Raw image data (various formats) 

Image (vector)

  • SVG (.svg)
  • Adobe Illustrator (.ai)
  • AutoCAD Drawing Interchange Format  (.dxf)
  • EPS (.eps)
  • PDF/A (.pdf)
  • PDF (.pdf)
  • WebCGM 2.1 (.cgm)

Video

  • Lossless AVI (.avi)
  • Matroska (.mkv)
  • MPEG-1 (.mpg, .mpeg, …)
  • MPEG-2 (.mpg, .mpeg, …)
  • MPEG-4 H.264 (.mp4)
  • MPEG-4 Part 14/MP4 (.mp4)
  • Audio Video Interleave (.avi) 
  • MXF (.mxf) 
  • Motion Jpeg 2000 (.mj2, .mjp2) 
  • QuickTime (.mov) 

Audio

  • AIFF (.aif, .aiff)
  • Broadcast Wave Format (.bwf)
  • FLAC (.flac)
  • Matroska (.mka)
  • MPEG-1 (.mpg, .mpeg, …)
  • MPEG-2 (.mpg, .mpeg, …)
  • Waveform Audio (.wav)
  • AAC (.aac)
  • Audio Video Interleave (.avi)
  • MP3 (.mp3)
  • MPEG-4 Part 14/MP4 (.mp4)
  • Ogg Vorbis (.ogg)
  • Opus (.opus)
  • Speex (.speex)

Geographical information (GIS)

  • Digital Elevation Model (DEM) Format (.dem) 
  • Geographic Markup Language (.gml) 
  • GeoTIFF (.tif, .tiff) 
  • CSV (.csv) 
  • GeoJSON (.geojson) 
  • OGC GeoPackage (.gpkg) 
  • NetCDF (.nc) 
  • ArcInfo Interchange (.e00) 
  • ESRI GRID (.adf, .asc, .grd) 
  • ESRI Shapefile (.shp) 
  • LAS (LASer) File Format 
  • MapInfo (.tab, .dat) 
  • MapInfo Interchange Format (.mif, .mid) 
  • Keyhole Markup Language (.kml) 

RDF (Resource Description Framework)

  • RDF/XML (.rdf) 
  • JSON-LD (.jsonld) 
  • Turtle (.ttl) 
  • N-Triples (.nt) 

 

 

3D data

  • COLLADA (.dae) 
  • CSV (.csv) 
  • Universal 3D (.u3d) 
  • Wavefront OBJ file (.obj) 
  • X3D (.x3d) 
  • NetCDF (.nc) 
  • HDF (.hdf)
  • 3DS (.3ds)
  • AutoCAD Drawing Interchange Format  (.dxf)
  • Autodesk 3D asset exchange format (.fbx)
  • Stanford polygon file format (.ply)
  • STL 2.0 (.stl)
  • Virtual Reality Modelling Language (.vrml, .wrl, .vrl)