Data formats & organisation
This module will show the many forms data can take and how you can organise them. We will explain folder and file naming systems, showing you how to organise your research and demonstrate good record keeping practice. Data management planning templates are available for download here or directly from our library data management page here.
Section 1: Different formats research data may take
Section 2: Folder structures
Section 3: File naming conventions
Different formats research data may take
Images: Brett Jordan and Markus Spiske and Maksym Kaharlytskyi on Unsplash
Your research data, when properly organised and managed, can document how your conclusions were reached, validate your findings and help troubleshoot issues in your work. Well organised research data will be interoperable for reuse in future projects.
Research data can take many forms including;
- Artefacts, specimens, physical samples
- Content analysis
- Datasets
- Documents and spreadsheets
- Experimental data
- Films, audio or video tapes/files, newspapers
- Focus group recordings, interview notes
- Lab or field notebooks or dairies
- Models, algorithms, scripts
- Photographs or image
- Questionnaires, transcripts, surveys
- Sensor readings
- Spatial or Geospatial: data related to a location
Folder structures
A standard folder structure will assist with organising, locating and sharing your research data. To create a folder naming convention:
- Assign folders real titles that can be understood by external parties
- Create new folders and sub folders for new areas of work
- Pre-fix folder names with numbers, e.g. 01, 02 etc to order files by the steps in your workflow. Use zero before each number to maintain numerical order.
We recommend documenting your folder structures in a “README” file saved in each folder set. The README file will explain your folder system and outline the contents of your folders.
File naming conventions
A File Naming Convention (FNC) or protocol allows you to systematically name your files, describing not only what each file contains but also how the project files relate to each other.
Best practice is to set up and standardise your naming conventions and file structures early in your research project. This will help you to build your frameworks and organise research data with sections of your dissertation.
Prior to starting your project it is very important that you check for standards or existing file name conventions in your field. This is particularly important when working with external parties where FNC are already in use, and any changes will need to be agreed upon before commencing the project. —
If you are creating your own file naming conventions, be sure to document how they work and how they were developed. Once decided, document your file naming structure into a README file located in the folders where your documents are stored. The hardest part of file naming conventions is compliance. For best results, start to embed the systems early in your project to make them part of your research practice.
When creating your file name convention, make sure that your system is;
- machine readable
- human readable
- ordered
Machine readable A machine readable format makes easy to search your files; you can refine your file lists based on how you have named them and you can extract the information you need simply from the file name.
Human Readable File names are easily read and understood by other people, and that your colleagues, collaborators or supervisors can easily decipher what is in each file. To make files readable, give your files meaningful and consistent titles which easily identify which part of the project the file applies to, which project you are looking at and when the file was created.
Ordered The creation date is the best place to begin organising our files. We strongly recommend that dates for file management are formatted as YYYYMMDD order eg : 20220815, in line with international standards ISO 8601. If you are using additional numbering systems, use a zero first to avoid any confusion.
An example file name convention
YYYYMMDD_project_element-version.doc
Applied FNC:
20220818_researchinterviews_candidate-1-transcript_01.doc
- YYYYMMDD = Date in standardised, agreed format
- Overarching project = examples include location, project section, chapter or article
- File element = data analysis for a chapter, a specific interviewee transcript, datapoint in the field
- Document version = 01, 02, 03 etc keep this simple
File name sections or chunks are separated file by an underscore or hyphen.
This example uses underscores between key sections and hyphens to connect the elements.
You can clearly see what the file is about, the version and when the file was created. The convention will allow you to add additional files that can be easily searched based on the file name.
TIP! DO NOT use punctuation, upper case or special characters as this may alter the file order
Second example of a file name convention
YYYYMMDD_SiteA_SensorB.csv
Applied FNC:
20210621_Southport_Humidity.csv
- YYYYMMDD = Date in standardised, agreed format
- SiteA = Location
- Sensor B = Sensor
File name sections or chunks are separated by an underscore
This section has been adapted from Jenny Bryan’s Naming Things presentation for the Reproduclible Science Workshop.
Links and references
File and folder naming conventions The following links contain excellent additional information and tips for naming and organising your files.
University of Edinburgh Records management guide
CESSDA Data Management Expert Guide
Reproducible Science - how to name files particularly the content around slide 27