Skip to Main Content

Research Data Management

Make informed choices for research data. RDM, policy, practical guidelines, software and tools at VU Amsterdam. FAIR data, archiving, storage, publication

Data Analysis

Although data analysis is an ongoing process throughout the research project, this page focuses on the analysis of the data subsequent to its collection. To ensure that research is empirical and verifiable, it is crucial that researchers keep records (data documentation) of every step made during the data analysis.

Data analysis converts raw/processed data into information that is useful for understanding. Many steps may be required to gain useful information from raw data. The process of processing and analysing data may require computing power not readily available or specific storage and protection options. If multiple parties are involved in the analysis, data sharing may also be necessary.

Data analysis often requires the use of specialised software.The software offered and licensed by the university currently includes: Stata, SPSS, and Atlas.TI. Some of the software is available for download at: download.vu.nl. For open software, see below.

In some cases researchers write their own scripts to analyse the data. At the VU, most scripts are written in R, Python and SQL.

If you want to read up on data analysis you should check out what journal articles and books the VU library has available on the subject:

Open Software

Using open software increases the Accessiblity, Interoperability and Reusability of your data. For that reason, we recommend that you use open software as much as possible for your data analysis. This could be software, code or scripts that you have written yourself - where possible, please make this software public, so your analysis is reproducible. Examples of open software are R and Python, which can be used instead of proprietary, commercial software such as SPSS and Matlab.

Researchers often write their software themselves. There are also organisations that specialise in writing research software, such as the eScience Center. The eScience Center offers the software they built for free use online. Their software is tagged with a DOI and stored in Zenodo as well as GitHub.
If you use software for analysing personal or otherwise sensitive data, you need a processing agreement with the developer if the software does not run locally. You can contact your Privacy Champion if you are not sure if you need one, and for help to set up a processing agreement.

There are several ways in which to start using open software:

  • For Python: you should install Anaconda and launch the Jupyter Notebook from the Navigator.
  • For R: you should install Anaconda and launch R Studio from the Navigator.
  • Use the Software Carpentries to learn the basics of programming in Python and R and version control with Git
  • Read the recommendations for FAIR Software.

The VU has several research groups that offer their code online. You can find them here:

  • The Systems Bioinformatics research group, on GitHub
  • The Computational Lexicology & Terminology Lab, on GitHub
  • The course Python for Text Analysis, on GitHub
  • VU RDM Tech IT group, on GitHub
  • A list of RDM tools, on GitHub