Although data analysis is an ongoing process throughout the research project, this page focuses on the analysis of the data subsequent to its collection. To ensure that research is empirical and verifiable, it is crucial that researchers keep records (data documentation) of every step made during the data analysis.
Data analysis converts raw/processed data into information that is useful for understanding. Many steps may be required to gain useful information from raw data. The process of cleaning and analysing data may require computing power not readily available or specific storage and protection options. If multiple parties are involved in the analysis, data sharing may also be necessary.
Data analysis often requires the use of specialised software.The software offered and licensed by the university currently includes: Stata, SPSS, and Atlas.TI. Some of the software is available for download at: download.vu.nl. For open software, see below.
In some cases researchers write their own scripts to analyse the data. At the VU, most scripts are written in R, Python and SQL.
If you want to read up on data analysis you should check out what journal articles and books the VU library has available on the subject:
It is common in many fields to hold interviews, focus group sessions, or make other observations that were recorded - video or audio. If indeed you have done so, and you need to have the text transcribed, there are several ways to do this. One option is to do this by hand, although this is very time-consuming.
Another option is to pay a transcription service to make the transcription or to use specialised software. The VU has drawn up processing agreements with one transcription service, Transcript Online, and one transcription software service, Amberscript.
You can find more information on what these transcription options do, how they work, how much they cost, and how they can be used:
- here (in Dutch)
- here (in English)
Using open software increases the Accessiblity, Interoperability and Reusability of your data. For that reason, we recommend that you use open software as much as possible for your data analysis. This could be software, code or scripts that you have written yourself - where possible, please make this software public, so your analysis is reproducible. Examples of open software are R and Python, which can be used instead of proprietary, commercial software such as SPSS and Matlab.
Researchers often write their software themselves. There are also organisations that specialise in writing research software, such as the eScience Center. The eScience Center offers the software they built for free use online. Their software is tagged with a DOI and stored in Zenodo as well as GitHub.
If you use software for analysing personal or otherwise sensitive data, you need a processing agreement with the developer if the software does not run locally. You can contact your Privacy Champion if you are not sure if you need one, and for help to set up a processing agreement.
There are several ways in which to start using open software:
The VU has several research groups that offer their code online. You can find them here: