Twenty years ago, the idea of supplementary material for a journal article was practically unheard of. Today, it seems that few papers don’t have some supplementary material associated with them. Why this big change?
Practically anything can go into supplementary material.
- If you have lots of extra analyses that are too large to go into an article.
- If your methods are extensive and there’s a word limit in the journal
- If your analytical tools throw out a bunch of super interesting graphics, but you can only use one or two
- Most of all, you can put in the raw data that makes up the basis for your work, together with the analytical script that you used to analyse it.
This is a really positive movement in science and it should receive all of our support as it is part of the greater scientific project (see part 2). Being proud of your work is being proud of the entire product, including the data and the analysis. If you felt they were good enough for peer review, then you should be happy to make them all freely available. Not sure if you haven’t made some mistakes? We all make mistakes and you shouldn’t think that you will have an entirely fault-free career. Whenever and wherever possible conduct your professional due diligence and be open and honest with your research. Science is a career when we should openly acknowledge errors and mistakes that we have made in the past. There’s no problem with this. Indeed, the problems only start when you refuse to acknowledge past errors and mistakes, making you prone to motivated reasoning and confirmation bias.
There are lots of data repositories out there. Some specialise in particular kinds of data, like genetic data in GenBank, or sound data in The Macaulay Library at the Cornell Lab of Ornithology. Others are general repositories of data and you can post anything there, including the scripts that you used to analyse the data. By doing this you ensure that any people that want to use your work in the future have full access to it. Your data deposit will be given a DOI. You can use this DOI in your publication so that readers know where to find the data.
Once I happily allowed the publishers to store my data. These days, I simply don’t trust them to curate and keep my data for posterity. The way that most publishers of academic journals behave, there is no reason for them to make sure that my data is safe and secure. Therefore, I would rather go to an independent curator of data that specialises in not-for-profit data curation. One example is Zenodo which is built and operated by CERN and OpenAIRE. This site has the advantage that it is integrated with GitHub (if you are already a user, but check it out if you aren’t). You can choose to open your data immediately, or embargo it until your paper is published. It’s a great platform, and so much more secure than using a publisher. Another example is the Open Science Framework (OSF) that aims to be a one-stop shop for all of your storage needs. You will need to make sure that your data is correctly sorted with sufficient metadata for someone else to understand (see Roche et al., 2015).
Once, we did statistical analysis with a GUI and a point and click approach. Forget to click a checkbox and the results were not repeatable. As the analyses have become increasingly sophisticated with more and differing pieces of software, it is becoming increasingly important to record the code that you write that gets your results together with your data. This is the difference between making your results and, therefore, your study repeatable.
Talk to your advisor. Your lab or your institution may have a repository that they prefer.
Let’s face it, you probably put a lot of effort into proposing your thesis hypotheses. Why not deposit this too? Remember that it does not have to be accessible to the world, but it would improve the integrity of your study if your original hypotheses are the same as the ones that you submit for publication. Some journals are now asking whether you have stored hypotheses ahead of conducting the experiments. They don’t stop you from submitting if you haven’t. But why not? It’s a great step forward in terms of transparency and provides more credibility when you eventually write that manuscript with the same aims. And it shouldn’t stop you from publishing another study that happened along the way, even though you hadn’t planned it. I suspect that in future funders will require us to deposit our successful funding proposals in an online repository before they release funds - hopefully, they’ll also be willing to support these facilities financially.