Make your datasets and code citable with a DOI

The result of the Zenodo+Github service. Image taken from

The result of the Zenodo+Github service: a Github project may be assigned its DOI. Image taken from

Recently I have learned about two interesting possibilities for researchers who want to publish their source code and their datasets in a more academic fashion: it is possible to get a DOI for both. This may potentially benefit their recognition, outreach, impact, and citation possibilities.

Assigning a digital identifier makes the resource unambiguous and not vulnerable to location changes, and it is especially useful when a dataset or code has no publication “behind it”.

1. For the code, Github and Zenodo offer this service. Read this article. It seems pretty straightforward.

2. For the datasets, visit DataCite. I have learned about this from the blog The Thesis Whisperer, which has a nice article about this topic. DataCite (wiki article) is a global non-profit consortium represented by members and regional offices. In the Netherlands, the TU Delft library is in charge of minting DOIs. I have already used their service to assign DOIs to my conference papers that didn’t get one, and their quick and efficient service leaves nothing to be desired.

Edit: in the meantime I’ve learned that the DOI points out to a specific release of the code, not the software project itself. Hence, if you release an update of the code, you will need to create a new DOI, from what I’ve understood. Therefore, it seems that the DOI is not the most suitable way to identify a software project that you intend to update.