How to publish research software
This is a guide to how to publish research software. It collects together the advice we give to researchers on how to publish research software. It is not exhaustive. Please get in touch if you think we are missing something or there is something you think we should change.
What do we mean by "publishing software"?
For many people, "publishing software" just means making it publicly available, e.g. by putting it somewhere online. In our opinion, publishing software is a process, just like publishing a paper. It is the whole process of making your software available so that it can be downloaded and, more importantly, used by others. It is more than just uploading your software to, e.g. a Git repository, and then citing the URL. Publishing your software involves thinking about how your software will be downloaded, installed and used by others, and how a community of users and developers can grow around your software such that it can be sustained and managed over time.
Should I always publish my software?
Before publishing your software, you should first ask the question of whether or not it should be published. Why do you want to publish your software? Ideally, you want to publish because you think your software could be helpful for someone else. Thus, just as you publish a paper because you want someone to read it and build upon your work, you should publish software because you want it to be used and built on by others. This means documenting and testing your software. It also means taking on the responsibility of being the "corresponding author" and helping those people use your software, possibly by replying to any issues they raise while using it, fixing any bugs, or helping them adapt your software to their problem.
It is not a good idea to publish your software and then neglect or abandon it. Abandoned research software falls rapidly out of date. Over time it suffers bitrot such that it can no longer be compiled, or may have bugs or be insecure. Abandoned software adds to the noise when people search for solutions to their problem. It risks wasting their time if your software can't be compiled, can't be used for the user's use-case, or if there is an unfixed bug that leads to incorrect results.
In short, only publish your software if you really want it to be used by others, and you have the time or a plan to support it post-publication.
Publishing sounds like a lot of work, and I don't have much time. What can I do?
You can just put a copy of your software online somewhere, e.g. in a Git repository or uploaded to a website. This would make your software publicly available, but with no guarantees or ongoing support. You would do this if, for example, you wanted to make an archive of your software available so that it can be used to reproduce the results in a paper you've written. In this case, upload the snapshot of the software that was used to generate the results or figures to, e.g. figshare, and then cite the DOI in your paper. Going down this route tells people that your software is being made available only to reproduce that work, and that there is no support available to help someone adapt your software to their work. Going down this route would make your software findable and accessible, but it would limit your software's interoperability and ability to be re-used by others.
Another route would be to consider donating your software to be included as part of a larger project. You would need to do work to make your software follow the coding conventions and requirements for the larger project. But, if the larger project accepts your contribution, then they will (hopefully) take on the maintenance and community management work for you. As a bonus, you will then have track record as a contributor to a larger project, which always looks good on a funding application or CV.
I do want others to use my software. How should I prepare for publication?
If you do want to publish your software, then decide how much time and effort you are willing to put into supporting it and managing the community post-publication. How large an audience for the software is there? What are its competitors? How large will the community be, and how much will it grow over the years? How will you advertise your software and grow your community? How will you find people to help you, or perhaps even fund you?
The answers to these questions will let you decide how much effort you should put into the different parts of publishing the software, i.e. the larger your planned audience and the longer you want your software to "live", then the more effort you will need to put in to help your software grow.
In the beginning, for a small project, you only need to add a few things to prepare for publication. Your software will need:
- A well-defined purpose that is documented in an easy-to-find location, e.g. such as a README file.
- Be available somewhere public, where it is easy for others to ask questions or contribute modifications or bugfixes, e.g. such as GitLab or GitHub.
- Have at least one example demonstrating how it is used.
- Have clear instructions on how to compile and/or install the software.
- Have some method that can be used to verify that the software is working (e.g. at a minimum the expected output from the example, but better, a set of tests that verify the software).
- A license that sets out the terms by which others can download, share or contribute to the software. More information about how to license your software is in How to License Research Software.
If you want your software community to grow, you should add the following, which will make it easier for new users to get your software working on their computer, and to learn how to adapt it to solve their problems;
- Compiled binaries or curated packages for commonly-used operating systems (e.g. Linux, Windows, MacOS), ideally uploaded to package repositories (e.g. such as PyPI or CRAN). These help new users install the software in seconds, meaning you won't lose potentially valuable new members of the community because they can't get your code working.
- A tutorial or more detailed set of documentation that explains how the different features in the software work, and that shows how the software can be used to solve a variety of different problems.
- A set of unit tests that give confidence that the software is validated and that bugs will be picked up and rectified quickly. This builds trust in your software.
- Community guidelines with, perhaps, a code of conduct or contribution guides, so that your community is welcoming, mutually supportive, and everyone feels welcomed and able to contribute issues, bugs, bugfixes, etc.
Finally, while the software was originally written by you, and you feel like it is "yours", as your community grows, you should have a document or succession plan in place so that control or "ownership" of the software can gradually move into that community. This could be by defining roles, such as a "release manager" or "documentation manager", or thinking about how a management group could be assembled that could decide, e.g. on what features should be developed, or who has edit rights or who can accept pull requests on the main repository. You need this exit plan because, eventually, you won't be able to work on the software any more (unless you do want to be maintaining this project for free until you retire!).
How do I publish my software?
To publish your software, you first need to identify all of the contributors and make sure that they receive credit. Look at the contributions made to your software, e.g. via the Git commit history, or via issues raised that helped grow your software. Add the names of all of the contributes into a file, such as `AUTHORS` or `CONTRIBUTERS`.
Next, you need to create a release. A release is a named or numbered version of your software that uniquely identifies all of the parts of the software (i.e. all of the source code, documentation, tests etc.). There are many different ways you could name or number your version. Popular version naming scheme are;
- Semantic Versioning (SemVer). Uses a major.minor.patch number, e.g. 1.0.0, 1.1.0, 1.2.1 etc.
- Sequence-based identifiers. Uses a single number that increases on every release, e.g. Version 1, Version 2, Version 3 etc.
- Calendar Versioning (CalVer). Uses the date of the release, e.g. 2020.1 would be the first release in 2020, while 2020.2 would be the second
- Named Versions. Uses a different name for each version, e.g. MacOS Big Sur, Windows XP etc.
- Git Version. Simply use the version number that comes from Git, e.g. 47e007e
There are also modifications of the above that indicate the nature of the release, i.e. adding "alpha" to refer to an early version that is perhaps unstable, or by using "pre-releases" or "release candidates", e.g. RC1, RC2, to indicate versions of your software that are leading up to a formal release.
Next, you should decide on how this release of the software should be cited. This is important to help you get credit when other people use your software to produce their research outputs. Citation could be as simple as just the names of you and the other contributors, the software name and version name or number, and the URL of the software (e.g. the Git repository or software website). This information should be easy to find, e.g. place it near the top of your README file or software website.
You can make citation easier by writing a CITATION.cff file for your software. This is a standard format file that makes it easy for others to cite your software. If you add this file to your GitHub repository, then you will get a "Cite this repository" link that generates citations in APA and BibTex format.
Now that you have everything in place, create a release, e.g. by using the create a release workflow in GitHub. This will create a downloadable package of your software, as well as creating a Git tag for that release.
How can I get a DOI for my software release?
It is very easy to get a document object identifier (DOI) for software. There are two main routes that we recommend;
- Upload the package containing your software release to figshare and create a DOI there.
- If you are using a Git repository, then connect this to Zenodo and have a DOI generated automatically for every release. A full guide on how to automate this using a GitHub webhook is available here.
How can I get a peer-reviewed citable reference for my software?
It can be useful to have your software peer-reviewed. This adds a community mark of quality that can help others better trust your code, and creates a peer-reviewed citable reference that can proxy for a "research output" for organisations that still don't recognise software outputs alone.
Software journals, such as the Journal of Open Source Software and the Journal of Open Research Software exist to make it easy for software developers to publish software papers, and for the research community to effectively peer review those papers. Software papers are designed to be easy to write if you have already gone to the above effort of creating a well-defined software release. The software paper is a simple addition that describes the release, i.e. the high-level functionality of the software, a statement of need so it can be understood who the audience of the software are and why the software is needed, and a short description of the features of the software, and references to where it is being used.
Software papers are supposed to be small (perhaps just 1000 words) and should only take a few hours to write if you have already created a well-defined software release.
- How to write and submit a paper to the Journal of Open Source Software
- How to write and submit a paper to the Journal of Open Research Software
How can I manage multiple releases of my software?
It is a good idea to use a Git tag or a branch for each release of your software. That way, you can continue development on old versions by adding new code onto the end of that branch (e.g. if you wanted to back-port bug fixes from the newest code to an old version). Always remember to create a new release whenever you make the version public. This could be by using a patch number if you are using the SemVer versioning scheme (e.g. adding a bugfix to 1.0.0 and then releasing would be version 1.0.1). Also remember to update your CITATION.cff file (or citation instructions) and CONTRIBUTORS or AUTHORS files to include all of the contributers to your new version. It is important that all contributors receive credit for their work.
If you have connected the Zenodo webhook to your GitHub repository then you will get a new DOI created automatically for you for each release. You can then choose to use the new DOI to cite that specific release, or to use the umbrella DOI generated by Zenodo for all releases.
Do I need to write a new software paper for each new release?
In general, you don't need to write a new software paper for your new release. The original software paper should continue to be sufficient as the peer-reviewed citable reference. We only recommend writing a new software paper if there is a significant change in functionality of the software, e.g. moving from version 1.X to 2.X, or a major new release that represents a significant new amount of scholarly work.