How to license research artifacts?

  • Other researchers interested in your work don’t have to start from scratch and spend weeks, months, or even years to find what you already found. Science would hardly progresses if we have to build the wheel over and over again.
  • Newcomers could easily start exploring your research field. By means of comparison, in my research field, software engineering, there is this sub-field called mining software repositories (MSR). Doing MSR research, say, 10 years ago was much more challenging than today. Actually, it became so easy to mining repositories that even students with little to no programming experience can start mining something in a matter of hours.
  • Reproduction becomes easier if we have research artifacts on your hands. Yes, we can try our best to recreate the very same scenario, but we may ended up missing some variables here and there (that may not even been described in the research paper). As noted in some other studies, small changes in the experiment could have non-trivial impacts on the findings.
  • We could find (and fix) research bugs more easily and quickly. Only with the original research artifacts is possible to find eventual bugs in the data/analysis. If we face a bug when trying to recreate the research artifact based on what was written in the paper, should we attribute that bug to ourselves or to the original research?
  • Source code files: there are a plethora of open source licenses available out there. The most common ones are the MIT license, the BSD license, the Apache v2 license, and the GPL family of licenses.
  • Data files: examples of data files include spreadsheets, .csv files, .rda files (for R programming), etc. License for data files include OFL 1.0, PDDL, CDLA-Permissive-1.0, ODC-BY, and some of the creative commons family of licenses.
  • Document files: For ordinary documents, it is more common to one of the creative commons family of license. Since some licenses under the creative commons umbrella have some issues with the GPL, the Free Software Foundation recommends the use of the “GFDL’’ (the GNU Free Documentation License). GFDL is widely used in GNU manuals. However, use GFDL with care, since it is not permitted as the only acceptable in some circumstances (e.g., if the creative work is not software-related, like ordinary paintings).
  • Image files: Similar to document files, as far as I can tell, creative commons is also and by far the most used license for image files. Indeed, images hosting websites such as Flickr have their own part dedicated to materials under creative commons.
  • Font files: perhaps the most used font license is the OFL (similar to data files, but its newer version 1.1).




Assistant professor @ Federal University of Pará

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

ARTFINITY BI-WEEKLY REPORT — 2021.01.16 ~ 2021.01.31

What do you look for in a software engineer?

To Move Faster, Experiment More

How To Scrape Number Of Posts On Instagram Using Python!

Reduce Cost and Increase Productivity with Value Added IT Services from buzinessware — {link} -

Reduce Cost and Increase Productivity with Value Added IT Services from buzinessware — {link} -

Writing HAProxy Configuration File

Customizing Sample Actions for the Google Assistant

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Gustavo Pinto

Gustavo Pinto

Assistant professor @ Federal University of Pará

More from Medium

How ‘Killing Eve’ Lost Its Magic

How AI will be the biggest discrimination in the future

Can Algorithms really be trusted in Clinical Medicine?

How to steal the show at a trade show? 8 merch ideas