First Principles in Research Data Management

2 The FAIR Principles and Research Data Management

Minglu Wang and Dany Savard

Learning Outcomes

By the end of this chapter you should be able to:

  1. Explain the history of the FAIR principles.
  2. Understand some of the key meanings, requirements, and underlying mechanisms of the FAIR principles.
  3. Be familiar with the tools and frameworks available to help improve the FAIRness of data.
  4. Understand how FAIR principles are included and referenced in research policies and data availability policies.
  5. Evaluate how research data repositories support FAIR principles.
  6. Find communities or initiatives that are using the FAIR principles within the Research Data Management ecosystem.


The FAIR principles (Findable, Accessible, Interoperable, Reusable) are guiding principles that aim to encourage data stewards to improve the ways in which research data can be found and reused by computational systems in today’s growing, complex data ecosystem. In this chapter, we’ll explore the scope of the principles and the tools you can use to evaluate and enhance the FAIRness of a dataset. We’ll also discuss the impact of the principles and explore how they have been endorsed.

Brief History of FAIR Principles

Why Do We Need Guiding Principles for Research Data?

Research Data Management (RDM) requirements were first proposed by national research funders in European countries because of the rise of data intensive science. Requirements around Data Management Plans (DMPs), data citation and data availability have since become important for the responsible conduct of research and have introduced new conditions for researchers seeking to publish or receive public funding (Hrynaszkiewicz et al., 2020). Since then, data stewards have helped researchers meet RDM requirements by advocating for data preservation, providing training on how to prepare data, and developing infrastructure to safely store data. While advancements in informational technology infrastructure have made computational analysis of large amounts of data possible, the corresponding rise in the number of data repositories and standards created to disseminate data in different disciplines and sectors has helped encourage silos and prevented data from being brought together for meaningful research. As a result, the need for broader principles that can enable responsible data sharing has become increasingly important for different members of the wider research data community.

Origins of the FAIR Guiding Principles

In 2014, at an unconference in the Netherlands called “Jointly Designing a Data FAIRport” (Data FAIRport, 2014) the foundational principles for interoperable research data were first discussed. The next year, a draft of the guide was expanded by a FAIR data publishing group from FORCE11 and published for public commenting and endorsement (FORCE11, 2014a). In 2016, Barend Mons and a group of contributors authored an article in Scientific Data describing the need to establish the FAIR guiding principles for digital assets (Wilkinson et al., 2016). These principles are designed to help humans and machines overcome barriers to discovering, accessing, reusing, and citing research data.

Since its original publication, a version of the FAIR principles has been maintained by GO FAIR. Over time, these principles have influenced researchers wishing to prepare their data for sharing, data repositories wishing to evaluate and improve their infrastructure, and others wishing to assess and enhance their policies to support a FAIR data ecosystem.

What are FAIR Guiding Principles?

FAIR Guiding Principles

The main purpose of the principles is to ensure that machines and humans can easily discover, access, interoperate, and properly reuse the vast amount of information available for scientific discovery. The principles are meant to be high-level and domain independent, meaning they are broad in scope and can be applied to different types of data across multiple disciplines. By refraining from assigning technical specifications, the FAIR guiding principles allow for different implementations of the data management norms and characteristics they propose.

The following overview of the FAIR principles is modified from the full list of principles and subpoints available at



Humans and computers should be able to easily find metadata and data.

Machine-readable metadata are essential for automatic discovery of datasets and services.

F1. (Meta)data are assigned a globally unique and persistent identifier (PID).

F2. Data are described with rich metadata (defined by R1 below).

F3. Metadata clearly and explicitly include the identifier of the data they describe.

F4. (Meta)data are registered or indexed in a searchable resource.



Once the user finds the data, they need to know how to access them and may require details around authentication and authorization.

A1. (Meta)data are retrievable by their identifier using a standardised communications protocol.

        A1.1 The protocol is open, free, and universally implementable.

        A1.2 The protocol allows for an authentication and authorisation procedure, where necessary.

A2. Metadata are accessible, even when the data are no longer available.



The data usually need to be integrated with other data and need to interoperate with applications or workflows for analysis, storage, and processing.

I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.

I2. (Meta)data use vocabularies that follow FAIR principles.

I3. (Meta)data include qualified references to other (meta)data.



The ultimate goal of FAIR is to optimize the reuse of data, so metadata and data should be well-described so that they can be replicated and/or combined in different settings.

R1. (Meta)data are richly described with a plurality of accurate and relevant attributes.

        R1.1. (Meta)data are released with a clear and accessible data usage license.

        R1.2. (Meta)data are associated with detailed provenance.

        R1.3. (Meta)data meet domain-relevant community standards.

In chapter 10, “Supporting Reproducible Research with Active Data Curation,” you’ll learn how to make data interoperable and reusable via active data curation.

Key Mechanisms of FAIR Guiding Principles: Metadata, Persistant Identifiers, and Licenses

Using appropriate metadata (information about data) is central to the FAIR principles. Similar to traditional research material (such as books and articles with bibliographic information), research data must be described in a structured way with controlled vocabularies that can be read by humans and machines so that data can be discovered and reused. As such, metadata are an integral part of research data outputs because they give the user important information about a dataset’s supporting documentation, identifiers, licenses, and other relevant elements. While metadata describing original research data should be rich and specific enough to allow humans and machines to understand the context and limitations of a dataset, they should also be offered by way of standardized descriptions so that the research data are more interpretable across different domains. To achieve this balance, researchers from various disciplines have endorsed well-developed metadata standards, such as those listed by the Research Data Alliance (RDA).

The other major mechanisms to guarantee findability and reusability of data are PIDs and licenses defining how data can be used. A publicly registered PID provides each dataset and its metadata with a unique and stable means of identification that can track any changes or movements online. Researchers sharing data on their own websites normally won’t be able to assign such an identifier and are encouraged to instead deposit their data with a dedicated data repository to access support around the use of PIDs, such as Digital Object Identifiers (DOI) (i.e.,

Many researchers have concerns about data misuse and are reluctant to share data broadly (Wiley et al., 2019, p. 5). Data users, on the other hand, are often not able to confidently reuse and reshare secondary data derived from an original research dataset due to a lack of clarity around data reuse permissions. To counter this issue, standard data licenses, such as Creative Commons licenses or Open Data Commons licenses, or custom data use agreements can encourage data reuse while protecting data creators’ rights to credit and attribution. By providing information about how data that has been assigned a given license can legally and ethically be used, licensing helps define the terms of a relationship between data creators, publishers, and users for a particular dataset. You’ll learn more about licensing data in chapter 12, “Planning for Open Science Workflows.”

FAIR Data and Openness

Efforts to make data FAIR doesn’t necessarily lead to data being shared openly without restrictions. For example, data objects could have PIDs and FAIR metadata but not be open or reusable because of the way they’ve been licensed. The FAIR Principles Working Detailed Document offers four levels of FAIRness for data objects within a repository to describe different potential degrees of access to data:

  1. Each data object has a PID and offers FAIR metadata.
  2. Each data object has user-defined metadata to give rich provenance information.
  3. Data elements within data objects are FAIR but are not open access and have defined restrictions around reuse.
  4. Data objects and data elements are FAIR and public with well-defined licenses (FORCE11, 2014b).

The FAIR guiding principles allow data stewards to participate in important data publishing decisions and also provide space for other principles to be invoked. For example, the CARE (Collective Benefit, Authority to Control, Responsibility, and Ethics) Principles for Indigenous Data Governance published by the Global Indigenous Data Alliance in 2019 recognize the importance of Indigenous data sovereignty and of centring Indigenous Peoples’ rights and interests in any dealings with Indigenous data. In many ways, the CARE and FAIR principles complement one another and guide researchers toward taking into account the varied participants and purposes associated with research data. Indigenous data sovereignty is further discussed in chapter 3.

How to Make Your Data FAIR: Tools and Guidance

FAIR Guiding Principles and Data Management Plans

Data Management Plans (DMPs) are required by certain funding opportunities according to the Canadian Tri-Agency Research Data Management Policy (Government of Canada, 2021). In DMPs, researchers describe methodologies and strategies that reflect the FAIR guiding principles. For example, researchers should effectively document data in early phases of a project so that high-quality and complete metadata can be generated for dissemination. Also, researchers should negotiate data sharing licenses with collaborators and obtain permissions to share data from research participants early in the data collection stage if they wish to deposit and preserve data in repositories that meet the FAIR guiding principles.

FAIRness Evaluation and Improvement Tools for Researchers

A variety of tools have been developed to help researchers understand the FAIR principles and how to implement certain practices that align with the principles. These tools range from simple checklists to customized resources designed around researchers’ practices. Below is a list of FAIR assessment tools with different features for various user groups that are either currently available or under development. We recommend using these tools as you prepare to make your data FAIR.

  1. How FAIR Are Your Data? Checklist (Jones & Grootveld, 2017)

Developed by a data services network in Europe, this is a simple one-page checklist based on the FAIR guiding principles with small modifications that make the concepts and terminologies more accessible for researchers. This checklist is a good introductory tool for researchers who are new to the field of RDM.

  1. FAIR Data Self Assessment Tool (Australian Research Data Commons, 2022)

The FAIR data self-assessment tool was developed by the Australian Research Data Commons. By answering questions corresponding to the FAIR guiding principles, researchers can visualize the FAIRness of their practices for each principle and see overall FAIRness across the four principles. They can also compare their current ways of handling data with best practices, thus identifying potential areas of improvement.

  1. FAIR Aware Tool (Data Archiving and Networked Services, 2021)

Developed by the Netherlands’ Data Archiving and Networked Services, the FAIR Aware tool provides a more detailed assessment to help researchers understand and implement the FAIR principles. Although this tool asks researchers to identify their domain of research, role(s), and organization(s), the actual content of the assessment is the same for all users. Researchers are presented with 10 awareness questions concerning each of the FAIR guiding principles and then asked to rate their willingness to comply with recommended practices. Once answers are submitted, an overview report of the researcher’s FAIR awareness levels is provided along with tips and resources on how to improve.

  1. F-UJI Automated FAIR Data Assessment Tool (Devaraju & Huber, 2020)

The F-UJI (FAIRsFAIR Research Data Object Assessment Service) is designed to assess the FAIRness of research data objects based on comprehensive and detailed FAIRsFAIR Data Object Assessment Metrics (Devaraju et al., 2020).

Other Guidance on How to Make Data FAIR

Besides FAIRness assessment tools, international and national research data services have developed general and discipline-specific guidelines on making data FAIR. Examples include the following:

  • OpenAIRE (an organization supporting the open science development in Europe) created the Guides for Researchers: How to make your data FAIR (OpenAIRE, n.d.)
  • How to FAIR (Danish National Forum for Research Data Management, n.d.) developed through interviews with a broad group of researchers and librarians
  • Top 10 FAIR Data & Software Things (Library Carpentry, n.d.) offers brief stand-alone guides on different topics and disciplines that can be used by members of research communities (i.e., astronomy, imaging, music, etc.)
  • Sustainable and FAIR Data Sharing in the Humanities (ALLEA Working Group E-Humanities, European Federation of Academies of Sciences and Humanities, 2020), provides practical guidance for researchers looking to make digital humanities data FAIR.

In Canada, researchers at the University of Ottawa Heart Institute and the Ottawa Hospital Research Institute have developed a series of data handling courses, including one called FAIR Principles (Centre for Journalology, n.d.). Not much additional guidance on the FAIR principles is available within the Canadian context. Librarians or researchers interested in this area could consult How to Be FAIR with Your Data: A Teaching and Training Handbook for Higher Education Institutions (Engelhardt et al., 2022) for examples of FAIR-related training options at various higher-education institutions in Europe.

Policy Impacts of the FAIR Principles

The FAIR principles have been used by government agencies, academic institutions, research funders, scholarly societies, publishers, and a variety of other actors to underscore the cultural, economic, and social significance of research data stewardship. As a result, these principles have become foundational for organizational bodies looking to influence researchers and how they to manage and share data. Some examples of policy impacts include the European Commission citing FAIR as directly influencing the development of the European Open Science Cloud (Hill, 2019, p. 284) and the U.S. National Institutes of Health citing the application of the FAIR data principles in their Data Management and Sharing Policy (Office of The Director, National Institutes of Health, 2020).

In Canada, a key government recommendation in the Roadmap for Open Science (2020) is the implementation of the FAIR principles by federal departments and agencies. This plan aims to ensure the interoperability of scientific and research data and metadata standards for data products tied to government agencies and departments is in place by January 2025. In terms of research funding, the Tri-Agency Research Data Management Policy states that Canada’s three federal research funding agencies — the Canadian Institutes of Health Research (CIHR), the Natural Sciences and Engineering Research Council of Canada (NSERC), and the Social Sciences and Humanities Research Council of Canada (SSHRC) — support FAIR guidance and expect researchers to share their data in accordance with FAIR principles and disciplinary standards where allowed by ethical, cultural, legal and commercial requirements (2021). In addition, Canadian academic publishers, such as Canadian Science Publishing (n.d.), have mirrored other journal publishers’ efforts by describing the FAIR principles as framing the contents of their data availability policy. Complying with such policies can mean employing the above-mentioned researcher tools to ensure data are as FAIR aligned as they can be before being released. However, in addition to data preparation, these requirements are also meant to influence a researcher’s thinking around the selection of a research data repository and how their choice will support FAIR alignment beyond the initial publication of their data.

FAIR Principles and Repositories

The FAIR principles represent an opportunity to recognize the current and potential value of data repositories. Wilkinson et al. (2016) underscore this idea in their work by discussing the benefits and limitations of data repositories and arguing these should evolve to respond to the discovery and reuse needs of researchers (pp. 2–4). Researchers should determine if a data repository meets their unique disciplinary RDM needs and allows them to comply with relevant ethical and legal requirements, and they should also consider whether their choice offers features that mirror FAIR guidance.

Research data repositories are special-purpose data containers designed to store research data and associated files and metadata to provide stable and long-term access to data outputs (Boyd, 2021, pp. 25–26). Repositories are critical pieces of digital infrastructure set up to encourage the discoverability of research data and help researchers publish and disseminate data. Which repository they choose will often depend on factors such as disciplinary norms, publisher or funder requirements, or data sharing guidelines. Additionally, a researcher may choose a repository based on such elements as the ease and convenience of the data deposit process, the types of files the repository accepts, the amount of data curation support they will receive, or the metadata schemas and controlled vocabularies a repository uses to describe the research data objects it stores. Consideration of these elements should lead researchers to select either a discipline-specific repository, a community-specific repository, or a generalist repository. Researchers can then explore whether their chosen repository puts the FAIR principles into practice by evaluating whether or not it offers some specific functions.

In their paper on the improvement of interoperability between types of repositories, Hahnel and Valen (2020) note that, to effectively function in alignment with the FAIR principles, a repository should do the following:

  • assign PIDs (DOIs, ORCIDs, and GRIDs) to its data products and related materials
  • offer its data alongside documented application program interfaces (APIs)
  • support robust options for data curation and subscribe to web accessibility guidelines
  • offer well-defined licenses that support data reuse
  • describe its path to sustainability by documenting preservation and disaster recovery workflows (pp. 195–197).

This guidance around optimal repository features mirrors similar recommendations made by OpenAIRE and by the FAIR Sharing initiative (Cannon et al., 2021). Some of these elements are also represented in the TRUST Principles for digital repositories released by Lin et al. (2020).

To assess how some major Canadian and international data repositories have documented their commitment to FAIR principles, review the following examples:

Additionally, you can locate appropriate repositories by consulting the re3data directory, which is a multidisciplinary tool that lists more than 2,800 entries for data repositories that can be searched by specific criteria, such as API type and metadata standard. Another strong option is the FAIRsharing directory, which is endorsed by the Research Data Alliance and provides a multidisciplinary platform where researchers can look up entries for repositories, data standards, and data policies. Both tools are excellent options for finding disciplinary-aligned repositories.

Some larger commercial, community, or publisher-endorsed repositories may offer more flexible and specialized features that align with FAIR guidance. However, when selecting a repository, one should consider whether their choice allows them to adhere to disciplinary norms, access the support needed to meet ethical or legal requirements, and help fulfill responsibilities toward communities that have expectations around access to their data. A choice of repository based on alignment with FAIR principles should always be balanced with these equally important requirements.

Getting Involved

For those interested in supporting the implementation of FAIR principles on a large scale, the GO FAIR initiative brings together individuals, institutions, and organizations to collaborate on policy development, skills development, and technical standards/technology development. This is primarily achieved via GO FAIR Implementation Networks that bring partners together to support the creation of unique deliverables. To learn more about implementation networks or about how to join them, visit


The FAIR principles have helped clarify how some goals of the RDM movement may be achieved. Along with other guiding principles, they have been endorsed by funders, publishers, and varied research communities, and they have helped connect and align efforts around supporting data access and reuse. Researchers should monitor the evolution of the FAIR principles in terms of their influence on national and international research data ecosystems and how they impact data reuse in their own disciplines.


Reflective Questions

  1. Use the FAIR Aware tool to conduct a self-evaluation of knowledge and skills for making data FAIR.
  2. Use the FAIR principles as a framework to evaluate the FAIRness of the following sample datasets and identify suggestions to improve the FAIRness of these datasets:
    1. Don Valley Historical Mapping Project:
    2. Soil and Plant Phytoliths from the Acacia-Commiphora Mosaics at Olduvai Gorge (Tanzania):
    3. CLOUD: Canadian Longterm Outdoor UAV Dataset:

Reflective Questions


Key Takeaways

  • FAIR guiding principles are high-level goals to guide the continuous optimization of research data, metadata, and data publishing environments for easier data access and reuse across domains through implementation of PIDs, rich and standard metadata, and data licenses.
  • Researchers can follow guidance and use tools to learn about FAIR principles, evaluate their current RDM practices, and plan for strategies to FAIRify their research data and publishing activities.
  • The FAIR principles have influenced government policies, research funding policies, and publisher policies regarding data availability.
  • Researchers can align their data management and sharing activities with the FAIR principles by ensuring they select data repositories that offer features that support FAIR compliance.
  • Research data repository registries are important tools for identifying repositories that offer FAIR-aligned features as well as other features related to disciplinary norms or legal/ethical/community-based obligations.

Additional Readings and Resources

FAIR and CARE Principles

The Global Indigenous Data Alliance. (2019). CARE principles for Indigenous data governance.

GO FAIR. (n.d.). FAIR principles.

Research Data Alliance. Metadata standards catalogue.

The FAIR Principles and Repositories

re3data directory.

FAIRsharing directory.

Getting Involved

GO FAIR Implementation.

Reference List

ALLEA Working Group E-Humanities. (2020). Sustainable and FAIR data sharing in the humanities: Recommendations of the ALLEA working group e-humanities.

Australian Research Data Commons. (2022). FAIR data self assessment tool.

Boyd, C. (2021). Understanding research data repositories as infrastructures. Proceedings of the Association for Information Science and Technology, 58(1), 25–35.

Canadian Science Publishing. (n.d.). Principles and policy on data availability.

Cannon, M., Graf, C., McNeice, K., Chan, W. M., Callaghan, S., Carnevale, I., Cranston, I., Edmunds, S. C., Everitt, N., Ganley, E., Hrynaszkiewicz, I., Khodiyar, V. K., Leary, A., Lemberger, T., MacCallum, C. J., Murray, H., Sharples, K., Soares E Silva, M., Wright, G., … (Moderator) Sansone, S-A. (2021). Repository features to help researchers: An invitation to a dialogue. Zenodo.

Centre for Journalology Training. (n.d.). FAIR principles.

Danish National Forum for Research Data Management. (n.d.). How to FAIR.

Data Archiving and Networked Services. (2021). FAIR Aware.

Data FAIRport. (2014). Data FAIRport conference: Jointly designing a data FAIRport.

Devaraju, A. & Huber, R. (2020). F-UJI – An automated FAIR data assessment tool (v1.0.0). Zenodo.

Devaraju, A., Huber, R., Mokrane, M., Herterich, P., Cepinskas, L., de Vries, J., L’Hours, H., Davidson, J., & Angus W. (2020). FAIRsFAIR data object assessment metrics (0.5). Zenodo.

Engelhardt, C., Biernacka, K., Coffey, A., Cornet, R., Danciu, A., Demchenko, Y., Downes, S., Erdmann, C., Garbuglia, F., Germer, K., Helbig, K., Hellström, M., Hettne, K., Hibbert, D., Jetten, M., Karimova, Y., Kryger Hansen, K., Kuusniemi, M. E., Letizia, V., McCutcheon, V., … Zhou, B. (2022). D7.4 How to be FAIR with your data. A teaching and training handbook for higher education institutions. Zenodo.

FORCE11. (2014a, September 1). FAIR data publishing group. Archived groups.

FORCE11. (2014b, September 10). Guiding principles for findable, accessible, interoperable and re-usable data publishing version b1. 0.

Government of Canada. (2020, February). Roadmap for open science.

Government of Canada. (2021). Tri-Agency research data management policy.

Hahnel, M., & Valen, D. (2020). How to (easily) extend the FAIRness of existing repositories. Data Intelligence, 2(1–2), 192–198.

Hill, T. (2019). Turning FAIR into reality: Review. Learned Publishing, 32(3), 283–286.

Hrynaszkiewicz, I., Simons, N., Hussain, A., Grant, R. and Goudie, S., 2020. Developing a research data policy framework for all journals and publishers. Data Science Journal, 19(1), 1-15.

Jones, S. & Grootveld, M. (2017). How FAIR are your data? Zenodo.

Library Carpentry. (n.d.). Top 10 FAIR data & software things. Zenodo.

Lin, D., Crabtree, J., Dillo, I. Downs R. R., Edmunds R., Giaretta, D., De Giusti, M., L’Hours, H., Hugo, W., Jenkyns, R., Khodiyar, V., Martone, M. E., Mokrane, M., Navele, V., Petters, J., Sierman, B., Sokolova, D. V., Stockhause, M., & Westbrook, J. (2020). The TRUST principles for digital repositories. Scientific Data, 7, 144.

Office of The Director, National Institutes of Health. (2020). Final NIH policy for data management and sharing.

OpenAIRE. (n.d.). How to make your data FAIR.

Wiley, C. A. & Burnette, M. H., (2019). Assessing data management support needs of bioengineering and biomedical research faculty. Journal of eScience Librarianship, 8(1), 1-19.

Wilkinson, M. D., Dumontier, M., Aalbersberg, Ij. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J-W., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., … Mons, B. (2016). The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 3(1), 160018.


About the authors

Minglu Wang is a Research Data Management (RDM) Librarian at York University. She has published book chapters, conference/working papers, and research articles closely related to academic libraries and RDM services. Minglu Wang is an active member of the Association of College & Research Libraries (ACRL), a division of the American Library Association (ALA), and she contributed to multiple years of the Association’s publications of Top Trends articles and Environmental Scan white papers. She is a member of the Research Intelligence Expert Group, a part of The Digital Research Alliance of Canada (The Alliance) RDM Team, and has participated in the design and report writing of the RDM Capacity Survey of Canadian Institutions. Email: | ORCID: 0000-0002-0021-5605


Dany Savard is the Associate Librarian for Collections and Research Services at the University of Toronto Mississauga Library. He has contributed to research articles on the topics of research data discovery and data repositories. He is a member of the Alliance Network of Experts’ Discovery and Metadata Expert Group and is the current Chair of its Canadian Data Repositories Landscape Working Group. He holds an MLIS from Western University and a Master of Arts in Public Policy and Administration from Toronto Metropolitan University. Email: | ORCID: 0000-0001-7472-7390



Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

Research Data Management in the Canadian Context Copyright © 2023 by Minglu Wang and Dany Savard is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.

Digital Object Identifier (DOI)

Share This Book