MSU Libraries

Research guides.

  • Need help? Ask a Librarian

How to Find Data & Statistics: Finding Data

  • Data vs. Statistics
  • Finding Statistics
  • Finding Data

Introduction to Finding Data

Start by defining your topic.

Be specific about your topic so that you can narrow your search, but be flexible enough to tailor your needs to existing sources.

Identify the Unit of Analysis

This is what you should be able to define:

#1 - Who or What?

Social Unit: This is the population that you want to study. It can be...

  • People For example: individuals, couples, households
  • Organizations and Institutions For example: companies, political parties, nation states
  • Commodities and Things For example: crops, automobiles, arrests

Time: This is the period of time you want to study. Things to think about...

  • Point in time A "snapshot" or one-time study
  • Time Series Study changes over time
  • Current information Keep in mind that there is usually a time lag before data will be published.  The most current information available may be a couple years old.
  • Historical information

 #3 - Where?

Space: Geography or place. There are two main types of geographic classifications...

  • Political boundaries For example: nation, state, county, school district, etc.
  • Statistical/census geography For example: metropolitian statistical areas, tracts, block groups, etc.

Remember to define your topic with enough flexibility to adapt to available data! Data is not available for every thinkable topic. Some data is hidden (behind a pay-wall for example), uncollected, unavailable. Be prepared to try alternative data.

Search Strategies

Search strategy #1: search in a data archive.

Look within a data archive that collects within the general subject area that you are searching for.

  • Inter-University Consortium for Political and Social Research The world's largest social science data archive.  It is one of the best places to start looking for a data set.  MSU maintains a membership which allows students and faculty to download data sets and use other online tools by registering with your @msu.edu email address.
  • Data Repositories (Open Access Directory) Open data repositories from multiple academic disciplines.
  • re3data.org: Registry of Research Data Repositories re3data.org is a global registry of research data repositories that covers research data repositories from different academic disciplines.
  • Data Citation Index A multidisciplinary index to data repositories, data studies, and data sets. Part of Web of Knowledge.

Search Strategy #2: Identify Potential Producers

Ask yourself: Who might collect and publish this type of data?

Then visit the organization’s website and see if you're right! Or, search for them as an author in the library catalog .

These are some of the main types of data producers:

Government Agencies The government collects data to aid in policy decisions and is the largest producer of data overall. For example, the U.S. Census Bureau, Federal Election Commission, Federal Highway Administration and many other agencies collect and publish data. To better understand the structure of government agencies read the U.S. Government Manual and browse FedStats . Government data is free and publicly available, but may require access through library resources or special requests.

Non-Government Organizations Many independent non-commercial and nonprofit organizations collect and publish data that supports their social platform. For example, the International Monetary Fund, United Nations, World Health Organization, and many others collect and publish data. For more information about NGOs, visit Duke Libraries NGO Research Guide . Data from NGOs may be free or fee-based. The library subscribes to many NGO data resources, so be sure to check the library’s e-resources pages or catalog.

Academic Institutions Academic research projects funded by public and private foundations create a wealth of data. For example, the Michigan State of the State Survey, Panel Study of Income Dynamics, American National Election Studies, and many other research projects collect and publish data. Much of this type of data is free and publicly available, but may require access through library resources. Access to smaller original research projects may be dependent upon contacting individual researchers.

Private Sector Commercial firms collect and publish data as a paid service to clients or to sell broadly. Examples include marketing firms, pollsters, trade organizations, and business information. This information is almost always is fee-based and may not always be available for public release. The library does subscribe to some commercial data services, particularly through the business library.

Search Strategy #3: Turn to the literature

Search for research studies based on secondary analysis of publicly available data sets.

Unfortunately, citation of research data is often incomplete.  Sometimes the best you will get is the title of the data set used, but check to see if the data or a related publication are cited and follow it up.  Don't commit this fallacy when you publish, cite your data.

Data Archive Bibliographies

  • ICPSR Bibliography of Data-Related Literature "A continuously-updated database of thousands of citations of works using data held in the ICPSR archive. The works include journal articles, books, book chapters, government and agency reports, working papers, dissertations, conference papers, meeting presentations, unpublished manuscripts, magazine and newspaper articles, and audiovisual materials."

Library Indexes

  • Database List Search the literature from your field.  Try related disciplines as well.
  • Web of Science This multidisciplinary index to the scholarly literature provides links from articles to records in the Data Citation Index.

Library Catalog

  • Use the MSU Library Catalog as part of your literature review to find books on your topic that may cite relevant data providers or for books of statistical tables to identify sources of data. The library also has some data sets on CD-ROM. Try adding keywords such as “data” or “statistics” to your search. To expand your search to include other libraries, look in WorldCat (request outside materials through interlibrary loan ).

Books on Research Methods

  • Unobtrusive measures / Eugene Webb
  • Secondary data analysis (Pocket guides to social work research methods) / Thomas P. Vartanian
  • Secondary data analysis: An introduction for psychologists / edited by Kali H. Trzesniewski, M. Brent Donnellan, and Richard E. Lucas
  • Secondary data sources for public health: A practical guide / Sarah Boslaugh
  • Using secondary data in educational and social research / Emma Smith
  • Research strategies for secondary data : a perspective for criminology and criminal justice / Marc Riedel
  • Secondary analysis of survey data / K. Jill Kiecolt, Laura E. Nathan (Note: Published in 1985, outdated but some issues are still relevant)
  • Reworking qualitative data / Janet Heaton

Search Strategy #4: Statistics lead to Data

Search for statistics and follow them to the source.

Try the search strategies for statistics detailed on the "Finding Statistics" tab of this guide.  Where does the statistic you find come from?  Can you track it down to the source survey or other data set?

Search Strategy #5: Ask for help

Knowing when to call in reinforcements is important.

Recap: Access to Data Sets

Depending on which search strategy you used, you may have already found the dataset file download link directly on a website.  Or, you may have just a reference/citation to a dataset or producer.  Here are some common ways to find the dataset files themselves.

  • Government agencies and universities will often post dataset files directly on their websites.
  • Check to see if the dataset has been archived in ICPSR or another topical data archive .
  • The library has many datasets on CD-ROM, especially in the Government Documents collection.  Search the library catalog for the study title.
  • Contact the data producer directly.
  • Ask a Librarian for assistance.

Evaluate Data

Once you’ve chosen a data set that you believe will work, take care to carefully evaluate it. Is it appropriate? Does it come from an authoritative source? Does it fit your needs? Does it cover your Where, When, and Who or What requirements? Are you willing to compromise your requirements or manipulate the data to fit your needs? Always read the documentation and codebook to ensure that the analysis you are planning to do really measures what you want it to.

Analyze Data

The Center for Statistical Training and Consulting (CSTAT) is the primary unit on campus that assists with data analysis.  CSTAT is a professional service and research unit that aims to support research and provide training and consulting in statistics for faculty, staff and graduate students.

For additional on campus resources please consult the Data Analysis page.

This is a short list of helpful tutorials that are useful for learning more about the technicalities of secondary data manipulation and codebooks.

  • Getting Started: Data Analysis A guide from Princeton University Data and Statistical Services. Talks about planning, research questions, and preparing data for analysis.
  • How to use a Codebook Also from Princeton University Data and Statistical Services. A general introduction to codebooks and preparing data for analysis.
  • ICPSR Data Use Tutorial A guide from the Inter-University Consortium for Political and Social Research that is specific to using their data archive resources.
  • Introduction to Data Handling This guide from the University of Chicago includes a section on reading and using a codebook.
  • Introduction to Data Tutorial The Harvard-MIT Data Center has created this short guide to understanding numeric data files.

Be sure to provide a proper citation for any data that you use.  The How to Cite Data research guide will help you determine how to format your citation.

  • << Previous: Finding Statistics
  • Last Updated: Oct 28, 2022 9:56 AM
  • URL: https://libguides.lib.msu.edu/datastats
  • Locations and Hours

Data Literacy for Researchers

Narrowing the search for data, finding datasets in repositories.

  • Creating Datasets

Data Management

  • Analyzing and Visualizing Data
  • Communicating Data
  • UCLA Data Research Support

To determine what data you need, you must must first define the area of interest for your research question. Here are some questions to consider:

  • What is the subject or topic of your research?
  • How could your research question be measured ? (e.g. by recorded observations, a survey instrument, demographic counts)
  • What are the geographic constraints or units?
  • What are the time constraints (a range of years; monthly, quarterly or annually)?
  • Do you need cross-sectional data (observations of many different subjects at a given time) or longitudinal data (tracks the same type of information on the same subjects at multiple points in time)?
  • Do you need quantitative data (measurements, counts, rankings) or qualitative data (texts, surveys, opinions, words), or both?

Questions adapted from Partlo, Kristin. 2009. "The Pedagogical Data Reference Interview." IASSIST Quarterly 33, (4): 6-10. Available at: https://iassistdata.org/sites/default/files/iqvol334_341partlo.pdf. Accessed via Staff and Faculty Work. Library. Carleton Digital Commons https://digitalcommons.carleton.edu/libr_staff_faculty/5

  • Conduct a literature review  to identify existing data sets that are used by studies relevant to your research question, as well as gaps in these datasets.
  • Following a literature review, consider how your research question addresses existing gaps and whether generating new data sets will help you answer that question.
  • Introduction
  • Repositories

Data repositories contain published datasets that are typically associated with publications or ongoing research projects. Data repositories are used to store and preserve data so that researchers can access and analyze it. 

There are two types of repositories we will discuss: scholarly  and public .

Scholarly : Scholarly data repositories are managed by organizations or scientific societies and often have stricter guidelines on the format and level of detail in submissions. They generally are well-maintained, containing data sets from well-controlled studies, and that include detailed descriptions and metadata. Access to these repositories may be restricted.

  • Public : While some data repositories are accessible only to select user groups, many are publicly accessible to anyone with an internet connection. Public data repositories can have a lot of interesting and useful data, but make sure to carefully evaluate the source and scope of the data, as well as the inclusion and preservation practices of the repository itself.

When searching for data in scholarly repositories, be sure to check to see if it is associated with a research publication. Make sure there is enough information in the record to be sure you can reuse the data correctly.

Here are some scholarly  repositories:

  • All UCLA Library Data & Statistics resources

And here are a list of public repositories:

  • Github Open Data
  • Google Dataset Search

When collecting data, there are two main considerations.

First, research data sometimes has to be purchased and/or used under strict terms of agreement, or following specific privacy protocols. When purchasing data sets, or downloading protected data, be sure the data is stored in a safe and secure environment. It's important to respect copyright permissions and understand what constitutes fair use .

Second, carefully looking into the context and content of the data can help you understand any potential biases or limitations to prevent misuse. Data that is published is often associated with specific experimental strategies. While the strategies and limitations are often discussed when data are published with research papers, this information is not always available with the data set. When selecting a data set for use:

  • Understand the methodological limitations to the data collection
  • Look for a data dictionary, or some sort of readme file that contains clear information about the variables and observations found in the data set
  • Try to find information on the context in which the data was collected, as this can inform limitations on how data should be used
  • Copyright and Fair Use: Charts and Tools Stanford University
  • Fair Use Infographic The Association of Research Libraries (ARL)
  • Ask the Copyright Genie if the work is covered by copyright
  • A Guide to Works You Can Use Freely University of Montana library guide
  • Copyright Basics University of Cincinnati library research guide

If your research question cannot be answered with existing datasets, it may be necessary to create your own. Creating data can be done through practices such as observation, surveying, simulation, and experimentation, as well as through methods that extract data from existing bodies of information such as web-scraping or text & data mining (TDM).

Data collection looks different for different disciplines. Here we include some generalized resources to assist with creating datasets:

When collecting original data to answer research questions, there are a few key things to think about in order to be sure the findings are accurate and can be used to draw conclusions relating to your research question.

Take detailed notes about the methods you are using to collect data. If data collection does not go as planned, make sure you make note of which aspects of the methods were changed.

If working with human-subjects data, or protected data, be sure to check with your local Institutional Review Board (IRB) office to see what kinds of protections need to be put in place for storing and reporting your data.

It is important to collect samples that properly represent your subject of study. If you expect to see certain results, compare your experimental sample with a sample with known results to see if the result is aligned with your expectations.

A sample with known results is called a control sample . A positive control is a sample where you expect to see the effect you think you will observe as a result of your experiment. A negative control is a sample where you know the result of your experiment will fail. 

If you plan to collect data, be sure to outline how you will organize and analyze the data beforehand. If the work is to be published, have a plan to share the data. See more about data management below.

  • Getting Started with IRB The UCLA Institutional Review Board provides guidelines, training, and approval for working with protected data
  • Explaining Experimental Controls Video and transcript explaining how control samples are used in scientific experimentation.
  • Data Management Plans

Data management plans (DMPs) are formal plans that describe the data you expect to acquire or generate through your research, along with how you plan to manage it, analyze it, and share it. Here are some resources focused on data management planning:

  • Examples of Data Management Plans (DMP) and Data Curation Profiles (DCP)

Data Management Plans (DMPs) are crucial to reproducible research practice because they provide a framework for how research data are managed and stored. This prevents data rot, the loss and/or corruption of data stored on individual computer hard drives. It also makes it easier to find data after research projects are completed and improves transparency by allowing collaborators or like-minded researchers to download and verify data analysis. Often DMPs enable researcher to think about the necessary privacy and compliance considerations, especially if the data handles human subjects research, or some other type of protected data. For these reasons, DMPs are often required as a part of research funding proposals.

  • << Previous: Overview
  • Next: Analyzing and Visualizing Data >>
  • Last Updated: Dec 5, 2023 11:58 AM
  • URL: https://guides.library.ucla.edu/data-research
  • skip to Main Navigation
  • skip to Main Content
  • skip to Footer
  • Accessibility feedback
  • Data & Visualization and Research Support
  • Data Management

Defining Research Data

One definition of research data is: "the recorded factual material commonly accepted in the scientific community as necessary to validate research findings." ( OMB Circular 110 ).

Research data covers a broad range of types of information (see examples below), and digital data can be structured and stored in a variety of file formats.

Note that properly managing data (and records) does not necessarily equate to sharing or publishing that data.

Examples of Research Data

Some examples of research data:

  • Documents (text, Word), spreadsheets
  • Laboratory notebooks, field notebooks, diaries
  • Questionnaires, transcripts, codebooks
  • Audiotapes, videotapes
  • Photographs, films
  • Protein or genetic sequences
  • Test responses
  • Slides, artifacts, specimens, samples
  • Collection of digital objects acquired and generated during the process of research
  • Database contents (video, audio, text, images)
  • Models, algorithms, scripts
  • Contents of an application (input, output, logfiles for analysis software, simulation software, schemas)
  • Methodologies and workflows
  • Standard operating procedures and protocols

Exclusions from Sharing

In addition to the other records to manage (below), some kinds of data may not be sharable due to the nature of the records themselves, or to ethical and privacy concerns. As defined by the OMB , this refers to:

  • preliminary analyses,
  • drafts of scientific papers,
  • plans for future research,
  • peer reviews, or
  • communications with colleagues

Research data also do not include:

  • Trade secrets, commercial information, materials necessary to be held confidential by a researcher until they are published, or similar information which is protected under law; and
  • Personnel and medical information and similar information the disclosure of which would constitute a clearly unwarranted invasion of personal privacy, such as information that could be used to identify a particular person in a research study.

Some types of data, particularly software, may require special license to share.  In those cases, contact the Office of Technology Transfer to review considerations for software generated in your research.

Other Records to Manage

Although they might not be addressed in an NSF data management plan, the following research records may also be important to manage during and beyond the life of a project.

  • Correspondence (electronic mail and paper-based correspondence)
  • Project files
  • Grant applications
  • Ethics applications
  • Technical reports
  • Research reports
  • Signed consent forms

Adapted from Defining Research Data by the University of Oregon Libraries.

Opens in your default email client

URL copied to clipboard.

QR code for this page

finding research data

Banner

How to Find Data: Tips for Finding Data

  • Tips for Finding Data
  • Climate & Environmental Data
  • Criminal Justice Data
  • Demographic Data
  • Education Data
  • Election Data
  • Health Data
  • Local NC Data
  • Sports Data
  • Data Repositories

Ask a Data Question

Email your data question to  [email protected]  or make an appointment with a librarian using the button below.

How to start looking for data

1. Define the type of data you need

Consider what/who is being measured, where is it collected, when, and how often

  • What/Who:  What is the unit of analysis relevant to your topic?
  • Where:  Is the data specific to a specific geography (i.e. global, by state, regional, etc.)
  • When:   Is there a relevant time period is collected?
  • Frequency:  How often is the data collected? (annually, semi-annually?)

2. Determine who collects the type of data you are looking for

Think of who has a stake in collecting this data. Also consider who the audience of the data might be. This will help you determine where the data is likely published and how accessible the data is.

An Example:

  • I am interested in finding employment rates for colleges by state
  • The government has a stake in collecting those numbers
  • So I could look in a compendia like DataPlanet (under topic: Education or Labor and Employment) or I could go directly to organizations that I suspect collect relevant data--like the  Bureau of Labor Statistics  

3. Start searching for data

Again, keep in mind who collects the data and what this means for where it is located.

  • Data that is collected by organizations and agencies that report, will often be found in compendia or directly through that organization's website. 
  • Data that is collected by individuals and researchers is sometimes available in data repositories .

Strategies for Finding Data

Browsing data compendia.

This is a good strategy if you are not sure what types of variables exist or what data would be relevant for your project

  • Select a data compendia
  • Determine the subject area or data type that your topic or variable falls under
  • Read the descriptions of the resources to determine a promising place to look

Searching by Topic

This guide provides several links to data sources by topic. These links are by no mean exhaustive, but can be a good place to start and can help you get a sense of who are some of the major collectors of data in your topic area.

  • Visit the Topic page of the Data guide
  • Find a topic/topics that fits your research area
  • Start exploring links

Targeted Searching

This can be a good strategy if you have a sense of who is a major source of the sort of data you are seeking. 

  • Identify the home website of a relevant organization (i.e. the Centers for Disease Control is a major source of health data)
  • You might also want to look for any links called "reports" or "publications"--these pages typically have data-sets, but might have published data that will help you identify other source of data, like relevant surveys or studies.
  • Use the site or domain search in the advanced search to limit to the website (i.e. www.cdc.gov)
  • i.e. "opioid use" (data OR statistics) site:www.cdc.gov

A sample search in Google Advance search for "opioid use" (data OR statistics) in cdc's website

Literature Mining

Another source of data are the datasets used by scholars in their research. By searching through existing literature, you can get your hands on a dataset. You might also browse a data repository to see if someone has archived the data from their research

To Search for Pre-Existing Literature

  • Select an appropriate subject-based article database, or go to the Finding Journal Articles Guide  to learn about selecting a database
  • Search in one of the databases and sift through the results for an article that uses a dataset
  • Alternatively, search through a data repository for relevant data

This guide-page is adapted from the Search Strategies Page from the Data, Datasets, and Statistical Resources by the Research/IT Desk at Carleton College's Gould Library.

  • << Previous: Home
  • Next: Data by Topic Area >>
  • Last Updated: Mar 26, 2024 11:28 AM
  • URL: https://davidson.libguides.com/data

Mailing Address : Davidson College - E.H. Little Library, 209 Ridge Road, Box 5000, Davidson, NC 28035

Getting Started Finding Data

General compendia, indexes, repositories, portals and data archives, finding specialized sources.

  • Literature review on topic

Who might collect this data?

Polling/public opinion, research guides and other materials, requesting or purchasing data that has already been identified, getting help, introduction.

Some things to consider when looking for data:

Do you need "summary statistics" (e.g. simple numbers or figures) or more detailed or micro-level "data"?  Raw, consumable data? If so, in what form? What technical skills and tools do you have?

Time frame and frequency (e.g. annual, monthly)

Time series data?-Do you need data collected at regular intervals over time?

Do you need only current or also historical or retrospective data?  Can you get it from the same source? (Be careful if you are trying to mix data from different sources.)

In what indicators/variables are you interested?  Which ones are critical vs. ideal?

What is your unit of analysis (e.g. individuals, companies, etc.)?

Geographic unit (e.g. country, city, etc.)?

What is the source, scope and methodology of the data you are using?  Definitions?

Who might care about the data that you want? (See Who Might Collect This Data )

Did you do a literature review?  (See Literature Review on Topic )

  • Data Repositories and Portals
  • ProQuest Statistical Abstract of the United States (Harvard Login) Summary statistics on the social, political and economic organization of the United States. Compiled from publications and records of various government and private agencies, it is designed to serve both as a convenient reference volume and as a guide to other statistical sources.
  • Historical Statistics of the United States (Harvard Login) Presents the U.S. in statistics from Colonial times to the present. Included are statistics on U.S. population, including characteristics, vital statistics, internal and international migration. Statistics on work and welfare, economic structure and performance, economic sectors, and governance and international relations. Tables may be downloaded for use in spreadsheets and other applications. This electronic database is also in a five volume hard copy set.
  • Proquest Statistical Insight (Harvard Login) Proquest Statistical Insight is a bibliographic database that indexes and abstracts the statistical content of selected United States government publications, state government publications, business and association publications, and intergovernmental publications. The abstracts may also contain a link to the full text of the table and/or a link to the agency's web site where the full text of the publication may be viewed and downloaded. more... less... Proquest Statistical Insight is a bibliographic database that indexes and abstracts the statistical content of selected United States government publications, state government publications, business and association publications, and intergovernmental publications. The abstracts may also contain a link to the full text of the table and/or a link to the agency's web site where the full text of the publication may be viewed and downloaded.

Restricted Access: HarvardKey or Harvard ID and PIN required

When you find an entry, check the source data. See e.g.  Live Births, Deaths, Marriages, And Divorces: 1960 To 2014

  • Harvard Dataverse more... less... The Harvard-MIT Data Center is the principal repository of quantitative social science data at Harvard University and the Massachusetts Institute of Technology. The majority of its holdings are available to Harvard and MIT affiliates directly via its web site through its search engine. Graduate students and faculty with a Harvard or MIT Library card can check out paper code books from libraries at either institution, under Harvard's and MIT's reciprocal borrowing agreement. In addition, the Data Center has negotiated a special agreement for undergraduates and summer graduate students, who are not covered by the standard agreement.
  • ICPSR One of the world’s oldest and largest social science data archives; based at the University of Michigan. Categories of datasets include census enumerations; community and urban studies; economic behavior and attitudes; education, government structures, policies and capabilities; social indicators; and much more. Includes links to related studies. You can search on/compare variables .
  • Registry of Research Data Repositories
  • Internet Crossroads in Social Science Data
  • Databases for Statistical and Data-Related Research
  • Find a database (Harvard) Look for your subject or topic and keyword and add data or statistics.
  • Harvard Libraries' Research guides
  • Qualitative Data Repository
  • Find a database (Harvard)
  • Hollis+ Statistical data sets is a resource type in results filter.
  • Harvard Google Scholar
  • Dissertations and Theses Full Text (ProQuest) more... less... Dissertations and Theses Full Text indexes dissertations and masters' theses from most North American graduate schools as well as some European universities. Provides full text for most indexed dissertations from 1990-present.
  • Academic Search Premier (Harvard Login) more... less... Academic Search Premier (ASP) is a multi-disciplinary database that includes citations and abstracts from over 4,700 scholarly publications (journals, magazines and newspapers). Full text is available for more than 3,600 of the publications and is searchable.

You can look for articles on your topic to find articles that use data originally generated for that article or to see what data the author used in writing the article.  Some databases also have subject and resource-type filters related to statistics, data, indicators, etc. to identify documents consisting of or related to data sets.

  • Elgaronline (Harvard Login)
  • Harvard Law School Library Find a Database
  • Harvard Law School Library Find Articles and Books
  • Research Guides at Harvard

It is often useful to consider if there are particular entities that might be most likely to care about and collect this data, such as governmental bodies, organizations, business/trade groups, commercial entities, etc.

  • Encyclopedia of Associations more... less... The Encyclopedia of Associations is a comprehensive source of detailed information on over 135,000 nonprofit membership organizations worldwide. It corresponds to the printed Encyclopedia of Associations family of publications as follows: National Organizations of the U.S., which covers more than 22,200 American associations of national scope; International Organizations, which covers some 22,300 multi-national, bi-national, and non-U.S. national associations; and Regional, State, and Local Organizations, which covers more than 115,000 U.S. associations with interstate, state, intrastate, city, or local scope or membership. The Encyclopedia of Associations database provides addresses and descriptions of professional societies, trade associations, labor unions, cultural and religious organizations, fan clubs, and other groups of all types.
  • Leadership Directories (Harvard Login)
  • Principal Federal Statistical Agencies
  • Find data and statistics from the government

You can try identifying a particular governmental body, organization, etc. or start with a general web search using keywords related to your topic (narrowly or broadly) and adding words like data, statistics, index, indicators, etc..  Adding .site:.gov, site:.org or site:.edu to a Google search allows you to limit to websites on those domains.  Certain topics might lend themselves to private research entities.

  • HKS Library Think Tank Search

Databases/Websites

  • Gallup Analytics (Harvard Login) Daily U.S. Data - economic, well-being, and political data collected daily since 2008 of 1,000+ interviews; World Poll Data - economic, social, and well-being data collected annually since 2005 in 160+ countries, 1.5 million+ interviews worldwide more... less... Fully searchable records of Daily U.S. Data - economic, well-being, and political data collected daily since 2008 of 1,000+ interviews; World Poll Data - economic, social, and well-being data collected annually since 2005 in 160+ countries, 1.5 million+ interviews worldwide; and Gallup Brain - historical Gallup trends on thousands of topics from the U.S. and world dating back to the 1930s.
  • Roper Center for Public Opinion (Harvard Login) Access to aggregate and raw public opinion data. Data collection is focused on United States public opinion but also includes growing collections of (micro-level) European, Latin American (Latin American Databank) and Japanese (JPOLL) polls. The data archive (micro-level data) is searchable by keyword, date, and survey organization. more... less... The Roper Center for Public Opinion provides access to summary-level (aggregate) and micro-level (raw) public opinion data. While the data collection focuses strongly on United States public opinion, it also includes growing collections of (micro-level) European, Latin American (Latin American Databank) and Japanese (JPOLL) polls. The data archive (micro-level data) is searchable by keyword, date, and survey organization. The iPOLL database (summary-level data) is searchable by keyword, subject /or survey organization and survey sponsor; it provides question and response level data. The Roper Center resources require users to set up individual accounts in order to gain access to the data.

Restricted Access: HarvardKey or Harvard ID and PIN required

  • Pew Research Center

News/secondary sources

  • Newspaper and news collections

You may sometimes find polls and surveys mentioned in newspaper articles and other secondary sources. You might try looking for words about your topic and adding the words "poll" or "survey" to your search.

  • Public Opinion Data Resources

Harvard guides

  • SJD Guide: Find and Use Data
  • Beginner's Guide to Locating and Using Numeric Data
  • Data Mining/Bulk Downloading Legal and Government Information (in progress)
  • Finding Data/Data Support: Getting Started BC Libraries
  • Inter-Governmental Data Resources
  • Harvard Research Guides

Many subject guides have sections about finding data/statistics, so you might also try looking at our research guides for your subject/discipline words and data or statistics or look for a guide on your topic generally.

  • Empirical Research

Books and Databases

finding research data

  • Sage Research Methods Online

If you need assistance acquiring or purchasing data, you may start with your faculty member's librarian liaison, submit a purchase request or contact Michelle Pearse , Senior Research and Data Librarian.  The library will work with you and your faculty member to determine how we might be able to obtain the data that is needed.

Contact Us!

  Ask Us!  Submit a question or search our knowledge base.

Chat with us!  Chat   with a librarian (HLS only)

Email: [email protected]

 Contact Historical & Special Collections at [email protected]

  Meet with Us   Schedule an online consult with a Librarian

Hours  Library Hours

Classes  View  Training Calendar  or  Request an Insta-Class

 Text  Ask a Librarian, 617-702-2728

 Call  Reference & Research Services, 617-495-4516

  • Last Updated: Sep 12, 2023 10:46 AM
  • URL: https://guides.library.harvard.edu/HLSRAfindingdata

Harvard University Digital Accessibility Policy

Finding Research Data

Introduction.

  • Qualitative data sources
  • Quantitative data sources
  • Locations and opening hours Find books, articles and more Use the library Accessibility and support Subject support Research support Special Collections Events About New University Library Contacts

finding research data

What is research data?

Qualitative data

Qualitative data is observational and descriptive, relating to experiences and emotions and can come in the form of interview transcripts, focus groups, case studies diaries, questionnaires and surveys that include open questions and free-text responses. 

Use the tab at the top to explore sources of qualitative data.

To find out more about qualitative data and research methods, see Sage Research Methods .

Quantitative data

Quantitative data is numerical, objective, measurable and can be statistically analysed and can comes in the form of statistics, polls, market research, questionnaires and surveys  with closed questions and multiple choice or yes/no responses.  

Use the tab at the top to explore sources of quantitative data.

To find out more about quantitative data and research methods, see Sage Research Methods .

How to find research data

You have access to a wide range of data sources.  The library provides access to data archives and specialist resources such as the UK Data Service, Mass Observation and the OECD iLibrary, available in the  A-Z databases list , where you can filter by your subject discipline.

There are also an increasing number of free, open access sources of data online, such as government websites, digital repositories of academic institutions (including the University of Bristol data.bris ), institutional organisations, research councils and subject focused collections. In recent years, researchers have been encouraged to publish their research data online, so more datasets are becoming available. However, not all researchers do this and it is less common in social sciences subjects.

A selection of open access data sources is available in the subject guides for sociology and for politics and international studies , under the 'Finding resources/subject databases &open access and research data tabs. Or you can use the tabs at the top for qualitative or quantitative data sources.

Using Google

Using a web search (such as Google) for data on a specific topic might not be the way to start your search.  You might get results for research outputs, such as journal articles, that refer to the datasets but do not include access. For example, if you are looking for interview transcripts, journal articles will discuss the methodology and generally include the interview questions in an appendix, but not the interviewee answers.  We recommend you use data repositories, archives and dataset search tools instead. 

Not all data is available.  It might not have published or may require a subscription that the library doesn't have. Some data are sensitive, confidential, or too detailed and access may be safeguarded or controlled.  This might require you registering with the data website and accepting conditions or stating why you want access.  Or access might be restricted to experienced researchers, at PhD level or post-doctoral research.

Browse what's available

If you need to find data for an essay, assignment or dissertation, you might have a very specific topic in mind.  It is best to check what data is available before you progress too far.  Search relevant data sources for your topic to check you can access what you need. You might need to browse what data are available on broader topics and if necessary adapt your focus.  It is best to find out early on what's available, rather than assume you can access one type of data on a very defined topic, only to realise late on that it's not available.

We have a broad range of primary sources and archival material that can be used to collect data for your own research.  This includes material such as government documents, parliamentary papers, newspapers, magazines, video, speeches, letters, social and political cartoons. Primary source and archival material are available in the A-Z databases list where you can filter by your subject discipline. 

Sage Research Methods

Student studying

Vast database of full-text handbooks, videos and case studies of qualitative and quantitative research methods. Also includes a research project planner, with step by step help on each stage of the research process, for all types of dissertation or research project.

Help with finding data

If you need help with finding data for sociology or politics and international studies, please  email me  at  [email protected] .

  • Next: Qualitative data sources >>
  • Last Updated: Apr 3, 2024 3:17 PM
  • URL: https://bristol.libguides.com/research-data

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • PLoS Comput Biol
  • v.18(2); 2022 Feb

Logo of ploscomp

Ten simple rules for improving research data discovery

Nicole contaxis.

1 NYU Health Sciences Library, NYU Langone Health, New York, New York, United States of America

Jason Clark

2 Montana State University Library, Montana State University, Bozeman, Montana, University States of America

Anthony Dellureficio

3 Medical library, Memorial Sloan Kettering Cancer Center, New York, New York, United States of America

Sara Gonzales

4 Galter Health Sciences Library and Learning Center, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America

Sara Mannheimer

Peter r. oxley.

5 Samuel J. Wood Library and C.V. Starr Biomedical Information Center, Weill-Cornell Medicine, New York, New York, United States of America

Melissa A. Ratajeski

6 Health Sciences Library System, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America

Alisa Surkis

Amy m. yarnell.

7 Health Sciences and Human Services Library, University of Maryland—Baltimore, Baltimore, Maryland, United States of America

Michelle Yee

Kristi holmes.

8 Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America

Introduction

Sharing and reusing research data present both opportunities and challenges for the individual researcher, their organizations, and the entire research community, but the promise of data sharing can only be actualized if the right data can be found. While grants and publications increasingly include data sharing requirements, locating the right data to answer a research question can still be challenging. As more data is shared more frequently, the data discovery problem becomes more apparent. There is simply more data to look through, and that data is distributed across a growing number of repositories, article supplements, websites, and other locations with different metadata, data standards, and search functionality.

These 10 rules can be thought of as a mirror to “Eleven Quick Tips for Finding Research Data,” providing key guidance on how to make your research data more findable in the complicated systems that share and provide access to research [ 1 ]. As opposed to helping locate data for reuse, this article is meant to help you make your data and your research more discoverable. The rules below walk you through the process of making your data more discoverable, including key steps to take when publishing an article (Rules 8 and 9).

As members of the Data Discovery Collaboration (DDC) [ 2 ], our work focuses on the issue of data discovery. We are a community of librarians and information professionals and are invested in helping to improve data findability and open data infrastructure more broadly. In service of these goals, we created these rules to provide guidance on how to improve the discovery of your research data by making key decisions early in the process (Rules 1 and 2), leveraging scholarly infrastructure already in place (Rules 3, 4, 5, 6, and 7), taking important steps at the point of publication (Rules 8 and 9), and tapping into the growing community of data professionals (Rule 10).

Rule 1: Decide what level of access you can provide

Data discovery is not data access. You can make your data discoverable without making it openly available. Data discovery is the process of learning that a dataset exists, whereas data access is the ability to download or view that data. Both of these concepts are part of the FAIR Data Principles ( F indable, A ccessible, I nteroperable, R eusable) [ 3 ], guidelines for data sharing that have been adopted by the National Institutes of Health (NIH) and many others. Discovery, the “F” (Findable) in FAIR, and access, the “A” (Accessible), are both integral concepts in improving data sharing.

That said, depending on data privacy issues, as well as security and ethical considerations, there may be restrictions placed on data access, such as an application to the data owner or International Review Board (IRB) or Ethics Committee approval. Deciding the level of access that you can provide to data will impact how you describe your data and what tools you use to make your data discoverable. As such, deciding the level of access you can provide to a dataset is the first step in making your data discoverable.

The distinction between data discovery and data access is reflected in the tools used to explore research data, such as data catalogs and repositories. A data catalog is a tool that holds metadata that explains the who, what, when, where, and why of a dataset [ 4 – 7 ]. Metadata is searchable information, formally defined as “…structured information that describes, explains, locates or otherwise makes it easier to retrieve, use or manage that resource” [ 8 ]. A data repository is a tool that holds that metadata along with the data itself [ 9 ]. For example, the UK Data Service’s Data Catalogue [ 10 ] describes data and where it is stored, while ReShare, a repository [ 11 ], stores data and employs metadata to help users locate the right dataset within their own storage.

Consider optimizing the discovery of restricted data through a data catalog, particularly since there may be limited options for repositories that can accommodate restricted data. Data catalogs require only a description of the data and access procedures, and thus can make the existence of the data and the processes for potentially gaining access to it transparent, without violating restrictions on accessing the data itself. Your first step in improving the discovery of your research data is to know what access restrictions you may be subject to and how much information can be shared with the public.

Rule 2: Comply with ethical standards

As you decide whether you can provide access to your data and if you can make your data discoverable for others to apply for access, you will need to comply with a range of ethical regulations. Ethical considerations for data should come into play across all stages of a project—not only in the generation of data, but also in sharing, discovery, and reuse [ 12 ]. Regulations provide some direction when considering ethical practices for data discovery, but ethics should be considered as an active discussion over the course of a project in order to accomplish the work in accordance with your values as well as in compliance with regulations in your country and in your field. Your values and these regulations should be used to inform how data is used, stored, and shared during the life of the project. Explicitly defining acceptable procedures and practices with your data and basing it in your ethical values can ensure good and ethical data management across all stages of a project [ 13 ].

Relevant regulations vary internationally. The revised Common Rule [ 14 ], which went into effect in the United States in 2019, introduces guidelines for data reuse, including the idea of broad consent—that participants can consent to “future storage, maintenance, or research uses” of their data [ 15 ]. In the European Union, 2018’s General Data Protection Regulation (GDPR) provides some guidance regarding discovery and use of existing data; however, the GDPR’s ramifications for academic research are still not fully clear [ 16 , 17 ]. Understanding regulations relevant to your location and field as well as the guidelines of local regulatory bodies, like your IRB or Ethics Committee, can help guide decisions on whether to make your data discoverable and how. Below are some considerations to support ethical practice for data discovery, based on key elements of the research and clinical ethics, as outlined in the United States’ Belmont Report [ 18 ]:

  • Respect for persons (i.e., human dignity). Review your consent agreements and consider how you might make your data discoverable while respecting participant consent.
  • Beneficence (i.e., minimizing harm and maximizing benefit). What are the potential risks of making your data discoverable as weighed against the potential harms? If your data is discoverable, will it negatively affect human subjects, endangered species, protected lands, or others?
  • Justice (i.e., fair treatment and privacy). When working with human subjects data, carefully consider the risk of reidentification when setting access procedures or choosing tools to enhance discovery. Consider direct and indirect identifiers and any related datasets that could lead to reidentification. Also consider intellectual property and ownership—is the data yours to make discoverable?

If your data results from work with vulnerable populations, be particularly careful. If applicable, consider referring to the CARE Principles for Indigenous Data Governance ( C ollective benefit, A uthority to control, R esponsibility, E thics). These principles, meant to complement the FAIR principles, provide key guidance on ways to work with indigenous individuals and communities and data that describes them [ 19 ]. Remember, supporting discovery for data does not mean that they need to be openly accessible; setting access procedures can allow for ethical data discovery and sharing [ 6 , 20 ]. Finally, it will be beneficial to consider your own values and ethical codes when pursuing newer practices like data publishing, discovery, and reuse, which may introduce new concerns not yet represented in regulations and guidelines [ 21 – 23 ].

Rule 3: Deposit your data somewhere trusted

In some cases, deciding where to describe or store your data to make it discoverable is an easy decision. Sometimes funder will require use of a specific repository, or, in other cases, a repository will be the clear choice for a specific type of data. For example, the Gene Expression Omnibus (GEO) database and the database of Genotypes and Phenotypes (dbGaP) are clear choices for depositing gene expression data and genome-wide association study (GWAS) data, respectively, for high visibility. Both GEO and dbGaP are well known and widely used NIH resources for preserving and making these specific types of data available. Other researchers looking for gene expression data will know to go to GEO, making it discoverable to the relevant research community. For other types of data, the decision may be less clear as there may not be a repository with a similar degree of buy-in, longevity, and support.

When the choice is not as clear, how do you identify an appropriate home? For discovery purposes, it is generally best to look for a discipline-specific repository because fellow researchers with similar research aims are more likely to look in those repositories for data to reuse. A number of journals and publishers that require data sharing have evaluated repositories based on their stability and practices and provided a list of recommended repositories [ 24 – 26 ], offering a good place to start. For those working in the United States, the NIH maintains a list of NIH-supported repositories [ 27 ], and that support provides a degree of confidence in their sustainability. Included in that list are some repositories that enable restricted access, so if your data cannot be freely shared, a NIH-supported repository may still be an option.

Other discipline-specific repositories can be identified using the Registry of Research Data Repositories [ 28 ]. Keep in mind that repositories vary widely in their degree of buy-in from research communities and sustainability and support from institutions who fund them. When choosing a repository, consider whether the repository is sustainable. Will it still exist in 5 years? Are there signs that it may lose the resources to accept new data and thereby becomes overlooked, with nonoptimized records? While there is no crystal ball to predict which repositories will be around and active in the future, considering the length of existence, the user population, and funding sources may provide a sense of whether or not a repository is likely to stick around.

If there are no suitable discipline-specific repositories, the publisher and NIH lists also include generalist repositories [ 24 – 27 ] that accept data from any discipline. These repositories vary in their size limitations, costs, and other policies [ 29 ]. In cases where datasets would be of interest to people in a broad array of disciplines, a generalist repository may be the best choice for data discoverability.

Another option is to upload your data to an institutional repository. These locally maintained generalist repositories may be a particularly good option for sustainability since an institution has a vested interest in preserving its data assets; however, make sure that the deposited data is findable through search engines (e.g., Google Dataset Search) or dataset aggregators (e.g., Mendeley Data). In fact, regardless of the type of repository you choose, checking to see that it is included in dataset aggregators and search engines should be an important part of your decision. These bolster the discovery of data stored across the many repositories.

In the interest of open science, it is strongly recommended to deposit your data to a repository to support data access, preservation, and reuse. A repository will help ensure a sound storage and preservation strategy for your data [ 13 ], and as stated earlier, if you need to limit access to your dataset, there are some repositories that can provide restricted access options. However, sometimes this is not possible. In these situations, a “landing page” record in a data catalog or repository is recommended if that option is available to you. While not every institution has a data catalog or repository with this function, if yours does, this is an excellent option for making data discoverable while you, or perhaps a third party, maintain control over the data.

Rule 4: Use persistent identifiers

Persistent identifiers (PIDs) are meant to provide a more stable, long-lasting way of uniquely identifying digital objects, people, and institutions and are vital for locating and citing scholarly materials [ 30 ]. Throughout the process of making your data discoverable, PIDs will play a crucial role. PIDs provide significant benefits: They can be machine readable, citable, and bound with metadata about the object, person, organization, or concept to which they point. Registry organizations maintain PIDs and provide search interfaces and APIs [ 31 , 32 ] for querying their registries. This means that by using PIDs and being registered through these organizations, you provide another avenue by which people may search for and discover your data. Assigning and using a PID can also makes it more possible to track the use and reuse of data over time, providing a clearer view on downstream use and dissemination.

To better explain PIDs and their benefits, we will highlight 2 important examples of PIDs: Digital Object Identifiers (DOIs) [ 33 ] to distinguish digital objects such as journal articles, and ORCID iDs [ 34 ] to distinguish between yourself and other researchers. ORCID iDs are identifiers for researchers and can help support scholarly identity management. These identifiers disambiguate you from other researchers with similar names, collapse variations of your own name to a single referent, and can maintain information about scholarly affiliations, education, funding, publications, and other scholarly works. ORCID iDs have been incorporated into publishing and funding workflows, resulting in a rich network of connections that can enhance the discovery of your data and other research products [ 35 ].

DOIs are used to distinguish digital research objects, including publications, data, software, and supporting information (see Rule 8 ). When a DOI is attached to a digital object, it is possible to accurately assert relationships between a given research object and other entities (e.g., authors with ORCID iDs) through links. Most publishers and repositories have incorporated DOIs, making these linkages, and thus enhanced discovery, possible. Furthermore, many repositories assign data a DOI upon deposit, making it easy to use and benefit from DOIs.

Rule 5: Create thoughtful and rich metadata

As emphasized in the FAIR Data Principles, data should be findable (the “F” in FAIR) both by humans and machines [ 3 ]. This means that the data needs to be described using metadata in a way that humans can easily read as well as in a structured way that machines can parse and link to other resources. PIDs can greatly aid in this process, but to increase discoverability, you will also need to describe your data with rich metadata. As previous 10 Simple Rules papers have noted, making material findable and understandable for humans and for machines is key to ensuring that the materials themselves are as useful as possible [ 30 , 36 ]. As such, adding metadata is a good step forward in making things findable, but adding metadata that uses a structured metadata schema is crucial [ 37 ].

A metadata schema is a system that defines the data elements needed to describe a particular object, such as a certain type of research data. A metadata schema tailored to your discipline provides a set of metadata elements designed to provide a description of your dataset that is sufficient to make it discoverable and understandable. For example, the Content Standard for Digital Geospatial Metadata (CSDGM) is a metadata standard for geographic data, maintained by the Federal Geographic Data Committee [ 38 ], while Data Discovery Initiative (DDI) is used to describe social science data [ 39 ]. Whether depositing a dataset into a repository or describing it in a data catalog, there will most likely be required data elements to complete. This rule is focused on how best to address several key metadata elements that are common to nearly all repositories and data catalogs: title, contributors, description, and funding.

When individuals are searching for datasets to reuse, they most likely will use the title as the first criteria to determine if it meets their needs. Be descriptive with your title and avoid defaulting to the filename as the title. Consider including as many of the following as are relevant to your dataset: What (type of data), Where (was the data collected), When (timeframe of the data), Who (subject of the research), and the Scale (approximate size) of the dataset.

Next, include the name and ORCID iD of all persons who contributed to the dataset (see Rule 4 ). The list of people who contributed to the dataset may be different from those with authorship on a related publication. Contributor roles to consider include data curation, methodology, resources, and software [ 40 ].

A dataset description is usually the lengthiest portion of a metadata record and the one that requires the most thought. This description should not be a restatement of a publication’s abstract nor should it address any results, analysis, or conclusions, although it may briefly describe the original research question with contextual details. For example, a description might start: “This dataset was generated from a survey that studies…” or “This dataset includes sequencing data from…” It can include information about the number and types of files included, important variables, and other documentation available (e.g., survey instruments, README files, information on research software). If the repository or catalog you are using does not have specific metadata elements for important experimental elements such as species/strain, cell type, study population, equipment, or other details, work them into the description text. This text will generally be searchable and therefore aid in the dataset’s discoverability. Most importantly, while writing the description, think about how you yourself would go about trying to find a dataset like yours. What terms would you use to search for the dataset and what would be most important to know in order to determine if you could reuse it?

Finally, provide specifics about funding that supported the work that produced the dataset (e.g., such as the funding organization, program name, and grant number) and provide links to project information in tools like NIH RePORTER [ 41 ]. Including this information will credit the funding agency for supporting the creation of the dataset, and it will also allow others search for research connected to a grant to find your dataset more easily.

Rule 6: Choose your keywords carefully

While a repository or data catalog may not require you to submit keywords, you can further enhance your dataset record’s discoverability by providing descriptive keywords and using controlled vocabularies or ontologies. Picking the right keywords is a vital part of the description of your data and is key to improving the discoverability of your data and other research outputs [ 30 ].

Controlled vocabularies and ontologies are resources maintained online by professional organizations and associations that standardize terms commonly used in various professional contexts, a common example of this being ICD-10 codes. These standardized terms are also machine readable, meaning that related keywords can be connected and enhance search results [ 36 ]. Medical Subject Headings (MeSH) [ 42 ], a hierarchically organized list of medical terms developed for PubMed, is one such example and is popular in medical data repositories. Some repositories offer ways to search for your keywords and related terms directly in these vocabularies through API integrations.

While using controlled vocabularies and ontologies is the best practice, data repositories do not always support their use. In which case, choosing keywords can increase the discoverability of your dataset by providing (1) the ability to describe your dataset with additional topical terms that might not be in the title or description fields; and (2) the opportunity to add synonyms, as well as broader or more granular terms that someone interested in the dataset might use in a search. Additionally, repositories and catalogs frequently allow users to browse all datasets with a specific keyword, meaning that adding keywords will also make your data discoverable through browsing functions.

An example of the benefits of tagging data records with synonyms can be seen in the generalist repository Zenodo, in which a search of “coronavirus” returns around 1,270 results at the time of writing whereas a search for the synonym “covid-19” returns around 58,700 results. In either set of records, a user can click on one of the hyperlinked keywords to do a targeted search for that keyword term. When adding keywords, whether or not they are part of controlled vocabularies, think of the words you would use to locate a similar dataset, then check whether these words are present in the description. If not, add them as keywords. Such terms might be related to geographic coverage, software used, instrumentation used, common abbreviations, subject of study or subject domain, to name only a few possibilities. A few minutes’ effort in identifying and applying synonym, variant, or more granular keyword terms can make all the difference to a searcher for data records and to your data’s overall discoverability and thus impact.

Rule 7: Create links to related resources

Data, along with publications, software, and other research products, are part of the larger research ecosystem. The research ecosystem includes research objects (e.g., articles, datasets, associated files, data management plans, code), the people who create these objects, funders and organizations that provide support for those objects, etc., and even information on how these concepts are linked. Being intentional about how you link your dataset to other components in this research ecosystem is an essential step toward improving the discoverability of your data. For example, use of a Data Availability Statement (see Rule 9 ) links your data to your publication, yet, there are more things to link than just your publications and your datasets.

Common opportunities and spaces for creating linkages across this network include links from labs or personal websites, links on social media posts back to research objects, a news post that announces your research findings and links to the release of the dataset, a link to a lay summary describing those findings for the public, a link from a code repository, and linking your research objects within an established metadata record. When creating these links, it is a good idea to follow best practices for other types of research objects. For example, research groups like the FORCE11 Software Citation Working Group have created guidelines on citing software [ 43 ] that would help you provide persistent links between your data and your software.

With discovery as your goal, this last example is the best place to start. After placing your dataset in a repository, enhancing the metadata record for your dataset by linking it to the software used for data collection using the software citation principles can provide additional context to your dataset that simple metadata cannot provide. To get the most out of these linkages, you will also want to use PIDs, as described in Rule 4.

Rule 8: Make supporting information discoverable too

These next 2 rules discuss ways to improve data discovery while in the publication process. When writing a manuscript and publishing, you will frequently create PDFs, CSV spreadsheets, or additional figures as supporting information. There are key differences between a data file submitted as a supporting information for a journal article versus a data file that has been preserved and shared in a repository. Supporting information added to journal articles are far less discoverable than files stored in repositories. Supporting information are linked to within the journal article and can be hosted by journals themselves. While it may be tempting to “set it and forget it” and solely deposit supporting information with a journal, these files may be overlooked (e.g., poorly located on the page or lack description), may be hidden behind journal paywalls, or become inaccessible if their links are not maintained [ 44 – 46 ].

Moreover, the naming of supporting information may be garbled, truncated, or replaced with a random sequence of characters, impacting long-term discoverability and usability. Think about how frequently supporting information are mysteriously named “cc9-12-e0123-s001.pdf” or “2020_1234_MOESM2_ESM.xls” and attached to an article in PubMed Central. If multiple such files are listed, it is impossible to know which file contains the desired data. Also, consider what could be missing. Due to costs associated with the storage and maintenance of supporting information, some journals impose limits on the size of deposited files [ 47 ], which means a submitter may not be able to deposit the full complement of files needed to support subsequent use.

Additionally, journals rarely commit to preserve supporting information in perpetuity. These and other factors necessitate finding an alternate hosting solution for data before submitting a manuscript to a journal. As mentioned in Rule 3, an important step in improving data discovery is depositing data somewhere trustworthy and persistent. Depositing these supporting information in other resources, like an institutional repository, helps ensure that access to these files will be maintained over time. Data files that are stored in repositories are indexed and maintained as separate entities. If best practices are followed (e.g., descriptive metadata and unique identifiers are applied), these files will appear in search engines on their own and are therefore opportunities to increase discoverability of a study and your visibility as a researcher.

The NIH Policy for Data Management and Sharing, set to take effect in 2023, encourages researchers to deposit high-value data files in related subject or data type–specific repositories, with generalist and institutional repositories and PubMed Central (for files under 2GB) as secondary options [ 48 ]. However, for all of the reasons noted above, the authors suggest utilizing repositories and linking them rather than hosting files solely within PubMed Central.

Rule 9: Include an accurate Data Availability Statement with your publications

A Data Availability Statement (DAS) within a journal article provides detailed information on where the data backing the claims of the research are located, whether that data is available for review, and if it is not, why it is not available. Several journals have provided recommendations on the specific style, format, and content of a DAS [ 49 , 50 ]. Writing a DAS is an important part of publishing that improves the visibility of your research data.

Writing a true and persistent DAS can provide a host of benefits, including increasing the discoverability of your data. First, it allows you to link directly to your data from a publication. Second, if a dataset cannot be deposited in an indexed repository for ethical, regulatory, or legal reasons, a DAS increases the visibility of this data that may be otherwise undiscoverable. Third, a DAS allows you to leverage filters in literature databases, making your publication more visible to researchers looking for secondary datasets. For example, in PubMed Central, it is possible to search only for articles that include a DAS so including a DAS increases the odds that a researcher looking for data to reuse will locate your research. Furthermore, there is evidence to suggest that the inclusion of a DAS leads to a citation advantage for authors [ 51 ].

Analysis has demonstrated that researchers often fail to write an accurate and useful DAS, but creating one does not need to be a difficult process [ 52 ]. In the case where data has been shared through a repository, there are only 2 key elements to a DAS. These components are a description of where the data is stored (e.g., a repository) as well as information to identify the dataset within that storage facility (e.g., accession information, a DOI, or a permalink).

However, for datasets that are restricted due to subject privacy or other regulatory issues, a different type of DAS is necessary. In this case, it is customary to include a short explanation of why the dataset is restricted along with information on who to contact in order to apply for access to the confidential data. This approach can present a barrier to those interested in using a dataset due to the difficulty of locating the owner of the data and managing the access procedures. It can also present issues for data owners who are then required to perform additional work to share the data years after publication. For this reason, this method should only be employed if a dataset cannot be deposited into a repository. In the case where there is no other option, you should ask yourself the following critical questions:

  • Is there a process for preparing this data for another researcher when it is requested?
  • Is there sufficient documentation to allow others to understand the dataset?
  • Is accurate and current contact information provided?
  • Has there been a regular review of stored data?
  • Is there a person assigned to act as a data steward and perform any necessary tasks when data is requested?

Sometimes a hybrid approach is best. A subset of a restricted dataset (e.g., a deidentified dataset) can be made available through a repository. This hybrid approach entails including access information for the accessible subset of the restricted dataset, information on why the larger dataset is restricted, and contact information to get further information on the larger dataset. Of course, before submitting a DAS, proofread the accession numbers and test any links to help ensure that the DAS is true, specific, and persistent.

Rule 10: Talk to your institutional librarian

You are not in this alone. You have subject matter experts in the field of discoverability close at hand, ready to work with you to support practices in data sharing in order to enhance access, support discoverability, and respect privacy.

The NIH and National Science Foundation (NSF) data management policies recommend institutional librarians as sources of expertise on issues related to data discoverability [ 48 , 53 , 54 ]. Librarians work with metadata, repositories, ontologies, and data management plans from both sides—searching and curating—and are highly familiar with institutional, national, and international requirements and policies for data handling. Your institutional library can therefore provide guidance in best practices, teach you how to use available tools and templates, as well as help you select appropriate repositories and data catalogs to maximize the discoverability of your data. Check your library for consultation services or classes in data management or for research guides and other online resources available to you.

Working through data discoverability is like preheating your oven: It is best done at the beginning of the process to prevent unpleasant delays at the end. Being proactive will help confirm that your data management is aligned with any regulatory requirements at the beginning of your project, and may reveal workflows that will allow efficient metadata collection and sharing along the way.

Acknowledgments

The authors would like to thank the Data Discovery Collaboration.

Funding Statement

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Funding includes the National Institutes of Health's National Center for Advancing Translational Science ( https://ncats.nih.gov ) to the NYU Langone Health’s Clinical and Translational Science Institute (UL1TR001445) to NC, MY, and AS and to the Northwestern University Clinical and Translational Sciences Institute (UL1TR001422) to SG and KH. Additional support was provided by the National Institutes of Health’s Network of the National Library of Medicine ( https://nnlm.gov ) to the National Evaluation Center (U24LM013751) to SH and KH. The Institute of Museum and Library Services ( https://www.imls.gov ) also provided funding to Montana State University Library (LG-89-18-0225-18) to SM and JC.

Research Data: Finding, Managing, Sharing

  • Discovery & Access
  • Data Management Plans
  • Organization
  • Documentation
  • Metadata/Description
  • Sharing & Publication
  • Preservation

Library Contact

Profile Photo

This guide, authored by U-M Library's Research Data Services , is intended for all researchers wanting to know more about how to obtain, manage, share, or preserve data.  Use the tabs above to gain an understanding of and to find resources about data management and stewardship.

Research data are the elements used to support or validate your academic work or analysis. While this includes numeric data, many other types of research objects such as code, formulae, images, sound, artifacts, and text can be considered research data.  

Research Data Services is a network of services provided by the library to assist you during all phases of the research data lifecycle. For questions about research data or to schedule a consultation, please get in touch with your subject librarian or email us.

Research Data Services

The Library can assist you in the following ways:

Data Management Planning: helping plan for managing, sharing and curating data and developing Data Management Plans (DMPs) that meet funder requirements. 

Discovery & Access: assisting in discovering, accessing, and acquiring different types of research materials, including data. 

Data Organization & Management: helping researchers to understand, develop and apply strategies for organizing and managing their data.  

Metadata & Documentation: locating standards for documentation that capture the details of generating, processing and analyzing data so it can be discovered, understood and reused.  

Data Sharing & Publication: helping disseminate research data for discovery, access and reuse in ways that enable researchers to receive credit for their work.  

Preservation: taking action to sustain the accessibility and scholarly value of data over time.  

Data Lifecycle

finding research data

Managing data effectively requires a consideration for how the data will grow and evolve over the course of your research. Data Lifecycle Models, such as the one shown here, provide a high level view of the stages that a data set may pass through and how these stages are connected. At each stage, researchers should consider how the data will be described, managed and secured.

The stages in your own data lifecycle will vary according to the type of research you are conducting, the data you are working with and your particular goals and needs for the data. The library can help you in identifying the stages of your data lifecycle and in considering what actions should be taken at each stage to ensure that the data are accessible, understandable and fit for re-use. 

Click here for more information on Data Lifecycles.  

*Graphic above from USGS:  https://www2.usgs.gov/datamanagement/why-dm/lifecycleoverview.php

Getting Help

For additional information or to ask questions, contact your subject librarian !

  • Next: Discovery & Access >>
  • Last Updated: Aug 14, 2023 11:21 AM
  • URL: https://guides.lib.umich.edu/data

finding research data

Finding research data

This page explains how you can search for existing research data.

You might want to use existing data for your research to do a meta-analysis, to re-evaluate findings, to perform new analyses on existing data, or to validate a model.

Use your network

Publishing datasets is not yet as common as publishing articles, and many researchers still keep their datasets to themselves. You might then want to ask people in your network if they have datasets you could use.

Search the literature

Many articles you read are likely based on data. If you find an article that is closely related to your research topic, see if it includes a reference to a dataset, or if it contains supplementary data files. If not, you could contact the author and request access to the data.

Search repositories

Data repositories contain datasets created and published by researchers or by organisations. The easiest way to look for repositories is to use the directory Re3data . This directory lists both multidisciplinary repositories and disciplinary repositories. Multidisciplinary repositories contain datasets from various fields, and disciplinary repositories contain datasets from specific fields.

Although Re3data covers many data repositories, it does not cover them all. You should always search for repositories within your discipline. You can also find disciplinary repositories with open data in this list of data repositories .

Search indexes of datasets

Another quick way to look for data is to search indexes of datasets. Unlike repositories, indexes do not contain actual datasets but list their metadata, such as title, data, and creator.

  • DataSearch lists datasets from repositories, together with figures, tables and other data from published articles. You can preview the data files, so you can easily see if a dataset is useful to you.
  • DataCite’s Metadata Search lists datasets that have been given a DOI, a unique identifier for datasets and articles. The database covers datasets from several repositories.

Banner

Data Module #1: What is Research Data?

Defining research data.

  • Qualitative vs. Quantitative
  • Types of Research Data
  • Data and Statistics
  • Let's Review...

Data Module Quick Navigation

Data Modules Table of Contents

#1 - What is Research Data? #2 - Planning for Your Data Use #3 - Finding & Collecting Data #4 -  Keeping Your Data Organized #5 -  Intellectual Property & Ethics #6 -  Storage, Backup, & Security #7 - Documentation

Library Resources

library building png

Module created by Aaron Albertson, Beth Hillemann, & Ron Joslin.

Creative Commons License

Many people think of data-driven research as something that primarily happens in the sciences. It is often thought of as involving a spreadsheet filled with numbers. Both of these beliefs are incorrect. Research data are collected and used in scholarship across all academic disciplines and, while it can consist of numbers in a spreadsheet, it also takes many different formats, including videos, images, artifacts, and diaries. Whether a psychologist collecting survey data to better understand human behavior, an artist using data to generate images and sounds, or an anthropologist using audio files to document observations about different cultures, scholarly research across all academic fields is increasingly data-driven.

In our Data Literacy Modules, we will demonstrate the ways in which research data are gathered and used across various academic disciplines by discussing it in a very broad sense. We define research data as: any information collected, stored, and processed to produce and validate original research results. Data might be used to prove or disprove a theory, bolster claims made in research, or to further the knowledge around a specific topic or problem.

Other Definitions of Research Data

There are many different definitions of research data available. Here are just a few examples of other definitions. We share these examples to illustrate there is not universal consensus on a definition, although many similarities are apparent.

  • U.S. Office of Management & Budget

“research data, unlike other types of information, is collected, observed, or created, for purposes of analysis to produce original research results”  

  • University of Edinburgh

"...recorded factual material commonly accepted in the scientific community as necessary to validate research findings..."  

  • National Endowment for the Humanities

"...materials generated or collected during the course of conducting research..."

Research Data Formats

Research data takes many different forms.  Data may be intangible as in measured numerical values found in a spreadsheet or an object as in physical research materials such samples of rocks, plants, or insects. Here are some examples of the formats that data can take:

  • Next: Qualitative vs. Quantitative >>
  • Last Updated: Feb 2, 2024 1:41 PM
  • URL: https://libguides.macalester.edu/data1

10 Great Places to Find Free Datasets for Your Next Project

Wondering where to find free and open datasets for your next data project? Look no further…

If you’re looking for a job in data analytics, you’ll need a portfolio to demonstrate your expertise. Of course, if you’re new to data analytics, you probably don’t have much expertise! Not to worry. The fact you might not have worked on a paid project yet doesn’t mean you can’t whip up a compelling portfolio using some practice datasets.

Fortunately, the Internet is awash with these, most of which are completely free to download (thanks to the open data initiative). In this post, we’ll highlight a few first-rate repositories where you can find data on everything from business to finance, planetary science and crime.

Prefer to watch this information over reading it? Check out this video on dataset resources, presented by our very own in-house data scientist, Tom!

Prepare to geek out, and here we go:

1. Google Dataset Search

Type of data: Miscellaneous Data compiled by: Google Access: Free to search, but does include some fee-based search results Sample dataset: Global price of coffee, 1990-present

It seems we turn to Google for everything these days, and data is no exception. Launched in 2018, Google Dataset Search is like Google’s standard search engine, but strictly for data.

While it’s not the best tool if you prefer to browse, if you have a particular topic or keyword in mind, it won’t disappoint. Google Dataset Search aggregates data from external sources, providing a clear summary of what’s available, a description of the data, who it’s provided by, and when it was last updated. It’s an excellent place to start.

Type of data: Miscellaneous Data compiled by: Kaggle Access: Free, but registration required Sample dataset: Daily temperature of major cities

Like Google Dataset Search, Kaggle offers aggregated datasets, but it’s a community hub rather than a search engine. Kaggle launched in 2010 with a number of machine learning competitions, which subsequently solved problems for the likes of NASA and Ford.

It’s since evolved into a renowned open data platform, offering cloud-based collaboration for data scientists, as well as educational tools for teaching artificial intelligence and data analysis techniques …plus, of course, tonnes of great datasets covering almost any topic you can imagine.

3. Data.Gov

Type of data: Government Data compiled by: US Federal Government Access: Free, no registration required Sample dataset: Lobster Report for Transshipment and Sales

In 2015, the US Government made all its data publicly available. With over 200,000 datasets covering everything from climate change to crime, you can lose yourself in the database for hours.

For a government website, it has some surprisingly user-friendly search functions, including the ability to drill down by geographical area, organization type, and file format. Search results are also clearly labeled at federal, state, county, and city levels.

If you’re interested in more general data about the US population, you can also check out the US Census Bureau , offering a rich selection of data about US citizens, their geography, education, and population growth.

4. Datahub.io

Type of data: Mostly business and finance Data compiled by: Datahub Access: Mostly free, no registration required Sample dataset: Average mass of glaciers since 1945

The goal of many data analysts is to help drive savvy business decisions. As such, using economic or business datasets for your portfolio project might be worth considering.

While Datahub covers a variety of topics from climate change to entertainment, it mainly focuses on areas like stock market data, property prices, inflation, and logistics. Because many of the data on the portal are updated monthly (or even daily) you’ll always have something fresh to work with, as well as data that covers broad timescales.

5. UCI Machine Learning Repository

Type of data: Machine learning Data compiled by: University of California Irvine Access: Free, no registration required Sample dataset: Behavior of urban traffic in Sao Paulo, Brazil

Generalized repositories are great if you’re happy to browse. But if you’re seeking something more niche, why not specialize? Enter the UCI Machine Learning Repository.

Launched thirty years ago by the University of California Irvine, don’t let the 90s vibe mislead you—the UCI repository has a strong reputation among students, teachers, and researchers as the go-to place for machine learning data.

Datasets are clearly categorized by task (i.e. classification, regression, or clustering), attribute (i.e. categorical, numerical), data type, and area of expertise. This makes it easy to find something that’s suitable, whatever machine learning project you’re working on .

5. Earth Data

Type of data: Earth science Data compiled by: NASA Access: Free, no registration required Sample dataset: Environmental conditions during fall moose hunting season in Alaska, 2000-2016

If you think space is awesome (let’s face it, space is awesome!) look no further than Earth Data. Publicly available since 1994, this repository provides access to all of NASA’s satellite observation data for our little blue planet.

As you can imagine, there’s plenty to peruse, from weather and climate measurements to atmospheric observations, ocean temperatures, vegetation mapping, and more. If Earth-based data isn’t your thing, NASA’s Planetary Data System takes things a step further with data from interplanetary missions, such as the Cassini probe (which orbited Saturn from 2004 to 2017). Who knows, you might even make a scientific discovery…

6. CERN Open Data Portal

Type of data: Particle Physics Data compiled by: CERN Access: Free, no registration required Sample dataset: Higgs candidate collision events from 2011 and 2012

Want to demonstrate your ability to work with highly complex datasets? Head to the CERN Open Data Portal. It offers access to over two petabytes of information, including datasets from the Large Hadron Collider particle accelerator. Frankly, these data aren’t for the faint of heart but if you’re interested in particle physics, they’re worth checking out.

While even the names of these datasets are pretty complex, each entry has a helpful breakdown of what’s included, as well as related datasets, and how to go about analyzing them. In many cases, they even provide sample code to get you started (thanks, CERN!)

7. Global Health Observatory Data Repository

Type of data: Health Data compiled by: UN World Health Organization Access: Free, no registration required Sample dataset: Polio immunization coverage estimates by region

The Global Health Observation data repository is the UN WHO’s gateway to health-related statistics from across the globe. If you’re looking to break into the healthcare industry (a key focus for many data scientists, especially in the area of machine learning), these datasets are a good option for your portfolio.

Covering everything from malaria to HIV/AIDS, antimicrobial resistance, and vaccination rates, the portal even has a nice little feature that lets you preview data tables before downloading them. Not strictly necessary, but definitely nice to have!

8. BFI film industry statistics

Type of data: Entertainment and film Data compiled by: British Film Institute Access: Free, no registration required Sample dataset: Weekend box office figures from 2001-present

If you’re looking for some data that are a bit more digestible, the next few should be right up your street. First off: the British Film Institute industry statistics. Throughout the year, the BFI accrues and releases data on everything from UK box office figures, to audience demographics, home entertainment, movie production costs, and more.

The best part, though, is their annual statistical yearbook. This breaks down the year’s data with some excellent statistical analysis and visual reports—great if you’re new to data analytics and want to check your work against the real thing.

9. NYC Taxi Trip Data

Type of data: Transport Data compiled by: New York City Taxi and Limousine Commission Access: Free, no registration required Sample dataset: Take your pick!

This is a weirdly fascinating one…since 2009, the NYC Taxi and Limousine Commission has been accruing transport data from across New York City. Find datasets covering pick-up/drop-off times and locations, trip distances, fares, rate and payment types, passenger counts, and more.

It’s pretty interesting to compare the differences in figures from 2009 to the present day, especially within such a small geographic area. The site also provides some additional tools, including user guides, taxicab zone maps, data dictionaries (for explaining the spreadsheet labels), and annual industry reports. All very intuitive and quite a helpful guide if you’re new to data analytics.

10. FBI Crime Data Explorer

Type of data: Crime and drugs Data compiled by: Federal Bureau of Investigation Access: Free, no registration required Sample dataset: Homicide offense counts in Point Pleasant, 2008-2018

If you’re fascinated by crime, the FBI Crime Data Explorer is the one for you. It provides a broad collection of crime statistics from a variety of state organizations (universities and local law enforcement) and government (on a local, regional, and state-level). Pull data on hate crimes, officer assaults, homicides, and more.

Like the last couple of entries on our list, it also includes some helpful user guides to support data navigation. Each dataset also has some pretty nice visual breakdowns and analysis, so you can see if it has the features you’re looking for before downloading it.

If you’re anything like us, you’ll lose hours simply browsing these vast repositories. From the quirky to the unashamedly geeky, there’s no better evidence of data’s ubiquity in our lives.

So what do you do once you’ve found your dataset and analyzed it? If you want to feature your analysis as a project in your portfolio, there are certain steps you’ll need to follow—you can learn how to build your data analytics portfolio in this guide .

If you’re completely new to data analytics, why not try out a free, 5-day introductory short course ? You’ll get a hands-on introduction to the field, complete with access to a workable dataset. And, if you’d like to learn more about what it takes to forge a career in data, check out the following:

  • Am I a Good Fit for a Career as a Data Analyst?
  • The Best Online Data Analytics Courses
  • The 7 Top Data Analysis Software Tools
  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Research Data – Types Methods and Examples

Research Data – Types Methods and Examples

Table of Contents

Research Data

Research Data

Research data refers to any information or evidence gathered through systematic investigation or experimentation to support or refute a hypothesis or answer a research question.

It includes both primary and secondary data, and can be in various formats such as numerical, textual, audiovisual, or visual. Research data plays a critical role in scientific inquiry and is often subject to rigorous analysis, interpretation, and dissemination to advance knowledge and inform decision-making.

Types of Research Data

There are generally four types of research data:

Quantitative Data

This type of data involves the collection and analysis of numerical data. It is often gathered through surveys, experiments, or other types of structured data collection methods. Quantitative data can be analyzed using statistical techniques to identify patterns or relationships in the data.

Qualitative Data

This type of data is non-numerical and often involves the collection and analysis of words, images, or sounds. It is often gathered through methods such as interviews, focus groups, or observation. Qualitative data can be analyzed using techniques such as content analysis, thematic analysis, or discourse analysis.

Primary Data

This type of data is collected by the researcher directly from the source. It can include data gathered through surveys, experiments, interviews, or observation. Primary data is often used to answer specific research questions or to test hypotheses.

Secondary Data

This type of data is collected by someone other than the researcher. It can include data from sources such as government reports, academic journals, or industry publications. Secondary data is often used to supplement or support primary data or to provide context for a research project.

Research Data Formates

There are several formats in which research data can be collected and stored. Some common formats include:

  • Text : This format includes any type of written data, such as interview transcripts, survey responses, or open-ended questionnaire answers.
  • Numeric : This format includes any data that can be expressed as numerical values, such as measurements or counts.
  • Audio : This format includes any recorded data in an audio form, such as interviews or focus group discussions.
  • Video : This format includes any recorded data in a video form, such as observations of behavior or experimental procedures.
  • Images : This format includes any visual data, such as photographs, drawings, or scans of documents.
  • Mixed media: This format includes any combination of the above formats, such as a survey response that includes both text and numeric data, or an observation study that includes both video and audio recordings.
  • Sensor Data: This format includes data collected from various sensors or devices, such as GPS, accelerometers, or heart rate monitors.
  • Social Media Data: This format includes data collected from social media platforms, such as tweets, posts, or comments.
  • Geographic Information System (GIS) Data: This format includes data with a spatial component, such as maps or satellite imagery.
  • Machine-Readable Data : This format includes data that can be read and processed by machines, such as data in XML or JSON format.
  • Metadata: This format includes data that describes other data, such as information about the source, format, or content of a dataset.

Data Collection Methods

Some common research data collection methods include:

  • Surveys : Surveys involve asking participants to answer a series of questions about a particular topic. Surveys can be conducted online, over the phone, or in person.
  • Interviews : Interviews involve asking participants a series of open-ended questions in order to gather detailed information about their experiences or perspectives. Interviews can be conducted in person, over the phone, or via video conferencing.
  • Focus groups: Focus groups involve bringing together a small group of participants to discuss a particular topic or issue in depth. The group is typically led by a moderator who asks questions and encourages discussion among the participants.
  • Observations : Observations involve watching and recording behaviors or events as they naturally occur. Observations can be conducted in person or through the use of video or audio recordings.
  • Experiments : Experiments involve manipulating one or more variables in order to measure the effect on an outcome of interest. Experiments can be conducted in a laboratory or in the field.
  • Case studies: Case studies involve conducting an in-depth analysis of a particular individual, group, or organization. Case studies typically involve gathering data from multiple sources, including interviews, observations, and document analysis.
  • Secondary data analysis: Secondary data analysis involves analyzing existing data that was collected for another purpose. Examples of secondary data sources include government records, academic research studies, and market research reports.

Analysis Methods

Some common research data analysis methods include:

  • Descriptive statistics: Descriptive statistics involve summarizing and describing the main features of a dataset, such as the mean, median, and standard deviation. Descriptive statistics are often used to provide an initial overview of the data.
  • Inferential statistics: Inferential statistics involve using statistical techniques to draw conclusions about a population based on a sample of data. Inferential statistics are often used to test hypotheses and determine the statistical significance of relationships between variables.
  • Content analysis : Content analysis involves analyzing the content of text, audio, or video data to identify patterns, themes, or other meaningful features. Content analysis is often used in qualitative research to analyze open-ended survey responses, interviews, or other types of text data.
  • Discourse analysis: Discourse analysis involves analyzing the language used in text, audio, or video data to understand how meaning is constructed and communicated. Discourse analysis is often used in qualitative research to analyze interviews, focus group discussions, or other types of text data.
  • Grounded theory : Grounded theory involves developing a theory or model based on an analysis of qualitative data. Grounded theory is often used in exploratory research to generate new insights and hypotheses.
  • Network analysis: Network analysis involves analyzing the relationships between entities, such as individuals or organizations, in a network. Network analysis is often used in social network analysis to understand the structure and dynamics of social networks.
  • Structural equation modeling: Structural equation modeling involves using statistical techniques to test complex models that include multiple variables and relationships. Structural equation modeling is often used in social science research to test theories about the relationships between variables.

Purpose of Research Data

Research data serves several important purposes, including:

  • Supporting scientific discoveries : Research data provides the basis for scientific discoveries and innovations. Researchers use data to test hypotheses, develop new theories, and advance scientific knowledge in their field.
  • Validating research findings: Research data provides the evidence necessary to validate research findings. By analyzing and interpreting data, researchers can determine the statistical significance of relationships between variables and draw conclusions about the research question.
  • Informing policy decisions: Research data can be used to inform policy decisions by providing evidence about the effectiveness of different policies or interventions. Policymakers can use data to make informed decisions about how to allocate resources and address social or economic challenges.
  • Promoting transparency and accountability: Research data promotes transparency and accountability by allowing other researchers to verify and replicate research findings. Data sharing also promotes transparency by allowing others to examine the methods used to collect and analyze data.
  • Supporting education and training: Research data can be used to support education and training by providing examples of research methods, data analysis techniques, and research findings. Students and researchers can use data to learn new research skills and to develop their own research projects.

Applications of Research Data

Research data has numerous applications across various fields, including social sciences, natural sciences, engineering, and health sciences. The applications of research data can be broadly classified into the following categories:

  • Academic research: Research data is widely used in academic research to test hypotheses, develop new theories, and advance scientific knowledge. Researchers use data to explore complex relationships between variables, identify patterns, and make predictions.
  • Business and industry: Research data is used in business and industry to make informed decisions about product development, marketing, and customer engagement. Data analysis techniques such as market research, customer analytics, and financial analysis are widely used to gain insights and inform strategic decision-making.
  • Healthcare: Research data is used in healthcare to improve patient outcomes, develop new treatments, and identify health risks. Researchers use data to analyze health trends, track disease outbreaks, and develop evidence-based treatment protocols.
  • Education : Research data is used in education to improve teaching and learning outcomes. Data analysis techniques such as assessments, surveys, and evaluations are used to measure student progress, evaluate program effectiveness, and inform policy decisions.
  • Government and public policy: Research data is used in government and public policy to inform decision-making and policy development. Data analysis techniques such as demographic analysis, cost-benefit analysis, and impact evaluation are widely used to evaluate policy effectiveness, identify social or economic challenges, and develop evidence-based policy solutions.
  • Environmental management: Research data is used in environmental management to monitor environmental conditions, track changes, and identify emerging threats. Data analysis techniques such as spatial analysis, remote sensing, and modeling are used to map environmental features, monitor ecosystem health, and inform policy decisions.

Advantages of Research Data

Research data has numerous advantages, including:

  • Empirical evidence: Research data provides empirical evidence that can be used to support or refute theories, test hypotheses, and inform decision-making. This evidence-based approach helps to ensure that decisions are based on objective, measurable data rather than subjective opinions or assumptions.
  • Accuracy and reliability : Research data is typically collected using rigorous scientific methods and protocols, which helps to ensure its accuracy and reliability. Data can be validated and verified using statistical methods, which further enhances its credibility.
  • Replicability: Research data can be replicated and validated by other researchers, which helps to promote transparency and accountability in research. By making data available for others to analyze and interpret, researchers can ensure that their findings are robust and reliable.
  • Insights and discoveries : Research data can provide insights into complex relationships between variables, identify patterns and trends, and reveal new discoveries. These insights can lead to the development of new theories, treatments, and interventions that can improve outcomes in various fields.
  • Informed decision-making: Research data can inform decision-making in a range of fields, including healthcare, business, education, and public policy. Data analysis techniques can be used to identify trends, evaluate the effectiveness of interventions, and inform policy decisions.
  • Efficiency and cost-effectiveness: Research data can help to improve efficiency and cost-effectiveness by identifying areas where resources can be directed most effectively. By using data to identify the most promising approaches or interventions, researchers can optimize the use of resources and improve outcomes.

Limitations of Research Data

Research data has several limitations that researchers should be aware of, including:

  • Bias and subjectivity: Research data can be influenced by biases and subjectivity, which can affect the accuracy and reliability of the data. Researchers must take steps to minimize bias and subjectivity in data collection and analysis.
  • Incomplete data : Research data can be incomplete or missing, which can affect the validity of the findings. Researchers must ensure that data is complete and representative to ensure that their findings are reliable.
  • Limited scope: Research data may be limited in scope, which can limit the generalizability of the findings. Researchers must carefully consider the scope of their research and ensure that their findings are applicable to the broader population.
  • Data quality: Research data can be affected by issues such as measurement error, data entry errors, and missing data, which can affect the quality of the data. Researchers must ensure that data is collected and analyzed using rigorous methods to minimize these issues.
  • Ethical concerns: Research data can raise ethical concerns, particularly when it involves human subjects. Researchers must ensure that their research complies with ethical standards and protects the rights and privacy of human subjects.
  • Data security: Research data must be protected to prevent unauthorized access or use. Researchers must ensure that data is stored and transmitted securely to protect the confidentiality and integrity of the data.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Primary Data

Primary Data – Types, Methods and Examples

Qualitative Data

Qualitative Data – Types, Methods and Examples

Quantitative Data

Quantitative Data – Types, Methods and Examples

Secondary Data

Secondary Data – Types, Methods and Examples

Research Information

Information in Research – Types and Examples

Banner

Research Data: Finding research data

  • Finding research data
  • RDMS This link opens in a new window

What is research data?

Research data is information that has been collected or created to validate research findings. It can be numerical, descriptive or visual and raw or analysed.

Some examples of research data are laboratory notebooks, data files, questionnaires and interview transcripts, photographs and slides.

Why use research data?

  • You can save time and effort by reusing data
  • You can reuse data in new ways to carry out further research

Remember to check the licence terms before you reuse data and always cite your source.

Research Data at Strathclyde

  • University of Strathclyde's Knowledgebase - browse or search for data sets in University of Strathclyde publications.
  • Strathprints - browse or search for data sets in open access publications from the University of Strathclyde.

Databases - Research data

You can filter database search results to show datasets.

For example, Web of Science Core Collection has links from search results to articles with data sets. Choose to Filter your search results by Associated Data.

finding research data

Research data repositories

Research data repositories store and manage research data. The simplest way to find a repository is to use a directory/registry service.

  • Registry of Research Data Repositories (re3data) - is  'a global registry of research data repositories that covers research data repositories from different academic disciplines.'
  • The Open Access Directory - 'This is a list of repositories and databases for open data.'
  • Data Cite - find and use research data. Provides DOIs.
  • Figshare - makes available research outputs including datasets.
  • Zenodo - 'an open dissemination research data repository' provided by CERN.

Government sources - Research data

  • Office for National Statistics - official statistics and datasets.
  • UK government data - data from UK government departments, agencies, public bodies and local authorities.

Health data

  • Public Health Scotland - Scottish health service data. Previously Information Services Division (ISD).
  • Image Data Resource (IDR) -  contains multidimensional life sciences image data and 'seeks to store, integrate and serve image datasets from published scientific studies.' 

General search engines - Research data

  • Dataset Search - Google search tool to find datasets.

Citing Data Sets

  • Ball, A. & Duke, M. (2015). ‘How to Cite Datasets and Link to Publications’. DCC How-to Guides . Edinburgh: Digital Curation Centre. Available online: http://www.dcc.ac.uk/resources/how-guides
  • EndNote , reference management software,  has a datasets reference type.
  • Next: RDMS >>
  • Last Updated: Sep 27, 2023 5:18 PM
  • URL: https://guides.lib.strath.ac.uk/research_data

search

U.S. flag

An official website of the United States government

Here's how you know

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Effective March 14, 2024, HHS Virtual Library customers will have a streamlined login , with a common URL for all HHS customers using a PIV card. IHS and other HHS customers without a PIV card will continue to login using temporary credentials provided by the NIH Library. Further guidance is available here .

Resources for Finding and Sharing Research Data

This one-hour introductory class provides researchers with an overview of online resources for locating research datasets, data repositories, and data publications for data sharing and re-use. Participants will learn search strategies for locating datasets through federated data search portals and generalist data repositories, including directories for locating discipline-specific and institutional data repositories. An overview of key issues to consider when re-using datasets or when locating a data repository for sharing and preservation purposes will be discussed.

Upcoming Sessions

Related online tutorials.

  • How to Use the Cochrane Library
  • NLM Resources for Nurses: Drug Info
  • NLM Resources for Nurses: Patient Education
  • NLM Resources For Nurses: Research
  • Using PubMed in Evidence-Based Practice

Other Related Content

  • Data Services
  • Electronic Lab Notebooks
  • Find Your Librarian
  • Finding Datasets, Data Repositories, and Data Standards
  • Statistical Support
  • Utility Menu

University Logo

  • Finding Data

Consultation, full service (HLS, Baker), and referrals for locating sources of research data (e.g. Library subscriptions, government sponsor, repository).

Eligibility information is outlined below based on providers with offerings that are available to the entire Harvard community or a specific unit/appointment. 

University-wide, harvard college library, services for academic programs.

Harvard College Library (SAP) offers consultations and referrals to data sources and data repositories with a particular focus on the social sciences and business/economic, public opinion, and demographic data. Consultations include locating data, recommending search strategies, and providing instruction sessions on finding data.

All Harvard community; focus on FAS undergraduates, graduate students, and faculty

Service Provider

Harvard College Library, Services for Academic Programs (SAP)

Service Fee

Service website.

https://library.harvard.edu/collections/data-and-government-information-collections

Contact Information

Hugh Truslow and Diane Sredi [email protected]

Fine Arts Library

Fine Arts Library provides consultations, instruction, and referrals for finding image data sources and image databases.

  • All Faculty
  • All Graduate Students
  • School Undergraduates
  • All Affiliates
  • Departments

Jessica Evans Brady:  [email protected]

Harvard Map Collection

Harvard Map Collection has a network of vendors that specialize in maps and data from around the world.

All Harvard Affiliates

[email protected]

Unit/Appointment-specific

Baker library (harvard business school).

Baker Library provides full data finding services for HBS faculty (fee) and free consultations to School Faculty and Students. This includes data that is freely available, available at Harvard, needs to be purchased, or needs to be created provided by scratch.

  • School Faculty
  • School Graduate Students

Baker Library

Yes for HBS Faculty

https://www.library.hbs.edu/Services/Finding-Data

Alex Caracuzzo:  [email protected]

Countway Library (Longwood Medical Area)

  • School Affiliates

Countway Library

https://countway.harvard.edu/services/publishing-data-services/data-services

Julie Goldman:  [email protected]

Gutman Library (Harvard Graduate School of Education)

Gutman Library

http://asklib.gse.harvard.edu/q.php

Harvard Kennedy School

  • Graduate School Fellows

Harvard Kennedy School Library

https://guides.library.harvard.edu/hks/data_resources

[email protected]

Harvard Law School

  • School Faculty and School Graduate Students (all services)
  • All Harvard Affiliates (consultation)

Harvard Law School Library

http://asklib.law.harvard.edu

Michelle Pearse:  [email protected]

  • Research Administration & Compliance
  • Research Computing
  • Archiving Faculty Research Data and Archiving Data
  • Buying and Licensing Data
  • Copyright and Intellectual Property
  • DASH Open-Access Repository
  • Data Cleaning
  • Data Curation
  • Data Handling
  • Data Retrieval
  • Data Sharing and Publishing
  • Data Visualization
  • Dataset Creation
  • Electronic Lab Notebooks (ELNs)
  • Geospatial Data
  • Harvard Dataverse Repository
  • Longwood Health Informationist
  • Metadata Creation
  • Qualitative Data
  • Research Data Management Lifecycle
  • Research Design
  • Software & Platforms
  • Text Analysis
  • Training, Workshops & Capacity Building
  • Active Research
  • Dissemination & Preservation

And get full access to all statistics. Starting from $2,388 USD per year!

Trusted by more than 23,000 companies

Trending statistics

Get facts and insights on topics that matter, mar 28, 2024 | mobile internet & apps, estimated value of digital companies at spac merger 2024.

On March 26, 2024, Truth Social entered the public market via the SPAC merger of Trump Media & Technology Group with the Digital World Acquisition Corp, at a valuation of eight billion U.S. dollars. The U.S.- based mortgage company Better.com, which went public via SPAC merger with the Aurora Acquisition Corp. in August 2023, enjoyed the implied valuation of almost seven billion U.S. dollars, despite having seen its profits deteriorate in the previous year. SPAC mergers in the alt-tech scene SPAC mergers have become an increasingly common way for non-mainstream social media companies and alt-tech platforms to enter the public market. The number of Truth Social app downloads in the United States barely reached 100 thousand in February 2024. Despite this, Truth Social had a grandiose market debut, peaking at almost 80 U.S. dollars per share. Launched in February 2022, Truth Social is a microblogging platform that proposes to uphold freedom of expression by giving voice to the alt-right political figures in the American public debate. Similarly, online video platform Rumble, which monetizes content from creators such as Andrew Tate and Russell Brand, chose to go public via a SPAC merger with CF Acquisition Corp VI on September 15, 2022. The quarterly revenue generated by Rumble was almost 18 million U.S. dollars in the third quarter of 2023, down by 30 percent compared to the previous period. In comparison, the hours of uploaded video content on Rumble keep climbing to increasingly high output results, a sign that content creators have been using the platform more than in previous years. Latest social media IPOs March 2024 marked the resurgence of social media IPOs. On March 21, 2024, social forum Reddit began officially trading on the New York exchange. The platform, which was launched in 2005, first expressed its wish to go public in December 2021. Reddit was valued at 6.5 billion U.S. dollars in its pre-IPO stand, but managed to surpass a 10 billion valuation after its market debut. Within the ranking of estimated valuation of selected social media platforms at their entrance into the public market , Reddit positions itself in the mid to low tier of the scale. Former president Donald Trump’s Truth Social, which went public via a SPAC merger in the same month, enjoyed a valuation of approximately eight billion U.S. dollars. The largest social media IPO recorded in the last 15 years was Facebook, which went public at a valuation of 104 billion U.S. dollars. Among the companies that have expressed the will to go public, communication and social networking platform Discord Inc. stands out thanks to its estimated 2022 valuation at 10 billion U.S. dollars.

Apr 1, 2024 | National Security

Geopolitical risk index 1985-2024.

Since the monthly counting of the Geopolitical Risk Index (GPR) started in 1985, the index peaked in October 2001, immediately after the 9/11 terrorist attack on the World Trade Center and Pentagon in the United States. The attack is perceived to be the  deadliest terrorist attack in the 20th and 21st century, and ultimately caused the start of the so-called war on terror, with U.S. invasions of Afghanistan (2001) and Iraq (2003) following in the aftermath.

The GPR was also high in March 2022 following Russia's invasion of Ukraine at the end of February that year. The attack on an independent state meant that the relations between Russia and the West reached a new low after the collapse of the Soviet Union, and several sanctions were imposed on Russia.

Apart from the 9/11 attacks in 2001, the index reached its highest level in January 1991. This was a result of the ongoing Gulf War following Iraq's invasion of Kuwait, but also Soviet troops storming the Lithuanian capital in order to stop the country's secession from the Soviet Union . Additionally, a massacre of Tutsi in Rwanda highlighted the growing tensions in the East African country, which ultimately resulted in the genocide in 1994.

Apr 2, 2024 | Apps

Monthly global downloads of temu shopping app 2022-2024.

The popularity of ecommerce platform Temu has been surging since its debut in the fall of 2022. In March 2024, the app was downloaded over 41 million times all over the world, making it more popular than Amazon’s marketplace app.

Temu, which is owned by the Chinese online retailer PDD Holdings, has successfully replicated the meteoric growth of its sister app Pinduoduo in overseas markets through effective marketing campaigns. Focusing on providing low-cost products with free and fast shipping, Temu has emerged as a wallet-saving alternative amidst rising inflation . The newcomer has also followed the playbook of Pinduoduo, such as gamification features and personalized purchase recommendations, to make shopping on mobile more fun.

These strategies work. In the first five months of 2023, Temu generated over 1.5 billion U.S. dollars in gross merchandise volume. It has caught the eye of inflation-weary shoppers in the West, particularly young people in the United States and Mexico. In April 2023, Temu achieved its first milestone of over 100 million active users in the United States.  

Mar 26, 2024 | Environmental Technology & Greentech

Global commercial carbon sequestration projects 2024, by major country.

The United States had the highest number of projects in the commercial carbon capture and storage (CCS) facilities pipeline worldwide as of March 2024. There were 149 commercial CCS projects in the U.S. in 2024, 16 of which were operational. Canada and the United Kingdom followed, with 49 and 48 CCS projects in the pipeline, respectively.

Mar 27, 2024 | Elections

U.s. top presidential candidates for 2024 election january 2024, by age.

As of March 2024, approximately 26 percent of Americans between the ages of 18 and 34 reported that they would vote independently in the 2024 presidential elections if Joe Biden was the Democratic candidate and Donald Trump the Republican candidate. Of those, 21 percent said they would vote for Robert F. Kennedy Jr.

Mar 26, 2024 | Demographics

Ranking of the 22 richest people in china as of 2024.

As of January 2024, Zhong Shanshan topped the list of the richest people in China with a net worth of 63 billion U.S. dollars. Huang Zheng, founder of Pinduoduo, and Ma Huateng, founder of the IT giant Tencent, came in second and third respectively, while Ma Yun, founder of the IT giant Alibaba, fell back to the tenth place. Ultra-high net worth individuals (UHNWI) in China Net worth refers to the amount of value by which an individual’s assets exceed their liabilities. It is usually cited to demonstrate the economic position of a person. Following China’s extensive economic development over the past two decades, the number of wealthy people had been rapidly growing as well. According to Hurun Research Institute, Greater China was the region with the largest number of billionaires worldwide as of 2024, with a total number of 814 billionaires. As of January 2022, the number of millionaires had amounted to approximately 20,400 people in Beijing alone. Unsurprisingly, the majority of high-net worth individuals lives in one of the four first-tier cities Beijing, Shanghai, Guangzhou, and Shenzhen. Chinese billionaire's sources of wealth Chinese millionaires have accumulated their wealth primarily as private entrepreneurs . Most of the people listed among the 20 wealthiest Chinese in 2024 had owned their own companies. Zhong Shanshan, who topped the list of richest people in China in 2024, has made his fortune as founder of the beverage company Nongfu Spring.

Mar 27, 2024 | B2C E-Commerce

India: top 10 online stores.

Ajio.com is leading the Indian e-commerce market, with e-commerce net sales of over 2.3 billion U.S. dollars generated in India, followed by jiomart.com with over 1.6 billion in sales. Third place is taken by reliancedigital.in with revenues of nearly 1.5 billion USD. For an extended ranking as well as rankings in specific product categories, please visit ecommerceDB.com . The eCommerceDB provides detailed information for over 30,000 online stores in more than 50 countries, including detailed revenue analytics, competitor analysis, market development, marketing budget, and interesting KPIs, such as traffic, shipping providers, payment options, social media activity and many more.

Mar 27, 2024 | Stocks

Euronext: market capitalization of largest companies 2024.

As of March 2024, the largest company by market capitalization listed on the Euronext stock exchange was the French luxury goods company LVMH, which consists of Louis Vuitton, Moët and Hennessy. At this time, the company's market capital value was around 410 billion euros and in second place was the Dutch company ASML Holdings with nearly 360 billion Euros.

Mar 28, 2024 | Electricity

Household electricity prices in africa 2023, by country.

Cape Verde recorded the highest electricity price for households in Africa. As of June 2023, one kilowatt hour cost around 0.31 U.S. dollars in the country. Rwanda and Mali followed, with households paying 0.23 and 0.22 U.S. dollars per kilowatt hour, respectively. Burkina Faso, Gabon, and Kenya also recorded expensive prices for electricity on the continent. On the other hand, Libya, Ethiopia, and Sudan registered the lowest prices for electric energy in Africa.

Mar 27, 2024 | Key Economic Indicators

Inflation rate australia q1 2021-q4 2023.

The inflation rate in Australia was at 4.1 percent as of the fourth quarter of 2023. This was a decline of 3.7 percentage points from the high of 7.8 percent in the fourth quarter of 2022. 

Popular topics

Starting point of your research.

  • Agriculture
  • Pharmaceuticals
  • Advertising
  • Video Games
  • Virtual Reality
  • European Union
  • United Kingdom

Market Insights

Discover data on your market

Gain access to valuable and comparable market data for over 190+ countries, territories, and regions with our Market Insights. Get deep insights into important figures, e.g., revenue metrics, key performance indicators, and much more.

All Market Insights topics at a glance

Statista accounts

Our complete solutions, basic account, get to know us.

  • Access to basic statistics
  • Download as PDF & PNG

Starter Account

The ideal entry-level account for individual users.

  • Full access to all statistics
  • Stand-alone license

Professional Account

Our company solution.

  • All functions of the Starter Account
  • Access to dossiers, forecasts, studies

*All products require an annual contract. Prices do not include sales tax.

Global stories vividly visualized

Consumer insights, understand what drives consumers.

The Consumer Insights helps marketers, planners and product managers to understand consumer behavior and their interaction with brands. Explore consumption and media usage on a global basis.

Our Service

Save time & money with statista, trusted content.

With an increasing number of Statista-cited media articles, Statista has established itself as a reliable partner for the largest media companies of the world.

Industry expertise

Over 500 researchers and specialists gather and double-check every statistic we publish. Experts provide country and industry-based forecasts.

Flatrate access

With our solutions you find data that matters within minutes – ready to go in your favorite format.

Mon - Fri, 9am - 6pm (EST)

Mon - Fri, 9am - 5pm (SGT)

Mon - Fri, 10:00am - 6:00pm (JST)

Mon - Fri, 9:30am - 5pm (GMT)

Elsevier's Research Collaborations Logo

  • Help & FAQ

Find research data

Filters for datasets.

  • 1 - 25 out of 34 results
  • Start date (ascending)

Search results

The researcher journey through a gender lens.

Jayabalasingham, B. (Creator), Kuiper-Hoyng, L. (Creator), Zhang, J. (Creator), Roberge, G. (Creator) & Collins, T. (Creator), Mendeley Data, Mar 6 2020

DOI : 10.17632/ww6g4t2r32.1 , https://data.mendeley.com/datasets/ww6g4t2r32/1

An Open Access Corpus of Scientific, Technical, and Medical Content

Daniel, R. (Creator), Groth, P. (Creator), Scerri, A. (Creator), Harper, C. A. (Creator), Vandenbussche, P. (Creator) & Cox, J. (Creator), Github, 2015

https://github.com/elsevierlabs/OA-STM-Corpus

Elsevier OA CC-BY Corpus

Kershaw, D. (Creator) & Koeling, R. (Creator), Mendeley Data, Aug 4 2020

DOI : 10.17632/zm33cdndxs.2

Project: Fostering Transparent and Responsible Conduct of Research: What can Journals do?

Malički, M. (Creator), Mendeley Data, Aug 1 2019

DOI : 10.17632/53cskwwpdn.4

Optimised Machine Learning Methods Predict Discourse Segment Type in Biological Research Articles

Cox, J. (Creator) & Harper, C. A. (Creator), Mendeley Data, Mar 23 2018

DOI : 10.17632/tds3k5kyvg.1

Evaluating Open Information Extraction on Scientific and Medical Text

Groth, P. (Creator), Lauruhn, M. (Creator), Scerri, A. (Creator) & Daniel, R. (Creator), Mendeley Data, Feb 16 2018

DOI : 10.17632/6m5dyx4b58.2

Elsevier's data and code for the bioCADDIE 2016 Dataset Retrieval Challenge

Cotroneo, P. (Creator), Mendeley Data, Jun 5 2017

DOI : 10.17632/zd9dxpyybg.1

Data for: Build it and they will come: The convening power of the SOLEIL Synchrotron facility

Plume, A. M. (Creator), Mendeley Data, 2020

DOI : 10.17632/td4k3rjmm7.1

Media and Cell Line Dictionaries

Cox, J. (Creator), Mendeley Data, 2020

DOI : http://dx.doi.org/10.17632/3nnsyxdsvd.1

Data for: Fractional Authorship & Publication Productivity

Herbert, R. (Creator), Mendeley Data, 2019

DOI : DOI: 10.17632/3rvsv3zvvy.1

ChemTables Sample: dataset for table classification in chemical patents

Thorne, C. (Creator), Akhondi, S. (Creator), Druckenbrodt, C. (Creator), Zhai, Z. (Creator), Verspoor, K. (Creator), Cohn, T. (Creator), Nguyen, D. Q. (Creator) & Eustratiadis, P. (Creator), Mendeley Data, Nov 4 2020

DOI : 10.17632/g7tjh7tbrj.1

Project: Preprint Observatory

Malički, M. (Creator), Mendeley Data, 2020

DOI : 10.17632/zrtfry5fsd.4 , https://data.mendeley.com/datasets/zrtfry5fsd/4

Automatic identification of relevant chemical compounds from patents. The training corpus.

Akhondi, S. (Creator), Mendeley Data, 2019

DOI : 10.17632/6hykykmn65.1

3D reconstruction data used to perform selection task

Zudilova-Seinstra, E. (Creator), Mendeley Data, Feb 13 2020

DOI : 10.17632/j6sdxytt46.10 , https://data.mendeley.com/datasets/j6sdxytt46/10

Data for report "Artificial Intelligence: How knowledge is created, transferred, and used"

Hellwig, J. (Creator), Huggett, S. (Creator) & Siebert, M. (Creator), Mendeley Data, Sep 15 2019

DOI : 10.17632/7ydfs62gd6.2 , https://data.mendeley.com/datasets/7ydfs62gd6/2

Citation Contexts for Nobel Prize Winning Papers

Cox, J. (Creator), Kohler, C. (Creator) & Groth, P. (Creator), Mendeley Data, Apr 13 2018

DOI : 10.17632/g75gcpp49k.1

Core ECL Machine Learning Library

Suryanarayana, A. (Creator), Github, Aug 7 2019

https://github.com/suryanarayanan21/ML_Core

Image Integrity Database

Seadle, M. (Creator), Elsevier, 2019

https://headt.eu/Image-Integrity-Database/

ChEMU dataset for information extraction from chemical patents

Verspoor, K. (Creator), Nguyen, D. Q. (Creator), Akhondi, S. (Creator), Drukenbrodt, C. (Creator), Thorne, C. (Creator), Hoessel, R. (Creator), He, J. (Creator) & Zhai, Z. (Creator), Mendeley Data, 2020

DOI : 10.17632/wy6745bjfj.2

Analysis of research data for 11 Institutions - Data Monitor

Zudilova-Seinstra, E. (Creator), Zigoni, A. (Creator) & Haak, W. (Creator), Mendeley Data, 2020

DOI : http://dx.doi.org/10.17632/k5p45z33kb.3

Data for: Accept me, accept me not: What do journal acceptance rates really mean?

Herbert, R. (Creator), Mendeley Data, 2020

DOI : 10.17632/rpb526yhyx.1

Open Access, Research Data Management (RDM) and Open Science in Poland 2019

Boudova, L. (Creator) & Moreira, R. (Creator), Mendeley Data, 2019

DOI : 10.17632/nyzygp93zh.1

Trust and peer review

Deakin, G. (Creator), Mulligan, A. (Creator), Brown, T. (Creator) & Jesper-Mir, E. (Creator), Mendeley Data, 2019

DOI : 10.17632/wkd3jmm7mf.2

Data for aggregate statistics in "Hundreds of extreme self-citing scientists revealed in new database"

Baas, J. (Creator), Ioannidis, J. (Creator), Klavans, R. (Creator) & Boyack, K. (Creator), Mendeley Data, 2019

DOI : 10.17632/gw684hwcyb.1

ChEMU-Ref dataset for Modeling Anaphora Resolution i n the Chemical Domain.

Fang, B. (Creator), Drukenbrodt, C. (Creator), Yeow, C. (Creator), Novakovic, S. (Creator), Hoessel, R. (Creator), Akhondi, S. (Creator), He, J. (Creator), Mistica, M. (Creator), Baldwin, T. (Creator) & Verspoor, K. (Creator), Mendeley Data, 2021

DOI : 10.17632/r28xxr6p92.1

  • Open access
  • Published: 28 March 2024

Using the consolidated Framework for Implementation Research to integrate innovation recipients’ perspectives into the implementation of a digital version of the spinal cord injury health maintenance tool: a qualitative analysis

  • John A Bourke 1 , 2 , 3 ,
  • K. Anne Sinnott Jerram 1 , 2 ,
  • Mohit Arora 1 , 2 ,
  • Ashley Craig 1 , 2 &
  • James W Middleton 1 , 2 , 4 , 5  

BMC Health Services Research volume  24 , Article number:  390 ( 2024 ) Cite this article

164 Accesses

Metrics details

Despite advances in managing secondary health complications after spinal cord injury (SCI), challenges remain in developing targeted community health strategies. In response, the SCI Health Maintenance Tool (SCI-HMT) was developed between 2018 and 2023 in NSW, Australia to support people with SCI and their general practitioners (GPs) to promote better community self-management. Successful implementation of innovations such as the SCI-HMT are determined by a range of contextual factors, including the perspectives of the innovation recipients for whom the innovation is intended to benefit, who are rarely included in the implementation process. During the digitizing of the booklet version of the SCI-HMT into a website and App, we used the Consolidated Framework for Implementation Research (CFIR) as a tool to guide collection and analysis of qualitative data from a range of innovation recipients to promote equity and to inform actionable findings designed to improve the implementation of the SCI-HMT.

Data from twenty-three innovation recipients in the development phase of the SCI-HMT were coded to the five CFIR domains to inform a semi-structured interview guide. This interview guide was used to prospectively explore the barriers and facilitators to planned implementation of the digital SCI-HMT with six health professionals and four people with SCI. A team including researchers and innovation recipients then interpreted these data to produce a reflective statement matched to each domain. Each reflective statement prefaced an actionable finding, defined as alterations that can be made to a program to improve its adoption into practice.

Five reflective statements synthesizing all participant data and linked to an actionable finding to improve the implementation plan were created. Using the CFIR to guide our research emphasized how partnership is the key theme connecting all implementation facilitators, for example ensuring that the tone, scope, content and presentation of the SCI-HMT balanced the needs of innovation recipients alongside the provision of evidence-based clinical information.

Conclusions

Understanding recipient perspectives is an essential contextual factor to consider when developing implementation strategies for healthcare innovations. The revised CFIR provided an effective, systematic method to understand, integrate and value recipient perspectives in the development of an implementation strategy for the SCI-HMT.

Trial registration

Peer Review reports

Injury to the spinal cord can occur through traumatic causes (e.g., falls or motor vehicle accidents) or from non-traumatic disease or disorder (e.g., tumours or infections) [ 1 ]. The onset of a spinal cord injury (SCI) is often sudden, yet the consequences are lifelong. The impact of a SCI is devastating, with effects on sensory and motor function, bladder and bowel function, sexual function, level of independence, community participation and quality of life [ 2 ]. In order to maintain good health, wellbeing and productivity in society, people with SCI must develop self-management skills and behaviours to manage their newly acquired chronic health condition [ 3 ]. Given the increasing emphasis on primary health care and community management of chronic health conditions, like SCI, there is a growing responsibility on all parties to promote good health practices and minimize the risks of common health complications in their communities.

To address this need, the Spinal Cord Injury Health Maintenance Tool (SCI-HMT) was co-designed between 2018 and 2023 with people living with SCI and their General Practitioners (GPs) in NSW, Australia [ 4 ] The aim of the SCI-HMT is to support self-management of the most common and arguably avoidable potentially life-threatening complications associated with SCI, such as mental health crises, autonomic dysreflexia, kidney infections and pressure injuries. The SCI-HMT provides comprehensible information with resources about the six highest priority health areas related to SCI (as indicated by people with SCI and GPs) and was developed over two phases. Phase 1 focused on developing a booklet version and Phase 2 focused on digitizing this content into a website and smartphone app [ 4 , 5 ].

Enabling the successful implementation of evidence-based innovations such as the SCI-HMT is inevitably influenced by contextual factors: those dynamic and diverse array of forces within real-world settings working for or against implementation efforts [ 6 ]. Contextual factors often include background environmental elements in which an intervention is situated, for example (but not limited to) demographics, clinical environments, organisational culture, legislation, and cultural norms [ 7 ]. Understanding the wider context is necessary to identify and potentially mitigate various challenges to the successful implementation of those innovations. Such work is the focus of determinant frameworks, which focus on categorising or classing groups of contextual determinants that are thought to predict or demonstrate an effect on implementation effectiveness to better understand factors that might influence implementation outcomes [ 8 ].

One of the most highly cited determinant frameworks is the Consolidated Framework for Implementation Research (CFIR) [ 9 ], which is often posited as an ideal framework for pre-implementation preparation. Originally published in 2009, the CFIR has recently been subject to an update by its original authors, which included a literature review, survey of users, and the creation of an outcome addendum [ 10 , 11 ]. A key contribution from this revision was the need for a greater focus on the place of innovation recipients, defined as the constituency for whom the innovation is being designed to benefit; for example, patients receiving treatment, students receiving a learning activity. Traditionally, innovation recipients are rarely positioned as key decision-makers or innovation implementers [ 8 ], and as a consequence, have not often been included in the application of research using frameworks, such as the CFIR [ 11 ].

Such power imbalances within the intersection of healthcare and research, particularly between those receiving and delivering such services and those designing such services, have been widely reported [ 12 , 13 ]. There are concerted efforts within health service development, health research and health research funding, to rectify this power imbalance [ 14 , 15 ]. Importantly, such efforts to promote increased equitable population impact are now being explicitly discussed within the implementation science literature. For example, Damschroder et al. [ 11 ] has recently argued for researchers to use the CFIR to collect data from innovation recipients, and that, ultimately, “equitable population impact is only possible when recipients are integrally involved in implementation and all key constituencies share power and make decisions together” (p. 7). Indeed, increased equity between key constituencies and partnering with innovation recipients promotes the likelihood of sustainable adoption of an innovation [ 4 , 12 , 14 ].

There is a paucity of work using the updated CFIR to include and understand innovation recipients’ perspectives. To address this gap, this paper reports on a process of using the CFIR to guide the collection of qualitative data from a range of innovation recipients within a wider co-design mixed methods study examining the development and implementation of SCI-HMT. The innovation recipients in our research are people living with SCI and GPs. Guided by the CFIR domains (shown in the supplementary material), we used reflexive thematic analysis [ 16 ]to summarize data into reflective summaries, which served to inform actionable findings designed to improve implementation of the SCI-HMT.

The procedure for this research is multi-stepped and is summarized in Fig.  1 . First, we mapped retrospective qualitative data collected during the development of the SCI-HMT [ 4 ] against the five domains of the CFIR in order to create a semi-structured interview guide (Step 1). Then, we used this interview guide to collect prospective data from health professionals and people with SCI during the development of the digital version of the SCI-HMT (Step 2) to identify implementation barriers and facilitators. This enabled us to interpret a reflective summary statement for each CFIR domain. Lastly, we developed an actionable finding for each domain summary. The first (RESP/18/212) and second phase (2019/ETH13961) of the project received ethical approval from The Northern Sydney Local Health District Human Research Ethics Committee. The reporting of this study was conducted in line with the consolidated Criteria for Reporting Qualitative Research (COREQ) guidelines [ 17 ]. All methods were performed in accordance with the relevant guidelines and regulations.

figure 1

Procedure of synthesising datasets to inform reflective statements and actionable findings. a Two health professionals had a SCI (one being JAB); b Two co-design researchers had a SCI (one being JAB)

Step one: retrospective data collection and analysis

We began by retrospectively analyzing the data set (interview and focus group transcripts) from the previously reported qualitative study from the development phase of the SCI-HMT [ 4 ]. This analysis was undertaken by two team members (KASJ and MA). KASJ has a background in co-design research. Transcript data were uploaded into NVivo software (Version 12: QSR International Pty Ltd) and a directed content analysis approach [ 18 ] was applied to analyze categorized data a priori according to the original 2009 CFIR domains (intervention characteristics, outer setting, inner setting, characteristics of individuals, and process of implementation) described by Damschroder et al. [ 9 ]. This categorized data were summarized and informed the specific questions of a semi-structured interview guide. The final output of step one was an interview guide with context-specific questions arranged according to the CFIR domains (see supplementary file 1). The interview was tested with two people with SCI and one health professional.

Step two: prospective data collection and analysis

In the second step, semi-structured interviews were conducted by KASJ (with MA as observer) with consenting healthcare professionals who had previously contributed to the development of the SCI-HMT. Healthcare professionals included GPs, Nurse Consultants, Specialist Physiotherapists, along with Health Researchers (one being JAB). In addition, a focus group was conducted with consenting individuals with SCI who had contributed to the SCI-HMT design and development phase. The interview schedule designed in step one above guided data collection in all interviews and the focus group.

The focus group and interviews were conducted online, audio recorded, transcribed verbatim and uploaded to NVivo software (Version 12: QSR International Pty Ltd). All data were subject to reflexive, inductive and deductive thematic analysis [ 16 , 19 ] to better understand participants’ perspectives regarding the potential implementation of the SCI-HMT. First, one team member (KASJ) read transcripts and began a deductive analysis whereby data were organized into CFIR domains-specific dataset. Second, KASJ and JAB analyzed this domain-specific dataset to inductively interpret a reflective statement which served to summarise all participant responses to each domain. The final output of step two was a reflective summary statement for each CFIR domain.

Step three: data synthesis

In the third step we aimed to co-create an actionable finding (defined as tangible alteration that can be made to a program, in this case the SCI-HMT [ 20 ]) based on each domain-specific reflective statement. To achieve this, three codesign researchers (KAS and JAB with one person with SCI from Step 2 (deidentified)) focused on operationalising each reflective statement into a recommended modification for the digital version of the SCI-HMT. This was an iterative process guided by the specific CFIR domain and construct definitions, which we deemed salient and relevant to each reflective statement (see Table  2 for example). Data synthesis involved line by line analysis, group discussion, and repeated refinement of actionable findings. A draft synthesis was shared with SCI-HMT developers (JWM and MA) and refinement continued until consensus was agreed on. The final outputs of step three were an actionable finding related to each reflective statement for each CFIR domain.

The characteristics of both the retrospective and prospective study participants are shown in Table  1 . The retrospective data included data from a total of 23 people: 19 people with SCI and four GPs. Of the 19 people with SCI, 12 participated in semi-structured interviews, seven participated in the first focus group, and four returned to the second focus group. In step 2, four people with SCI participated in a focus group and six healthcare professionals participated in one-on-one semi-structured interviews. Two of the healthcare professionals (a GP and a registrar) had lived experience of SCI, as did one researcher (JAB). All interviews and focus groups were conducted either online or in-person and ranged in length between 60 and 120 min.

In our overall synthesis, we actively interpreted five reflective statements based on the updated CFIR domain and construct definitions by Damschroder et al. [ 11 ]. Table  2 provides a summary of how we linked the updated CFIR domain and construct definitions to the reflective statements. We demonstrate this process of co-creation below, including illustrative quotes from participants. Importantly, we guide readers to the actionable findings related to each reflective statement in Table  2 . Each actionable statement represents an alteration that can be made to a program to improve its adoption into practice.

Participants acknowledged that self-management is a major undertaking and very demanding, as one person with SCI said, “ we need to be informed without being terrified and overwhelmed”. Participants felt the HMT could indeed be adapted, tailored, refined, or reinvented to meet local needs. For example, another person with SCI remarked:

“Education needs to be from the get-go but in bite sized pieces from all quarters when readiness is most apparent… at all time points , [not just as a] a newbie tool or for people with [long-term impairment] ” (person with SCI_02).

Therefore, the SCI-HMT had to balance complexity of content while still being accessible and engaging, and required input from both experts in the field and those with lived experience of SCI, for example, a clinical nurse specialist suggested:

“it’s essential [the SCI-HMT] is written by experts in the field as well as with collaboration with people who have had a, you know, the lived experience of SCI” (healthcare professional_03).

Furthermore, the points of contact with healthcare for a person with SCI can be challenging to navigate and the SCI-HMT has the potential to facilitate a smoother engagement process and improve communication between people with SCI and healthcare services. As a GP suggested:

“we need a tool like this to link to that pathway model in primary health care , [the SCI-HMT] it’s a great tool, something that everyone can read and everyone’s reading the same thing” (healthcare professional_05).

Participants highlighted that the ability of the SCI-HMT to facilitate effective communication was very much dependent on the delivery format. The idea of digitizing the SCI-HMT garnered equal support from people with SCI and health care professionals, with one participant with SCI deeming it to be “ essential” ( person with SCI_01) and a health professional suggesting a “digitalized version will be an advantage for most people” (healthcare professional_02).

Outer setting

There was strong interest expressed by both people with SCI and healthcare professionals in using the SCI-HMT. The fundamental premise was that knowledge is power and the SCI-HMT would have strong utility in post-acute rehabilitation services, as well as primary care. As a person with SCI said,

“ we need to leave the [spinal unit] to return to the community with sufficient knowledge, and to know the value of that knowledge and then need to ensure primary healthcare provider [s] are best informed” (person with SCI_04).

The value of the SCI-HMT in facilitating clear and effective communication and shared decision-making between healthcare professionals and people with SCI was also highlighted, as shown by the remarks of an acute nurse specialist:

“I think this tool is really helpful for the consumer and the GP to work together to prioritize particular tests that a patient might need and what the regularity of that is” (healthcare professional_03).

Engaging with SCI peer support networks to promote the SCI-HMT was considered crucial, as one person with SCI emphasized when asked how the SCI-HMT might be best executed in the community, “…peers, peers and peers” (person with SCI_01). Furthermore, the layering of content made possible in the digitalized version will allow for the issue of approachability in terms of readiness for change, as another person with SCI said:

“[putting content into a digital format] is essential and required and there is a need to put summarized content in an App with links to further web-based information… it’s not likely to be accessed otherwise” (person with SCI_02).

Inner setting

Participants acknowledged that self-management of health and well-being is substantial and demanding. It was suggested that the scope, tone, and complexity of the SCI-HMT, while necessary, could potentially be resisted by people with SCI if they felt overwhelmed, as one person with SCI described:

“a manual that is really long and wordy, like, it’s [a] health metric… they maybe lack the health literacy to, to consume the content then yes, it would impede their readiness for [self-management]” (person with SCI_02).

Having support from their GPs was considered essential, and the HMT could enable GP’s, who are under time pressure, to provide more effective health and advice to their patients, as one GP said:

“We GP’s are time poor, if you realize then when you’re time poor you look quickly to say oh this is a patient tool - how can I best use this?” (healthcare professional_05).

Furthermore, health professional skills may be best used with the synthesis of self-reported symptoms, behaviors, or observations. A particular strength of a digitized version would be its ability to facilitate more streamlined communication between a person with SCI and their primary healthcare providers developing healthcare plans, as an acute nurse specialist reflected, “ I think that a digitalized version is essential with links to primary healthcare plans” (healthcare professional_03).

Efficient communication with thorough assessment is essential to ensure serious health issues are not missed, as findings reinforce that the SCI-HMT is an educational tool, not a replacement for healthcare services, as a clinical nurse specialist commented, “ remember, things will go wrong– people end up very sick and in acute care “ (healthcare professional_02).

The SCI-HMT has the potential to provide a pathway to a ‘hope for better than now’ , a hope to ‘remain well’ and a hope to ‘be happy’ , as the informant with SCI (04) declared, “self-management is a long game, if you’re keeping well, you’ve got that possibility of a good life… of happiness”. Participants with SCI felt the tool needed to be genuine and

“acknowledge the huge amount of adjustment required, recognizing that dealing with SCI issues is required to survive and live a good life” (person with SCI_04).

However, there is a risk that an individual is completely overwhelmed by the scale of the SCI-HMT content and the requirement for lifelong vigilance. Careful attention and planning were paid to layering the information accordingly to support self-management as a ‘long game’, which one person with SCI reflected in following:

“the first 2–3 year [period] is probably the toughest to get your head around the learning stuff, because you’ve got to a stage where you’re levelling out, and you’ve kind of made these promises to yourself and then you realize that there’s no quick fix” (person with SCI_01).

It was decided that this could be achieved by providing concrete examples and anecdotes from people with SCI illustrating that a meaningful, healthy life is possible, and that good health is the bedrock of a good life with SCI.

There was universal agreement that the SCI-HMT is aspirational and that it has the potential to improve knowledge and understanding for people with SCI, their families, community workers/carers and primary healthcare professionals, as a GP remarked:

“[different groups] could just read it and realize, ‘Ahh, OK that’s what that means… when you’re doing catheters. That’s what you mean when you’re talking about bladder and bowel function or skin care” (healthcare professional_04).

Despite the SCI-HMT providing an abundance of information and resources to support self-management, participants identified four gaps: (i) the priority issue of sexuality, including pleasure and identity, as one person with SCI remarked:

“ sexuality is one of the biggest issues that people with SCI often might not speak about that often cause you know it’s awkward for them. So yeah, I think that’s a that’s a serious issue” (person with SCI_03).

(ii) consideration of the taboo nature of bladder and bowel topics for indigenous people, (iii) urgent need to ensure links for SCI-HMT care plans are compatible with patient management systems, and (iv) exercise and leisure as a standalone topic taking account of effects of physical activity, including impact on mental health and wellbeing but more especially for fun.

To ensure longevity of the SCI-HMT, maintaining a partnership between people with SCI, SCI community groups and both primary and tertiary health services is required for liaison with the relevant professional bodies, care agencies, funders, policy makers and tertiary care settings to ensure ongoing education and promotion of SCI-HMT is maintained. For example, delivery of ongoing training of healthcare professionals to both increase the knowledge base of primary healthcare providers in relation to SCI, and to promote use of the tools and resources through health communities. As a community nurse specialist suggested:

“ improving knowledge in the health community… would require digital links to clinical/health management platforms” (healthcare professional_02).

In a similar vein, a GP suggested:

“ our common GP body would have continuing education requirements… especially if it’s online, in particular for the rural, rural doctors who you know, might find it hard to get into the city” (healthcare professional_04).

The successful implementation of evidence-based innovations into practice is dependent on a wide array of dynamic and active contextual factors, including the perspectives of the recipients who are destined to use such innovations. Indeed, the recently updated CFIR has called for innovation recipient perspectives to be a priority when considering contextual factors [ 10 , 11 ]. Understanding and including the perspectives of those the innovation is being designed to benefit can promote increased equity and validation of recipient populations, and potentially increase the adoption and sustainability of innovations.

In this paper, we have presented research using the recently updated CFIR to guide the collection of innovation recipients’ perspectives (including people with SCI and GPs working in the community) regarding the potential implementation barriers and facilitators of the digital version of the SCI-HMT. Collected data were synthesized to inform actionable findings– tangible ways in which the SCI-HMT could be modified according of the domains of the CFIR (e.g., see Keith et al. [ 20 ]). It is important to note that we conducted this research using the original domains of the CFIR [ 9 ] prior to Damschroder et al. publishing the updated CFIR [ 11 ]. However, in our analysis we were able to align our findings to the revised CFIR domains and constructs, as Damschroder [ 11 ] suggests, constructs can “be mapped back to the original CFIR to ensure longitudinal consistency” (p. 13).

One of the most poignant findings from our analyses was the need to ensure the content of the SCI-HMT balanced scientific evidence and clinical expertise with lived experience knowledge. This balance of clinical and experiential knowledge demonstrated genuine regard for lived experience knowledge, and created a more accessible, engaging, useable platform. For example, in the innovation and individual domains, the need to include lived experience quotes was immediately apparent once the perspective of people with SCI was included. It was highlighted that while the SCI-HMT will prove useful to many parties at various stages along the continuum of care following onset of SCI, there will be those individuals that are overwhelmed by the scale of the content. That said, the layering of information facilitated by the digitalized version is intended to provide an ease of navigation through the SCI-HMT and enable a far greater sense of control over personal health and wellbeing. Further, despite concerns regarding e-literacy the digitalized version of the SCI-HMT is seen as imperative for accessibility given the wide geographic diversity and recent COVID pandemic [ 21 ]. While there will be people who are challenged by the technology, the universally acceptable use of the internet is seen as less of a barrier than printed material.

The concept of partnership was also apparent within the data analysis focusing on the outer and inner setting domains. In the outer setting domain, our findings emphasized the importance of engaging with SCI community groups, as well as primary and tertiary care providers to maximize uptake at all points in time from the phase of subacute rehabilitation onwards. While the SCI-HMT is intended for use across the continuum of care from post-acute rehabilitation onwards, it may be that certain modules are more relevant at different times, and could serve as key resources during the hand over between acute care, inpatient rehabilitation and community reintegration.

Likewise, findings regarding the inner setting highlighted the necessity of a productive partnership between GPs and individuals with SCI to address the substantial demands of long-term self-management of health and well-being following SCI. Indeed, support is crucial, especially when self-management is the focus. This is particularly so in individuals living with complex disability following survival after illness or injury [ 22 ], where health literacy has been found to be a primary determinant of successful health and wellbeing outcomes [ 23 ]. For people with SCI, this tool potentially holds the most appeal when an individual is ready and has strong partnerships and supportive communication. This can enable potential red flags to be recognized earlier allowing timely intervention to avert health crises, promoting individual well-being, and reducing unnecessary demands on health services.

While the SCI-HMT is an educational tool and not meant to replace health services, findings suggest the current structure would lead nicely to having the conversation with a range of likely support people, including SCI peers, friends and family, GP, community nurses, carers or via on-line support services. The findings within the process domain underscored the importance of ongoing partnership between innovation implementers and a broad array of innovation recipients (e.g., individuals with SCI, healthcare professionals, family, funding agencies and policy-makers). This emphasis on partnership also addresses recent discussions regarding equity and the CFIR. For example, Damschroder et al. [ 11 ] suggests that innovation recipients are too often not included in the CFIR process, as the CFIR is primarily seen as a tool intended “to collect data from individuals who have power and/or influence over implementation outcomes” (p. 5).

Finally, we feel that our inclusion of innovation recipients’ perspectives presented in this article begins to address the notion of equity in implementation, whereby the inclusion of recipient perspectives in research using the CFIR both validates, and increases, the likelihood of sustainable adoption of evidence-based innovations, such as the SCI-HMT. We have used the CFIR in a pragmatic way with an emphasis on meaningful engagement between the innovation recipients and the research team, heeding the call from Damschroder et al. [ 11 ], who recently argued for researchers to use the CFIR to collect data from innovation recipients. Adopting this approach enabled us to give voice to innovation recipient perspectives and subsequently ensure that the tone, scope, content and presentation of the SCI-HMT balanced the needs of innovation recipients alongside the provision of evidence-based clinical information.

Our research is not without limitations. While our study was successful in identifying a number of potential barriers and facilitators to the implementation of the SCI-HMT, we did not test any implementation strategies to impact determinants, mechanisms, or outcomes. This will be the focus of future research on this project, which will investigate the impact of implementation strategies on outcomes. Focus will be given to the context-mechanism configurations which give rise to particular outcomes for different groups in certain circumstances [ 7 , 24 ]. A second potential concern is the relatively small sample size of participants that may not allow for saturation and generalizability of the findings. However, both the significant impact of secondary health complications for people with SCI and the desire for a health maintenance tool have been established in Australia [ 2 , 4 ]. The aim our study reported in this article was to achieve context-specific knowledge of a small sample that shares a particular mutual experience and represents a perspective, rather than a population [ 25 , 26 ]. We feel our findings can stimulate discussion and debate regarding participant-informed approaches to implementation of the SCI-HMT, which can then be subject to larger-sample studies to determine their generalisability, that is, their external validity. Notably, future research could examine the interaction between certain demographic differences (e.g., gender) of people with SCI and potential barriers and facilitators to the implementation of the SCI-HMT. Future research could also include the perspectives of other allied health professionals working in the community, such as occupational therapists. Lastly, while our research gave significant priority to recipient viewpoints, research in this space would benefit for ensuring innovation recipients are engaged as genuine partners throughout the entire research process from conceptualization to implementation.

Employing the CFIR provided an effective, systematic method for identifying recipient perspectives regarding the implementation of a digital health maintenance tool for people living with SCI. Findings emphasized the need to balance clinical and lived experience perspectives when designing an implementation strategy and facilitating strong partnerships with necessary stakeholders to maximise the uptake of SCI-HMT into practice. Ongoing testing will monitor the uptake and implementation of this innovation, specifically focusing on how the SCI-HMT works for different users, in different contexts, at different stages and times of the rehabilitation journey.

Data availability

The datasets supporting the conclusions of this article are available available upon request and with permission gained from the project Steering Committee.

Abbreviations

spinal cord injury

HMT-Spinal Cord Injury Health Maintenance Tool

Consolidated Framework for Implementation Research

Kirshblum S, Vernon WL. Spinal Cord Medicine, Third Edition. New York: Springer Publishing Company; 2018.

Middleton JW, Arora M, Kifley A, Clark J, Borg SJ, Tran Y, et al. Australian arm of the International spinal cord Injury (Aus-InSCI) Community Survey: 2. Understanding the lived experience in people with spinal cord injury. Spinal Cord. 2022;60(12):1069–79.

Article   PubMed   PubMed Central   Google Scholar  

Craig A, Nicholson Perry K, Guest R, Tran Y, Middleton J. Adjustment following chronic spinal cord injury: determining factors that contribute to social participation. Br J Health Psychol. 2015;20(4):807–23.

Article   PubMed   Google Scholar  

Middleton JW, Arora M, Jerram KAS, Bourke J, McCormick M, O’Leary D, et al. Co-design of the Spinal Cord Injury Health Maintenance Tool to support Self-Management: a mixed-methods Approach. Top Spinal Cord Injury Rehabilitation. 2024;30(1):59–73.

Article   Google Scholar  

Middleton JW, Arora M, McCormick M, O’Leary D. Health maintenance Tool: how to stay healthy and well with a spinal cord injury. A tool for consumers by consumers. 1st ed. Sydney, NSW Australia: Royal Rehab and The University of Sydney; 2020.

Google Scholar  

Nilsen P, Bernhardsson S. Context matters in implementation science: a scoping review of determinant frameworks that describe contextual determinants for implementation outcomes. BMC Health Serv Res. 2019;19(1):189.

Jagosh J. Realist synthesis for Public Health: building an Ontologically Deep understanding of how Programs Work, for whom, and in which contexts. Annu Rev Public Health. 2019;40(1):361–72.

Nilsen P. Making sense of implementation theories, models and frameworks. Implement Sci. 2015;10(1):53.

Damschroder LJ, Aron DC, Keith RE, Kirsh SR, Alexander JA, Lowery JC. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implement Sci. 2009;4(1):50.

Damschroder LJ, Reardon CM, Opra Widerquist MA, Lowery JC. Conceptualizing outcomes for use with the Consolidated Framework for Implementation Research (CFIR): the CFIR outcomes Addendum. Implement Sci. 2022;17(1):7.

Damschroder LJ, Reardon CM, Widerquist MAO, Lowery JC. The updated Consolidated Framework for Implementation Research based on user feedback. Implement Sci. 2022;17(1):75.

Plamondon K, Ndumbe-Eyoh S, Shahram S. 2.2 Equity, Power, and Transformative Research Coproduction. Research Co-Production in Healthcare2022. p. 34–53.

Verville L, Cancelliere C, Connell G, Lee J, Munce S, Mior S, et al. Exploring clinicians’ experiences and perceptions of end-user roles in knowledge development: a qualitative study. BMC Health Serv Res. 2021;21(1):926.

Gainforth HL, Hoekstra F, McKay R, McBride CB, Sweet SN, Martin Ginis KA, et al. Integrated Knowledge Translation Guiding principles for conducting and Disseminating Spinal Cord Injury Research in Partnership. Arch Phys Med Rehabil. 2021;102(4):656–63.

Langley J, Knowles SE, Ward V. Conducting a Research Coproduction Project. Research Co-Production in Healthcare2022. p. 112– 28.

Braun V, Clarke V. One size fits all? What counts as quality practice in (reflexive) thematic analysis? Qualitative Research in Psychology. 2020:1–25.

Tong A, Sainsbury p, Craig J. Consolidated criteria for reporting qulaitative research (COREQ): a 32-item checklist for interviews and focus groups. Int J Qual Health Care. 2007;19(6):349–57.

Bengtsson M. How to plan and perform a qualitative study using content analysis. NursingPlus Open. 2016;2:8–14.

Braun V, Clarke V. Using thematic analysis in psychology. Qualitative Res Psychol. 2006;3(2):77–101.

Keith RE, Crosson JC, O’Malley AS, Cromp D, Taylor EF. Using the Consolidated Framework for Implementation Research (CFIR) to produce actionable findings: a rapid-cycle evaluation approach to improving implementation. Implement Science: IS. 2017;12(1):15.

Choukou M-A, Sanchez-Ramirez DC, Pol M, Uddin M, Monnin C, Syed-Abdul S. COVID-19 infodemic and digital health literacy in vulnerable populations: a scoping review. Digit HEALTH. 2022;8:20552076221076927.

PubMed   PubMed Central   Google Scholar  

Daniels N. Just Health: Meeting Health needs fairly. Cambridge University Press; 2007. p. 397.

Parker SM, Stocks N, Nutbeam D, Thomas L, Denney-Wilson E, Zwar N, et al. Preventing chronic disease in patients with low health literacy using eHealth and teamwork in primary healthcare: protocol for a cluster randomised controlled trial. BMJ Open. 2018;8(6):e023239–e.

Salter KL, Kothari A. Using realist evaluation to open the black box of knowledge translation: a state-of-the-art review. Implement Sci. 2014;9(1):115.

Sebele-Mpofu FY. The Sampling Conundrum in qualitative research: can Saturation help alleviate the controversy and alleged subjectivity in Sampling? Int’l J Soc Sci Stud. 2021;9:11.

Malterud K, Siersma VD, Guassora AD. Sample size in qualitative interview studies: guided by Information Power. Qual Health Res. 2015;26(13):1753–60.

Download references

Acknowledgements

Authors of this study would like to thank all the consumers with SCI and healthcare professionals for their invaluable contribution to this project. Their participation and insights have been instrumental in shaping the development of the SCI-HMT. The team also acknowledges the support and guidance provided by the members of the Project Steering Committee, as well as the partner organisations, including NSW Agency for Clinical Innovation, and icare NSW. Author would also like to acknowledge the informant group with lived experience, whose perspectives have enriched our understanding and informed the development of SCI-HMT.

The SCI Wellness project was a collaborative project between John Walsh Centre for Rehabilitation Research at The University of Sydney and Royal Rehab. Both organizations provided in-kind support to the project. Additionally, the University of Sydney and Royal Rehab received research funding from Insurance and Care NSW (icare NSW) to undertake the SCI Wellness Project. icare NSW do not take direct responsibility for any of the following: study design, data collection, drafting of the manuscript, or decision to publish.

Author information

Authors and affiliations.

John Walsh Centre for Rehabilitation Research, Northern Sydney Local Health District, St Leonards, NSW, Australia

John A Bourke, K. Anne Sinnott Jerram, Mohit Arora, Ashley Craig & James W Middleton

The Kolling Institute, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia

Burwood Academy Trust, Burwood Hospital, Christchurch, New Zealand

John A Bourke

Royal Rehab, Ryde, NSW, Australia

James W Middleton

State Spinal Cord Injury Service, NSW Agency for Clinical Innovation, St Leonards, NSW, Australia

You can also search for this author in PubMed   Google Scholar

Contributions

Project conceptualization: KASJ, MA, JWM; project methodology: JWM, MA, KASJ, JAB; data collection: KASJ and MA; data analysis: KASJ, JAB, MA, JWM; writing—original draft preparation: JAB; writing—review and editing: JAB, KASJ, JWM, MA, AC; funding acquisition: JWM, MA. All authors contributed to the revision of the paper and approved the final submitted version.

Corresponding author

Correspondence to John A Bourke .

Ethics declarations

Ethics approval and consent to participate.

The first (RESP/18/212) and second phase (2019/ETH13961) of the project received ethical approval from The Northern Sydney Local Health District Human Research Ethics Committee. All participants provided informed, written consent. All data were to be retained for 7 years (23rd May 2030).

Consent for publication

Not applicable.

Competing interests

MA part salary (from Dec 2018 to Dec 2023), KASJ part salary (July 2021 to Dec 2023) and JAB part salary (Jan 2022 to Aug 2022) was paid from the grant monies. Other authors declare no conflicts of interest.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Bourke, J.A., Jerram, K.A.S., Arora, M. et al. Using the consolidated Framework for Implementation Research to integrate innovation recipients’ perspectives into the implementation of a digital version of the spinal cord injury health maintenance tool: a qualitative analysis. BMC Health Serv Res 24 , 390 (2024). https://doi.org/10.1186/s12913-024-10847-x

Download citation

Received : 14 August 2023

Accepted : 11 March 2024

Published : 28 March 2024

DOI : https://doi.org/10.1186/s12913-024-10847-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Spinal Cord injury
  • Self-management
  • Innovation recipients
  • Secondary health conditions
  • Primary health care
  • Evidence-based innovations
  • Actionable findings
  • Consolidated Framework for implementation research

BMC Health Services Research

ISSN: 1472-6963

finding research data

Work Trend Index

Research and data on the trends reshaping the world of work

A colorful illustration of people being launched or boosted into the sky with capes that remind the viewer of cursor pointers.

What Can Copilot’s Earliest Users Teach Us About Generative AI at Work?

A first look at the impact on productivity, creativity, and time.

About Work Trend Index

31,000 people. 31 countries. Trillions of productivity signals.

The Work Trend Index conducts global, industry-spanning surveys as well as observational studies to offer unique insights on the trends reshaping work for every employee and leader.

A digital illustration of objects representing work and communication in the foreground leading to a figure moving past them and looking out on a peaceful landscape.

Annual Report · May 9, 2023

Will AI Fix Work?

The pace of work is outpacing our ability to keep up. AI is poised to create a whole new way of working.

Illustration of four people laying down a large puzzle piece, standing on a floor of other puzzle pieces. Two figures hold on to the blue side of the puzzle while another two hold on to the purple side.

Special Report · April 20, 2023

The New Performance Equation in the Age of AI

New research shows that employee engagement matters to the bottom line—especially amid economic uncertainty

Illustration of three people in a sailboat working together to navigate choppy waters. In the distance, several sailboats on placid waters are visible.

Special Report · September 22, 2022

Hybrid Work Is Just Work. Are We Doing It Wrong?

In choppy economic waters, new data points to three urgent pivots for leaders to help employees and organizations thrive

Illustration of hot air balloons ascending into the sky. A person running toward a balloon that is taking off receives a helping hand from another person who is already in the basket of the balloon.

Annual Report · March 16, 2022

Great Expectations: Making Hybrid Work Work

From when to go to the office to why work in the first place, employees have a new “worth it” equation. And there’s no going back.

Three frontline workers, one in an apron, one in a hard hat, one with a stethoscope, in front of a colorful illustrated background.

Special Report · January 12, 2022

Technology Can Help Unlock a New Future for Frontline Workers

New data shows that now is the time to empower the frontline with the right digital tools

A figure chisels a stone sculpture. Another paints a landscape. Between them, a figure hands each the tools they need to do their jobs.

Special Report · September 9, 2021

To Thrive in Hybrid Work, Build a Culture of Trust and Flexibility

Microsoft employee survey data shows the importance of embracing different work styles—and the power of simple conversations

An illustration of a person resting between two meetings.

Special Report · April 20, 2021

Research Proves Your Brain Needs Breaks

New options help you carve out downtime between meetings

A giant hand draws lines that connect a series of tiny people. The lines form an arrow pointing the way forward.

Special Report · March 30, 2021

In Hybrid Work, Managers Keep Teams Connected

Researchers found that feelings of connection among Microsoft’s teams diminished during the pandemic. They also discovered the remedy.

Abstract paper cut illustration of concentric woman's silhouette in profile with moon over water at the center

Annual Report · March 22, 2021

The Next Great Disruption Is Hybrid Work—Are We Ready?

Exclusive research and expert insights into a year of work like no other reveal urgent trends leaders should consider as hybrid work unfolds.

A person on a stepladder replacing the yellow light on a stoplight.

Special Report · 2020-09-22

A Checkup on Employee Wellbeing

Explore how the pandemic is impacting wellbeing at work around the world.

People surfing on a wavy ocean made up of striated data charts.,

Special Report · 2020-07-08

The Knowns and Unknowns of the Future of Work

Learn how a sudden shift to remote work may have lasting effects around the world.

People climbing out of a video conferencing icon and walking around freely.,

Special Report · 2020-04-09

Remote Work Trend Report: Meetings

See how global meeting habits changed during the world’s largest work-from-home mandate.

WorkLab Newsletter art

The WorkLab Newsletter: Science-based insights on the future of work, direct to your inbox

Discover more from WorkLab

A colorful photo-illustration of McKinsey’s Global Talent Head Bryan Hancock.

Additional research on the future of work

Privacy Approach

Microsoft takes privacy seriously. We remove all personal and organization-identifying information, such as company name, from the data before analyzing it and creating reports. We never use customer content—such as information within an email, chat, document, or meeting—to produce reports. Our goal is to discover and share broad workplace trends that are anonymized by aggregating the data broadly from those trillions of signals that make up the Microsoft Graph.

medRxiv

The use and impact of surveillance-based technology initiatives in inpatient and acute mental health settings: A systematic review

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jessica L. Griffiths
  • For correspondence: [email protected]
  • ORCID record for Katherine R. K. Saunders
  • ORCID record for Una Foye
  • ORCID record for Anna Greenburgh
  • ORCID record for Antonio Rojas-Garcia
  • ORCID record for Brynmor Lloyd-Evans
  • ORCID record for Sonia Johnson
  • ORCID record for Alan Simpson
  • Info/History
  • Supplementary material
  • Preview PDF

Background: The use of surveillance technologies is becoming increasingly common in inpatient mental health settings, commonly justified as efforts to improve safety and cost-effectiveness. However, the use of these technologies has been questioned in light of limited research conducted and the sensitivities, ethical concerns and potential harms of surveillance. This systematic review aims to: 1) map how surveillance technologies have been employed in inpatient mental health settings, 2) identify any best practice guidance, 3) explore how they are experienced by patients, staff and carers, and 4) examine evidence regarding their impact. Methods: We searched five academic databases (Embase, MEDLINE, PsycInfo, PubMed and Scopus), one grey literature database (HMIC) and two pre-print servers (medRxiv and PsyArXiv) to identify relevant papers published up to 18/09/2023. We also conducted backwards and forwards citation tracking and contacted experts to identify relevant literature. Quality was assessed using the Mixed Methods Appraisal Tool. Data were synthesised using a narrative approach. Results: A total of 27 studies were identified as meeting the inclusion criteria. Included studies reported on CCTV/video monitoring (n = 13), Vision-Based Patient Monitoring and Management (VBPMM) (n = 6), Body Worn Cameras (BWCs) (n = 4), GPS electronic monitoring (n = 2) and wearable sensors (n = 2). Twelve papers (44.4%) were rated as low quality, five (18.5%) medium quality, and ten (37.0%) high quality. Five studies (18.5%) declared a conflict of interest. We identified minimal best practice guidance. Qualitative findings indicate that patient, staff and carer perceptions and experiences of surveillance technologies are mixed and complex. Quantitative findings regarding the impact of surveillance on outcomes such as self-harm, violence, aggression, care quality and cost-effectiveness were inconsistent or weak. Discussion: There is currently insufficient evidence to suggest that surveillance technologies in inpatient mental health settings are achieving the outcomes they are employed to achieve, such as improving safety and reducing costs. The studies were generally of low methodological quality, lacked lived experience involvement, and a substantial proportion (18.5%) declared conflicts of interest. Further independent coproduced research is needed to more comprehensively evaluate the impact of surveillance technologies in inpatient settings, including harms and benefits. If surveillance technologies are to be implemented, it will be important to engage all key stakeholders in the development of policies, procedures and best practice guidance to regulate their use, with a particular emphasis on prioritising the perspectives of patients.

Competing Interest Statement

AS and UF have undertaken and published research on BWCs. We have received no financial support from BWC or any other surveillance technology companies. All other authors declare no competing interests.

Clinical Protocols

https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=463993

Funding Statement

This study is funded by the National Institute for Health and Care Research (NIHR) Policy Research Programme (grant no. PR-PRU-0916-22003). The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. ARG was supported by the Ramon y Cajal programme (RYC2022-038556-I), funded by the Spanish Ministry of Science, Innovation and Universities.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Data Availability

The template data extraction form is available in Supplementary 1. MMAT quality appraisal ratings for each included study are available in Supplementary 2. All data used is publicly available in the published papers included in this review.

View the discussion thread.

Supplementary Material

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Reddit logo

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager
  • Tweet Widget
  • Facebook Like
  • Google Plus One
  • Addiction Medicine (316)
  • Allergy and Immunology (617)
  • Anesthesia (159)
  • Cardiovascular Medicine (2276)
  • Dentistry and Oral Medicine (279)
  • Dermatology (201)
  • Emergency Medicine (370)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (798)
  • Epidemiology (11573)
  • Forensic Medicine (10)
  • Gastroenterology (678)
  • Genetic and Genomic Medicine (3575)
  • Geriatric Medicine (336)
  • Health Economics (616)
  • Health Informatics (2304)
  • Health Policy (913)
  • Health Systems and Quality Improvement (863)
  • Hematology (335)
  • HIV/AIDS (752)
  • Infectious Diseases (except HIV/AIDS) (13149)
  • Intensive Care and Critical Care Medicine (755)
  • Medical Education (359)
  • Medical Ethics (100)
  • Nephrology (388)
  • Neurology (3346)
  • Nursing (191)
  • Nutrition (506)
  • Obstetrics and Gynecology (651)
  • Occupational and Environmental Health (645)
  • Oncology (1756)
  • Ophthalmology (524)
  • Orthopedics (209)
  • Otolaryngology (284)
  • Pain Medicine (223)
  • Palliative Medicine (66)
  • Pathology (437)
  • Pediatrics (1001)
  • Pharmacology and Therapeutics (422)
  • Primary Care Research (406)
  • Psychiatry and Clinical Psychology (3058)
  • Public and Global Health (5983)
  • Radiology and Imaging (1221)
  • Rehabilitation Medicine and Physical Therapy (714)
  • Respiratory Medicine (811)
  • Rheumatology (367)
  • Sexual and Reproductive Health (350)
  • Sports Medicine (316)
  • Surgery (386)
  • Toxicology (50)
  • Transplantation (170)
  • Urology (142)

April 2, 2024

For the Record- April 2, 2024: Diversity in Clinical Trials Bill, PAVE Findings

In this Issue:

WA State Diversity in Clinical Trials Bill

Pave 2023 annual report findings and recommendations, new consent resources, changes to genomic data sharing consent requirements worksheet.

In 2023 the WA State legislature passed 2SHB 1745 , also known as the Diversity in Clinical Trials bill. This bill aims to improve participation in clinical trials from underrepresented communities (i.e., those more likely to be historically marginalized and less likely to be included in research) so that their data informs and contributes towards better health outcomes in these populations.

One aspect of this bill directs any state entity or hospital (including the UW) to adopt an institutional policy that encourages the identification and recruitment of underrepresented demographic groups into clinical trials. The policy must include:

  • Requirements that investigators collaborate with community-based organizations and use methods recognized by FDA to identify and recruit underrepresented populations
  • Provide information to trial participants in languages other than English
  • Provide translation services or bilingual staff for trial screening
  • Provide culturally specific recruitment materials
  • Provide electronic consent

HSD, along with other collaborative partners (e.g., UW Medicine Office of Healthcare Equity, ITHS, UW School of Medicine) are in the beginning stages of developing policy, guidance and resources needed to comply with the new law. Please note that:

  • Stakeholder input will be sought throughout the process.
  • Researchers will have time to prepare before these new requirements go into effect.
  • Regular monthly updates will be provided via this newsletter and other venues.

No Serious Non-Compliance Found

HSD’s Post Approval Verification and Education (PAVE) program annual report summarizes outcomes of PAVE evaluations conducted during the prior calendar year. It includes human subjects research studies under UW or External IRB review where a routine or IRB-directed evaluation was conducted. In 2023, the program reviewed 17 studies, which included 16 routine evaluations and 1 directed evaluation. Of the 92 observations made throughout the 17 PAVE visits, there were zero instances of serious non-compliance found. This is great news and demonstrates that UW researchers are conducting safe and compliant research.

There were findings of minor noncompliance. Some of the most common issues observed (and ways to avoid them) include:

  • Suggestion: In some instances, it may be appropriate to request an IRB exception to this requirement. For example, it may be necessary to temporarily keep subject names with data collection forms to facilitate in-person study visits. HSD allows researchers to request exceptions to individual data security requirements if necessary for the study and if the exceptions do not significantly increase risk to subjects. Exceptions can be requested under Section 9.6.b of the IRB Protocol Form .
  • Suggestion: Consider using only the IRB stamped approved consent form so it’s easy to see the approval watermark with the IRB approval date so you know you have the current form. Alternatively, use version control in the footer and update this with each consent revision. Also consider putting in place a process to inform all study staff of any consent form changes as well as where to find the most current IRB approved consent form. The current IRB approved version of your consent document can be downloaded directly from Zipline under the Documents tab in the “Final” column.

New Webpage Available

HSD’s Consent Materials As of October 1st, 2023, HSD stopped accepting new consent forms using our retired Standard Consent Template. Instead, researchers are expected to design a participant-focused consent process and form using our Designing the Consent Process guidance , optional templates , and example consent forms which are designed to assist researchers with meeting our requirements. You can use one of our templates or build your own consent form using the Designing the Consent Process guidance.

Materials Posted to the New Webpage

HSD has posted a new webpage that provides links to training and tools for building a participant-focused consent.

Interactive tutorial and checklist from the federal Office for Human Research Protections (OHRP) OHRP developed a 90-minute tutorial to assist researchers with designing a participant focused consent process and form. HSD’s consent materials were based on OHRP guidance and are in line with the tutorial. The accompanying checklist is a useful summary of the most important points in the tutorial.

Consenttools.org ConsentTools is supported by the Bioethics Research Center at Washington University School of Medicine in St. Louis, MO. They provide resources for optimizing key information, assessing consent comprehension, and obtaining consent using a legally authorized representative. HSD highly recommends reviewing the guidance on optimizing key information and the 5-minute tutorial on incorporating plain language into the consent process and form.

For more information about our expectations for a participant-focused consent, review the eNews from October 3rd.

Contact [email protected] with any questions.

Effective Tuesday, April 2, 2024

HSD is retiring the WORKSHEET Consent Requirements and Expectations for Genomic Data Sharing. This was a complex PDF worksheet that described the consent requirements for genomic data that is required to be shared through NIH repositories when research is subject to the NIH Genomic Data Sharing (GDS) policy . Instead, HSD has developed a simplified version of the consent requirements that align with recently updated NIH FAQs . These requirements have been incorporated into one worksheet describing all the GDS certification criteria that must be met for genomic data to be shared under the policy.

Where to find information about GDS consent requirements:

  • WORKSHEET Genomic Data Sharing Certification
  • GUIDANCE Genomic Data Sharing
  • Example consent language about sharing data through NIH repositories

NIH exceptions to GDS certification requirements:

  • When it is anticipated that GDS certification requirements cannot be met: investigators should state this in their data management and sharing plan and indicate what data, if any, can be shared and how. In some instances, the NIH funding institute or center may need to determine whether to grant an exception to the data submission expectation under the GDS policy.
  • When genomic data from specimens created or collected after the effective date of the GDS policy (January 25, 2015), lack consent for research use and data sharing: if there are compelling scientific reasons that necessitate the use of the genomic data, investigators should provide a justification in the funding request for their use. The NIH funding institute or center will review the justification and decide whether to make an exception to the consent requirement.
  • When the research is funded or supported by NHGRI and consent expectations cannot be met: NHGRI will grant exceptions to the consent expectation on a case-by-case basis. Information about how to request an exception can be found in the NHGRI GDS Policy FAQs .

Questions? Contact us at [email protected] .

OR Support Offices

  • Human Subjects Division (HSD)
  • Office of Animal Welfare (OAW)
  • Office of Research (OR)
  • Office of Research Information Services (ORIS)
  • Office of Sponsored Programs (OSP)

OR Research Units

  • Applied Physics Laboratory (APL-UW)
  • WA National Primate Research Center (WaNPRC)

Research Partner Offices

  • Corporate and Foundation Relations (CFR)
  • Enivronmental Health and Safety (EH&S)
  • Grant and Contract Accounting (GCA)
  • Institute of Translational Health Sciences (ITHS)
  • Management Accounting and Analysis (MAA)
  • Post Award Fiscal Compliance (PAFC)

Collaboration

  • Centers and Institutes
  • Collaborative Proposal Development Resources
  • Research Fact Sheet
  • Research Annual Report
  • Stats and Rankings
  • Honors and Awards
  • Office of Research

© 2024 University of Washington | Seattle, WA

Read our research on: Gun Policy | International Conflict | Election 2024

Regions & Countries

Americans’ use of chatgpt is ticking up, but few trust its election information.

It’s been more than a year since ChatGPT’s public debut set the tech world abuzz . And Americans’ use of the chatbot is ticking up: 23% of U.S. adults say they have ever used it, according to a Pew Research Center survey conducted in February, up from 18% in July 2023.

The February survey also asked Americans about several ways they might use ChatGPT, including for workplace tasks, for learning and for fun. While growing shares of Americans are using the chatbot for these purposes, the public is more wary than not of what the chatbot might tell them about the 2024 U.S. presidential election. About four-in-ten adults have not too much or no trust in the election information that comes from ChatGPT. By comparison, just 2% have a great deal or quite a bit of trust.

Pew Research Center conducted this study to understand Americans’ use of ChatGPT and their attitudes about the chatbot. For this analysis, we surveyed 10,133 U.S. adults from Feb. 7 to Feb. 11, 2024.

Everyone who took part in the survey is a member of the Center’s American Trends Panel (ATP), an online survey panel that is recruited through national, random sampling of residential addresses. This way, nearly all U.S. adults have a chance of selection. The survey is weighted to be representative of the U.S. adult population by gender, race, ethnicity, partisan affiliation, education and other categories. Read more about the ATP’s methodology .

Here are the questions used for this analysis , along with responses, and the survey methodology .

Below we’ll look more closely at:

  • Which U.S. adults have used ChatGPT
  • How Americans are using it
  • How much Americans trust ChatGPT’s election information

Who has used ChatGPT?

A line chart showing that chatGPT use has ticked up since July, particularly among younger adults.

Most Americans still haven’t used the chatbot, despite the uptick since our July 2023 survey on this topic . But some groups remain far more likely to have used it than others.

Differences by age

Adults under 30 stand out: 43% of these young adults have used ChatGPT, up 10 percentage points since last summer. Use of the chatbot is also up slightly among those ages 30 to 49 and 50 to 64. Still, these groups remain less likely than their younger peers to have used the technology. Just 6% of Americans 65 and up have used ChatGPT.

Differences by education

Highly educated adults are most likely to have used ChatGPT: 37% of those with a postgraduate or other advanced degree have done so, up 8 points since July 2023. This group is more likely to have used ChatGPT than those with a bachelor’s degree only (29%), some college experience (23%) or a high school diploma or less (12%).

How have Americans used ChatGPT?

Since March 2023, we’ve also tracked three potential reasons Americans might use ChatGPT: for work, to learn something new or for entertainment.

Line charts showing that the share of employed Americans who have used ChatGPT for work has risen by double digits in the past year.

The share of employed Americans who have used ChatGPT on the job increased from 8% in March 2023 to 20% in February 2024, including an 8-point increase since July.

Turning to U.S. adults overall, about one-in-five have used ChatGPT to learn something new (17%) or for entertainment (17%). These shares have increased from about one-in-ten in March 2023.

Line charts showing that about a third of employed Americans under 30 have now used ChatGPT for work.

Use of ChatGPT for work, learning or entertainment has largely risen across age groups over the past year. Still, there are striking differences between these groups (those 18 to 29, 30 to 49, and 50 and older).

For example, about three-in-ten employed adults under 30 (31%) say they have used it for tasks at work – up 19 points from a year ago, with much of that increase happening since July. These younger workers are more likely than their older peers to have used ChatGPT in this way.

Adults under 30 also stand out in using the chatbot for learning. And when it comes to entertainment, those under 50 are more likely than older adults to use ChatGPT for this purpose.

A third of employed Americans with a postgraduate degree have used ChatGPT for work, compared with smaller shares of workers who have a bachelor’s degree only (25%), some college (19%) or a high school diploma or less (8%).

Those shares have each roughly tripled since March 2023 for workers with a postgraduate degree, bachelor’s degree or some college. Among workers with a high school diploma or less, use is statistically unchanged from a year ago.

Using ChatGPT for other purposes also varies by education level, though the patterns are slightly different. For example, a quarter each of postgraduate and bachelor’s degree-holders have used ChatGPT for learning, compared with 16% of those with some college experience and 11% of those with a high school diploma or less education. Each of these shares is up from a year ago.

ChatGPT and the 2024 presidential election

With more people using ChatGPT, we also wanted to understand whether Americans trust the information they get from it, particularly in the context of U.S. politics.

A horizontal stacked bar chart showing that about 4 in 10 Americans don’t trust information about the election that comes from ChatGPT.

About four-in-ten Americans (38%) don’t trust the information that comes from ChatGPT about the 2024 U.S. presidential election – that is, they say they have not too much trust (18%) or no trust at all (20%).

A mere 2% have a great deal or quite a bit of trust, while 10% have some trust.

Another 15% aren’t sure, while 34% have not heard of ChatGPT.

Distrust far outweighs trust regardless of political party. About four-in-ten Republicans and Democrats alike (including those who lean toward each party) have not too much or no trust at all in ChatGPT’s election information.

Notably, however, very few Americans have actually used the chatbot to find information about the presidential election: Just 2% of adults say they have done so, including 2% of Democrats and Democratic-leaning independents and 1% of Republicans and GOP leaners.

These survey findings come amid growing national attention on chatbots and misinformation. Several tech companies have recently pledged to prevent the misuse of artificial intelligence – including chatbots – in this year’s election. But recent reports suggest chatbots themselves may provide misleading answers to election-related questions .

Note: Here are the questions used for this analysis , along with responses, and the survey methodology .

finding research data

Sign up for our weekly newsletter

Fresh data delivered Saturday mornings

Many Americans think generative AI programs should credit the sources they rely on

Q&a: how we used large language models to identify guests on popular podcasts, striking findings from 2023, what the data says about americans’ views of artificial intelligence, most popular.

About Pew Research Center Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research. Pew Research Center does not take policy positions. It is a subsidiary of The Pew Charitable Trusts .

IMAGES

  1. Research Data

    finding research data

  2. 8 Types of Analysis in Research

    finding research data

  3. Research Findings

    finding research data

  4. Five Steps to Finding Research Studies

    finding research data

  5. Top 14 Data Analysis Tools For Research (Explained)

    finding research data

  6. A Step-by-Step Guide to the Data Analysis Process [2022]

    finding research data

VIDEO

  1. 4. Research Skills

  2. Finding Research Gaps: A guide for project and research students

  3. Research Data

  4. Interactive Insights: Navigating Supervisor-Student Dynamics

  5. 3. Graduate Research Project Ideas: Identify what motivates you (Module 1, Part 2)

  6. Guide in Finding Research Topic

COMMENTS

  1. Eleven quick tips for finding research data

    When this vast amount and variety of data is made available, finding relevant data to meet a research need is increasingly a challenge. In the past, when data were relatively sparse, researchers discovered existing data by searching literature, attending conferences, and asking colleagues. In today's data-rich environment, with accompanying ...

  2. How to Find Data & Statistics: Finding Data

    Search Strategy #1: Search in a Data Archive. Look within a data archive that collects within the general subject area that you are searching for. Inter-University Consortium for Political and Social Research. The world's largest social science data archive. It is one of the best places to start looking for a data set.

  3. Research Guides: Data Literacy for Researchers: Finding Data

    Finding Datasets in Repositories. Data repositories contain published datasets that are typically associated with publications or ongoing research projects. Data repositories are used to store and preserve data so that researchers can access and analyze it. There are two types of repositories we will discuss: scholarly and public.

  4. Defining Research Data

    Defining Research Data. One definition of research data is: "the recorded factual material commonly accepted in the scientific community as necessary to validate research findings." ( OMB Circular 110 ). Research data covers a broad range of types of information (see examples below), and digital data can be structured and stored in a variety of ...

  5. Research Guides: How to Find Data: Tips for Finding Data

    2. Determine who collects the type of data you are looking for. Think of who has a stake in collecting this data. Also consider who the audience of the data might be. This will help you determine where the data is likely published and how accessible the data is. An Example: I am interested in finding employment rates for colleges by state. The ...

  6. Research Guides: Getting Started Finding Data: Home

    ICPSR. One of the world's oldest and largest social science data archives; based at the University of Michigan. Categories of datasets include census enumerations; community and urban studies; economic behavior and attitudes; education, government structures, policies and capabilities; social indicators; and much more.

  7. Introduction

    Research data is the i nformation collected, observed or generated to validate research findings. It can be either qualitative,quantitative or mixed methods, combining both. Qualitative data. Qualitative data is observational and descriptive, relating to experiences and emotions and can come in the form of interview transcripts, focus groups ...

  8. Find Research Data

    There are many ways to find research data. You can search directly in data repositories, check out the literature to see what data other researchers are citing or try a google search. Depending on your topic one approach might work best or you might need to try all three.

  9. Ten simple rules for improving research data discovery

    These 10 rules can be thought of as a mirror to "Eleven Quick Tips for Finding Research Data," providing key guidance on how to make your research data more findable in the complicated systems that share and provide access to research . As opposed to helping locate data for reuse, this article is meant to help you make your data and your ...

  10. Overview

    Research data are the elements used to support or validate your academic work or analysis. While this includes numeric data, many other types of research objects such as code, formulae, images, sound, artifacts, and text can be considered research data. Research Data Services is a network of services provided by the library to assist you during ...

  11. Finding research data

    Finding research data This page explains how you can search for existing research data. You might want to use existing data for your research to do a meta-analysis, to re-evaluate findings, to perform new analyses on existing data, or to validate a model.

  12. ResearchGate

    Access 160+ million publications and connect with 25+ million researchers. Join for free and gain visibility by uploading your research.

  13. Data Module #1: What is Research Data?

    In this module, we will provide you with a basic definition and understanding of what research data are. We'll also explore how data fits into the scholarly research process. Many people think of data-driven research as something that primarily happens in the sciences. It is often thought of as involving a spreadsheet filled with numbers.

  14. Research Findings

    Qualitative Findings. Qualitative research is an exploratory research method used to understand the complexities of human behavior and experiences. Qualitative findings are non-numerical and descriptive data that describe the meaning and interpretation of the data collected. Examples of qualitative findings include quotes from participants ...

  15. 10 Great Places To Find Open, Free Datasets [2024 Guide]

    1. Google Dataset Search. Type of data: Miscellaneous. Data compiled by: Google. Access: Free to search, but does include some fee-based search results. Sample dataset: Global price of coffee, 1990-present. It seems we turn to Google for everything these days, and data is no exception.

  16. Research Data

    Research data refers to any information or evidence gathered through systematic investigation or experimentation to support or refute a hypothesis or answer a research question. It includes both primary and secondary data, and can be in various formats such as numerical, textual, audiovisual, or visual. Research data plays a critical role in ...

  17. Finding research data

    The simplest way to find a repository is to use a directory/registry service. Registry of Research Data Repositories (re3data) - is 'a global registry of research data repositories that covers research data repositories from different academic disciplines.'. The Open Access Directory - 'This is a list of repositories and databases for open data.'.

  18. Dataset Search

    Dataset Search. Try coronavirus covid-19 or water quality site:canada.ca. Learn more about Dataset Search.

  19. Resources for Finding and Sharing Research Data

    This one-hour introductory class provides researchers with an overview of online resources for locating research datasets, data repositories, and data publications for data sharing and re-use. Participants will learn search strategies for locating datasets through federated data search portals and generalist data repositories, including ...

  20. Finding Data

    Finding Data. Consultation, full service (HLS, Baker), and referrals for locating sources of research data (e.g. Library subscriptions, government sponsor, repository). Eligibility information is outlined below based on providers with offerings that are available to the entire Harvard community or a specific unit/appointment.

  21. Statista

    [email protected]. Tel. +1 212 419-5774. Mon - Fri, 9am - 6pm (EST) Find statistics, consumer survey results and industry studies from over 22,500 sources on over 60,000 topics on the internet's ...

  22. Find research data

    Find research data Search by expertise, name or affiliation. Show filters; Advanced search; Search in all content Filters for Datasets Search concepts ... Supplementary data tables for "A standardized citation metrics author database annotated for scientific field" (PLoS Biology 2019) Ioannidis, J. ...

  23. Data Innovations

    These activities include identifying data needs and data gaps, creating new research databases, and disseminating the data to support research and analysis to inform public and private policy. The AHRQ Data Innovations initiative has resulted in the creation of databases for public dissemination. Some databases can be accessed directly through ...

  24. AHRQ Challenge on Integrating Healthcare System Data With Systematic

    The Agency for Healthcare Research and Quality (AHRQ), U.S. Department of Health and Human Services (HHS), is announcing a challenge competition to explore the resources and infrastructure needed to integrate real-world evidence from healthcare systems into systematic review findings.. About the Challenge; How to Enter the Challenge

  25. Using the consolidated Framework for Implementation Research to

    Procedure. The procedure for this research is multi-stepped and is summarized in Fig. 1.First, we mapped retrospective qualitative data collected during the development of the SCI-HMT [] against the five domains of the CFIR in order to create a semi-structured interview guide (Step 1).Then, we used this interview guide to collect prospective data from health professionals and people with SCI ...

  26. Work Trend Index: Microsoft's latest research on the ways we work

    Research and data on the trends reshaping the world of work. Special Report · November 15, 2023. What Can Copilot's Earliest Users Teach Us About Generative AI at Work? A first look at the impact on productivity, creativity, and time. Read the latest report. About Work Trend Index.

  27. Assessing the Impact of COVID-19 on Rural Hospitals

    The purpose of this research is to examine the impact of the COVID-19 pandemic on rural hospital financial performance. ... Data and Methods. This study used hospital-level data from 2017 to 2022. Data were obtained from the Medicare Hospital Cost Reports, the AHA Annual Survey, the Area Health Resource File, the Center for Disease Control and ...

  28. The use and impact of surveillance-based technology initiatives in

    Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable. Yes The template data extraction form is available in Supplementary 1. MMAT quality appraisal ratings for each included study are available in Supplementary 2.

  29. For the Record- April 2, 2024: Diversity in Clinical Trials Bill, PAVE

    When genomic data from specimens created or collected after the effective date of the GDS policy (January 25, 2015), lack consent for research use and data sharing: if there are compelling scientific reasons that necessitate the use of the genomic data, investigators should provide a justification in the funding request for their use. The NIH ...

  30. Americans increasingly using ChatGPT, but few ...

    And Americans' use of the chatbot is ticking up: 23% of U.S. adults say they have ever used it, according to a Pew Research Center survey conducted in February, up from 18% in July 2023. The February survey also asked Americans about several ways they might use ChatGPT, including for workplace tasks, for learning and for fun.