Clinical Data Management - Science topic

Jakob Vielhauer

  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up
  • DOI: 10.4103/0253-7613.93842
  • Corpus ID: 11221473

Data management in clinical research: An overview

  • B. Krishnankutty , S. Bellary , +1 author L. Moodahadu
  • Published in Indian Journal of… 1 March 2012

Figures and Tables from this paper

figure 1

125 Citations

An overview of clinical trial data management, its standards and recent trends, metrics for leveraging more in clinical data management: proof of concept in the context of vaccine trials in an indian pharmaceutical company, clinical data management importance in clinical research, data management in clinical trials, data collection and quality control, clinical trial data management software: a review of the technical features., a review of clinical data management systems used in clinical trials., examining clinical data manager performance on the certified clinical data management examtm.

  • Highly Influenced
  • 13 Excerpts

Clinical Analytics and Data Management for the DNP

Clinical trials data management in the big data era, 12 references, heterogeneity prevails: the state of clinical trial data management in europe - results of a survey of ecrin centres, clinical data management: current status, challenges, and future directions from industry perspectives, on educating about medical data management, data management in multi-center clinical trials and the role of a nation-wide computer network. a 5 year evaluation., customized dual data entry for computerized data analysis., quality assurance in clinical trials., could an open-source clinical trial data-management system be what we have all been looking for, single vs. double data entry in cast., related papers.

Showing 1 through 3 of 0 Related Papers

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 10 November 2022

Rethinking clinical study data: why we should respect analysis results as data

  • Joana M. Barros   ORCID: orcid.org/0000-0002-2952-5420 1 , 2   na1 ,
  • Lukas A. Widmer   ORCID: orcid.org/0000-0003-1471-3493 1 ,
  • Mark Baillie   ORCID: orcid.org/0000-0002-5618-0667 1 &
  • Simon Wandel   ORCID: orcid.org/0000-0002-1442-597X 1  

Scientific Data volume  9 , Article number:  686 ( 2022 ) Cite this article

6129 Accesses

1 Citations

6 Altmetric

Metrics details

  • Medical research
  • Research data
  • Research management

The development and approval of new treatments generates large volumes of results, such as summaries of efficacy and safety. However, it is commonly overlooked that analyzing clinical study data also produces data in the form of results. For example, descriptive statistics and model predictions are data. Although integrating and putting findings into context is a cornerstone of scientific work, analysis results are often neglected as a data source. Results end up stored as “data products” such as PDF documents that are not machine readable or amenable to future analyses. We propose a solution to “calculate once, use many times” by combining analysis results standards with a common data model. This analysis results data model re-frames the target of analyses from static representations of the results (e.g., tables and figures) to a data model with applications in various contexts, including knowledge discovery. Further, we provide a working proof of concept detailing how to approach standardization and construct a schema to store and query analysis results.

Similar content being viewed by others

clinical data management research paper

Constructing a finer-grained representation of clinical trial results from ClinicalTrials.gov

clinical data management research paper

Reproducibility of real-world evidence studies using clinical practice data to inform regulatory and coverage decisions

clinical data management research paper

Reporting guidelines for precision medicine research of clinical relevance: the BePRECISE checklist

Introduction.

The process of analyzing data also produces data in the form of results. In other words, project outcomes themselves are a data source for future research: aggregated summaries, descriptive statistics, model estimates, predictions, and evaluation measurements may be reused for secondary purposes. For example, the development and approval of new treatments generates large volumes of results, such as summaries of efficacy and safety from supporting clinical trials through the development phases. Integrating these findings forms the evidence base for efficacy and safety review for new treatments under consideration.

Although integrating and putting scientific findings into context is a cornerstone of scientific work, project results are often neglected or indeed not handled as data (i.e., the machine-readable numerical outcome from an analysis). Analysis results are typically shared as part of presentations, reports, or publications addressing a greater objective. The results of data analysis end up stored as data products , namely, presentation-suitable formats such as PDF, PowerPoint, or HTML documents populated with text, tables, and figures showcasing the results of a single analysis or an assembly of analyses. Contrary to data which can be stored in data frames or databases, data products are not designed to be machine-readable or amenable to future data analyses. An example comparing a data product with data is given in Fig.  3 . In this example, we illustrate how a descriptive analysis of individual patient data - in this case the survival probability by treatment over time - then becomes a new machine-readable data source for subsequent analyses. In other words, the results from one analysis becomes a data source for new analyses. This is the case for clinical trial reporting where the data analysis summaries from a study are rendered to rich text format (RTF) files that are then compiled into appendices following the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) E3 guideline 1 where each appendix is a table, listing or and figure summary of a drug efficacy and safety evaluation. The analysis results stored in these appendices - which can span 1000s of pages - are not readily reusable: extracting information from PDF files is notoriously difficult, and even if machine-readable formats (RTFs) are available, often some manual work is required since important (meta-)information is contained in footnotes for which no standard formats exist. There have been recent attempts to modernise the reporting of clinical trials including the use of electronic notebooks and web-based frameworks. However, while literate programming documents such as Rmarkdown allow documenting code and results together and R-shiny enables dynamic data exploration, the rendered data products also suffer the same fate of presentation-suitable formats. In other words, modern data products also do not handle data analysis results as data. Although there is an agreement on which information should be shared as part of a data package and that sharing data can accelerate new discoveries, there is no proposed solution to facilitate the sharing and reuse of analysis results 2 .

A focus on results presentation over storage considerations sets up a barrier impeding the assimilation of scientific knowledge, understanding what was intended and what was implemented. As a repercussion, the scientific process cycle is broken, leaving researchers who want to reuse prior results with three options:

Re-run the analysis if the code and original source data are accessible.

Re-do the analysis if only the original source data is accessible.

Manually or (pseudo-)automatically extract information from the data products (e.g., tables, figures, published notebooks).

The first option would appear to be the best one and is, for instance, being implemented in Elife executable research articles 3 . However, being able to rerun the analysis does not guarantee reproducibility and can be computationally expensive when covering many studies, large data, or sophisticated models. Analyses can depend on technical factors such as the products used, their versions, and (hardware and software) dependencies, all of which affect the outcome. Even tailored statistical environments such as R 4 have a wide range of output discrepancies and must rely on extensions, such as broom 5 for reformatting and standardizing the outputs of data analysis.

For the second option, there are additional complications to account for: even if we assume that the entire analysis is fully documented, common analyses are not straightforward to implement. This option assumes that the complete details required to implement the analysis are documented, for example, in a statistical analysis plan (SAP). However, data-driven and expertise-driven undocumented choices are a hidden source of deviations that make reproducing or replicating the results an elusive task 6 . On top of this, the selective reporting of results limits replication of the complete set of performed data analyses (both pre-specified and ad-hoc) within a research project 7 , 8 , 9 .

The last scenario is common place for secondary research that combines and integrates findings of single, independent studies, such as meta-analyses or systematic reviews. Following the Cochrane Handbook for Systematic Reviews of Interventions to perform a meta-analysis, to assess the findings, it is necessary to first digitize the studies’ documents either through a laborious manual effort or by using extraction tools known to be error-prone and requiring verification 10 . Furthermore, the unavailability of complete results, potentially through selective reporting, requires researchers to extrapolate the missing results, which can lead to questionable reliability and risk of bias 11 .

Data management is an important, but often undervalued, pillar of scientific work. Good data management supports key activities from planning and execution to analysis and reporting. The importance of data stewardship is now also recognized as an additional pillar. Good data stewardship supports activities beyond the single project into areas such as knowledge discovery, as well as the reuse of data for secondary purposes, to other downstream tasks such as the contextualization, appraisal, and integration of knowledge. Initiatives like FAIR set up the minimal guiding principles and practices for data stewardship based on making the data Findable, Accessible, Interoperable, and Reusable 12 . Likewise, the software and data mining community (e.g., IBM , ONNX , and PFA ) have introduced initiatives bringing standardization to analytic applications, thus facilitating data exchange and releasing the researcher from the burden of translating the output of statistical analysis into a suitable format for the data product.

An important component of data management is the data model which specifies the information to capture, how to store it, and standardizes how the elements relate to one another. In the clinical domain, data management is a critical element in preparing regulatory submissions and to obtain market approval. In 1999 the Clinical Data Interchange Standards Consortium (CDISC) introduced the operational data model (ODM) facilitating the collection, organization, and sharing of clinical research data and metadata 13 . In addition, the ODM enabled the creation of standards (Fig.  1 ) such as the Standard Data Tabulation Model (SDTM) and the analysis data model (ADaM) to easily derive analysis datasets for regulatory submissions. Owing to the needs at the different stages of the clinical research lifecycle, CDISC data standards reflect the key steps of the clinical data lifecycle. Although regulatory procedures were traditionally focused on document submission, there has since been a gradual desire to also assess the data used to create the documents 14 . CDISC data standards address this need; however, these standards only consider data from planning and collection, up to analysis data (i.e. data prepared and ready for data analysis). Therefore, the outcome of this paper can be viewed as a potential extension to the CDISC data standards and how not only individual patient data but also descriptive and inferential results should be stored and made available for future reuse.

figure 1

CDISC defines a collection of standards adapted to the different stages in the clinical research process. For example, ADaM defines data sets that support efficient generation, replication, and review of analyses 36 .

In this paper, we explore the concept of viewing the output of data analysis as data. By doing so, we address the problems associated with the limited reproducibility and reusability of analysis results. We demonstrate why we should respect analysis results as data and put forward a solution using an analysis result data model (ARDM), re-framing the analyses target from the applications of the results (e.g., tables and figures) to a data model. By integrating the analysis results into a similar schema with specific constraints, we would ensure analysis data quality, improve reusability, and facilitate the development of tools leveraging the re-use of analysis results. Taking meta-analyses again as an example, applying an ARDM would now only require one database query instead of a long process of information extraction and verification. Tables, listings, and figures could be generated directly from the results instead of repeating the analysis. Furthermore, storing the results as independent datasets would also allow sharing information without the need for the underlying individual patient data, a useful property given data protection regulations in both academic and industry publications. Viewing analysis results as a data source moves us from repeating or redundantly recording results to a calculate once, use many times mindset. While we use the latter term focusing on results of statistical analyses for clinical studies, it can be seen as a special case of the more general concept of open science and open data, which aims at reducing redundancy in scientific research on a larger scale.

Implementing the ARDM in clinical research

The ARDM is adaptive and expandable. For example, with each analysis standard, we can adapt or create new tables to the schema. With respect to the inspection and visualization of the results, there is also the flexibility to create a variety of outputs, independent of the analysis standard. The proof of concept for the ARDM is implemented using the R programming language and a relational SQLite database ; however, these choices can be revisited as the ARDM can be implemented using a variety of languages and databases. This implementation should be viewed as a starting point rather than a complete solution. Here, we highlight the considerations we took to construct the ARDM utilizing three analysis standards (descriptive statistics, safety, and survival analysis) and leveraging the CDISC Pilot Project ADaM dataset. Further documentation is available in the code repository. An overview of the requirements to create the ARDM is shown is Fig.  2 .

figure 2

In clinical development, the analysis results data model enables a source of truth for results applied in various applications. Currently, the examples on the right require running analyses independently, even when using the same results.

Prior to ingesting clinical data, the algorithm first creates empty tables with specifications on the column names and data types. These tables are grouped into metadata, intermediate data, and results. The metadata tables are created to record additional information such as variables types (e.g., categorical and continuous) and measurement units (e.g., age is given in years). As part of the metadata tables, the algorithm also creates an analysis standards table requiring information on the analysis standard name, function calls, and its parameters. The intermediate data tables aggregate information at the subject level and are useful to avoid repeated data transformations (e.g., repeated aggregations) thus, reducing potential errors and computational execution time during the analysis. The results tables specify the analysis results information that will be stored. Note that the creation of the metadata, intermediate data, and result tables require upfront planning to identify which information should be recorded. Although it is possible to create tables ad hoc, a fundamental part of the ARDM is to generalize and remove redundancies rather than creating a multitude of fit-for-purpose solutions. Hence, creating a successful ARDM requires understanding the clinical development pipeline to effectively plan the analysis by taking into account the downstream applications of the results (e.g., the analysis standard or the data products). As the information stored in the results tables is dictated by the data model, it is possible to inspect the results by querying the database and creating visualizations. In the public repository 15 , we showcase how to query the database and create different products from the results. Furthermore, the modular nature of the ARDM separates the results rendering from the downstream outputs hence, updates to the data products do not affect the results.

Applications

Analysis standards are a fundamental part of the ARDM to guarantee coherent and suitable outputs. They ensure that the results are comparable, which is not always the case. Similarly, where conventions exist (e.g., safety analysis), we can use an ARDM to provide structure to the results storage thus, facilitating access and reusability. In short, it provides a knowledge source of validated analysis results, i.e. a single source of truth. This enables the separation between the analysis and the data products, streamlining the creation of tables or figures for publications, or other products as outlined in Fig.  2 .

Tracking, searching, and retrieving outputs is facilitated by having an ARDM as it enables query-based searches. For example, we can search based on primary endpoints “p-value”, “point estimates”, and adverse events incidence for any given trial present in the database. With automation, we can also select cohorts through query-based searches and apply the analysis standards to automate the creation of results using the selected data. This also facilitates decision-making and enhancements. For example, one can have access to complete trial results beyond the primary endpoint, and extrapolate to cohorts that require special considerations such as pediatric patients. In addition, a single source of truth for results encourages the adoption of more sophisticated approaches to gather new inferences, for example, using knowledge graphs and network analysis.

Case study: updating a Kaplan–Meier plot

The Kaplan-Meier plot is a common way to visualize the results from a survival or time-to-event analysis. The purpose of the Kaplan-Meier non-parametric method is to estimate the survival probability from observed survival times 16 . Note that some patients might not experience the event (e.g., death, relapse); hence, censoring is used to differentiate between the cases and to allow for valid inferences. As a result of the analysis, survival curves are created for the given strata. For the CDISC pilot study which was conducted in patients with mild to moderate Alzheimer’s disease, a time-to-event safety endpoint, the time to dermatologic events, is available. Such time-to-event safety endpoints are not uncommon in practice since they allow understanding potential differences between the treatment groups in the time to onset of the first event. Since the pilot study involved three treatment groups – placebo, low dose, and high dose – it may be a good starting point to plot all groups first. Figure  3A shows a Kaplan-Meier plot with three strata corresponding to the treatments in the CDISC pilot study.

figure 3

The Kaplan-Meier plot corresponds to a data product from a survival analysis ( A ). On the contrary, the data from the analysis is stored in a machine-readable format ( B ) allowing for updates to the Kaplan-Meier plot and for use in downstream analyses.

Even in the showcased scenario, we assume to have access to the clinical data, however, this might not be the case. Data protection is an important aspect of any research area. While data protection regulations have provided a way to share data and in return improve the reproducibility of experiments, in clinical research, sharing sensitive subject-specific data is impractical or simply not possible for legal reasons. Another option is to only share aggregated data or the analysis results. While this option can still bring privacy issues, for example due to the presence of outliers, results are already widely shared in publications through visualizations like the ones shown in Figs.  3 and 4 . For Kaplan-Meier plots, this has led to numerous approaches 17 , 18 , 19 , 20 on extracting/retrieving the underlying results data, since these are often required e.g. in health technology assessments or when incorporating historical information into actual studies (e.g., Roychoudhury and Neuenschwander (2020) 21 ). In contrast to current practice, having an ARDM in place gives many options on what data to share to support results reusability in a variety of contexts. For example, even regulatory agencies can benefit from the ARDM since outputs such as tables, graphics and listings can be easily generated from the results without the need to repeat or reproduce analyses. From our experience, it is common to initially share results with limited people (e.g., within a team) where we do not give much importance to details like aesthetics. However, at a later stage, researchers need the results to update the visualization to suit a wider audience, or use this data for future research. In the Kaplan-Meier plot example, this requires reverse-engineering by using tools to digitize the plot and create machine-readable results.

figure 4

Employing an Analysis Results Data Model enables re-use at the results level rather than requiring source data. In this example, treatment arms can be removed ( A ), or additional summary statistics, such as the median survival time ( B ) or a risk table ( C ), can provide more context without repeating the underlying analysis.

A results visualization can appear in a variety of documents from presentation slides, an initial report, or a final publication, however, it is most likely not accompanied by the results used to create it. This hinders the reuse of the information (i.e., results) in the plot. A frequently encountered situation is illustrated in Fig.  4A , where one stratum is removed and the plot only shows two survival curves, for placebo and the high dose. This is not atypical in drug development, since after a general study overview, the focus is often on one dose only. While this update may seem trivial, from our experience, this task can require considerable time and effort due to the unavailability of the results. Without an analysis results data model or a known location where to find the results from the survival analysis, one must first locate the clinical data to perform the same analysis again. Then, search for and find the analysis code and the instructions to create the Kaplan-Meier plot. Eventually, one must repeat the analysis entirely. Thirdly, it is advisable to confirm whether the new plot matches the one we want to update; this is especially important if the analysis had to be redone as data transformations might have happened (e.g., different censoring than originally planned). Finally, one can filter the strata and create the plot in Fig.  4A .

The analysis results data model

To create an analysis results data model, the first step requires thinking of the results of the analysis as data itself. Through this abstraction, we can begin organizing the data in a common model linking (e.g., clinical) datasets with the analysis results. Before we further introduce the ARDM it is necessary to clarify what an analysis and analysis results entail. An analysis is formally defined as a “detailed examination of the elements or structure of something” 22 . In practice, it is a collection of steps to inspect and understand data, explore a hypothesis, generate results, inferences, and possibly predictions. Analyses are fluid and can change depending on the conclusions drawn after each one of the steps. Nonetheless, routine analyses promote conventions that we can use as a foundation to create analysis standards. For example, looking at the table of contents of a Clinical Study Report (CSR) we can see a collection of routine results summaries. Diving deeper into these sections, we can see the same or similar analysis results between CSRs of independent clinical studies, namely due to conventions 1 . For example, it is standard for a clinical trial to report the demographics and baseline characteristics of the study population, and a summary of adverse events. These data summaries may also be a collection of separate data analyses grouped together in tables or figures (i.e., descriptive statistics of various baseline measurements, or the incidence rates of common adverse drug reactions, by assigned treatment). Also, the same statistics , such as the number of patients assigned to a treatment arm, may be repeated throughout the CSR. Complex inferential statistics may also be repeated in various tables and figures. For example, key outcomes maybe grouped together in a standalone summary of a drug’s benefit-risk profile. Therefore, without upfront planning, the same statistics may be implemented many times in separate code.

figure 5

The analysis standard follows a grammar to define the steps in the analysis. Similarly, Wilkinson’s 23 grammar of graphics (GoG) concisely defines the components required to produce a graphic.

The analysis results are the outcome of the analysis and are typically rendered into tables, figures, and listings to facilitate the presentation to stakeholders. Some examples of applications that can reuse the same results are present in Fig.  2 (right). Before the rendering, the results are stored in intermediate formats such as data frames or datasets. We can use this to our advantage and capture the results for posterior use in research by defining which elements to store and the respective constraints. This supports planning the analyses and the potential applications for the results, minimizing imprudent applications. An analysis results data model can be used to formalize the result elements to store and the constraints with the additional benefit of making the relationships between the results explicit. For example, we can store intermediate results, generated after the initial analysis steps, and use them to achieve the final analysis results. Besides improving the reusability of results, and reproducibility of the analysis, establishing relationships enables retracing the analysis steps and promotes transparency.

Data standards are useful to integrate and represent data correctly by specifying formats, units, and fields, among others. Due to the many requirements in clinical development, guidelines detailing how to implement a data standard are also frequent and essential to ensure the standard is correctly implemented and to describe the fundamental principles that apply to all data. An analysis standard would thus define the inputs and outputs of the analysis as well as the steps necessary to achieve those outputs. While an analysis convention follows a general set of context-dependent analysis steps, a standard ensures the analysis steps are inclusive (i.e., independent of context), consistent and uniform where each step is specified through a grammar 23 , 24 , 25 or the querying syntax used in database systems . In Fig.  5 , we compare the concepts behind an analysis standard with Wilkinson’s grammar of graphics (GoG) data flow. Both follow an immutable order, ensuring that previous steps must be fulfilled to achieve the end result. For example, any data transformation needs to occur before we apply a formula (e.g., compute the descriptive statistics), otherwise, the result of the analysis becomes dubious. The collection of steps forms a grammar; however, each step also offers choices. For example, apply formula can refer to a linear model or Cox model. Wilkinson refers to this characteristic as the system’s richness by the means of “paths” constructed by choosing different designs, scales, statistical methods, geometries, coordinate systems, and aesthetics. In the context of the ARDM, analysis standards support pre-planning, compelling the researcher to iterate over the potential analysis routes and the underlying question the analysis should address. In general, it is good practice to write down the details of an analysis, for example using a SAP, with sufficient granularity that the analysis could be reproduced independently if only the source data was available. Thus, the analysis standards would translate the intent expressed in the SAP into clear and well-defined steps.

Analysis standards bring immediate benefits to the analysis data quality 26 , 27 as it enables the validation of software and methods. With software validity, we refer to whether a piece of software does what it is expected and whether it clearly states how the output was reached. The validation of methods addresses whether the adequate statistical methodology was chosen. Due to its nature, this quality aspect is tightly related to other components of the clinical development process such as the SAP. In clinical development, standard operating procedures already cover many of these steps. However, they critically do not handle analysis results as a data source. Combining a data model with analysis standards would benefit clinical practice in four aspects:

Guaranteeing data quality and consistency across a clinical program, essentially creating a single source of truth designed to handle different levels of project abstraction. For example, from a single data analysis to a complete study, or a collection of studies.

Reusability by providing standardization across therapeutic areas and instigating the development of tools using the results instead of requiring individual patient data (e.g., interactive apps).

Simplicity as the analysis standard would encourage upfront planning and identify the necessary inputs, steps, and outputs to keep (e.g., reducing the complexity of forest plots and benefit-risk graphs summaries).

Efficiency by avoiding the manual and recurrent repetition of the analysis, and leveraging modularization and standardization of inferential statistics.

Analysis results datasets have been previously put forward as a solution to improve the uptake of graphics within Novartis, under the banner of graph-ready datasets 28 . Experienced study team leads often have implemented it for efficiency gains, especially around analysis outputs that would reuse existing summary statistics, for example, to support benefit-risk graphs where outcomes may come from different domains. Our experience has also revealed an element of institutional inertia. Standardizing analysis and results requires upfront planning which is often seen as added effort. However, teams that have gone through the steps of setting up a data model and a lightweight analysis process, have found efficiency and quality gains in reusing and maintaining code, as well as verifying and validating results. Regarding inferential results, instead of using results documents or repeating an analysis, we can simply access a common database where these are stored. An ARDM also simplifies modifications to the analysis (and consequently the results). With current practice, these changes might impact one function, program, or script in the best case, or multiple programs or scripts in the worst case. Using an ARDM only requires changes to one program as these can automatically propagate to any downstream analyses. The validation is also simplified as we transition from comparing data products (e.g., RTF files and plots) to comparing datasets directly. Additionally, this brings clarity and transparency, and is suitable for automation.

Six guiding principles

To create the ARDM we follow a collection of principles addressing the obstacles commonly faced during the clinical research process but also present in other areas. These principles are highlighted in Table  1 and broadly put forward improvements to quality, accessibility, efficiency, and reproducibility. On top of providing a data management solution, the ARDM compels us to take a holistic view of the clinical research process, from the initial data capture to the potential end applications. With this view, we have a clearer picture of where deficiencies occur and of their impact on the process.

The “searchable” principle refers to the easy retrieval of information by guaranteeing storage in a known, consistent, and technically-sound way. As we previously highlighted, it is common to have vast collections of results with very limited searchability. For example, figures in a collection of PDF documents. A practical solution is to have a data model to store the information consistently. In turn, this supports using a database that is by default more searchable than the PDF documents. With “searchable” in place, one can apply the “interoperable”, “nonredundant”, and “reusable and extensible” principles. In practice, this includes the use of consistent field names to store data in the database (e.g., the column “mean” has the mean value stored as a numeric value). The resulting coherent database is system-agnostic and can be queried through a variety of tools such as APIs. Thus, the data storing process supports straightforward querying which in turn can be used to avoid storing redundant results. Overall, this facilitates the use of the stored (results) data for primary analysis (i.e., submission to regulators) and secondary purposes (e.g., meta-analysis) but also allows for extensions of the data model granted the current model constraints are respected. The “separation of concerns” refers to having the analysis (i.e., analysis code) separated from the source data, the results (e.g., from a survival analysis as shown in Fig.  3B ), and the data products (e.g., the Kaplan-Meier plot in Fig.  3A ). Finally, the “community-driven” principle ensures that the ARDM can be used pervasively, for instance, such that locations for tracking and finding results are not just multiplied across organizations but are community-developed and ideally lead to a single, widely accepted resource that can be searched as pioneered by the EMBL GWAS Catalog .

In many industries where sub-optimal but quick solutions are preferred, technical debt is a growing problem. While some amount of technical debt is inevitable, understanding our processes can point us to where to make progressive updates and improvements. For example, upfront planning using analysis standards would reduce this debt by default as our starting point are previously verified and validated analyses (i.e., analysis standards). In an effort to continue reducing the debt, the ARDM separation of concerns principle streamlines changes and updates to processes since the analysis, results, and products are separate entities. Standardizing how to store results enables the use of different programming languages to perform analysis with traditionally non-comparable output formats (e.g., SAS and R). Furthermore, we believe the ARDM should grow organically and community-driven, supporting consensus building and cross-organization access.

The ARDM provides a solution to handle analysis results as data by creating a single source of truth. To guarantee the accuracy of the source, it leverages analysis standards (i.e., validated analyses) with known outputs which are then organized in a database following the proposed data model. The use of analysis standards supports the pre-planning of analyses, compelling the researcher to iterate on the best approach for analyzing the data, and potentially deciding to use pre-existing and appropriate analysis standards. Considering the ARDM from the biomedical data lifecycle view (e.g., through the lens of the Harvard Medical School’s Biomedical Data Lifecycle) , the ARDM touches the documentation & metadata, analysis ready datasets, data repositories, data sharing, and reproducibility stages. However, we take the point of view of a clinical researcher (both data consumer and producer) who sees the recurring problem of having to extract results data from published work. Therefore, in the context of the clinical trial lifecycle 2 , extending CDISC with the ARDM would touch on all of the biomedical data lifecycle phases as the ARDM relies on details present in supporting documents like the statistical analysis plan and data specifications.

The concept of creating standards through a common data model is recognised as good data management and stewardship practice. A few examples include the Observational Medical Outcomes Partnership data model, a standard designed to standardize the structure and content of observational data 29 and the Large-scale Evidence Generation and Evaluation across a Network of Databases research initiative to generate and store evidence from observational data 30 . The data model created by the Sentinel initiative , led by the Food and Drug Administration (FDA), is tailored to organize medical billing information and electronic health records from a network of health care organizations. Similarly, the National Patient-Centered Clinical Research Network also established a standard to organize the data collected from their network of partners. Finally, expanding the search to translational medicine, the Informatics for Integrating Biology and the Bedside introduced a standard to organize electronic medical records and clinical research data 31 .

Alongside data models, standard processes have been established to generate analysis results such as the requirement to document analyses in SAPs 32 , including all data transformations from the source data to analysis ready data sets. However, analyses can be complex and dependent on technical factors, such as the statistical software used, as well as undocumented analysis choices throughout the pipeline, from source data to result. Even less complex routine analyses are error-prone and might not be clearly reproducible. Altogether, this process is time and resource-consuming. A proposed solution is to perform the analysis automatically. With this in mind and targeting clinical development, Brix et al . 33 introduce the ODM Data Analysis, a tool to automatically validate, monitor, and generate descriptive statistics from clinical data stored in the CDISC Operational Data Model format. The FDA’s Sentinel Initiative is also capable of generating descriptive summaries and performing specific analysis leveraging the proprietary Sentinel Routine Querying System .

Following this direction, the natural progression would be to create a standard suited for storing analysis results. Such an idea is implemented in the genome-wide association studies (GWAS) catalogue where curators assess GWAS literature, extract data, and store it following a standard including the summary statistics . Taking a step in this direction, CDISC began the 360 initiative to support the implementation of standards as linked metadata in an attempt to improve efficiency, consistency, and reusability across the clinical research. Nonetheless, the irreproducibility of research results remains an obstacle in clinical research and has brought up calls for global data standardization to enable semantic interoperability and adherence to the FAIR principles 34 . In our view, analysis standards and the ARDM are an important contribution to this initiative.

An important aspect which we did not explicitly discuss is the quality of the (raw/source) data which will ultimately serve as the source of any analyses for which results dataset are created through the ARDM. While the ARDM can be seen as a concept naturally tied to the CDISC philosophy, which is most prominently used in drug development studies that are conducted in a highly regulated environment with rigorous data quality standards, its applicability goes far beyond. For example, analyses conducted on open health data could also benefit from the ARDM, which would help to simplify traceability, exchangeability and reproducibility of analysis results. However, when working with these kind of data, understanding the quality of the underlying raw data is of paramount importance. In particular, since the ARDM will make analysis results more easily accessible and reusable also to an audience who may only have a limited understanding of how to assess the quality of the underlying raw data. In this wider context, it may beneficial to use data quality evaluation approaches that were developed for a non-technical audience or for an audience without subject-matter (domain) expertise 35 . This will allow the audience to interpret the results taking the quality of the underlying raw data into account.

Utilizing the proposed ARDM has a set of requirements. Firstly, the provided clinical data must follow a consistent standard (i.e., CDISC ADaM). Our solution involves automatically populating a database, hence there are expectations regarding the structure of the data. Similarly, data standards are necessary to enable analysis standards. If the analysis input expectations are not met, the analysis is unsuccessful and no results are produced or stored. Further, when a data standard is updated it is necessary to also update the analysis standards and the ARDM accordingly. Another limitation is the necessity of analysis standards. Without quality analysis standards, the quality of the source of truth is not guaranteed. Creating analysis standards requires a good understanding of the analysis to correctly define the underlying grammar and identify relevant decision options for the user (e.g., filter data before modeling). The third limitation corresponds to the applications. At the moment, the ARDM stores and organizes results in a suitable way to reuse in known applications (e.g., creating plots, tables, and requesting individual result values). As future applications are unknown, the data model might not store all the information needed. However, given the ARDM modular approach, it is only necessary to update the result information to be kept rather than updating the entire workflow. Another limitation refers to the supported data modalities. The proposed ARDM is implemented on tabular clinical trial data. However, it is possible to adapt the ARDM and design choices (e.g., type of database) to support diverse data. For example, the summary statistics present in the genome-wide association studies (GWAS) catalog could be stored following an ARDM.

The current option to share and access clinical trials results is ClinicalTrials.gov . Nonetheless, this is a repository and does not permit querying results as these are not stored as data (i.e., a machine-readable dataframe). The ARDM is an attempt to bring forward the problem of reproducibility and the lack of a single source of truth for analysis results. With it, we call for a paradigm shift where the target for the data analysis becomes the data model. Nonetheless, we understand the ARDM limitations and view it as one solution to a complex problem. We believe the best way to understand how the ARDM should evolve, or to shape it into a better solution, is to hear the opinions of the community. Hence, our underlying objective is to get the community’s attention, discover similar initiatives, and converge on how to move forward in establishing analysis results as a data source to support future reusability and knowledge discovery.

Data availability

The CDISC Pilot Project ADaM ADSL, ADTTE, and ADAE datasets were used to support the implementation of the analysis results data model. This data can be found at the PHUSE scripts repository ( https://github.com/phuse-org/phuse-scripts/blob/fa55614d7d178a193cc9b6e74256ea2d8dcf5d80/data/adam/TDF_ADaM_v1.0.zip ) and at the repository supporting this manuscript 15 .

Code availability

The implementation of the analysis results data model is available on Github 15 . This repository exemplifies how to construct the data model and the respective schema, as well as shows how to query the underlying database. Furthermore, we provide three output examples to visualize the results.

European Medicines Agency. ICH Topic E 3 - Structure and Content of Clinical Study Reports. https://www.ema.europa.eu/en/documents/scientific-guideline/ich-e-3-structure-content-clinical-study-reports-step-5_en.pdf (1996).

Committee on Strategies for Responsible Sharing of Clinical Trial Data, Board on Health Sciences Policy & Institute of Medicine. Sharing clinical trial data (National Academies Press, Washington, D.C. 2015).

Maciocci, Giuliano and Aufreiter, Michael and Bentley, Nokome. Introducing eLife’s first computationally reproducible article. https://elifesciences.org/labs/ad58f08d/introducing-elife-s-first-computationally-reproducible-article (2019).

R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/ (2021).

Robinson, D., Hayes, A. & Couch, S. broom: Convert Statistical Objects into Tidy Tibbles. https://CRAN.R-project.org/package=broom . R package version 0.7.6 (2021).

Siebert, M. et al . Data-sharing and re-analysis for main studies assessed by the european medicines agency—a crosssectional study on european public assessment reports. BMC medicine 20 , 1–14 (2022).

Article   Google Scholar  

Gelman, A. & Loken, E. The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. Dep. Stat. Columbia Univ. 348 (2013).

Wicherts, J. M. et al . Degrees of freedom in planning, running, analyzing, and reporting psychological studies: A checklist to avoid p-hacking. Front. psychology 1832 (2016).

Devezer, B., Navarro, D. J., Vandekerckhove, J. & Ozge Buzbas, E. The case for formal methodology in scientific reform. Royal Soc. open science 8 , 200805 (2020).

Higgins, J. P. et al . Cochrane handbook for systematic reviews of interventions (John Wiley & Sons, 2019).

Tendal, B. et al . Disagreements in meta-analyses using outcomes measured on continuous or rating scales: observer agreement study. BMJ 339 (2009).

Wilkinson, M. D. et al . The fair guiding principles for scientific data management and stewardship. Sci. data 3 , 1–9 (2016).

Huser, V., Sastry, C., Breymaier, M., Idriss, A. & Cimino, J. J. Standardizing data exchange for clinical research protocols and case report forms: An assessment of the suitability of the Clinical Data Interchange Standards Consortium (CDISC) Operational Data Model (ODM). J. biomedical informatics 57 , 88–99 (2015).

Article   PubMed   Google Scholar  

European Medicines Agency. European Medicines Regulatory Network Data Standardisation Strategy. https://www.ema.europa.eu/en/documents/other/european-medicines-regulatory-network-data-standardisation-strategy_en.pdf (2021).

Barros, JM., A Widmer, L. & Baillie, M. Analysis Results Data Model, Zenodo , https://doi.org/10.5281/zenodo.7163032 (2022).

Kaplan, E. L. & Meier, P. Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 53 , 457–481 (1958).

Article   MathSciNet   MATH   Google Scholar  

Guyot, P., Ades, A., Ouwens, M. J. & Welton, N. J. Enhanced secondary analysis of survival data: reconstructing the data from published kaplan-meier survival curves. BMC medical research methodology 12 , 1–13 (2012).

Liu, Z., Rich, B. & Hanley, J. A. Recovering the raw data behind a non-parametric survival curve. Syst. reviews 3 , 1–10 (2014).

Article   CAS   Google Scholar  

Liu, N., Zhou, Y. & Lee, J. J. IPDfromKM: reconstruct individual patient data from published kaplan-meier survival curves. BMC Med. Res. Methodol. 21 , 1–22 (2021).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Rogula, B., Lozano-Ortega, G. & Johnston, K. M. A method for reconstructing individual patient data from kaplan-meier survival curves that incorporate marked censoring times. MDM Policy & Pract. 7 (2022).

Roychoudhury, S. & Neuenschwander, B. Bayesian leveraging of historical control data for a clinical trial with time-to-event endpoint. Stat. medicine 39 , 984–995 (2020).

Article   MathSciNet   Google Scholar  

Cambridge University Press. Analysis. In Cambridge Academic Content Dictionary, https://dictionary.cambridge.org/dictionary/english/analysis (Cambridge University Press, 2021).

Wilkinson, L. The grammar of graphics. In Handbook of computational statistics, 375–414 (Springer, 2012).

Wickham, H. Tidy data. J. Stat. Softw. 59 , 1–23 (2014).

Lee, S., Cook, D. & Lawrence, M. Plyranges: A grammar of genomic data transformation. Genome biology 20 , 1–10 (2019).

PhUSE Standard Analysis and Code Sharing Working Group. Best Practices for Quality Control and Validation. https://phuse.s3.eu-central-1.amazonaws.com/Deliverables/Standard+Analyses+and+Code+Sharing/Best+Practices+for+Quality+Control+%26+Validation.pdf (2020).

European Medicines Agency. ICH Topic E 6 - Guideline for good clinical practice (R2). https://www.ema.europa.eu/en/documents/scientific-guideline/ich-e-6-r2-guideline-good-clinical-practice-step-5_en.pdf (2015).

Vandemeulebroecke, M. et al . How can we make better graphs? an initiative to increase the graphical expertise and productivity of quantitative scientists. Pharm. Stat. 18 , 106–114 (2019).

Observational Medical Outcomes Partnership. OMOP Common Data Model. https://ohdsi.github.io/CommonDataModel/ (2021).

Schuemie, M. J. et al . Principles of large-scale evidence generation and evaluation across a network of databases (LEGEND). J. Am. Med. Informatics Assoc. 27 , 1331–1337 (2020).

Murphy, S. N. et al . Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J. Am. Med. Informatics Assoc. 17 , 124–130 (2010).

Gamble, C. et al . Guidelines for the content of statistical analysis plans in clinical trials. JAMA 318 , 2337–2343 (2017).

Brix, T. J. et al . ODM data analysis—a tool for the automatic validation, monitoring and generation of generic descriptive statistics of patient data. PloS one 13 , e0199242 (2018).

Article   MathSciNet   PubMed   PubMed Central   Google Scholar  

Jauregui, B. et al . The turning point for clinical research: Global data standardization. J. Appl. Clin. Trials (2019).

Nikiforova, A. Analysis of open health data quality using data object-driven approach to data quality evaluation: insights from a latvian context. In IADIS International Conference e-Health , 119–126 (2019).

Peter Van Reusel. CDISC 360: What’s in It for Me? www.cdisc.org/sites/default/files/2021-10/CDISC_360_2021_EU_Interchange.pdf (2021).

Download references

Acknowledgements

We thank Carlotta Caroli, Nicholas Kelley, and Shahram Ebadollahi for their role in establishing and stewarding the AI4Life residency program. We also want to acknowledge Janice Branson for her valuable comments and support in this journey. Finally, J.M.B. would like to thank Idorsia Pharmaceuticals for the support during the final submission.

Author information

This work took place and was submitted when the author was at Novartis: Joana M. Barros.

Authors and Affiliations

Analytics, Novartis Pharma AG, Basel, Switzerland

Joana M. Barros, Lukas A. Widmer, Mark Baillie & Simon Wandel

Department of Biometry, Idorsia Pharmaceuticals, Allschwil, Switzerland

Joana M. Barros

You can also search for this author in PubMed   Google Scholar

Contributions

All authors conceived and contributed to design the approach. J.M.B., L.A.W. and S.W. supervised the project. J.M.B. developed the data model and analysis standards. M.B. and L.A.W. reviewed the methodology. All authors read, edited, and approved the manuscript.

Corresponding authors

Correspondence to Joana M. Barros or Mark Baillie .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Barros, J.M., Widmer, L.A., Baillie, M. et al. Rethinking clinical study data: why we should respect analysis results as data. Sci Data 9 , 686 (2022). https://doi.org/10.1038/s41597-022-01789-2

Download citation

Received : 25 April 2022

Accepted : 18 October 2022

Published : 10 November 2022

DOI : https://doi.org/10.1038/s41597-022-01789-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Scientific Data (2024)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

clinical data management research paper

Research Data Management Resources

  • UMass Chan Data Resources
  • NIH Data Management and Sharing Policy
  • Data Privacy and Ethics

Data Privacy and Protection of Human Subjects

Research compliance, data integrity, and ethics, data ethics case studies.

  • Data Analysis and Cleaning
  • Data Documentation
  • File Management
  • File Storage and Backup
  • Data Sharing Policies and Copyright
  • Data Retention
  • Data Citation

​ Sally Gore, MS, MS LIS Manager, Research & Scholarly Communication Services [email protected]

Tess Grynoch, MLIS Research Data & Scholarly Communications Librarian [email protected]

Leah Honor, MLIS Research Data & Scholarly Communications Librarian [email protected]

Lisa Palmer, MSLS, AHIP Institutional Repository Librarian [email protected]

One of the laws that covers human subject data, particularly medical data, is the Health Insurance Portability and Accountability Act (HIPAA)  which has requirements for the privacy and security of personally identifiable information.

  • UMass Chan HIPAA Compliance Web Page Information about HIPAA regulations, including de-identification certification and authorization to disclose PHI for research forms.
  • Health Information Privacy Department of Health and Human Services resources on the Health Insurance Portability and Accountability Act (HIPAA).

Institutional Review Board

All UMass Chan Morningside Graduate School of Biomedical Sciences Basic Science students and postdocs must take CITI Training .

Video

  • UMass Chan Institutional Review Board (IRB) The IRB provides approval, guidance, and standards for human subjects research, and maintains a record of all human subject research conducted at UMass Chan.
  • Office for Human Research Protections Policy and advocacy for subjects involved in Department of Health and Human Services research.

Data De-identification

The Safe Harbor method for de-identification requires that all of the following personal identifiers be removed for the data to qualify as de-identified:  

  • Geographic subdivisions smaller than a state (except for 3-digit zip codes where the population is greater than 20,000) 
  • Dates other than year (except birth years that reveal an age of 90 or older, which must be aggregated so as to reveal only that the individual is age 90 or over)
  • Names of relatives and employers
  • Telephone and fax numbers
  • E-mail addresses
  • Social security numbers
  • Medical record numbers
  • Health plan beneficiary numbers
  • Account numbers
  • Certificate/license numbers
  • Vehicle or other device serial numbers
  • Internet protocol (IP) addresses
  • Finger or voice prints
  • Photographic images
  • and any other unique identifying number, characteristic, or code
  • HHS Deidentification Guidance Describes protected health information and provides guidance on the Expert Determination and Safe Harbor de-identification methods. From the US Department of Health and Human Services.

Resources from the UK Data Service:

  • How to anonymize quantitative data
  • How to anonymize qualitative data

De-identification tool:

  • NLM-Scrubber Freely available clinical text deidentification tool which uses an automated Safe Harbor process to deidentify the data. Review to ensure deidentification is complete is still required.
  • Research Compliance at UMass Chan Within the Office of Research, this group oversees and maintains UMass Chan standards for research.
  • Office of Research Integrity Federal agency that oversees Public Health Services research integrity.

Actions that undermine data integrity include data fabrication , falsification , and misattribution . For example, many journals, such as the Journal of Cell Biology , have strict editorial policies regarding images and image manipulation that, if not followed, result in the rejection and/or retraction of papers. 

  • Retraction Watch Database User guide for the Retraction Watch Database where you can filter the retracted articles by reason including concerns with data.

Other research ethics standards which impact research data at UMass Chan

  • Morningside Graduate School of Biomedical Sciences Research Ethics Certification requirements for Morningside Graduate School of Biomedical Sciences students.
  • UMass Chan Institutional Animal Care and Use Committee (IACUC) UMass Chan intranet site of the department responsible for the ethical treatment of laboratory animals.

U.S Department of Health and Human Services, The Office of Research Integrity (ORI)

The ethical aspects of data are many. Examples include defining ownership of data, obtaining consent to collect and share data, protecting the identity of human subjects and their personal identifying information, and the licensing of data.  Below are several ethics cases from Responsible Conduct of Research Casebook: Data Acquisition and Management a publication from the Office of Research Integrity at the U.S. Department of Health and Human Services.

There are generally four matters of data acquisition and management that need to be addressed at the outset of a study: (1) collection, (2) storage; (3) ownership, and 4) sharing. These cases and role play present common scenarios that occur at various stages of data acquisition and management. Issues discussed include acquiring sensitive data, sharing data with colleagues, and managing data collection processes.

Case One:  A researcher wants to sequence the genomes of children with cancer, eventually making them publicly available online, but encounters issues with adequate data protection and parental consent.

Case Two:  After working with her advisor to develop a sophisticated database, the postdoc wants access to the database in order to submit a grant proposal but runs into trouble when seeking the advisor’s permission.

Case Three:  A post-doc has a novel idea after observing a procedure during residency, but he needs access to a large amount of clinical data, including medical record numbers, so that he can eventually recruit individuals to participate in his research.

Role Play:  An assistant professor places her data on the NIH’s database of genotypes and phenotypes (dbGaP) only to find that a leading researcher has published a paper using the data shared in the NIH database before the one-year embargo period was up.

From the U.S. Office of Research Integrity's   RCR Casebook Stories about Researchers Worth Sharing edited by James M. DuBois

  • << Previous: Managing Research Data
  • Next: Data Analysis and Cleaning >>
  • Last Updated: Jul 23, 2024 3:52 PM
  • URL: https://libraryguides.umassmed.edu/research_data_management_resources

Small data challenges for intelligent prognostics and health management: a review

  • Open access
  • Published: 23 July 2024
  • Volume 57 , article number  214 , ( 2024 )

Cite this article

You have full access to this open access article

clinical data management research paper

  • Chuanjiang Li 1 ,
  • Shaobo Li 1 ,
  • Yixiong Feng 1 ,
  • Konstantinos Gryllias 2 ,
  • Fengshou Gu 3 &
  • Michael Pecht 4  

69 Accesses

Explore all metrics

Prognostics and health management (PHM) is critical for enhancing equipment reliability and reducing maintenance costs, and research on intelligent PHM has made significant progress driven by big data and deep learning techniques in recent years. However, complex working conditions and high-cost data collection inherent in real-world scenarios pose small-data challenges for the application of these methods. Given the urgent need for data-efficient PHM techniques in academia and industry, this paper aims to explore the fundamental concepts, ongoing research, and future trajectories of small data challenges in the PHM domain. This survey first elucidates the definition, causes, and impacts of small data on PHM tasks, and then analyzes the current mainstream approaches to solving small data problems, including data augmentation, transfer learning, and few-shot learning techniques, each of which has its advantages and disadvantages. In addition, this survey summarizes benchmark datasets and experimental paradigms to facilitate fair evaluations of diverse methodologies under small data conditions. Finally, some promising directions are pointed out to inspire future research.

Similar content being viewed by others

clinical data management research paper

A systematic review of data-driven approaches to fault diagnosis and early warning

clinical data management research paper

Deep Learning Algorithms for Machinery Health Prognostics Using Time-Series Data: A Review

clinical data management research paper

Challenges and Opportunities of AI-Enabled Monitoring, Diagnosis & Prognosis: A Review

Avoid common mistakes on your manuscript.

1 Introduction

Prognostics and health management (PHM), an increasingly important framework for realizing condition awareness and intelligent maintenance of mechanical equipment by analyzing collected monitoring data, is being applied in a growing spectrum of industries, such as aerospace (Randall 2021 ), transportation (Li et al. 2023a ), and wind turbines (Han et al. 2023 ). According to a survey conducted by the National Science Foundation (NSF) (Gray et al. 2012 ), PHM technologies have created economic benefits of $855 million over the past decade. It is the fact that PHM has such great application potential that it continues to attract sustained attention and research from different academic communities, including but not limited to reliability analysis, mechanical engineering, and computer science.

Functionally, PHM covers the entire monitoring lifecycle of an equipment, fulfilling roles across four key dimensions: anomaly detection (AD), fault diagnosis (FD), remaining useful life (RUL) prediction, and maintenance execution (ME) (Zio 2022 ). First, AD aims to discern rare events that deviate significantly from standard patterns, and the crux lies in accurately differentiating a handful of anomalous data from an extensive volume of normal data (Li et al. 2022a ). The focus of FD is to classify diverse faults, and the difficulty is to extract effective fault features under complex working conditions. RUL prediction emphasizes on estimating the time remaining before a component or system fails, and its main challenge is to construct comprehensive health indicators capable of characterizing trends in health degradation. Finally, ME optimizes maintenance decisions based on diagnostic and prognostic results (Lee and Mitici 2023 ).

Methodologically, the techniques employed to execute the PHM tasks of AD, FD, and RUL prediction can be classified into physics model-based, data-driven, and hybrid methods (Lei et al. 2018 ). Physics model-based methods utilize mathematical models to describe failure mechanisms and signal relationships, representative techniques include state observers (Choi et al. 2020 ), parameter estimation (Schmid et al. 2020 ), and some signal processing approaches (Gangsar and Tiwari 2020 ). However, data-driven methods involve manual or adaptive extraction of features from sensor signals, including statistical methods (Wang et al. 2022 ), machine learning (ML) (Huang et al. 2021 ) and deep learning (DL) (Fink et al. 2020 ). Hybrid approaches (Zhou et al. 2023a ) combine elements from both physics model-based and data-driven techniques. Among these methods, DL-based techniques have gained widespread interest in PHM tasks, spanning from AD to ME, which is attributed to their pronounced advantages over conventional techniques in automatic feature extraction and pattern recognition.

Figure  1 depicts the intelligent PHM cycle based on DL models (Omri et al. 2020 ), the steps include data collection and processing, model construction, feature extraction, task execution, and model deployment. It is evident that monitoring data forms the foundation of this cycle, its volume and quality wield decisive influence on the eventual performance of DL models in industrial contexts. However, gathering substantial datasets consisting of diverse anomaly and fault patterns with precise labels under different working conditions is time-consuming, dangerous, and costly, leading to small data problems that challenge models’ performance in PHM tasks. A recent investigation conducted by Dimensional Research underscores this quandary, revealing that 96% of companies have encountered small data issues in implementing industrial ML and DL projects (D. Research 2019 ).

figure 1

The intelligent PHM cycle based on DL models (Omri et al. 2020 )

To address the small data issues in intelligent PHM, organizations have started to shift their focus from big data to small data to enhance the efficiency and robustness of Artificial Intelligence (AI) models, which is strongly evidenced by the rapid growth of academic publications over recent years. To provide a comprehensive overview, we applied the preferred reporting items for systematic reviews and meta-analyses (PRISMA) (Huang et al. 2024 ; Kumar et al. 2023 ) method for paper investigation and selection. As shown in Fig.  2 , the PRISMA technique includes three steps: Defining the scope, databases, and keywords, screening search results, and identifying articles for analysis. At first, the search scope was limited to articles published in IEEE Xplore, Elsevier, and Web of Science databases from 2018 to 2023, and the keywords consisted of topic terms such as “small/limited/imbalanced/incomplete data”, technical terms such as “data augmentation”, “deep learning”, “transfer learning”, “few-shot learning”, “meta-learning”, and application-specific terminologies such as “intelligent PHM”, “anomaly detection”, “fault diagnosis”, “RUL prediction” etc. The second stage is to search the literature in the databases by looking for articles whose title, abstract and keywords contain the predefined keywords, resulting in 139, 1232, and 281 papers from IEEE Xplore, Elsevier, and Web of Science, respectively. In order to eliminate duplicate literature and select the most relevant literature on small data problems in PHM, the first 100 non-duplicate studies from each database were sorted (producing a sum of 300 papers) according to the inclusion and exclusion criteria as listed in Table  1 . Finally, we further refined the obtained results with thorough review and evaluation, and a total of 201 representative papers were chosen for analysis presented in this survey.

figure 2

The procedure for paper investigation and selection using PRISMA method

Despite the growing number of studies, the statistics highlight that there are few review articles on the topic of small data challenges. The first related review is the report entitled “Small data’s big AI potential”, which was released by the Center for Security and Emerging Technology (CSET) at Georgetown University in September 2021 (Chahal et al. 2021 ), and it emphasized the benefits of small data and introduced some typical approaches. Then, Adadi ( 2021 ) reviewed and discussed four categories of data-efficient algorithms for tackling data-hungry problems in ML. More recently, a study (Cao et al. 2023 ) theoretically analyzed learning on small data, followed an agnostic active sampling theory and reviewed the aspects of generalization, optimization and challenges. Since 2021, scholars in the PHM community have been focusing on the small data problem in intelligent FD and have conducted some review studies. Pan et al. ( 2022 ) reviewed the applications of generative adversarial network (GAN)-based methods, Zhang et al . ( 2022a ) outlined solutions from the perspective of data processing, feature extraction, and fault classification, and Li et al. ( 2022b ) organized a comprehensive survey on transfer learning (TL) covering theoretical foundations, practical applications, and prevailing challenges.

It is worth noting that existing studies provide valuable guidance, but they have yet to delve into the foundational concepts of small data and exhibit certain limitations in the analysis. For instance, some reviews studied the small data problems from a macro perspective without considering the application characteristics of PHM tasks (Chahal et al. 2021 ; Adadi 2021 ; Cao et al. 2023 ). However, some concentrated solely on particular methodologies that were used to address small data challenges in FD tasks (Pan et al. 2022 ; Zhang et al. 2022a ; Li et al. 2022b ), lack the systematic research on the solutions to AD and RUL prediction tasks, seriously limiting the development and industrial application of intelligent PHM. Therefore, an in-depth exploration of the small data challenges in the PHM domain is necessary to provide guidance for the successful applications of intelligent models in the industry.

This review is a direct response to the contemporary demand for addressing the small data challenges in PHM, and it aims to clarify the following three key questions: (1) What is small data in PHM? (2) Why solve the small data challenges? and (3) How to address small data challenges effectively? These fundamental issues distinguish our work from existing surveys and demonstrate the major contributions:

Small data challenges for intelligent PHM are studied for the first time, and the definition, causes, and impacts are analyzed in detail.

An overview of various state-of-the-art methods for solving small data problems is presented, and the specific issues and remaining challenges for each PHM task are discussed.

The commonly used benchmark datasets and experimental settings are summarized to provide a reference for developing and evaluating data-efficient models in PHM.

Finally, promising directions are indicated to facilitate future research on small data.

Consequently, this paper is organized according to the hierarchical architecture shown in Fig.  3 . Section  2 discusses the definition of small data in the PHM domain and analyzes the corresponding causes and impacts. Section  3 provides a comprehensive overview of representative approaches—including data augmentation (DA) methods (Sect.  3.1 ), transfer learning (TL) methods (Sect.  3.2 ), and few-shot learning (FSL) methods (Sect.  3.3 ). The problems in PHM applications are discussed in Sect.  4 . Section  5 summarizes the datasets and experimental settings for model evaluation. Finally, potential research directions are given in Sect.  6 and the conclusions are drawn in Sect.  7 . In addition, the abbreviations of notations used in this paper are summarized in Table  2 .

figure 3

The hierarchical architecture of this review

2 Analysis of small data challenges in PHM

The excellent performance of DL models in executing PHM tasks is intricately tied to the premise of abundant and high-quality labeled data. However, this assumption is unlikely to be satisfied in industry, as small data is often the real situation, which exhibits distinct data distributions and may lead to difficulties in model learning. Therefore, this section first analyzes the definition, causes, and impacts of small data in PHM.

2.1 What is “small data”?

Before answering the question of what small data is, let us first review the relative term, “big data”, which has garnered distinct interpretations among scholars since its birth in 2012. Ward and Barker ( 1309 ) regarded big data as a phrase that “describes the storage and analysis of large or complex datasets using a series of techniques”. Another perspective, as presented in Suthaharan ( 2014 ), focused on the data’s cardinality, continuity, and complexity. Among the various definitions, the one that has been widely accepted is characterized by the “5 V” attributes: volume, variety, value, velocity, and veracity (Jin et al. 2015 ).

After long-term research, some experts have discovered the fact that big data is not ubiquitous, and the paradigm of small data has emerged as a novel area worthy of thorough investigation in the field of AI (Vapnik 2013 ; Berman 2013 ; Baeza-Yates 202 4; Kavis 2015 ). Vapnik ( 2013 ) stands among the pioneers in this pursuit, having defined small data as a scenario where “the ratio of the number of training samples to the Vapnik–Chervonenkis (VC) dimensions of a learning machine is less than 20.” Berman ( 2013 ) considered small data as being used to solve discrete questions based on limited and structured data that come from one institution. Another study defines small data as “data in a volume and format that makes it accessible, informative and actionable.” (Baeza-Yates 2024 ). In an industrial context, Kavis ( 2015 ) described small data as “The small set of specific attributes produced by the Internet of Things, these are typically a small set of sensor data such as temperature, wind speed, vibration and status”.

Considering the distinctive attributes of equipment signals within industries, a new definition for small data in PHM is given here: Small data refers to datasets consisting of equipment or system status information collected from sensors that are characterized by a limited quantity or quality of samples. Taking the FD task as an example, the corresponding mathematical expression is: Given a dataset \(D = \left\{ {F_{I} (x_{i}^{I} ,y_{i}^{I} )_{i = 1}^{{n_{I} }} } \right\}_{I = 1}^{N}\) , \(\left( {x_{i}^{I} ,y_{i}^{I} } \right)\) are the samples and labels (if any) of the I th fault \(F_{I}\) . \(N\) represents the number of fault classes in \(D,\) and each fault set has a sample size of \(n_{I}\) . Notably, the term “small” carries two connotations: (i) On a quantitative scale, “small” signifies a limited dataset volume, a limited sample size \(n_{I}\) , or a minimal total number of fault types \(N\) ; and (ii) From a qualitative perspective, “small” indicates a scarcity of valuable information within \(D\) due to a substantial proportion of anomalous, missing, unlabeled, or noisy-labeled data in \(\left( {x_{i}^{I} ,y_{i}^{I} } \right)\) . There is no fixed threshold to define “small” concerning both quantity and quality, which is an open question depending on the specific PHM task to be performed, the equipment analyzed, the chosen methodology, and the desired performance. To further understand the meaning of small data, a comprehensive comparison is conducted with big data in Table  3 .

2.2 Causes of small data problems in PHM

Rapid advances in sensors and industrial Internet technology has simplified the process of collecting monitoring data from equipment. However, only large companies currently have the ability to acquire data on a large scale. Since most of the collected data are normal samples with limited abnormal or faulty data, they cannot provide enough information for model training. As illustrated in Fig.  4 , four main causes for small data challenges in PHM are analyzed.

figure 4

Four main causes of small data challenges in PHM

2.2.1 Heavy investment

When deploying an intelligent PHM system, Return on Investment (ROI) is the top concern of companies. The substantial investment comes from two main aspects, as shown in the first quadrant of Fig.  4 : First, factories need to digitally upgrade existing old equipment to collect monitoring data. (ii) Second, data labeling and processing requires manual operation and domain expertise. Although the costs of sensors and labeling outsourcing are relatively low today, installing sensors across numerous machines and processing terabytes of data is still beyond the reach of most manufacturers.

2.2.2 Data accessibility restrictions

Illustrated in the second quadrant, this factor is underscored by follows: (i) The sensitivity, security, or privacy of the data often leads to strict access controls, an example is data collection of military equipment. (ii) For data transfers and data sharing, individuals, corporations, and nations need to comply with laws and supervisory ordinances, especially after the release of the General Data Protection Regulation (Zarsky 2016 ).

2.2.3 Complex working conditions

The contents depicted in the third quadrant of Fig.  4 include: (i) Data distribution within PHM inherently displays significant variability across diverse production tasks, machines and operating conditions (Zhang et al. 2023 ), making it impossible to collect data under all potential conditions. (ii) Acquiring data within specialized service environments, such as high radiation, carries inherent risks. (iii) The development of equipment from a healthy state to eventual failure experiences a long process.

2.2.4 Multi-factor coupling

As equipment becomes more intricately integrated, correlation and coupling effects have undergone continuous augmentation. As shown in the fourth quadrant of Fig.  4 : Couplings exist between (i) multiple-components, (ii) multiple-systems, and (iii) diverse processes. Such interactions are commonly characterized by nonlinearity, temporal variability, and uncertain attributes, further increasing the complexity of data acquisition.

2.3 Impacts of small data on PHM tasks

The availability of labeled and high-quality data remains limited, producing some impacts on performing PHM tasks, particularly involving both data and models (Wang et al. 2020a ). As shown on the left side of Fig.  5 , the effects at the data level primarily include incomplete data and unbalanced distribution, which subsequently leads to poor generalization at the model-level. This section analyzes the impacts with corresponding evaluation metrics based on the example of FD task.

figure 5

Impacts of small data problems in the PHM domain, with current mainstream approaches

2.3.1 Incomplete data

Data integrity refers to the “breadth, depth, and scope of information contained in the data” (Chen et al. 2023a ). However, the obtained small dataset often exhibits a low density of supervised information owing to restricted fault categories or sample size. Further, missing values and labels, or outliers in the incomplete data exacerbates the scarcity of valuable information. Data incompleteness in PHM can be measured by the following metrics:

where \(I_{D}\) represents the incompleteness of the dataset \(D\) , \(n_{{_{D} }}\) and \(N_{{_{D} }}\) are the number of incomplete samples and the total samples in \(D\) , respectively. Similarly, this metric can also assess the incompleteness of samples within a certain class \(C_{i}\) in line with Eq. ( 2 ). When either \(I_{D}\) or \(I_{{C_{i} }}\) approaches 0, it indicates a relatively complete dataset or class. Conversely, a higher value represents a severe degree of data incompleteness, leading to a substantial loss of information within the data.

2.3.2 Imbalanced data distribution

The second impact is the imbalanced data distribution. The fault classes containing higher or lower numbers of samples are called the majority and minority classes, respectively. Depending on the imbalance that exists between different classes or within a single class, the phenomena of inter-class imbalance or intra-class imbalance arises accordingly. Considering a dataset with two distinct fault types, each comprising two subclasses, the degrees of inter-class \(IR\) and intra-class \(IR_{{C_{i} }}\) imbalances can be quantified as (Ren et al. 2023 ):

where \(N_{{{\text{maj}}}}\) and \(N_{\min }\) represent the count of the majority and minority classes within the dataset. \(n_{{{\text{maj}}}}\) and \(n_{\min }\) signify the respective sample sizes of the two subclasses within class \(C_{i}\) . The above values span the interval [1, ∞) to describe the extent of the imbalance. A value of \(IR\) or \(IR_{{C_{i} }}\) is 1 indicates a balanced inter-class or intra-class case, whereas a value of 50 is typically thought of as a highly imbalanced task by domain experts (Triguero et al. 2015 ).

2.3.3 Poor model generalization

Technically, the principal of supervised DL is to build a model \(f\) , which learns the underlying patterns from a training set \(D_{train}\) and tries to predict the labels of previously unseen test data \(D_{test}\) . The empirical error \(E_{emp}\) on the training set and the expected error \(E_{exp}\) on the test set can be derived by calculating the discrepancy between the true labels \(Y\) and the predicted labels \(\hat{Y}\) , respectively. And the difference between these two errors, i.e., the generalization error \(G(f,D_{train} ,D_{test} )\) , is commonly used to measure the generalizability of the trained model on a test set. Generalization error is bounded by the model’s complexity \(h\) and the training data size \(P\) as follows (LeCun et al. 1998 ):

where k is a constant and α is a coefficient with a value range of [0.5, 1.0]. The equation above shows that the parameter \(P\) determines the model’s generalization. When \(P\) is large enough, \(G(f,D_{train} ,D_{test} )\) for the model with a certain \(h\) will converge towards to 0. However, the small, incomplete or unbalanced data often result in larger \(G(f,D_{train} ,D_{test} )\) and poor generalization.

3 Overview of approaches to small data challenges in PHM

This section provides a structured overview of the latest advancements in tackling small data challenges in representative PHM tasks such as AD, FD and RUL prediction. As depicted on the right-hand side of Fig.  5 , three main strategies have been extracted from the current literatures: DA, TL and FSL. In the upcoming subsections, we delve into the relevant theories and proposed methodologies for each category, followed by a brief summary.

3.1 Data augmentation methods

DA methods provide data-level solutions to address small data issues, and their efficacy has been verified in many studies. The basic principle is to improve the quantity or quality of the training dataset by creating copies or new synthetic samples of existing data (Gay et al. 2023 ). Depending on how the auxiliary data are generated, transform-based, sampling-based, and deep generative models-based DA methods are analyzed.

3.1.1 Transform-based DA

Transform-based methods is one of the earliest classes of DA, which increases the size of small datasets by employing geometric transformations to existing samples without changing labels. These transformations are so diverse and flexible that they include random cropping, vertical and horizontal flipping, and noise injection. However, most of them were initially designed for two-dimensional (2-D) images and cannot be directly applied to one-dimensional (1-D) signals of equipment (Iglesias et al. 2023 ).

Considering the sequential nature of monitoring data, scholars have devised transformation methods applicable to increase the size of 1-D data (Meng et al. 2019 ; Li et al. 2020a ; Zhao et al. 2020a ; Fu et al. 2020 ; Sadoughi et al. 2019 ; Gay et al. 2022 ). For example, Meng et al . ( 2019 ) proposed a DA approach for FD of rotating machinery, which equally divided the original sample and then randomly reorganized the two segments to form a new fault sample. In Li et al. ( 2020a ) and Zhao et al. ( 2020a ), various transformation techniques, such as Gaussian noise, random scaling, time stretching, and signal translation, are simultaneously applied, as illustrated in Fig.  6 . It is worth noting that all the aforementioned techniques are global transformations that are imposed on the entire signal, potentially overlooking the local fault properties. Consequently, some studies have combined local and global transforms (Zhang et al. 2020a ; Yu et al. 2020 , 2021a ) to change both segments and the entirety of the original signal to obtain more authentic samples. For instance, Yu et al . ( 2020 ) simultaneously used strategies of local and global signal amplification, noise addition, and data exchange to improve the diversity of fault samples.

figure 6

Illustration of the transformations applied in Li et al. ( 2020a ) and Zhao et al. ( 2020a ). Gaussian noise was randomly added to the raw samples, random scaling was achieved by multiplying the raw signal with a random factor, time stretching was implemented by horizontally stretching the signals along the time axis, and signal translation was done by shifting the signal forward or backwards

3.1.2 Sampling-based DA

Sampling-based DA methods are usually applied to solve data imbalance problems under small data conditions. Among them, under-sampling techniques solve data imbalance by reducing the sample size of the majority class, while over-sampling methods achieve DA by expanding samples of the minority class. Over-sampling can be further classified into random over-sampling and synthetic minority over-sampling techniques (SMOTE) (Chawla et al. 2002 ) depending on whether or not new classes are created. As shown in Fig.  7 , random over-sampling copies the data of a minority class n times to increases data size, and SMOTE creates synthetic samples by calculating the k nearest neighbors of the samples from minority classes, thus enhancing both the quantity and the diversity of samples.

figure 7

Comparison between random over-sampling and SMOTE

To address data imbalance arising from abundant healthy samples and fewer faulty samples in monitoring data, some studies (Yang et al. 2020a ; Hu et al. 2020 ) have introduced enhanced random over-sampling methods for augmentation of small data. For example, Yang et al. ( 2020a ) enhanced random over-sampling method by introducing a variable-scale sampling strategy for unbalanced and incomplete data in the FD task, and Hu et al. ( 2020 ) used resampling method to simulate data under different working conditions to decrease domain bias. In comparison, the SMOTE technique has gained widespread utilization in PHM tasks due to its inherent advantages (Hao and Liu 2020 ; Mahmoodian et al. 2021 ). Hao and Liu ( 2020 ) combined SMOTE with Euclidean distance to achieve better over-sampling of minority class samples. To address the difficulties of selecting appropriate nearest neighbors for synthetic samples, Zhu et al. ( 2022 ) calculated the Euclidean and Mahalanobis distances of the nearest neighbors, and Wang et al. ( 2023 ) used the characteristics of neighborhood distribution to equilibrate samples. Moreover, studies of (Liu and Zhu 2020 ; Fan et al. 2020 ; Dou et al. 2023 ) further improved the adaptability of SMOTE by employing weighted distributions to shift the importance of classification boundaries more toward the challenging minority classes, demonstrated effectiveness in resolving data imbalance issues.

3.1.3 Deep generative models-based DA

In addition, deep generative models have emerged as highly promising solutions to small data since 2017, autoencoder (AE) and generative adversarial network (GAN) are two prominent representatives (Moreno-Barea et al. 2020 ). AE is a special type of neural network characterized by encoding its input to the output in an unsupervised manner (Hinton and Zemel 1994 ), where the optimization goal is to learn an effective representation of the input data. The fundamental architecture of an AE, illustrated in Fig.  8 a, comprises two symmetric parts encompassing a total of five shallow layers. The first half, known as the encoder, transforms input data into a latent space, while the second half, or the decoder, deciphers this latent representation to reconstruct the data. Likewise, a GAN is composed of two fundamental components, as shown in Fig.  8 b. The first is the generator, responsible for creating fake samples based on input random noise, and the second is the discriminator for identifying the authenticity of the generated samples. These two components engage in an adversarial training process, progressively moving towards a state of Nash equilibrium.

figure 8

Basic architectures of a AE and b GAN

The unique advantages of GAN in generating diverse samples makes it superior to traditional over-sampling DA methods, especially in tackling data imbalance problems for PHM tasks (Behera et al. 2023 ). Various innovative models have emerged, including variational auto-encoder (VAE) (Qi et al. 2023 ), deep convolutional GAN (DCGAN) (Zheng et al. 2019 ), Wasserstein GAN (Yu et al. 2019 ), etc. These methods can be classified into two groups based on their input types. The first commonly generates data from 1-D inputs like raw signals (Zheng et al. 2019 ; Yu et al. 2019 ; Dixit and Verma 2020 ; Ma et al. 2021 ; Zhao et al. 2021a , 2020b ; Liu et al. 2022 ; Guo et al. 2020 ; Wan et al. 2021 ; Huang et al. 2020 , 2022 ; Zhang et al. 2020b ; Behera and Misra 2021 ; Wenbai et al. 2021 ; Jiang et al. 2023 ) and frequency features (Ding et al. 2019 ; Miao et al. 2021 ; Mao et al. 2019 ), which can capture the inherent temporal information in signals without complex pre-processing. For instance, Dixit and Verma ( 2020 ) proposed an improved conditional VAE to generate synthetic samples using raw vibration signals, yielding remarkable FD performance despite limited data availability. The work in Mao et al. ( 2019 ) applied the Fast Fourier Transform (FFT) to convert original signals into the frequency domain as inputs for GAN and obtained higher-quality generated samples. On the other hand, some studies (Du et al. 2019 ; Yan et al. 2022 ; Liang et al. 2020 ; Zhao and Yuan 2021 ; Zhao et al. 2022 ; Sun et al. 2021 ; Zhang et al. 2022b ; Bai et al. 2023 ) took the strengths of AEs and GANs in the image domain, aimed to generate corresponding images by utilizing 2-D time–frequency representations. For instance, Bai et al. ( 2023 ) employed an intertemporal return plot to transform time-series data to 2-D images as inputs for Wasserstein GAN, and this method reduced data imbalance and improved diagnostic accuracy of bearing faults.

3.1.4 Epilog

Table 4 summarizes the diverse DA-based solutions to addressing small data problems in PHM, covering specific issues tackled by each technique, as well as the merits and drawbacks of each technique. It is evident that DA approaches focus on mitigating small data challenges at the data-level, including problems characterized by insufficient labelled training data, class imbalance, incomplete data, and samples contaminated with noise. To tackle these, transform-based methods primarily increase the size of the training dataset by imposing transformations onto signals, but the effectiveness depends on the quality of raw signals. As for sampling-based approaches, they excel at dealing with unbalanced problems in the PHM tasks, and SMOTE methods demonstrate proficiency in both augmenting minority class samples and diversifying their composition, but refining nearest neighbor selection and bolstering adaptability to high levels of class imbalance remain open research areas. While deep generative models-based DA provides a flexible and promising tool capable of generating samples for various equipment under different working conditions, but more in-depth research is needed on integrating the characteristics of PHM tasks, quality assessment of the generated data, and efficient training of generative models.

3.2 Transfer learning methods

Traditional DL models assumes that training and test data originate from an identical domain, however, changes in operating conditions inevitably cause divergences in data distributions. TL is a new technique that eliminates the requirement for same data distribution by transferring and reusing data or knowledge from related domains, ultimately solving small data problems in the target domain. TL is defined in terms of domains and tasks, each domain \(D\) consists of a feature space and a corresponding marginal distribution, and the task \(T\) associated with each domain contains a label space and a learning function (Yao et al. 2023 ). Within the PHM context, TL can be concisely defined as: Given a source domain \(D_{S}\) and a task \(T_{S}\) , and a target domain \(D_{T}\) with a task \(T_{T}\) . The goal of TL is to exploit the knowledge of certain equipment that is learned from \(D_{S}\) and \(T_{S}\) to enhance the learning process for \(T_{T}\) within \(D_{T}\) under the setting of \(D_{S} \ne D_{T}\) or \(T_{S} \ne T_{T}\) , and the data volume of the source domain is considered much larger than that of the target domain. There is a range of categorization criteria for TL methods in the existing literature. From the perspective of “what to transfer” during the implementation phase, TL can be categorized into three types: instance-based TL, feature-based TL, and parameter-based TL. Among these categories, the former two are affiliated with solutions operating at the data level, while the latter belong to the realm of model-level approaches. These classifications are visually represented in Fig.  9 .

figure 9

Descriptions of the main categories of TL

3.2.1 Instance-based TL

The premise of applying TL is that the source domain contains sufficient labeled data, whereas the target domain either lacks sufficient labeled data or predominantly consists of unlabeled data. Although a straightforward way is to train a model for the target domain using samples from the source domain, which proves impractical due to the inherent distribution disparities between the two domains. Therefore, finding and applying labeled instances in the source domain that have similar data distribution to the target domain is the key. For this purpose, various methods have been proposed to minimize the distribution divergence, and weighting strategies are the most widely used.

Dynamic weight adjustment (DWA) is a popular strategy, and its novelty lies in reweighting the source and target domain samples based on their contributions to the learning of the target model. Take the well-known TrAdaBoost algorithm (Dai et al. 2007 ) as an example, which increases the weights of samples that are similar to the target domain, and reduces the weights of irrelevant source instances. The effectiveness of TrAdaBoost has been validated in FD for wind turbines (Chen et al. 2021 ), bearings (Miao et al. 2020 ), and induction motors (Xiao et al. 2019 ). Evolving from foundational research, scholars also introduced multi-objective optimization (Lee et al. 2021 ) and DL theories (Jamil et al. 2022 ; Zhang et al. 2020c ) into TrAdaBoost to improve model training efficiency. However, DWA requires labeled target samples, otherwise, weight adjustment methods based on kernel mapping techniques are needed to estimate the key weight parameters, such as matching the mean of source and target domain samples in the replicated kernel Hilbert space (RKHS) (Tang et al. 2023a ). For example, Chen et al . ( 2020 ) designed a white cosine similarity criterion based on kernel principal component analysis to determine the weight parameters for data in the source and target domain, boosting the diagnostic performance for gears under limited data and varying working conditions. More research can be found in Liu and Ren ( 2020 ), Xing et al. ( 2021 ), Ruan et al. ( 2022 ).

3.2.2 Feature-based TL

Unlike instance-based TL that finds similarities between different domains in the space of raw samples, feature-based methods perform knowledge transfer within a shared feature space between source and target domains. As demonstrated in Fig.  10 , feature-based TL is widely applied in domain-adaption and domain-generalization scenarios, where the former focuses on how to migrate knowledge from the source domain to the target domain, and domain generalization aims to develop a model that is robust across multiple source domains so that it can be generalized to any new domain. The key to feature-based TL is to reduce the disparities between the marginal and conditional distributions of different domains by some operations, such as discrepancy-based methods and feature reduction methods, which eventually enable the model to achieve excellent adaptation and generalization on target tasks (Qin et al. 2023 ).

figure 10

Application scenarios of feature-based TL

The main challenge for discrepancy-based methods is to accurately quantify the distributional similarity between domains, which relies on specific distance metrics. Table 5 lists the popular metrics (Borgwardt et al. 2006 ; Kullback and Leibler 1951 ; Gretton et al. 2012 ; Sun and Saenko 2016 ; Arjovsky et al. 2017 ) and the algorithms applied to PHM tasks (Yang et al. 2018 , 2019a ; Cheng et al. 2020 ; Zhao et al. 2020c ; Xia et al. 2021 ; Zhu et al. 2023a ; Li et al. 2020b , c , 2021a ; He et al. 2021 ). Maximum Mean Discrepancy (MMD) is based on the distance between instance means in the RKHS, and Wasserstein distance assesses the likeness of probability distributions by considering geometric properties, both of which are widely used. For example, Yang et al . ( 2018 ) devised a convolutional adaptation network with multicore MMD to minimize the distribution discrepancy between the feature distributions derived from both laboratory and real machines failure data. And the integration of Wasserstein distance in Cheng et al. ( 2020 ) greatly enhanced the domain adaptation capability of the proposed model. Moreover, Fan et al. ( 2023a ) proposed a domain-based discrepancy metric for domain generalization fault diagnosis under unseen conditions, which helps model balance the intra- and interdomain distances for multiple source domains. On the other hand, feature reduction approaches aim to automatically capture general representations across different domains, mainly using unsupervised methods such as clustering (Michau and Fink 2021 ; He et al. 2020a ; Mao et al. 2021 ) and AE models (Tian et al. 2020 ; Lu and Yin 2021 ; Hu et al. 2021a ; Mao et al. 2020 ). For instance, Mao et al. ( 2021 ) integrated time series clustering into TL, and used the meta-degradation information obtained from each cluster for temporal domain adaptation in bearing RUL prediction. To improve model performance for imbalanced and transferable FD, Lu and Yin ( 2021 ) designed a weakly supervised convolutional AE (CAE) model to learn representations from multi-domain data. Liao et al. ( 2020 ) presented a deep semi-supervised domain generalization network, which showed excellent generalization performance in performing rotary machinery fault diagnosis under unseen speed.

3.2.3 Parameter-based TL

The third category of TL is parameter-based TL, which supposes that the source and target tasks share certain knowledge at the model level, and the knowledge is encoded in the architecture and parameters of the model pre-trained on the source domain. It is motivated by the fact that retraining a model from scratch requires substantial data and time, while it is more efficient to directly transfer pre-trained parameters and fine-tune them in the target domain. In this way, there are basically two main implementations depending on the utilization of the transferred parameters in target model training: full fine-tuning (or freezing) and partial fine-tuning (or freezing), as shown in Fig.  11 .

figure 11

Parameter-based TL. a Full fine-tuning (or freezing), b partial fine-tuning (or freezing)

Full fine-tuning (or freezing) means that all parameters transferred from the source domain are fine-tuned with limited labelled data from the target domain, or those parameters are frozen without updating during the training of the target model. Conversely, partial fine-tuning (or freezing) is the selective fine-tuning of only specific upper layers or parameters, keeping the lower layer parameters consistent with the pre-trained model. In both cases, the classifier or predictor of the target model needs to be retrained with randomly initialized parameters to align with the number of classes or data distribution of the target task. The full fine-tuning (or freezing) approach is particularly applicable when the source and target domain samples exhibit a high degree of similarity, so that general features can be extracted from the target domain by using the pre-training parameters (Cho et al. 2020 ; He et al. 2019 , 2020b ; Zhiyi et al. 2020 ; Wu and Zhao 2020 ; Peng et al. 2021 ; Zhang et al. 2018 ; Che et al. 2020 ; Cao et al. 2018 ; Wen et al. 2020 , 2019 ). From the perspective of the size of the pre-trained model and the fine-tuning time, the full fine-tuning and full freezing strategies are suitable for small and large models, respectively. For example, He et al . (Zhiyi et al. 2020 ) proposed to achieve knowledge transfer between bearings mounted on different machines by fully fine-tuning the pre-trained parameters with few target training samples. In Wen et al. ( 2020 , 2019 ), researchers applied deep convolutional neural networks (CNN)—ResNet-50 (a 50-layer CNN) and VGG-19 (a 19-layer CNN) that was pre-trained on ImageNet as feature extractors, and train target FD models using full freezing methods. In contrast, partial fine-tuning (or freezing) strategies are more suitable for handling cases with significant domain differences (Wu et al. 2020 ; Zhang et al. 2020d ; Yang et al. 2021 ; Brusa et al. 2021 ; Li et al. 2021b ), such as transfer between complex working conditions (Wu et al. 2020 ) and multimodal data sources (Brusa et al. 2021 ). In addition, Kim and Youn ( 2019 ) introduced an innovative approach known as selective parameter freezing (SPF), where only a portion of parameters within each layer is frozen, which enables explicit selection of output-sensitive parameters from the source model, reducing the risk of overfitting the target model under limited data conditions.

3.2.4 Epilog

The TL framework breaks the assumption of homogeneous distribution of training and test data in traditional DL and compensates for the lack of labeled data in the target domain by acquiring and transferring knowledge from a large amount of easily collected data. As summarized in Table  6 , instance-based TL can be regarded as a borrowed augmentation, wherein other datasets with similar distributions are utilized to enrich the samples in the target domain. Among the techniques, DWA strategies demonstrate superiority in solving insufficient labeled target data and imbalanced data, whereas their drawbacks of high computational cost and high dependence on similar distributions need further optimization. As a comparison, feature-based TL performs knowledge transfer by learning general fault representations and has the ability to handle domain-adaption and domain-generalization tasks with large distribution differences, such as transfers between distinct working conditions (He et al. 2020a ), transfers between diverse components (Yang et al. 2019a ), or even transfers from simulated to physical processes (Li et al. 2020b ). And weakly supervised-based feature reduction techniques are capable of adaptively discovering better feature representations, and showing great potential in open domain generalization problems. Finally, parameter-based TL saves the target model from being retrained from scratch, but the effectiveness of these parameters hinges on the size and quality of the source samples, and model pre-training on multi-source domain data can be considered (Li et al. 2023b ; Tang et al. 2021 ).

3.3 Few-shot learning methods

DA and TL methods both require that training dataset has a certain number (ranging from dozens to hundreds) of labeled samples. However, in some industrial cases, samples of specific classes (such as incipient failures or compound faults) may be exceptionally rare and inaccessible, with only a handful of samples (e.g., 5–10) per category for DL model training, resulting in poor model performance on such “few-shot” problems (Song et al. 2022 ). Inspired by the human ability to learn and reuse prior knowledge from previous tasks, which Juirgen Schmidhuber initially named meta-learning (Schmidhuber 1987 ), FSL methods have been proposed to learn a model that can be trained and quickly adapted to tasks with only a few examples. As shown in Fig.  12 , there are some differences between traditional DL, TL, and FSL methods: (1) traditional DL and TL are trained and tested on data points from a single task, while FSL methods are often learning at the task level; (2) the learning of traditional DL requires large amounts of labeled training and test data, TL requires large amounts of labeled training data in source domain, while FSL methods perform meta-training and meta-test with limited data. The organization of FSL tasks follows the “ N -way K -shot Q -query” protocol (Thrun and Pratt 2012 ), where N categories are randomly selected, and K support samples and Q query samples are randomly drawn from each category for each task. The objective of FSL is to combine previously acquired knowledge from multiple tasks during meta-training with a few support samples to predict the class of query samples during meta-test. Based on the way prior knowledge is learned, metric-, optimization-, and attribute-based FSL methods are primarily discussed.

figure 12

Comparison between a traditional DL, b TL, and c FSL methods

3.3.1 Metric-based FSL

Metric-based FSL is to learn priori knowledge by measuring sample similarities, which consists of two components: a feature embedding module responsible for mapping samples to feature vectors, and a metric module to compute similarity (Li et al. 2021 ). Siamese Neural Networks is one of the pioneers, initially proposed by Koch et al. in 2015 for one-shot image recognition (Koch et al. 2015 ), which used two parallel CNNs and L1 distance to determine whether paired inputs are identical. Subsequently, Vinyals et al . ( 2016 ) introducing long short-term memory (LSTM) with attention mechanisms for effective assessment of multi-class similarity, and Snell et al . ( 2017 ) developed Prototypical Networks to calculate the distance between prototype representations, and Relation Networks (Sung et al. 2018 ) utilized adaptive neural networks instead of traditional functions. Table 7 lists the differences between these representative approaches in terms of embedding modules and metric functions.

According to current studies, two forms of metric-based FSL methods are applied in PHM tasks. The first utilizes fixed metrics (e.g., cosine distance) for measuring similarity, while the second leverages learnable metrics, such as the neural network of Relation Networks. For example, Zhang et al . ( 2019 ) firstly introduced a wide-kernel deep CNN-based Siamese Networks for the FD of rolling bearings, which achieved excellent performance with limited data under different working conditions. Then, various FSL algorithms based on Siamese networks (Li et al. 2022c ; Zhao et al. 2023 ; Wang and Xu 2021 ), matching networks (Xu et al. 2020 ; Wu et al. 2023 ; Zhang et al. 2020e ) and prototypical networks (Lao et al. 2023 ; Jiang et al. 2022 ; Long et al. 2023 ; Zhang et al. 2022c ) have been developed for PHM tasks. Zhang et al. ( 2020e ) designed an iterative matching network combined with a selective signal reuse strategy for the few-shot FD of wind turbines. Jiang et al. ( 2022 ) developed a two-branch prototype network (TBPN) model, which integrated both time and frequency domain signals to enhance fault classification accuracy. While Relation Networks have shown superiority over fixed metric-based FSL methods when measuring samples from different domains, and they are therefore widely applied for cross-domain few-shot tasks (Lu et al. 2021 ; Wang et al. 2020b ; Luo et al. 2022 ; Yang et al. 2023a ; Tang et al. 2023b ). To illustrate, Lu et al . ( 2021 ) considered the FD of rotating machinery with limited data as a similarity metric learning problem, and they introduced Relation Networks into the TL framework as a solution. Luo et al. ( 2022 ) proposed a Triplet Relation Network method for performing cross-component few-shot FD tasks, and Tang et al. ( 2023b ) designed a novel lightweight relation network for performing cross-domain few-shot FD tasks with high efficiency. Furthermore, to address domain shift issues resulting from diverse working conditions, Feng et al. ( 2021 ) integrated similarity-based meta-learning network with domain-adversarial for cross-domain fault identification.

3.3.2 Optimization-based FSL

Optimization-based FSL methods adhere to the “learning to optimize” principle to solve overfitting problems arising from small samples. Specifically, these techniques learn good global initialization parameters across various tasks, allowing the model to quickly adapt to new few-shot tasks during the meta-test (Parnami and Lee 2022 ). Taking the best-known model agnostic meta-learning (MAML) (Finn et al. 2017 ) algorithm as an example, optimization-based FSL typically follows a two-loop learning process, first learning a task-specific model (base learner) for a given task in the inner loop, and then learning a meta-learner over a distribution of tasks in the outer loop, where meta-knowledge is embedded in the model parameters and then used as initialization parameters of the model for meta-test tasks. MAML is compatible with diverse models that are trained using gradient descent, allowing models to generalize well to new few-shot tasks without overfitting.

Recent literature highlights the potential of MAML in PHM, mainly focuses on meta-classification and meta-regression. For meta-classification methods, the aim is to learn an optimized classification model based on multiple meta-training tasks that can accurately classify novel classes in meta-test with a few samples as support, typically used for AD (Chen et al. 2022 ) and FD tasks (Li et al. 2021c , 2023c ; Hu et al. 2021b ; Lin et al. 2023 ; Yu et al. 2021b ; Chen et al. 2023b ; Zhang et al. 2021 ; Ren et al. 2024 ). For example, Li et al . ( 2021c ) proposed a MAML-based meta-learning FD technique for bearings under new conditions by exploiting the prior knowledge of known working conditions. To further improve meta-learning capabilities, advanced models such as task-sequencing MAML (Hu et al. 2021b ) and meta-transfer MAML (Li et al. 2023c ) have been designed for few-shot FD tasks, and a meta-learning based domain generalization framework was proposed to alleviate both low-resource and domain shift problems (Ren et al. 2024 ). On the other hand, meta-regression methods target prediction tasks in PHM, with the goal of predicting continuous variables using limited input samples based on meta-optimized models derived from analogous regression tasks (Li et al. 2019 , 2022d ; Ding et al. 2021 , 2022a ; Mo et al. 2022 ; Ding and Jia 2021 ). Li et al . ( 2019 ) first explored the application of MAML to RUL prediction with small size data in 2019, a fully connected neural network (FCNN)-based meta-regression model was designed for predicting tool wear under varying cutting conditions. In addition, MAML has also been integrated into reinforcement learning for fault control under degraded conditions, and more insights can be found in Dai et al. ( 2022 ), Yu et al. ( 2023 ).

3.3.3 Attribute-based FSL

There is also a unique paradigm of FSL known as “zero-shot learning” (Yang et al. 2022 ), where models are used to predict the classes for which no samples were seen during meta-training. In this setup, auxiliary information is necessary to bridge the information gap of unseen classes due to the absence of training data. The supplementary information must be valid, unique, and representative that can effectively differentiate various classes, such as attribute information for images in computer vision. As shown in Fig.  13 , the classes of unseen animals are inferenced by transferring the between-class attributes, such as semantic descriptions of animals’ shape, voice, or habitats, whose effectiveness has been validated in many zero-shot tasks (Zhou et al. 2023b ).

figure 13

Attribute-based methods in zero-shot image identification

The idea of attributed-based FSL approach offers potential solutions to the zero-sample problem in PHM tasks. However, visual attributes cannot be used directly because they do not match the physical meaning of the sensor signals, and for this reason, scholars have worked on effective fault attributes. Given that fault-related semantic descriptions can be easily obtained from maintenance records and can be defined for specific faults in practice, semantic attributes are widely used in current research (Zhuo and Ge 2021 ; Feng and Zhao 2020 ; Xu et al. 2022 ; Chen et al. 2023c ; Xing et al. 2022 ). For example, Feng and Zhao ( 2020 ) pioneered the implementation of zero-shot FD based on the transfer of fault description attributes, which included failure position, fault causes and consequences, providing auxiliary knowledge for the target faults. Xu et al. ( 2022 ) devised a zero-shot learning framework for compound FD, and the semantic descriptor of the framework can define distinct fault semantics for singular and compound faults. Fan et al. ( 2023b ) proposed an attribute fusion transfer method for zero-shot FD with new fault modes. Despite the strides made in description-drive semantic attributes, certain limitations exist, including reliance on expert insights and inaccurate information sources. More recently, attributes without semantic information (termed non-semantic attributes) have also been explored in Lu et al. ( 2022 ), Lv et al. ( 2020 ). Lu et al. ( 2022 ) developed a zero-shot intelligent FD system by employing statistical attributes extracted from time and frequency domains of signal.

3.3.4 Epilog

FSL methods are advantageous in solving small data problems with extremely limited samples, such as giving only five, one, or even zero samples per class in each task. As listed in Table  8 , metric-based FSL methods are concise in their principles and computation, and they shift the focus from sample quantity to intrinsic similarity, but the reliance on labeled data during the training of feature embeddings constrains their applicability in supervised settings. Optimization-based FSL, particularly those underpinned by MAML, boast broader applications including fault classification, RUL prediction, and fault control, but these techniques need substantial computational resources for the gradient optimization of deep networks, and the balance between the optimization parameters and model training speed is the key (Hu et al. 2023 ). Attribute-based FSL is an emerging but promising research topic that has huge potential to significantly reduce the cost of data collection in the industry, and zero-shot learning enables model generalize to new failure modes or conditions without retraining, achieving intelligent prognostics for complex systems even with “zero” abnormal or fault sample. In industry, few-shot is often accompanied by domain shift problems caused by varying speed and load conditions, which is a more difficult problem and poses challenges for traditional FSL methods to learn enough representative fault features that can be adapted and generalized to unseen data distributions, and research in this area has recently begun (Liu et al. 2023 ).

4 Discussion of problems in PHM applications

Different PHM tasks have distinct goals and characteristics, thus producing various forms of small data problems and needs corresponding solutions. Therefore, based on the methods discussed in Sect.  3 , this section will further explore the specific issues and remaining challenges from the perspective of PHM applications. And the distribution of specific issues and corresponding methods for each task is shown in Fig.  14 .

figure 14

Pie chart of specific problems for each PHM task, and the numbers represent articles published on the corresponding topic

4.1 Small data problems in AD tasks

4.1.1 main problems and corresponding solutions.

In industrial applications, the amount of abnormal data is much less than normal data, which seriously hinders the development of accurate anomaly detectors. According to the statistics in Fig.  14 , current research on small data in AD tasks focuses on three core issues: class imbalance, incomplete data, and poor model generalization, which have different impacts on AD tasks. Specifically, class imbalance may cause the model to be biased towards the normal class, which reduces the sensitivity of detecting rare anomalies; incomplete data can make it difficult for the model to distinguish between normal variations and true anomalies when key features are missing; and the problem of poor model generalization may lead to false-positives or false-negatives, which reduces the overall reliability of the anomaly detection system.

To address the above class imbalance problems, existing studies demonstrate that directly increasing the samples of minority classes through DA techniques yields positive results. In our survey of the literature, two papers (Fan et al. 2020 ; Rajagopalan et al. 2023 ) introduced optimized SMOTE algorithms, and one study employed GAN-based DA methods to AD tasks of wind turbines. These methods facilitated the generation of additional anomalous samples, and enhanced model accuracy while minimizing false positive rates. Another prevalent challenge in AD tasks is incomplete data, stemming from faulty sensors, inaccurate measurements, or different sampling rates. Deep generative models, with their superior learning capabilities, have been widely used to improve information density of incomplete data (Guo et al. 2020 ; Yan et al. 2022 ). To address inadequate model generalization when confronted with limited labeled training samples, Michau and Fink ( 2021 ) proposed an unsupervised TL framework. Notably, the majority of AD methods advanced in current research are rooted in unsupervised learning models, such as AE, with wide applications involving electric motors (Rajagopalan et al. 2023 ), process equipment (Guo et al. 2020 ), and wind turbines (Liu et al. 2019 ).

4.1.2 Remaining challenges

AD is an integral and fundamental task in equipment health monitoring, where the difficulty lies in dealing with a complex set of data and various anomalies (Pang et al. 2021 ). Though existing research has provided valuable insights into addressing small data challenges, certain unresolved issues warrant further exploration.

4.1.2.1 Adaptability of detection models

The majority of AD algorithms are domain-dependent and designed for specific anomalies and conditions. However, industrial production constitutes a dynamic and nonlinear process, where changes in variables such as environment, speed, or load may lead to data drift and novel anomalies. For small datasets, even minor changes in the underlying patterns can have a pronounced impact on the dataset’s characteristics, thus degrading anomaly detection performance of models. To address these issues, it is imperative to improve the adaptability of detection models by using adaptive optimizers and learners, such as the online adaptive Recurrent Neural Network proposed in Fekri et al. ( 2021 ), which had the capability to learn from newly arriving data and adapt to novel patterns.

4.1.2.2 Real-time anomaly detection

Real-time is always a desirable property of detection models, which ensure that anomalies can be detected and reported to the operator in a timely manner, and corresponding decisions can be made quickly, and this is especially important for complex equipment, such as UAVs (Yang et al. 2023b ). The deployment of lightweight network architectures and edge computing technologies holds promise in enabling the realization of real-time detection capabilities.

4.2 Small data problems in FD tasks

4.2.1 main problems and corresponding solutions.

Accuracy is one of the most important metrics for evaluating models’ performance in classifying different types of faults, but it is strongly influenced by the size of the fault data. As shown in Fig.  14 , the small data challenge in FD has received the most extensive research attention compared to AD and RUL prediction tasks. The small data problem has also manifested itself in a richer variety of ways, including limited labeled training data, class imbalance, incomplete data, low data quality, and poor generalization. First, limited labeled training data increases the risk of model overfitting and poses a challenge in capturing variations in fault conditions, class imbalance leads to lower sensitivity to few and unseen faults, incomplete data leads to incomplete extraction of fault features, low-quality data misleads the diagnostic model to generate false positives or false negatives; and poor generalization capability limits the applicability of the model to different operating conditions and equipment.

To address the scarcity of labeled training data, two practical solutions emerge: using samples within the existing datasets, and borrowing from external data sources. The former involves employing already acquired signals to generate samples adhering to the same data distribution. Following this idea, there are five and eight surveyed papers have utilized transform-based and deep generative models-based DA methods, respectively, with 1-D vibration signals as input. The second involves three main techniques: reusing samples from other domains through sample-based TL, obtaining available features via feature-based TL, and utilizing attribute representations through attribute-based FSL. According to the statistics of the surveyed papers, feature-based approaches were employed 15 times for cross-domain scenarios, and attribute-based methods were chosen 7 times for predicting novel classes with zero training samples. Data imbalance is another common problem in FD, with 16 articles retrieved on this topic, and most of them applying deep generative models to address inter-class imbalance problems. In addition to the issues discussed above, data quality problems such as incomplete data and noisy labels have also gained attention, with two and three papers based on deep generative models being presented, respectively.

Secondly, as for the issues caused by limited data at the model level, such as overfitting, diminished accuracy, and weakened generalization, researchers also have also proposed various solutions. This includes 12 papers using parameter-based TL methods, 14 papers applying metric-based FSL methods, and 8 papers using MAML-based FSL approaches. Among these, parameter-based TL methods leverage knowledge within the structure and parameters of models to decrease training time, while metric-based FSL alleviates the requirement for sample size by learning category similarities, and MAML-based FSL achieves fast adaptation to novel FD tasks by using meta-learned knowledge. These successful applications also demonstrate the potential for integrating TL and FSL paradigms to improve model accuracy and generalizability.

4.2.2 Remaining challenges

The data-level and model-level approaches proposed above have made significant progress in solving the small data problems in FD tasks. However, there are still some challenges that need to be addressed urgently.

4.2.2.1 Quality of small data

In our survey of 107 studies on FD tasks, most focused on solving sample size problems, only five papers investigated data quality issues in small data challenges. It is important for researchers to realize that a voluminous collection of irrelevant samples is far inferior to a small yet high-quality dataset on FD tasks. And the poor quality of small data results from both samples and labels, including but not limited to missing data, noise and outliers in signal measurement, and the errors during labeling. Consequently, there is a large research gap in factor analysis, data quality assessment and data enhancement.

4.2.2.2 System-level FD with limited data

The majority of current algorithms for handling small data problems focus on component-level FD, as evidenced by their applications to bearings (Zhang et al. 2020a ; Yu et al. 2020 , 2021a ) and gears (Zhao et al. 2020b ). However, these methods cannot meet the diagnostic demands of intricate industrial systems composed of multiple components. Thus, developing intelligent models to perform system-level FD with limited data requires more exploration.

4.3 Small data problems in RUL prediction tasks

4.3.1 main problems and corresponding solutions.

The paradox inherent in RUL prediction lies in its aim to estimate degradation trends of an equipment based on historical monitoring data, whereas run-to-failure data are difficult to obtain. This paradox has motivated scholars to recognize the significance of small data issues within prognostic tasks. Among the 27 reviewed papers, the problems of limited labeled training data, class imbalance, incomplete data, and poor model generalization are mainly studied. While these issues are similar to those in the FD task, the RUL prediction task has different implications due to its continuous label space nature (Ding et al. 2023a ). Specifically, limited labeled training data makes it difficult to learn sufficiently robust representations of health indictors, class imbalance may lead to more frequent prediction of non-failure events and produce conservative estimates, missing information in incomplete data further increases prediction uncertainty, and poorer generalization capability reduces the compatibility of the model for different operating conditions or devices.

RUL prediction is a typical regression task, wherein the quantity of training data profoundly influences the feature learning and nonlinear fitting abilities of DL models. Addressing the challenges of limited labeled training data, solutions include transform-(Fu et al. 2020 ; Sadoughi et al. 2019 ; Gay et al. 2022 ) and generative model-based (Zhang et al. 2020b ) DA methods, alongside instance- (Zhang et al. 2020c ; Ruan et al. 2022 ) and feature-based (Xia et al. 2021 ; Mao et al. 2021 , 2020 ) TL methods. Among the reviewed papers, three sampling-based DA methods and one deep generative models-based DA approach have been reported to alleviate class imbalance problems. For instance, the Adaptive Synthetic over-sampling strategy was proposed in Liu and Zhu ( 2020 ) for tool wear prediction with imbalanced truncation data. Another major challenge of RUL prediction is the incomplete time-series data, which is treated as an imputation problem, the proposed GAN-based methods in Huang et al. ( 2022 ), Wenbai et al. ( 2021 ) achieved the insertion of missing data by automatically learning the correlations within time series. Model-level solutions based on TL and FSL have also been employed to enhance the generalization of predictive models across domains when faced with limited time series samples. Notably, MAML-based few-shot prognostics (Li et al. 2019 , 2022d ; Ding et al. 2021 , 2022a ; Mo et al. 2022 ; Ding and Jia 2021 ) have recently demonstrated substantial advancements within the PHM field. In addition, LSTM has become a popular benchmark model for RUL prediction tasks, due to their proficiency in capturing long-term dependencies, and the combination of LSTM with CNN has extended the capability of learning degradation patterns (Wenbai et al. 2021 ; Xia et al. 2021 ).

4.3.2 Remaining challenges

Significant strides have been made in addressing the challenges of limited data in RUL predictions. However, it is noteworthy that many of the proposed methods rely more or less on certain assumptions that might not hold in real-world conditions. In order to achieve more reliable forecasts, a number of major challenges must be addressed.

4.3.2.1 Interpretability of prognostic models

Although numerous prognostic models have shown impressive predictive performance, many are poorly interpreted. The inherent “black box” nature of DL models diminishes their desired interpretability, transparency, and causal insights about both the models and their outcomes. Consequently, within RUL prediction, interpretability is much needed to reveal the underlying degradation mechanisms hidden in the monitoring data, thus increasing the level of “trust” of intelligent models.

4.3.2.2 Uncertainty quantification in small data conditions

Uncertainty quantification (UQ) is an important dimension of the PHM framework that can improve the quality of RUL prediction through risk assessment and management. Uncertainty involved in RUL predictions can be categorized into aleatory uncertainty and epistemic uncertainty (Kiureghian and Ditlevsen 2009 ), where the first type often results from the noise inherent in data, such as the noise in signal measurements; while the second category of uncertainty is attributed to the deficiency of model knowledge, including model architecture, and model parameters. As discussed above, the impacts of small data challenge at the data level (incomplete data and unbalanced distribution) and model level (poor generalization) both further increase the uncertainty of predictive results, leading to few studies on UQ under small data conditions. The existing research on the UQ of intelligent RUL predictions mainly applies Gaussian process regression and Bayesian neural networks. For example, Ding et al. ( 2023b ) designed a Bayesian approximation enhanced probabilistic meta-learning method to reduce parameter uncertainty in few-shot prognostics. The recent study (Nemani et al. 2023 ) demonstrates that the physics-informed ML is promising for the UQ of RUL predictions in small data conditions by combining physics-based and data-driven modeling.

5 Datasets and experimental settings

There is a growing number of proposed methods for small data problems in the PHM domain, but there is a lack of corresponding unified criteria for fair and valid evaluation of the proposed methods, one of the major reasons being the complexity and variability of the equipment and working conditions under study. To this end, we analyze and distill two key elements of model evaluation in current studies—datasets and small data settings, which are summarized in this section to provide guidance for effective evaluation of existing models.

5.1 Datasets

In the past decade, the PHM community has released many diagnostic and prognostic benchmarks that cover different mechanical objects, such as bearings (Smith and Randall 2015 ; Lessmeier et al. 2016 ; Qiu et al. 2006 ; Nectoux et al. 2012 ; Wang et al. 2018a ; Bechhoefer 2013 ), gearbox (Shao et al. 2018 ; Xie et al. 2016 ), turbofan engines (Saxena et al. 2008 ), and cutting tools (Agogino and Goebel 2007 ). Table 9 lists several datasets that have been widely used in existing research to study small data problems, and the signal types, failure modes, counts of operational conditions, application to PHM tasks, and features of these datasets are outlined.

Different datasets exhibit distinct characteristics and are therefore suitable for studying various problems. Depending on how the fault data is generated, these datasets can be broadly categorized as simulated fault datasets, real fault datasets, and hybrid datasets. Among them, the simulated fault datasets (Smith and Randall 2015 ; Qiu et al. 2006 ; Wang et al. 2018a ; Shao et al. 2018 ; Xie et al. 2016 ; Saxena et al. 2008 ; Bronz et al. 2020 ; Downs and Vogel 1993 ) obtain fault samples by using artificially induced faults or simulation software, and the experimental process involves limited and human-controlled variables, so the fault characteristics and degradation modes in the data are relatively simpler, and the DL models often achieve excellent performance. A typical example is the Case Western Reserve University (CWRU) dataset (Smith and Randall 2015 ), which is a well-known simulation benchmark widely used for small data problems in AD, FD and RUL prediction tasks. The CWRU has characteristics of multiple failure modes, unbalanced classes, different bearings, and various operating conditions, which provide opportunities for studying limited labeled training data (Ding et al. 2019 ), class imbalance (Mao et al. 2019 ), incomplete data (Yang et al. 2020a ), and equipment degradation under various conditions (Kim and Youn 2019 ; Li et al. 2021c ).

However, real fault datasets (Nectoux et al. 2012 ; Agogino and Goebel 2007 ) collect failure samples from equipment during natural degradation, which is often accompanied by many uncontrollable factors from the equipment itself and the external environment, resulting in more complex data distributions. These datasets are generally used to validate the robustness of small-data solutions in practical conditions (Sadoughi et al. 2019 ). Moreover, hybrid datasets (Lessmeier et al. 2016 ; Bechhoefer 2013 ) contain both artificially damaged and real-damaged fault data, and they are used to validate the transfer between failures across objects, working conditions, and from laboratory to real environments (Wang et al. 2020b ).

Further, in terms of the types of signals contained in the datasets, vibration signals, sound signals, electric currents, and temperatures are the most common. These various signals open up avenues for developing multi-source data fusion techniques (Yan et al. 2023 ). In addition, some datasets include not only single faults but also composite faults, and these datasets facilitate the study of diagnosis and prognosis of compound faults. Moreover, as shown in Table  9 , most of the datasets collect signals from individual components, but samples from subsystems (Shao et al. 2018 ) or entire systems (Downs and Vogel 1993 ) for system-level diagnostics and prediction are required.

5.2 Experimental setups

During the execution of intelligent PHM tasks, the general process of conducting experiments for DL models is to first divide the dataset into training, validation, and test sets according to a certain ratio. However, in order to simulate limited data scenarios, designing “small data” is a conundrum. There are two popular strategies used in the current studies, as shown in Table  10 .

5.2.1 Setting a small sample size

The most direct and commonly employed setup in studying small data problems involves reducing the number of training or test samples to a few or dozens, which is achieved by selecting a tiny subset of the entire dataset. For example, 2.5% of the dataset was used for training in Xing et al. ( 2022 ), meaning only five fault samples of each class were provided to the model, which is far less than the hundreds or thousands of samples required by traditional DL methods. Due to the ease of implementation and understanding, this strategy has been widely used in AD, FD, and RUL prediction tasks with limited data, and it was notably observed in most experiments using DA and TL methods. However, the number of “small sample” is relative to the total size of the dataset and lacks a unified standard, which should be consistent when comparing various methods.

5.2.2 Following the N -way K -shot protocol

Another strategy is to treat PHM tasks under limited data conditions as a few-shot classification or regression problems. This strategy draws on the organization of the FSL method that extends the input samples from data point level to task space. As described in Sect.  3.3 , each N -way K -shot task consists of N ( N  ≤ 20) classes, with each class containing K ( K  ≤ 10) support samples. Creating multiple N -way K -shot subtasks can be used for the training and test of FSL models. In the case of the CWRU dataset, for example, 10-way 1/5-shot FD tasks are frequently designed. This setting better aligns the principles of the FSL framework and proves to be beneficial in detecting novel faults under unseen conditions. However, tasks need to be sampled from a sufficiently large number of categories, otherwise the tasks will be homogeneous and degrade model performance.

6 Future research directions

Currently, most of the data sizes involved in intelligent PHM tasks are still in the small data stage and will be for a long time. Various methods proposed in existing research have made significant progress, but there is still a long way to go to realize data-efficient PHM. For this reason, we propose some directions for further research on small data challenges.

6.1 Data governance

Existing research on the limited data challenge focuses on the quantity of monitoring data, with relatively little attention paid to the quality of the samples. In fact, monitoring data serves as the “raw material” for implementing PHM tasks, and its quality seriously affects the performance of intelligent models as well as the accuracy of maintenance decisions. As a result, it is imperative to research the theories and methodologies concerning the governance of industrial data, which involves the quantification, assessment, and enhancement of data quality (Karkošková 2023 ). An in-depth exploration of these ensures that the collected monitoring data meets the data quality requirements set out in the ISO/IEC 25012 standard (Gualo et al. 2021 ), thereby minimizing the adverse effects of factors such as sensor drift, measurement errors, environmental noise, and label inaccuracies. Data governance is a key component to steer the trajectory of intelligent PHM from the prevalent model-centric paradigm towards a data-centric fashion (Zha et al. 2023 ).

6.2 Multimodal learning

Multimodal learning is a novel paradigm to train models with multiple different data modalities, which provides a potential means to solve small data problems in PHM. Specifically, rich forms of monitoring data exist in industry, including but not limited to surveillance videos, equipment images, and maintenance records, these data contain a wealth of intermodal and cross-modal information, which can be fused by multimodal learning techniques (Xu et al. 2023 ) to compensate for the low information density of limited unimodal data. Meanwhile, multimodal data from different systems and equipment can help to perceive their health status more comprehensively, thus improving the intelligent diagnosis and forecasting capability for the entire fleet of equipment (Jose et al. 2023 ).

6.3 Physics-informed data-driven approaches

Existing studies have demonstrated that data-driven approaches, especially those based on DL, excel at capturing underlying patterns from multivariable data, but they are susceptible to small dataset size. While physics model-based methods incorporate mechanisms or expert knowledge during the modeling process, but have limited data processing capabilities. Considering the respective strengths and weaknesses of these two paradigms, an emerging trend is to develop hybrid frameworks that integrate domain knowledge with implicit knowledge extracted from data (Ma et al. 2023 ), which has two obvious advantages in solving small data problems. On the one hand, the introduction of physical knowledge reduces the black-box characteristics of the DL model to a certain extent and enhances the interpretability of the PHM task decision-making under small samples (Weikun et al. 2023 ); On the other hand, physical modeling takes the known physical laws and principles as priori knowledge, which can reduce the uncertainty and domain bias brought by the small-sample data under the complex working conditions, for example, Shi et al. ( 2022 ) have validated the effectiveness of introducing multibody dynamic simulation into data augmentation for robustness enhancement.

6.4 Weak-supervised learning

DL-based models have demonstrated much potential in numerous PHM tasks, but their performance relies heavily on supervised learning, and the need for abundant annotated data are significant barriers to deploying these models in the industry. As we all know, obtaining high-quality labeled data is time-intensive and expensive, but unlabeled data is more readily available in practice. This reality has spurred the exploration of techniques such as unsupervised and self-supervised learning methods to perform autonomous construction of learning models using unlabeled data. Weak-supervised strategies have been successfully employed in the fields of computer vision and natural language processing, and the application potential in PHM tasks has been explored by Zhao et al . ( 2021b ) and Ding et al . ( 2022b ), and the results illustrate that these methods excel at addressing the open-set diagnostic and prognostic problems with small data.

6.5 Federated learning

Federated learning (FL) (Yang et al. 2019b ), a promising framework for developing DL models with low resources, which adheres to the unique principle of “data stays put, models move”. FL allows the training of decentralized models using data generated by each manufacturing company separately, without the need to aggregate the data from all manufacturers into a centralized repository, resulting in two significant benefits. First, from a cost perspective, FL reduces the expenses associated with large-scale data collection, transmission, storage, and model training. Second, from a data privacy standpoint, the FL approach directly leverages locally-held data without data sharing, eliminating concerns of data owners about data sovereignty and business secrets. Moreover, the distributed training process of exchanging only partially model parameters reduces the risk of malicious attacks on PHM models in industrial applications (Arunan et al. 2023 ). At present, representative models include federated averaging (FedAvg) (McMahan et al. 2017 ), federated proximal (FedProx) (Mishchenko et al. 2022 ), federated transfer learning (Kevin et al. 2021 ), and federated meta-learning (Fallah et al. 2020 ), which provide valuable guidance in developing reliable and responsible intelligent PHM. Due to the complexity of equipment composition and working conditions, issues such as device heterogeneity and data imbalance in FL applications in PHM require more attention and research (Berghout et al. 2022 ).

6.6 Large-scale models

Since the release of GPT-3 (Brown et al. 2020 ) and ChatGPT (Scheurer et al. 2023 ), large-scale models have become a hot topic in academia and industry, triggering a new wave of innovation. Technically, large-scale models are the evolution and extension of traditional DL models that they require large amounts of data and computing resources to train the hundreds of millions of parameters, and demonstrate amazing abilities in data understanding, multi-task performing, logical reasoning, and domain generalization. Considering the remaining challenges of traditional DL models in performing PHM tasks with small data, developing large-scale models for the PHM-domain is a promising direction, the pre-trained large-scale model is first chosen based on the target PHM task and signal type, such as the pre-trained BERT is reused for RUL prediction (Zhu et al. 2023b ), and which is then fine-tuned by freezing most layers and only fine-tuning the top layers with small amounts of data, and regularization and architecture adjustment techniques may be used to alleviate overfitting during the process. The study in Li et al. ( 2023d ) have validated that the large-scale models pretrained on multi-modal data from related equipment and working conditions can be generalized to cross-task and cross-domain tasks with zero-shot.

7 Conclusions

Intelligent PHM is a key part in Industry 4.0, and it is closely linked to big data and AI models. To address the difficulties of developing DL models with limited data, we provide the first comprehensive overview of small data challenges in PHM. The definition, causes and impacts of small data are first systematically analyzed to answer the research questions of “what” and “why” of solving data scarcity problems. We then comprehensively summarize the proposed solutions along three technical lines to report how the small data issues have been addressed in existing studies. Furthermore, the problems and remaining challenges within each specific PHM task are explored. Additionally, available benchmark datasets, experimental settings, and promising directions are discussed, to offer valuable references for future research on more intelligent, data-efficient, and explainable PHM methods. Learning from small data is critical to advancing intelligent PHM, as well as contributing the development of General Industrial AI.

Adadi A (2021) A survey on data-efficient algorithms in big data era. J Big Data 8:1–54

Article   Google Scholar  

Agogino A, Goebel K (2007) BEST lab, UC Berkeley, Milling Data Set. NASA Ames Prognostics Data Repository, NASA Ames Research Center, Moffett Field

Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. PMLR, pp 214–223

Google Scholar  

Arunan A, Qin Y, Li X, Yuen C (2023) A federated learning-based industrial health prognostics for heterogeneous edge devices using matched feature extraction. IEEE Trans Autom Sci Eng. https://doi.org/10.1109/TASE.2023.3274648

Baeza-Yates R (2024) Gold blog BIG, small or right data: Which is the proper focus?

Bai G, Sun W, Cao C, Wang D, Sun Q, Sun L (2023) GAN-based bearing fault diagnosis method for short and imbalanced vibration signal. IEEE Sens J 24:1894–1904

Bechhoefer E (2013) Condition based maintenance fault database for testing diagnostics and prognostic algorithms. MFPT Data

Behera S, Misra R (2021) Generative adversarial networks based remaining useful life estimation for IIoT. Comput Electr Eng 92:107195

Behera S, Misra R, Sillitti A (2023) GAN-based multi-task learning approach for prognostics and health management of IIoT. IEEE Trans Autom Sci Eng. https://doi.org/10.1109/TASE.2023.3267860

Berghout T, Benbouzid M, Bentrcia T, Lim WH, Amirat Y (2022) Federated learning for condition monitoring of industrial processes: a review on fault diagnosis methods challenges, and prospects. Electronics 12:158

Berman JJ (2013) Principles of big data: preparing, sharing, and analyzing complex information. Newnes

Borgwardt KM, Gretton A, Rasch MJ, Kriegel H-P, Schölkopf B, Smola AJ (2006) Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22:e49–e57

Bronz M, Baskaya E, Delahaye D, Puechmore S (2020) Real-time fault detection on small fixed-wing UAVs using machine learning. In: 2020 AIAA/IEEE 39th Digital Avionics Systems Conference (DASC), IEEE, San Antonio, TX, USA, pp 1–10

Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler D, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D (2020) Language models are few-shot learners. Advances in neural information processing systems. Curran Associates Inc, pp 1877–1901

Brusa E, Delprete C, Di Maggio LG (2021) Deep transfer learning for machine diagnosis: from sound and music recognition to bearing fault detection. Appl Sci 11:11663

Cao P, Zhang S, Tang J (2018) Preprocessing-free gear fault diagnosis using small datasets with deep convolutional neural network-based transfer learning. IEEE Access 6:26241–26253

Cao X, Bu W, Huang S, Zhang M, Tsang IW, Ong YS, Kwok JT (2023) A survey of learning on small data: generalization, optimization, and challenge

Chahal H, Toner H, Rahkovsky I (2021) Small data’s big AI potential. Center for Security and Emerging Technology

Book   Google Scholar  

Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

Che C, Wang H, Ni X, Fu Q (2020) Domain adaptive deep belief network for rolling bearing fault diagnosis. Comput Ind Eng 143:106427

Chen C, Shen F, Xu J, Yan R (2020) Domain adaptation-based transfer learning for gear fault diagnosis under varying working conditions. IEEE Trans Instrum Meas 70:1–10

Chen W, Qiu Y, Feng Y, Li Y, Kusiak A (2021) Diagnosis of wind turbine faults with transfer learning algorithms. Renew Energy 163:2053–2067

Chen J, Hu W, Cao D, Zhang Z, Chen Z, Blaabjerg F (2022) A meta-learning method for electric machine bearing fault diagnosis under varying working conditions with limited data. IEEE Trans Indus Inform 19:2552–2564

Chen X, Liu H, Nikitas N (2023a) Internal pump leakage detection of the hydraulic systems with highly incomplete flow data. Adv Eng Inform 56:101974

Chen J, Tang J, Li W (2023b) Industrial edge intelligence: federated-meta learning framework for few-shot fault diagnosis. IEEE Trans Netw Sci Eng. https://doi.org/10.1109/TNSE.2023.3266942

Article   MathSciNet   Google Scholar  

Chen X, Zhao C, Ding J (2023c) Pyramid-type zero-shot learning model with multi-granularity hierarchical attributes for industrial fault diagnosis. Reliab Eng Syst Saf 240:109591

Cheng C, Zhou B, Ma G, Wu D, Yuan Y (2020) Wasserstein distance based deep adversarial transfer learning for intelligent fault diagnosis with unlabelled or insufficient labelled data. Neurocomputing 409:35–45

Cho SH, Kim S, Choi J-H (2020) Transfer learning-based fault diagnosis under data deficiency. Appl Sci 10:7768

Choi K, Kim Y, Kim S-K, Kim K-S (2020) Current and position sensor fault diagnosis algorithm for PMSM drives based on robust state observer. IEEE Trans Industr Electron 68:5227–5236

D Research (2019) Artificial intelligence and machine learning projects are obstructed by data issues

Dai W, Yang Q, Xue GR, Yu Y (2007) Boosting for transfer learning. 2007. In: Proceedings of the 24th International Conference on Machine Learning

Dai H, Chen P, Yang H (2022) Metalearning-based fault-tolerant control for skid steering vehicles under actuator fault conditions. Sensors 22:845

Der Kiureghian A, Ditlevsen O (2009) Aleatory or epistemic? Does it matter? Struct Saf 31:105–112

Ding P, Jia M (2021) Mechatronics equipment performance degradation assessment using limited and unlabeled data. IEEE Trans Industr Inf 18:2374–2385

Ding Y, Ma L, Ma J, Wang C, Lu C (2019) A generative adversarial network-based intelligent fault diagnosis method for rotating machinery under small sample size conditions. IEEE Access 7:149736–149749

Ding P, Jia M, Zhao X (2021) Meta deep learning based rotating machinery health prognostics toward few-shot prognostics. Appl Soft Comput 104:107211

Ding P, Jia M, Ding Y, Cao Y, Zhao X (2022a) Intelligent machinery health prognostics under variable operation conditions with limited and variable-length data. Adv Eng Inform 53:101691

Ding Y, Zhuang J, Ding P, Jia M (2022b) Self-supervised pretraining via contrast learning for intelligent incipient fault detection of bearings. Reliab Eng Syst Saf 218:108126

Ding P, Zhao X, Shao H, Jia M (2023a) Machinery cross domain degradation prognostics considering compound domain shifts. Reliab Eng Syst Saf 239:109490

Ding P, Jia M, Ding Y, Cao Y, Zhuang J, Zhao X (2023b) Machinery probabilistic few-shot prognostics considering prediction uncertainty. IEEE/ASME Trans Mechatron 29:106–118

Dixit S, Verma NK (2020) Intelligent condition-based monitoring of rotary machines with few samples. IEEE Sens J 20:14337–14346

Dou J, Wei G, Song Y, Zhou D, Li M (2023) Switching triple-weight-smote in empirical feature space for imbalanced and incomplete data. IEEE Trans Autom Sci Eng 21:1–17

Downs JJ, Vogel EF (1993) A plant-wide industrial process control problem. Comput Chem Eng 17:245–255

Du Y, Zhang W, Wang J, Wu H (2019) DCGAN based data generation for process monitoring. In: IEEE, pp 410–415

Fallah A, Mokhtari A, Ozdaglar A (2020) Personalized federated learning with theoretical guarantees: a model-agnostic meta-learning approach. Adv Neural Inf Process Syst 33:3557–3568

Fan Y, Cui X, Han H, Lu H (2020) Chiller fault detection and diagnosis by knowledge transfer based on adaptive imbalanced processing. Sci Technol Built Environ 26:1082–1099

Fan Z, Xu Q, Jiang C, Ding SX (2023a) Deep mixed domain generalization network for intelligent fault diagnosis under unseen conditions. IEEE Trans Industr Electron 71:965–974

Fan L, Chen X, Chai Y, Lin W (2023b) Attribute fusion transfer for zero-shot fault diagnosis. Adv Eng Inform 58:102204

Fekri MN, Patel H, Grolinger K, Sharma V (2021) Deep learning for load forecasting with smart meter data: online adaptive recurrent neural network. Appl Energy 282:116177

Feng L, Zhao C (2020) Fault description based attribute transfer for zero-sample industrial fault diagnosis. IEEE Trans Industr Inf 17:1852–1862

Feng Y, Chen J, Yang Z, Song X, Chang Y, He S, Xu E, Zhou Z (2021) Similarity-based meta-learning network with adversarial domain adaptation for cross-domain fault identification. Knowl-Based Syst 217:106829

Fink O, Wang Q, Svensen M, Dersin P, Lee W-J, Ducoffe M (2020) Potential, challenges and future directions for deep learning in prognostics and health management applications. Eng Appl Artif Intell 92:103678

Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: 34th International Conference on Machine Learning, ICML 2017 3:1856–1868

Fu B, Yuan W, Cui X, Yu T, Zhao X, Li C (2020) Correlation analysis and augmentation of samples for a bidirectional gate recurrent unit network for the remaining useful life prediction of bearings. IEEE Sens J 21:7989–8001

Gangsar P, Tiwari R (2020) Signal based condition monitoring techniques for fault detection and diagnosis of induction motors: a state-of-the-art review. Mech Syst Signal Process 144:106908

Gay A, Voisin A, Iung B, Do P, Bonidal R, Khelassi A (2022) Data augmentation-based prognostics for predictive maintenance of industrial system. CIRP Ann 71:409–412

Gay A, Voisin A, Iung B, Do P, Bonidal R, Khelassi A (2023) A study on data augmentation optimization for data-centric health prognostics of industrial systems. IFAC-PapersOnLine 56:1270–1275

Gray DO, Rivers D, Vermont G (2012) Measuring the economic impacts of the NSF Industry/University Cooperative Research Centers Program: a feasibility study, Arlington, Virginia

Gretton A, Sejdinovic D, Strathmann H, Balakrishnan S, Pontil M, Fukumizu K, Sriperumbudur BK (2012) Optimal kernel choice for large-scale two-sample tests. Adv Neural Inform Process Syst 25

Gualo F, Rodríguez M, Verdugo J, Caballero I, Piattini M (2021) Data quality certification using ISO/IEC 25012: industrial experiences. J Syst Softw 176:110938

Guo C, Hu W, Yang F, Huang D (2020) Deep learning technique for process fault detection and diagnosis in the presence of incomplete data. Chin J Chem Eng 28:2358–2367

Han T, Xie W, Pei Z (2023) Semi-supervised adversarial discriminative learning approach for intelligent fault diagnosis of wind turbine. Inf Sci 648:119496

Hao W, Liu F (2020) Imbalanced data fault diagnosis based on an evolutionary online sequential extreme learning machine. Symmetry 12:1204

He Z, Shao H, Zhang X, Cheng J, Yang Y (2019) Improved deep transfer auto-encoder for fault diagnosis of gearbox under variable working conditions with small training samples. IEEE Access 7:115368–115377

He Y, Hu M, Feng K, Jiang Z (2020a) An intelligent fault diagnosis scheme using transferred samples for intershaft bearings under variable working conditions. IEEE Access 8:203058–203069

He Z, Shao H, Wang P, (Jing) Lin J, Cheng J, Yang Y (2020b) Deep transfer multi-wavelet auto-encoder for intelligent fault diagnosis of gearbox with few target training samples. Knowl-Based Syst 191:105313

He J, Li X, Chen Y, Chen D, Guo J, Zhou Y (2021) Deep transfer learning method based on 1d-cnn for bearing fault diagnosis. Shock Vib 2021:1–16

Hinton GE, Zemel RS (1994) Autoencoders, minimum description length, and Helmholtz free energy. Adv Neural Inf Process Syst 6:3–10

Hu T, Tang T, Lin R, Chen M, Han S, Wu J (2020) A simple data augmentation algorithm and a self-adaptive convolutional architecture for few-shot fault diagnosis under different working conditions. Measurement 156:107539

Hu C, Zhou Z, Wang B, Zheng W, He S (2021a) Tensor transfer learning for intelligence fault diagnosis of bearing with semisupervised partial label learning. J Sens 2021:1–11

Hu Y, Liu R, Li X, Chen D, Hu Q (2021b) Task-sequencing meta learning for intelligent few-shot fault diagnosis with limited data. IEEE Trans Industr Inf 18:3894–3904

Hu Z, Shen L, Wang Z, Liu T, Yuan C, Tao D (2023) Architecture, dataset and model-scale agnostic data-free meta-learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7736–7745

Huang N, Chen Q, Cai G, Xu D, Zhang L, Zhao W (2020) Fault diagnosis of bearing in wind turbine gearbox under actual operating conditions driven by limited data with noise labels. IEEE Trans Instrum Meas 70:1–10

Huang F, Sava A, Adjallah KH, Wang Z (2021) Fuzzy model identification based on mixture distribution analysis for bearings remaining useful life estimation using small training data set. Mech Syst Signal Process 148:107173

Huang Y, Tang Y, VanZwieten J, Liu J (2022) Reliable machine prognostic health management in the presence of missing data. Concurr Comput Pract Exp 34:e5762

Huang C, Bu S, Lee HH, Chan KW, Yung WKC (2024) Prognostics and health management for induction machines: a comprehensive review. J Intell Manuf 35:937–962

Iglesias G, Talavera E, González-Prieto Á, Mozo A, Gómez-Canaval S (2023) Data Augmentation techniques in time series domain: a survey and taxonomy. Neural Comput Appl 35:10123–10145

Jamil F, Verstraeten T, Nowé A, Peeters C, Helsen J (2022) A deep boosted transfer learning method for wind turbine gearbox fault detection. Renew Energy 197:331–341

Jiang C, Chen H, Xu Q, Wang X (2022) Few-shot fault diagnosis of rotating machinery with two-branch prototypical networks. J Intell Manuf. https://doi.org/10.1007/s10845-021-01904-x

Jiang Y, Drescher B, Yuan G (2023) A GAN-based multi-sensor data augmentation technique for CNC machine tool wear prediction. IEEE Access 11:95782–95795

Jin X, Wah BW, Cheng X, Wang Y (2015) Significance and challenges of big data research. Big Data Res 2:59–64

Jose S, Nguyen KTP, Medjaher K (2023) Multimodal machine learning in prognostics and health management of manufacturing systems. Artificial intelligence for smart manufacturing: methods, applications, and challenges. Springer, pp 167–197

Chapter   Google Scholar  

Karkošková S (2023) Data governance model to enhance data quality in financial institutions. Inf Syst Manag 40:90–110

Kavis M (2015) Forget big data–small data is driving the Internet of Things, https://www.Forbes.Com/Sites/Mikekavis/2015/02/25/Forget-Big-Datasmall-Data-Is-Driving-the-Internet-of-Things

Kevin I, Wang K, Zhou X, Liang W, Yan Z, She J (2021) Federated transfer learning based cross-domain prediction for smart manufacturing. IEEE Trans Industr Inf 18:4088–4096

Kim H, Youn BD (2019) A new parameter repurposing method for parameter transfer with small dataset and its application in fault diagnosis of rolling element bearings. IEEE Access 7:46917–46930

Koch G, Zemel R, Salakhutdinov R (2015) Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop

Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86

Kumar P, Raouf I, Kim HS (2023) Review on prognostics and health management in smart factory: from conventional to deep learning perspectives. Eng Appl Artif Intell 126:107126

Lao Z, He D, Jin Z, Liu C, Shang H, He Y (2023) Few-shot fault diagnosis of turnout switch machine based on semi-supervised weighted prototypical network. Knowl-Based Syst 274:110634

LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324

Lee YO, Jo J, Hwang J (2017) Application of deep neural network and generative adversarial network to industrial maintenance: a case study of induction motor fault detection. In: Proceedings—2017 IEEE International Conference on Big Data, Big Data 2017 2018-Janua, pp 3248–3253

Lee J, Mitici M (2023) Deep reinforcement learning for predictive aircraft maintenance using probabilistic remaining-useful-life prognostics. Reliab Eng Syst Saf 230:108908

Lee K, Han S, Pham VH, Cho S, Choi H-J, Lee J, Noh I, Lee SW (2021) Multi-objective instance weighting-based deep transfer learning network for intelligent fault diagnosis. Appl Sci 11:2370

Lei Y, Li N, Guo L, Li N, Yan T, Lin J (2018) Machinery health prognostics: a systematic review from data acquisition to RUL prediction. Mech Syst Signal Process 104:799–834

Lessmeier C, Kimotho JK, Zimmer D, Sextro W (2016) Condition monitoring of bearing damage in electromechanical drive systems by using motor current signals of electric motors: a benchmark data set for data-driven classification, 17

Li Y, Liu C, Hua J, Gao J, Maropoulos P (2019) A novel method for accurately monitoring and predicting tool wear under varying cutting conditions based on meta-learning. CIRP Ann 68:487–490

Li X, Zhang W, Ding Q, Sun JQ (2020a) Intelligent rotating machinery fault diagnosis based on deep learning using data augmentation. J Intell Manuf 31:433–452

Li W, Gu S, Zhang X, Chen T (2020b) Transfer learning for process fault diagnosis: knowledge transfer from simulation to physical processes. Comput Chem Eng 139:106904

Li X, Zhang W, Ding Q, Li X (2020c) Diagnosing rotating machines with weakly supervised data using deep transfer learning. IEEE Trans Industr Inf 16:1688–1697

Li F, Tang T, Tang B, He Q (2021a) Deep convolution domain-adversarial transfer learning for fault diagnosis of rolling bearings. Measurement 169:108339

Li Y, Jiang W, Zhang G, Shu L (2021b) Wind turbine fault diagnosis based on transfer learning and convolutional autoencoder with small-scale data. Renew Energy 171:103–115

Li C, Li S, Zhang A, He Q, Liao Z, Hu J (2021c) Meta-learning for few-shot bearing fault diagnosis under complex working conditions. Neurocomputing 439:197–211

Li X, Yang X, Ma Z, Xue JH (2021d) Deep metric learning for few-shot image classification: a selective review, arXiv Preprint https://arXiv.org/2105.08149

Li Z, Sun Y, Yang L, Zhao Z, Chen X (2022a) Unsupervised machine anomaly detection using autoencoder and temporal convolutional network. IEEE Trans Instrum Meas 71:1–13

Li W, Huang R, Li J, Liao Y, Chen Z, He G, Yan R, Gryllias K (2022b) A perspective survey on deep transfer learning for fault diagnosis in industrial scenarios: theories, applications and challenges. Mech Syst Signal Process 167:108487

Li C, Li S, Zhang A, Yang L, Zio E, Pecht M, Gryllias K (2022c) A Siamese hybrid neural network framework for few-shot fault diagnosis of fixed-wing unmanned aerial vehicles. J Comput Design Eng 9:1511–1524

Li Y, Wang J, Huang Z, Gao RX (2022d) Physics-informed meta learning for machining tool wear prediction. J Manuf Syst 62:17–27

Li Y, Yang Y, Feng K, Zuo MJ, Chen Z (2023a) Automated and adaptive ridge extraction for rotating machinery fault detection. IEEE/ASME Trans Mechatron 28:2565

Li K, Lu J, Zuo H, Zhang G (2023b) Source-free multi-domain adaptation with fuzzy rule-based deep neural networks. IEEE Trans Fuzzy Syst. https://doi.org/10.1109/TFUZZ.2023.3276978

Li C, Li S, Wang H, Gu F, Ball AD (2023c) Attention-based deep meta-transfer learning for few-shot fine-grained fault diagnosis. Knowl-Based Syst 264:110345

Li Y-F, Wang H, Sun M (2023d) ChatGPT-like large-scale foundation models for prognostics and health management: a survey and roadmaps. Reliab Eng Syst Saf 243:109850

Liang P, Deng C, Wu J, Yang Z, Zhu J, Zhang Z (2020) Single and simultaneous fault diagnosis of gearbox via a semi-supervised and high-accuracy adversarial learning framework. Knowl-Based Syst 198:105895

Liao Y, Huang R, Li J, Chen Z, Li W (2020) Deep semisupervised domain generalization network for rotary machinery fault diagnosis under variable speed. IEEE Trans Instrum Meas 69:8064–8075

Lin J, Shao H, Zhou X, Cai B, Liu B (2023) Generalized MAML for few-shot cross-domain fault diagnosis of bearing driven by heterogeneous signals. Expert Syst Appl 230:120696

Liu J, Ren Y (2020) A general transfer framework based on industrial process fault diagnosis under small samples. IEEE Trans Industr Inf 3203:1–11

Liu C, Zhu L (2020) A two-stage approach for predicting the remaining useful life of tools using bidirectional long short-term memory. Measurement 164:108029

Liu J, Qu F, Hong X, Zhang H (2019) A small-sample wind turbine fault detection method with synthetic fault data using generative adversarial nets. IEEE Trans Industr Inf 15:3877–3888

Liu S, Jiang H, Wu Z, Li X (2022) Data synthesis using deep feature enhanced generative adversarial networks for rolling bearing imbalanced fault diagnosis. Mech Syst Signal Process 163:108139

Liu S, Chen J, He S, Shi Z, Zhou Z (2023) Few-shot learning under domain shift: attentional contrastive calibrated transformer of time series for fault diagnosis under sharp speed variation. Mech Syst Signal Process 189:110071

Long J, Chen Y, Huang H, Yang Z, Huang Y, Li C (2023) Multidomain variance-learnable prototypical network for few-shot diagnosis of novel faults. J Intell Manuf. https://doi.org/10.1007/s10845-023-02123-2

Lu N, Yin T (2021) Transferable common feature space mining for fault diagnosis with imbalanced data. Mech Syst Signal Process 156:107645

Lu N, Hu H, Yin T, Lei Y, Wang S (2021) Transfer relation network for fault diagnosis of rotating machinery with small data. IEEE Trans Cybern 52:11927–11941

Lu N, Zhuang G, Ma Z, Zhao Q (2022) A zero-shot intelligent fault diagnosis system based on EEMD. IEEE Access 10:54197–54207

Luo M, Xu J, Fan Y, Zhang J (2022) TRNet: a cross-component few-shot mechanical fault diagnosis. IEEE Trans Indus Inform. https://doi.org/10.1109/TII.2022.3204554

Lv H, Chen J, Pan T, Zhou Z (2020) Hybrid attribute conditional adversarial denoising autoencoder for zero-shot classification of mechanical intelligent fault diagnosis. Appl Soft Comput 95:106577

Ma L, Ding Y, Wang Z, Wang C, Ma J, Lu C (2021) An interpretable data augmentation scheme for machine fault diagnosis based on a sparsity-constrained generative adversarial network. Expert Syst Appl 182:115234

Ma Z, Liao H, Gao J, Nie S, Geng Y (2023) Physics-informed machine learning for degradation modelling of an electro-hydrostatic actuator system. Reliab Eng Syst Saf 229:108898

Mahmoodian A, Durali M, Saadat M, Abbasian T (2021) A life clustering framework for prognostics of gas turbine engines under limited data situations. Int J Eng Trans C: Aspects 34:728–736

Mao W, Liu Y, Ding L, Li Y (2019) Imbalanced fault diagnosis of rolling bearing based on generative adversarial network: a comparative study. IEEE Access 7:9515–9530

Mao W, He J, Zuo MJ (2020) Predicting remaining useful life of rolling bearings based on deep feature representation and transfer learning. IEEE Trans Instrum Meas 69:1594–1608

Mao W, He J, Sun B, Wang L (2021) Prediction of bearings remaining useful life across working conditions based on transfer learning and time series clustering. IEEE Access 9:135285–135303

McMahan B, Moore E, Ramage D, Hampson S, Arcas BAY (2017) Communication-efficient learning of deep networks from decentralized data. Artificial intelligence and statistics. PMLR, pp 1273–1282

Meng Z, Guo X, Pan Z, Sun D, Liu S (2019) Data segmentation and augmentation methods based on raw data using deep neural networks approach for rotating machinery fault diagnosis. IEEE Access 7:79510–79522

Miao Y, Jiang Y, Huang J, Zhang X, Han L (2020) Application of fault diagnosis of seawater hydraulic pump based on transfer learning. Shock Vib 2020:1–8

Miao J, Wang J, Zhang D, Miao Q (2021) Improved generative adversarial network for rotating component fault diagnosis in scenarios with extremely limited data. IEEE Trans Instrum Meas 71:1–13

Michau G, Fink O (2021) Unsupervised transfer learning for anomaly detection: application to complementary operating condition transfer. Knowl-Based Syst 216:106816

Mishchenko K, Khaled A, Richtárik P (2022) Proximal and federated random reshuffling. In: International Conference on Machine Learning, PMLR, pp 15718–15749

Mo Y, Li L, Huang B, Li X (2022) Few-shot RUL estimation based on model-agnostic meta-learning. J Intell Manuf 34:1–14

Moreno-Barea FJ, Jerez JM, Franco L (2020) Improving classification accuracy using data augmentation on small data sets. Expert Syst Appl 161:113696

Nectoux P, Gouriveau R, Medjaher K, Ramasso E, Chebel-Morello B, Zerhouni N, Varnier C (2012) PRONOSTIA: an experimental platform for bearings accelerated degradation tests. In: IEEE International Conference on Prognostics and Health Management, PHM’12, pp 1–8

Nemani V, Biggio L, Huan X, Hu Z, Fink O, Tran A, Wang Y, Zhang X, Hu C (2023) Uncertainty quantification in machine learning for engineering design and health prognostics: a tutorial. Mech Syst Signal Process 205:110796

Omri N, Al-Masry Z, Mairot N, Giampiccolo S, Zerhouni N (2020) Industrial data management strategy towards an SME-oriented PHM. J Manuf Syst 56:23–36

Pan T, Chen J, Zhang T, Liu S, He S, Lv H (2022) Generative adversarial network in mechanical fault diagnosis under small sample: a systematic review on applications and future perspectives. ISA Trans 128:1–10

Pang G, Cao L, Aggarwal C (2021) Deep learning for anomaly detection: challenges, methods, and opportunities, pp 1127–1130

Parnami A, Lee M (2022) Learning from few examples: a summary of approaches to few-shot learning. arXiv Preprint https://arXiv.org/2203.04291

Peng C, Li L, Chen Q, Tang Z, Gui W, He J (2021) A fault diagnosis method for rolling bearings based on parameter transfer learning under imbalance data sets. Energies 14:944

Qi L, Ren Y, Fang Y, Zhou J (2023) Two-view LSTM variational auto-encoder for fault detection and diagnosis in multivariable manufacturing processes. Neural Comput Appl 35:1–20

Qin A, Mao H, Zhong J, Huang Z, Li X (2023) Generalized transfer extreme learning machine for unsupervised cross-domain fault diagnosis with small and imbalanced samples. IEEE Sens J 23:15831–15843

Qiu H, Lee J, Lin J, Yu G (2006) Wavelet filter-based weak signature detection method and its application on rolling element bearing prognostics. J Sound Vib 289:1066–1090

Rajagopalan S, Singh J, Purohit A (2023) VMD-based ensembled SMOTEBoost for imbalanced multi-class rotor mass imbalance fault detection and diagnosis under industrial noise. J Vib Eng Technol 12:1–22

Randall RB (2021) Vibration-based condition monitoring: industrial, automotive and aerospace applications. Wiley

Ren Z, Lin T, Feng K, Zhu Y, Liu Z, Yan K (2023) A systematic review on imbalanced learning methods in intelligent fault diagnosis. IEEE Trans Instrum Meas 72:3508535

Ren L, Mo T, Cheng X (2024) Meta-learning based domain generalization framework for fault diagnosis with gradient aligning and semantic matching. IEEE Trans Ind Inf 20:754–764

Ruan D, Wu Y, Yan J, Gühmann C (2022) Fuzzy-membership-based framework for task transfer learning between fault diagnosis and RUL prediction. IEEE Trans Reliab 72:989–1002

Sadoughi M, Lu H, Hu C (2019) A deep learning approach for failure prognostics of rolling element bearings. In: IEEE, pp 1–7

Saxena A, Goebel K, Simon D, Eklund N (2008) Damage propagation modeling for aircraft engine run-to-failure simulation. In: IEEE, pp 1–9

Scheurer J, Campos JA, Korbak T, Chan JS, Chen A, Cho K, Perez E (2023) Training language models with language feedback at scale, arXiv Preprint https://arXiv.org/2303.16755

Schmid M, Gebauer E, Hanzl C, Endisch C (2020) Active model-based fault diagnosis in reconfigurable battery systems. IEEE Trans Power Electron 36:2584–2597

Schmidhuber J (1987) Evolutionary principles in self-referential learning, or on learning how to learn: the meta-meta-... hook

Shao S, McAleer S, Yan R, Baldi P (2018) Highly accurate machine fault diagnosis using deep transfer learning. IEEE Trans Industr Inf 15:2446–2455

Shi D, Ye Y, Gillwald M, Hecht M (2022) Robustness enhancement of machine fault diagnostic models for railway applications through data augmentation. Mech Syst Signal Process 164:108217

Smith WA, Randall RB (2015) Rolling element bearing diagnostics using the Case Western Reserve University data: a benchmark study. Mech Syst Signal Process 64:100–131

Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. Adv Neural Inform Process Syst 30

Song Y, Wang T, Mondal SK, Sahoo JP (2022) A comprehensive survey of few-shot learning: evolution applications, challenges, and opportunities. ACM Comput Surv 271:1–40

Sun B, Saenko K (2016) Deep coral: correlation alignment for deep domain adaptation. Springer, pp 443–450

Sun Y, Zhao T, Zou Z, Chen Y, Zhang H (2021) Imbalanced data fault diagnosis of hydrogen sensors using deep convolutional generative adversarial network with convolutional neural network. Rev Sci Instrum 92:095007

Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018) Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 1199–1208

Suthaharan S (2014) Big data classification: problems and challenges in network intrusion prediction with machine learning. ACM SIGMETRICS Perform Eval Rev 41:70–73

Tang Z, Bo L, Liu X, Wei D (2021) An autoencoder with adaptive transfer learning for intelligent fault diagnosis of rotating machinery. Meas Sci Technol 32:55110

Tang Y, Xiao X, Yang X, Lei B (2023a) Research on a small sample feature transfer method for fault diagnosis of reciprocating compressors. J Loss Prev Process Ind 85:105163

Tang T, Qiu C, Yang T, Wang J, Zhao J, Chen M, Wu J, Wang L (2023b) A novel lightweight relation network for cross-domain few-shot fault diagnosis. Measurement 213:112697

Thrun S, Pratt L (2012) Learning to learn. Springer Science Business Media

Tian Y, Tang Y, Peng X (2020) Cross-task fault diagnosis based on deep domain adaptation with local feature learning. IEEE Access 8:127546–127559

Triguero I, Del Río S, López V, Bacardit J, Benítez JM, Herrera F (2015) ROSEFW-RF: the winner algorithm for the ECBDL’14 big data competition: an extremely imbalanced big data bioinformatics problem. Knowl-Based Syst 87:69–79

Vapnik V (2013) The nature of statistical learning theory. Springer Science Business Media

Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D (2016) Matching networks for one shot learning. Adv Neural Inform Process Syst 29:3637–3645

Wan W, He S, Chen J, Li A, Feng Y (2021) QSCGAN: an un-supervised quick self-attention convolutional GAN for LRE bearing fault diagnosis under limited label-lacked data. IEEE Trans Instrum Meas 70:1–16

Wang C, Xu Z (2021) An intelligent fault diagnosis model based on deep neural network for few-shot fault diagnosis. Neurocomputing 456:550–562

Wang B, Lei Y, Li N, Li N (2018a) A hybrid prognostics approach for estimating remaining useful life of rolling element bearings. IEEE Trans Reliab 69:401–412

Wang Z, Wang J, Wang Y (2018b) An intelligent diagnosis scheme based on generative adversarial learning deep neural networks and its application to planetary gearbox fault pattern recognition. Neurocomputing 310:213–222

Wang Y, Yao Q, Kwok JT, Ni LM (2020a) Generalizing from a few examples: a survey on few-shot learning. ACM Comput Surv (CSUR) 53:1–34

Wang S, Wang D, Kong D, Wang J, Li W, Zhou S (2020b) Few-shot rolling bearing fault diagnosis with metric-based meta learning. Sensors (switzerland) 20:1–15

Wang D, Zhang M, Xu Y, Lu W, Yang J, Zhang T (2021) Metric-based meta-learning model for few-shot fault diagnosis under multiple limited data conditions. Mech Syst Signal Process 155:107510

Wang Z, Yang J, Guo Y (2022) Unknown fault feature extraction of rolling bearings under variable speed conditions based on statistical complexity measures. Mech Syst Signal Process 172:108964

Wang S, Ma L, Wang J (2023) Fault diagnosis method based on CND-SMOTE and BA-SVM algorithm. J Phys Conf Ser 2493:012008

Ward JS, Barker A (2013) Undefined by data: a survey of big data definitions. arXiv Preprint https://arXiv.org/1309.5821

Weikun D, Nguyen KT, Medjaher K, Christian G, Morio J (2023) Physics-informed machine learning in prognostics and health management: state of the art and challenges. Appl Math Model 124:325–352

Wen L, Li X, Li X, Gao L (2019) A new transfer learning based on VGG-19 network for fault diagnosis. In: IEEE, pp 205–209

Wen L, Li X, Gao L (2020) A transfer convolutional neural network for fault diagnosis based on ResNet-50. Neural Comput Appl 32:6111–6124

Wenbai C, Chang L, Weizhao C, Huixiang L, Qili C, Peiliang W (2021) A prediction method for the RUL of equipment for missing data. Complexity 2021:2122655

Wu H, Zhao J (2020) Fault detection and diagnosis based on transfer learning for multimode chemical processes. Comput Chem Eng 135:106731

Wu J, Zhao Z, Sun C, Yan R, Chen X (2020) Few-shot transfer learning for intelligent fault diagnosis of machine. Measurement 166:108202

Wu K, Yukang N, Wu J, Yuanhang W (2023) Prior knowledge-based self-supervised learning for intelligent bearing fault diagnosis with few fault samples. Meas Sci Technol 34:105104

Xia P, Huang Y, Li P, Liu C, Shi L (2021) Fault knowledge transfer assisted ensemble method for remaining useful life prediction. IEEE Trans Industr Inf 18:1758–1769

Xiao D, Huang Y, Qin C, Liu Z, Li Y, Liu C (2019) Transfer learning with convolutional neural networks for small sample size problem in machinery fault diagnosis. Proc Inst Mech Eng C J Mech Eng Sci 233:5131–5143. https://doi.org/10.1177/0954406219840381

Xie J, Zhang L, Duan L, Wang J (2016) On cross-domain feature fusion in gearbox fault diagnosis under various operating conditions based on transfer component analysis. In: IEEE, pp 1–6

Xing S, Lei Y, Yang B, Lu N (2021) Adaptive knowledge transfer by continual weighted updating of filter kernels for few-shot fault diagnosis of machines. IEEE Trans Industr Electron 69:1968–1976

Xing S, Lei Y, Wang S, Lu N, Li N (2022) A label description space embedded model for zero-shot intelligent diagnosis of mechanical compound faults. Mech Syst Signal Process 162:108036

Xu J, Xu P, Wei Z, Ding X, Shi L (2020) DC-NNMN: across components fault diagnosis based on deep few-shot learning. Shock Vib 2020:3152174

Xu J, Zhou L, Zhao W, Fan Y, Ding X, Yuan X (2022) Zero-shot learning for compound fault diagnosis of bearings. Expert Syst Appl 190:116197

Xu P, Zhu X, Clifton DA (2023) Multimodal learning with transformers: a survey. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2023.3275156

Yan H, Wang J, Chen J, Liu Z, Feng Y (2022) Virtual sensor-based imputed graph attention network for anomaly detection of equipment with incomplete data. J Manuf Syst 63:52–63

Yan H, Liu Z, Chen J, Feng Y, Wang J (2023) Memory-augmented skip-connected autoencoder for unsupervised anomaly detection of rocket engines with multi-source fusion. ISA Trans 133:53–65

Yang B, Lei Y, Jia F, Xing S (2018) A transfer learning method for intelligent fault diagnosis from laboratory machines to real-case machines. In: IEEE, pp 35–40

Yang B, Lei Y, Jia F, Xing S (2019a) An intelligent fault diagnosis approach based on transfer learning from laboratory bearings to locomotive bearings. Mech Syst Signal Process 122:692–706

Yang Q, Liu Y, Cheng Y, Kang Y, Chen T, Yu H (2019b) Federated learning, synthesis lectures on artificial intelligence and machine learning 13, pp 1–207

Yang J, Xie G, Yang Y (2020a) An improved ensemble fusion autoencoder model for fault diagnosis from imbalanced and incomplete data. Control Eng Pract 98:104358

Yang Y, Wang H, Liu Z, Yang Z (2020b) Few-shot learning for rolling bearing fault diagnosis via Siamese two-dimensional convolutional neural network. In: Proceedings—11th International Conference on Prognostics and System Health Management, PHM-Jinan 2020, pp 373–378

Yang X, Bai M, Liu J, Liu J, Yu D (2021) Gas path fault diagnosis for gas turbine group based on deep transfer learning. Measurement 181:109631

Yang G, Ye Z, Zhang R, Huang K (2022) A comprehensive survey of zero-shot image classification: methods, implementation, and fair evaluation. ACI 2:1–31

Yang C, Zhang J, Chang Y, Zou J, Liu Z, Fan S (2023a) A novel deep parallel time-series relation network for fault diagnosis. IEEE Trans Instrum Meas 72:1–13

Yang L, Li S, Li C, Zhu C, Zhang A, Liang G (2023b) Data-driven unsupervised anomaly detection and recovery of unmanned aerial vehicle flight data based on spatiotemporal correlation. Sci China Technol Sci 66:1–13

Yao S, Kang Q, Zhou M, Rawa MJ, Abusorrah A (2023) A survey of transfer learning for machinery diagnostics and prognostics. Artif Intell Rev 56:2871–2922

Yin H, Li Z, Zuo J, Liu H, Yang K, Li F (2020) Wasserstein generative adversarial network and convolutional neural network (WG-CNN) for bearing fault diagnosis. Math Probl Eng 2020:2604191

Yu Y, Tang B, Lin R, Han S, Tang T, Chen M (2019) CWGAN: conditional Wasserstein generative adversarial nets for fault data generation. In: IEEE, pp 2713–2718

Yu K, Ma H, Lin TR, Li X (2020) A consistency regularization based semi-supervised learning approach for intelligent fault diagnosis of rolling bearing. Measurement 165:107987

Yu K, Lin TR, Ma H, Li X, Li X (2021a) A multi-stage semi-supervised learning approach for intelligent fault diagnosis of rolling bearing using data augmentation and metric learning. Mech Syst Signal Process 146:107043

Yu C, Ning Y, Qin Y, Su W, Zhao X (2021b) Multi-label fault diagnosis of rolling bearing based on meta-learning. Neural Comput Appl 33:5393–5407

Yu Q, Luo L, Liu B, Hu S (2023) Re-planning of quadrotors under disturbance based on meta reinforcement learning. J Intell Rob Syst 107:13

Zarsky TZ (2016) Incompatible: the GDPR in the age of big data. Seton Hall L Rev 47:995

Zha D, Bhat ZP, Lai K-H, Yang F, Hu X (2023) Data-centric AI: perspectives and challenges. In: Proceedings of the 2023 SIAM International Conference on Data Mining (SDM), SIAM, pp 945–948

Zhang A, Wang H, Li S, Cui Y, Liu Z, Yang G, Hu J (2018) Transfer learning with deep recurrent neural networks for remaining useful life estimation. Appl Sci 8:2416

Zhang A, Li S, Cui Y, Yang W, Dong R, Hu J (2019) Limited data rolling bearing fault diagnosis with few-shot learning. IEEE Access 7:110895–110904

Zhang Y, Ren Z, Zhou S (2020a) An intelligent fault diagnosis for rolling bearing based on adversarial semi-supervised method. IEEE Access 8:149868–149877

Zhang X, Qin Y, Yuen C, Jayasinghe L, Liu X (2020b) Time-series regeneration with convolutional recurrent generative adversarial network for remaining useful life estimation. IEEE Trans Industr Inf 17:6820–6831

Zhang L, Guo L, Gao H, Dong D, Fu G, Hong X (2020c) Instance-based ensemble deep transfer learning network: a new intelligent degradation recognition method and its application on ball screw. Mech Syst Signal Process 140:106681

Zhang H, Zhang Q, Shao S, Niu T, Yang X, Ding H (2020d) Sequential network with residual neural network for rotatory machine remaining useful life prediction using deep transfer learning. Shock Vib 2020:1–16

Zhang K, Chen J, Zhang T, He S, Pan T, Zhou Z (2020e) Intelligent fault diagnosis of mechanical equipment under varying working condition via iterative matching network augmented with selective Signal reuse strategy. J Manuf Syst 57:400–415

Zhang S, Ye F, Wang B, Habetler TG (2021) Few-shot bearing fault diagnosis based on model-agnostic meta-learning. IEEE Trans Ind Appl 57:4754–4764

Zhang T, Chen J, Li F, Zhang K, Lv H, He S, Xu E (2022a) Intelligent fault diagnosis of machines with small & imbalanced data: a state-of-the-art review and possible extensions. ISA Trans 119:152–171

Zhang X, Wu B, Zhang X, Zhou Q, Hu Y, Liu J (2022b) A novel assessable data augmentation method for mechanical fault diagnosis under noisy labels. Measurement 198:111114

Zhang X, Wang J, Han B, Zhang Z, Yan Z, Jia M, Guo L (2022c) Feature distance-based deep prototype network for few-shot fault diagnosis under open-set domain adaptation scenario. Measurement 201:111522

Zhang T, Chen J, Liu S, Liu Z (2023) Domain discrepancy-guided contrastive feature learning for few-shot industrial fault diagnosis under variable working conditions. IEEE Trans Industr Inf 19:10277–10287

Zhao B, Yuan Q (2021) Improved generative adversarial network for vibration-based fault diagnosis with imbalanced data. Measurement 169:108522

Zhao Z, Li T, Wu J, Sun C, Wang S, Yan R, Chen X (2020a) Deep learning algorithms for rotating machinery intelligent diagnosis: an open source benchmark study. ISA Trans 107:224–255

Zhao X, Jia M, Lin M (2020b) Deep Laplacian auto-encoder and its application into imbalanced fault diagnosis of rotating machinery. Measurement 152:107320

Zhao K, Jiang H, Wu Z, Lu T (2020c) A novel transfer learning fault diagnosis method based on manifold embedded distribution alignment with a little labeled data. J Intell Manuf 33:1–15

Zhao B, Niu Z, Liang Q, Xin Y, Qian T, Tang W, Wu Q (2021a) Signal-to-signal translation for fault diagnosis of bearings and gears with few fault samples. IEEE Trans Instrum Meas 70:1–10

Zhao Z, Zhang Q, Yu X, Sun C, Wang S, Yan R, Chen X (2021b) Applications of unsupervised deep transfer learning to intelligent fault diagnosis: a survey and comparative study. IEEE Trans Instrum Meas 70:1–28

Zhao K, Jiang H, Liu C, Wang Y, Zhu K (2022) A new data generation approach with modified Wasserstein auto-encoder for rotating machinery fault diagnosis with limited fault data. Knowl-Based Syst 238:107892

Zhao J, Yuan M, Cui J, Huang J, Zhao F, Dong S, Qu Y (2023) A novel hierarchical training architecture for Siamese Neural Network based fault diagnosis method under small sample. Measurement 215:112851

Zheng T, Song L, Guo B, Liang H, Guo L (2019) An efficient method based on conditional generative adversarial networks for imbalanced fault diagnosis of rolling bearing. In: IEEE, pp 1–8

Zhiyi H, Haidong S, Lin J, Junsheng C, Yu Y (2020) Transfer fault diagnosis of bearing installed in different machines using enhanced deep auto-encoder. Measurement 152:107393

Zhou K, Diehl E, Tang J (2023a) Deep convolutional generative adversarial network with semi-supervised learning enabled physics elucidation for extended gear fault diagnosis under data limitations. Mech Syst Signal Process 185:109772

Zhou L, Liu Y, Bai X, Li N, Yu X, Zhou J, Hancock ER (2023b) Attribute subspaces for zero-shot learning. Pattern Recogn 144:109869

Zhu QX, Zhang N, He YL, Xu Y (2022) Novel imbalanced fault diagnosis method based on CSMOTE integrated with LSDA and LightGBM for industrial process. In: IEEE, pp 326–331

Zhu R, Peng W, Wang D, Huang C-G (2023a) Bayesian transfer learning with active querying for intelligent cross-machine fault prognosis under limited data. Mech Syst Signal Process 183:109628

Zhu J, Long Z, Ma X, Luan F (2023b) Bearing remaining useful life prediction based on BERT fine-tuning. In: 2023 Global Reliability and Prognostics and Health Management Conference (PHM-Hangzhou), IEEE, pp 1–6

Zhuo Y, Ge Z (2021) Auxiliary information guided industrial data augmentation for any-shot fault learning and diagnosis. IEEE Trans Industr Inf 3203:1–11

Zio E (2022) Prognostics and health management (PHM): where are we and where do we (need to) go in theory and practice. Reliab Eng Syst Saf 218:108119

Download references

Acknowledgements

This work was supported in part by the National Key Research and Development Program of China [No. 2023YFB3308800]; in part by National Natural Science Foundation of China [No. 52275480]; in part by the Guizhou Province Higher Education Project [No. QJH KY [2020]005], in part by the Guizhou University Natural Sciences Special Project (Guida Tegang Hezi (2023) No.61).

Author information

Authors and affiliations.

State Key Laboratory of Public Big Data, Guizhou University, Guiyang, 550025, Guizhou, China

Chuanjiang Li, Shaobo Li & Yixiong Feng

Department of Mechanical Engineering, Flanders Make, KU Leuven, 3000, Louvain, Belgium

Konstantinos Gryllias

School of Computing and Engineering, University of Huddersfield, Huddersfield, HD1 3DH, UK

Fengshou Gu

Advanced Life Cycle Engineering, University of Maryland, College Park, MD, 20742, USA

Michael Pecht

You can also search for this author in PubMed   Google Scholar

Contributions

Chuanjiang Li: Conceptualization, Investigation, Methodology, Software, Data curation, Writing-Original draft preparation. Shaobo Li: Conceptualization, Supervision, Funding support. Yixiong Feng: Investigation, Writing-review. Konstantinos Gryllias: Methodology, Writing-review. Fengshou Gu: Methodology, Writing-review & editing. Michael Pecht: Methodology, Writing-review & editing.

Corresponding author

Correspondence to Chuanjiang Li .

Ethics declarations

Conflict of interest.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Li, C., Li, S., Feng, Y. et al. Small data challenges for intelligent prognostics and health management: a review. Artif Intell Rev 57 , 214 (2024). https://doi.org/10.1007/s10462-024-10820-4

Download citation

Accepted : 28 May 2024

Published : 23 July 2024

DOI : https://doi.org/10.1007/s10462-024-10820-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Prognostics and health management (PHM)
  • Data augmentation
  • Few-shot learning
  • Transfer learning
  • Find a journal
  • Publish with us
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.15(2); 2023 Feb
  • PMC10023071

Logo of cureus

Clinical Trials and Clinical Research: A Comprehensive Review

Venkataramana kandi.

1 Clinical Microbiology, Prathima Institute of Medical Sciences, Karimnagar, IND

Sabitha Vadakedath

2 Biochemistry, Prathima Institute of Medical Sciences, Karimnagar, IND

Clinical research is an alternative terminology used to describe medical research. Clinical research involves people, and it is generally carried out to evaluate the efficacy of a therapeutic drug, a medical/surgical procedure, or a device as a part of treatment and patient management. Moreover, any research that evaluates the aspects of a disease like the symptoms, risk factors, and pathophysiology, among others may be termed clinical research. However, clinical trials are those studies that assess the potential of a therapeutic drug/device in the management, control, and prevention of disease. In view of the increasing incidences of both communicable and non-communicable diseases, and especially after the effects that Coronavirus Disease-19 (COVID-19) had on public health worldwide, the emphasis on clinical research assumes extremely essential. The knowledge of clinical research will facilitate the discovery of drugs, devices, and vaccines, thereby improving preparedness during public health emergencies. Therefore, in this review, we comprehensively describe the critical elements of clinical research that include clinical trial phases, types, and designs of clinical trials, operations of trial, audit, and management, and ethical concerns.

Introduction and background

A clinical trial is a systematic process that is intended to find out the safety and efficacy of a drug/device in treating/preventing/diagnosing a disease or a medical condition [ 1 , 2 ]. Clinical trial includes various phases that include phase 0 (micro-dosing studies), phase 1, phase 2, phase 3, and phase 4 [ 3 ]. Phase 0 and phase 2 are called exploratory trial phases, phase 1 is termed the non-therapeutic phase, phase 3 is known as the therapeutic confirmatory phase, and phase 4 is called the post-approval or the post-marketing surveillance phase. Phase 0, also called the micro-dosing phase, was previously done in animals but now it is carried out in human volunteers to understand the dose tolerability (pharmacokinetics) before being administered as a part of the phase 1 trial among healthy individuals. The details of the clinical trial phases are shown in Table ​ Table1 1 .

This table has been created by the authors.

MTD: maximum tolerated dose; SAD: single ascending dose; MAD: multiple ascending doses; NDA: new drug application; FDA: food and drug administration

Clinical trial phaseType of the studyNature of study
Phase 0ExploratoryExamines too low (1/100 ) concentrations (micro-dosing) of the drug for less time. Study the pharmacokinetics and determine the dose for phase I studies. Previously done in animals but now it is carried out in humans.
Phase I, Phase Ia, Phase IbNon-therapeutic trialAround <50 healthy subjects are recruited. Establishes a safe dose range, and the MTD. Examines the pharmacokinetic and pharmacodynamic effects. Usually single-center studies. Phase Ia: SAD, and MTD. Duration of one week to several months depending on the trial and includes 6-8 groups of 3-6 participants. Phase Ib: MAD and the dose is gradually narrowed down. Three groups of 8 individuals each.
Phase II, Phase IIa, Phase IIbExploratory trialRecruiting around 5-100 patients of either sex. Examines the effective dosage and the therapeutic effects on patients. It decides the therapeutic regimen and drug-drug interactions. Usually, multicentre studies. Phase IIa: Decides the drug dosage, includes 20-30 patients, and takes up to weeks/months. Phase IIb: Studies dose-response relationship, drug-drug interactions, and comparison with a placebo.
Phase IIITherapeutic confirmatory trialMore than 300 patients (up to 3000) of either sex are recruited in this study and are multicentric trials. Pre-marketing phase examines the efficacy and the safety of the drug. Comparison of the test drug with the placebo/standard drug. Adverse drug reactions/adverse events are noted. Initiate the process of NDA with appropriate regulatory agencies like the FDA.
Phase IVPost-approval studyAfter approval/post-licensure and post-marketing studies/surveillance studies. Following up on the patients for an exceptionally long time for potential adverse reactions and drug-drug interactions.

Clinical research design has two major types that include non-interventional/observational and interventional/experimental studies. The non-interventional studies may have a comparator group (analytical studies like case-control and cohort studies), or without it (descriptive study). The experimental studies may be either randomized or non-randomized. Clinical trial designs are of several types that include parallel design, crossover design, factorial design, randomized withdrawal approach, adaptive design, superiority design, and non-inferiority design. The advantages and disadvantages of clinical trial designs are depicted in Table ​ Table2 2 .

Trial design typeType of the studyNature of studyAdvantages/disadvantages
ParallelRandomizedThis is the most frequent design wherein each arm of the study group is allocated a particular treatment (placebo (an inert substance)/therapeutic drug)The placebo arm does not receive the trial drug, so may not get the benefit of it
CrossoverRandomizedThe patient in this trial gets each drug and the patients serve as a control themselvesAvoids participant bias in treatment and requires a small sample size. This design is not suitable for research on acute diseases.
FactorialNon-randomizedTwo or more interventions on the participants and the study can provide information on the interactions between the drugsThe study design is complex
Randomized withdrawal approachRandomizedThis study evaluates the time/duration of the drug therapyThe study uses a placebo to understand the efficacy of a drug in treating the disease
Matched pairsPost-approval studyRecruit patients with the same characteristicsLess variability

There are different types of clinical trials that include those which are conducted for treatment, prevention, early detection/screening, and diagnosis. These studies address the activities of an investigational drug on a disease and its outcomes [ 4 ]. They assess whether the drug is able to prevent the disease/condition, the ability of a device to detect/screen the disease, and the efficacy of a medical test to diagnose the disease/condition. The pictorial representation of a disease diagnosis, treatment, and prevention is depicted in Figure ​ Figure1 1 .

An external file that holds a picture, illustration, etc.
Object name is cureus-0015-00000035077-i01.jpg

This figure has been created by the authors.

The clinical trial designs could be improvised to make sure that the study's validity is maintained/retained. The adaptive designs facilitate researchers to improvise during the clinical trial without interfering with the integrity and validity of the results. Moreover, it allows flexibility during the conduction of trials and the collection of data. Despite these advantages, adaptive designs have not been universally accepted among clinical researchers. This could be attributed to the low familiarity of such designs in the research community. The adaptive designs have been applied during various phases of clinical trials and for different clinical conditions [ 5 , 6 ]. The adaptive designs applied during different phases are depicted in Figure ​ Figure2 2 .

An external file that holds a picture, illustration, etc.
Object name is cureus-0015-00000035077-i02.jpg

The Bayesian adaptive trial design has gained popularity, especially during the Coronavirus Disease-19 (COVID-19) pandemic. Such designs could operate under a single master protocol. It operates as a platform trial wherein multiple treatments can be tested on different patient groups suffering from disease [ 7 ].

In this review, we comprehensively discuss the essential elements of clinical research that include the principles of clinical research, planning clinical trials, practical aspects of clinical trial operations, essentials of clinical trial applications, monitoring, and audit, clinical trial data analysis, regulatory audits, and project management, clinical trial operations at the investigation site, the essentials of clinical trial experiments involving epidemiological, and genetic studies, and ethical considerations in clinical research/trials.

A clinical trial involves the study of the effect of an investigational drug/any other intervention in a defined population/participant. The clinical research includes a treatment group and a placebo wherein each group is evaluated for the efficacy of the intervention (improved/not improved) [ 8 ].

Clinical trials are broadly classified into controlled and uncontrolled trials. The uncontrolled trials are potentially biased, and the results of such research are not considered as equally as the controlled studies. Randomized controlled trials (RCTs) are considered the most effective clinical trials wherein the bias is minimized, and the results are considered reliable. There are different types of randomizations and each one has clearly defined functions as elaborated in Table ​ Table3 3 .

Randomization typeFunctions
Simple randomizationThe participants are assigned to a case or a control group based on flipping coin results/computer assignment
Block randomizationEqual and small groups of both cases and controls
Stratified randomizationRandomization based on the age of the participant and other covariates
Co-variate adaptive randomization/minimizationSequential assignment of a new participant into a group based on the covariates
Randomization by body halves or paired organs (Split body trials)One intervention is administered to one-half of the body and the comparator intervention is assigned to another half of the body
Clustered randomizationIntervention is administered to clusters/groups by randomization to prevent contamination and either active or comparator intervention is administered for each group
Allocation by randomized consent (Zelen trials)Patients are allocated to one of the two trial arms

Principles of clinical trial/research

Clinical trials or clinical research are conducted to improve the understanding of the unknown, test a hypothesis, and perform public health-related research [ 2 , 3 ]. This is majorly carried out by collecting the data and analyzing it to derive conclusions. There are various types of clinical trials that are majorly grouped as analytical, observational, and experimental research. Clinical research can also be classified into non-directed data capture, directed data capture, and drug trials. Clinical research could be prospective or retrospective. It may also be a case-control study or a cohort study. Clinical trials may be initiated to find treatment, prevent, observe, and diagnose a disease or a medical condition.

Among the various types of clinical research, observational research using a cross-sectional study design is the most frequently performed clinical research. This type of research is undertaken to analyze the presence or absence of a disease/condition, potential risk factors, and prevalence and incidence rates in a defined population. Clinical trials may be therapeutic or non-therapeutic type depending on the type of intervention. The therapeutic type of clinical trial uses a drug that may be beneficial to the patient. Whereas in a non-therapeutic clinical trial, the participant does not benefit from the drug. The non-therapeutic trials provide additional knowledge of the drug for future improvements. Different terminologies of clinical trials are delineated in Table ​ Table4 4 .

Type of clinical trialDefinition
Randomized trialStudy participants are randomly assigned to a group
Open-labelBoth study subjects and the researchers are aware of the drug being tested
Blinded (single-blind)In single-blind studies, the subject has no idea about the group (test/control) in which they are placed
Double-blind (double-blind)In the double-blind study, the subjects as well as the investigator have no idea about the test/control group
PlaceboA substance that appears like a drug but has no active moiety
Add-onAn additional drug apart from the clinical trial drug given to a group of study participants
Single centerA study being carried out at a particular place/location/center
Multi-centerA study is being carried out at multiple places/locations/centers

In view of the increased cost of the drug discovery process, developing, and low-income countries depend on the production of generic drugs. The generic drugs are similar in composition to the patented/branded drug. Once the patent period is expired generic drugs can be manufactured which have a similar quality, strength, and safety as the patented drug [ 9 ]. The regulatory requirements and the drug production process are almost the same for the branded and the generic drug according to the Food and Drug Administration (FDA), United States of America (USA).

The bioequivalence (BE) studies review the absorption, distribution, metabolism, and excretion (ADME) of the generic drug. These studies compare the concentration of the drug at the desired location in the human body, called the peak concentration of the drug (Cmax). The extent of absorption of the drug is measured using the area under the receiver operating characteristic curve (AUC), wherein the generic drug is supposed to demonstrate similar ADME activities as the branded drug. The BE studies may be undertaken in vitro (fasting, non-fasting, sprinkled fasting) or in vivo studies (clinical, bioanalytical, and statistical) [ 9 ].

Planning clinical trial/research

The clinical trial process involves protocol development, designing a case record/report form (CRF), and functioning of institutional review boards (IRBs). It also includes data management and the monitoring of clinical trial site activities. The CRF is the most significant document in a clinical study. It contains the information collected by the investigator about each subject participating in a clinical study/trial. According to the International Council for Harmonisation (ICH), the CRF can be printed, optical, or an electronic document that is used to record the safety and efficacy of the pharmaceutical drug/product in the test subjects. This information is intended for the sponsor who initiates the clinical study [ 10 ].

The CRF is designed as per the protocol and later it is thoroughly reviewed for its correctness (appropriate and structured questions) and finalized. The CRF then proceeds toward the print taking the language of the participating subjects into consideration. Once the CRF is printed, it is distributed to the investigation sites where it is filled with the details of the participating subjects by the investigator/nurse/subject/guardian of the subject/technician/consultant/monitors/pharmacist/pharmacokinetics/contract house staff. The filled CRFs are checked for their completeness and transported to the sponsor [ 11 ].

Effective planning and implementation of a clinical study/trial will influence its success. The clinical study majorly includes the collection and distribution of the trial data, which is done by the clinical data management section. The project manager is crucial to effectively plan, organize, and use the best processes to control and monitor the clinical study [ 10 , 11 ].

The clinical study is conducted by a sponsor or a clinical research organization (CRO). A perfect protocol, time limits, and regulatory requirements assume significance while planning a clinical trial. What, when, how, and who are clearly planned before the initiation of a study trial. Regular review of the project using the bar and Gantt charts, and maintaining the timelines assume increased significance for success with the product (study report, statistical report, database) [ 10 , 11 ].

The steps critical to planning a clinical trial include the idea, review of the available literature, identifying a problem, formulating the hypothesis, writing a synopsis, identifying the investigators, writing a protocol, finding a source of funding, designing a patient consent form, forming ethics boards, identifying an organization, preparing manuals for procedures, quality assurance, investigator training and initiation of the trial by recruiting the participants [ 10 ].

The two most important points to consider before the initiation of the clinical trial include whether there is a need for a clinical trial, if there is a need, then one must make sure that the study design and methodology are strong for the results to be reliable to the people [ 11 ].

For clinical research to envisage high-quality results, the study design, implementation of the study, quality assurance in data collection, and alleviation of bias and confounding factors must be robust [ 12 ]. Another important aspect of conducting a clinical trial is improved management of various elements of clinical research that include human and financial resources. The role of a trial manager to make a successful clinical trial was previously reported. The trial manager could play a key role in planning, coordinating, and successfully executing the trial. Some qualities of a trial manager include better communication and motivation, leadership, and strategic, tactical, and operational skills [ 13 ].

Practical aspects of a clinical trial operations

There are different types of clinical research. Research in the development of a novel drug could be initiated by nationally funded research, industry-sponsored research, and clinical research initiated by individuals/investigators. According to the documents 21 code of federal regulations (CFR) 312.3 and ICH E-6 Good Clinical Practice (GCP) 1.54, an investigator is an individual who initiates and conducts clinical research [ 14 ]. The investigator plan, design, conduct, monitor, manage data, compile reports, and supervise research-related regulatory and ethical issues. To manage a successful clinical trial project, it is essential for an investigator to give the letter of intent, write a proposal, set a timeline, develop a protocol and related documents like the case record forms, define the budget, and identify the funding sources.

Other major steps of clinical research include the approval of IRBs, conduction and supervision of the research, data review, and analysis. Successful clinical research includes various essential elements like a letter of intent which is the evidence that supports the interest of the researcher to conduct drug research, timeline, funding source, supplier, and participant characters.

Quality assurance, according to the ICH and GCP guidelines, is necessary to be implemented during clinical research to generate quality and accurate data. Each element of the clinical research must have been carried out according to the standard operating procedure (SOP), which is written/determined before the initiation of the study and during the preparation of the protocol [ 15 ].

The audit team (quality assurance group) is instrumental in determining the authenticity of the clinical research. The audit, according to the ICH and GCP, is an independent and external team that examines the process (recording the CRF, analysis of data, and interpretation of data) of clinical research. The quality assurance personnel are adequately trained, become trainers if needed, should be good communicators, and must handle any kind of situation. The audits can be at the investigator sites evaluating the CRF data, the protocol, and the personnel involved in clinical research (source data verification, monitors) [ 16 ].

Clinical trial operations are governed by legal and regulatory requirements, based on GCPs, and the application of science, technology, and interpersonal skills [ 17 ]. Clinical trial operations are complex, time and resource-specific that requires extensive planning and coordination, especially for the research which is conducted at multiple trial centers [ 18 ].

Recruiting the clinical trial participants/subjects is the most significant aspect of clinical trial operations. Previous research had noted that most clinical trials do not meet the participant numbers as decided in the protocol. Therefore, it is important to identify the potential barriers to patient recruitment [ 19 ].

Most clinical trials demand huge costs, increased timelines, and resources. Randomized clinical trial studies from Switzerland were analyzed for their costs which revealed approximately 72000 USD for a clinical trial to be completed. This study emphasized the need for increased transparency with respect to the costs associated with the clinical trial and improved collaboration between collaborators and stakeholders [ 20 ].

Clinical trial applications, monitoring, and audit

Among the most significant aspects of a clinical trial is the audit. An audit is a systematic process of evaluating the clinical trial operations at the site. The audit ensures that the clinical trial process is conducted according to the protocol, and predefined quality system procedures, following GCP guidelines, and according to the requirements of regulatory authorities [ 21 ].

The auditors are supposed to be independent and work without the involvement of the sponsors, CROs, or personnel at the trial site. The auditors ensure that the trial is conducted by designated professionally qualified, adequately trained personnel, with predefined responsibilities. The auditors also ensure the validity of the investigational drug, and the composition, and functioning of institutional review/ethics committees. The availability and correctness of the documents like the investigational broacher, informed consent forms, CRFs, approval letters of the regulatory authorities, and accreditation of the trial labs/sites [ 21 ].

The data management systems, the data collection software, data backup, recovery, and contingency plans, alternative data recording methods, security of the data, personnel training in data entry, and the statistical methods used to analyze the results of the trial are other important responsibilities of the auditor [ 21 , 22 ].

According to the ICH-GCP Sec 1.29 guidelines the inspection may be described as an act by the regulatory authorities to conduct an official review of the clinical trial-related documents, personnel (sponsor, investigator), and the trial site [ 21 , 22 ]. The summary report of the observations of the inspectors is performed using various forms as listed in Table ​ Table5 5 .

FDA: Food and Drug Administration; IND: investigational new drug; NDA: new drug application; IRB: institutional review board; CFR: code of federal regulations

Regulatory (FDA) form numberComponents of the form
483List of objectionable conditions/processes prepared by the FDA investigator and submitted to the auditee at the end of the inspection
482The auditors submit their identity proofs and notice of inspections to the clinical investigators and later document their observations
1571This document details the fact that the clinical trial is not initiated before 30 days of submitting the IND to the FDA for approval. The form confirms that the IRB complies with 21 CFR Part 56. The form details the agreement to follow regulatory requirements and names all the individuals who monitor the conduct and progress of the study and evaluate the safety of the clinical trial
1572This form details the fact that the study is conducted after ethics approval ensures that the study is carried out according to protocol, informed consent, and IRB approval

Because protecting data integrity, the rights, safety, and well-being of the study participants are more significant while conducting a clinical trial, regular monitoring and audit of the process appear crucial. Also, the quality of the clinical trial greatly depends on the approach of the trial personnel which includes the sponsors and investigators [ 21 ].

The responsibility of monitoring lies in different hands, and it depends on the clinical trial site. When the trial is initiated by a pharmaceutical industry, the responsibility of trial monitoring depends on the company or the sponsor, and when the trial is conducted by an academic organization, the responsibility lies with the principal investigator [ 21 ].

An audit is a process conducted by an independent body to ensure the quality of the study. Basically, an audit is a quality assurance process that determines if a study is carried out by following the SPOs, in compliance with the GCPs recommended by regulatory bodies like the ICH, FDA, and other local bodies [ 21 ].

An audit is performed to review all the available documents related to the IRB approval, investigational drug, and the documents related to the patient care/case record forms. Other documents that are audited include the protocol (date, sign, treatment, compliance), informed consent form, treatment response/outcome, toxic response/adverse event recording, and the accuracy of data entry [ 22 ].

Clinical trial data analysis, regulatory audits, and project management

The essential elements of clinical trial management systems (CDMS) include the management of the study, the site, staff, subject, contracts, data, and document management, patient diary integration, medical coding, monitoring, adverse event reporting, supplier management, lab data, external interfaces, and randomization. The CDMS involves setting a defined start and finishing time, defining study objectives, setting enrolment and termination criteria, commenting, and managing the study design [ 23 ].

Among the various key application areas of clinical trial systems, the data analysis assumes increased significance. The clinical trial data collected at the site in the form of case record form is stored in the CDMS ensuring the errors with respect to the double data entry are minimized.

Clinical trial data management uses medical coding, which uses terminologies with respect to the medications and adverse events/serious adverse events that need to be entered into the CDMS. The project undertaken to conduct the clinical trial must be predetermined with timelines and milestones. Timelines are usually set for the preparation of protocol, designing the CRF, planning the project, identifying the first subject, and timelines for recording the patient’s data for the first visit.

The timelines also are set for the last subject to be recruited in the study, the CRF of the last subject, and the locked period after the last subject entry. The planning of the project also includes the modes of collection of the data, the methods of the transport of the CRFs, patient diaries, and records of severe adverse events, to the central data management sites (fax, scan, courier, etc.) [ 24 ].

The preparation of SOPs and the type and timing of the quality control (QC) procedures are also included in the project planning before the start of a clinical study. Review (budget, resources, quality of process, assessment), measure (turnaround times, training issues), and control (CRF collection and delivery, incentives, revising the process) are the three important aspects of the implementation of a clinical research project.

In view of the increasing complexity related to the conduct of clinical trials, it is important to perform a clinical quality assurance (CQA) audit. The CQA audit process consists of a detailed plan for conducting audits, points of improvement, generating meaningful audit results, verifying SOP, and regulatory compliance, and promoting improvement in clinical trial research [ 25 ]. All the components of a CQA audit are delineated in Table ​ Table6 6 .

CRF: case report form; CSR: clinical study report; IC: informed consent; PV: pharmacovigilance; SAE: serious adverse event

Product-specific audits programPharmacovigilance audits program
Protocol, CRF, IC, CSR
SupplierSafety data management
Clinical database
Investigator siteCommunications and regulatory reporting
Clinical site visit
Study managementSignal detection and evaluation
SAE reporting
Supplier audits programRisk management and PV planning
Supplier qualification
Sponsor data audit during the trialComputerized system
Preferred vendor list after the trials
Process/System audits programSuppliers
Clinical safety reporting
Data managementRegulatory inspection management program
Clinical supply
Study monitoringAssist with the audit response
Computerized systemPre-inspection audit

Clinical trial operations at the investigator's site

The selection of an investigation site is important before starting a clinical trial. It is essential that the individuals recruited for the study meet the inclusion criteria of the trial, and the investigator's and patient's willingness to accept the protocol design and the timelines set by the regulatory authorities including the IRBs.

Before conducting clinical research, it is important for an investigator to agree to the terms and conditions of the agreement and maintain the confidentiality of the protocol. Evaluation of the protocol for the feasibility of its practices with respect to the resources, infrastructure, qualified and trained personnel available, availability of the study subjects, and benefit to the institution and the investigator is done by the sponsor during the site selection visit.

The standards of a clinical research trial are ensured by the Council for International Organizations of Medical Sciences (CIOMS), National Bioethics Advisory Commission (NBAC), United Nations Programme on Human Immunodeficiency Virus/Acquired Immunodeficiency Syndrome (HIV/AIDS) (UNAIDS), and World Medical Association (WMA) [ 26 ].

Recommendations for conducting clinical research based on the WMA support the slogan that says, “The health of my patient will be my first consideration.” According to the International Code of Medical Ethics (ICME), no human should be physically or mentally harmed during the clinical trial, and the study should be conducted in the best interest of the person [ 26 ].

Basic principles recommended by the Helsinki declaration include the conduction of clinical research only after the prior proof of the safety of the drug in animal and lab experiments. The clinical trials must be performed by scientifically, and medically qualified and well-trained personnel. Also, it is important to analyze the benefit of research over harm to the participants before initiating the drug trials.

The doctors may prescribe a drug to alleviate the suffering of the patient, save the patient from death, and gain additional knowledge of the drug only after obtaining informed consent. Under the equipoise principle, the investigators must be able to justify the treatment provided as a part of the clinical trial, wherein the patient in the placebo arm may be harmed due to the unavailability of the therapeutic/trial drug.

Clinical trial operations greatly depend on the environmental conditions and geographical attributes of the trial site. It may influence the costs and targets defined by the project before the initiation. It was noted that one-fourth of the clinical trial project proposals/applications submit critical data on the investigational drug from outside the country. Also, it was noted that almost 35% of delays in clinical trials owing to patient recruitment with one-third of studies enrolling only 5% of the participants [ 27 ].

It was suggested that clinical trial feasibility assessment in a defined geographical region may be undertaken for improved chances of success. Points to be considered under the feasibility assessment program include if the disease under the study is related to the population of the geographical region, appropriateness of the study design, patient, and comparator group, visit intervals, potential regulatory and ethical challenges, and commitments of the study partners, CROs in respective countries (multi-centric studies) [ 27 ].

Feasibility assessments may be undertaken at the program level (ethics, regulatory, and medical preparedness), study level (clinical, regulatory, technical, and operational aspects), and at the investigation site (investigational drug, competency of personnel, participant recruitment, and retention, quality systems, and infrastructural aspects) [ 27 ].

Clinical trials: true experiments

In accordance with the revised schedule "Y" of the Drugs and Cosmetics Act (DCA) (2005), a drug trial may be defined as a systematic study of a novel drug component. The clinical trials aim to evaluate the pharmacodynamic, and pharmacokinetic properties including ADME, efficacy, and safety of new drugs.

According to the drug and cosmetic rules (DCR), 1945, a new chemical entity (NCE) may be defined as a novel drug approved for a disease/condition, in a specified route, and at a particular dosage. It also may be a new drug combination, of previously approved drugs.

A clinical trial may be performed in three types; one that is done to find the efficacy of an NCE, a comparison study of two drugs against a medical condition, and the clinical research of approved drugs on a disease/condition. Also, studies of the bioavailability and BE studies of the generic drugs, and the drugs already approved in other countries are done to establish the efficacy of new drugs [ 28 ].

Apart from the discovery of a novel drug, clinical trials are also conducted to approve novel medical devices for public use. A medical device is defined as any instrument, apparatus, appliance, software, and any other material used for diagnostic/therapeutic purposes. The medical devices may be divided into three classes wherein class I uses general controls; class II uses general and special controls, and class III uses general, special controls, and premarket approvals [ 28 ].

The premarket approval applications ensure the safety and effectiveness, and confirmation of the activities from bench to animal to human clinical studies. The FDA approval for investigational device exemption (IDE) for a device not approved for a new indication/disease/condition. There are two types of IDE studies that include the feasibility study (basic safety and potential effectiveness) and the pivotal study (trial endpoints, randomization, monitoring, and statistical analysis plan) [ 28 ].

As evidenced by the available literature, there are two types of research that include observational and experimental research. Experimental research is alternatively known as the true type of research wherein the research is conducted by the intervention of a new drug/device/method (educational research). Most true experiments use randomized control trials that remove bias and neutralize the confounding variables that may interfere with the results of research [ 28 ].

The variables that may interfere with the study results are independent variables also called prediction variables (the intervention), dependent variables (the outcome), and extraneous variables (other confounding factors that could influence the outside). True experiments have three basic elements that include manipulation (that influence independent variables), control (over extraneous influencers), and randomization (unbiased grouping) [ 29 ].

Experiments can also be grouped as true, quasi-experimental, and non-experimental studies depending on the presence of specific characteristic features. True experiments have all three elements of study design (manipulation, control, randomization), and prospective, and have great scientific validity. Quasi-experiments generally have two elements of design (manipulation and control), are prospective, and have moderate scientific validity. The non-experimental studies lack manipulation, control, and randomization, are generally retrospective, and have low scientific validity [ 29 ].

Clinical trials: epidemiological and human genetics study

Epidemiological studies are intended to control health issues by understanding the distribution, determinants, incidence, prevalence, and impact on health among a defined population. Such studies are attempted to perceive the status of infectious diseases as well as non-communicable diseases [ 30 ].

Experimental studies are of two types that include observational (cross-sectional studies (surveys), case-control studies, and cohort studies) and experimental studies (randomized control studies) [ 3 , 31 ]. Such research may pose challenges related to ethics in relation to the social and cultural milieu.

Biomedical research related to human genetics and transplantation research poses an increased threat to ethical concerns, especially after the success of the human genome project (HGP) in the year 2000. The benefits of human genetic studies are innumerable that include the identification of genetic diseases, in vitro fertilization, and regeneration therapy. Research related to human genetics poses ethical, legal, and social issues (ELSI) that need to be appropriately addressed. Most importantly, these genetic research studies use advanced technologies which should be equally available to both economically well-placed and financially deprived people [ 32 ].

Gene therapy and genetic manipulations may potentially precipitate conflict of interest among the family members. The research on genetics may be of various types that include pedigree studies (identifying abnormal gene carriers), genetic screening (for diseases that may be heritable by the children), gene therapeutics (gene replacement therapy, gene construct administration), HGP (sequencing the whole human genome/deoxyribonucleic acid (DNA) fingerprinting), and DNA, cell-line banking/repository [ 33 ]. The biobanks are established to collect and store human tissue samples like umbilical tissue, cord blood, and others [ 34 ].

Epidemiological studies on genetics are attempts to understand the prevalence of diseases that may be transmitted among families. The classical epidemiological studies may include single case observations (one individual), case series (< 10 individuals), ecological studies (population/large group of people), cross-sectional studies (defined number of individuals), case-control studies (defined number of individuals), cohort (defined number of individuals), and interventional studies (defined number of individuals) [ 35 ].

Genetic studies are of different types that include familial aggregation (case-parent, case-parent-grandparent), heritability (study of twins), segregation (pedigree study), linkage study (case-control), association, linkage, disequilibrium, cohort case-only studies (related case-control, unrelated case-control, exposure, non-exposure group, case group), cross-sectional studies, association cohort (related case-control, familial cohort), and experimental retrospective cohort (clinical trial, exposure, and non-exposure group) [ 35 ].

Ethics and concerns in clinical trial/research

Because clinical research involves animals and human participants, adhering to ethics and ethical practices assumes increased significance [ 36 ]. In view of the unethical research conducted on war soldiers after the Second World War, the Nuremberg code was introduced in 1947, which promulgated rules for permissible medical experiments on humans. The Nuremberg code suggests that informed consent is mandatory for all the participants in a clinical trial, and the study subjects must be made aware of the nature, duration, and purpose of the study, and potential health hazards (foreseen and unforeseen). The study subjects should have the liberty to withdraw at any time during the trial and to choose a physician upon medical emergency. The other essential principles of clinical research involving human subjects as suggested by the Nuremberg code included benefit to the society, justification of study as noted by the results of the drug experiments on animals, avoiding even minimal suffering to the study participants, and making sure that the participants don’t have life risk, humanity first, improved medical facilities for participants, and suitably qualified investigators [ 37 ].

During the 18th world medical assembly meeting in the year 1964, in Helsinki, Finland, ethical principles for doctors practicing research were proposed. Declaration of Helsinki, as it is known made sure that the interests and concerns of the human participants will always prevail over the interests of the society. Later in 1974, the National Research Act was proposed which made sure that the research proposals are thoroughly screened by the Institutional ethics/Review Board. In 1979, the April 18th Belmont report was proposed by the national commission for the protection of human rights during biomedical and behavioral research. The Belmont report proposed three core principles during research involving human participants that include respect for persons, beneficence, and justice. The ICH laid down GCP guidelines [ 38 ]. These guidelines are universally followed throughout the world during the conduction of clinical research involving human participants.

ICH was first founded in 1991, in Brussels, under the umbrella of the USA, Japan, and European countries. The ICH conference is conducted once every two years with the participation from the member countries, observers from the regulatory agencies, like the World Health Organization (WHO), European Free Trade Association (EFTA), and the Canadian Health Protection Branch, and other interested stakeholders from the academia and the industry. The expert working groups of the ICH ensure the quality, efficacy, and safety of the medicinal product (drug/device). Despite the availability of the Nuremberg code, the Belmont Report, and the ICH-GCP guidelines, in the year 1982, International Ethical Guidelines for Biomedical Research Involving Human Subjects was proposed by the CIOMS in association with WHO [ 39 ]. The CIOMS protects the rights of the vulnerable population, and ensures ethical practices during clinical research, especially in underdeveloped countries [ 40 ]. In India, the ethical principles for biomedical research involving human subjects were introduced by the Indian Council of Medical Research (ICMR) in the year 2000 and were later amended in the year 2006 [ 41 ]. Clinical trial approvals can only be done by the IRB approved by the Drug Controller General of India (DGCI) as proposed in the year 2013 [ 42 ].

Current perspectives and future implications

A recent study attempted to evaluate the efficacy of adaptive clinical trials in predicting the success of a clinical trial drug that entered phase 3 and minimizing the time and cost of drug development. This study highlighted the drawbacks of such clinical trial designs that include the possibility of type 1 (false positive) and type 2 (false negative) errors [ 43 ].

The usefulness of animal studies during the preclinical phases of a clinical trial was evaluated in a previous study which concluded that animal studies may not completely guarantee the safety of the investigational drug. This is noted by the fact that many drugs which passed toxicity tests in animals produced adverse reactions in humans [ 44 ].

The significance of BE studies to compare branded and generic drugs was reported previously. The pharmacokinetic BE studies of Amoxycillin comparing branded and generic drugs were carried out among a group of healthy participants. The study results have demonstrated that the generic drug had lower Cmax as compared to the branded drug [ 45 ].

To establish the BE of the generic drugs, randomized crossover trials are carried out to assess the Cmax and the AUC. The ratio of each pharmacokinetic characteristic must match the ratio of AUC and/or Cmax, 1:1=1 for a generic drug to be considered as a bioequivalent to a branded drug [ 46 ].

Although the generic drug development is comparatively more beneficial than the branded drugs, synthesis of extended-release formulations of the generic drug appears to be complex. Since the extended-release formulations remain for longer periods in the stomach, they may be influenced by gastric acidity and interact with the food. A recent study suggested the use of bio-relevant dissolution tests to increase the successful production of generic extended-release drug formulations [ 47 ].

Although RCTs are considered the best designs, which rule out bias and the data/results obtained from such clinical research are the most reliable, RCTs may be plagued by miscalculation of the treatment outcomes/bias, problems of cointerventions, and contaminations [ 48 ].

The perception of healthcare providers regarding branded drugs and their view about the generic equivalents was recently analyzed and reported. It was noted that such a perception may be attributed to the flexible regulatory requirements for the approval of a generic drug as compared to a branded drug. Also, could be because a switch from a branded drug to a generic drug in patients may precipitate adverse events as evidenced by previous reports [ 49 ].

Because the vulnerable population like drug/alcohol addicts, mentally challenged people, children, geriatric age people, military persons, ethnic minorities, people suffering from incurable diseases, students, employees, and pregnant women cannot make decisions with respect to participating in a clinical trial, ethical concerns, and legal issues may prop up, that may be appropriately addressed before drug trials which include such groups [ 50 ].

Conclusions

Clinical research and clinical trials are important from the public health perspective. Clinical research facilitates scientists, public health administrations, and people to increase their understanding and improve preparedness with reference to the diseases prevalent in different geographical regions of the world. Moreover, clinical research helps in mitigating health-related problems as evidenced by the current Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) pandemic and other emerging and re-emerging microbial infections. Clinical trials are crucial to the development of drugs, devices, and vaccines. Therefore, scientists are required to be up to date with the process and procedures of clinical research and trials as discussed comprehensively in this review.

The content published in Cureus is the result of clinical experience and/or research by independent individuals or organizations. Cureus is not responsible for the scientific accuracy or reliability of data or conclusions published herein. All content published within Cureus is intended only for educational, research and reference purposes. Additionally, articles published within Cureus should not be deemed a suitable substitute for the advice of a qualified health care professional. Do not disregard or avoid professional medical advice due to content published within Cureus.

The authors have declared that no competing interests exist.

COMMENTS

  1. Data management in clinical research: An overview

    Clinical Data Management (CDM) is a critical phase in clinical research, which leads to generation of high-quality, reliable, and statistically sound data from clinical trials. This helps to produce a drastic reduction in time from drug development to ...

  2. (PDF) Data management in clinical research: An overview

    Clinical Data Management (CDM) is a critical phase in clinical research, which leads to generation of high-quality, reliable, and statistically sound data from clinical trials.

  3. Essentials of Data Management: An Overview

    Data source: Data collection is a critical first step in the data management process and may be broadly classified as "primary data collection" (collection of data directly from the subjects specifically for the study) and "secondary use of data" (re-purposing data that was collected for some other reason - either for clinical care in the subject's medical record or for a different ...

  4. PDF Essentials of data management: an overview

    Data transformation The data collected may not be in the form required for analysis. The process of data transformation includes recategorization and

  5. PDF The Evolution of Clinical Data Management into Clinical Data Science

    How to create a Clinical Data Science Organization Society for Clinical Data Management Position Paper 6 1. The complexification of clinical trials designs (e.g., Adaptive, Master Protocols, Synthetic Control Arms) where studies regularly include different patient populations and endpoints with evolving and

  6. Essentials of data management: an overview

    Securely storing data is especially important in clinical research as the data may contain protected health information of the study subjects. 9 Most institutes that support clinical research have ...

  7. 17781 PDFs

    Explore the latest full-text research PDFs, articles, conference papers, preprints and more on CLINICAL DATA MANAGEMENT. Find methods information, sources, references or conduct a literature ...

  8. PDF Foundational Practices of Research Data Management

    Research Ideas and Outcomes 6: e56508 doi: 10.3897/rio.6.e56508 Reviewed v1 Guidelines Foundational Practices of Research Data Management Kristin A Briney , Heather Coates , Abigail Goben

  9. Data Management in Clinical Research: General ...

    1. INTRODUCTION. Data management is at the heart of the clinical research process. Although good data management practices cannot make up for poor study design, poor data management can render a perfectly executed trial useless.

  10. Data Collection and Management in Clinical Research

    Well-designed trials and data management methods are essential to the integrity of the findings from clinical trials, and the completeness, accuracy, and timeliness of data collection are key indicators of the quality of conduct of the study. The research data...

  11. Clinical data management: Current status, challenges, and future

    Open Access Journal of Clinical Trials 2010:2 submit your manuscript | www.dovepress.com Dovepress Dovepress 95 Clinical data management from industry perspectives usage of EDC technology? What does the future hold for

  12. PDF The Evolution of Clinical Data Management into Clinical Data Science

    The Evolution of Clinical Data Management into Clinical Data Science (Part 3) Society for Clinical Data Management Reflection Paper 4 While recognizing the significance of what CDM has achieved thus far, it is important to reflect on our

  13. From clinical data management to clinical data science: Time for a new

    Population profile. According to the U.S. Bureau of Labor Statistics, 4 there were approximately 113,300 clinical data managers and clinical data scientists working in professional, scientific, and technical sectors in 2021. Because clinical data managers and clinical data scientists perform highly similar roles across a wide range of settings and given that both occupations carry a single ...

  14. Data management in clinical research: An overview

    There is an increased demand to improve the CDM standards to meet the regulatory requirements and stay ahead of the competition by means of faster commercialization of product. Clinical Data Management (CDM) is a critical phase in clinical research, which leads to generation of high-quality, reliable, and statistically sound data from clinical trials. This helps to produce a drastic reduction ...

  15. PDF Data Management Considerations for Clinical Trials

    Clinical and Translational Science Center 1 CLINICAL AND TRANSLATIONAL SCIENCE CENTER Data Management Considerations for Clinical Trials Brad Pollock, M.P.H., Ph.D.

  16. Rethinking clinical study data: why we should respect analysis ...

    The development and approval of new treatments generates large volumes of results, such as summaries of efficacy and safety. However, it is commonly overlooked that analyzing clinical study data ...

  17. PDF Introduction to the Principles and Practice of Clinical Research

    Data Management Overview Part 1 of 4 Christine Gordon Clinical Data Management Project Manager Center for Cancer Research National Cancer Institute

  18. A systematic review and meta-data analysis of clinical data ...

    A Clinical Data Repository (CDR) is a dynamic database capable of real-time updates with patients' data, organized to facilitate rapid and easy retrieval. CDRs offer numerous benefits, ranging from preserving patients' medical records for follow-up care and prescriptions to enabling the development of intelligent models that can predict, and potentially mitigate serious health conditions ...

  19. PDF Clinical Data Management System

    The International Journal of Multi-Disciplinary Research ISSN: 3471-7102, ISBN: 978-9982-70-318-5 3 Paper-ID: CFP/872/2018 www.ijmdr.net

  20. PDF Effective Data Management and Analysis in Clinical Trials

    www.ejpmr.com │ Vol 10, Issue 7, 2023.│ ISO 9001:2015 Certified Journal │ Girisha et al.European Journal of Pharmaceutical and Medical Research 571 Electronic Data Capture (EDC) Systems In the realm of clinical trials, electronic data capture

  21. Data Privacy and Ethics

    Sally Gore, MS, MS LIS Manager, Research & Scholarly Communication Services [email protected]. Tess Grynoch, MLIS Research Data & Scholarly Communications Librarian

  22. A Clinical Data Management System for Diabetes Clinical Trials

    Methods. The present research was a mixed-methods study conducted in 2019. To identify the required data elements and functions to develop the system, 60 researchers completed a questionnaire.

  23. Small data challenges for intelligent prognostics and health management

    Prognostics and health management (PHM) is critical for enhancing equipment reliability and reducing maintenance costs, and research on intelligent PHM has made significant progress driven by big data and deep learning techniques in recent years. However, complex working conditions and high-cost data collection inherent in real-world scenarios pose small-data challenges for the application of ...

  24. University of Florida

    Equal Opportunity Employer. The University is committed to non-discrimination with respect to race, creed, color, religion, age, disability, sex, sexual orientation, gender identity and expression, marital status, national origin, political opinions or affiliations, genetic information and veteran status in all aspects of employment including recruitment, hiring, promotions, transfers ...

  25. Analysis of professional competencies for the clinical research data

    Competencies, foundational knowledge, and application areas. Management of clinical research data is largely performed with ad hoc and consensus methods informed by federal regulations and guidance. 1 Within these general parameters, organizations develop their own operating procedures. While currently under revision, the Good Clinical Data Management Practices (GCDMP) document 2 was ...

  26. Individual and Contextual Predictors of Sexual Orientation Disclosure

    Claudia Sepúlveda is a psychologist with a Bachelor's degree (2022) from the Universidad Católica de Temuco (Chile), and currently pursuing a diploma in Educational Inclusion with a specialization in Neurodiversity. Her professional, academic and research interests focus on providing learning support for neurodivergent students from a perspective of rights and inclusion, with a particular ...

  27. Method enables fast, accurate estimates of ...

    A new mathematical method, validated with experimental animal data, provides a fast, reliable and minimally invasive way of determining how to treat critical blood pressure changes during surgery ...

  28. Clinical Trials and Clinical Research: A Comprehensive Review

    Clinical research is an alternative terminology used to describe medical research. Clinical research involves people, and it is generally carried out to evaluate the efficacy of a therapeutic drug, a medical/surgical procedure, or a device as a part of treatment and patient management.