A guide to critical appraisal of evidence

Fineout-Overholt, Ellen PhD, RN, FNAP, FAAN

Ellen Fineout-Overholt is the Mary Coulter Dowdy Distinguished Professor of Nursing at the University of Texas at Tyler School of Nursing, Tyler, Tex.

The author has disclosed no financial relationships related to this article.

Critical appraisal is the assessment of research studies' worth to clinical practice. Critical appraisal—the heart of evidence-based practice—involves four phases: rapid critical appraisal, evaluation, synthesis, and recommendation. This article reviews each phase and provides examples, tips, and caveats to help evidence appraisers successfully determine what is known about a clinical issue. Patient outcomes are improved when clinicians apply a body of evidence to daily practice.

How do nurses assess the quality of clinical research? This article outlines a stepwise approach to critical appraisal of research studies' worth to clinical practice: rapid critical appraisal, evaluation, synthesis, and recommendation. When critical care nurses apply a body of valid, reliable, and applicable evidence to daily practice, patient outcomes are improved.


Critical care nurses can best explain the reasoning for their clinical actions when they understand the worth of the research supporting their practices. In c ritical appraisal , clinicians assess the worth of research studies to clinical practice. Given that achieving improved patient outcomes is the reason patients enter the healthcare system, nurses must be confident their care techniques will reliably achieve best outcomes.

Nurses must verify that the information supporting their clinical care is valid, reliable, and applicable. Validity of research refers to the quality of research methods used, or how good of a job researchers did conducting a study. Reliability of research means similar outcomes can be achieved when the care techniques of a study are replicated by clinicians. Applicability of research means it was conducted in a similar sample to the patients for whom the findings will be applied. These three criteria determine a study's worth in clinical practice.

Appraising the worth of research requires a standardized approach. This approach applies to both quantitative research (research that deals with counting things and comparing those counts) and qualitative research (research that describes experiences and perceptions). The word critique has a negative connotation. In the past, some clinicians were taught that studies with flaws should be discarded. Today, it is important to consider all valid and reliable research informative to what we understand as best practice. Therefore, the author developed the critical appraisal methodology that enables clinicians to determine quickly which evidence is worth keeping and which must be discarded because of poor validity, reliability, or applicability.

Evidence-based practice process

The evidence-based practice (EBP) process is a seven-step problem-solving approach that begins with data gathering (see Seven steps to EBP ). During daily practice, clinicians gather data supporting inquiry into a particular clinical issue (Step 0). The description is then framed as an answerable question (Step 1) using the PICOT question format ( P opulation of interest; I ssue of interest or intervention; C omparison to the intervention; desired O utcome; and T ime for the outcome to be achieved). 1 Consistently using the PICOT format helps ensure that all elements of the clinical issue are covered. Next, clinicians conduct a systematic search to gather data answering the PICOT question (Step 2). Using the PICOT framework, clinicians can systematically search multiple databases to find available studies to help determine the best practice to achieve the desired outcome for their patients. When the systematic search is completed, the work of critical appraisal begins (Step 3). The known group of valid and reliable studies that answers the PICOT question is called the body of evidence and is the foundation for the best practice implementation (Step 4). Next, clinicians evaluate integration of best evidence with clinical expertise and patient preferences and values to determine if the outcomes in the studies are realized in practice (Step 5). Because healthcare is a community of practice, it is important that experiences with evidence implementation be shared, whether the outcome is what was expected or not. This enables critical care nurses concerned with similar care issues to better understand what has been successful and what has not (Step 6).

Critical appraisal of evidence

The first phase of critical appraisal, rapid critical appraisal, begins with determining which studies will be kept in the body of evidence. All valid, reliable, and applicable studies on the topic should be included. This is accomplished using design-specific checklists with key markers of good research. When clinicians determine a study is one they want to keep (a “keeper” study) and that it belongs in the body of evidence, they move on to phase 2, evaluation. 2

In the evaluation phase, the keeper studies are put together in a table so that they can be compared as a body of evidence, rather than individual studies. This phase of critical appraisal helps clinicians identify what is already known about a clinical issue. In the third phase, synthesis, certain data that provide a snapshot of a particular aspect of the clinical issue are pulled out of the evaluation table to showcase what is known. These snapshots of information underpin clinicians' decision-making and lead to phase 4, recommendation. A recommendation is a specific statement based on the body of evidence indicating what should be done—best practice. Critical appraisal is not complete without a specific recommendation. Each of the phases is explained in more detail below.

Phase 1: Rapid critical appraisal . Rapid critical appraisal involves using two tools that help clinicians determine if a research study is worthy of keeping in the body of evidence. The first tool, General Appraisal Overview for All Studies (GAO), covers the basics of all research studies (see Elements of the General Appraisal Overview for All Studies ). Sometimes, clinicians find gaps in knowledge about certain elements of research studies (for example, sampling or statistics) and need to review some content. Conducting an internet search for resources that explain how to read a research paper, such as an instructional video or step-by-step guide, can be helpful. Finding basic definitions of research methods often helps resolve identified gaps.

To accomplish the GAO, it is best to begin with finding out why the study was conducted and how it answers the PICOT question (for example, does it provide information critical care nurses want to know from the literature). If the study purpose helps answer the PICOT question, then the type of study design is evaluated. The study design is compared with the hierarchy of evidence for the type of PICOT question. The higher the design falls within the hierarchy or levels of evidence, the more confidence nurses can have in its finding, if the study was conducted well. 3,4 Next, find out what the researchers wanted to learn from their study. These are called the research questions or hypotheses. Research questions are just what they imply; insufficient information from theories or the literature are available to guide an educated guess, so a question is asked. Hypotheses are reasonable expectations guided by understanding from theory and other research that predicts what will be found when the research is conducted. The research questions or hypotheses provide the purpose of the study.

Next, the sample size is evaluated. Expectations of sample size are present for every study design. As an example, consider as a rule that quantitative study designs operate best when there is a sample size large enough to establish that relationships do not exist by chance. In general, the more participants in a study, the more confidence in the findings. Qualitative designs operate best with fewer people in the sample because these designs represent a deeper dive into the understanding or experience of each person in the study. 5 It is always important to describe the sample, as clinicians need to know if the study sample resembles their patients. It is equally important to identify the major variables in the study and how they are defined because this helps clinicians best understand what the study is about.

The final step in the GAO is to consider the analyses that answer the study research questions or confirm the study hypothesis. This is another opportunity for clinicians to learn, as learning about statistics in healthcare education has traditionally focused on conducting statistical tests as opposed to interpreting statistical tests. Understanding what the statistics indicate about the study findings is an imperative of critical appraisal of quantitative evidence.

The second tool is one of the variety of rapid critical appraisal checklists that speak to validity, reliability, and applicability of specific study designs, which are available at varying locations (see Critical appraisal resources ). When choosing a checklist to implement with a group of critical care nurses, it is important to verify that the checklist is complete and simple to use. Be sure to check that the checklist has answers to three key questions. The first question is: Are the results of the study valid? Related subquestions should help nurses discern if certain markers of good research design are present within the study. For example, identifying that study participants were randomly assigned to study groups is an essential marker of good research for a randomized controlled trial. Checking these essential markers helps clinicians quickly review a study to check off these important requirements. Clinical judgment is required when the study lacks any of the identified quality markers. Clinicians must discern whether the absence of any of the essential markers negates the usefulness of the study findings. 6-9


The second question is: What are the study results? This is answered by reviewing whether the study found what it was expecting to and if those findings were meaningful to clinical practice. Basic knowledge of how to interpret statistics is important for understanding quantitative studies, and basic knowledge of qualitative analysis greatly facilitates understanding those results. 6-9

The third question is: Are the results applicable to my patients? Answering this question involves consideration of the feasibility of implementing the study findings into the clinicians' environment as well as any contraindication within the clinicians' patient populations. Consider issues such as organizational politics, financial feasibility, and patient preferences. 6-9

When these questions have been answered, clinicians must decide about whether to keep the particular study in the body of evidence. Once the final group of keeper studies is identified, clinicians are ready to move into the phase of critical appraisal. 6-9

Phase 2: Evaluation . The goal of evaluation is to determine how studies within the body of evidence agree or disagree by identifying common patterns of information across studies. For example, an evaluator may compare whether the same intervention is used or if the outcomes are measured in the same way across all studies. A useful tool to help clinicians accomplish this is an evaluation table. This table serves two purposes: first, it enables clinicians to extract data from the studies and place the information in one table for easy comparison with other studies; and second, it eliminates the need for further searching through piles of periodicals for the information. (See Bonus Content: Evaluation table headings .) Although the information for each of the columns may not be what clinicians consider as part of their daily work, the information is important for them to understand about the body of evidence so that they can explain the patterns of agreement or disagreement they identify across studies. Further, the in-depth understanding of the body of evidence from the evaluation table helps with discussing the relevant clinical issue to facilitate best practice. Their discussion comes from a place of knowledge and experience, which affords the most confidence. The patterns and in-depth understanding are what lead to the synthesis phase of critical appraisal.

The key to a successful evaluation table is simplicity. Entering data into the table in a simple, consistent manner offers more opportunity for comparing studies. 6-9 For example, using abbreviations versus complete sentences in all columns except the final one allows for ease of comparison. An example might be the dependent variable of depression defined as “feelings of severe despondency and dejection” in one study and as “feeling sad and lonely” in another study. 10 Because these are two different definitions, they need to be different dependent variables. Clinicians must use their clinical judgment to discern that these different dependent variables require different names and abbreviations and how these further their comparison across studies.


Sample and theoretical or conceptual underpinnings are important to understanding how studies compare. Similar samples and settings across studies increase agreement. Several studies with the same conceptual framework increase the likelihood of common independent variables and dependent variables. The findings of a study are dependent on the analyses conducted. That is why an analysis column is dedicated to recording the kind of analysis used (for example, the name of the statistical analyses for quantitative studies). Only statistics that help answer the clinical question belong in this column. The findings column must have a result for each of the analyses listed; however, in the actual results, not in words. For example, a clinician lists a t -test as a statistic in the analysis column, so a t -value should reflect whether the groups are different as well as probability ( P -value or confidence interval) that reflects statistical significance. The explanation for these results would go in the last column that describes worth of the research to practice. This column is much more flexible and contains other information such as the level of evidence, the studies' strengths and limitations, any caveats about the methodology, or other aspects of the study that would be helpful to its use in practice. The final piece of information in this column is a recommendation for how this study would be used in practice. Each of the studies in the body of evidence that addresses the clinical question is placed in one evaluation table to facilitate the ease of comparing across the studies. This comparison sets the stage for synthesis.

Phase 3: Synthesis . In the synthesis phase, clinicians pull out key information from the evaluation table to produce a snapshot of the body of evidence. A table also is used here to feature what is known and help all those viewing the synthesis table to come to the same conclusion. A hypothetical example table included here demonstrates that a music therapy intervention is effective in reducing the outcome of oxygen saturation (SaO 2 ) in six of the eight studies in the body of evidence that evaluated that outcome (see Sample synthesis table: Impact on outcomes ). Simply using arrows to indicate effect offers readers a collective view of the agreement across studies that prompts action. Action may be to change practice, affirm current practice, or conduct research to strengthen the body of evidence by collaborating with nurse scientists.

When synthesizing evidence, there are at least two recommended synthesis tables, including the level-of-evidence table and the impact-on-outcomes table for quantitative questions, such as therapy or relevant themes table for “meaning” questions about human experience. (See Bonus Content: Level of evidence for intervention studies: Synthesis of type .) The sample synthesis table also demonstrates that a final column labeled synthesis indicates agreement across the studies. Of the three outcomes, the most reliable for clinicians to see with music therapy is SaO 2 , with positive results in six out of eight studies. The second most reliable outcome would be reducing increased respiratory rate (RR). Parental engagement has the least support as a reliable outcome, with only two of five studies showing positive results. Synthesis tables make the recommendation clear to all those who are involved in caring for that patient population. Although the two synthesis tables mentioned are a great start, the evidence may require more synthesis tables to adequately explain what is known. These tables are the foundation that supports clinically meaningful recommendations.

Phase 4: Recommendation . Recommendations are definitive statements based on what is known from the body of evidence. For example, with an intervention question, clinicians should be able to discern from the evidence if they will reliably get the desired outcome when they deliver the intervention as it was in the studies. In the sample synthesis table, the recommendation would be to implement the music therapy intervention across all settings with the population, and measure SaO 2 and RR, with the expectation that both would be optimally improved with the intervention. When the synthesis demonstrates that studies consistently verify an outcome occurs as a result of an intervention, however that intervention is not currently practiced, care is not best practice. Therefore, a firm recommendation to deliver the intervention and measure the appropriate outcomes must be made, which concludes critical appraisal of the evidence.

A recommendation that is off limits is conducting more research, as this is not the focus of clinicians' critical appraisal. In the case of insufficient evidence to make a recommendation for practice change, the recommendation would be to continue current practice and monitor outcomes and processes until there are more reliable studies to be added to the body of evidence. Researchers who use the critical appraisal process may indeed identify gaps in knowledge, research methods, or analyses, for example, that they then recommend studies that would fill in the identified gaps. In this way, clinicians and nurse scientists work together to build relevant, efficient bodies of evidence that guide clinical practice.

Evidence into action

Critical appraisal helps clinicians understand the literature so they can implement it. Critical care nurses have a professional and ethical responsibility to make sure their care is based on a solid foundation of available evidence that is carefully appraised using the phases outlined here. Critical appraisal allows for decision-making based on evidence that demonstrates reliable outcomes. Any other approach to the literature is likely haphazard and may lead to misguided care and unreliable outcomes. 11 Evidence translated into practice should have the desired outcomes and their measurement defined from the body of evidence. It is also imperative that all critical care nurses carefully monitor care delivery outcomes to establish that best outcomes are sustained. With the EBP paradigm as the basis for decision-making and the EBP process as the basis for addressing clinical issues, critical care nurses can improve patient, provider, and system outcomes by providing best care.

Seven steps to EBP

Step 0–A spirit of inquiry to notice internal data that indicate an opportunity for positive change.

Step 1– Ask a clinical question using the PICOT question format.

Step 2–Conduct a systematic search to find out what is already known about a clinical issue.

Step 3–Conduct a critical appraisal (rapid critical appraisal, evaluation, synthesis, and recommendation).

Step 4–Implement best practices by blending external evidence with clinician expertise and patient preferences and values.

Step 5–Evaluate evidence implementation to see if study outcomes happened in practice and if the implementation went well.

Step 6–Share project results, good or bad, with others in healthcare.

Adapted from: Steps of the evidence-based practice (EBP) process leading to high-quality healthcare and best patient outcomes. © Melnyk & Fineout-Overholt, 2017. Used with permission.

Critical appraisal resources

  • The Joanna Briggs Institute http://joannabriggs.org/research/critical-appraisal-tools.html
  • Critical Appraisal Skills Programme (CASP) www.casp-uk.net/casp-tools-checklists
  • Center for Evidence-Based Medicine www.cebm.net/critical-appraisal
  • Melnyk BM, Fineout-Overholt E. Evidence-Based Practice in Nursing and Healthcare: A Guide to Best Practice . 3rd ed. Philadelphia, PA: Wolters Kluwer; 2015.

A full set of critical appraisal checklists are available in the appendices.

Bonus content!

This article includes supplementary online-exclusive material. Visit the online version of this article at www.nursingcriticalcare.com to access this content.

critical appraisal; decision-making; evaluation of research; evidence-based practice; synthesis

  • + Favorites
  • View in Gallery

Determining the level of evidence: experimental research appraisal, caring for hospitalized patients with alcohol withdrawal syndrome, the qt interval, evidence-based practice for red blood cell transfusions, searching with critical appraisal tools.


Critical Appraisal : Critical appraisal full list of checklists and tools

  • What is critical appraisal?
  • Where to start
  • Education and childhood studies
  • Occupational Therapy
  • Physiotherapy
  • Interpreting statistics
  • Further reading and resources

Which checklist or tool should I use?

There are hundreds of critical appraisal checklists and tools you can choose from, which can be very overwhelming. There are so many because there are many kinds of research, knowledge can be communicated in a wide range of ways, and whether something is appropriate to meet your information needs depends on your specific context. 

We have asked for recommendations from lecturers in different academic departments, to give you an idea about which checklists and tools may be the most relevant for you. Please hover over the drop-down menu at the top of the page, underneath 'Critical appraisal checklists and tools' to view the individual subject pages.

Below are lists of as many critical appraisal tools and checklists as we have been able to find. These are split into health sciences and social sciences because the two areas tend to take different approaches to evaluation, for various reasons!

To see a selection of checklists more suitable for your subject, hover over the top tab of this page.  

Critical appraisal checklists and tools for Health Sciences

  • AACODS  Checklist for appraising grey literature
  • AMSTAR 2  critical appraisal tool for systematic reviews that include randomised and non-randomised studies of healthcare interventions or both
  • AOTA Critically Appraised Papers  American Occupational Therapy Association 
  • Bandolier - "Evidence based thinking about healthcare"
  • BestBETS critical appraisal worksheet
  • BMJ critical appraisal checklists
  • CASP  Critical Appraisal Skills Programme includes checklists for case control studies, clinical prediction rule, cohort studies, diagnostic studies, economic evaluation, qualitative studies, RCTs and systematic reviews
  • Centre for Evidence Based Medicine (Oxford) Critical Appraisal Tools  CEBM's worksheets to assess systematic reviews, diagnostic, prognosis, and RCTs
  • Centre for Evidence Based Medicine (Oxford) CATmaker and EBM calculator  CEBM's computer assisted critical appraisal tool CATmaker 
  • CEMB critical appraisal sheets  (Centre for Evidence Based Medicine)
  • Cochrane Assessing Risk of Bias in a Randomized Trial
  • Critical appraisal: a checklist from Students for Best Evidence S4BE (student network with simple explanations of difficult concepts)
  • Critical appraisal and statistical skills (Knowledge for Healthcare)
  • Critical appraisal of clinical trials  from Testing Treatments International
  • Critical appraisal of clinical trials (Medicines Learning Portal)
  • Critical appraisal of quantitative research  
  • Critical appraisal of a quantitative paper  from Teeside University
  • Critical appraisal of a qualitative paper  from Teeside University
  • Critical appraisal tools  from the Centre for Evidence-Based Medicine
  • Critical Evaluation of Research Papers – Qualitative Studies from Teeside University
  • Critical Evaluation of Research Papers – RCTs/Experimental Studies from Teeside University
  • Evaluation tool for mixed methods study designs 
  • GRADE - The Grading of Recommendations Assessment, Development and Evaluation working group  guidelines and publications for grading the quality of evidence in healthcare research and policy
  • HCPRDU Evaluation Tool for Mixed Methods Studies  - University of Salford Health Care Practice R&D Unit 
  • HCPRDU Evaluation Tool for Qualitative Studies  - University of Salford Health Care Practice R&D Unit 
  • HCPRDU Evaluation Tool for Quantitative Studies  - University of Salford Health Care Practice R&D Unit 
  • JBI Joanna Briggs Institute critical appraisal tools  checklists for Analytical cross sectional studies, case control studies, case reports, case series, cohort studies, diagnostic test accuracy, economic evaluations, prevalence studies, qualitative research, quasi-experimental (non-randomised) studies, RCTs, systematic reviews and for text and opinion  
  • Knowledge Translation Program  - Toronto based KTP critical appraisal worksheets for systematic reviews, prognosis, diagnosis, harm and therapy
  • MMAT Mixed Methods Appraisal Tool 
  • McMaster University Evidence Based Practice Research Group quantitative and qualitative review forms
  • NHLBI (National Heart, Blood and lung Institute) study quality assessment tools for case control studies, case series, controlled intervention, observational cohort and cross sectional studies, before-after (pre-post) studies with no control group, systematic reviews and meta analyses 
  • NICE Guidelines, The Manual Appendix H. pp9-24
  • QUADAS-2  tool for evaluating risk of bias in systematic reviews from the University of Bristol
  • PEDro  PEDro (Physiotherapy Evidence Database) Scale - appraisal resources including a tutorial and appraisal tool
  • RoB 2   A revised Cochrane risk-of-bias tool for randomized trials
  • ROBINS-I Risk Of Bias In Non-Randomized Studies of Interventions 
  • ROBIS  Risk of Bias in Systematic Reviews
  • ROB-ME   A tool for assessing Risk Of Bias due to Missing Evidence in a synthesis
  • SIGN  - Critical appraisal notes and checklists for case control studies, cohort studies, diagnostic studies, economic studies, RCTs, meta-analyses and systematic reviews
  • Strength of Recommendation Taxonomy  - the SORT scale for quality, quantity and consistency of evidence in individual studies or bodies of evidence
  • STROBE (Strengthening the Reporting of Observational studies in Epidemiology)  for cohort, case-control, and cross-sectional studies (combined),  cohort, case-control, cross-sectional studies and conference abstracts
  • SURE Case Controlled Studies Critical Appraisal checklist
  • SURE Case Series Studies Critical Appraisal checklist
  • SURE Cohort Studies Critical Appraisal checklist
  • SURE Cross-sectional Studies Critical Appraisal checklist
  • SURE Experimental Studies Critical Appraisal checklist
  • SURE Qualitative Studies Critical Appraisal checklist
  • SURE Systematic Review Critical Appraisal checklist

Critical appraisal checklists and tools for Social Sciences

  • AACODS   Checklist for appraising grey literature
  • CRAAP test to evaluate sources of information 
  • Critical Appraisal of an Article on an Educational Intervention  (variable study design) from the University of Glasgow
  • Educational Interventions Critical Appraisal worksheet  from BestBETs
  • PROMPT  from Open University
  • PROVEN  - tool to evaluate any source of information 

SIFT (The Four Moves)  to help students distinguish between truth and fake news 

Some Guidelines for the Critical Reviewing of Conceptual Papers

Critical Appraisal of Clinical Research

Azzam al-jundi.

1 Professor, Department of Orthodontics, King Saud bin Abdul Aziz University for Health Sciences-College of Dentistry, Riyadh, Kingdom of Saudi Arabia.

Salah Sakka

2 Associate Professor, Department of Oral and Maxillofacial Surgery, Al Farabi Dental College, Riyadh, KSA.

Evidence-based practice is the integration of individual clinical expertise with the best available external clinical evidence from systematic research and patient’s values and expectations into the decision making process for patient care. It is a fundamental skill to be able to identify and appraise the best available evidence in order to integrate it with your own clinical experience and patients values. The aim of this article is to provide a robust and simple process for assessing the credibility of articles and their value to your clinical practice.


Decisions related to patient value and care is carefully made following an essential process of integration of the best existing evidence, clinical experience and patient preference. Critical appraisal is the course of action for watchfully and systematically examining research to assess its reliability, value and relevance in order to direct professionals in their vital clinical decision making [ 1 ].

Critical appraisal is essential to:

  • Combat information overload;
  • Identify papers that are clinically relevant;
  • Continuing Professional Development (CPD).

Carrying out Critical Appraisal:

Assessing the research methods used in the study is a prime step in its critical appraisal. This is done using checklists which are specific to the study design.

Standard Common Questions:

  • What is the research question?
  • What is the study type (design)?
  • Selection issues.
  • What are the outcome factors and how are they measured?
  • What are the study factors and how are they measured?
  • What important potential confounders are considered?
  • What is the statistical method used in the study?
  • Statistical results.
  • What conclusions did the authors reach about the research question?
  • Are ethical issues considered?

The Critical Appraisal starts by double checking the following main sections:

I. Overview of the paper:

  • The publishing journal and the year
  • The article title: Does it state key trial objectives?
  • The author (s) and their institution (s)

The presence of a peer review process in journal acceptance protocols also adds robustness to the assessment criteria for research papers and hence would indicate a reduced likelihood of publication of poor quality research. Other areas to consider may include authors’ declarations of interest and potential market bias. Attention should be paid to any declared funding or the issue of a research grant, in order to check for a conflict of interest [ 2 ].

II. ABSTRACT: Reading the abstract is a quick way of getting to know the article and its purpose, major procedures and methods, main findings, and conclusions.

  • Aim of the study: It should be well and clearly written.
  • Materials and Methods: The study design and type of groups, type of randomization process, sample size, gender, age, and procedure rendered to each group and measuring tool(s) should be evidently mentioned.
  • Results: The measured variables with their statistical analysis and significance.
  • Conclusion: It must clearly answer the question of interest.

III. Introduction/Background section:

An excellent introduction will thoroughly include references to earlier work related to the area under discussion and express the importance and limitations of what is previously acknowledged [ 2 ].

-Why this study is considered necessary? What is the purpose of this study? Was the purpose identified before the study or a chance result revealed as part of ‘data searching?’

-What has been already achieved and how does this study be at variance?

-Does the scientific approach outline the advantages along with possible drawbacks associated with the intervention or observations?

IV. Methods and Materials section : Full details on how the study was actually carried out should be mentioned. Precise information is given on the study design, the population, the sample size and the interventions presented. All measurements approaches should be clearly stated [ 3 ].

V. Results section : This section should clearly reveal what actually occur to the subjects. The results might contain raw data and explain the statistical analysis. These can be shown in related tables, diagrams and graphs.

VI. Discussion section : This section should include an absolute comparison of what is already identified in the topic of interest and the clinical relevance of what has been newly established. A discussion on a possible related limitations and necessitation for further studies should also be indicated.

Does it summarize the main findings of the study and relate them to any deficiencies in the study design or problems in the conduct of the study? (This is called intention to treat analysis).

  • Does it address any source of potential bias?
  • Are interpretations consistent with the results?
  • How are null findings interpreted?
  • Does it mention how do the findings of this study relate to previous work in the area?
  • Can they be generalized (external validity)?
  • Does it mention their clinical implications/applicability?
  • What are the results/outcomes/findings applicable to and will they affect a clinical practice?
  • Does the conclusion answer the study question?
  • -Is the conclusion convincing?
  • -Does the paper indicate ethics approval?
  • -Can you identify potential ethical issues?
  • -Do the results apply to the population in which you are interested?
  • -Will you use the results of the study?

Once you have answered the preliminary and key questions and identified the research method used, you can incorporate specific questions related to each method into your appraisal process or checklist.

1-What is the research question?

For a study to gain value, it should address a significant problem within the healthcare and provide new or meaningful results. Useful structure for assessing the problem addressed in the article is the Problem Intervention Comparison Outcome (PICO) method [ 3 ].

P = Patient or problem: Patient/Problem/Population:

It involves identifying if the research has a focused question. What is the chief complaint?

E.g.,: Disease status, previous ailments, current medications etc.,

I = Intervention: Appropriately and clearly stated management strategy e.g.,: new diagnostic test, treatment, adjunctive therapy etc.,

C= Comparison: A suitable control or alternative

E.g.,: specific and limited to one alternative choice.

O= Outcomes: The desired results or patient related consequences have to be identified. e.g.,: eliminating symptoms, improving function, esthetics etc.,

The clinical question determines which study designs are appropriate. There are five broad categories of clinical questions, as shown in [ Table/Fig-1 ].


Categories of clinical questions and the related study designs.

Clinical QuestionsClinical Relevance and Suggested Best Method of Investigation
Aetiology/CausationWhat caused the disorder and how is this related to the development of illness.
Example: randomized controlled trial - case-control study- cohort study.
TherapyWhich treatments do more good than harm compared with an alternative treatment?
Example: randomized control trial, systematic review, meta- analysis.
PrognosisWhat is the likely course of a patient’s illness?
What is the balance of the risks and benefits of a treatment?
Example: cohort study, longitudinal survey.
DiagnosisHow valid and reliable is a diagnostic test?
What does the test tell the doctor?
Example: cohort study, case -control study
Cost- effectivenessWhich intervention is worth prescribing?
Is a newer treatment X worth prescribing compared with older treatment Y?
Example: economic analysis

2- What is the study type (design)?

The study design of the research is fundamental to the usefulness of the study.

In a clinical paper the methodology employed to generate the results is fully explained. In general, all questions about the related clinical query, the study design, the subjects and the correlated measures to reduce bias and confounding should be adequately and thoroughly explored and answered.

Participants/Sample Population:

Researchers identify the target population they are interested in. A sample population is therefore taken and results from this sample are then generalized to the target population.

The sample should be representative of the target population from which it came. Knowing the baseline characteristics of the sample population is important because this allows researchers to see how closely the subjects match their own patients [ 4 ].

Sample size calculation (Power calculation): A trial should be large enough to have a high chance of detecting a worthwhile effect if it exists. Statisticians can work out before the trial begins how large the sample size should be in order to have a good chance of detecting a true difference between the intervention and control groups [ 5 ].

  • Is the sample defined? Human, Animals (type); what population does it represent?
  • Does it mention eligibility criteria with reasons?
  • Does it mention where and how the sample were recruited, selected and assessed?
  • Does it mention where was the study carried out?
  • Is the sample size justified? Rightly calculated? Is it adequate to detect statistical and clinical significant results?
  • Does it mention a suitable study design/type?
  • Is the study type appropriate to the research question?
  • Is the study adequately controlled? Does it mention type of randomization process? Does it mention the presence of control group or explain lack of it?
  • Are the samples similar at baseline? Is sample attrition mentioned?
  • All studies report the number of participants/specimens at the start of a study, together with details of how many of them completed the study and reasons for incomplete follow up if there is any.
  • Does it mention who was blinded? Are the assessors and participants blind to the interventions received?
  • Is it mentioned how was the data analysed?
  • Are any measurements taken likely to be valid?

Researchers use measuring techniques and instruments that have been shown to be valid and reliable.

Validity refers to the extent to which a test measures what it is supposed to measure.

(the extent to which the value obtained represents the object of interest.)

  • -Soundness, effectiveness of the measuring instrument;
  • -What does the test measure?
  • -Does it measure, what it is supposed to be measured?
  • -How well, how accurately does it measure?

Reliability: In research, the term reliability means “repeatability” or “consistency”

Reliability refers to how consistent a test is on repeated measurements. It is important especially if assessments are made on different occasions and or by different examiners. Studies should state the method for assessing the reliability of any measurements taken and what the intra –examiner reliability was [ 6 ].

3-Selection issues:

The following questions should be raised:

  • - How were subjects chosen or recruited? If not random, are they representative of the population?
  • - Types of Blinding (Masking) Single, Double, Triple?
  • - Is there a control group? How was it chosen?
  • - How are patients followed up? Who are the dropouts? Why and how many are there?
  • - Are the independent (predictor) and dependent (outcome) variables in the study clearly identified, defined, and measured?
  • - Is there a statement about sample size issues or statistical power (especially important in negative studies)?
  • - If a multicenter study, what quality assurance measures were employed to obtain consistency across sites?
  • - Are there selection biases?
  • • In a case-control study, if exercise habits to be compared:
  • - Are the controls appropriate?
  • - Were records of cases and controls reviewed blindly?
  • - How were possible selection biases controlled (Prevalence bias, Admission Rate bias, Volunteer bias, Recall bias, Lead Time bias, Detection bias, etc.,)?
  • • Cross Sectional Studies:
  • - Was the sample selected in an appropriate manner (random, convenience, etc.,)?
  • - Were efforts made to ensure a good response rate or to minimize the occurrence of missing data?
  • - Were reliability (reproducibility) and validity reported?
  • • In an intervention study, how were subjects recruited and assigned to groups?
  • • In a cohort study, how many reached final follow-up?
  • - Are the subject’s representatives of the population to which the findings are applied?
  • - Is there evidence of volunteer bias? Was there adequate follow-up time?
  • - What was the drop-out rate?
  • - Any shortcoming in the methodology can lead to results that do not reflect the truth. If clinical practice is changed on the basis of these results, patients could be harmed.

Researchers employ a variety of techniques to make the methodology more robust, such as matching, restriction, randomization, and blinding [ 7 ].

Bias is the term used to describe an error at any stage of the study that was not due to chance. Bias leads to results in which there are a systematic deviation from the truth. As bias cannot be measured, researchers need to rely on good research design to minimize bias [ 8 ]. To minimize any bias within a study the sample population should be representative of the population. It is also imperative to consider the sample size in the study and identify if the study is adequately powered to produce statistically significant results, i.e., p-values quoted are <0.05 [ 9 ].

4-What are the outcome factors and how are they measured?

  • -Are all relevant outcomes assessed?
  • -Is measurement error an important source of bias?

5-What are the study factors and how are they measured?

  • -Are all the relevant study factors included in the study?
  • -Have the factors been measured using appropriate tools?

Data Analysis and Results:

- Were the tests appropriate for the data?

- Are confidence intervals or p-values given?

  • How strong is the association between intervention and outcome?
  • How precise is the estimate of the risk?
  • Does it clearly mention the main finding(s) and does the data support them?
  • Does it mention the clinical significance of the result?
  • Is adverse event or lack of it mentioned?
  • Are all relevant outcomes assessed?
  • Was the sample size adequate to detect a clinically/socially significant result?
  • Are the results presented in a way to help in health policy decisions?
  • Is there measurement error?
  • Is measurement error an important source of bias?

Confounding Factors:

A confounder has a triangular relationship with both the exposure and the outcome. However, it is not on the causal pathway. It makes it appear as if there is a direct relationship between the exposure and the outcome or it might even mask an association that would otherwise have been present [ 9 ].

6- What important potential confounders are considered?

  • -Are potential confounders examined and controlled for?
  • -Is confounding an important source of bias?

7- What is the statistical method in the study?

  • -Are the statistical methods described appropriate to compare participants for primary and secondary outcomes?
  • -Are statistical methods specified insufficient detail (If I had access to the raw data, could I reproduce the analysis)?
  • -Were the tests appropriate for the data?
  • -Are confidence intervals or p-values given?
  • -Are results presented as absolute risk reduction as well as relative risk reduction?

Interpretation of p-value:

The p-value refers to the probability that any particular outcome would have arisen by chance. A p-value of less than 1 in 20 (p<0.05) is statistically significant.

  • When p-value is less than significance level, which is usually 0.05, we often reject the null hypothesis and the result is considered to be statistically significant. Conversely, when p-value is greater than 0.05, we conclude that the result is not statistically significant and the null hypothesis is accepted.

Confidence interval:

Multiple repetition of the same trial would not yield the exact same results every time. However, on average the results would be within a certain range. A 95% confidence interval means that there is a 95% chance that the true size of effect will lie within this range.

8- Statistical results:

  • -Do statistical tests answer the research question?

Are statistical tests performed and comparisons made (data searching)?

Correct statistical analysis of results is crucial to the reliability of the conclusions drawn from the research paper. Depending on the study design and sample selection method employed, observational or inferential statistical analysis may be carried out on the results of the study.

It is important to identify if this is appropriate for the study [ 9 ].

  • -Was the sample size adequate to detect a clinically/socially significant result?
  • -Are the results presented in a way to help in health policy decisions?

Clinical significance:

Statistical significance as shown by p-value is not the same as clinical significance. Statistical significance judges whether treatment effects are explicable as chance findings, whereas clinical significance assesses whether treatment effects are worthwhile in real life. Small improvements that are statistically significant might not result in any meaningful improvement clinically. The following questions should always be on mind:

  • -If the results are statistically significant, do they also have clinical significance?
  • -If the results are not statistically significant, was the sample size sufficiently large to detect a meaningful difference or effect?

9- What conclusions did the authors reach about the study question?

Conclusions should ensure that recommendations stated are suitable for the results attained within the capacity of the study. The authors should also concentrate on the limitations in the study and their effects on the outcomes and the proposed suggestions for future studies [ 10 ].

  • -Are the questions posed in the study adequately addressed?
  • -Are the conclusions justified by the data?
  • -Do the authors extrapolate beyond the data?
  • -Are shortcomings of the study addressed and constructive suggestions given for future research?
  • -Bibliography/References:

Do the citations follow one of the Council of Biological Editors’ (CBE) standard formats?

10- Are ethical issues considered?

If a study involves human subjects, human tissues, or animals, was approval from appropriate institutional or governmental entities obtained? [ 10 , 11 ].

Critical appraisal of RCTs: Factors to look for:

  • Allocation (randomization, stratification, confounders).
  • Follow up of participants (intention to treat).
  • Data collection (bias).
  • Sample size (power calculation).
  • Presentation of results (clear, precise).
  • Applicability to local population.

[ Table/Fig-2 ] summarizes the guidelines for Consolidated Standards of Reporting Trials CONSORT [ 12 ].


Summary of the CONSORT guidelines.

Title and abstractIdentification as a RCT in the title- Structured summary (trial design, methods, results, and conclusions)
Introduction-Scientific background
Methods-Description of trial design and important changes to methods
-Eligibility criteria for participants
-The interventions for each group
-Completely defined and assessed primary and secondary outcome measures
-How sample size was determined
-Method used to generate the random allocation sequence
-Mechanism used to implement the random allocation sequence
-Blinding details -Statistical methods used
Results-Numbers of participants, losses and exclusions after randomization
-Results for each group and the estimated effect size and its precision (such as 95% confidence interval)
-Results of any other subgroup analyses performed
Discussion-Trial limitations
Other information- Registration number

Critical appraisal of systematic reviews: provide an overview of all primary studies on a topic and try to obtain an overall picture of the results.

In a systematic review, all the primary studies identified are critically appraised and only the best ones are selected. A meta-analysis (i.e., a statistical analysis) of the results from selected studies may be included. Factors to look for:

  • Literature search (did it include published and unpublished materials as well as non-English language studies? Was personal contact with experts sought?).
  • Quality-control of studies included (type of study; scoring system used to rate studies; analysis performed by at least two experts).
  • Homogeneity of studies.

[ Table/Fig-3 ] summarizes the guidelines for Preferred Reporting Items for Systematic reviews and Meta-Analyses PRISMA [ 13 ].


Summary of PRISMA guidelines.

TitleIdentification of the report as a systematic review, meta-analysis, or both.
AbstractStructured Summary: background; objectives; eligibility criteria; results; limitations; conclusions; systematic review registration number.
Introduction-Description of the rationale for the review
-Provision of a defined statement of questions being concentrated on with regard to participants, interventions, comparisons, outcomes, and study design (PICOS).
Methods-Specification of study eligibility criteria
-Description of all information sources
-Presentation of full electronic search strategy
-State the process for selecting studies
-Description of the method of data extraction from reports and methods used for assessing risk of bias of individual studies in addition to methods of handling data and combining results of studies.
ResultsProvision of full details of:
-Study selection.
-Study characteristics (e.g., study size, PICOS, follow-up period) -Risk of bias within studies.
-Results of each meta-analysis done, including confidence intervals and measures of consistency.
-Methods of additional analyses (e.g., sensitivity or subgroup analyses, meta-regression).
Discussion-Summary of the main findings including the strength of evidence for each main outcome.
-Discussion of limitations at study and outcome level.
-Provision of a general concluded interpretation of the results in the context of other evidence.
FundingSource and role of funders.

Critical appraisal is a fundamental skill in modern practice for assessing the value of clinical researches and providing an indication of their relevance to the profession. It is a skills-set developed throughout a professional career that facilitates this and, through integration with clinical experience and patient preference, permits the practice of evidence based medicine and dentistry. By following a systematic approach, such evidence can be considered and applied to clinical practice.

Critical appraisal tools

The tools listed below will help identify the many ways that error and bias can distort research results.

These tools generally have a core set of questions around 'risk of bias'. Some tools also include other questions to address precision and external validity i.e. generalisability.

The recommended Cochrane risk of bias tools define internal validity as "risk of bias" and consider that to be the key concept when assessing if a study is valid.

Among the collection is a set of checklists that SURE has developed, please note that these have not been externally validated.

The Cochrane Collaboration advocate against the use of scales yielding a summary score.

Systematic literature reviews of primary research studies Chevron right

  • ROBIS: Tools to assess risk of bias in systematic reviews [Recommended]
  • AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or nonrandomised studies of healthcare interventions, or both
  • Critical appraisal skills programme (CASP) Systematic Review Checklist
  • JBI Checklist for Systematic Reviews
  • SURE Systematic Review Critical Appraisal Checklist

Randomised controlled trials Chevron right

  • Cochrane Risk of Bias tool (2.0) [Recommended]
  • Critical Appraisal Skills Programme (CASP) Randomised Controlled Trial Checklist
  • JBI Checklist for Randomized Controlled Trials
  • NHLBI Quality Assessment of Controlled Intervention Studies
  • SIGN Randomised Controlled Trials Checklist
  • SURE Experimental Studies Critical Appraisal Checklist

Non-randomised controlled trials Chevron right

  • Cochrane ROBINS-I tool [Recommended]
  • JBI checklist for Quasi-Experimental Studies (non-randomized experimental studies)

Observational studies Chevron right

  • Critical Appraisal Skills Programme (CASP) Cohort Study Checklist
  • JBI Checklist for Cohort Studies
  • NHLBI Quality Assessment Tool for Observational Cohort and Cross-Sectional Studies
  • SIGN Cohort Studies Checklist
  • SURE Cohort Studies Critical Appraisal Checklist

Case controlled

  • Critical Appraisal Skills Programme (CASP) Case Control Study Checklist
  • JBI Checklist for Case Control Studies
  • NHLBI Quality Assessment of Case-Control Studies
  • SIGN Case Control Studies Checklist
  • SURE Case Control Studies Critical Appraisal Checklist


  • JBI Checklist for Analytical Cross-sectional Studies
  • NHLBI Quality Assessment Tool for Observational Cohort and Cross-sectional Studies
  • SURE Cross-sectional Studies Critical Appraisal Checklist

Case series

  • JBI Checklist for Case Series Studies
  • NHLBI Quality Assessment Tool for Case Series Studies
  • SURE Case Series Studies Critical Appraisal Checklist

Qualitative views and opinions studies Chevron right

  • Critical Appraisal Skills Programme (CASP) Qualitative Studies Checklist
  • JBI Checklist for Qualitative Research
  • SURE Qualitative Studies Critical Appraisal Checklist

Diagnostic accuracy studies Chevron right

  • QUADAS-2 [Recommended]
  • Critical Appraisal Skills Programme (CASP) Diagnostic Study Checklist
  • JBI Checklist for Diagnostic Test Accuracy Studies

Economic evaluation studies Chevron right

  • Critical Appraisal Skills Programme (CASP) Economic Evaluation Checklist
  • JBI Checklist for Economic Evaluations
  • NICE Guidelines, The Manual Appendix H. pp. 9-24

Assessing an outcome from a body of evidence Chevron right

  • Grading of Recommendations Assessment, Development and Evaluation (GRADE)

Establishing study type Chevron right

  • NICE public health guidance, Appendix E,  Algorithm for classifying quantitative (experimental and observational) study designs
  • Evidence-Based Answers to Clinical Questions for Busy Clinicians, pp. 27-28 . The Centre for Clinical Effectiveness, Monash Institute of Health Services Research: Melbourne, Australia, 2011
  • AHRQ study design algorithms

Critical Appraisal of Quantitative Research

what is the best critical appraisal tool for quantitative research

  Rocco Cavaleri
  Sameer Bhole
  Amit Arora  

2871 Accesses

1 Citations

Critical appraisal skills are important for anyone wishing to make informed decisions or improve the quality of healthcare delivery. A good critical appraisal provides information regarding the believability and usefulness of a particular study. However, the appraisal process is often overlooked, and critically appraising quantitative research can be daunting for both researchers and clinicians. This chapter introduces the concept of critical appraisal and highlights its importance in evidence-based practice. Readers are then introduced to the most common quantitative study designs and key questions to ask when appraising each type of study. These studies include systematic reviews, experimental studies (randomized controlled trials and non-randomized controlled trials), and observational studies (cohort, case-control, and cross-sectional studies). This chapter also provides the tools most commonly used to appraise the methodological and reporting quality of quantitative studies. Overall, this chapter serves as a step-by-step guide to appraising quantitative research in healthcare settings.

Altman DG, Bland JM. Treatment allocation in controlled trials: why randomise? BMJ. 1999;318(7192):1209.

Article   Google Scholar  

Arora A, Scott JA, Bhole S, Do L, Schwarz E, Blinkhorn AS. Early childhood feeding practices and dental caries in preschool children: a multi-centre birth cohort study. BMC Public Health. 2011;11(1):28.

Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, … Lijmer JG. The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Ann Intern Med. 2003;138(1):W1–12.

Cavaleri R, Schabrun S, Te M, Chipchase L. Hand therapy versus corticosteroid injections in the treatment of de quervain’s disease: a systematic review and meta-analysis. J Hand Ther. 2016;29(1):3–11. https://doi.org/10.1016/j.jht.2015.10.004 .

Centre for Evidence-based Management. Critical appraisal tools. 2017. Retrieved 20 Dec 2017, from https://www.cebma.org/resources-and-tools/what-is-critical-appraisal/ .

Centre for Evidence-based Medicine. Critical appraisal worksheets. 2017. Retrieved 3 Dec 2017, from http://www.cebm.net/blog/2014/06/10/critical-appraisal/ .

Clark HD, Wells GA, Huët C, McAlister FA, Salmi LR, Fergusson D, Laupacis A. Assessing the quality of randomized trials: reliability of the jadad scale. Control Clin Trials. 1999;20(5):448–52. https://doi.org/10.1016/S0197-2456(99)00026-4 .

Critical Appraisal Skills Program. Casp checklists. 2017. Retrieved 5 Dec 2017, from http://www.casp-uk.net/casp-tools-checklists .

Dawes M, Davies P, Gray A, Mant J, Seers K, Snowball R. Evidence-based practice: a primer for health care professionals. London: Elsevier; 2005.

Google Scholar  

Dumville JC, Torgerson DJ, Hewitt CE. Research methods: reporting attrition in randomised controlled trials. BMJ. 2006;332(7547):969.

Greenhalgh T, Donald A. Evidence-based health care workbook: understanding research for individual and group learning. London: BMJ Publishing Group; 2000.

Guyatt GH, Sackett DL, Cook DJ, Guyatt G, Bass E, Brill-Edwards P, … Gerstein H. Users’ guides to the medical literature: II. How to use an article about therapy or prevention. JAMA. 1993;270(21):2598–601.

Guyatt GH, Oxman AD, Akl EA, Kunz R, Vist G, Brozek J, … Jaeschke R. GRADE guidelines: 1. Introduction – GRADE evidence profiles and summary of findings tables. J Clin Epidemiol. 2011;64(4), 383–94.

Herbert R, Jamtvedt G, Mead J, Birger Hagen K. Practical evidence-based physiotherapy. London: Elsevier Health Sciences; 2005.

Hewitt CE, Torgerson DJ. Is restricted randomisation necessary? BMJ. 2006;332(7556):1506–8.

Higgins JPT, Green S. Cochrane handbook for systematic reviews of interventions version 5.0.2. The cochrane collaboration. 2009. Retrieved 3 Dec 2017, from http://www.cochrane-handbook.org .

Hoffmann T, Bennett S, Del Mar C. Evidence-based practice across the health professions. Chatswood: Elsevier Health Sciences; 2013.

Hoffmann T, Glasziou PP, Boutron I, Milne R, Perera R, Moher D, … Johnston M. Better reporting of interventions: template for intervention description and replication (TIDieR) checklist and guide. BMJ, 2014;348: g1687.

Joanna Briggs Institute. Critical appraisal tools. 2017. Retrieved 4 Dec 2017, from http://joannabriggs.org/research/critical-appraisal-tools.html .

Mhaskar R, Emmanuel P, Mishra S, Patel S, Naik E, Kumar A. Critical appraisal skills are essential to informed decision-making. Indian J Sex Transm Dis. 2009;30(2):112–9. https://doi.org/10.4103/0253-7184.62770 .

Moher D, Schulz KF, Altman DG. The CONSORT statement: revised recommendations for improving the quality of reports of parallel group randomized trials. BMC Med Res Methodol. 2001;1(1):2. https://doi.org/10.1186/1471-2288-1-2 .

Moher D, Liberati A, Tetzlaff J, Altman DG, Prisma Group. Preferred reporting items for systematic reviews and meta-analyses: the prisma statement. PLoS Med. 2009;6(7):e1000097.

National Health and Medical Research Council. NHMRC additional levels of evidence and grades for recommendations for developers of guidelines. Canberra: NHMRC; 2009. Retrieved from https://www.nhmrc.gov.au/_files_nhmrc/file/guidelines/developers/nhmrc_levels_grades_evidence_120423.pdf .

National Heart Lung and Blood Institute. Study quality assessment tools. 2017. Retrieved 17 Dec 2017, from https://www.nhlbi.nih.gov/health-topics/study-quality-assessment-tools .

Physiotherapy Evidence Database. PEDro scale. 2017. Retrieved 10 Dec 2017, from https://www.pedro.org.au/english/downloads/pedro-scale/ .

Portney L, Watkins M. Foundations of clinical research: application to practice. 2nd ed. Upper Saddle River: F.A. Davis Company/Publishers; 2009.

Roberts C, Torgerson DJ. Understanding controlled trials: baseline imbalance in randomised controlled trials. BMJ. 1999;319(7203):185.

Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, … Kristjansson E. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ. 2017;358:j4008. https://doi.org/10.1136/bmj.j4008 .

Sterne JA, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, … Boutron I. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ. 2016;355:i4919.

Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, … Thacker SB. Meta-analysis of observational studies in epidemiology: a proposal for reporting. JAMA. 2000;283(15):2008–12.

Von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP, Initiative S. The strengthening the reporting of observational studies in epidemiology (strobe) statement: guidelines for reporting observational studies. Int J Surg. 2014;12(12):1495–9.

Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, … Bossuyt PM. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 2011;155(8):529–36.

Download references

Author information

Authors and Affiliations

School of Science and Health, Western Sydney University, Campbelltown, NSW, Australia

Rocco Cavaleri

Sydney Dental School, Faculty of Medicine and Health, The University of Sydney, Surry Hills, NSW, Australia

Sameer Bhole

School of Science and Health, Western Sydney University, Sydney, NSW, Australia

Discipline of Paediatrics and Child Health, Sydney Medical School, Sydney, NSW, Australia

Oral Health Services, Sydney Local Health District and Sydney Dental Hospital, NSW Health, Sydney, NSW, Australia

COHORTE Research Group, Ingham Institute of Applied Medical Research, Liverpool, NSW, Australia

Oral Health Services, Sydney Local Health District and Sydney Dental Hospital, NSW Health, Surry Hills, NSW, Australia

Corresponding author

Correspondence to Rocco Cavaleri .

Editor information

Editors and affiliations.

School of Science and Health, Western Sydney University, Penrith, NSW, Australia

Pranee Liamputtong

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this entry

Cite this entry.

Cavaleri, R., Bhole, S., Arora, A. (2019). Critical Appraisal of Quantitative Research. In: Liamputtong, P. (eds) Handbook of Research Methods in Health Social Sciences. Springer, Singapore. https://doi.org/10.1007/978-981-10-5251-4_120

Download citation

DOI : https://doi.org/10.1007/978-981-10-5251-4_120

Published : 13 January 2019

Publisher Name : Springer, Singapore

Print ISBN : 978-981-10-5250-7

Online ISBN : 978-981-10-5251-4

eBook Packages : Social Sciences Reference Module Humanities and Social Sciences Reference Module Business, Economics and Social Sciences

Critical Appraisal for Health Students

  • Critical Appraisal of a quantitative paper
  • Critical Appraisal: Help
  • Critical Appraisal of a qualitative paper
  • Useful resources

Appraisal of a Quantitative paper: Top tips


  • Introduction

Critical appraisal of a quantitative paper (RCT)

This guide, aimed at health students, provides basic level support for appraising quantitative research papers. It's designed for students who have already attended lectures on critical appraisal. One framework for appraising quantitative research (based on reliability, internal and external validity) is provided and there is an opportunity to practise the technique on a sample article.

Please note this framework is for appraising one particular type of quantitative research a Randomised Controlled Trial (RCT) which is defined as 

a trial in which participants are randomly assigned to one of two or more groups: the experimental group or groups receive the intervention or interventions being tested; the comparison group (control group) receive usual care or no treatment or a placebo.  The groups are then followed up to see if there are any differences between the results.  This helps in assessing the effectiveness of the intervention.(CASP, 2020)

Support materials

  • Framework for reading quantitative papers (RCTs)
  • Critical appraisal of a quantitative paper PowerPoint

To practise following this framework for critically appraising a quantitative article, please look at the following article:

Marrero, D.G.  et al.  (2016) 'Comparison of commercial and self-initiated weight loss programs in people with prediabetes: a randomized control trial',  AJPH Research , 106(5), pp. 949-956.

Critical Appraisal of a quantitative paper (RCT): practical example

  • Internal Validity
  • External Validity
  • Reliability Measurement Tool

How to use this practical example 

Using the framework, you can have a go at appraising a quantitative paper - we are going to look at the following article:

Marrero, d.g.  et al  (2016) 'comparison of commercial and self-initiated weight loss programs in people with prediabetes: a randomized control trial',  ajph research , 106(5), pp. 949-956.,            step 1.  take a quick look at the article, step 2.  click on the internal validity tab above - there are questions to help you appraise the article, read the questions and look for the answers in the article. , step 3.   click on each question and our answers will appear., step 4.    repeat with the other aspects of external validity and reliability. , questioning the internal validity:, randomisation : how were participants allocated to each group did a randomisation process taken place, comparability of groups: how similar were the groups eg age, sex, ethnicity – is this made clear, blinding (none, single, double or triple): who was not aware of which group a patient was in (eg nobody, only patient, patient and clinician, patient, clinician and researcher) was it feasible for more blinding to have taken place , equal treatment of groups: were both groups treated in the same way , attrition : what percentage of participants dropped out did this adversely affect one group has this been evaluated, overall internal validity: does the research measure what it is supposed to be measuring, questioning the external validity:, attrition: was everyone accounted for at the end of the study was any attempt made to contact drop-outs, sampling approach: how was the sample selected was it based on probability or non-probability what was the approach (eg simple random, convenience) was this an appropriate approach, sample size (power calculation): how many participants was a sample size calculation performed did the study pass, exclusion/ inclusion criteria: were the criteria set out clearly were they based on recognised diagnostic criteria, what is the overall external validity can the results be applied to the wider population, questioning the reliability (measurement tool) internal validity:, internal consistency reliability (cronbach’s alpha). has a cronbach’s alpha score of 0.7 or above been included, test re-test reliability correlation. was the test repeated more than once were the same results received has a correlation coefficient been reported is it above 0.7 , validity of measurement tool. is it an established tool if not what has been done to check if it is reliable pilot study expert panel literature review criterion validity (test against other tools): has a criterion validity comparison been carried out was the score above 0.7, what is the overall reliability how consistent are the measurements , overall validity and reliability:, overall how valid and reliable is the paper.

  • Methodology
  • Research Methodology
  • Quantitative Research

Critical Appraisal of Quantitative Research

  • In book: Handbook of Research Methods in Health Social Sciences (pp.1-23)
  • Publisher: Springer

Rocco Cavaleri at Western Sydney University

  • Western Sydney University

Sameer Bhole at Oral Health Services and Sydney Dental Hospital Sydney Local Health District

  • Oral Health Services and Sydney Dental Hospital Sydney Local Health District

Amit Arora at The University of Sydney

  • The University of Sydney

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Kristin Ginsberg

  • Elias Machina

Tinashe Dune

  • George A. Wells

David Henry

  • Jonathan AC Sterne
  • Miguel A Hernán
  • Julian PT Higgins

Siobhan Schabrun

  • Sally C. Morton
  • for the Meta-analysis Of Observational Studies in Epidemiology (MOOSE) Group
  • Robert D Herbert
  • Gro Jamtvedt
  • K. B. Hagen
  • Erik von Elm
  • D.G. Altman

Matthias Egger

  • J P Vandenbroucke

Tammy Hoffmann

  • Paul P Glasziou

Isabelle Boutron

  • M.P. Watkins
  • JPT Higgins
  • S(eds) Green

Gordon H Guyatt

  • Research article
  • Open access
  • Published: 16 September 2004

A systematic review of the content of critical appraisal tools

  Persis Katrak
  Andrea E Bialocerkowski
  Nicola Massy-Westropp
  VS Saravana Kumar
  Karen A Grimmer  

BMC Medical Research Methodology volume 4, Article number: 22 (2004)

159k Accesses

208 Citations

14 Altmetric

Metrics details

Consumers of research (researchers, administrators, educators and clinicians) frequently use standard critical appraisal tools to evaluate the quality of published research reports. However, there is no consensus regarding the most appropriate critical appraisal tool for allied health research. We summarized the content, intent, construction and psychometric properties of published, currently available critical appraisal tools to identify common elements and their relevance to allied health research.

A systematic review was undertaken of 121 published critical appraisal tools sourced from 108 papers located on electronic databases and the Internet. The tools were classified according to the study design for which they were intended. Their items were then classified into one of 12 criteria based on their intent. Commonly occurring items were identified. The empirical basis for construction of the tool, the method by which overall quality of the study was established, the psychometric properties of the critical appraisal tools and whether guidelines were provided for their use were also recorded.

Eighty-seven percent of critical appraisal tools were specific to a research design, with most tools having been developed for experimental studies. There was considerable variability in items contained in the critical appraisal tools. Twelve percent of available tools were developed using specified empirical research. Forty-nine percent of the critical appraisal tools summarized the quality appraisal into a numeric summary score. Few critical appraisal tools had documented evidence of validity of their items, or reliability of use. Guidelines regarding administration of the tools were provided in 43% of cases.


There was considerable variability in intent, components, construction and psychometric properties of published critical appraisal tools for research reports. There is no "gold standard' critical appraisal tool for any study design, nor is there any widely accepted generic tool that can be applied equally well across study types. No tool was specific to allied health research requirements. Thus interpretation of critical appraisal of research reports currently needs to be considered in light of the properties and intent of the critical appraisal tool chosen for the task.

Consumers of research (clinicians, researchers, educators, administrators) frequently use standard critical appraisal tools to evaluate the quality and utility of published research reports [ 1 ]. Critical appraisal tools provide analytical evaluations of the quality of the study, in particular the methods applied to minimise biases in a research project [ 2 ]. As these factors potentially influence study results, and the way that the study findings are interpreted, this information is vital for consumers of research to ascertain whether the results of the study can be believed, and transferred appropriately into other environments, such as policy, further research studies, education or clinical practice. Hence, choosing an appropriate critical appraisal tool is an important component of evidence-based practice.

Although the importance of critical appraisal tools has been acknowledged [ 1 , 3 – 5 ] there appears to be no consensus regarding the 'gold standard' tool for any medical evidence. In addition, it seems that consumers of research are faced with a large number of critical appraisal tools from which to choose. This is evidenced by the recent report by the Agency for Health Research Quality in which 93 critical appraisal tools for quantitative studies were identified [ 6 ]. Such choice may pose problems for research consumers, as dissimilar findings may well be the result when different critical appraisal tools are used to evaluate the same research report [ 6 ].

Critical appraisal tools can be broadly classified into those that are research design-specific and those that are generic. Design-specific tools contain items that address methodological issues that are unique to the research design [ 5 , 7 ]. This precludes comparison however of the quality of different study designs [ 8 ]. To attempt to overcome this limitation, generic critical appraisal tools have been developed, in an attempt to enhance the ability of research consumers to synthesise evidence from a range of quantitative and or qualitative study designs (for instance [ 9 ]). There is no evidence that generic critical appraisal tools and design-specific tools provide a comparative evaluation of research designs.

Moreover, there appears to be little consensus regarding the most appropriate items that should be contained within any critical appraisal tool. This paper is concerned primarily with critical appraisal tools that address the unique properties of allied health care and research [ 10 ]. This approach was taken because of the unique nature of allied health contacts with patients, and because evidence-based practice is an emerging area in allied health [ 10 ]. The availability of so many critical appraisal tools (for instance [ 6 ]) may well prove daunting for allied health practitioners who are learning to critically appraise research in their area of interest. For the purposes of this evaluation, allied health is defined as encompassing "...all occasions of service to non admitted patients where services are provided at units/clinics providing treatment/counseling to patients. These include units primarily concerned with physiotherapy, speech therapy, family panning, dietary advice, optometry occupational therapy..." [ 11 ].

The unique nature of allied health practice needs to be considered in allied health research. Allied health research thus differs from most medical research, with respect to:

• the paradigm underpinning comprehensive and clinically-reasoned descriptions of diagnosis (including validity and reliability). An example of this is in research into low back pain, where instead of diagnosis being made on location and chronicity of pain (as is common) [ 12 ], it would be made on the spinal structure and the nature of the dysfunction underpinning the symptoms, which is arrived at by a staged and replicable clinical reasoning process [ 10 , 13 ].

• the frequent use of multiple interventions within the one contact with the patient (an occasion of service), each of which requires appropriate description in terms of relationship to the diagnosis, nature, intensity, frequency, type of instruction provided to the patient, and the order in which the interventions were applied [ 13 ]

• the timeframe and frequency of contact with the patient (as many allied health disciplines treat patients in episodes of care that contain multiple occasions of service, and which can span many weeks, or even years in the case of chronic problems [ 14 ])

• measures of outcome, including appropriate methods and timeframes of measuring change in impairment, function, disability and handicap that address the needs of different stakeholders (patients, therapists, funders etc) [ 10 , 12 , 13 ].

Search strategy

In supplementary data [see additional file 1 ].

Data organization and extraction

Two independent researchers (PK, NMW) participated in all aspects of this review, and they compared and discussed their findings with respect to inclusion of critical appraisal tools, their intent, components, data extraction and item classification, construction and psychometric properties. Disagreements were resolved by discussion with a third member of the team (KG).

Data extraction consisted of a four-staged process. First, identical replica critical appraisal tools were identified and removed prior to analysis. The remaining critical appraisal tools were then classified according to the study design for which they were intended to be used [ 1 , 2 ]. The scientific manner in which the tools had been constructed was classified as whether an empirical research approach has been used, and if so, which type of research had been undertaken. Finally, the items contained in each critical appraisal tool were extracted and classified into one of eleven groups, which were based on the criteria described by Clarke and Oxman [ 4 ] as:

• Study aims and justification

• Methodology used , which encompassed method of identification of relevant studies and adherence to study protocol;

• Sample selection , which ranged from inclusion and exclusion criteria, to homogeneity of groups;

• Method of randomization and allocation blinding;

• Attrition : response and drop out rates;

• Blinding of the clinician, assessor, patient and statistician as well as the method of blinding;

• Outcome measure characteristics;

• Intervention or exposure details;

• Method of data analyses ;

• Potential sources of bias ; and

• Issues of external validity , which ranged from application of evidence to other settings to the relationship between benefits, cost and harm.

An additional group, " miscellaneous ", was used to describe items that could not be classified into any of the groups listed above.

Data synthesis

Data was synthesized using MS Excel spread sheets as well as narrative format by describing the number of critical appraisal tools per study design and the type of items they contained. Descriptions were made of the method by which the overall quality of the study was determined, evidence regarding the psychometric properties of the tools (validity and reliability) and whether guidelines were provided for use of the critical appraisal tool.

One hundred and ninety-three research reports that potentially provided a description of a critical appraisal tool (or process) were identified from the search strategy. Fifty-six of these papers were unavailable for review due to outdated Internet links, or inability to source the relevant journal through Australian university and Government library databases. Of the 127 papers retrieved, 19 were excluded from this review, as they did not provide a description of the critical appraisal tool used, or were published in languages other than English. As a result, 108 papers were reviewed, which yielded 121 different critical appraisal tools [ 1 – 5 , 7 , 9 , 15 – 102 , 116 ].

Empirical basis for tool construction

We identified 14 instruments (12% all tools) which were reported as having been constructed using a specified empirical approach [ 20 , 29 , 30 , 32 , 35 , 40 , 49 , 51 , 70 – 72 , 79 , 103 , 116 ]. The empirical research reflected descriptive and/or qualitative approaches, these being critical review of existing tools [ 40 , 72 ], Delphi techniques to identify then refine data items [ 32 , 51 , 71 ], questionnaires and other forms of written surveys to identify and refine data items [ 70 , 79 , 103 ], facilitated structured consensus meetings [ 20 , 29 , 30 , 35 , 40 , 49 , 70 , 72 , 79 , 116 ], and pilot validation testing [ 20 , 40 , 72 , 103 , 116 ]. In all the studies which reported developing critical appraisal tools using a consensus approach, a range of stakeholder input was sought, reflecting researchers and clinicians in a range of health disciplines, students, educators and consumers. There were a further 31 papers which cited other studies as the source of the tool used in the review, but which provided no information on why individual items had been chosen, or whether (or how) they had been modified. Moreover, for 21 of these tools, the cited sources of the critical appraisal tool did not report the empirical basis on which the tool had been constructed.

Critical appraisal tools per study design

Seventy-eight percent (N = 94) of the critical appraisal tools were developed for use on primary research [ 1 – 5 , 7 , 9 , 18 , 19 , 25 – 27 , 34 , 37 – 41 ], while the remainder (N = 26) were for secondary research (systematic reviews and meta-analyses) [ 2 – 5 , 15 – 36 , 116 ]. Eighty-seven percent (N = 104) of all critical appraisal tools were design-specific [ 2 – 5 , 7 , 9 , 15 – 90 ], with over one third (N = 45) developed for experimental studies (randomized controlled trials, clinical trials) [ 2 – 4 , 25 – 27 , 34 , 37 – 73 ]. Sixteen critical appraisal tools were generic. Of these, six were developed for use on both experimental and observational studies [ 9 , 91 – 95 ], whereas 11 were purported to be useful for any qualitative and quantitative research design [ 1 , 18 , 41 , 96 – 102 , 116 ] (see Figure 1 , Table 1 ).

figure 1

Number of critical appraisal tools per study design [1,2]

Critical appraisal items

One thousand, four hundred and seventy five items were extracted from these critical appraisal tools. After grouping like items together, 173 different item types were identified, with the most frequently reported items being focused towards assessing the external validity of the study (N = 35) and method of data analyses (N = 28) (Table 2 ). The most frequently reported items across all critical appraisal tools were:

Eligibility criteria (inclusion/exclusion criteria) (N = 63)

Appropriate statistical analyses (N = 47)

Random allocation of subjects (N = 43)

Consideration of outcome measures used (N = 43)

Sample size justification/power calculations (N = 39)

Study design reported (N = 36)

Assessor blinding (N = 36)

Design-specific critical appraisal tools

Systematic reviews.

Eighty-seven different items were extracted from the 26 critical appraisal tools, which were designed to evaluate the quality of systematic reviews. These critical appraisal tools frequently contained items regarding data analyses and issues of external validity (Tables 2 and 3 ).

Items assessing data analyses were focused to the methods used to summarize the results, assessment of sensitivity of results and whether heterogeneity was considered, whereas the nature of reporting of the main results, interpretation of them and their generalizability were frequently used to assess the external validity of the study findings. Moreover, systematic review critical appraisal tools tended to contain items such as identification of relevant studies, search strategy used, number of studies included and protocol adherence, that would not be relevant for other study designs. Blinding and randomisation procedures were rarely included in these critical appraisal tools.

Experimental studies

One hundred and twenty thirteen different items were extracted from the 45 experimental critical appraisal tools. These items most frequently assessed aspects of data analyses and blinding (Tables 1 and 2 ). Data analyses items were focused on whether appropriate statistical analysis was performed, whether a sample size justification or power calculation was provided and whether side effects of the intervention were recorded and analysed. Blinding was focused on whether the participant, clinician and assessor were blinded to the intervention.

Diagnostic studies

Forty-seven different items were extracted from the seven diagnostic critical appraisal tools. These items frequently addressed issues involving data analyses, external validity of results and sample selection that were specific to diagnostic studies (whether the diagnostic criteria were defined, definition of the "gold" standard, the calculation of sensitivity and specificity) (Tables 1 and 2 ).

Observational studies

Seventy-four different items were extracted from the 19 critical appraisal tools for observational studies. These items primarily focused on aspects of data analyses (see Tables 1 and 2 , such as whether confounders were considered in the analysis, whether a sample size justification or power calculation was provided and whether appropriate statistical analyses were preformed.

Qualitative studies

Thirty-six different items were extracted from the seven qualitative study critical appraisal tools. The majority of these items assessed issues regarding external validity, methods of data analyses and the aims and justification of the study (Tables 1 and 2 ). Specifically, items were focused to whether the study question was clearly stated, whether data analyses were clearly described and appropriate, and application of the study findings to the clinical setting. Qualitative critical appraisal tools did not contain items regarding sample selection, randomization, blinding, intervention or bias, perhaps because these issues are not relevant to the qualitative paradigm.

Generic critical appraisal tools

Experimental and observational studies.

Forty-two different items were extracted from the six critical appraisal tools that could be used to evaluate experimental and observational studies. These tools most frequently contained items that addressed aspects of sample selection (such as inclusion/exclusion criteria of participants, homogeneity of participants at baseline) and data analyses (such as whether appropriate statistical analyses were performed, whether a justification of the sample size or power calculation were provided).

All study designs

Seventy-eight different items were contained in the ten critical appraisal tools that could be used for all study designs (quantitative and qualitative). The majority of these items focused on whether appropriate data analyses were undertaken (such as whether confounders were considered in the analysis, whether a sample size justification or power calculation was provided and whether appropriate statistical analyses were preformed) and external validity issues (generalization of results to the population, value of the research findings) (see Tables 1 and 2 ).

Allied health critical appraisal tools

We found no critical appraisal instrument specific to allied health research, despite finding at least seven critical appraisal instruments associated with allied health topics (mostly physiotherapy management of orthopedic conditions) [ 37 , 39 , 52 , 58 , 59 , 65 ]. One critical appraisal development group proposed two instruments [ 9 ], specific to quantitative and qualitative research respectively. The core elements of allied health research quality (specific diagnosis criteria, intervention descriptions, nature of patient contact and appropriate outcome measures) were not addressed in any one tool sourced for this evaluation. We identified 152 different ways of considering quality reporting of outcome measures in the 121 critical appraisal tools, and 81 ways of considering description of interventions. Very few tools which were not specifically targeted to diagnostic studies (less than 10% of the remaining tools) addressed diagnostic criteria. The critical appraisal instrument that seemed most related to allied health research quality [ 39 ] sought comprehensive evaluation of elements of intervention and outcome, however this instrument was relevant only to physiotherapeutic orthopedic experimental research.

Overall study quality

Forty-nine percent (N = 58) of critical appraisal tools summarised the results of the quality appraisal into a single numeric summary score [ 5 , 7 , 15 – 25 , 37 – 59 , 74 – 77 , 80 – 83 , 87 , 91 – 93 , 96 , 97 ] (Figure 2 ). This was achieved by one of two methods:

figure 2

Number of critical appraisal tools with, and without, summary quality scores

An equal weighting system, where one point was allocated to each item fulfilled; or

A weighted system, where fulfilled items were allocated various points depending on their perceived importance.

However, there was no justification provided for any of the scoring systems used. In the remaining critical appraisal tools (N = 62), a single numerical summary score was not provided [ 1 – 4 , 9 , 25 – 36 , 60 – 73 , 78 , 79 , 84 – 90 , 94 , 95 , 98 – 102 ]. This left the research consumer to summarize the results of the appraisal in a narrative manner, without the assistance of a standard approach.

Psychometric properties of critical appraisal tools

Few critical appraisal tools had documented evidence of their validity and reliability. Face validity was established in nine critical appraisal tools, seven of which were developed for use on experimental studies [ 38 , 40 , 45 , 49 , 51 , 63 , 70 ] and two for systematic reviews [ 32 , 103 ]. Intra-rater reliability was established for only one critical appraisal tool as part of its empirical development process [ 40 ], whereas inter-rater reliability was reported for two systematic review tools [ 20 , 36 ] (for one of these as part of the developmental process [ 20 ]) and seven experimental critical appraisal tools [ 38 , 40 , 45 , 51 , 55 , 56 , 63 ] (for two of these as part of the developmental process [ 40 , 51 ]).

Critical appraisal tool guidelines

Forty-three percent (N = 52) of critical appraisal tools had guidelines that informed the user of the interpretation of each item contained within them (Table 2 ). These guidelines were most frequently in the form of a handbook or published paper (N = 31) [ 2 , 4 , 9 , 15 , 20 , 25 , 28 , 29 , 31 , 36 , 37 , 41 , 50 , 64 – 67 , 69 , 80 , 84 – 87 , 89 , 90 , 95 , 100 , 116 ], whereas in 14 critical appraisal tools explanations accompanied each item [ 16 , 26 , 27 , 40 , 49 , 51 , 57 , 59 , 79 , 83 , 91 , 102 ].

Our search strategy identified a large number of published critical appraisal tools that are currently available to critically appraise research reports. There was a distinct lack of information on tool development processes in most cases. Many of the tools were reported to be modifications of other published tools, or reflected specialty concerns in specific clinical or research areas, without attempts to justify inclusion criteria. Less than 10 of these tools were relevant to evaluation of the quality of allied health research, and none of these were based on an empirical research approach. We are concerned that although our search was systematic and extensive [ 104 , 105 ], our broad key words and our lack of ready access to 29% of potentially useful papers (N = 56) potentially constrained us from identifying all published critical appraisal tools. However, consumers of research seeking critical appraisal instruments are not likely to seek instruments from outdated Internet links and unobtainable journals, thus we believe that we identified the most readily available instruments. Thus, despite the limitations on sourcing all possible tools, we believe that this paper presents a useful synthesis of the readily available critical appraisal tools.

The majority of the critical appraisal tools were developed for a specific research design (87%), with most designed for use on experimental studies (38% of all critical appraisal tools sourced). This finding is not surprising as, according to the medical model, experimental studies sit at or near the top of the hierarchy of evidence [ 2 , 8 ]. In recent years, allied health researchers have strived to apply the medical model of research to their own discipline by conducting experimental research, often by using the randomized controlled trial design [ 106 ]. This trend may be the reason for the development of experimental critical appraisal tools reported in allied health-specific research topics [ 37 , 39 , 52 , 58 , 59 , 65 ].

We also found a considerable number of critical appraisal tools for systematic reviews (N = 26), which reflects the trend to synthesize research evidence to make it relevant for clinicians [ 105 , 107 ]. Systematic review critical appraisal tools contained unique items (such as identification of relevant studies, search strategy used, number of studies included, protocol adherence) compared with tools used for primary studies, a reflection of the secondary nature of data synthesis and analysis.

In contrast, we identified very few qualitative study critical appraisal tools, despite the presence of many journal-specific guidelines that outline important methodological aspects required in a manuscript submitted for publication [ 108 – 110 ]. This finding may reflect the more traditional, quantitative focus of allied health research [ 111 ]. Alternatively, qualitative researchers may view the robustness of their research findings in different terms compared with quantitative researchers [ 112 , 113 ]. Hence the use of critical appraisal tools may be less appropriate for the qualitative paradigm. This requires further consideration.

Of the small number of generic critical appraisal tools, we found few that could be usefully applied (to any health research, and specifically to the allied health literature), because of the generalist nature of their items, variable interpretation (and applicability) of items across research designs, and/or lack of summary scores. Whilst these types of tools potentially facilitate the synthesis of evidence across allied health research designs for clinicians, their lack of specificity in asking the 'hard' questions about research quality related to research design also potentially precludes their adoption for allied health evidence-based practice. At present, the gold standard study design when synthesizing evidence is the randomized controlled trial [ 4 ], which underpins our finding that experimental critical appraisal tools predominated in the allied health literature [ 37 , 39 , 52 , 58 , 59 , 65 ]. However, as more systematic literature reviews are undertaken on allied health topics, it may become more accepted that evidence in the form of other research design types requires acknowledgement, evaluation and synthesis. This may result in the development of more appropriate and clinically useful allied health critical appraisal tools.

A major finding of our study was the volume and variation in available critical appraisal tools. We found no gold standard critical appraisal tool for any type of study design. Therefore, consumers of research are faced with frustrating decisions when attempting to select the most appropriate tool for their needs. Variable quality evaluations may be produced when different critical appraisal tools are used on the same literature [ 6 ]. Thus, interpretation of critical analysis must be carefully considered in light of the critical appraisal tool used.

The variability in the content of critical appraisal tools could be accounted for by the lack of any empirical basis of tool construction, established validity of item construction, and the lack of a gold standard against which to compare new critical tools. As such, consumers of research cannot be certain that the content of published critical appraisal tools reflect the most important aspects of the quality of studies that they assess [ 114 ]. Moreover, there was little evidence of intra- or inter-rater reliability of the critical appraisal tools. Coupled with the lack of protocols for use, this may mean that critical appraisers could interpret instrument items in different ways over repeated occasions of use. This may produce variable results [123].

Based on the findings of this evaluation, we recommend that consumers of research should carefully select critical appraisal tools for their needs. The selected tools should have published evidence of the empirical basis for their construction, validity of items and reliability of interpretation, as well as guidelines for use, so that the tools can be applied and interpreted in a standardized manner. Our findings highlight the need for consensus to be reached regarding the important and core items for critical appraisal tools that will produce a more standardized environment for critical appraisal of research evidence. As a consequence, allied health research will specifically benefit from having critical appraisal tools that reflect best practice research approaches which embed specific research requirements of allied health disciplines.

National Health and Medical Research Council: How to Review the Evidence: Systematic Identification and Review of the Scientific Literature. Canberra. 2000

Google Scholar  

National Health and Medical Research Council: How to Use the Evidence: Assessment and Application of Scientific Evidence. Canberra. 2000

Joanna Briggs Institute. [ http://www.joannabriggs.edu.au ]

Clarke M, Oxman AD: Cochrane Reviewer's Handbook 4.2.0. 2003, Oxford: The Cochrane Collaboration

Crombie IK: The Pocket Guide to Critical Appraisal: A Handbook for Health Care Professionals. 1996, London: BMJ Publishing Group

Agency for Healthcare Research and Quality: Systems to Rate the Strength of Scientific Evidence. Evidence Report/Technology Assessment No. 47, Publication No. 02-E016. Rockville. 2002

Elwood JM: Critical Appraisal of Epidemiological Studies and Clinical Trials. 1998, Oxford: Oxford University Press, 2

Sackett DL, Richardson WS, Rosenberg W, Haynes RB: Evidence Based Medicine. How to Practice and Teach EBM. 2000, London: Churchill Livingstone

Critical literature reviews. [ http://www.cotfcanada.org/cotf_critical.htm ]

Bialocerkowski AE, Grimmer KA, Milanese SF, Kumar S: Application of current research evidence to clinical physiotherapy practice. J Allied Health Res Dec.

The National Health Data Dictionary – Version 10. http://www.aihw.gov.au/publications/hwi/nhdd12/nhdd12-v1.pdf and http://www.aihw.gov.au/publications/hwi/nhdd12/nhdd12-v2.pdf

Grimmer K, Bowman P, Roper J: Episodes of allied health outpatient care: an investigation of service delivery in acute public hospital settings. Disability and Rehabilitation. 2000, 22 (1/2): 80-87.

CAS   PubMed   Google Scholar  

Grimmer K, Milanese S, Bialocerkowski A: Clinical guidelines for low back pain: A physiotherapy perspective. Physiotherapy Canada. 2003, 55 (4): 1-9.

Grimmer KA, Milanese S, Bialocerkowski AE, Kumar S: Producing and implementing evidence in clinical practice: the therapies' dilemma. Physiotherapy. 2004,

Greenhalgh T: How to read a paper: papers that summarize other papers (systematic reviews and meta-analysis). BMJ. 1997, 315: 672-675.

CAS   PubMed   PubMed Central   Google Scholar  

Auperin A, Pignon J, Poynard T: Review article: critical review of meta-analysis of randomised clinical trials in hepatogastroenterology. Alimentary Pharmacol Therapeutics. 1997, 11: 215-225. 10.1046/j.1365-2036.1997.131302000.x.

CAS   Google Scholar  

Barnes DE, Bero LA: Why review articles on the health effects of passive smoking reach different conclusions. J Am Med Assoc. 1998, 279: 1566-1570. 10.1001/jama.279.19.1566.

Beck CT: Use of meta-analysis as a teaching strategy in nursing research courses. J Nurs Educat. 1997, 36: 87-90.

Carruthers SG, Larochelle P, Haynes RB, Petrasovits A, Schiffrin EL: Report of the Canadian Hypertension Society Consensus Conference: 1. Introduction. Can Med Assoc J. 1993, 149: 289-293.

Oxman AD, Guyatt GH, Singer J, Goldsmith CH, Hutchinson BG, Milner RA, Streiner DL: Agreement among reviewers of review articles. J Clin Epidemiol. 1991, 44: 91-98. 10.1016/0895-4356(91)90205-N.

Sacks HS, Reitman D, Pagano D, Kupelnick B: Meta-analysis: an update. Mount Sinai Journal of Medicine. 1996, 63: 216-224.

Smith AF: An analysis of review articles published in four anaesthesia journals. Can J Anaesth. 1997, 44: 405-409.

L'Abbe KA, Detsky AS, O'Rourke K: Meta-analysis in clinical research. Ann Intern Med. 1987, 107: 224-233.

PubMed   Google Scholar  

Mulrow CD, Antonio S: The medical review article: state of the science. Ann Intern Med. 1987, 106: 485-488.

Continuing Professional Development: A Manual for SIGN Guideline Developers. [ http://www.sign.ac.uk ]

Learning and Development Public Health Resources Unit. [ http://www.phru.nhs.uk/ ]

FOCUS Critical Appraisal Tool. [ http://www.focusproject.org.uk ]

Cook DJ, Sackett DL, Spitzer WO: Methodologic guidelines for systematic reviews of randomized control trials in health care from the Potsdam Consultation on meta-analysis. J Clin Epidemiol. 1995, 48: 167-171. 10.1016/0895-4356(94)00172-M.

Cranney A, Tugwell P, Shea B, Wells G: Implications of OMERACT outcomes in arthritis and osteoporosis for Cochrane metaanalysis. J Rheumatol. 1997, 24: 1206-1207.

Guyatt GH, Sackett DL, Sinclair JC, Hoyward R, Cook DJ, Cook RJ: User's guide to the medical literature. IX. A method for grading health care recommendations. J Am Med Assoc. 1995, 274: 1800-1804. 10.1001/jama.274.22.1800.

Gyorkos TW, Tannenbaum TN, Abrahamowicz M, Oxman AD, Scott EAF, Milson ME, Rasooli Iris, Frank JW, Riben PD, Mathias RG: An approach to the development of practice guidelines for community health interventions. Can J Public Health. 1994, 85: S8-13.

Moher D, Cook DJ, Eastwood S, Olkin I, Rennie D, Stroup DF: Improving the quality of reports of meta-analyses of randomised controlled trials: the QUOROM statement. Quality of reporting of meta-analyses. Lancet. 1999, 354: 1896-1900. 10.1016/S0140-6736(99)04149-5.

Oxman AD, Cook DJ, Guyatt GH: Users' guides to the medical literature. VI. How to use an overview. Evidence-Based Medicine Working Group. J Am Med Assoc. 1994, 272: 1367-1371. 10.1001/jama.272.17.1367.

Pogue J, Yusuf S: Overcoming the limitations of current meta-analysis of randomised controlled trials. Lancet. 1998, 351: 47-52. 10.1016/S0140-6736(97)08461-4.

Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, Moher D, Becker BJ, Sipe TA, Thacker SB: Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis of observational studies in epidemiology (MOOSE) group. J Am Med Assoc. 2000, 283: 2008-2012. 10.1001/jama.283.15.2008.

Irwig L, Tosteson AN, Gatsonis C, Lau J, Colditz G, Chalmers TC, Mostellar F: Guidelines for meta-analyses evaluating diagnostic tests. Ann Intern Med. 1994, 120: 667-676.

Moseley AM, Herbert RD, Sherrington C, Maher CG: Evidence for physiotherapy practice: A survey of the Physiotherapy Evidence Database. Physiotherapy Evidence Database (PEDro). Australian Journal of Physiotherapy. 2002, 48: 43-50.

Cho MK, Bero LA: Instruments for assessing the quality of drug studies published in the medical literature. J Am Med Assoc. 1994, 272: 101-104. 10.1001/jama.272.2.101.

De Vet HCW, De Bie RA, Van der Heijden GJ, Verhagen AP, Sijpkes P, Kipschild PG: Systematic reviews on the basis of methodological criteria. Physiotherapy. 1997, 83: 284-289.

Downs SH, Black N: The feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies of health care interventions. J Epidemiol Community Health. 1998, 52: 377-384.

Evans M, Pollock AV: A score system for evaluating random control clinical trials of prophylaxis of abdominal surgical wound infection. Br J Surg. 1985, 72: 256-260.

Fahey T, Hyde C, Milne R, Thorogood M: The type and quality of randomized controlled trials (RCTs) published in UK public health journals. J Public Health Med. 1995, 17: 469-474.

Gotzsche PC: Methodology and overt and hidden bias in reports of 196 double-blind trials of nonsteroidal antiinflammatory drugs in rheumatoid arthritis. Control Clin Trials. 1989, 10: 31-56. 10.1016/0197-2456(89)90017-2.

Imperiale TF, McCullough AJ: Do corticosteroids reduce mortality from alcoholic hepatitis? A meta-analysis of the randomized trials. Ann Int Med. 1990, 113: 299-307.

Jadad AR, Moore RA, Carroll D, Jenkinson C, Reynolds DJ, Gavaghan DJ, McQuay HJ: Assessing the quality of reports of randomized clinical trials: is blinding necessary?. Control Clin Trials. 1996, 17: 1-12. 10.1016/0197-2456(95)00134-4.

Khan KS, Daya S, Collins JA, Walter SD: Empirical evidence of bias in infertility research: overestimation of treatment effect in crossover trials using pregnancy as the outcome measure. Fertil Steril. 1996, 65: 939-945.

Kleijnen J, Knipschild P, ter Riet G: Clinical trials of homoeopathy. BMJ. 1991, 302: 316-323.

Liberati A, Himel HN, Chalmers TC: A quality assessment of randomized control trials of primary treatment of breast cancer. J Clin Oncol. 1986, 4: 942-951.

Moher D, Schulz KF, Altman DG, for the CONSORT Group: The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomized trials. J Am Med Assoc. 2001, 285: 1987-1991. 10.1001/jama.285.15.1987.

Reisch JS, Tyson JE, Mize SG: Aid to the evaluation of therapeutic studies. Pediatrics. 1989, 84: 815-827.

Sindhu F, Carpenter L, Seers K: Development of a tool to rate the quality assessment of randomized controlled trials using a Delphi technique. J Advanced Nurs. 1997, 25: 1262-1268. 10.1046/j.1365-2648.1997.19970251262.x.

Van der Heijden GJ, Van der Windt DA, Kleijnen J, Koes BW, Bouter LM: Steroid injections for shoulder disorders: a systematic review of randomized clinical trials. Br J Gen Pract. 1996, 46: 309-316.

Van Tulder MW, Koes BW, Bouter LM: Conservative treatment of acute and chronic nonspecific low back pain. A systematic review of randomized controlled trials of the most common interventions. Spine. 1997, 22: 2128-2156. 10.1097/00007632-199709150-00012.

Garbutt JC, West SL, Carey TS, Lohr KN, Crews FT: Pharmacotherapy for Alcohol Dependence. Evidence Report/Technology Assessment No. 3, AHCPR Publication No. 99-E004. Rockville. 1999

Oremus M, Wolfson C, Perrault A, Demers L, Momoli F, Moride Y: Interarter reliability of the modified Jadad quality scale for systematic reviews of Alzheimer's disease drug trials. Dement Geriatr Cognit Disord. 2001, 12: 232-236. 10.1159/000051263.

Clark O, Castro AA, Filho JV, Djubelgovic B: Interrater agreement of Jadad's scale. Annual Cochrane Colloqium Abstracts. 2001, [ http://www.biomedcentral.com/abstracts/COCHRANE/1/op031 ]October Lyon

Jonas W, Anderson RL, Crawford CC, Lyons JS: A systematic review of the quality of homeopathic clinical trials. BMC Alternative Medicine. 2001, 1: 12-10.1186/1472-6882-1-12.

Van Tulder M, Malmivaara A, Esmail R, Koes B: Exercises therapy for low back pain: a systematic review within the framework of the Cochrane Collaboration back review group. Spine. 2000, 25: 2784-2796. 10.1097/00007632-200011010-00011.

Van Tulder MW, Ostelo R, Vlaeyen JWS, Linton SJ, Morley SJ, Assendelft WJJ: Behavioral treatment for chronic low back pain: a systematic review within the framework of the cochrane back. Spine. 2000, 25: 2688-2699. 10.1097/00007632-200010150-00024.

Aronson N, Seidenfeld J, Samson DJ, Aronson N, Albertson PC, Bayoumi AM, Bennett C, Brown A, Garber ABA, Gere M, Hasselblad V, Wilt T, Ziegler MPHK, Pharm D: Relative Effectiveness and Cost Effectiveness of Methods of Androgen Suppression in the Treatment of Advanced Prostate Cancer. Evidence Report/Technology Assessment No. 4, AHCPR Publication No.99-E0012. Rockville. 1999

Chalmers TC, Smith H, Blackburn B, Silverman B, Schroeder B, Reitman D, Ambroz A: A method for assessing the quality of a randomized control trial. Control Clin Trials. 1981, 2: 31-49. 10.1016/0197-2456(81)90056-8.

der Simonian R, Charette LJ, McPeek B, Mosteller F: Reporting on methods in clinical trials. New Eng J Med. 1982, 306: 1332-1337.

Detsky AS, Naylor CD, O'Rourke K, McGeer AJ, L'Abbe KA: Incorporating variations in the quality of individual randomized trials into meta-analysis. J Clin Epidemiol. 1992, 45: 255-265. 10.1016/0895-4356(92)90085-2.

Goudas L, Carr DB, Bloch R, Balk E, Ioannidis JPA, Terrin MN: Management of Cancer Pain. Evidence Report/Technology Assessment No. 35 (Contract 290-97-0019 to the New England Medical Center), AHCPR Publication No. 99-E004. Rockville. 2000

Guyatt GH, Sackett DL, Cook DJ: Users' guides to the medical literature. II. How to use an article about therapy or prevention. A. Are the results of the study valid? Evidence-Based Medicine Working Group. J Am Med Assoc. 1993, 270: 2598-2601. 10.1001/jama.270.21.2598.

Khan KS, Ter Riet G, Glanville J, Sowden AJ, Kleijnen J: Undertaking Systematic Reviews of Research on Effectiveness: Centre of Reviews and Dissemination's Guidance for Carrying Out or Commissioning Reviews: York. 2000

McNamara R, Bass EB, Marlene R, Miller J: Management of New Onset Atrial Fibrillation. Evidence Report/Technology Assessment No.12, AHRQ Publication No. 01-E026. Rockville. 2001

Prendiville W, Elbourne D, Chalmers I: The effects of routine oxytocic administration in the management of the third stage of labour: an overview of the evidence from controlled trials. Br J Obstet Gynae Col. 1988, 95: 3-16.

Schulz KF, Chalmers I, Hayes RJ, Altman DG: Empirical evidence of bias. Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. J Am Med Assoc. 1995, 273: 408-412. 10.1001/jama.273.5.408.

The Standards of Reporting Trials Group: A proposal for structured reporting of randomized controlled trials. J Am Med Assoc. 1994, 272: 1926-1931. 10.1001/jama.272.24.1926.

Verhagen AP, de Vet HC, de Bie RA, Kessels AGH, Boers M, Bouter LM, Knipschild PG: The Delphi list: a criteria list for quality assessment of randomized clinical trials for conducting systematic reviews developed by Delphi consensus. J Clin Epidemiol. 1998, 51: 1235-1241. 10.1016/S0895-4356(98)00131-0.

Zaza S, Wright-De Aguero LK, Briss PA, Truman BI, Hopkins DP, Hennessy MH, Sosin DM, Anderson L, Carande-Kullis VG, Teutsch SM, Pappaioanou M: Data collection instrument and procedure for systematic reviews in the guide to community preventive services. Task force on community preventive services. Am J Prevent Med. 2000, 18: 44-74. 10.1016/S0749-3797(99)00122-1.

Haynes BB, Wilczynski N, McKibbon A, Walker CJ, Sinclair J: Developing optimal search strategies for detecting clinically sound studies in MEDLINE. J Am Informatics Assoc. 1994, 1: 447-458.

Greenhalgh T: How to read a paper: papers that report diagnostic or screening tests. BMJ. 1997, 315: 540-543.

Arroll B, Schechter MT, Sheps SB: The assessment of diagnostic tests: a comparison of medical literature in 1982 and 1985. J Gen Int Med. 1988, 3: 443-447.

Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der Meulen JH, Bossuyt PM: Empirical evidence of design-related bias in studies of diagnostic tests. J Am Med Assoc. 1999, 282: 1061-1066. 10.1001/jama.282.11.1061.

Sheps SB, Schechter MT: The assessment of diagnostic tests. A survey of current medical research. J Am Med Assoc. 1984, 252: 2418-2422. 10.1001/jama.252.17.2418.

McCrory DC, Matchar DB, Bastian L, Dutta S, Hasselblad V, Hickey J, Myers MSE, Nanda K: Evaluation of Cervical Cytology. Evidence Report/Technology Assessment No. 5, AHCPR Publication No.99-E010. Rockville. 1999

Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig LM, Lijmer JG, Moher D, Rennie D, DeVet HCW: Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Clin Chem. 2003, 49: 1-6. 10.1373/49.1.1.

Greenhalgh T: How to Read a Paper: Assessing the methodological quality of published papers. BMJ. 1997, 315: 305-308.

Angelillo I, Villari P: Residential exposure to electromagnetic fields and childhood leukaemia: a meta-analysis. Bull World Health Org. 1999, 77: 906-915.

Ariens G, Mechelen W, Bongers P, Bouter L, Van der Wal G: Physical risk factors for neck pain. Scand J Work Environ Health. 2000, 26: 7-19.

Hoogendoorn WE, van Poppel MN, Bongers PM, Koes BW, Bouter LM: Physical load during work and leisure time as risk factors for back pain. Scand J Work Environ Health. 1999, 25: 387-403.

Laupacis A, Wells G, Richardson WS, Tugwell P: Users' guides to the medical literature. V. How to use an article about prognosis. Evidence-Based Medicine Working Group. J Am Med Assoc. 1994, 272: 234-237. 10.1001/jama.272.3.234.

Levine M, Walter S, Lee H, Haines T, Holbrook A, Moyer V: Users' guides to the medical literature. IV. How to use an article about harm. Evidence-Based Medicine Working Group. J Am Med Assoc. 1994, 271: 1615-1619. 10.1001/jama.271.20.1615.

Carey TS, Boden SD: A critical guide to case series reports. Spine. 2003, 28: 1631-1634. 10.1097/00007632-200308010-00001.

Greenhalgh T, Taylor R: How to read a paper: papers that go beyond numbers (qualitative research). BMJ. 1997, 315: 740-743.

Hoddinott P, Pill R: A review of recently published qualitative research in general practice. More methodological questions than answers?. Fam Pract. 1997, 14: 313-319. 10.1093/fampra/14.4.313.

Mays N, Pope C: Quality research in health care: Assessing quality in qualitative research. BMJ. 2000, 320: 50-52. 10.1136/bmj.320.7226.50.

Mays N, Pope C: Rigour and qualitative research. BMJ. 1995, 311: 109-112.

Colditz GA, Miller JN, Mosteller F: How study design affects outcomes in comparisons of therapy. I: Medical. Stats Med. 1989, 8: 441-454.

Turlik MA, Kushner D: Levels of evidence of articles in podiatric medical journals. J Am Pod Med Assoc. 2000, 90: 300-302.

Borghouts JAJ, Koes BW, Bouter LM: The clinical course and prognostic factors of non-specific neck pain: a systematic review. Pain. 1998, 77: 1-13. 10.1016/S0304-3959(98)00058-X.

Spitzer WO, Lawrence V, Dales R, Hill G, Archer MC, Clark P, Abenhaim L, Hardy J, Sampalis J, Pinfold SP, Morgan PP: Links between passive smoking and disease: a best-evidence synthesis. A report of the working group on passive smoking. Clin Invest Med. 1990, 13: 17-46.

Sutton AJ, Abrams KR, Jones DR, Sheldon TA, Song F: Systematic reviews of trials and other studies. Health Tech Assess. 1998, 2: 1-276.

Chestnut RM, Carney N, Maynard H, Patterson P, Mann NC, Helfand M: Rehabilitation for Traumatic Brain Injury. Evidence Report/Technology Assessment No. 2, Agency for Health Care Research and Quality Publication No. 99-E006. Rockville. 1999

Lohr KN, Carey TS: Assessing best evidence: issues in grading the quality of studies for systematic reviews. Joint Commission J Qual Improvement. 1999, 25: 470-479.

Greer N, Mosser G, Logan G, Halaas GW: A practical approach to evidence grading. Joint Commission J Qual Improvement. 2000, 26: 700-712.

Harris RP, Helfand M, Woolf SH, Lohr KN, Mulrow CD, Teutsch SM, Atkins D: Current methods of the U.S. Preventive Services Task Force: a review of the process. Am J Prevent Med. 2001, 20: 21-35. 10.1016/S0749-3797(01)00261-6.

Anonymous: How to read clinical journals: IV. To determine etiology or causation. Can Med Assoc J. 1981, 124: 985-990.

Whitten PS, Mair FS, Haycox A, May CR, Williams TL, Hellmich S: Systematic review of cost effectiveness studies of telemedicine interventions. BMJ. 2002, 324: 1434-1437. 10.1136/bmj.324.7351.1434.

PubMed   PubMed Central   Google Scholar  

Forrest JL, Miller SA: Evidence-based decision making in action: Part 2-evaluating and applying the clinical evidence. J Contemp Dental Pract. 2002, 4: 42-52.

Oxman AD, Guyatt GH: Validation of an index of the quality of review articles. J Clin Epidemiol. 1991, 44: 1271-1278. 10.1016/0895-4356(91)90160-B.

Jones T, Evans D: Conducting a systematic review. Aust Crit Care. 2000, 13: 66-71.

Papadopoulos M, Rheeder P: How to do a systematic literature review. South African J Physiother. 2000, 56: 3-6.

Selker LG: Clinical research in Allied Health. J Allied Health. 1994, 23: 201-228.

Stevens KR: Systematic reviews: the heart of evidence-based practice. AACN Clin Issues. 2001, 12: 529-538.

Devers KJ, Frankel RM: Getting qualitative research published. Ed Health. 2001, 14: 109-117. 10.1080/13576280010021888.

Canadian Journal of Public Health: Review guidelines for qualitative research papers submitted for consideration to the Canadian Journal of Public Health. Can J Pub Health. 2000, 91: I2-

Malterud K: Shared understanding of the qualitative research process: guidelines for the medical researcher. Fam Pract. 1993, 10: 201-206.

Higgs J, Titchen A: Research and knowledge. Physiotherapy. 1998, 84: 72-80.

Maggs-Rapport F: Best research practice: in pursuit of methodological rigour. J Advan Nurs. 2001, 35: 373-383. 10.1046/j.1365-2648.2001.01853.x.

Cutcliffe JR, McKenna HP: Establishing the credibility of qualitative research findings: the plot thickens. J Advan Nurs. 1999, 30: 374-380. 10.1046/j.1365-2648.1999.01090.x.

Andresen EM: Criteria for assessing the tools of disability outcomes research. Arch Phys Med Rehab. 2000, 81: S15-S20. 10.1053/apmr.2000.20619.

Beatie P: Measurement of health outcomes in the clinical setting: applications to physiotherapy. Phys Theory Pract. 2001, 17: 173-185. 10.1080/095939801317077632.

Charnock DF, (Ed): The DISCERN Handbook: Quality criteria for consumer health information on treatment choices. 1998, Radcliffe Medical Press

Download references

Author information

Authors and Affiliations

Centre for Allied Health Evidence: A Collaborating Centre of the Joanna Briggs Institute, City East Campus, University of South Australia, North Terrace, Adelaide, 5000, Australia

Persis Katrak, Nicola Massy-Westropp, VS Saravana Kumar & Karen A Grimmer

School of Physiotherapy, The University of Melbourne, Melbourne, 3010, Australia

Andrea E Bialocerkowski

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Karen A Grimmer .

Additional information

Competing interests.

No competing interests.

Authors' contributions

PK Sourced critical appraisal tools

Categorized the content and psychometric properties of critical appraisal tools

AEB Synthesis of findings

Drafted manuscript

NMW Sourced critical appraisal tools

VSK Sourced critical appraisal tools

KAG Study conception and design

Assisted with critiquing critical appraisal tools and categorization of the content and psychometric properties of critical appraisal tools

Drafted and reviewed manuscript

Addressed reviewer's comments and re-submitted the article

CASP Checklists

  • How to use our CASP Checklists
  • Referencing and Creative Commons
  • Online Training Courses
  • CASP Workshops
  • What is Critical Appraisal
  • Study Designs
  • Useful Links
  • Bibliography
  • View all Tools and Resources
  • Testimonials

Critical appraisal tools and resources

CASP has produced simple critical appraisal checklists for the key study designs. These are not meant to replace considered thought and judgement when reading a paper but are for use as a guide and aide memoire. All CASP checklists cover three main areas: validity , results and clinical relevance.

What is Critical Appraisal?

Critical Appraisal is the process of carefully and systematically examining research to judge its trustworthiness, and its value and relevance in a particular context. It is an essential skill for evidence-based medicine because it allows people to find and use research evidence reliably and efficiently.

Learn more about what critical appraisal is, why we need it and more

A complete list (published & unpublished) of articles and research papers about CASP and other critical appraisal tools and approaches, covering from 1993-2012.

  • CASP Checklist

Need more information?

  • Online Learning
Critical Appraisal of Studies

Critical appraisal is the process of carefully and systematically examining research to judge its trustworthiness, and its value/relevance in a particular context by providing a framework to evaluate the research. During the critical appraisal process, researchers can:

  • Decide whether studies have been undertaken in a way that makes their findings reliable as well as valid and unbiased
  • Make sense of the results
  • Know what these results mean in the context of the decision they are making
  • Determine if the results are relevant to their patients/schoolwork/research

Burls, A. (2009). What is critical appraisal? In What Is This Series: Evidence-based medicine. Available online at  What is Critical Appraisal?

Critical appraisal is included in the process of writing high quality reviews, like systematic and integrative reviews and for evaluating evidence from RCTs and other study designs. For more information on systematic reviews, check out our  Systematic Review  guide.

Critical Appraisal tools

Critical appraisal worksheets to help you appraise the reliability, importance and applicability of clinical evidence.

Critical appraisal is the systematic evaluation of clinical research papers in order to establish:

  • Does this study address a  clearly focused question ?
  • Did the study use valid methods to address this question?
  • Are the valid results of this study important?
  • Are these valid, important results applicable to my patient or population?

If the answer to any of these questions is “no”, you can save yourself the trouble of reading the rest of it.

This section contains useful tools and downloads for the critical appraisal of different types of medical evidence. Example appraisal sheets are provided together with several helpful examples.

Critical Appraisal Worksheets

  • Systematic Reviews  Critical Appraisal Sheet
  • Diagnostics  Critical Appraisal Sheet
  • Prognosis  Critical Appraisal Sheet
  • Randomised Controlled Trials  (RCT) Critical Appraisal Sheet
  • Critical Appraisal of Qualitative Studies  Sheet
  • IPD Review  Sheet

Chinese - translated by Chung-Han Yang and Shih-Chieh Shao

  • Systematic Reviews  Critical Appraisal Sheet
  • Diagnostic Study  Critical Appraisal Sheet
  • Prognostic Critical Appraisal Sheet
  • RCT  Critical Appraisal Sheet
  • IPD reviews Critical Appraisal Sheet
  • Qualitative Studies Critical Appraisal Sheet 

German - translated by Johannes Pohl and Martin Sadilek

  • Systematic Review  Critical Appraisal Sheet
  • Diagnosis Critical Appraisal Sheet
  • Prognosis Critical Appraisal Sheet
  • Therapy / RCT Critical Appraisal Sheet

Lithuanian - translated by Tumas Beinortas

  • Systematic review appraisal Lithuanian (PDF)
  • Diagnostic accuracy appraisal Lithuanian  (PDF)
  • Prognostic study appraisal Lithuanian  (PDF)
  • RCT appraisal sheets Lithuanian  (PDF)

Portugese - translated by Enderson Miranda, Rachel Riera and Luis Eduardo Fontes

  • Portuguese – Systematic Review Study Appraisal Worksheet
  • Portuguese – Diagnostic Study Appraisal Worksheet
  • Portuguese – Prognostic Study Appraisal Worksheet
  • Portuguese – RCT Study Appraisal Worksheet
  • Portuguese – Systematic Review Evaluation of Individual Participant Data Worksheet
  • Portuguese – Qualitative Studies Evaluation Worksheet

Spanish - translated by Ana Cristina Castro

  • Systematic Review  (PDF)
  • Diagnosis  (PDF)
  • Prognosis  Spanish Translation (PDF)
  • Therapy / RCT  Spanish Translation (PDF)

Persian - translated by Ahmad Sofi Mahmudi

  • Prognosis  (PDF)
  • PICO  Critical Appraisal Sheet (PDF)
  • PICO Critical Appraisal Sheet (MS-Word)
  • Educational Prescription  Critical Appraisal Sheet (PDF)

Explanations & Examples

  • Pre-test probability
  • SpPin and SnNout
  • Likelihood Ratios

Revising the JBI quantitative critical appraisal tools to improve their applicability: an overview of methods and the development process


  1 JBI, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA, Australia.
  2 Queen's Collaboration for Health Care Quality, Queen's University, Kingston, ON, Canada.
  3 Czech National Centre for Evidence-Based Healthcare and Knowledge Translation (Cochrane Czech Republic; The Czech Republic [Middle European] Centre for Evidence-Based Healthcare: A JBI Centre of Excellence; Masaryk University GRADE Centre), Faculty of Medicine, Institute of Biostatistics and Analyses, Masaryk University, Brno, Czech Republic.
  4 The Nottingham Centre for Evidence-Based Healthcare: A JBI Centre of Excellence, School of Medicine, University of Nottingham, Nottingham, UK.
  5 Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, NSW, Australia.
  • PMID: 36121230
  • DOI: 10.11124/JBIES-22-00125

JBI offers a suite of critical appraisal instruments that are freely available to systematic reviewers and researchers investigating the methodological limitations of primary research studies. The JBI instruments are designed to be study-specific and are presented as questions in a checklist. The JBI instruments have existed in a checklist-style format for approximately 20 years; however, as the field of research synthesis expands, many of the tools offered by JBI have become outdated. The JBI critical appraisal tools for quantitative studies (eg, randomized controlled trials, quasi-experimental studies) must be updated to reflect the current methodologies in this field. Cognizant of this and the recent developments in risk-of-bias science, the JBI Effectiveness Methodology Group was tasked with updating the current quantitative critical appraisal instruments. This paper details the methods and rationale that the JBI Effectiveness Methodology Group followed when updating the JBI critical appraisal instruments for quantitative study designs. We detail the key changes made to the tools and highlight how these changes reflect current methodological developments in this field.

Copyright © 2023 JBI.

PubMed Disclaimer

  • Protocol: Factors contributing to the discontinuation of breastfeeding upon women's return to work: A systematic review protocol. Scotta AV, Barral PE, Farre A, Soria EA. Scotta AV, et al. Campbell Syst Rev. 2024 Sep 9;20(3):e1434. doi: 10.1002/cl2.1434. eCollection 2024 Sep. Campbell Syst Rev. 2024. PMID: 39253405 Free PMC article.
  • Anatomical Features in Inguinal-Pubic-Adductor Area That May Contribute to Gender Difference in Susceptibility to Groin Pain Syndrome. Bisciotti GN, Bisciotti A, Auci A, Bisciotti A, Volpi P. Bisciotti GN, et al. J Pers Med. 2024 Aug 14;14(8):860. doi: 10.3390/jpm14080860. J Pers Med. 2024. PMID: 39202051 Free PMC article. Review.
  • Potential and Risks Behind the National Transformation Program in Saudi Arabia. Alkhurayji K, Alzahrani HA, Alotaibi AS, Alharbi AG, Zandan AA, Alsheikhi H. Alkhurayji K, et al. Cureus. 2024 Jul 21;16(7):e65047. doi: 10.7759/cureus.65047. eCollection 2024 Jul. Cureus. 2024. PMID: 39165447 Free PMC article. Review.
  • Satisfaction of Patients and Physicians with Telehealth Services during the COVID-19 Pandemic: A Systematic Review and Meta-Analysis. Fadaizadeh L, Velayati F, Arab-Zozani M. Fadaizadeh L, et al. Healthc Inform Res. 2024 Jul;30(3):206-223. doi: 10.4258/hir.2024.30.3.206. Epub 2024 Jul 31. Healthc Inform Res. 2024. PMID: 39160780 Free PMC article.
  • Clinical presentation and surgical outcomes in patients with Shone's complex: a systematic review. Ahmed HS, Jayaram PR, Gupta D. Ahmed HS, et al. Gen Thorac Cardiovasc Surg. 2024 Oct;72(10):621-640. doi: 10.1007/s11748-024-02067-1. Epub 2024 Aug 2. Gen Thorac Cardiovasc Surg. 2024. PMID: 39090433 Review.
  • Porritt K, Gomersall J, Lockwood C. JBI's systematic reviews: study selection and critical appraisal. Am J Nurs 2014;114(6):47–52.
  • JBI. Critical appraisal tools [internet]. Adelaide, JBI; n.d. [cited 2022 Nov 29]. Available from: https://jbi.global/critical-appraisal-tools .
  • Aromataris E, Munn Z. Aromataris E, Munn Z Chapter 1: JBI Systematic Reviews JBI Manual for Evidence Synthesis [internet]. Adelaide, JBI; 2020 [cited 2022 Nov 29]. Available from: https://synthesismanual.jbi.global .
  • Tufanaru C, Munn Z, Aromataris E, Campbell J, Hopp L Aromataris E, Munn Z. Chapter 3: Systematic reviews of effectiveness JBI Manual for Evidence Synthesis [internet]. Adelaide, JBI; 2020 [cited 2022 Nov 29]. Available from: https://synthesismanual.jbi.global .
  • Munn Z, Barker TH, Moola S, Tufanaru C, Stern C, McArthur A, et al. Methodological quality of case series studies: an introduction to the JBI critical appraisal tool. JBI Evid Synth 2020;18(10):2127–2133.

Publication types

  • Search in MeSH

Critical Appraisal

Use this guide to find information resources about critical appraisal including checklists, books and journal articles.

Key Resources

  • This online resource explains the sections commonly used in research articles. Understanding how research articles are organised can make reading and evaluating them easier View page
  • Critical appraisal checklists
  • Worksheets for appraising systematic reviews, diagnostics, prognostics and RCTs. View page
  • A free online resource for both healthcare staff and patients; four modules of 30–45 minutes provide an introduction to evidence based medicine, clinical trials and Cochrane Evidence. View page
  • This tool will guide you through a series of questions to help you to review and interpret a published health research paper. View page
  • The PRISMA flow diagram depicts the flow of information through the different phases of a literature review. It maps out the number of records identified, included and excluded, and the reasons for exclusions. View page
  • A useful resource for methods and evidence in applied social science. View page
  • A comprehensive database of reporting guidelines. Covers all the main study types. View page
  • A tool to assess the methodological quality of systematic reviews. View page

Book subject search

  • Critical appraisal

Journal articles

  • View article

Shea BJ and others (2017) AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions or both, British Medical Journal, 358.

  • An outline of AMSTAR 2 and its use for as a critical appraisal tool for systematic reviews. View article (open access)
  • View articles

  1. Modified McMaster Quantitative Critical Appraisal Tool

    what is the best critical appraisal tool for quantitative research

  2. Critical appraisal tools used.

    what is the best critical appraisal tool for quantitative research

  3. step 3

    what is the best critical appraisal tool for quantitative research

  4. The Joanna Briggs Institute (JBI) critical appraisal checklist for

    what is the best critical appraisal tool for quantitative research

  5. A critical appraisal tool for qualitative and quantitative research

    what is the best critical appraisal tool for quantitative research

  6. 4 Critical Appraisal

    what is the best critical appraisal tool for quantitative research


  1. Critical Appraisal of Qualitative Research

  2. Critical Appraisal Tool-Guidelines for Qualitative Studies

  3. Critical Appraisal Of Research Article

  4. Critical Appraisal of research evidence

  5. Critical Appraisal Tool: Critical Review Form for Quantitative Studies

  6. CASP Question 5 Discussion


  1. JBI Critical Appraisal Tools

    JBI's Evidence Synthesis Critical Appraisal Tools Assist in Assessing the Trustworthiness, ... "Revising the JBI quantitative critical appraisal tools to improve their applicability: An overview of methods and the development process" ... Munn Z, Porritt K. Qualitative research synthesis: methodological guidance for systematic reviewers ...

  2. Critical Appraisal Tools and Reporting Guidelines

    Schondelmeyer A. C., Bettencourt A. P., Xiao R., Beidas R. S., Wolk C. B., Landrigan C. P., Brady P. W., Brent C. R., Parthasarathy P., Kern-Goldberger A. S., Sergay ...

  3. A guide to critical appraisal of evidence

    Critical appraisal is the assessment of research studies' worth to clinical practice. Critical appraisal—the heart of evidence-based practice—involves four phases: rapid critical appraisal, evaluation, synthesis, and recommendation. This article reviews each phase and provides examples, tips, and caveats to help evidence appraisers ...

  4. Scientific writing: Critical Appraisal Toolkit (CAT) for assessing

    Abstract. Healthcare professionals are often expected to critically appraise research evidence in order to make recommendations for practice and policy development. Here we describe the Critical Appraisal Toolkit (CAT) currently used by the Public Health Agency of Canada. The CAT consists of: algorithms to identify the type of study design ...

  5. Guidance to best tools and practices for systematic reviews

    Methods and guidance to produce a reliable evidence synthesis. Several international consortiums of EBM experts and national health care organizations currently provide detailed guidance (Table (Table1). 1).They draw criteria from the reporting and methodological standards of currently recommended appraisal tools, and regularly review and update their methods to reflect new information and ...

  6. Full article: Critical appraisal

    For example, in quantitative research a critical appraisal checklist assists a reviewer in assessing each study according to the same (pre-determined) criteria; that is, checklists help standardize the process, if not the outcome (they are navigational tools, not anchors, Booth, Citation 2007). Also, if the checklist has been through a rigorous ...

  7. Critical appraisal full list of checklists and tools

    There are hundreds of critical appraisal checklists and tools you can choose from, which can be very overwhelming. There are so many because there are many kinds of research, knowledge can be communicated in a wide range of ways, and whether something is appropriate to meet your information needs depends on your specific context.

  8. Critical Appraisal of Clinical Research

    Critical appraisal is the course of action for watchfully and systematically examining research to assess its reliability, value and relevance in order to direct professionals in their vital clinical decision making [1]. Critical appraisal is essential to: Continuing Professional Development (CPD).

  9. PDF © Joanna Briggs Institute 2017 Critical Appraisal Checklist for

    JBI Critical Appraisal Tools All systematic reviews incorporate a process of critique or appraisal of the research evidence. The purpose of this appraisal is to assess the methodological quality of a study and to determine the extent to which a study has addressed the possibility of bias in its design, conduct and analysis. All papers

  10. Critical Appraisal: Assessing the Quality of Studies

    This contains five items from the JBI-Qualitative Assessment and Review Instrument critical appraisal tool to assess dependability and then assess three levels of credibility; these are shown in Table 6.5. These are aspects of studies that are equivalent to reliability and internal validity in quantitative research, respectively.

  11. Critical appraisal tools

    Feedback or suggestions. If you have any feedback and suggestions for improvement, please contact us: Specialist Unit for Review Evidence. [email protected]. +44 (0)29 2068 7913. Tools to help identify the many ways that errors and bias can affect research results.

  12. Critical Appraisal of Quantitative Research

    Abstract. Critical appraisal skills are important for anyone wishing to make informed decisions or improve the quality of healthcare delivery. A good critical appraisal provides information regarding the believability and usefulness of a particular study. However, the appraisal process is often overlooked, and critically appraising quantitative ...

  13. Critical Appraisal of a quantitative paper

    This guide, aimed at health students, provides basic level support for appraising quantitative research papers. It's designed for students who have already attended lectures on critical appraisal. One framework for appraising quantitative research (based on reliability, internal and external validity) is provided and there is an opportunity to ...

  14. (PDF) Critical Appraisal of Quantitative Research

    quality. 1 Introduction. Critical appraisal describes the process of analyzing a study in a rigorous and. methodical way. Often, this process involves working through a series of questions. to ...

  15. A systematic review of the content of critical appraisal tools

    Consumers of research (researchers, administrators, educators and clinicians) frequently use standard critical appraisal tools to evaluate the quality of published research reports. However, there is no consensus regarding the most appropriate critical appraisal tool for allied health research. We summarized the content, intent, construction and psychometric properties of published, currently ...

  16. Critical Appraisal Tools & Resources

    Critical Appraisal is the process of carefully and systematically examining research to judge its trustworthiness, and its value and relevance in a particular context. It is an essential skill for evidence-based medicine because it allows people to find and use research evidence reliably and efficiently. Learn more about what critical appraisal ...

  17. Introduction

    Critical Appraisal of Studies. Critical appraisal is the process of carefully and systematically examining research to judge its trustworthiness, and its value/relevance in a particular context by providing a framework to evaluate the research. During the critical appraisal process, researchers can: Decide whether studies have been undertaken ...

  18. Critical Appraisal tools

    This section contains useful tools and downloads for the critical appraisal of different types of medical evidence. Example appraisal sheets are provided together with several helpful examples. Critical Appraisal Worksheets. English. Systematic Reviews Critical Appraisal Sheet; Diagnostics Critical Appraisal Sheet; Prognosis Critical Appraisal ...

  19. Optimising the value of the critical appraisal skills programme (CASP

    Our aim is to discuss the suitability and usability of the Critical Appraisal Skills Programme (CASP) qualitative checklist tool for quality appraisal in qualitative evidence synthesis in order to support and improve future appraisal exercises framed by the tool. 30 The CASP tool is the most commonly used checklist/criteria-based tool for ...

  20. JBI releases revised Critical Appraisal Tools

    Revising the JBI quantitative critical appraisal tools to improve their applicability: an overview of methods and the development process. The revised JBI critical appraisal tool for the assessment of risk of bias for randomized controlled trials. For almost 25 years, JBI's critical appraisal tools have assisted systematic reviewers assess ...

  21. Revising the JBI quantitative critical appraisal tools to improve their

    The JBI instruments have existed in a checklist-style format for approximately 20 years; however, as the field of research synthesis expands, many of the tools offered by JBI have become outdated. The JBI critical appraisal tools for quantitative studies (eg, randomized controlled trials, quasi-experimental studies) must be updated to reflect ...

  22. Critical Appraisal

    Cathala X and Moorley C (2018) How to appraise quantitative research, Evidence-Based Nursing, 21(4), pp. 99-101. ... An outline of AMSTAR 2 and its use for as a critical appraisal tool for systematic reviews. View article (open access) Smith J and Noble H (2014) Bias in research, Evidence-Based Nursing, 17 (4), pp. 100-101. ...

  23. Evidence Synthesis and Systematic Reviews

    Research Guides; Health Science Libraries; ... Data Extraction Tools; Meta-Analysis Tools; Critical Appraisal Checklists; Grading the Strength of Evidence Tools; Risk of Bias Tools ... Tools for Evidence Synthesis. On this page you will find tools for the different steps of the evidence synthesis process. If you are looking for even more tools ...