Critical Value Approach in Hypothesis Testing

define critical value in hypothesis testing

After calculating the test statistic using the sample data, you compare it to the critical value(s) corresponding to the chosen significance level ( α ).

Two-sided test

Left-tailed test, right-tailed test, using critical values to construct confidence intervals.

Compute the lower bound and upper bound:

Finding the Critical Value

As you can see, the specific formula to find critical values depends on the distribution and the parameters associated with the problem at hand.

Take your skills to the next level ⚡️

S.3.1 Hypothesis Testing (Critical Value Approach)

The critical value approach involves determining "likely" or "unlikely" by determining whether or not the observed test statistic is more extreme than would be expected if the null hypothesis were true. That is, it entails comparing the observed test statistic to some cutoff value, called the " critical value ." If the test statistic is more extreme than the critical value, then the null hypothesis is rejected in favor of the alternative hypothesis. If the test statistic is not as extreme as the critical value, then the null hypothesis is not rejected.

Specifically, the four steps involved in using the critical value approach to conducting any hypothesis test are:

  • Specify the null and alternative hypotheses.
  • Using the sample data and assuming the null hypothesis is true, calculate the value of the test statistic. To conduct the hypothesis test for the population mean μ , we use the t -statistic \(t^*=\frac{\bar{x}-\mu}{s/\sqrt{n}}\) which follows a t -distribution with n - 1 degrees of freedom.
  • Determine the critical value by finding the value of the known distribution of the test statistic such that the probability of making a Type I error — which is denoted \(\alpha\) (greek letter "alpha") and is called the " significance level of the test " — is small (typically 0.01, 0.05, or 0.10).
  • Compare the test statistic to the critical value. If the test statistic is more extreme in the direction of the alternative than the critical value, reject the null hypothesis in favor of the alternative hypothesis. If the test statistic is less extreme than the critical value, do not reject the null hypothesis.

Example S.3.1.1

In our example concerning the mean grade point average, suppose we take a random sample of n = 15 students majoring in mathematics. Since n = 15, our test statistic t * has n - 1 = 14 degrees of freedom. Also, suppose we set our significance level α at 0.05 so that we have only a 5% chance of making a Type I error.

Right-Tailed

The critical value for conducting the right-tailed test H 0 : μ = 3 versus H A : μ > 3 is the t -value, denoted t \(\alpha\) , n - 1 , such that the probability to the right of it is \(\alpha\). It can be shown using either statistical software or a t -table that the critical value t 0.05,14 is 1.7613. That is, we would reject the null hypothesis H 0 : μ = 3 in favor of the alternative hypothesis H A : μ > 3 if the test statistic t * is greater than 1.7613. Visually, the rejection region is shaded red in the graph.

t distribution graph for a t value of 1.76131

Left-Tailed

The critical value for conducting the left-tailed test H 0 : μ = 3 versus H A : μ < 3 is the t -value, denoted -t ( \(\alpha\) , n - 1) , such that the probability to the left of it is \(\alpha\). It can be shown using either statistical software or a t -table that the critical value -t 0.05,14 is -1.7613. That is, we would reject the null hypothesis H 0 : μ = 3 in favor of the alternative hypothesis H A : μ < 3 if the test statistic t * is less than -1.7613. Visually, the rejection region is shaded red in the graph.

t-distribution graph for a t value of -1.76131

There are two critical values for the two-tailed test H 0 : μ = 3 versus H A : μ ≠ 3 — one for the left-tail denoted -t ( \(\alpha\) / 2, n - 1) and one for the right-tail denoted t ( \(\alpha\) / 2, n - 1) . The value - t ( \(\alpha\) /2, n - 1) is the t -value such that the probability to the left of it is \(\alpha\)/2, and the value t ( \(\alpha\) /2, n - 1) is the t -value such that the probability to the right of it is \(\alpha\)/2. It can be shown using either statistical software or a t -table that the critical value -t 0.025,14 is -2.1448 and the critical value t 0.025,14 is 2.1448. That is, we would reject the null hypothesis H 0 : μ = 3 in favor of the alternative hypothesis H A : μ ≠ 3 if the test statistic t * is less than -2.1448 or greater than 2.1448. Visually, the rejection region is shaded red in the graph.

t distribution graph for a two tailed test of 0.05 level of significance

The Data Scientist

the data scientist logo

Understanding Critical Value vs. P-Value in Hypothesis Testing

In the realm of statistical analysis , critical values and p-values serve as essential tools for hypothesis testing and decision making. These concepts, rooted in the work of statisticians like Ronald Fisher and the Neyman-Pearson approach, play a crucial role in determining statistical significance. Understanding the distinction between critical values and p-values is vital for researchers and data analysts to interpret their findings accurately and avoid misinterpretations that can lead to false positives or false negatives.

This article aims to shed light on the key differences between critical values and p-values in hypothesis testing. It will explore the definition and calculation of critical values, including how to find critical values using tables or calculators. The discussion will also cover p-values, their interpretation, and their relationship to significance levels. Additionally, the article will address common pitfalls in result interpretation and provide guidance on when to use critical values versus p-values in various statistical scenarios, such as t-tests and confidence intervals.

define critical value in hypothesis testing

What is a Critical Value?

Definition and concept.

A critical value in statistics serves as a crucial cut-off point in hypothesis testing and decision making. It defines the boundary between accepting and rejecting the null hypothesis, playing a vital role in determining statistical significance. The critical value is intrinsically linked to the significance level (α) chosen for the test, which represents the probability of making a Type I error.

Critical values are essential for accurately representing a range of characteristics within a dataset. They help statisticians calculate the margin of error and provide insights into the validity and accuracy of their findings. In hypothesis testing, the critical value is compared to the obtained test statistic to determine whether the null hypothesis should be rejected or not.

How to calculate critical values

Calculating critical values involves several steps and depends on the type of test being conducted. The general formula for finding the critical value is:

Critical probability (p*) = 1 – (Alpha / 2)

Where Alpha = 1 – (confidence level / 100)

For example, using a confidence level of 95%, the alpha value would be:

Alpha value = 1 – (95/100) = 0.05

Then, the critical probability would be:

Critical probability (p*) = 1 – (0.05 / 2) = 0.975

The critical value can be expressed in two ways:

  • As a Z-score related to cumulative probability
  • As a critical t statistic, which is equal to the critical probability

For larger sample sizes (typically n ≥ 30), the Z-score is used, while for smaller samples or when the population standard deviation is unknown, the t statistic is more appropriate.

Examples in hypothesis testing

Critical values play a crucial role in various types of hypothesis tests. Here are some examples:

  • One-tailed test: For a right-tailed test with H₀: μ = 3 vs. H₁: μ > 3, the critical value would be the t-value such that the probability to the right of it is α. For instance, with α = 0.05 and 14 degrees of freedom, the critical value t₀.₀₅,₁₄ is 1.7613 . The null hypothesis would be rejected if the test statistic t is greater than 1.7613.
  • Two-tailed test: For a two-tailed test with H₀: μ = 3 vs. H₁: μ ≠ 3, there are two critical values – one for each tail. Using α = 0.05 and 14 degrees of freedom, the critical values would be -2.1448 and 2.1448 . The null hypothesis would be rejected if the test statistic t is less than -2.1448 or greater than 2.1448.
  • Z-test example: In a tire manufacturing plant producing 15.2 tires per hour with a variance of 2.5, new machines were tested. The critical region for a one-tailed test with α = 0.10 was z > 1.282. The calculated z-statistic of 3.51 exceeded this critical value , leading to the rejection of the null hypothesis.

Understanding critical values is essential for making informed decisions in hypothesis testing and statistical analysis. They provide a standardized approach to evaluating the significance of research findings and help researchers avoid misinterpretations that could lead to false positives or false negatives.

Understanding P-Values

define critical value in hypothesis testing

Definition of p-value

In statistical hypothesis testing, a p-value is a crucial concept that helps researchers quantify the strength of evidence against the null hypothesis. The p-value is defined as the probability of obtaining test results at least as extreme as the observed results, assuming that the null hypothesis is true. This definition highlights the relationship between the p-value and the null hypothesis, which is fundamental to understanding its interpretation.

The p-value serves as an alternative to rejection points, providing the smallest level of significance at which the null hypothesis would be rejected. It is important to note that the p-value is not the probability that the null hypothesis is true or that the alternative hypothesis is false. Rather, it indicates how compatible the observed data are with a specified statistical model, typically the null hypothesis.

Interpreting p-values

Interpreting p-values correctly is essential for making sound statistical inferences. A smaller p-value suggests stronger evidence against the null hypothesis and in favor of the alternative hypothesis. Conventionally, a p-value of 0.05 or lower is considered statistically significant, leading to the rejection of the null hypothesis. However, it is crucial to understand that this threshold is arbitrary and should not be treated as a definitive cutoff point for decision-making.

When interpreting p-values, it is important to consider the following:

  • The p-value does not indicate the size or importance of the observed effect. A small p-value can be observed for an effect that is not meaningful or important, especially with large sample sizes.
  • The p-value is not the probability that the observed effects were produced by random chance alone. It is calculated under the assumption that the null hypothesis is true.
  • A p-value greater than 0.05 does not necessarily mean that the null hypothesis is true or that there is no effect. It simply indicates that the evidence against the null hypothesis is not strong enough to reject it at the chosen significance level.

Common misconceptions about p-values

Despite their widespread use, p-values are often misinterpreted in scientific research and education. Some common misconceptions include:

  • Interpreting the p-value as the probability that the null hypothesis is true or the probability that the alternative hypothesis is false. This interpretation is incorrect, as p-values do not provide direct probabilities for hypotheses.
  • Believing that a p-value less than 0.05 proves that a finding is true or that the probability of making a mistake is less than 5%. In reality, the p-value is a statement about the relation of the data to the null hypothesis, not a measure of truth or error rates.
  • Treating p-values on opposite sides of the 0.05 threshold as qualitatively different. This dichotomous thinking can lead to overemphasis on statistical significance and neglect of practical significance.
  • Using p-values to determine the size or importance of an effect. P-values do not provide information about effect sizes or clinical relevance.

To address these misconceptions, it is important to consider p-values as continuous measures of evidence rather than binary indicators of significance. Additionally, researchers should focus on reporting effect sizes, confidence intervals, and practical significance alongside p-values to provide a more comprehensive understanding of their findings.

Key Differences Between Critical Values and P-Values

define critical value in hypothesis testing

Approach to hypothesis testing

Critical values and p-values represent two distinct approaches to hypothesis testing, each offering unique insights into the decision-making process. The critical value approach, rooted in traditional hypothesis testing, establishes a clear boundary for accepting or rejecting the null hypothesis. This method is closely tied to significance levels and provides a straightforward framework for statistical inference.

In contrast, p-values offer a continuous measure of evidence against the null hypothesis. This approach allows for a more nuanced evaluation of the data’s compatibility with the null hypothesis. While both methods aim to support or reject the null hypothesis, they differ in how they lead to that decision.

Decision-making process

The decision-making process for critical values and p-values follows different paths. Critical values provide a binary framework, simplifying the decision to either reject or fail to reject the null hypothesis. This approach streamlines the process by classifying results as significant or not significant based on predetermined thresholds.

For instance, in a hypothesis test with a significance level (α) of 0.05 , the critical value serves as the dividing line between the rejection and non-rejection regions. If the test statistic exceeds the critical value, the null hypothesis is rejected.

P-values, on the other hand, offer a more flexible approach to decision-making. Instead of a simple yes or no answer, p-values present a range of evidence levels against the null hypothesis. This continuous scale allows researchers to interpret the strength of evidence and choose an appropriate significance level for their specific context.

Interpretation of results

The interpretation of results differs significantly between critical values and p-values. Critical values provide a clear-cut interpretation: if the test statistic falls within the rejection region defined by the critical value, the null hypothesis is rejected. This approach offers a straightforward way to communicate results, especially when a binary decision is required.

P-values, however, offer a more nuanced interpretation of results. A smaller p-value indicates stronger evidence against the null hypothesis. For example, a p-value of 0.03 suggests more compelling evidence against the null hypothesis than a p-value of 0.07. This continuous scale allows for a more detailed assessment of the data’s compatibility with the null hypothesis.

It’s important to note that while a p-value of 0.05 is often used as a threshold for statistical significance, this is an arbitrary cutoff. The interpretation of p-values should consider the context of the study and the potential for practical significance.

Both approaches have their strengths and limitations. Critical values simplify decision-making but may not accurately reflect the increasing precision of estimates as sample sizes grow. P-values provide a more comprehensive understanding of outcomes, especially when combined with effect size measures. However, they are frequently misunderstood and can be affected by sample size in large datasets, potentially leading to misleading significance.

In conclusion, while critical values and p-values are both essential tools in hypothesis testing, they offer different perspectives on statistical inference. Critical values provide a clear, binary decision framework, while p-values allow for a more nuanced evaluation of evidence against the null hypothesis. Understanding these differences is crucial for researchers to choose the most appropriate method for their specific research questions and to interpret results accurately.

define critical value in hypothesis testing

When to Use Critical Values vs. P-Values

Advantages of critical value approach.

The critical value approach offers several advantages in hypothesis testing. It provides a simple, binary framework for decision-making, allowing researchers to either reject or fail to reject the null hypothesis. This method is particularly useful when a clear explanation of the significance of results is required. Critical values are especially beneficial in sectors where decision-making is influenced by predetermined thresholds, such as the commonly used 0.05 significance level.

One of the key strengths of the critical value approach is its consistency with accepted significance levels, which simplifies interpretation. This method is particularly valuable in non-parametric tests where distributional assumptions may be violated. The critical value approach involves comparing the observed test statistic to a predetermined cutoff value. If the test statistic is more extreme than the critical value, the null hypothesis is rejected in favor of the alternative hypothesis.

Benefits of p-value method

The p-value method offers a more nuanced approach to hypothesis testing. It provides a continuous scale for evaluating the strength of evidence against the null hypothesis, allowing researchers to interpret data with greater flexibility. This approach is particularly useful when conducting unique or exploratory research, as it enables scientists to choose an appropriate level of significance based on their specific context.

P-values quantify the probability of observing a test statistic as extreme as, or more extreme than, the one observed, assuming the null hypothesis is true. This method provides a more comprehensive understanding of outcomes, especially when combined with effect size measures. For instance, a p-value of 0.0127 indicates that it is unlikely to observe such an extreme test statistic if the null hypothesis were true, leading to its rejection.

Choosing the right approach for your study

The choice between critical values and p-values depends on various factors, including the nature of the data , study design, and research objectives. Critical values are best suited for situations requiring a simple, binary choice about the null hypothesis. They streamline the decision-making process by classifying results as significant or not significant.

On the other hand, p-values are more appropriate when evaluating the strength of evidence against the null hypothesis on a continuous scale. They offer a more subtle understanding of the data’s significance and allow for flexibility in interpretation. However, it’s crucial to note that p-values have been subject to debate and controversy, particularly in the context of analyzing complex data associated with plant and animal breeding programs.

When choosing between these approaches, consider the following:

  • If you need a clear-cut decision based on predetermined thresholds, the critical value approach may be more suitable.
  • For a more nuanced interpretation of results, especially in exploratory research, the p-value method might be preferable.
  • Consider the potential for misinterpretation and misuse associated with p-values, such as p-value hacking , which can lead to inflated significance and misleading conclusions.

Ultimately, the choice between critical values and p-values should be guided by the specific requirements of your study and the need for accurate statistical inferences to make informed decisions in your field of research.

Common Pitfalls in Interpreting Results

Overreliance on arbitrary thresholds.

One of the most prevalent issues in statistical analysis is the overreliance on arbitrary thresholds, particularly the p-value of 0.05 . This threshold has been widely used for decades to determine statistical significance , but its arbitrary nature has come under scrutiny. Many researchers argue that setting a single threshold for all sciences is too extreme and can lead to misleading conclusions.

The use of p-values as the sole measure of significance can result in the publication of potentially false or misleading results. It’s crucial to understand that statistical significance does not necessarily equate to practical significance or real-world importance. A study with a large sample size can produce statistically significant results even when the effect size is trivial.

To address this issue, some researchers propose selecting and justifying p-value thresholds for experiments before collecting any data. These levels would be based on factors such as the potential impact of a discovery or how surprising it would be. However, this approach also has its critics, who argue that researchers may not have the incentive to use more stringent thresholds of evidence.

Ignoring effect sizes

Another common pitfall in interpreting results is the tendency to focus solely on statistical significance while ignoring effect sizes. Effect size is a crucial measure that indicates the magnitude of the relationship between variables or the difference between groups. It provides information about the practical significance of research findings, which is often more valuable than mere statistical significance.

Unlike p-values, effect sizes are independent of sample size . This means they offer a more reliable measure of the practical importance of a result, especially when dealing with large datasets. Researchers should report effect sizes alongside p-values to provide a comprehensive understanding of their findings.

It’s important to note that the criteria for small or large effect sizes may vary depending on the research field. Therefore, it’s essential to consider the context and norms within a particular area of study when interpreting effect sizes.

Misinterpreting statistical vs. practical significance

The distinction between statistical and practical significance is often misunderstood or overlooked in research. Statistical significance, typically determined by p-values, indicates the probability that the observed results occurred by chance. However, it does not provide information about the magnitude or practical importance of the effect.

Practical significance, on the other hand, refers to the real-world relevance or importance of the research findings. A result can be statistically significant but practically insignificant, or vice versa. For instance, a study with a large sample size might find a statistically significant difference between two groups, but the actual difference may be too small to have any meaningful impact in practice.

To avoid this pitfall, researchers should focus on both statistical and practical significance when interpreting their results. This involves considering not only p-values but also effect sizes, confidence intervals, and the potential real-world implications of the findings. Additionally, it’s crucial to interpret results in the context of the specific research question and field of study.

By addressing these common pitfalls, researchers can improve the quality and relevance of their statistical analyzes. This approach will lead to more meaningful interpretations of results and better-informed decision-making in various fields of study.

Critical values and p-values are key tools in statistical analysis , each offering unique benefits to researchers. These concepts help in making informed decisions about hypotheses and understanding the significance of findings. While critical values provide a clear-cut approach for decision-making, p-values offer a more nuanced evaluation of evidence against the null hypothesis. Understanding their differences and proper use is crucial to avoid common pitfalls in result interpretation.

Ultimately, the choice between critical values and p-values depends on the specific needs of a study and the context of the research. It’s essential to consider both statistical and practical significance when interpreting results, and to avoid overreliance on arbitrary thresholds. By using these tools wisely, researchers can enhance the quality and relevance of their statistical analyzes, leading to more meaningful insights and better-informed decisions. 

1. When should you use a critical value as opposed to a p-value in hypothesis testing?

When testing a hypothesis, compare the p-value directly with the significance level (α). If the p-value is less than α, reject the null hypothesis (H0); if it’s greater, do not reject H0. Conversely, using critical values allows you to determine whether the p-value is greater or less than α.

2. What does it mean if the p-value is less than the critical value?

If the p-value is lower than the critical value, you should reject the null hypothesis. Conversely, if the p-value is equal to or greater than the critical value, you should not reject the null hypothesis. Remember, a smaller p-value generally indicates stronger evidence against the null hypothesis.

3. What is the purpose of a critical value in statistical testing?

The critical value is a point on the test statistic that defines the boundaries of the acceptance or rejection regions for a statistical test. It helps in setting the threshold for what constitutes statistically significant results.

4. When should you reject the null hypothesis based on the critical value?

In the critical value approach, if the test statistic is more extreme than the critical value, reject the null hypothesis. If it is less extreme, do not reject the null hypothesis. This method helps in deciding the statistical significance of the test results.

define critical value in hypothesis testing

Unlock the power of data science & AI with Tesseract Academy! Dive into our data science & AI courses to elevate your skills and discover endless possibilities in this new era.

  • 6 Emerging Tech Trends Every Organization Needs to Monitor
  • What Role Does Machine Learning Play In Generating Summaries?
  • The Rise of MSPs: How Managed IT Services Are Transforming Businesses
  • Business models in data science and AI

What is a critical value?

A critical value is a point on the distribution of the test statistic under the null hypothesis that defines a set of values that call for rejecting the null hypothesis. This set is called critical or rejection region. Usually, one-sided tests have one critical value and two-sided test have two critical values. The critical values are determined so that the probability that the test statistic has a value in the rejection region of the test when the null hypothesis is true equals the significance level (denoted as α or alpha).

define critical value in hypothesis testing

Critical values on the standard normal distribution for α = 0.05

Figure A shows that results of a one-tailed Z-test are significant if the value of the test statistic is equal to or greater than 1.64, the critical value in this case. The shaded area represents the probability of a type I error (α = 5% in this example) of the area under the curve. Figure B shows that results of a two-tailed Z-test are significant if the absolute value of the test statistic is equal to or greater than 1.96, the critical value in this case. The two shaded areas sum to 5% (α) of the area under the curve.

Examples of calculating critical values

In hypothesis testing, there are two ways to determine whether there is enough evidence from the sample to reject H 0 or to fail to reject H 0 . The most common way is to compare the p-value with a pre-specified value of α, where α is the probability of rejecting H 0 when H 0 is true. However, an equivalent approach is to compare the calculated value of the test statistic based on your data with the critical value. The following are examples of how to calculate the critical value for a 1-sample t-test and a One-Way ANOVA.

Calculating a critical value for a 1-sample t-test

  • Select Calc > Probability Distributions > t .
  • Select Inverse cumulative probability .
  • In Degrees of freedom , enter 9 (the number of observations minus one).
  • In Input constant , enter 0.95 (one minus one-half alpha).

This gives you an inverse cumulative probability, which equals the critical value, of 1.83311. If the absolute value of the t-statistic value is greater than this critical value, then you can reject the null hypothesis, H 0 , at the 0.10 level of significance.

Calculating a critical value for an analysis of variance (ANOVA)

  • Choose Calc > Probability Distributions > F .
  • In Numerator degrees of freedom , enter 2 (the number of factor levels minus one).
  • In Denominator degrees of freedom , enter 9 (the degrees of freedom for error).
  • In Input constant , enter 0.95 (one minus alpha).

This gives you an inverse cumulative probability (critical value) of 4.25649. If the F-statistic is greater than this critical value, then you can reject the null hypothesis, H 0 , at the 0.05 level of significance.

  • Minitab.com
  • License Portal
  • Cookie Settings

You are now leaving support.minitab.com.

Click Continue to proceed to:

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Hypothesis Testing | A Step-by-Step Guide with Easy Examples

Published on November 8, 2019 by Rebecca Bevans . Revised on June 22, 2023.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics . It is most often used by scientists to test specific predictions, called hypotheses, that arise from theories.

There are 5 main steps in hypothesis testing:

  • State your research hypothesis as a null hypothesis and alternate hypothesis (H o ) and (H a  or H 1 ).
  • Collect data in a way designed to test the hypothesis.
  • Perform an appropriate statistical test .
  • Decide whether to reject or fail to reject your null hypothesis.
  • Present the findings in your results and discussion section.

Though the specific details might vary, the procedure you will use when testing a hypothesis will always follow some version of these steps.

Table of contents

Step 1: state your null and alternate hypothesis, step 2: collect data, step 3: perform a statistical test, step 4: decide whether to reject or fail to reject your null hypothesis, step 5: present your findings, other interesting articles, frequently asked questions about hypothesis testing.

After developing your initial research hypothesis (the prediction that you want to investigate), it is important to restate it as a null (H o ) and alternate (H a ) hypothesis so that you can test it mathematically.

The alternate hypothesis is usually your initial hypothesis that predicts a relationship between variables. The null hypothesis is a prediction of no relationship between the variables you are interested in.

  • H 0 : Men are, on average, not taller than women. H a : Men are, on average, taller than women.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

define critical value in hypothesis testing

For a statistical test to be valid , it is important to perform sampling and collect data in a way that is designed to test your hypothesis. If your data are not representative, then you cannot make statistical inferences about the population you are interested in.

There are a variety of statistical tests available, but they are all based on the comparison of within-group variance (how spread out the data is within a category) versus between-group variance (how different the categories are from one another).

If the between-group variance is large enough that there is little or no overlap between groups, then your statistical test will reflect that by showing a low p -value . This means it is unlikely that the differences between these groups came about by chance.

Alternatively, if there is high within-group variance and low between-group variance, then your statistical test will reflect that with a high p -value. This means it is likely that any difference you measure between groups is due to chance.

Your choice of statistical test will be based on the type of variables and the level of measurement of your collected data .

  • an estimate of the difference in average height between the two groups.
  • a p -value showing how likely you are to see this difference if the null hypothesis of no difference is true.

Based on the outcome of your statistical test, you will have to decide whether to reject or fail to reject your null hypothesis.

In most cases you will use the p -value generated by your statistical test to guide your decision. And in most cases, your predetermined level of significance for rejecting the null hypothesis will be 0.05 – that is, when there is a less than 5% chance that you would see these results if the null hypothesis were true.

In some cases, researchers choose a more conservative level of significance, such as 0.01 (1%). This minimizes the risk of incorrectly rejecting the null hypothesis ( Type I error ).

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

The results of hypothesis testing will be presented in the results and discussion sections of your research paper , dissertation or thesis .

In the results section you should give a brief summary of the data and a summary of the results of your statistical test (for example, the estimated difference between group means and associated p -value). In the discussion , you can discuss whether your initial hypothesis was supported by your results or not.

In the formal language of hypothesis testing, we talk about rejecting or failing to reject the null hypothesis. You will probably be asked to do this in your statistics assignments.

However, when presenting research results in academic papers we rarely talk this way. Instead, we go back to our alternate hypothesis (in this case, the hypothesis that men are on average taller than women) and state whether the result of our test did or did not support the alternate hypothesis.

If your null hypothesis was rejected, this result is interpreted as “supported the alternate hypothesis.”

These are superficial differences; you can see that they mean the same thing.

You might notice that we don’t say that we reject or fail to reject the alternate hypothesis . This is because hypothesis testing is not designed to prove or disprove anything. It is only designed to test whether a pattern we measure could have arisen spuriously, or by chance.

If we reject the null hypothesis based on our research (i.e., we find that it is unlikely that the pattern arose by chance), then we can say our test lends support to our hypothesis . But if the pattern does not pass our decision rule, meaning that it could have arisen by chance, then we say the test is inconsistent with our hypothesis .

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Descriptive statistics
  • Measures of central tendency
  • Correlation coefficient

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 22). Hypothesis Testing | A Step-by-Step Guide with Easy Examples. Scribbr. Retrieved September 27, 2024, from https://www.scribbr.com/statistics/hypothesis-testing/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, choosing the right statistical test | types & examples, understanding p values | definition and examples, what is your plagiarism score.

Warning: The NCBI web site requires JavaScript to function. more...

U.S. flag

An official website of the United States government

The .gov means it's official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Browse Titles

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 Jan-.

Cover of StatPearls

StatPearls [Internet].

Hypothesis testing, p values, confidence intervals, and significance.

Jacob Shreffler ; Martin R. Huecker .

Affiliations

Last Update: March 13, 2023 .

  • Definition/Introduction

Medical providers often rely on evidence-based medicine to guide decision-making in practice. Often a research hypothesis is tested with results provided, typically with p values, confidence intervals, or both. Additionally, statistical or research significance is estimated or determined by the investigators. Unfortunately, healthcare providers may have different comfort levels in interpreting these findings, which may affect the adequate application of the data.

  • Issues of Concern

Without a foundational understanding of hypothesis testing, p values, confidence intervals, and the difference between statistical and clinical significance, it may affect healthcare providers' ability to make clinical decisions without relying purely on the research investigators deemed level of significance. Therefore, an overview of these concepts is provided to allow medical professionals to use their expertise to determine if results are reported sufficiently and if the study outcomes are clinically appropriate to be applied in healthcare practice.

Hypothesis Testing

Investigators conducting studies need research questions and hypotheses to guide analyses. Starting with broad research questions (RQs), investigators then identify a gap in current clinical practice or research. Any research problem or statement is grounded in a better understanding of relationships between two or more variables. For this article, we will use the following research question example:

Research Question: Is Drug 23 an effective treatment for Disease A?

Research questions do not directly imply specific guesses or predictions; we must formulate research hypotheses. A hypothesis is a predetermined declaration regarding the research question in which the investigator(s) makes a precise, educated guess about a study outcome. This is sometimes called the alternative hypothesis and ultimately allows the researcher to take a stance based on experience or insight from medical literature. An example of a hypothesis is below.

Research Hypothesis: Drug 23 will significantly reduce symptoms associated with Disease A compared to Drug 22.

The null hypothesis states that there is no statistical difference between groups based on the stated research hypothesis.

Researchers should be aware of journal recommendations when considering how to report p values, and manuscripts should remain internally consistent.

Regarding p values, as the number of individuals enrolled in a study (the sample size) increases, the likelihood of finding a statistically significant effect increases. With very large sample sizes, the p-value can be very low significant differences in the reduction of symptoms for Disease A between Drug 23 and Drug 22. The null hypothesis is deemed true until a study presents significant data to support rejecting the null hypothesis. Based on the results, the investigators will either reject the null hypothesis (if they found significant differences or associations) or fail to reject the null hypothesis (they could not provide proof that there were significant differences or associations).

To test a hypothesis, researchers obtain data on a representative sample to determine whether to reject or fail to reject a null hypothesis. In most research studies, it is not feasible to obtain data for an entire population. Using a sampling procedure allows for statistical inference, though this involves a certain possibility of error. [1]  When determining whether to reject or fail to reject the null hypothesis, mistakes can be made: Type I and Type II errors. Though it is impossible to ensure that these errors have not occurred, researchers should limit the possibilities of these faults. [2]

Significance

Significance is a term to describe the substantive importance of medical research. Statistical significance is the likelihood of results due to chance. [3]  Healthcare providers should always delineate statistical significance from clinical significance, a common error when reviewing biomedical research. [4]  When conceptualizing findings reported as either significant or not significant, healthcare providers should not simply accept researchers' results or conclusions without considering the clinical significance. Healthcare professionals should consider the clinical importance of findings and understand both p values and confidence intervals so they do not have to rely on the researchers to determine the level of significance. [5]  One criterion often used to determine statistical significance is the utilization of p values.

P values are used in research to determine whether the sample estimate is significantly different from a hypothesized value. The p-value is the probability that the observed effect within the study would have occurred by chance if, in reality, there was no true effect. Conventionally, data yielding a p<0.05 or p<0.01 is considered statistically significant. While some have debated that the 0.05 level should be lowered, it is still universally practiced. [6]  Hypothesis testing allows us to determine the size of the effect.

An example of findings reported with p values are below:

Statement: Drug 23 reduced patients' symptoms compared to Drug 22. Patients who received Drug 23 (n=100) were 2.1 times less likely than patients who received Drug 22 (n = 100) to experience symptoms of Disease A, p<0.05.

Statement:Individuals who were prescribed Drug 23 experienced fewer symptoms (M = 1.3, SD = 0.7) compared to individuals who were prescribed Drug 22 (M = 5.3, SD = 1.9). This finding was statistically significant, p= 0.02.

For either statement, if the threshold had been set at 0.05, the null hypothesis (that there was no relationship) should be rejected, and we should conclude significant differences. Noticeably, as can be seen in the two statements above, some researchers will report findings with < or > and others will provide an exact p-value (0.000001) but never zero [6] . When examining research, readers should understand how p values are reported. The best practice is to report all p values for all variables within a study design, rather than only providing p values for variables with significant findings. [7]  The inclusion of all p values provides evidence for study validity and limits suspicion for selective reporting/data mining.  

While researchers have historically used p values, experts who find p values problematic encourage the use of confidence intervals. [8] . P-values alone do not allow us to understand the size or the extent of the differences or associations. [3]  In March 2016, the American Statistical Association (ASA) released a statement on p values, noting that scientific decision-making and conclusions should not be based on a fixed p-value threshold (e.g., 0.05). They recommend focusing on the significance of results in the context of study design, quality of measurements, and validity of data. Ultimately, the ASA statement noted that in isolation, a p-value does not provide strong evidence. [9]

When conceptualizing clinical work, healthcare professionals should consider p values with a concurrent appraisal study design validity. For example, a p-value from a double-blinded randomized clinical trial (designed to minimize bias) should be weighted higher than one from a retrospective observational study [7] . The p-value debate has smoldered since the 1950s [10] , and replacement with confidence intervals has been suggested since the 1980s. [11]

Confidence Intervals

A confidence interval provides a range of values within given confidence (e.g., 95%), including the accurate value of the statistical constraint within a targeted population. [12]  Most research uses a 95% CI, but investigators can set any level (e.g., 90% CI, 99% CI). [13]  A CI provides a range with the lower bound and upper bound limits of a difference or association that would be plausible for a population. [14]  Therefore, a CI of 95% indicates that if a study were to be carried out 100 times, the range would contain the true value in 95, [15]  confidence intervals provide more evidence regarding the precision of an estimate compared to p-values. [6]

In consideration of the similar research example provided above, one could make the following statement with 95% CI:

Statement: Individuals who were prescribed Drug 23 had no symptoms after three days, which was significantly faster than those prescribed Drug 22; there was a mean difference between the two groups of days to the recovery of 4.2 days (95% CI: 1.9 – 7.8).

It is important to note that the width of the CI is affected by the standard error and the sample size; reducing a study sample number will result in less precision of the CI (increase the width). [14]  A larger width indicates a smaller sample size or a larger variability. [16]  A researcher would want to increase the precision of the CI. For example, a 95% CI of 1.43 – 1.47 is much more precise than the one provided in the example above. In research and clinical practice, CIs provide valuable information on whether the interval includes or excludes any clinically significant values. [14]

Null values are sometimes used for differences with CI (zero for differential comparisons and 1 for ratios). However, CIs provide more information than that. [15]  Consider this example: A hospital implements a new protocol that reduced wait time for patients in the emergency department by an average of 25 minutes (95% CI: -2.5 – 41 minutes). Because the range crosses zero, implementing this protocol in different populations could result in longer wait times; however, the range is much higher on the positive side. Thus, while the p-value used to detect statistical significance for this may result in "not significant" findings, individuals should examine this range, consider the study design, and weigh whether or not it is still worth piloting in their workplace.

Similarly to p-values, 95% CIs cannot control for researchers' errors (e.g., study bias or improper data analysis). [14]  In consideration of whether to report p-values or CIs, researchers should examine journal preferences. When in doubt, reporting both may be beneficial. [13]  An example is below:

Reporting both: Individuals who were prescribed Drug 23 had no symptoms after three days, which was significantly faster than those prescribed Drug 22, p = 0.009. There was a mean difference between the two groups of days to the recovery of 4.2 days (95% CI: 1.9 – 7.8).

  • Clinical Significance

Recall that clinical significance and statistical significance are two different concepts. Healthcare providers should remember that a study with statistically significant differences and large sample size may be of no interest to clinicians, whereas a study with smaller sample size and statistically non-significant results could impact clinical practice. [14]  Additionally, as previously mentioned, a non-significant finding may reflect the study design itself rather than relationships between variables.

Healthcare providers using evidence-based medicine to inform practice should use clinical judgment to determine the practical importance of studies through careful evaluation of the design, sample size, power, likelihood of type I and type II errors, data analysis, and reporting of statistical findings (p values, 95% CI or both). [4]  Interestingly, some experts have called for "statistically significant" or "not significant" to be excluded from work as statistical significance never has and will never be equivalent to clinical significance. [17]

The decision on what is clinically significant can be challenging, depending on the providers' experience and especially the severity of the disease. Providers should use their knowledge and experiences to determine the meaningfulness of study results and make inferences based not only on significant or insignificant results by researchers but through their understanding of study limitations and practical implications.

  • Nursing, Allied Health, and Interprofessional Team Interventions

All physicians, nurses, pharmacists, and other healthcare professionals should strive to understand the concepts in this chapter. These individuals should maintain the ability to review and incorporate new literature for evidence-based and safe care. 

  • Review Questions
  • Access free multiple choice questions on this topic.
  • Comment on this article.

Disclosure: Jacob Shreffler declares no relevant financial relationships with ineligible companies.

Disclosure: Martin Huecker declares no relevant financial relationships with ineligible companies.

This book is distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) ( http://creativecommons.org/licenses/by-nc-nd/4.0/ ), which permits others to distribute the work, provided that the article is not altered or used commercially. You are not required to obtain permission to distribute this article, provided that you credit the author and journal.

  • Cite this Page Shreffler J, Huecker MR. Hypothesis Testing, P Values, Confidence Intervals, and Significance. [Updated 2023 Mar 13]. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 Jan-.

In this Page

Bulk download.

  • Bulk download StatPearls data from FTP

Related information

  • PMC PubMed Central citations
  • PubMed Links to PubMed

Similar articles in PubMed

  • The reporting of p values, confidence intervals and statistical significance in Preventive Veterinary Medicine (1997-2017). [PeerJ. 2021] The reporting of p values, confidence intervals and statistical significance in Preventive Veterinary Medicine (1997-2017). Messam LLM, Weng HY, Rosenberger NWY, Tan ZH, Payet SDM, Santbakshsing M. PeerJ. 2021; 9:e12453. Epub 2021 Nov 24.
  • Review Clinical versus statistical significance: interpreting P values and confidence intervals related to measures of association to guide decision making. [J Pharm Pract. 2010] Review Clinical versus statistical significance: interpreting P values and confidence intervals related to measures of association to guide decision making. Ferrill MJ, Brown DA, Kyle JA. J Pharm Pract. 2010 Aug; 23(4):344-51. Epub 2010 Apr 13.
  • Interpreting "statistical hypothesis testing" results in clinical research. [J Ayurveda Integr Med. 2012] Interpreting "statistical hypothesis testing" results in clinical research. Sarmukaddam SB. J Ayurveda Integr Med. 2012 Apr; 3(2):65-9.
  • Confidence intervals in procedural dermatology: an intuitive approach to interpreting data. [Dermatol Surg. 2005] Confidence intervals in procedural dermatology: an intuitive approach to interpreting data. Alam M, Barzilai DA, Wrone DA. Dermatol Surg. 2005 Apr; 31(4):462-6.
  • Review Is statistical significance testing useful in interpreting data? [Reprod Toxicol. 1993] Review Is statistical significance testing useful in interpreting data? Savitz DA. Reprod Toxicol. 1993; 7(2):95-100.

Recent Activity

  • Hypothesis Testing, P Values, Confidence Intervals, and Significance - StatPearl... Hypothesis Testing, P Values, Confidence Intervals, and Significance - StatPearls

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

Connect with NLM

National Library of Medicine 8600 Rockville Pike Bethesda, MD 20894

Web Policies FOIA HHS Vulnerability Disclosure

Help Accessibility Careers

statistics

All Subjects

study guides for every class

That actually explain what's on your next test, critical value, from class:, data, inference, and decisions.

A critical value is a point on the scale of the test statistic that separates the regions where the null hypothesis is rejected from those where it is not. In hypothesis testing, critical values help determine the threshold at which you can conclude that an observed effect is statistically significant. This concept is crucial in estimating parameters and making inferences about regression models, as it aids in assessing how far the sample statistic must be from the hypothesized value to reject the null hypothesis.

congrats on reading the definition of Critical Value . now let's actually learn it.

5 Must Know Facts For Your Next Test

  • Critical values depend on the significance level (alpha), which is commonly set at 0.05 for many tests, dictating how strict you are about rejecting the null hypothesis.
  • In a two-tailed test, there are two critical values, one for each tail of the distribution, while in a one-tailed test, there is only one critical value.
  • The critical value is derived from a statistical distribution such as the Z-distribution or t-distribution, depending on whether the population standard deviation is known and the sample size.
  • When performing regression analysis, critical values are used to determine whether individual regression coefficients are significantly different from zero.
  • If a test statistic exceeds the critical value, you reject the null hypothesis, suggesting that your findings are statistically significant.

Review Questions

  • Changing the significance level directly impacts critical values. A lower significance level (e.g., from 0.05 to 0.01) results in higher critical values, making it harder to reject the null hypothesis. Conversely, a higher significance level lowers critical values, increasing the likelihood of rejection. This adjustment can lead to more conservative or liberal conclusions about statistical significance based on how stringent you want to be with your evidence.
  • Critical values play a vital role in constructing confidence intervals. For instance, a critical value corresponding to a chosen confidence level (like 1.96 for 95% confidence) helps establish the margin of error around a sample mean. When you apply these critical values to your estimates, it informs you how far you can be from your sample statistic while still being confident that the true population parameter lies within that interval.
  • Understanding critical values allows for more precise interpretation of regression results by indicating whether coefficients are statistically significant. By comparing calculated t-values against their respective critical values derived from t-distributions, you can discern if changes in independent variables have meaningful impacts on the dependent variable. This insight not only helps in determining which predictors are significant but also guides decision-making based on model outputs and confidence in predictions.

Related terms

Z-Score : A Z-score measures how many standard deviations an element is from the mean, often used in the context of standard normal distribution.

P-Value : A P-value indicates the probability of obtaining results at least as extreme as those observed, under the assumption that the null hypothesis is true.

Confidence Interval : A confidence interval is a range of values derived from sample statistics that is likely to contain the value of an unknown population parameter.

" Critical Value " also found in:

Subjects ( 26 ).

  • AP Statistics
  • Advanced quantitative methods
  • Business Forecasting
  • College Introductory Statistics
  • Elementary Differential Topology
  • Engineering Probability
  • Extremal Combinatorics
  • Geometric Measure Theory
  • Introduction to Econometrics
  • Introduction to Mathematical Economics
  • Introduction to Probabilistic Methods in Mathematics and the Sciences
  • Introduction to Probability
  • Introductory Probability and Statistics for Business
  • Linear Modeling: Theory and Applications
  • Market Research: Tools and Techniques for Data Collection and Analysis
  • Mathematical Modeling
  • Mathematical Probability Theory
  • Morse Theory
  • Preparatory Statistics
  • Probabilistic & Statistical Decision-Making for Management
  • Probability and Mathematical Statistics in Data Science
  • Probability and Statistics
  • Statistical Inference
  • Statistical Methods for Data Science
  • Theoretical Statistics

© 2024 Fiveable Inc. All rights reserved.

Ap® and sat® are trademarks registered by the college board, which is not affiliated with, and does not endorse this website..

Critical Value

Critical value is a cut-off value that is used to mark the start of a region where the test statistic, obtained in hypothesis testing, is unlikely to fall in. In hypothesis testing, the critical value is compared with the obtained test statistic to determine whether the null hypothesis has to be rejected or not.

Graphically, the critical value splits the graph into the acceptance region and the rejection region for hypothesis testing. It helps to check the statistical significance of a test statistic. In this article, we will learn more about the critical value, its formula, types, and how to calculate its value.

1.
2.
3.
4.
5.
6.
7.
8.

What is Critical Value?

A critical value can be calculated for different types of hypothesis tests. The critical value of a particular test can be interpreted from the distribution of the test statistic and the significance level. A one-tailed hypothesis test will have one critical value while a two-tailed test will have two critical values.

Critical Value Definition

Critical value can be defined as a value that is compared to a test statistic in hypothesis testing to determine whether the null hypothesis is to be rejected or not. If the value of the test statistic is less extreme than the critical value, then the null hypothesis cannot be rejected. However, if the test statistic is more extreme than the critical value, the null hypothesis is rejected and the alternative hypothesis is accepted. In other words, the critical value divides the distribution graph into the acceptance and the rejection region. If the value of the test statistic falls in the rejection region, then the null hypothesis is rejected otherwise it cannot be rejected.

Critical Value Formula

Depending upon the type of distribution the test statistic belongs to, there are different formulas to compute the critical value. The confidence interval or the significance level can be used to determine a critical value. Given below are the different critical value formulas.

Critical Value Confidence Interval

The critical value for a one-tailed or two-tailed test can be computed using the confidence interval . Suppose a confidence interval of 95% has been specified for conducting a hypothesis test. The critical value can be determined as follows:

  • Step 1: Subtract the confidence level from 100%. 100% - 95% = 5%.
  • Step 2: Convert this value to decimals to get \(\alpha\). Thus, \(\alpha\) = 5%.
  • Step 3: If it is a one-tailed test then the alpha level will be the same value in step 2. However, if it is a two-tailed test, the alpha level will be divided by 2.
  • Step 4: Depending on the type of test conducted the critical value can be looked up from the corresponding distribution table using the alpha value.

The process used in step 4 will be elaborated in the upcoming sections.

T Critical Value

A t-test is used when the population standard deviation is not known and the sample size is lesser than 30. A t-test is conducted when the population data follows a Student t distribution . The t critical value can be calculated as follows:

  • Determine the alpha level.
  • Subtract 1 from the sample size. This gives the degrees of freedom (df).
  • If the hypothesis test is one-tailed then use the one-tailed t distribution table. Otherwise, use the two-tailed t distribution table for a two-tailed test.
  • Match the corresponding df value (left side) and the alpha value (top row) of the table. Find the intersection of this row and column to give the t critical value.

Test Statistic for one sample t test: t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\). \(\overline{x}\) is the sample mean, \(\mu\) is the population mean, s is the sample standard deviation and n is the size of the sample.

Test Statistic for two samples t test: \(\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{s_{1}^{2}}{n_{1}}+\frac{s_{2}^{2}}{n_{2}}}}\).

Decision Criteria:

  • Reject the null hypothesis if test statistic > t critical value (right-tailed hypothesis test).
  • Reject the null hypothesis if test statistic < t critical value (left-tailed hypothesis test).
  • Reject the null hypothesis if the test statistic does not lie in the acceptance region (two-tailed hypothesis test).

Critical Value

This decision criterion is used for all tests. Only the test statistic and critical value change.

Z Critical Value

A z test is conducted on a normal distribution when the population standard deviation is known and the sample size is greater than or equal to 30. The z critical value can be calculated as follows:

  • Find the alpha level.
  • Subtract the alpha level from 1 for a two-tailed test. For a one-tailed test subtract the alpha level from 0.5.
  • Look up the area from the z distribution table to obtain the z critical value. For a left-tailed test, a negative sign needs to be added to the critical value at the end of the calculation.

Test statistic for one sample z test: z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\). \(\sigma\) is the population standard deviation.

Test statistic for two samples z test: z = \(\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{\sigma_{1}^{2}}{n_{1}}+\frac{\sigma_{2}^{2}}{n_{2}}}}\).

F Critical Value

The F test is largely used to compare the variances of two samples. The test statistic so obtained is also used for regression analysis. The f critical value is given as follows:

  • Subtract 1 from the size of the first sample. This gives the first degree of freedom. Say, x
  • Similarly, subtract 1 from the second sample size to get the second df. Say, y.
  • Using the f distribution table, the intersection of the x column and y row will give the f critical value.

Test Statistic for large samples: f = \(\frac{\sigma_{1}^{2}}{\sigma_{2}^{2}}\). \(\sigma_{1}^{2}\) variance of the first sample and \(\sigma_{2}^{2}\) variance of the second sample.

Test Statistic for small samples: f = \(\frac{s_{1}^{2}}{s_{2}^{2}}\). \(s_{1}^{1}\) variance of the first sample and \(s_{2}^{2}\) variance of the second sample.

Chi-Square Critical Value

The chi-square test is used to check if the sample data matches the population data. It can also be used to compare two variables to see if they are related. The chi-square critical value is given as follows:

  • Identify the alpha level.
  • Subtract 1 from the sample size to determine the degrees of freedom (df).
  • Using the chi-square distribution table, the intersection of the row of the df and the column of the alpha value yields the chi-square critical value.

Test statistic for chi-squared test statistic: \(\chi ^{2} = \sum \frac{(O_{i}-E_{i})^{2}}{E_{i}}\).

Critical Value Calculation

Suppose a right-tailed z test is being conducted. The critical value needs to be calculated for a 0.0079 alpha level. Then the steps are as follows:

  • Subtract the alpha level from 0.5. Thus, 0.5 - 0.0079 = 0.4921
  • Using the z distribution table find the area closest to 0.4921. The closest area is 0.4922. As this value is at the intersection of 2.4 and 0.02 thus, the z critical value = 2.42.

Critical Value Calculation

Related Articles:

  • Probability and Statistics
  • Data Handling

Important Notes on Critical Value

  • Critical value can be defined as a value that is useful in checking whether the null hypothesis can be rejected or not by comparing it with the test statistic.
  • It is the point that divides the distribution graph into the acceptance and the rejection region.
  • There are 4 types of critical values - z, f, chi-square, and t.

Examples on Critical Value

Example 1: Find the critical value for a left tailed z test where \(\alpha\) = 0.012.

Solution: First subtract \(\alpha\) from 0.5. Thus, 0.5 - 0.012 = 0.488.

Using the z distribution table, z = 2.26.

However, as this is a left-tailed z test thus, z = -2.26

Answer: Critical value = -2.26

Example 2: Find the critical value for a two-tailed f test conducted on the following samples at a \(\alpha\) = 0.025

Variance = 110, Sample size = 41

Variance = 70, Sample size = 21

Solution: \(n_{1}\) = 41, \(n_{2}\) = 21,

\(n_{1}\) - 1= 40, \(n_{2}\) - 1 = 20,

Sample 1 df = 40, Sample 2 df = 20

Using the F distribution table for \(\alpha\) = 0.025, the value at the intersection of the 40 th column and 20 th row is

F(40, 20) = 2.287

Answer: Critical Value = 2.287

Example 3: Suppose a one-tailed t-test is being conducted on data with a sample size of 8 at \(\alpha\) = 0.05. Then find the critical value.

Solution: n = 8

df = 8 - 1 = 7

Using the one tailed t distribution table t(7, 0.05) = 1.895.

Answer: Crititcal Value = 1.895

go to slide go to slide go to slide

define critical value in hypothesis testing

Book a Free Trial Class

FAQs on Critical Value

What is the critical value in statistics.

Critical value in statistics is a cut-off value that is compared with a test statistic in hypothesis testing to check whether the null hypothesis should be rejected or not.

What are the Different Types of Critical Value?

There are 4 types of critical values depending upon the type of distributions they are obtained from. These distributions are given as follows:

  • Normal distribution (z critical value).
  • Student t distribution (t).
  • Chi-squared distribution (chi-squared).
  • F distribution (f).

What is the Critical Value Formula for an F test?

To find the critical value for an f test the steps are as follows:

  • Determine the degrees of freedom for both samples by subtracting 1 from each sample size.
  • Find the corresponding value from a one-tailed or two-tailed f distribution at the given alpha level.
  • This will give the critical value.

What is the T Critical Value?

The t critical value is obtained when the population follows a t distribution. The steps to find the t critical value are as follows:

  • Subtract the sample size number by 1 to get the df.
  • Use the t distribution table for the alpha value to get the required critical value.

How to Find the Critical Value Using a Confidence Interval for a Two-Tailed Z Test?

The steps to find the critical value using a confidence interval are as follows:

  • Subtract the confident interval from 100% and convert the resultant into a decimal value to get the alpha level.
  • Subtract this value from 1.
  • Find the z value for the corresponding area using the normal distribution table to get the critical value.

Can a Critical Value be Negative?

If a left-tailed test is being conducted then the critical value will be negative. This is because the critical value will be to the left of the mean thus, making it negative.

How to Reject Null Hypothesis Based on Critical Value?

The rejection criteria for the null hypothesis is given as follows:

  • Right-tailed test: Test statistic > critical value.
  • Left-tailed test: Test statistic < critical value.
  • Two-tailed test: Reject if the test statistic does not lie in the acceptance region.

Critical Value Calculator

Table of contents

Welcome to the critical value calculator! Here you can quickly determine the critical value(s) for two-tailed tests, as well as for one-tailed tests. It works for most common distributions in statistical testing: the standard normal distribution N(0,1) (that is when you have a Z-score), t-Student, chi-square, and F-distribution .

What is a critical value? And what is the critical value formula? Scroll down – we provide you with the critical value definition and explain how to calculate critical values in order to use them to construct rejection regions (also known as critical regions).

How to use critical value calculator

The critical value calculator is your go-to tool for swiftly determining critical values in statistical tests, be it one-tailed or two-tailed. To effectively use the calculator, follow these steps:

In the first field, input the distribution of your test statistic under the null hypothesis: is it a standard normal N (0,1), t-Student, chi-squared, or Snedecor's F? If you are not sure, check the sections below devoted to those distributions, and try to localize the test you need to perform.

In the field What type of test? choose the alternative hypothesis : two-tailed, right-tailed, or left-tailed.

If needed, specify the degrees of freedom of the test statistic's distribution. If you need more clarification, check the description of the test you are performing. You can learn more about the meaning of this quantity in statistics from the degrees of freedom calculator .

Set the significance level, α \alpha α . By default, we pre-set it to the most common value, 0.05, but you can adjust it to your needs.

The critical value calculator will display your critical value(s) and the rejection region(s).

For example, let's envision a scenario where you are conducting a one-tailed hypothesis test using a t-Student distribution with 15 degrees of freedom. You have opted for a right-tailed test and set a significance level (α) of 0.05. The results indicate that the critical value is 1.7531, and the critical region is (1.7531, ∞). This implies that if your test statistic exceeds 1.7531, you will reject the null hypothesis at the 0.05 significance level.

👩‍🏫 Want to learn more about critical values? Keep reading!

What is a critical value?

In hypothesis testing, critical values are one of the two approaches which allow you to decide whether to retain or reject the null hypothesis. The other approach is to calculate the p-value (for example, using the p-value calculator ).

The critical value approach consists of checking if the value of the test statistic generated by your sample belongs to the so-called rejection region , or critical region , which is the region where the test statistic is highly improbable to lie . A critical value is a cut-off value (or two cut-off values in the case of a two-tailed test) that constitutes the boundary of the rejection region(s). In other words, critical values divide the scale of your test statistic into the rejection region and the non-rejection region.

Once you have found the rejection region, check if the value of the test statistic generated by your sample belongs to it :

  • If so, it means that you can reject the null hypothesis and accept the alternative hypothesis; and
  • If not, then there is not enough evidence to reject H 0 .

But how to calculate critical values? First of all, you need to set a significance level , α \alpha α , which quantifies the probability of rejecting the null hypothesis when it is actually correct. The choice of α is arbitrary; in practice, we most often use a value of 0.05 or 0.01. Critical values also depend on the alternative hypothesis you choose for your test , elucidated in the next section .

Critical value definition

To determine critical values, you need to know the distribution of your test statistic under the assumption that the null hypothesis holds. Critical values are then points with the property that the probability of your test statistic assuming values at least as extreme at those critical values is equal to the significance level α . Wow, quite a definition, isn't it? Don't worry, we'll explain what it all means.

First, let us point out it is the alternative hypothesis that determines what "extreme" means. In particular, if the test is one-sided, then there will be just one critical value; if it is two-sided, then there will be two of them: one to the left and the other to the right of the median value of the distribution.

Critical values can be conveniently depicted as the points with the property that the area under the density curve of the test statistic from those points to the tails is equal to α \alpha α :

Left-tailed test: the area under the density curve from the critical value to the left is equal to α \alpha α ;

Right-tailed test: the area under the density curve from the critical value to the right is equal to α \alpha α ; and

Two-tailed test: the area under the density curve from the left critical value to the left is equal to α / 2 \alpha/2 α /2 , and the area under the curve from the right critical value to the right is equal to α / 2 \alpha/2 α /2 as well; thus, total area equals α \alpha α .

Critical values for symmetric distribution

As you can see, finding the critical values for a two-tailed test with significance α \alpha α boils down to finding both one-tailed critical values with a significance level of α / 2 \alpha/2 α /2 .

How to calculate critical values?

The formulae for the critical values involve the quantile function , Q Q Q , which is the inverse of the cumulative distribution function ( c d f \mathrm{cdf} cdf ) for the test statistic distribution (calculated under the assumption that H 0 holds!): Q = c d f − 1 Q = \mathrm{cdf}^{-1} Q = cdf − 1 .

Once we have agreed upon the value of α \alpha α , the critical value formulae are the following:

  • Left-tailed test :
  • Right-tailed test :
  • Two-tailed test :

In the case of a distribution symmetric about 0 , the critical values for the two-tailed test are symmetric as well:

Unfortunately, the probability distributions that are the most widespread in hypothesis testing have somewhat complicated c d f \mathrm{cdf} cdf formulae. To find critical values by hand, you would need to use specialized software or statistical tables. In these cases, the best option is, of course, our critical value calculator! 😁

Z critical values

Use the Z (standard normal) option if your test statistic follows (at least approximately) the standard normal distribution N(0,1) .

In the formulae below, u u u denotes the quantile function of the standard normal distribution N(0,1):

Left-tailed Z critical value: u ( α ) u(\alpha) u ( α )

Right-tailed Z critical value: u ( 1 − α ) u(1-\alpha) u ( 1 − α )

Two-tailed Z critical value: ± u ( 1 − α / 2 ) \pm u(1- \alpha/2) ± u ( 1 − α /2 )

Check out Z-test calculator to learn more about the most common Z-test used on the population mean. There are also Z-tests for the difference between two population means, in particular, one between two proportions.

t critical values

Use the t-Student option if your test statistic follows the t-Student distribution . This distribution is similar to N(0,1) , but its tails are fatter – the exact shape depends on the number of degrees of freedom . If this number is large (>30), which generically happens for large samples, then the t-Student distribution is practically indistinguishable from N(0,1). Check our t-statistic calculator to compute the related test statistic.

t-Student distribution densities

In the formulae below, Q t , d Q_{\text{t}, d} Q t , d ​ is the quantile function of the t-Student distribution with d d d degrees of freedom:

Left-tailed t critical value: Q t , d ( α ) Q_{\text{t}, d}(\alpha) Q t , d ​ ( α )

Right-tailed t critical value: Q t , d ( 1 − α ) Q_{\text{t}, d}(1 - \alpha) Q t , d ​ ( 1 − α )

Two-tailed t critical values: ± Q t , d ( 1 − α / 2 ) \pm Q_{\text{t}, d}(1 - \alpha/2) ± Q t , d ​ ( 1 − α /2 )

Visit the t-test calculator to learn more about various t-tests: the one for a population mean with an unknown population standard deviation , those for the difference between the means of two populations (with either equal or unequal population standard deviations), as well as about the t-test for paired samples .

chi-square critical values (χ²)

Use the χ² (chi-square) option when performing a test in which the test statistic follows the χ²-distribution .

You need to determine the number of degrees of freedom of the χ²-distribution of your test statistic – below, we list them for the most commonly used χ²-tests.

Here we give the formulae for chi square critical values; Q χ 2 , d Q_{\chi^2, d} Q χ 2 , d ​ is the quantile function of the χ²-distribution with d d d degrees of freedom:

Left-tailed χ² critical value: Q χ 2 , d ( α ) Q_{\chi^2, d}(\alpha) Q χ 2 , d ​ ( α )

Right-tailed χ² critical value: Q χ 2 , d ( 1 − α ) Q_{\chi^2, d}(1 - \alpha) Q χ 2 , d ​ ( 1 − α )

Two-tailed χ² critical values: Q χ 2 , d ( α / 2 ) Q_{\chi^2, d}(\alpha/2) Q χ 2 , d ​ ( α /2 ) and Q χ 2 , d ( 1 − α / 2 ) Q_{\chi^2, d}(1 - \alpha/2) Q χ 2 , d ​ ( 1 − α /2 )

Several different tests lead to a χ²-score:

Goodness-of-fit test : does the empirical distribution agree with the expected distribution?

This test is right-tailed . Its test statistic follows the χ²-distribution with k − 1 k - 1 k − 1 degrees of freedom, where k k k is the number of classes into which the sample is divided.

Independence test : is there a statistically significant relationship between two variables?

This test is also right-tailed , and its test statistic is computed from the contingency table. There are ( r − 1 ) ( c − 1 ) (r - 1)(c - 1) ( r − 1 ) ( c − 1 ) degrees of freedom, where r r r is the number of rows, and c c c is the number of columns in the contingency table.

Test for the variance of normally distributed data : does this variance have some pre-determined value?

This test can be one- or two-tailed! Its test statistic has the χ²-distribution with n − 1 n - 1 n − 1 degrees of freedom, where n n n is the sample size.

F critical values

Finally, choose F (Fisher-Snedecor) if your test statistic follows the F-distribution . This distribution has a pair of degrees of freedom .

Let us see how those degrees of freedom arise. Assume that you have two independent random variables, X X X and Y Y Y , that follow χ²-distributions with d 1 d_1 d 1 ​ and d 2 d_2 d 2 ​ degrees of freedom, respectively. If you now consider the ratio ( X d 1 ) : ( Y d 2 ) (\frac{X}{d_1}):(\frac{Y}{d_2}) ( d 1 ​ X ​ ) : ( d 2 ​ Y ​ ) , it turns out it follows the F-distribution with ( d 1 , d 2 ) (d_1, d_2) ( d 1 ​ , d 2 ​ ) degrees of freedom. That's the reason why we call d 1 d_1 d 1 ​ and d 2 d_2 d 2 ​ the numerator and denominator degrees of freedom , respectively.

In the formulae below, Q F , d 1 , d 2 Q_{\text{F}, d_1, d_2} Q F , d 1 ​ , d 2 ​ ​ stands for the quantile function of the F-distribution with ( d 1 , d 2 ) (d_1, d_2) ( d 1 ​ , d 2 ​ ) degrees of freedom:

Left-tailed F critical value: Q F , d 1 , d 2 ( α ) Q_{\text{F}, d_1, d_2}(\alpha) Q F , d 1 ​ , d 2 ​ ​ ( α )

Right-tailed F critical value: Q F , d 1 , d 2 ( 1 − α ) Q_{\text{F}, d_1, d_2}(1 - \alpha) Q F , d 1 ​ , d 2 ​ ​ ( 1 − α )

Two-tailed F critical values: Q F , d 1 , d 2 ( α / 2 ) Q_{\text{F}, d_1, d_2}(\alpha/2) Q F , d 1 ​ , d 2 ​ ​ ( α /2 ) and Q F , d 1 , d 2 ( 1 − α / 2 ) Q_{\text{F}, d_1, d_2}(1 -\alpha/2) Q F , d 1 ​ , d 2 ​ ​ ( 1 − α /2 )

Here we list the most important tests that produce F-scores: each of them is right-tailed .

ANOVA : tests the equality of means in three or more groups that come from normally distributed populations with equal variances. There are ( k − 1 , n − k ) (k - 1, n - k) ( k − 1 , n − k ) degrees of freedom, where k k k is the number of groups, and n n n is the total sample size (across every group).

Overall significance in regression analysis . The test statistic has ( k − 1 , n − k ) (k - 1, n - k) ( k − 1 , n − k ) degrees of freedom, where n n n is the sample size, and k k k is the number of variables (including the intercept).

Compare two nested regression models . The test statistic follows the F-distribution with ( k 2 − k 1 , n − k 2 ) (k_2 - k_1, n - k_2) ( k 2 ​ − k 1 ​ , n − k 2 ​ ) degrees of freedom, where k 1 k_1 k 1 ​ and k 2 k_2 k 2 ​ are the number of variables in the smaller and bigger models, respectively, and n n n is the sample size.

The equality of variances in two normally distributed populations . There are ( n − 1 , m − 1 ) (n - 1, m - 1) ( n − 1 , m − 1 ) degrees of freedom, where n n n and m m m are the respective sample sizes.

Behind the scenes of the critical value calculator

I'm Anna, the mastermind behind the critical value calculator and a PhD in mathematics from Jagiellonian University .

The idea for creating the tool originated from my experiences in teaching and research. Recognizing the need for a tool that simplifies the critical value determination process across various statistical distributions, I built a user-friendly calculator accessible to both students and professionals. After publishing the tool, I soon found myself using the calculator in my research and as a teaching aid.

Trust in this calculator is paramount to me. Each tool undergoes a rigorous review process , with peer-reviewed insights from experts and meticulous proofreading by native speakers. This commitment to accuracy and reliability ensures that users can be confident in the content. Please check the Editorial Policies page for more details on our standards.

What is a Z critical value?

A Z critical value is the value that defines the critical region in hypothesis testing when the test statistic follows the standard normal distribution . If the value of the test statistic falls into the critical region, you should reject the null hypothesis and accept the alternative hypothesis.

How do I calculate Z critical value?

To find a Z critical value for a given confidence level α :

Check if you perform a one- or two-tailed test .

For a one-tailed test:

Left -tailed: critical value is the α -th quantile of the standard normal distribution N(0,1).

Right -tailed: critical value is the (1-α) -th quantile.

Two-tailed test: critical value equals ±(1-α/2) -th quantile of N(0,1).

No quantile tables ? Use CDF tables! (The quantile function is the inverse of the CDF.)

Verify your answer with an online critical value calculator.

Is a t critical value the same as Z critical value?

In theory, no . In practice, very often, yes . The t-Student distribution is similar to the standard normal distribution, but it is not the same . However, if the number of degrees of freedom (which is, roughly speaking, the size of your sample) is large enough (>30), then the two distributions are practically indistinguishable , and so the t critical value has practically the same value as the Z critical value.

What is the Z critical value for 95% confidence?

The Z critical value for a 95% confidence interval is:

  • 1.96 for a two-tailed test;
  • 1.64 for a right-tailed test; and
  • -1.64 for a left-tailed test.

What distribution?

What type of test?

Degrees of freedom (d)

Significance level

The test statistic follows the t-distribution with d degrees of freedom.

by Dipali Chaudhari

What is Critical Value? | Explained with Types & Examples

define critical value in hypothesis testing

In statistics, a critical value is a value that separates a region of rejection from a region of non-rejection in a statistical hypothesis test. Critical value is a value that separates the acceptance and rejection regions in a hypothesis test, based on a given level of significance (alpha).

It is a boundary or threshold that determines whether a statistical test will reject the null hypothesis or fail to reject it. The critical value is determined by the distribution of the test statistic and the level of significance chosen for the test. By using a table or using statistical software we can find easily critical value.

The critical value is dependent on the significance level, sample size, and the type of test being performed. They play a crucial role in hypothesis testing and help determine the validity of statistical inferences.

In this article, we will discuss the definition of critical value, the Meaning of critical value, Approaches of Critical value, types of critical value, and also discuss the example of critical value.

Definition of Critical Values

A critical value is a threshold value used in hypothesis testing that separates the acceptance and rejection regions based on a given level of significance. In statistical hypothesis testing, a critical value is a threshold value that is used to determine whether a test statistic is significant enough to null hypothesis reject.

It is based on the level of significance chosen for the test and is determined by the distribution of the test statistic. The critical value separates the acceptance and rejection regions, and if the test statistic falls in the rejection region, the null hypothesis is rejected.

Critical values play a crucial role in hypothesis testing as they help to determine the validity of statistical inferences. The critical value is based on the level of significance (alpha) chosen for the test and is determined by the distribution of the test statistic.

It is used to define the region of rejection, which consists of the extreme sample statistics that are unlikely to occur if the null hypothesis is true. Critical values can be obtained from tables or calculated using statistical software and are essential in determining the validity of statistical inferences.

Critical Value Approach | Steps of Hypothesis Testing

The approach of critical value involves several steps in statistical hypothesis testing :

  • Formulate the null hypothesis and alternative hypothesis.
  • Choose the level of significance (alpha) for the test.
  • Determine the appropriate test statistic to use for the hypothesis test.
  • Determine the test statistic distribution under the null hypothesis.
  • Calculate the test statistic value using the sample data.
  • Determine the critical value from the distribution of the test statistic based on the level of significance.
  • Compare the critical value with the test statistic value.
  • If the test statistic value is greater than or equal to the critical value, reject the null hypothesis in favor of the alternative hypothesis; if not, fail to reject the null hypothesis.
  • Calculate the p-value to determine the strength of evidence against the null hypothesis, if desired.

The critical value approach is widely used in hypothesis testing to determine the validity of statistical inferences. It involves determining a threshold value that separates the acceptance and rejection regions based on a given level of significance, which helps to determine whether the test statistic is significant enough to reject the null hypothesis.

The approach of critical value is used in various statistical tests such as t-tests, F-tests, and chi-square tests. The critical value approach is widely used in statistical hypothesis testing as it provides a clear and objective method to determine the validity of statistical inferences.

Different Types of Critical Value

There are four different kinds of crucial values, depending on the statistical test that is run on the statistical data. A list of critical value types is given below:

  • F-critical value
  • T- critical value
  • Z- critical value
  • Chi-square-critical value

F- Critical value

When testing a hypothesis involving F-distribution then we use the F-critical value. It is denoted by the following notation Fα, df2, df1, here α is the level of significance and df2, df1 denotes the degree of freedom for the denominator and nominator, respectively, and also Fα, df2, df1 is the F critical value that corresponds upper tail area of α.

The F-critical value is used to determine whether to reject or fail to reject the null hypothesis in a hypothesis test involving variances. If the calculated F-statistic is greater than or equal to the F critical value, the null hypothesis is rejected, indicating that there is a significant difference in variances between the groups being compared.

T- Critical value

When testing a hypothesis involving T-distribution then we use the T-critical value. It is denoted by the following notation tα/2, here α is the level of significance and tα/2 is the t-critical value that corresponds to the upper tail area of α/2 and also n-1 is the degree of freedom.

A t critical value calculator is the best way to find the t value of your required input to avoid table searches and possible mistakes.

Z- Critical value

In hypothesis when involving standard normal distribution used the Z critical value. It is denoted by the following notation Zα/2, here α is the level of significance and Zα/2 is the Z score critical value that corresponds to the upper tail area of α/2.

Chi-square- Critical value

The Chi-Square critical value is a value used in statistical hypothesis testing to determine the significance of the Chi-Square statistic. It is based on the level of significance (alpha) chosen for the test and the degrees of freedom associated with the Chi-Square distribution.

Critical Value Example | Formula

Find the critical value for a two-tailed f test conducted on the following samples at an α = 0.05

Variance = 120, Sample size = 61

Variance = 80, Sample size = 51

Sample df1 = n1 – 1= 60

Sample df2= n2 – 1 = 40

For α = 0.05, using the F distribution table, the value at the intersection of the 60th column and 40th row is

F (60, 40) = 1.637

Critical Value = 1.637

FAQs Questions and Answers

– The critical value is calculated based on the distribution of the test statistic and the desired significance level. In most cases, the critical value is determined from tables or statistical software. For example, in a t-test with a sample size of 10 and a significance level of 0.05, the critical value can be looked up from a t-distribution table with 9 degrees of freedom.

– Yes, critical values can be negative, depending on the distribution of the test statistic.

– If the test statistic exceeds the critical value, the null hypothesis is rejected, indicating that there is evidence to support an alternative hypothesis. This means that the observed difference between two groups or variables is unlikely to have occurred by chance.

In this article, we have discussed the definition of critical value, the Meaning of critical value, Approaches of Critical value, and types of critical value, and also with the help of examples, the topic will be explained. After studying this article anyone can defend this topic easily.

Are you looking for job?

Job openings »

Dipali Chaudhari

I have completed master in Electrical Power System. I work and write technical tutorials on the PLC, MATLAB programming, and Electrical on DipsLab.com portal.

Sharing my knowledge on this blog makes me happy.  And sometimes I delve in Python programming.

Leave a Comment Cancel reply

web analytics

Data Science Society

Data.Platform

Data Science Society

Statistical Hypothesis Testing: How to Calculate Critical Values

define critical value in hypothesis testing

Testing statistical hypotheses is one of the most important parts of data analysis. It lets the researcher and analyst conclude the whole community from a small sample. In this case, critical values are useful because they help figure out if the results are worthy of attention.

The goal of this article is to define a critical value calculator, talk about why it’s important in statistical hypothesis testing, and show how to use one.

What is statistical hypothesis testing?

define critical value in hypothesis testing

Statistical hypothesis testing is a methodical way to draw conclusions about a whole community from a small group of samples. In this step, observed data is compared to an expected or null hypothesis to see if there is a difference that is caused by an effect, chance, or just a human mistake. Hypotheses are put to the test in economics, social studies, and science in order to come to reasonable conclusions.

What are critical values?

In this case, critical values are limits or borders that are used during hypothesis testing to see how important a test statistic is. In hypothesis testing, the critical value is compared to a test statistic that measures the difference between the data that was noticed and the value that was thought to be true. A critical value calculator is used to evaluate if there is sufficient information in the observed results that would make it possible to invalidate the zero hypothesis.

How to calculate critical values

Step 1: identify the test statistic.

Before you can figure out the key values, you need to choose the right test statistic for your hypothesis test. The “test statistic” is a number that shows that the data are different from the “null value.” This is a list of test statistics. Which one to use depends on the data or hypothesis being tested.

Examples of these statistics are the Z-score, T-statistics, F-statistics, and Chi-squared statistics. Here’s a brief overview of when each test statistic is typically used:

Z-score: If you have data that has a normal distribution, you can find out what the group mean and standard deviation are.

T-statistic: The t-statistic is used to test hypotheses when the sample size is small, or the community standard deviation is unknown.

F-statistic: In ANOVA tests, F-statistics are used to find changes between the variances of different groups or treatments.

The chi-squared measure is used for tests that use categorical data, such as the goodness of fit test or the test for independence in a contingency table.

Once you’ve found the best statistic for a hypothesis test, move on to the next step.

Step 2: Determine the degrees of freedom

Degrees of freedom (df) are one of the important things that are used to figure out critical numbers. Freedom of degrees refers to the number of separate factors that are linked to your dataset. The number of degrees of freedom changes based on the test measure that is used.

For example, to find the critical numbers for a T-statistic, one is usually taken away from n to get an idea of the degrees of freedom. An F-statistic in ANOVA, on the other hand, has two sets of degrees of freedom: one for the numerator (which is the difference between groups) and one for the denominator (which is the difference within groups).

Because of this, you need to figure out the right number of degrees of freedom for your analysis and not use the wrong numbers because they lead to wrong results. If you need to find the right degree of freedom values for your test statistic, look at the appropriate statistical tables or sources.

Step 3: One needs to find the critical value in a critical value table

A critical value table is an important part of any hypothesis test. For each degree of freedom and significant level, the table shows the test statistic values that go with them. This critical number sets a limit on how often the null should be rejected.

One example is a two-tailed Z-test with a significance level of 0.05 (alpha = 0.05). If you know the number of degrees of freedom, you can find the critical value that is equal to alpha/2 (0.025) in the

Also, the T-table shows the important number for alpha/2 and your degrees of freedom for the T-distribution with degrees of freedom.

Step 4: Do you believe that the test statistic is bigger than the critical value?

After that, we will compare this test statistic with the critical number we chose from the table. So, you will reject the null hypothesis if your test result is more extreme than what is needed for a significance level (the tail of the distribution above the critical value). This shows that the data that was seen is very different, which means it probably wasn’t just a matter of chance. On the other hand, you can’t reject the null hypothesis if your test statistic doesn’t fall in the rejection area. In this case, the data that was noticed is not enough to show that the value that was hypothesized might be wrong.

In the field of statistical hypothesis testing, researchers and other analysts need to know what key values are and how to find them. So, critical values are a common way to figure out how important the results of tests are. When researchers check to see if the test statistic is greater than or similar to the critical value, they can tell if their data supports the null hypothesis or not.

Always use the right critical value tables, and keep in mind that degrees of freedom are a big part of making sure that statistical analysis is correct and thorough. Using statistical software can also help cut down on mistakes and make the math part of this process easier.

Hypothesis testing is built on important values that help people come to conclusions, make decisions, make progress in science, and learn more. Critical value calculation is a skill that everyone who works with statistics needs to have.

Leave a Reply Cancel reply

You must be logged in to post a comment.

  • Faculty Resource Center
  • Biochemistry
  • Bioengineering
  • Cancer Research
  • Developmental Biology
  • Engineering
  • Environment
  • Immunology and Infection
  • Neuroscience
  • JoVE Journal
  • JoVE Encyclopedia of Experiments
  • JoVE Chrome Extension
  • Environmental Sciences
  • Pharmacology
  • JoVE Science Education
  • JoVE Lab Manual
  • JoVE Business
  • Videos Mapped to your Course
  • High Schools
  • Videos Mapped to Your Course

Chapter 9: Hypothesis Testing

Back to chapter, critical region, critical values and significance level, previous video 9.2: null and alternative hypotheses, next video 9.4: p -value.

Hypothesis testing requires the sample statistics—such as proportion, mean, or standard deviation—to be converted into a value or score known as the test statistics.

Assuming that the null hypothesis is true, the test statistic for each sample statistic is calculated using the following equations.

As samples assume a particular distribution, a given test statistic value would fall into a specific area under the curve with some probability.

Such an area, which includes all the values of a test statistic that indicates that the null hypothesis must be rejected, is termed the rejection region or critical region.

The value that separates a critical region from the rest is termed the critical value. The critical values are the  z, t,  or chi-square values calculated at the desired confidence level.

The probability that the test statistic will fall in the critical region when the null hypothesis is actually true is called the significance level.

In the example of testing the proportion of healthy and scabbed apples, if the sample proportion is 0.9, the hypothesis can be tested as follows.

The critical region, critical value, and significance level are interdependent concepts crucial in hypothesis testing.

In hypothesis testing, a sample statistic is converted to a test statistic using z , t , or chi-square distribution. A critical region is an area under the curve in  probability distributions demarcated by the critical value. When the test statistic falls in this region, it suggests that the null hypothesis must be rejected. As this region contains all those values of the test statistic (calculated using the sample data) that suggest rejecting the null hypothesis, it is also known as the rejection region or region of rejection. The critical region may fall at the right, left, or both tails of the distribution based on the direction indicated in the alternative hypothesis and the calculated critical value.

A critical value is calculated using the z , t, or chi-square distribution table at a specific significance level. It is a fixed value for the given sample size and the significance level. The critical value creates a demarcation between all those values that suggest rejection of the null hypothesis and all those other values that indicate the opposite. A critical value is  based on a pre-decided significance level.

A significance level or level of significance or statistical significance is defined as the probability that the calculated test statistic will fall in the critical region. In other words, it is a statistical measure that indicates that the evidence for rejecting a true null hypothesis is strong enough. The significance level is indicated by α, and it is commonly 0.05 or 0.01.

Simple Hit Counter

  • Data Science
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • Deep Learning
  • Computer Vision
  • Artificial Intelligence
  • AI ML DS Interview Series
  • AI ML DS Projects series
  • Data Engineering
  • Web Scrapping

Understanding Hypothesis Testing

Hypothesis testing involves formulating assumptions about population parameters based on sample statistics and rigorously evaluating these assumptions against empirical evidence. This article sheds light on the significance of hypothesis testing and the critical steps involved in the process.

What is Hypothesis Testing?

A hypothesis is an assumption or idea, specifically a statistical claim about an unknown population parameter. For example, a judge assumes a person is innocent and verifies this by reviewing evidence and hearing testimony before reaching a verdict.

Hypothesis testing is a statistical method that is used to make a statistical decision using experimental data. Hypothesis testing is basically an assumption that we make about a population parameter. It evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data. 

To test the validity of the claim or assumption about the population parameter:

  • A sample is drawn from the population and analyzed.
  • The results of the analysis are used to decide whether the claim is true or not.
Example: You say an average height in the class is 30 or a boy is taller than a girl. All of these is an assumption that we are assuming, and we need some statistical way to prove these. We need some mathematical conclusion whatever we are assuming is true.

Defining Hypotheses

\mu

Key Terms of Hypothesis Testing

\alpha

  • P-value: The P value , or calculated probability, is the probability of finding the observed/extreme results when the null hypothesis(H0) of a study-given problem is true. If your P-value is less than the chosen significance level then you reject the null hypothesis i.e. accept that your sample claims to support the alternative hypothesis.
  • Test Statistic: The test statistic is a numerical value calculated from sample data during a hypothesis test, used to determine whether to reject the null hypothesis. It is compared to a critical value or p-value to make decisions about the statistical significance of the observed results.
  • Critical value : The critical value in statistics is a threshold or cutoff point used to determine whether to reject the null hypothesis in a hypothesis test.
  • Degrees of freedom: Degrees of freedom are associated with the variability or freedom one has in estimating a parameter. The degrees of freedom are related to the sample size and determine the shape.

Why do we use Hypothesis Testing?

Hypothesis testing is an important procedure in statistics. Hypothesis testing evaluates two mutually exclusive population statements to determine which statement is most supported by sample data. When we say that the findings are statistically significant, thanks to hypothesis testing. 

One-Tailed and Two-Tailed Test

One tailed test focuses on one direction, either greater than or less than a specified value. We use a one-tailed test when there is a clear directional expectation based on prior knowledge or theory. The critical region is located on only one side of the distribution curve. If the sample falls into this critical region, the null hypothesis is rejected in favor of the alternative hypothesis.

One-Tailed Test

There are two types of one-tailed test:

\mu \geq 50

Two-Tailed Test

A two-tailed test considers both directions, greater than and less than a specified value.We use a two-tailed test when there is no specific directional expectation, and want to detect any significant difference.

\mu =

To delve deeper into differences into both types of test: Refer to link

What are Type 1 and Type 2 errors in Hypothesis Testing?

In hypothesis testing, Type I and Type II errors are two possible errors that researchers can make when drawing conclusions about a population based on a sample of data. These errors are associated with the decisions made regarding the null hypothesis and the alternative hypothesis.

\alpha


Null Hypothesis is True

Null Hypothesis is False

Null Hypothesis is True (Accept)

Correct Decision

Type II Error (False Negative)

Alternative Hypothesis is True (Reject)

Type I Error (False Positive)

Correct Decision

How does Hypothesis Testing work?

Step 1: define null and alternative hypothesis.

H_0

We first identify the problem about which we want to make an assumption keeping in mind that our assumption should be contradictory to one another, assuming Normally distributed data.

Step 2 – Choose significance level

\alpha

Step 3 – Collect and Analyze data.

Gather relevant data through observation or experimentation. Analyze the data using appropriate statistical methods to obtain a test statistic.

Step 4-Calculate Test Statistic

The data for the tests are evaluated in this step we look for various scores based on the characteristics of data. The choice of the test statistic depends on the type of hypothesis test being conducted.

There are various hypothesis tests, each appropriate for various goal to calculate our test. This could be a Z-test , Chi-square , T-test , and so on.

  • Z-test : If population means and standard deviations are known. Z-statistic is commonly used.
  • t-test : If population standard deviations are unknown. and sample size is small than t-test statistic is more appropriate.
  • Chi-square test : Chi-square test is used for categorical data or for testing independence in contingency tables
  • F-test : F-test is often used in analysis of variance (ANOVA) to compare variances or test the equality of means across multiple groups.

We have a smaller dataset, So, T-test is more appropriate to test our hypothesis.

T-statistic is a measure of the difference between the means of two groups relative to the variability within each group. It is calculated as the difference between the sample means divided by the standard error of the difference. It is also known as the t-value or t-score.

Step 5 – Comparing Test Statistic:

In this stage, we decide where we should accept the null hypothesis or reject the null hypothesis. There are two ways to decide where we should accept or reject the null hypothesis.

Method A: Using Crtical values

Comparing the test statistic and tabulated critical value we have,

  • If Test Statistic>Critical Value: Reject the null hypothesis.
  • If Test Statistic≤Critical Value: Fail to reject the null hypothesis.

Note: Critical values are predetermined threshold values that are used to make a decision in hypothesis testing. To determine critical values for hypothesis testing, we typically refer to a statistical distribution table , such as the normal distribution or t-distribution tables based on.

Method B: Using P-values

We can also come to an conclusion using the p-value,

p\leq\alpha

Note : The p-value is the probability of obtaining a test statistic as extreme as, or more extreme than, the one observed in the sample, assuming the null hypothesis is true. To determine p-value for hypothesis testing, we typically refer to a statistical distribution table , such as the normal distribution or t-distribution tables based on.

Step 7- Interpret the Results

At last, we can conclude our experiment using method A or B.

Calculating test statistic

To validate our hypothesis about a population parameter we use statistical functions . We use the z-score, p-value, and level of significance(alpha) to make evidence for our hypothesis for normally distributed data .

1. Z-statistics:

When population means and standard deviations are known.

z = \frac{\bar{x} - \mu}{\frac{\sigma}{\sqrt{n}}}

  • μ represents the population mean, 
  • σ is the standard deviation
  • and n is the size of the sample.

2. T-Statistics

T test is used when n<30,

t-statistic calculation is given by:

t=\frac{x̄-μ}{s/\sqrt{n}}

  • t = t-score,
  • x̄ = sample mean
  • μ = population mean,
  • s = standard deviation of the sample,
  • n = sample size

3. Chi-Square Test

Chi-Square Test for Independence categorical Data (Non-normally distributed) using:

\chi^2 = \sum \frac{(O_{ij} - E_{ij})^2}{E_{ij}}

  • i,j are the rows and columns index respectively.

E_{ij}

Real life Examples of Hypothesis Testing

Let’s examine hypothesis testing using two real life situations,

Case A: D oes a New Drug Affect Blood Pressure?

Imagine a pharmaceutical company has developed a new drug that they believe can effectively lower blood pressure in patients with hypertension. Before bringing the drug to market, they need to conduct a study to assess its impact on blood pressure.

  • Before Treatment: 120, 122, 118, 130, 125, 128, 115, 121, 123, 119
  • After Treatment: 115, 120, 112, 128, 122, 125, 110, 117, 119, 114

Step 1 : Define the Hypothesis

  • Null Hypothesis : (H 0 )The new drug has no effect on blood pressure.
  • Alternate Hypothesis : (H 1 )The new drug has an effect on blood pressure.

Step 2: Define the Significance level

Let’s consider the Significance level at 0.05, indicating rejection of the null hypothesis.

If the evidence suggests less than a 5% chance of observing the results due to random variation.

Step 3 : Compute the test statistic

Using paired T-test analyze the data to obtain a test statistic and a p-value.

The test statistic (e.g., T-statistic) is calculated based on the differences between blood pressure measurements before and after treatment.

t = m/(s/√n)

  • m  = mean of the difference i.e X after, X before
  • s  = standard deviation of the difference (d) i.e d i ​= X after, i ​− X before,
  • n  = sample size,

then, m= -3.9, s= 1.8 and n= 10

we, calculate the , T-statistic = -9 based on the formula for paired t test

Step 4: Find the p-value

The calculated t-statistic is -9 and degrees of freedom df = 9, you can find the p-value using statistical software or a t-distribution table.

thus, p-value = 8.538051223166285e-06

Step 5: Result

  • If the p-value is less than or equal to 0.05, the researchers reject the null hypothesis.
  • If the p-value is greater than 0.05, they fail to reject the null hypothesis.

Conclusion: Since the p-value (8.538051223166285e-06) is less than the significance level (0.05), the researchers reject the null hypothesis. There is statistically significant evidence that the average blood pressure before and after treatment with the new drug is different.

Python Implementation of Case A

Let’s create hypothesis testing with python, where we are testing whether a new drug affects blood pressure. For this example, we will use a paired T-test. We’ll use the scipy.stats library for the T-test.

Scipy is a mathematical library in Python that is mostly used for mathematical equations and computations.

We will implement our first real life problem via python,

In the above example, given the T-statistic of approximately -9 and an extremely small p-value, the results indicate a strong case to reject the null hypothesis at a significance level of 0.05. 

  • The results suggest that the new drug, treatment, or intervention has a significant effect on lowering blood pressure.
  • The negative T-statistic indicates that the mean blood pressure after treatment is significantly lower than the assumed population mean before treatment.

Case B : Cholesterol level in a population

Data: A sample of 25 individuals is taken, and their cholesterol levels are measured.

Cholesterol Levels (mg/dL): 205, 198, 210, 190, 215, 205, 200, 192, 198, 205, 198, 202, 208, 200, 205, 198, 205, 210, 192, 205, 198, 205, 210, 192, 205.

Populations Mean = 200

Population Standard Deviation (σ): 5 mg/dL(given for this problem)

Step 1: Define the Hypothesis

  • Null Hypothesis (H 0 ): The average cholesterol level in a population is 200 mg/dL.
  • Alternate Hypothesis (H 1 ): The average cholesterol level in a population is different from 200 mg/dL.

As the direction of deviation is not given , we assume a two-tailed test, and based on a normal distribution table, the critical values for a significance level of 0.05 (two-tailed) can be calculated through the z-table and are approximately -1.96 and 1.96.

(203.8 - 200) / (5 \div \sqrt{25})

Step 4: Result

Since the absolute value of the test statistic (2.04) is greater than the critical value (1.96), we reject the null hypothesis. And conclude that, there is statistically significant evidence that the average cholesterol level in the population is different from 200 mg/dL

Python Implementation of Case B

Limitations of hypothesis testing.

  • Although a useful technique, hypothesis testing does not offer a comprehensive grasp of the topic being studied. Without fully reflecting the intricacy or whole context of the phenomena, it concentrates on certain hypotheses and statistical significance.
  • The accuracy of hypothesis testing results is contingent on the quality of available data and the appropriateness of statistical methods used. Inaccurate data or poorly formulated hypotheses can lead to incorrect conclusions.
  • Relying solely on hypothesis testing may cause analysts to overlook significant patterns or relationships in the data that are not captured by the specific hypotheses being tested. This limitation underscores the importance of complimenting hypothesis testing with other analytical approaches.

Hypothesis testing stands as a cornerstone in statistical analysis, enabling data scientists to navigate uncertainties and draw credible inferences from sample data. By systematically defining null and alternative hypotheses, choosing significance levels, and leveraging statistical tests, researchers can assess the validity of their assumptions. The article also elucidates the critical distinction between Type I and Type II errors, providing a comprehensive understanding of the nuanced decision-making process inherent in hypothesis testing. The real-life example of testing a new drug’s effect on blood pressure using a paired T-test showcases the practical application of these principles, underscoring the importance of statistical rigor in data-driven decision-making.

Frequently Asked Questions (FAQs)

1. what are the 3 types of hypothesis test.

There are three types of hypothesis tests: right-tailed, left-tailed, and two-tailed. Right-tailed tests assess if a parameter is greater, left-tailed if lesser. Two-tailed tests check for non-directional differences, greater or lesser.

2.What are the 4 components of hypothesis testing?

Null Hypothesis ( ): No effect or difference exists. Alternative Hypothesis ( ): An effect or difference exists. Significance Level ( ): Risk of rejecting null hypothesis when it’s true (Type I error). Test Statistic: Numerical value representing observed evidence against null hypothesis.

3.What is hypothesis testing in ML?

Statistical method to evaluate the performance and validity of machine learning models. Tests specific hypotheses about model behavior, like whether features influence predictions or if a model generalizes well to unseen data.

4.What is the difference between Pytest and hypothesis in Python?

Pytest purposes general testing framework for Python code while Hypothesis is a Property-based testing framework for Python, focusing on generating test cases based on specified properties of the code.

Similar Reads

  • data-science

Please Login to comment...

  • How to Watch NFL on NFL+ in 2024: A Complete Guide
  • Best Smartwatches in 2024: Top Picks for Every Need
  • Top Budgeting Apps in 2024
  • 10 Best Parental Control App in 2024
  • GeeksforGeeks Practice - Leading Online Coding Platform

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

define critical value in hypothesis testing

Hypothesis Testing for Means & Proportions

  •   1  
  • |   2  
  • |   3  
  • |   4  
  • |   5  
  • |   6  
  • |   7  
  • |   8  
  • |   9  
  • |   10  

On This Page sidebar

Hypothesis Testing: Upper-, Lower, and Two Tailed Tests

Type i and type ii errors.

Learn More sidebar

All Modules

More Resources sidebar

Z score Table

t score Table

The procedure for hypothesis testing is based on the ideas described above. Specifically, we set up competing hypotheses, select a random sample from the population of interest and compute summary statistics. We then determine whether the sample data supports the null or alternative hypotheses. The procedure can be broken down into the following five steps.  

  • Step 1. Set up hypotheses and select the level of significance α.

H 0 : Null hypothesis (no change, no difference);  

H 1 : Research hypothesis (investigator's belief); α =0.05

 

Upper-tailed, Lower-tailed, Two-tailed Tests

The research or alternative hypothesis can take one of three forms. An investigator might believe that the parameter has increased, decreased or changed. For example, an investigator might hypothesize:  

: μ > μ , where μ is the comparator or null value (e.g., μ =191 in our example about weight in men in 2006) and an increase is hypothesized - this type of test is called an ; : μ < μ , where a decrease is hypothesized and this is called a ; or : μ ≠ μ where a difference is hypothesized and this is called a .  

The exact form of the research hypothesis depends on the investigator's belief about the parameter of interest and whether it has possibly increased, decreased or is different from the null value. The research hypothesis is set up by the investigator before any data are collected.

 

  • Step 2. Select the appropriate test statistic.  

The test statistic is a single number that summarizes the sample information.   An example of a test statistic is the Z statistic computed as follows:

When the sample size is small, we will use t statistics (just as we did when constructing confidence intervals for small samples). As we present each scenario, alternative test statistics are provided along with conditions for their appropriate use.

  • Step 3.  Set up decision rule.  

The decision rule is a statement that tells under what circumstances to reject the null hypothesis. The decision rule is based on specific values of the test statistic (e.g., reject H 0 if Z > 1.645). The decision rule for a specific test depends on 3 factors: the research or alternative hypothesis, the test statistic and the level of significance. Each is discussed below.

  • The decision rule depends on whether an upper-tailed, lower-tailed, or two-tailed test is proposed. In an upper-tailed test the decision rule has investigators reject H 0 if the test statistic is larger than the critical value. In a lower-tailed test the decision rule has investigators reject H 0 if the test statistic is smaller than the critical value.  In a two-tailed test the decision rule has investigators reject H 0 if the test statistic is extreme, either larger than an upper critical value or smaller than a lower critical value.
  • The exact form of the test statistic is also important in determining the decision rule. If the test statistic follows the standard normal distribution (Z), then the decision rule will be based on the standard normal distribution. If the test statistic follows the t distribution, then the decision rule will be based on the t distribution. The appropriate critical value will be selected from the t distribution again depending on the specific alternative hypothesis and the level of significance.  
  • The third factor is the level of significance. The level of significance which is selected in Step 1 (e.g., α =0.05) dictates the critical value.   For example, in an upper tailed Z test, if α =0.05 then the critical value is Z=1.645.  

The following figures illustrate the rejection regions defined by the decision rule for upper-, lower- and two-tailed Z tests with α=0.05. Notice that the rejection regions are in the upper, lower and both tails of the curves, respectively. The decision rules are written below each figure.

Rejection Region for Upper-Tailed Z Test (H : μ > μ ) with α=0.05

The decision rule is: Reject H if Z 1.645.

 

 

α

Z

0.10

1.282

0.05

1.645

0.025

1.960

0.010

2.326

0.005

2.576

0.001

3.090

0.0001

3.719

Standard normal distribution with lower tail at -1.645 and alpha=0.05

Rejection Region for Lower-Tailed Z Test (H 1 : μ < μ 0 ) with α =0.05

The decision rule is: Reject H 0 if Z < 1.645.

a

Z

0.10

-1.282

0.05

-1.645

0.025

-1.960

0.010

-2.326

0.005

-2.576

0.001

-3.090

0.0001

-3.719

Standard normal distribution with two tails

Rejection Region for Two-Tailed Z Test (H 1 : μ ≠ μ 0 ) with α =0.05

The decision rule is: Reject H 0 if Z < -1.960 or if Z > 1.960.

0.20

1.282

0.10

1.645

0.05

1.960

0.010

2.576

0.001

3.291

0.0001

3.819

The complete table of critical values of Z for upper, lower and two-tailed tests can be found in the table of Z values to the right in "Other Resources."

Critical values of t for upper, lower and two-tailed tests can be found in the table of t values in "Other Resources."

  • Step 4. Compute the test statistic.  

Here we compute the test statistic by substituting the observed sample data into the test statistic identified in Step 2.

  • Step 5. Conclusion.  

The final conclusion is made by comparing the test statistic (which is a summary of the information observed in the sample) to the decision rule. The final conclusion will be either to reject the null hypothesis (because the sample data are very unlikely if the null hypothesis is true) or not to reject the null hypothesis (because the sample data are not very unlikely).  

If the null hypothesis is rejected, then an exact significance level is computed to describe the likelihood of observing the sample data assuming that the null hypothesis is true. The exact level of significance is called the p-value and it will be less than the chosen level of significance if we reject H 0 .

Statistical computing packages provide exact p-values as part of their standard output for hypothesis tests. In fact, when using a statistical computing package, the steps outlined about can be abbreviated. The hypotheses (step 1) should always be set up in advance of any analysis and the significance criterion should also be determined (e.g., α =0.05). Statistical computing packages will produce the test statistic (usually reporting the test statistic as t) and a p-value. The investigator can then determine statistical significance using the following: If p < α then reject H 0 .  

 

 

  • Step 1. Set up hypotheses and determine level of significance

H 0 : μ = 191 H 1 : μ > 191                 α =0.05

The research hypothesis is that weights have increased, and therefore an upper tailed test is used.

  • Step 2. Select the appropriate test statistic.

Because the sample size is large (n > 30) the appropriate test statistic is

  • Step 3. Set up decision rule.  

In this example, we are performing an upper tailed test (H 1 : μ> 191), with a Z test statistic and selected α =0.05.   Reject H 0 if Z > 1.645.

We now substitute the sample data into the formula for the test statistic identified in Step 2.  

We reject H 0 because 2.38 > 1.645. We have statistically significant evidence at a =0.05, to show that the mean weight in men in 2006 is more than 191 pounds. Because we rejected the null hypothesis, we now approximate the p-value which is the likelihood of observing the sample data if the null hypothesis is true. An alternative definition of the p-value is the smallest level of significance where we can still reject H 0 . In this example, we observed Z=2.38 and for α=0.05, the critical value was 1.645. Because 2.38 exceeded 1.645 we rejected H 0 . In our conclusion we reported a statistically significant increase in mean weight at a 5% level of significance. Using the table of critical values for upper tailed tests, we can approximate the p-value. If we select α=0.025, the critical value is 1.96, and we still reject H 0 because 2.38 > 1.960. If we select α=0.010 the critical value is 2.326, and we still reject H 0 because 2.38 > 2.326. However, if we select α=0.005, the critical value is 2.576, and we cannot reject H 0 because 2.38 < 2.576. Therefore, the smallest α where we still reject H 0 is 0.010. This is the p-value. A statistical computing package would produce a more precise p-value which would be in between 0.005 and 0.010. Here we are approximating the p-value and would report p < 0.010.                  

In all tests of hypothesis, there are two types of errors that can be committed. The first is called a Type I error and refers to the situation where we incorrectly reject H 0 when in fact it is true. This is also called a false positive result (as we incorrectly conclude that the research hypothesis is true when in fact it is not). When we run a test of hypothesis and decide to reject H 0 (e.g., because the test statistic exceeds the critical value in an upper tailed test) then either we make a correct decision because the research hypothesis is true or we commit a Type I error. The different conclusions are summarized in the table below. Note that we will never know whether the null hypothesis is really true or false (i.e., we will never know which row of the following table reflects reality).

Table - Conclusions in Test of Hypothesis

 

is True

Correct Decision

Type I Error

is False

Type II Error

Correct Decision

In the first step of the hypothesis test, we select a level of significance, α, and α= P(Type I error). Because we purposely select a small value for α, we control the probability of committing a Type I error. For example, if we select α=0.05, and our test tells us to reject H 0 , then there is a 5% probability that we commit a Type I error. Most investigators are very comfortable with this and are confident when rejecting H 0 that the research hypothesis is true (as it is the more likely scenario when we reject H 0 ).

When we run a test of hypothesis and decide not to reject H 0 (e.g., because the test statistic is below the critical value in an upper tailed test) then either we make a correct decision because the null hypothesis is true or we commit a Type II error. Beta (β) represents the probability of a Type II error and is defined as follows: β=P(Type II error) = P(Do not Reject H 0 | H 0 is false). Unfortunately, we cannot choose β to be small (e.g., 0.05) to control the probability of committing a Type II error because β depends on several factors including the sample size, α, and the research hypothesis. When we do not reject H 0 , it may be very likely that we are committing a Type II error (i.e., failing to reject H 0 when in fact it is false). Therefore, when tests are run and the null hypothesis is not rejected we often make a weak concluding statement allowing for the possibility that we might be committing a Type II error. If we do not reject H 0 , we conclude that we do not have significant evidence to show that H 1 is true. We do not conclude that H 0 is true.

Lightbulb icon signifying an important idea

 The most common reason for a Type II error is a small sample size.

return to top | previous page | next page

Content ©2017. All Rights Reserved. Date last modified: November 6, 2017. Wayne W. LaMorte, MD, PhD, MPH

P-Value in Statistical Hypothesis Tests: What is it?

P value definition.

A p value is used in hypothesis testing to help you support or reject the null hypothesis . The p value is the evidence against a null hypothesis . The smaller the p-value, the stronger the evidence that you should reject the null hypothesis.

P values are expressed as decimals although it may be easier to understand what they are if you convert them to a percentage . For example, a p value of 0.0254 is 2.54%. This means there is a 2.54% chance your results could be random (i.e. happened by chance). That’s pretty tiny. On the other hand, a large p-value of .9(90%) means your results have a 90% probability of being completely random and not due to anything in your experiment. Therefore, the smaller the p-value, the more important (“ significant “) your results.

When you run a hypothesis test , you compare the p value from your test to the alpha level you selected when you ran the test. Alpha levels can also be written as percentages.

p value

P Value vs Alpha level

Alpha levels are controlled by the researcher and are related to confidence levels . You get an alpha level by subtracting your confidence level from 100%. For example, if you want to be 98 percent confident in your research, the alpha level would be 2% (100% – 98%). When you run the hypothesis test, the test will give you a value for p. Compare that value to your chosen alpha level. For example, let’s say you chose an alpha level of 5% (0.05). If the results from the test give you:

  • A small p (≤ 0.05), reject the null hypothesis . This is strong evidence that the null hypothesis is invalid.
  • A large p (> 0.05) means the alternate hypothesis is weak, so you do not reject the null.

P Values and Critical Values

p-value

What if I Don’t Have an Alpha Level?

In an ideal world, you’ll have an alpha level. But if you do not, you can still use the following rough guidelines in deciding whether to support or reject the null hypothesis:

  • If p > .10 → “not significant”
  • If p ≤ .10 → “marginally significant”
  • If p ≤ .05 → “significant”
  • If p ≤ .01 → “highly significant.”

How to Calculate a P Value on the TI 83

Example question: The average wait time to see an E.R. doctor is said to be 150 minutes. You think the wait time is actually less. You take a random sample of 30 people and find their average wait is 148 minutes with a standard deviation of 5 minutes. Assume the distribution is normal. Find the p value for this test.

  • Press STAT then arrow over to TESTS.
  • Press ENTER for Z-Test .
  • Arrow over to Stats. Press ENTER.
  • Arrow down to μ0 and type 150. This is our null hypothesis mean.
  • Arrow down to σ. Type in your std dev: 5.
  • Arrow down to xbar. Type in your sample mean : 148.
  • Arrow down to n. Type in your sample size : 30.
  • Arrow to <μ0 for a left tail test . Press ENTER.
  • Arrow down to Calculate. Press ENTER. P is given as .014, or about 1%.

The probability that you would get a sample mean of 148 minutes is tiny, so you should reject the null hypothesis.

Note : If you don’t want to run a test, you could also use the TI 83 NormCDF function to get the area (which is the same thing as the probability value).

Dodge, Y. (2008). The Concise Encyclopedia of Statistics . Springer. Gonick, L. (1993). The Cartoon Guide to Statistics . HarperPerennial.

  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

One-Tailed and Two-Tailed Hypothesis Tests Explained

By Jim Frost 61 Comments

Choosing whether to perform a one-tailed or a two-tailed hypothesis test is one of the methodology decisions you might need to make for your statistical analysis. This choice can have critical implications for the types of effects it can detect, the statistical power of the test, and potential errors.

In this post, you’ll learn about the differences between one-tailed and two-tailed hypothesis tests and their advantages and disadvantages. I include examples of both types of statistical tests. In my next post, I cover the decision between one and two-tailed tests in more detail.

What Are Tails in a Hypothesis Test?

First, we need to cover some background material to understand the tails in a test. Typically, hypothesis tests take all of the sample data and convert it to a single value, which is known as a test statistic. You’re probably already familiar with some test statistics. For example, t-tests calculate t-values . F-tests, such as ANOVA, generate F-values . The chi-square test of independence and some distribution tests produce chi-square values. All of these values are test statistics. For more information, read my post about Test Statistics .

These test statistics follow a sampling distribution. Probability distribution plots display the probabilities of obtaining test statistic values when the null hypothesis is correct. On a probability distribution plot, the portion of the shaded area under the curve represents the probability that a value will fall within that range.

The graph below displays a sampling distribution for t-values. The two shaded regions cover the two-tails of the distribution.

Plot that display critical regions in the two tails of the distribution.

Keep in mind that this t-distribution assumes that the null hypothesis is correct for the population. Consequently, the peak (most likely value) of the distribution occurs at t=0, which represents the null hypothesis in a t-test. Typically, the null hypothesis states that there is no effect. As t-values move further away from zero, it represents larger effect sizes. When the null hypothesis is true for the population, obtaining samples that exhibit a large apparent effect becomes less likely, which is why the probabilities taper off for t-values further from zero.

Related posts : How t-Tests Work and Understanding Probability Distributions

Critical Regions in a Hypothesis Test

In hypothesis tests, critical regions are ranges of the distributions where the values represent statistically significant results. Analysts define the size and location of the critical regions by specifying both the significance level (alpha) and whether the test is one-tailed or two-tailed.

Consider the following two facts:

  • The significance level is the probability of rejecting a null hypothesis that is correct.
  • The sampling distribution for a test statistic assumes that the null hypothesis is correct.

Consequently, to represent the critical regions on the distribution for a test statistic, you merely shade the appropriate percentage of the distribution. For the common significance level of 0.05, you shade 5% of the distribution.

Related posts : Significance Levels and P-values and T-Distribution Table of Critical Values

Two-Tailed Hypothesis Tests

Two-tailed hypothesis tests are also known as nondirectional and two-sided tests because you can test for effects in both directions. When you perform a two-tailed test, you split the significance level percentage between both tails of the distribution. In the example below, I use an alpha of 5% and the distribution has two shaded regions of 2.5% (2 * 2.5% = 5%).

When a test statistic falls in either critical region, your sample data are sufficiently incompatible with the null hypothesis that you can reject it for the population.

In a two-tailed test, the generic null and alternative hypotheses are the following:

  • Null : The effect equals zero.
  • Alternative :  The effect does not equal zero.

The specifics of the hypotheses depend on the type of test you perform because you might be assessing means, proportions, or rates.

Example of a two-tailed 1-sample t-test

Suppose we perform a two-sided 1-sample t-test where we compare the mean strength (4.1) of parts from a supplier to a target value (5). We use a two-tailed test because we care whether the mean is greater than or less than the target value.

To interpret the results, simply compare the p-value to your significance level. If the p-value is less than the significance level, you know that the test statistic fell into one of the critical regions, but which one? Just look at the estimated effect. In the output below, the t-value is negative, so we know that the test statistic fell in the critical region in the left tail of the distribution, indicating the mean is less than the target value. Now we know this difference is statistically significant.

Statistical output from a two-tailed 1-sample t-test.

We can conclude that the population mean for part strength is less than the target value. However, the test had the capacity to detect a positive difference as well. You can also assess the confidence interval. With a two-tailed hypothesis test, you’ll obtain a two-sided confidence interval. The confidence interval tells us that the population mean is likely to fall between 3.372 and 4.828. This range excludes the target value (5), which is another indicator of significance.

Advantages of two-tailed hypothesis tests

You can detect both positive and negative effects. Two-tailed tests are standard in scientific research where discovering any type of effect is usually of interest to researchers.

One-Tailed Hypothesis Tests

One-tailed hypothesis tests are also known as directional and one-sided tests because you can test for effects in only one direction. When you perform a one-tailed test, the entire significance level percentage goes into the extreme end of one tail of the distribution.

In the examples below, I use an alpha of 5%. Each distribution has one shaded region of 5%. When you perform a one-tailed test, you must determine whether the critical region is in the left tail or the right tail. The test can detect an effect only in the direction that has the critical region. It has absolutely no capacity to detect an effect in the other direction.

In a one-tailed test, you have two options for the null and alternative hypotheses, which corresponds to where you place the critical region.

You can choose either of the following sets of generic hypotheses:

  • Null : The effect is less than or equal to zero.
  • Alternative : The effect is greater than zero.

Plot that displays a single critical region for a one-tailed test.

  • Null : The effect is greater than or equal to zero.
  • Alternative : The effect is less than zero.

Plot that displays a single critical region in the left tail for a one-tailed test.

Again, the specifics of the hypotheses depend on the type of test you perform.

Notice how for both possible null hypotheses the tests can’t distinguish between zero and an effect in a particular direction. For example, in the example directly above, the null combines “the effect is greater than or equal to zero” into a single category. That test can’t differentiate between zero and greater than zero.

Example of a one-tailed 1-sample t-test

Suppose we perform a one-tailed 1-sample t-test. We’ll use a similar scenario as before where we compare the mean strength of parts from a supplier (102) to a target value (100). Imagine that we are considering a new parts supplier. We will use them only if the mean strength of their parts is greater than our target value. There is no need for us to differentiate between whether their parts are equally strong or less strong than the target value—either way we’d just stick with our current supplier.

Consequently, we’ll choose the alternative hypothesis that states the mean difference is greater than zero (Population mean – Target value > 0). The null hypothesis states that the difference between the population mean and target value is less than or equal to zero.

Statistical output for a one-tailed 1-sample t-test.

To interpret the results, compare the p-value to your significance level. If the p-value is less than the significance level, you know that the test statistic fell into the critical region. For this study, the statistically significant result supports the notion that the population mean is greater than the target value of 100.

Confidence intervals for a one-tailed test are similarly one-sided. You’ll obtain either an upper bound or a lower bound. In this case, we get a lower bound, which indicates that the population mean is likely to be greater than or equal to 100.631. There is no upper limit to this range.

A lower-bound matches our goal of determining whether the new parts are stronger than our target value. The fact that the lower bound (100.631) is higher than the target value (100) indicates that these results are statistically significant.

This test is unable to detect a negative difference even when the sample mean represents a very negative effect.

Advantages and disadvantages of one-tailed hypothesis tests

One-tailed tests have more statistical power to detect an effect in one direction than a two-tailed test with the same design and significance level. One-tailed tests occur most frequently for studies where one of the following is true:

  • Effects can exist in only one direction.
  • Effects can exist in both directions but the researchers only care about an effect in one direction. There is no drawback to failing to detect an effect in the other direction. (Not recommended.)

The disadvantage of one-tailed tests is that they have no statistical power to detect an effect in the other direction.

As part of your pre-study planning process, determine whether you’ll use the one- or two-tailed version of a hypothesis test. To learn more about this planning process, read 5 Steps for Conducting Scientific Studies with Statistical Analyses .

This post explains the differences between one-tailed and two-tailed statistical hypothesis tests. How these forms of hypothesis tests function is clear and based on mathematics. However, there is some debate about when you can use one-tailed tests. My next post explores this decision in much more depth and explains the different schools of thought and my opinion on the matter— When Can I Use One-Tailed Hypothesis Tests .

If you’re learning about hypothesis testing and like the approach I use in my blog, check out my Hypothesis Testing book! You can find it at Amazon and other retailers.

Cover image of my Hypothesis Testing: An Intuitive Guide ebook.

Share this:

define critical value in hypothesis testing

Reader Interactions

' src=

August 23, 2024 at 1:28 pm

Thank so much. This is very helpfull

' src=

June 26, 2022 at 12:14 pm

Hi, Can help me with figuring out the null and alternative hypothesis of the following statement? Some claimed that the real average expenditure on beverage by general people is at least $10.

' src=

February 19, 2022 at 6:02 am

thank you for the thoroughly explanation, I’m still strugling to wrap my mind around the t-table and the relation between the alpha values for one or two tail probability and the confidence levels on the bottom (I’m understanding it so wrongly that for me it should be the oposite, like one tail 0,05 should correspond 95% CI and two tailed 0,025 should correspond to 95% because then you got the 2,5% on each side). In my mind if I picture the one tail diagram with an alpha of 0,05 I see the rest 95% inside the diagram, but for a one tail I only see 90% CI paired with a 5% alpha… where did the other 5% go? I tried to understand when you said we should just double the alpha for a one tail probability in order to find the CI but I still cant picture it. I have been trying to understand this. Like if you only have one tail and there is 0,05, shouldn’t the rest be on the other side? why is it then 90%… I know I’m missing a point and I can’t figure it out and it’s so frustrating…

' src=

February 23, 2022 at 10:01 pm

The alpha is the total shaded area. So, if the alpha = 0.05, you know that 5% of the distribution is shaded. The number of tails tells you how to divide the shaded areas. Is it all in one region (1-tailed) or do you split the shaded regions in two (2-tailed)?

So, for a one-tailed test with an alpha of 0.05, the 5% shading is all in one tail. If alpha = 0.10, then it’s 10% on one side. If it’s two-tailed, then you need to split that 10% into two–5% in both tails. Hence, the 5% in a one-tailed test is the same as a two-tailed test with an alpha of 0.10 because that test has the same 5% on one side (but there’s another 5% in the other tail).

It’s similar for CIs. However, for CIs, you shade the middle rather than the extremities. I write about that in one my articles about hypothesis testing and confidence intervals .

I’m not sure if I’m answering your question or not.

' src=

February 17, 2022 at 1:46 pm

I ran a post hoc Dunnett’s test alpha=0.05 after a significant Anova test in Proc Mixed using SAS. I want to determine if the means for treatment (t1, t2, t3) is significantly less than the means for control (p=pathogen). The code for the dunnett’s test is – LSmeans trt / diff=controll (‘P’) adjust=dunnett CL plot=control; I think the lower bound one tailed test is the correct test to run but I’m not 100% sure. I’m finding conflicting information online. In the output table for the dunnett’s test the mean difference between the control and the treatments is t1=9.8, t2=64.2, and t3=56.5. The control mean estimate is 90.5. The adjusted p-value by treatment is t1(p=0.5734), t2 (p=.0154) and t3(p=.0245). The adjusted lower bound confidence limit in order from t1-t3 is -38.8, 13.4, and 7.9. The adjusted upper bound for all test is infinity. The graphical output for the dunnett’s test in SAS is difficult to understand for those of us who are beginner SAS users. All treatments appear as a vertical line below the the horizontal line for control at 90.5 with t2 and t3 in the shaded area. For treatment 1 the shaded area is above the line for control. Looking at just the output table I would say that t2 and t3 are significantly lower than the control. I guess I would like to know if my interpretation of the outputs is correct that treatments 2 and 3 are statistically significantly lower than the control? Should I have used an upper bound one tailed test instead?

' src=

November 10, 2021 at 1:00 am

Thanks Jim. Please help me understand how a two tailed testing can be used to minimize errors in research

' src=

July 1, 2021 at 9:19 am

Hi Jim, Thanks for posting such a thorough and well-written explanation. It was extremely useful to clear up some doubts.

' src=

May 7, 2021 at 4:27 pm

Hi Jim, I followed your instructions for the Excel add-in. Thank you. I am very new to statistics and sort of enjoy it as I enter week number two in my class. I am to select if three scenarios call for a one or two-tailed test is required and why. The problem is stated:

30% of mole biopsies are unnecessary. Last month at his clinic, 210 out of 634 had benign biopsy results. Is there enough evidence to reject the dermatologist’s claim?

Part two, the wording changes to “more than of 30% of biopsies,” and part three, the wording changes to “less than 30% of biopsies…”

I am not asking for the problem to be solved for me, but I cannot seem to find direction needed. I know the elements i am dealing with are =30%, greater than 30%, and less than 30%. 210 and 634. I just don’t know what to with the information. I can’t seem to find an example of a similar problem to work with.

May 9, 2021 at 9:22 pm

As I detail in this post, a two-tailed test tells you whether an effect exists in either direction. Or, is it different from the null value in either direction. For the first example, the wording suggests you’d need a two-tailed test to determine whether the population proportion is ≠ 30%. Whenever you just need to know ≠, it suggests a two-tailed test because you’re covering both directions.

For part two, because it’s in one direction (greater than), you need a one-tailed test. Same for part three but it’s less than. Look in this blog post to see how you’d construct the null and alternative hypotheses for these cases. Note that you’re working with a proportion rather than the mean, but the principles are the same! Just plug your scenario and the concept of proportion into the wording I use for the hypotheses.

I hope that helps!

' src=

April 11, 2021 at 9:30 am

Hello Jim, great website! I am using a statistics program (SPSS) that does NOT compute one-tailed t-tests. I am trying to compare two independent groups and have justifiable reasons why I only care about one direction. Can I do the following? Use SPSS for two-tailed tests to calculate the t & p values. Then report the p-value as p/2 when it is in the predicted direction (e.g , SPSS says p = .04, so I report p = .02), and report the p-value as 1 – (p/2) when it is in the opposite direction (e.g., SPSS says p = .04, so I report p = .98)? If that is incorrect, what do you suggest (hopefully besides changing statistics programs)? Also, if I want to report confidence intervals, I realize that I would only have an upper or lower bound, but can I use the CI’s from SPSS to compute that? Thank you very much!

April 11, 2021 at 5:42 pm

Yes, for p-values, that’s absolutely correct for both cases.

For confidence intervals, if you take one endpoint of a two-side CI, it becomes a one-side bound with half the confidence level.

Consequently, to obtain a one-sided bound with your desired confidence level, you need to take your desired significance level (e.g., 0.05) and double it. Then subtract it from 1. So, if you’re using a significance level of 0.05, double that to 0.10 and then subtract from 1 (1 – 0.10 = 0.90). 90% is the confidence level you want to use for a two-sided test. After obtaining the two-sided CI, use one of the endpoints depending on the direction of your hypothesis (i.e., upper or lower bound). That’s produces the one-sided the bound with the confidence level that you want. For our example, we calculated a 95% one-sided bound.

' src=

March 3, 2021 at 8:27 am

Hi Jim. I used the one-tailed(right) statistical test to determine an anomaly in the below problem statement: On a daily basis, I calculate the (mapped_%) in a common field between two tables.

The way I used the t-test is: On any particular day, I calculate the sample_mean, S.D and sample_count (n=30) for the last 30 days including the current day. My null hypothesis, H0 (pop. mean)=95 and H1>95 (alternate hypothesis). So, I calculate the t-stat based on the sample_mean, pop.mean, sample S.D and n. I then choose the t-crit value for 0.05 from my t-ditribution table for dof(n-1). On the current day if my abs.(t-stat)>t-crit, then I reject the null hypothesis and I say the mapped_pct on that day has passed the t-test.

I get some weird results here, where if my mapped_pct is as low as 6%-8% in all the past 30 days, the t-test still gets a “pass” result. Could you help on this? If my hypothesis needs to be changed.

I would basically look for the mapped_pct >95, if it worked on a static trigger. How can I use the t-test effectively in this problem statement?

' src=

December 18, 2020 at 8:23 pm

Hello Dr. Jim, I am wondering if there is evidence in one of your books or other source you could provide, which supports that it is OK not to divide alpha level by 2 in one-tailed hypotheses. I need the source for supporting evidence in a Portfolio exercise and couldn’t find one.

I am grateful for your reply and for your statistics knowledge sharing!

' src=

November 27, 2020 at 10:31 pm

If I did a one directional F test ANOVA(one tail ) and wanted to calculate a confidence interval for each individual groups (3) mean . Would I use a one tailed or two tailed t , within my confidence interval .

November 29, 2020 at 2:36 am

Hi Bashiru,

F-tests for ANOVA will always be one-tailed for the reasons I discuss in this post. To learn more about, read my post about F-tests in ANOVA .

For the differences between my groups, I would not use t-tests because the family-wise error rate quickly grows out of hand. To learn more about how to compare group means while controlling the familywise error rate, read my post about using post hoc tests with ANOVA . Typically, these are two-side intervals but you’d be able to use one-sided.

' src=

November 26, 2020 at 10:51 am

Hi Jim, I had a question about the formulation of the hypotheses. When you want to test if a beta = 1 or a beta = 0. What will be the null hypotheses? I’m having trouble with finding out. Because in most cases beta = 0 is the null hypotheses but in this case you want to test if beta = 0. so i’m having my doubts can it in this case be the alternative hypotheses or is it still the null hypotheses?

Kind regards, Noa

November 27, 2020 at 1:21 am

Typically, the null hypothesis represents no effect or no relationship. As an analyst, you’re hoping that your data have enough evidence to reject the null and favor the alternative.

Assuming you’re referring to beta as in regression coefficients, zero represents no relationship. Consequently, beta = 0 is the null hypothesis.

You might hope that beta = 1, but you don’t usually include that in your alternative hypotheses. The alternative hypothesis usually states that it does not equal no effect. In other words, there is an effect but it doesn’t state what it is.

There are some exceptions to the above but I’m writing about the standard case.

' src=

November 22, 2020 at 8:46 am

Your articles are a help to intro to econometrics students. Keep up the good work! More power to you!

' src=

November 6, 2020 at 11:25 pm

Hello Jim. Can you help me with these please?

Write the null and alternative hypothesis using a 1-tailed and 2-tailed test for each problem. (In paragraph and symbols)

A teacher wants to know if there is a significant difference in the performance in MAT C313 between her morning and afternoon classes.

It is known that in our university canteen, the average waiting time for a customer to receive and pay for his/her order is 20 minutes. Additional personnel has been added and now the management wants to know if the average waiting time had been reduced.

November 8, 2020 at 12:29 am

I cover how to write the hypotheses for the different types of tests in this post. So, you just need to figure which type of test you need to use. In your case, you want to determine whether the mean waiting time is less than the target value of 20 minutes. That’s a 1-sample t-test because you’re comparing a mean to a target value (20 minutes). You specifically want to determine whether the mean is less than the target value. So, that’s a one-tailed test. And, you’re looking for a mean that is “less than” the target.

So, go to the one-tailed section in the post and look for the hypotheses for the effect being less than. That’s the one with the critical region on the left side of the curve.

Now, you need include your own information. In your case, you’re comparing the sample estimate to a population mean of 20. The 20 minutes is your null hypothesis value. Use the symbol mu μ to represent the population mean.

You put all that together and you get the following:

Null: μ ≥ 20 Alternative: μ 0 to denote the null hypothesis and H 1 or H A to denote the alternative hypothesis if that’s what you been using in class.

' src=

October 17, 2020 at 12:11 pm

I was just wondering if you could please help with clarifying what the hypothesises would be for say income for gamblers and, age of gamblers. I am struggling to find which means would be compared.

October 17, 2020 at 7:05 pm

Those are both continuous variables, so you’d use either correlation or regression for them. For both of those analyses, the hypotheses are the following:

Null : The correlation or regression coefficient equals zero (i.e., there is no relationship between the variables) Alternative : The coefficient does not equal zero (i.e., there is a relationship between the variables.)

When the p-value is less than your significance level, you reject the null and conclude that a relationship exists.

' src=

October 17, 2020 at 3:05 am

I was ask to choose and justify the reason between a one tailed and two tailed test for dummy variables, how do I do that and what does it mean?

October 17, 2020 at 7:11 pm

I don’t have enough information to answer your question. A dummy variable is also known as an indicator variable, which is a binary variable that indicates the presence or absence of a condition or characteristic. If you’re using this variable in a hypothesis test, I’d presume that you’re using a proportions test, which is based on the binomial distribution for binary data.

Choosing between a one-tailed or two-tailed test depends on subject area issues and, possibly, your research objectives. Typically, use a two-tailed test unless you have a very good reason to use a one-tailed test. To understand when you might use a one-tailed test, read my post about when to use a one-tailed hypothesis test .

' src=

October 16, 2020 at 2:07 pm

In your one-tailed example, Minitab describes the hypotheses as “Test of mu = 100 vs > 100”. Any idea why Minitab says the null is “=” rather than “= or less than”? No ASCII character for it?

October 16, 2020 at 4:20 pm

I’m not entirely sure even though I used to work there! I know we had some discussions about how to represent that hypothesis but I don’t recall the exact reasoning. I suspect that it has to do with the conclusions that you can draw. Let’s focus on the failing to reject the null hypothesis. If the test statistic falls in that region (i.e., it is not significant), you fail to reject the null. In this case, all you know is that you have insufficient evidence to say it is different than 100. I’m pretty sure that’s why they use the equal sign because it might as well be one.

Mathematically, I think using ≤ is more accurate, which you can really see when you look at the distribution plots. That’s why I phrase the hypotheses using ≤ or ≥ as needed. However, in terms of the interpretation, the “less than” portion doesn’t really add anything of importance. You can conclude that its equal to 100 or greater than 100, but not less than 100.

' src=

October 15, 2020 at 5:46 am

Thank you so much for your timely feedback. It helps a lot

October 14, 2020 at 10:47 am

How can i use one tailed test at 5% alpha on this problem?

A manufacturer of cellular phone batteries claims that when fully charged, the mean life of his product lasts for 26 hours with a standard deviation of 5 hours. Mr X, a regular distributor, randomly picked and tested 35 of the batteries. His test showed that the average life of his sample is 25.5 hours. Is there a significant difference between the average life of all the manufacturer’s batteries and the average battery life of his sample?

October 14, 2020 at 8:22 pm

I don’t think you’d want to use a one-tailed test. The goal is to determine whether the sample is significantly different than the manufacturer’s population average. You’re not saying significantly greater than or less than, which would be a one-tailed test. As phrased, you want a two-tailed test because it can detect a difference in either direct.

It sounds like you need to use a 1-sample t-test to test the mean. During this test, enter 26 as the test mean. The procedure will tell you if the sample mean of 25.5 hours is a significantly different from that test mean. Similarly, you’d need a one variance test to determine whether the sample standard deviation is significantly different from the test value of 5 hours.

For both of these tests, compare the p-value to your alpha of 0.05. If the p-value is less than this value, your results are statistically significant.

' src=

September 22, 2020 at 4:16 am

Hi Jim, I didn’t get an idea that when to use two tail test and one tail test. Will you please explain?

September 22, 2020 at 10:05 pm

I have a complete article dedicated to that: When Can I Use One-Tailed Tests .

Basically, start with the assumption that you’ll use a two-tailed test but then consider scenarios where a one-tailed test can be appropriate. I talk about all of that in the article.

If you have questions after reading that, please don’t hesitate to ask!

' src=

July 31, 2020 at 12:33 pm

Thank you so so much for this webpage.

I have two scenarios that I need some clarification. I will really appreciate it if you can take a look:

So I have several of materials that I know when they are tested after production. My hypothesis is that the earlier they are tested after production, the higher the mean value I should expect. At the same time, the later they are tested after production, the lower the mean value. Since this is more like a “greater or lesser” situation, I should use one tail. Is that the correct approach?

On the other hand, I have several mix of materials that I don’t know when they are tested after production. I only know the mean values of the test. And I only want to know whether one mean value is truly higher or lower than the other, I guess I want to know if they are only significantly different. Should I use two tail for this? If they are not significantly different, I can judge based on the mean values of test alone. And if they are significantly different, then I will need to do other type of analysis. Also, when I get my P-value for two tail, should I compare it to 0.025 or 0.05 if my confidence level is 0.05?

Thank you so much again.

July 31, 2020 at 11:19 pm

For your first, if you absolutely know that the mean must be lower the later the material is tested, that it cannot be higher, that would be a situation where you can use a one-tailed test. However, if that’s not a certainty, you’re just guessing, use a two-tail test. If you’re measuring different items at the different times, use the independent 2-sample t-test. However, if you’re measuring the same items at two time points, use the paired t-test. If it’s appropriate, using the paired t-test will give you more statistical power because it accounts for the variability between items. For more information, see my post about when it’s ok to use a one-tailed test .

For the mix of materials, use a two-tailed test because the effect truly can go either direction.

Always compare the p-value to your full significance level regardless of whether it’s a one or two-tailed test. Don’t divide the significance level in half.

' src=

June 17, 2020 at 2:56 pm

Is it possible that we reach to opposite conclusions if we use a critical value method and p value method Secondly if we perform one tail test and use p vale method to conclude our Ho, then do we need to convert sig value of 2 tail into sig value of one tail. That can be done just by dividing it with 2

June 18, 2020 at 5:17 pm

The p-value method and critical value method will always agree as long as you’re not changing anything about how the methodology.

If you’re using statistical software, you don’t need to make any adjustments. The software will do that for you.

However, if you calculating it by hand, you’ll need to take your significance level and then look in the table for your test statistic for a one-tailed test. For example, you’ll want to look up 5% for a one-tailed test rather than a two-tailed test. That’s not as simple as dividing by two. In this article, I show examples of one-tailed and two-tailed tests for the same degrees of freedom. The t critical value for the two-tailed test is +/- 2.086 while for the one-sided test it is 1.725. It is true that probability associated with those critical values doubles for the one-tailed test (2.5% -> 5%), but the critical value itself is not half (2.086 -> 1.725). Study the first several graphs in this article to see why that is true.

For the p-value, you can take a two-tailed p-value and divide by 2 to determine the one-sided p-value. However, if you’re using statistical software, it does that for you.

' src=

June 11, 2020 at 3:46 pm

Hello Jim, if you have the time I’d be grateful if you could shed some clarity on this scenario:

“A researcher believes that aromatherapy can relieve stress but wants to determine whether it can also enhance focus. To test this, the researcher selected a random sample of students to take an exam in which the average score in the general population is 77. Prior to the exam, these students studied individually in a small library room where a lavender scent was present. If students in this group scored significantly above the average score in general population [is this one-tailed or two-tailed hypothesis?], then this was taken as evidence that the lavender scent enhanced focus.”

Thank you for your time if you do decide to respond.

June 11, 2020 at 4:00 pm

It’s unclear from the information provided whether the researchers used a one-tailed or two-tailed test. It could be either. A two-tailed test can detect effects in both directions, so it could definitely detect an average group score above the population score. However, you could also detect that effect using a one-tailed test if it was set up correctly. So, there’s not enough information in what you provided to know for sure. It could be either.

However, that’s irrelevant to answering the question. The tricky part, as I see it, is that you’re not entirely sure about why the scores are higher. Are they higher because the lavender scent increased concentration or are they higher because the subjects have lower stress from the lavender? Or, maybe it’s not even related to the scent but some other characteristic of the room or testing conditions in which they took the test. You just know the scores are higher but not necessarily why they’re higher.

I’d say that, no, it’s not necessarily evidence that the lavender scent enhanced focus. There are competing explanations for why the scores are higher. Also, it would be best do this as an experiment with a control and treatment group where subjects are randomly assigned to either group. That process helps establish causality rather than just correlation and helps rules out competing explanations for why the scores are higher.

By the way, I spend a lot of time on these issues in my Introduction to Statistics ebook .

' src=

June 9, 2020 at 1:47 pm

If a left tail test has an alpha value of 0.05 how will you find the value in the table

' src=

April 19, 2020 at 10:35 am

Hi Jim, My question is in regards to the results in the table in your example of the one-sample T (Two-Tailed) test. above. What about the P-value? The P-value listed is .018. I assuming that is compared to and alpha of 0.025, correct?

In regression analysis, when I get a test statistic for the predictive variable of -2.099 and a p-value of 0.039. Am I comparing the p-value to an alpha of 0.025 or 0.05? Now if I run a Bootstrap for coefficients analysis, the results say the sig (2-tail) is 0.098. What are the critical values and alpha in this case? I’m trying to reconcile what I am seeing in both tables.

Thanks for your help.

April 20, 2020 at 3:24 am

Hi Marvalisa,

For one-tailed tests, you don’t need to divide alpha in half. If you can tell your software to perform a one-tailed test, it’ll do all the calculations necessary so you don’t need to adjust anything. So, if you’re using an alpha of 0.05 for a one-tailed test and your p-value is 0.04, it is significant. The procedures adjust the p-values automatically and it all works out. So, whether you’re using a one-tailed or two-tailed test, you always compare the p-value to the alpha with no need to adjust anything. The procedure does that for you!

The exception would be if for some reason your software doesn’t allow you to specify that you want to use a one-tailed test instead of a two-tailed test. Then, you divide the p-value from a two-tailed test in half to get the p-value for a one tailed test. You’d still compare it to your original alpha.

For regression, the same thing applies. If you want to use a one-tailed test for a cofficient, just divide the p-value in half if you can’t tell the software that you want a one-tailed test. The default is two-tailed. If your software has the option for one-tailed tests for any procedure, including regression, it’ll adjust the p-value for you. So, in the normal course of things, you won’t need to adjust anything.

' src=

March 26, 2020 at 12:00 pm

Hey Jim, for a one-tailed hypothesis test with a .05 confidence level, should I use a 95% confidence interval or a 90% confidence interval? Thanks

March 26, 2020 at 5:05 pm

You should use a one-sided 95% confidence interval. One-sided CIs have either an upper OR lower bound but remains unbounded on the other side.

' src=

March 16, 2020 at 4:30 pm

This is not applicable to the subject but… When performing tests of equivalence, we look at the confidence interval of the difference between two groups, and we perform two one-sided t-tests for equivalence..

' src=

March 15, 2020 at 7:51 am

Thanks for this illustrative blogpost. I had a question on one of your points though.

By definition of H1 and H0, a two-sided alternate hypothesis is that there is a difference in means between the test and control. Not that anything is ‘better’ or ‘worse’.

Just because we observed a negative result in your example, does not mean we can conclude it’s necessarily worse, but instead just ‘different’.

Therefore while it enables us to spot the fact that there may be differences between test and control, we cannot make claims about directional effects. So I struggle to see why they actually need to be used instead of one-sided tests.

What’s your take on this?

March 16, 2020 at 3:02 am

Hi Dominic,

If you’ll notice, I carefully avoid stating better or worse because in a general sense you’re right. However, given the context of a specific experiment, you can conclude whether a negative value is better or worse. As always in statistics, you have to use your subject-area knowledge to help interpret the results. In some cases, a negative value is a bad result. In other cases, it’s not. Use your subject-area knowledge!

I’m not sure why you think that you can’t make claims about directional effects? Of course you can!

As for why you shouldn’t use one-tailed tests for most cases, read my post When Can I Use One-Tailed Tests . That should answer your questions.

' src=

May 10, 2019 at 12:36 pm

Your website is absolutely amazing Jim, you seem like the nicest guy for doing this and I like how there’s no ulterior motive, (I wasn’t automatically signed up for emails or anything when leaving this comment). I study economics and found econometrics really difficult at first, but your website explains it so clearly its been a big asset to my studies, keep up the good work!

May 10, 2019 at 2:12 pm

Thank you so much, Jack. Your kind words mean a lot!

' src=

April 26, 2019 at 5:05 am

Hy Jim I really need your help now pls

One-tailed and two- tailed hypothesis, is it the same or twice, half or unrelated pls

April 26, 2019 at 11:41 am

Hi Anthony,

I describe how the hypotheses are different in this post. You’ll find your answers.

' src=

February 8, 2019 at 8:00 am

Thank you for your blog Jim, I have a Statistics exam soon and your articles let me understand a lot!

February 8, 2019 at 10:52 am

You’re very welcome! I’m happy to hear that it’s been helpful. Best of luck on your exam!

' src=

January 12, 2019 at 7:06 am

Hi Jim, When you say target value is 5. Do you mean to say the population mean is 5 and we are trying to validate it with the help of sample mean 4.1 using Hypo tests ?.. If it is so.. How can we measure a population parameter as 5 when it is almost impossible o measure a population parameter. Please clarify

January 12, 2019 at 6:57 pm

When you set a target for a one-sample test, it’s based on a value that is important to you. It’s not a population parameter or anything like that. The example in this post uses a case where we need parts that are stronger on average than a value of 5. We derive the value of 5 by using our subject area knowledge about what is required for a situation. Given our product knowledge for the hypothetical example, we know it should be 5 or higher. So, we use that in the hypothesis test and determine whether the population mean is greater than that target value.

When you perform a one-sample test, a target value is optional. If you don’t supply a target value, you simply obtain a confidence interval for the range of values that the parameter is likely to fall within. But, sometimes there is meaningful number that you want to test for specifically.

I hope that clarifies the rational behind the target value!

' src=

November 15, 2018 at 8:08 am

I understand that in Psychology a one tailed hypothesis is preferred. Is that so

November 15, 2018 at 11:30 am

No, there’s no overall preference for one-tailed hypothesis tests in statistics. That would be a study-by-study decision based on the types of possible effects. For more information about this decision, read my post: When Can I Use One-Tailed Tests?

' src=

November 6, 2018 at 1:14 am

I’m grateful to you for the explanations on One tail and Two tail hypothesis test. This opens my knowledge horizon beyond what an average statistics textbook can offer. Please include more examples in future posts. Thanks

November 5, 2018 at 10:20 am

Thank you. I will search it as well.

Stan Alekman

November 4, 2018 at 8:48 pm

Jim, what is the difference between the central and non-central t-distributions w/respect to hypothesis testing?

November 5, 2018 at 10:12 am

Hi Stan, this is something I will need to look into. I know central t-distribution is the common Student t-distribution, but I don’t have experience using non-central t-distributions. There might well be a blog post in that–after I learn more!

' src=

November 4, 2018 at 7:42 pm

this is awesome.

Comments and Questions Cancel reply

IMAGES

  1. Critical Value

    define critical value in hypothesis testing

  2. Finding z critical values for Hypothesis test

    define critical value in hypothesis testing

  3. Understanding Critical values (Hypothesis testing for Normal Distribution)

    define critical value in hypothesis testing

  4. Chapter 8 Hypothesis Testing

    define critical value in hypothesis testing

  5. PPT

    define critical value in hypothesis testing

  6. Hypothesis Testing

    define critical value in hypothesis testing

VIDEO

  1. Tutorial for Finding the Critical Value(s) in a Z Test

  2. Hypothesis Testing

  3. WHAT IS TEST STATISTICS?

  4. Stating Hypotheses & Defining Parameters

  5. Hypothesis Test

  6. Tutorial for Finding the Critical Value(s) in a T Test

COMMENTS

  1. Critical Value: Definition, Finding & Calculator

    Critical values (CV) are the boundary between nonsignificant and significant results in hypothesis testing.Test statistics that exceed a critical value have a low probability of occurring if the null hypothesis is true. Therefore, when test statistics exceed these cutoffs, you can reject the null and conclude that the effect exists in the population. . In other words, they define the rejection ...

  2. S.3.1 Hypothesis Testing (Critical Value Approach)

    The critical value for conducting the left-tailed test H0 : μ = 3 versus HA : μ < 3 is the t -value, denoted -t(α, n - 1), such that the probability to the left of it is α. It can be shown using either statistical software or a t -table that the critical value -t0.05,14 is -1.7613. That is, we would reject the null hypothesis H0 : μ = 3 in ...

  3. 7.5: Critical values, p-values, and significance level

    When we use z z -scores in this way, the obtained value of z z (sometimes called z z -obtained) is something known as a test statistic, which is simply an inferential statistic used to test a null hypothesis. The formula for our z z -statistic has not changed: z = X¯¯¯¯ − μ σ¯/ n−−√ (7.5.1) (7.5.1) z = X ¯ − μ σ ¯ / n.

  4. Critical Value Approach in Hypothesis Testing

    The critical value is the cut-off point to determine whether to accept or reject the null hypothesis for your sample distribution. The critical value approach provides a standardized method for hypothesis testing, enabling you to make informed decisions based on the evidence obtained from sample data. After calculating the test statistic using ...

  5. S.3.1 Hypothesis Testing (Critical Value Approach)

    The critical value for conducting the left-tailed test H0 : μ = 3 versus HA : μ < 3 is the t -value, denoted -t(α, n - 1), such that the probability to the left of it is α. It can be shown using either statistical software or a t -table that the critical value -t0.05,14 is -1.7613. That is, we would reject the null hypothesis H0 : μ = 3 in ...

  6. Understanding Critical Value vs. P-Value in Hypothesis Testing

    In conclusion, while critical values and p-values are both essential tools in hypothesis testing, they offer different perspectives on statistical inference. Critical values provide a clear, binary decision framework, while p-values allow for a more nuanced evaluation of evidence against the null hypothesis.

  7. How to Calculate Critical Values for Statistical Hypothesis Testing

    Test Statistic <= Critical Value: Fail to reject the null hypothesis of the statistical test. Test Statistic > Critical Value: Reject the null hypothesis of the statistical test. Two-Tailed Test. A two-tailed test has two critical values, one on each side of the distribution, which is often assumed to be symmetrical (e.g. Gaussian and Student-t ...

  8. 8.1: The Elements of Hypothesis Testing

    The critical value or critical values of a test of hypotheses are the number or numbers that determine the rejection region. ... Definition: hypothesis test. A standardized test statistic for a hypothesis test is the statistic that is formed by subtracting from the statistic of interest its mean and dividing by its standard deviation.

  9. What is a critical value?

    A critical value is a point on the distribution of the test statistic under the null hypothesis that defines a set of values that call for rejecting the null hypothesis. This set is called critical or rejection region. Usually, one-sided tests have one critical value and two-sided test have two critical values.

  10. Hypothesis Testing

    Step 5: Present your findings. The results of hypothesis testing will be presented in the results and discussion sections of your research paper, dissertation or thesis.. In the results section you should give a brief summary of the data and a summary of the results of your statistical test (for example, the estimated difference between group means and associated p-value).

  11. Hypothesis Testing, P Values, Confidence Intervals, and Significance

    Definition/Introduction. Medical providers often rely on evidence-based medicine to guide decision-making in practice. Often a research hypothesis is tested with results provided, typically with p values, confidence intervals, or both. Additionally, statistical or research significance is estimated or determined by the investigators.

  12. Critical Value

    A critical value is a point on the scale of the test statistic that separates the regions where the null hypothesis is rejected from those where it is not. In hypothesis testing, critical values help determine the threshold at which you can conclude that an observed effect is statistically significant. This concept is crucial in estimating parameters and making inferences about regression ...

  13. Critical Value

    The critical value for a one-tailed or two-tailed test can be computed using the confidence interval. Suppose a confidence interval of 95% has been specified for conducting a hypothesis test. The critical value can be determined as follows: Step 1: Subtract the confidence level from 100%. 100% - 95% = 5%. Step 2: Convert this value to decimals ...

  14. Critical Value Calculator

    A Z critical value is the value that defines the critical region in hypothesis testing when the test statistic follows the standard normal distribution. If the value of the test statistic falls into the critical region, you should reject the null hypothesis and accept the alternative hypothesis.

  15. What is Critical Value?

    Definition of Critical Values. A critical value is a threshold value used in hypothesis testing that separates the acceptance and rejection regions based on a given level of significance. In statistical hypothesis testing, a critical value is a threshold value that is used to determine whether a test statistic is significant enough to null ...

  16. Statistical Hypothesis Testing: How to Calculate Critical Values

    Step 1: Identify the test statistic. Before you can figure out the key values, you need to choose the right test statistic for your hypothesis test. The "test statistic" is a number that shows that the data are different from the "null value.". This is a list of test statistics. Which one to use depends on the data or hypothesis being ...

  17. Critical Region, Critical Values and Significance Level

    The critical region, critical value, and significance level are interdependent concepts crucial in hypothesis testing. In hypothesis testing, a sample statistic is converted to a test statistic using z, t, or chi-square distribution. A critical region is an area under the curve in probability distributions demarcated by the critical value.

  18. Understanding Hypothesis Testing

    If Test Statistic>Critical Value: Reject the null hypothesis. If Test Statistic≤Critical Value: Fail to reject the null hypothesis. Note: Critical values are predetermined threshold values that are used to make a decision in hypothesis testing. To determine critical values for hypothesis testing, we typically refer to a statistical ...

  19. Introduction to Hypothesis Testing

    Hypothesis Tests. A hypothesis test consists of five steps: 1. State the hypotheses. State the null and alternative hypotheses. These two hypotheses need to be mutually exclusive, so if one is true then the other must be false. 2. Determine a significance level to use for the hypothesis. Decide on a significance level.

  20. Hypothesis Testing: Upper-, Lower, and Two Tailed Tests

    Because we rejected the null hypothesis, we now approximate the p-value which is the likelihood of observing the sample data if the null hypothesis is true. An alternative definition of the p-value is the smallest level of significance where we can still reject H 0. In this example, we observed Z=2.38 and for α=0.05, the critical value was 1.645.

  21. P-Value in Statistical Hypothesis Tests: What is it?

    P Value Definition. A p value is used in hypothesis testing to help you support or reject the null hypothesis. The p value is the evidence against a null hypothesis. The smaller the p-value, the stronger the evidence that you should reject the null hypothesis. P values are expressed as decimals although it may be easier to understand what they ...

  22. 8.1: The null and alternative hypotheses

    If, however, the test statistic is greater than the critical value, then we provisionally reject the null hypothesis. This critical value comes from a probability distribution appropriate for the kind of sampling and properties of the measurement we are using. In other words, the rejection criterion for the null hypothesis is set to a critical ...

  23. One-Tailed and Two-Tailed Hypothesis Tests Explained

    Critical Regions in a Hypothesis Test. In hypothesis tests, critical regions are ranges of the distributions where the values represent statistically significant results. Analysts define the size and location of the critical regions by specifying both the significance level (alpha) and whether the test is one-tailed or two-tailed.