Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Hypothesis testing for a correlation that is zero or negative

I would like to test for a correlation that is zero or negative using the following hypothesis test: $H_0: p>0$ (Null hypothesis: the correlation is positive) $H_A: p\le0$ (Alternative hypothesis: the correlation is zero or negative)

Since this is different from the usual correlation statistical hypothesis tests, $(H_0:p=0, H_A:p\ne0)$, I am having difficulty in figuring out the formulas.

How would I perform the calculations for this hypothesis test? (Of course the calculations for $r,\ r^2,\ SS_x,\ SS_y,\ n,$ etc. are all known.) Thank you in advance for your time!

  • hypothesis-testing
  • correlation

Nick Stauner's user avatar

  • 2 $\begingroup$ The difficulty is having the equality in the alternative. How is one to find the distribution of the test statistic under the null? (Can you explain more about the underlying situation?) $\endgroup$ –  Glen_b Commented Feb 10, 2014 at 21:26
  • $\begingroup$ Would it be possible to inverse it and calculate and report the p-value for the TypeII error? H0: p<=0 (Null hypothesis: the correlation is zero or negative) HA: p>0 (Alternative hypothesis: the correlation is positive) $\endgroup$ –  user39947 Commented Feb 10, 2014 at 21:34
  • $\begingroup$ The situation is to demonstrate that a certain action (treatment) is not providing any benefit (not a positive correlation). For the moment we don't care to distiguish if is has no effect or a negative effect. $\endgroup$ –  user39947 Commented Feb 10, 2014 at 21:39
  • 1 $\begingroup$ I don't understand what you mean by the phrase "report the p-value for the TypeII error". The usual approach to the problem you have would be to test against the alternative that it does have a positive effect; failure to reject doesn't prove 'no positive effect', but it does in does show an absence of evidence for it. $\endgroup$ –  Glen_b Commented Feb 10, 2014 at 21:56
  • 3 $\begingroup$ It might be better to simply look at a confidence interval for the effect (even two-sided), and to discuss the conclusions in the light of that interval. You may get less tangled up in the hypothesis-testing logic. An alternative would be to be to begin reviewing the way hypothesis tests actually work (I don't mean the hand-wavy explanations, I mean the mathematics). $\endgroup$ –  Glen_b Commented Feb 10, 2014 at 22:13

3 Answers 3

Strange that no direct answer to the original question has been given (even though @Nick Stauner and @Glen_b nicely elaborated on possibly superior alternatives). The wikipedia article discusses various methods, including the following, which is probably the most direct answer.

A one-sided hypothesis test on a correlation can be performed via t as a test statistic. Here,

t $= r\sqrt\frac{{n-2}}{1-r^2}$

with the critical value found via $t_{\alpha,n-2}$ (in the more common two-sided case, only $\alpha$ is changed). So for your $H_0$ that r is smaller than 0, the test rejects if the t resulting from plugging your n and r into the above formula is smaller than the critical value determined by your n and desired $\alpha$. (Admittedly, even this does not precisely answer the question in the sense that a correlation of exactly 0 is filed on the wrong side.)

Alternatively, a permutation test can be performed (see the wiki article).

jona's user avatar

  • 1 $\begingroup$ I was just about to edit my answer to include this following @whuber's comment :) IIRC, the fact that the zero was "filed on the wrong side" threw me off initially...However, I see the OP wasn't all that adamant about designing the hypotheses that way . $\endgroup$ –  Nick Stauner Commented Jul 21, 2014 at 18:36

You might achieve what you're really after (if it's not exactly what you've asked, which is interesting in its own right; +1 and welcome to CV!) rather simply by fitting a confidence interval (CI) around the correlation (I see @Glen_b suggested this in a comment too). If your correlation is significantly negative, a 95% CI would exclude positive values (and zero) with 95% confidence, which is usually enough for many statistical applications (e.g., in the social sciences, from whence I come brandishing a PhD). See also: When are confidence intervals useful?

I don't know if it's legit to just keep increasing (or decreasing) your confidence levels until your upper bound exceeds zero, but I'm curious enough myself that I'll offer this idea, risk a little rep, and eagerly await any critical comments the community might have for us. I.e., I don't see why you couldn't just take the confidence level at which your correlation estimate's CI touches zero as your estimate of $1-p$ for a test of whether your estimate is on the proper side of zero, but also below the other, more extreme bound ...which means I still haven't answered your question exactly. Still, even if your estimate is above zero, you could calculate the level of confidence with which you can say future samples from the same distribution would exhibit correlations that are also above zero and below the upper bound of your CI ...

This idea is due in part to my general preference for CIs over significance tests, which itself is due partly to a recent book (Cumming, 2012) I haven't actually read, to be honest—I've heard some pretty credible praise from those who have though—enough to recommend it myself, whether that's wise or otherwise. Speaking of "credible", if you like the CI idea, you might also consider calculating credible intervals —the Bayesian approach to estimating the probability given the fixed data of a random population parameter value being within the interval, as opposed to the CI's probability of the random data given a fixed population parameter...but I'm no Bayesian (yet), so I can't speak to that, or even be certain that I've described the credible interval interpretation with precise accuracy. You may prefer to see these questions:

  • Possible dupe of ^: What does a confidence interval (vs. a credible interval) actually express?
  • Interpreting a confidence interval.
  • Confidence intervals when using Bayes' theorem
  • What's the difference between a confidence interval and a credible interval?
  • Should I report credible intervals instead of confidence intervals?
  • Are there any examples where Bayesian credible intervals are obviously inferior to frequentist confidence intervals

As you can see, there's a lot of confusion about these matters, and many ways of explaining them.

Cumming, G. (2012). Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis . New York: Routledge.

Community's user avatar

  • 1 $\begingroup$ These are good remarks, but they seem never to address the question itself, which comes down to making a one-tailed versus a two-tailed test. $\endgroup$ –  whuber ♦ Commented Jul 21, 2014 at 15:10
  • 1 $\begingroup$ Yeah, it wasn't an answer regarding how to perform a one-sided NHST; it's about how to use a CI to (potentially) conclude something similar but upper-bounded as well. I didn't know the direct answer off the top of my head, and given the enthusiasm for confidence intervals that I expressed here, I hadn't felt the need to look up a direct answer. I suppose it was easy enough though, and @jona just filled it in for us. $\endgroup$ –  Nick Stauner Commented Jul 21, 2014 at 18:38

The simplest way to do so (for Pearson correlation) is to use Fisher's z-transformation.

Let r be the correlation in question.

Let n be the sample size used to acquire the correlation. tanh is the hyperbolic tangent atanh or $\tanh^{-1}$ is the inverse hyperbolic tangent .

Let z = atanh(r) , then z is normally distributed with variance $\frac{1}{n-3}$`

Using this, you can construct a confidence interval

$ C.I.(\rho) = \tanh\left(\tanh^{-1}(\rho) \pm q \cdot \frac{1}{\sqrt{n-3}}\right) $, where $q$ is the value that describes the level of confidence you want (i.e., the value you would read from a normal distribution table (e.g., 1.96 for 95% confidence)),

If zero is in the confidence interval, then you would fail to reject the null hypothesis that the correlation is zero. Also, note that you cannot use this for correlations of $\pm 1$ because if they are one for data that is truly continuous, then you only need 3 data points to determine that.

For one-sided values, simply use the z-score you'd use for a 1-sided p-value for it, and then transform it back and see if your correlation is within the range of that interval.

Edit: You can use a 1-sided test using the same values. Also, I changed sample values $r$ to theoretical values $\rho$, since that's a more appropriate use of confidence intervals.

Source: http://en.wikipedia.org/wiki/Fisher_transformation

Max Candocia's user avatar

  • 3 $\begingroup$ Although this is a good (standard) approach, you describe a two-tailed test rather than the one-tailed test requested in the question. $\endgroup$ –  whuber ♦ Commented Jul 21, 2014 at 19:58
  • $\begingroup$ I believe you can use the z-score and compute the p-value from $-\infty$, but yes, you are correct. $\endgroup$ –  Max Candocia Commented Jul 21, 2014 at 21:52
  • 1 $\begingroup$ Re the edit: unsophisticated readers may get the wrong idea. I think it's important to be clear and precise about how the z-score is related to the desired confidence. (For a given level of confidence, the z-scores for the one-sided interval are based on different quantiles than the z-scores for the two-sided interval.) An explicit account of that will reveal the difference between the two-sided test and the one-sided test sought by the O.P. $\endgroup$ –  whuber ♦ Commented Jul 21, 2014 at 22:03

Your Answer

Sign up or log in, post as a guest.

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Not the answer you're looking for? Browse other questions tagged hypothesis-testing correlation or ask your own question .

  • Featured on Meta
  • We've made changes to our Terms of Service & Privacy Policy - July 2024
  • Bringing clarity to status tag usage on meta sites

Hot Network Questions

  • How to determine if a set is countable or uncountable?
  • If you get pulled for secondary inspection at immigration, missing flight, will the airline rebook you?
  • Idiomatic alternative to “going to Canossa”
  • Which Cards Use -5V and -12V in IBM PC Compatible Systems?
  • Is it possible to do physics without mathematics?
  • Visualizing histogram of data on unit circle?
  • Has the government of Afghanistan clarified what they mean/intend by the ban on 'images of living beings'?
  • When a submarine blows its ballast and rises, where did the energy for the ascent come from?
  • What Christian ideas are found in the New Testament that are not found in the Old Testament?
  • Can figere come with a dative?
  • Why does my Bluetooth speaker keep on connecting and disconnecting?
  • How do I type the characters ├ ─ and └?
  • Tenses change the meaning
  • Are Experimental Elixirs Magic Items?
  • Why if gravity were higher, designing a fully reusable rocket would be impossible?
  • How to get Swedish coins in Sweden?
  • What does 北渐 mean?
  • Can you successfully substitute pickled onions for baby onions in Coq Au Vin?
  • Op Amp Feedback Resistors
  • How can I address my colleague communicating with us via chatGPT?
  • What (if any) pre-breathes were "attempted" on the ISS, and why?
  • What is the difference between an `.iso` OS for a network and an `.iso` OS for CD?
  • What's the counterpart to "spheroid" for a circle? There's no "circoid"
  • Are there any original heat shield tiles on any of the retired space shuttles that flew to space?

null hypothesis of zero correlation

  • Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Statistics By Jim

Making statistics intuitive

Null Hypothesis: Definition, Rejecting & Examples

By Jim Frost 6 Comments

What is a Null Hypothesis?

The null hypothesis in statistics states that there is no difference between groups or no relationship between variables. It is one of two mutually exclusive hypotheses about a population in a hypothesis test.

Photograph of Rodin's statue, The Thinker who is pondering the null hypothesis.

  • Null Hypothesis H 0 : No effect exists in the population.
  • Alternative Hypothesis H A : The effect exists in the population.

In every study or experiment, researchers assess an effect or relationship. This effect can be the effectiveness of a new drug, building material, or other intervention that has benefits. There is a benefit or connection that the researchers hope to identify. Unfortunately, no effect may exist. In statistics, we call this lack of an effect the null hypothesis. Researchers assume that this notion of no effect is correct until they have enough evidence to suggest otherwise, similar to how a trial presumes innocence.

In this context, the analysts don’t necessarily believe the null hypothesis is correct. In fact, they typically want to reject it because that leads to more exciting finds about an effect or relationship. The new vaccine works!

You can think of it as the default theory that requires sufficiently strong evidence to reject. Like a prosecutor, researchers must collect sufficient evidence to overturn the presumption of no effect. Investigators must work hard to set up a study and a data collection system to obtain evidence that can reject the null hypothesis.

Related post : What is an Effect in Statistics?

Null Hypothesis Examples

Null hypotheses start as research questions that the investigator rephrases as a statement indicating there is no effect or relationship.

Does the vaccine prevent infections? The vaccine does not affect the infection rate.
Does the new additive increase product strength? The additive does not affect mean product strength.
Does the exercise intervention increase bone mineral density? The intervention does not affect bone mineral density.
As screen time increases, does test performance decrease? There is no relationship between screen time and test performance.

After reading these examples, you might think they’re a bit boring and pointless. However, the key is to remember that the null hypothesis defines the condition that the researchers need to discredit before suggesting an effect exists.

Let’s see how you reject the null hypothesis and get to those more exciting findings!

When to Reject the Null Hypothesis

So, you want to reject the null hypothesis, but how and when can you do that? To start, you’ll need to perform a statistical test on your data. The following is an overview of performing a study that uses a hypothesis test.

The first step is to devise a research question and the appropriate null hypothesis. After that, the investigators need to formulate an experimental design and data collection procedures that will allow them to gather data that can answer the research question. Then they collect the data. For more information about designing a scientific study that uses statistics, read my post 5 Steps for Conducting Studies with Statistics .

After data collection is complete, statistics and hypothesis testing enter the picture. Hypothesis testing takes your sample data and evaluates how consistent they are with the null hypothesis. The p-value is a crucial part of the statistical results because it quantifies how strongly the sample data contradict the null hypothesis.

When the sample data provide sufficient evidence, you can reject the null hypothesis. In a hypothesis test, this process involves comparing the p-value to your significance level .

Rejecting the Null Hypothesis

Reject the null hypothesis when the p-value is less than or equal to your significance level. Your sample data favor the alternative hypothesis, which suggests that the effect exists in the population. For a mnemonic device, remember—when the p-value is low, the null must go!

When you can reject the null hypothesis, your results are statistically significant. Learn more about Statistical Significance: Definition & Meaning .

Failing to Reject the Null Hypothesis

Conversely, when the p-value is greater than your significance level, you fail to reject the null hypothesis. The sample data provides insufficient data to conclude that the effect exists in the population. When the p-value is high, the null must fly!

Note that failing to reject the null is not the same as proving it. For more information about the difference, read my post about Failing to Reject the Null .

That’s a very general look at the process. But I hope you can see how the path to more exciting findings depends on being able to rule out the less exciting null hypothesis that states there’s nothing to see here!

Let’s move on to learning how to write the null hypothesis for different types of effects, relationships, and tests.

Related posts : How Hypothesis Tests Work and Interpreting P-values

How to Write a Null Hypothesis

The null hypothesis varies by the type of statistic and hypothesis test. Remember that inferential statistics use samples to draw conclusions about populations. Consequently, when you write a null hypothesis, it must make a claim about the relevant population parameter . Further, that claim usually indicates that the effect does not exist in the population. Below are typical examples of writing a null hypothesis for various parameters and hypothesis tests.

Related posts : Descriptive vs. Inferential Statistics and Populations, Parameters, and Samples in Inferential Statistics

Group Means

T-tests and ANOVA assess the differences between group means. For these tests, the null hypothesis states that there is no difference between group means in the population. In other words, the experimental conditions that define the groups do not affect the mean outcome. Mu (µ) is the population parameter for the mean, and you’ll need to include it in the statement for this type of study.

For example, an experiment compares the mean bone density changes for a new osteoporosis medication. The control group does not receive the medicine, while the treatment group does. The null states that the mean bone density changes for the control and treatment groups are equal.

  • Null Hypothesis H 0 : Group means are equal in the population: µ 1 = µ 2 , or µ 1 – µ 2 = 0
  • Alternative Hypothesis H A : Group means are not equal in the population: µ 1 ≠ µ 2 , or µ 1 – µ 2 ≠ 0.

Group Proportions

Proportions tests assess the differences between group proportions. For these tests, the null hypothesis states that there is no difference between group proportions. Again, the experimental conditions did not affect the proportion of events in the groups. P is the population proportion parameter that you’ll need to include.

For example, a vaccine experiment compares the infection rate in the treatment group to the control group. The treatment group receives the vaccine, while the control group does not. The null states that the infection rates for the control and treatment groups are equal.

  • Null Hypothesis H 0 : Group proportions are equal in the population: p 1 = p 2 .
  • Alternative Hypothesis H A : Group proportions are not equal in the population: p 1 ≠ p 2 .

Correlation and Regression Coefficients

Some studies assess the relationship between two continuous variables rather than differences between groups.

In these studies, analysts often use either correlation or regression analysis . For these tests, the null states that there is no relationship between the variables. Specifically, it says that the correlation or regression coefficient is zero. As one variable increases, there is no tendency for the other variable to increase or decrease. Rho (ρ) is the population correlation parameter and beta (β) is the regression coefficient parameter.

For example, a study assesses the relationship between screen time and test performance. The null states that there is no correlation between this pair of variables. As screen time increases, test performance does not tend to increase or decrease.

  • Null Hypothesis H 0 : The correlation in the population is zero: ρ = 0.
  • Alternative Hypothesis H A : The correlation in the population is not zero: ρ ≠ 0.

For all these cases, the analysts define the hypotheses before the study. After collecting the data, they perform a hypothesis test to determine whether they can reject the null hypothesis.

The preceding examples are all for two-tailed hypothesis tests. To learn about one-tailed tests and how to write a null hypothesis for them, read my post One-Tailed vs. Two-Tailed Tests .

Related post : Understanding Correlation

Neyman, J; Pearson, E. S. (January 1, 1933).  On the Problem of the most Efficient Tests of Statistical Hypotheses .  Philosophical Transactions of the Royal Society A .  231  (694–706): 289–337.

Share this:

null hypothesis of zero correlation

Reader Interactions

' src=

January 11, 2024 at 2:57 pm

Thanks for the reply.

January 10, 2024 at 1:23 pm

Hi Jim, In your comment you state that equivalence test null and alternate hypotheses are reversed. For hypothesis tests of data fits to a probability distribution, the null hypothesis is that the probability distribution fits the data. Is this correct?

' src=

January 10, 2024 at 2:15 pm

Those two separate things, equivalence testing and normality tests. But, yes, you’re correct for both.

Hypotheses are switched for equivalence testing. You need to “work” (i.e., collect a large sample of good quality data) to be able to reject the null that the groups are different to be able to conclude they’re the same.

With typical hypothesis tests, if you have low quality data and a low sample size, you’ll fail to reject the null that they’re the same, concluding they’re equivalent. But that’s more a statement about the low quality and small sample size than anything to do with the groups being equal.

So, equivalence testing make you work to obtain a finding that the groups are the same (at least within some amount you define as a trivial difference).

For normality testing, and other distribution tests, the null states that the data follow the distribution (normal or whatever). If you reject the null, you have sufficient evidence to conclude that your sample data don’t follow the probability distribution. That’s a rare case where you hope to fail to reject the null. And it suffers from the problem I describe above where you might fail to reject the null simply because you have a small sample size. In that case, you’d conclude the data follow the probability distribution but it’s more that you don’t have enough data for the test to register the deviation. In this scenario, if you had a larger sample size, you’d reject the null and conclude it doesn’t follow that distribution.

I don’t know of any equivalence testing type approach for distribution fit tests where you’d need to work to show the data follow a distribution, although I haven’t looked for one either!

' src=

February 20, 2022 at 9:26 pm

Is a null hypothesis regularly (always) stated in the negative? “there is no” or “does not”

February 23, 2022 at 9:21 pm

Typically, the null hypothesis includes an equal sign. The null hypothesis states that the population parameter equals a particular value. That value is usually one that represents no effect. In the case of a one-sided hypothesis test, the null still contains an equal sign but it’s “greater than or equal to” or “less than or equal to.” If you wanted to translate the null hypothesis from its native mathematical expression, you could use the expression “there is no effect.” But the mathematical form more specifically states what it’s testing.

It’s the alternative hypothesis that typically contains does not equal.

There are some exceptions. For example, in an equivalence test where the researchers want to show that two things are equal, the null hypothesis states that they’re not equal.

In short, the null hypothesis states the condition that the researchers hope to reject. They need to work hard to set up an experiment and data collection that’ll gather enough evidence to be able to reject the null condition.

' src=

February 15, 2022 at 9:32 am

Dear sir I always read your notes on Research methods.. Kindly tell is there any available Book on all these..wonderfull Urgent

Comments and Questions Cancel reply

12.3 Testing the Significance of the Correlation Coefficient (Optional)

The correlation coefficient, r , tells us about the strength and direction of the linear relationship between x and y . However, the reliability of the linear model also depends on how many observed data points are in the sample. We need to look at both the correlation coefficient r and the sample size n , together.

We perform a hypothesis test of the significance of the correlation coefficient to decide whether the linear relationship in the sample data is strong enough to use to model the relationship in the population.

The sample data are used to compute r , the correlation coefficient for the sample. If we had data for the entire population, we could find the population correlation coefficient. But, because we have only sample data, we cannot calculate the population correlation coefficient. The sample correlation coefficient, r , is our estimate of the unknown population correlation coefficient.

  • The symbol for the population correlation coefficient is ρ , the Greek letter rho.
  • ρ = population correlation coefficient (unknown).
  • r = sample correlation coefficient (known; calculated from sample data).

The hypothesis test lets us decide whether the value of the population correlation coefficient ρ is close to zero or significantly different from zero . We decide this based on the sample correlation coefficient r and the sample size n .

If the test concludes the correlation coefficient is significantly different from zero, we say the correlation coefficient is significant .

  • Conclusion: There is sufficient evidence to conclude there is a significant linear relationship between x and y because the correlation coefficient is significantly different from zero.
  • What the conclusion means: There is a significant linear relationship between x and y . We can use the regression line to model the linear relationship between x and y in the population.

If the test concludes the correlation coefficient is not significantly different from zero (it is close to zero), we say the correlation coefficient is not significant .

  • Conclusion: There is insufficient evidence to conclude there is a significant linear relationship between x and y because the correlation coefficient is not significantly different from zero.
  • What the conclusion means: There is not a significant linear relationship between x and y . Therefore, we cannot use the regression line to model a linear relationship between x and y in the population.
  • If r is significant and the scatter plot shows a linear trend, the line can be used to predict the value of y for values of x that are within the domain of observed x values.
  • If r is not significant or if the scatter plot does not show a linear trend, the line should not be used for prediction.
  • If r is significant and the scatter plot shows a linear trend, the line may not be appropriate or reliable for prediction outside the domain of observed x values in the data.

Performing the Hypothesis Test

  • Null hypothesis: H 0 : ρ = 0.
  • Alternate hypothesis: H a : ρ ≠ 0.

What the Hypothesis Means in Words:

  • Null hypothesis H 0 : The population correlation coefficient is not significantly different from zero. There is not a significant linear relationship (correlation) between x and y in the population.
  • Alternate hypothesis H a : The population correlation coefficient is significantly different from zero. There is a significant linear relationship (correlation) between x and y in the population.

Drawing a Conclusion: There are two methods to make a conclusion. The two methods are equivalent and give the same result.

  • Method 1: Use the p -value.
  • Method 2: Use a table of critical values.

In this chapter, we will always use a significance level of 5 percent, α = 0.05.

Using the p -value method, you could choose any appropriate significance level you want; you are not limited to using α = 0.05. But, the table of critical values provided in this textbook assumes we are using a significance level of 5 percent, α = 0.05. If we wanted to use a significance level different from 5 percent with the critical value method, we would need different tables of critical values that are not provided in this textbook.

METHOD 1: Using a p -value to Make a Decision

Using the ti-83, 83+, 84, 84+ calculator.

To calculate the p -value using LinRegTTEST :

  • Complete the same steps as the LinRegTTest performed previously in this chapter, making sure on the line prompt for β or σ , ≠ 0 is highlighted.
  • When looking at the output screen, the p -value is on the line that reads p = .
  • Decision: Reject the null hypothesis.
  • Decision: Do not reject the null hypothesis.

You will use technology to calculate the p -value, but it is useful to know that the p -value is calculated using a t distribution with n – 2 degrees of freedom and that the p -value is the combined area in both tails.

An alternative way to calculate the p -value ( p ) given by LinRegTTest is the command 2*tcdf(abs(t),10^99, n–2) in 2nd DISTR.

  • Consider the third exam/final exam example .
  • The line of best fit is ŷ = –173.51 + 4.83 x , with r = 0.6631, and there are n = 11 data points.
  • Can the regression line be used for prediction? Given a third exam score ( x value), can we use the line to predict the final exam score (predicted y value)?
  • H 0 : ρ = 0
  • H a : ρ ≠ 0
  • The p -value is 0.026 (from LinRegTTest on a calculator or from computer software).
  • The p -value, 0.026, is less than the significance level of α = 0.05.
  • Decision: Reject the null hypothesis H 0 .
  • Conclusion: There is sufficient evidence to conclude there is a significant linear relationship between the third exam score ( x ) and the final exam score ( y ) because the correlation coefficient is significantly different from zero.

Because r is significant and the scatter plot shows a linear trend, the regression line can be used to predict final exam scores.

METHOD 2: Using a Table of Critical Values to Make a Decision

The 95 Percent Critical Values of the Sample Correlation Coefficient Table ( Table 12.9 ) can be used to give you a good idea of whether the computed value of r is significant. Use it to find the critical values using the degrees of freedom, df = n – 2. The table has already been calculated with α = 0.05. The table tells you the positive critical value, but you should also make that number negative to have two critical values. If r is not between the positive and negative critical values, then the correlation coefficient is significant. If r is significant, then you may use the line for prediction. If r is not significant (between the critical values), you should not use the line to make predictions.

Example 12.6

Suppose you computed r = 0.801 using n = 10 data points. The degrees of freedom would be 8 ( df = n – 2 = 10 – 2 = 8). Using Table 12.9 with df = 8, we find that the critical value is 0.632. This means the critical values are really ±0.632. Since r = 0.801 and 0.801 > 0.632, r is significant and the line may be used for prediction. If you view this example on a number line, it will help you to see that r is not between the two critical values.

Try It 12.6

For a given line of best fit, you computed that r = 0.6501 using n = 12 data points, and the critical value found on the table is 0.576. Can the line be used for prediction? Why or why not?

Example 12.7

Suppose you computed r = –0.624 with 14 data points, where df = 14 – 2 = 12. The critical values are –0.532 and 0.532. Since –0.624 < –0.532, r is significant and the line can be used for prediction.

Try It 12.7

For a given line of best fit, you compute that r = 0.5204 using n = 9 data points, and the critical values are ±0.666. Can the line be used for prediction? Why or why not?

Example 12.8

Suppose you computed r = 0.776 and n = 6, with df = 6 -– 2 = 4. The critical values are – 0.811 and 0.811. Since 0.776 is between the two critical values, r is not significant. The line should not be used for prediction.

Try It 12.8

For a given line of best fit, you compute that r = –0.7204 using n = 8 data points, and the critical value is 0.707. Can the line be used for prediction? Why or why not?

Third Exam vs. Final Exam Example: Critical Value Method

Consider the third exam/final exam example . The line of best fit is: ŷ = –173.51 + 4.83 x , with r = .6631, and there are n = 11 data points. Can the regression line be used for prediction? Given a third exam score ( x value), can we use the line to predict the final exam score (predicted y value)?

  • Use the 95 Percent Critical Values table for r with df = n – 2 = 11 – 2 = 9.
  • Using the table with df = 9, we find that the critical value listed is 0.602. Therefore, the critical values are ±0.602.
  • Since 0.6631 > 0.602, r is significant.

Example 12.9

Suppose you computed the following correlation coefficients. Using the table at the end of the chapter, determine whether r is significant and whether the line of best fit associated with each correlation coefficient can be used to predict a y value. If it helps, draw a number line.

  • r = –0.567 and the sample size, n , is 19. To solve this problem, first find the degrees of freedom. df = n - 2 = 17. Then, using the table, the critical values are ±0.456. –0.567 < –0.456, or you may say that –0.567 is not between the two critical values. r is significant and may be used for predictions.
  • r = 0.708 and the sample size, n , is 9. df = n - 2 = 7 The critical values are ±0.666. 0.708 > 0.666. r is significant and may be used for predictions.
  • r = 0.134 and the sample size, n , is 14. df = 14 –- 2 = 12. The critical values are ±0.532. 0.134 is between –0.532 and 0.532. r is not significant and may not be used for predictions.
  • r = 0 and the sample size, n , is 5. It doesn’'t matter what the degrees of freedom are because r = 0 will always be between the two critical values, so r is not significant and may not be used for predictions.

Try It 12.9

For a given line of best fit, you compute that r = 0 using n = 100 data points. Can the line be used for prediction? Why or why not?

Assumptions in Testing the Significance of the Correlation Coefficient

Testing the significance of the correlation coefficient requires that certain assumptions about the data be satisfied. The premise of this test is that the data are a sample of observed points taken from a larger population. We have not examined the entire population because it is not possible or feasible to do so. We are examining the sample to draw a conclusion about whether the linear relationship that we see between x and y in the sample data provides strong enough evidence that we can conclude there is a linear relationship between x and y in the population.

The regression line equation that we calculate from the sample data gives the best-fit line for our particular sample. We want to use this best-fit line for the sample as an estimate of the best-fit line for the population. Examining the scatter plot and testing the significance of the correlation coefficient helps us determine whether it is appropriate to do this.

  • There is a linear relationship in the population that models the sample data. Our regression line from the sample is our best estimate of this line in the population.
  • The y values for any particular x value are normally distributed about the line. This implies there are more y values scattered closer to the line than are scattered farther away. Assumption 1 implies that these normal distributions are centered on the line; the means of these normal distributions of y values lie on the line.
  • Normal distributions of all the y values have the same shape and spread about the line.
  • The residual errors are mutually independent (no pattern).
  • The data are produced from a well-designed, random sample or randomized experiment.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/tea-statistics . Changes were made to the original material, including updates to art, structure, and other content updates.

Access for free at https://openstax.org/books/statistics/pages/1-introduction
  • Authors: Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Statistics
  • Publication date: Mar 27, 2020
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/statistics/pages/1-introduction
  • Section URL: https://openstax.org/books/statistics/pages/12-3-testing-the-significance-of-the-correlation-coefficient-optional

© Apr 16, 2024 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

Module 12: Linear Regression and Correlation

Hypothesis test for correlation, learning outcomes.

  • Conduct a linear regression t-test using p-values and critical values and interpret the conclusion in context

The correlation coefficient,  r , tells us about the strength and direction of the linear relationship between x and y . However, the reliability of the linear model also depends on how many observed data points are in the sample. We need to look at both the value of the correlation coefficient r and the sample size n , together.

We perform a hypothesis test of the “ significance of the correlation coefficient ” to decide whether the linear relationship in the sample data is strong enough to use to model the relationship in the population.

The sample data are used to compute  r , the correlation coefficient for the sample. If we had data for the entire population, we could find the population correlation coefficient. But because we only have sample data, we cannot calculate the population correlation coefficient. The sample correlation coefficient, r , is our estimate of the unknown population correlation coefficient.

  • The symbol for the population correlation coefficient is ρ , the Greek letter “rho.”
  • ρ = population correlation coefficient (unknown)
  • r = sample correlation coefficient (known; calculated from sample data)

The hypothesis test lets us decide whether the value of the population correlation coefficient  ρ is “close to zero” or “significantly different from zero.” We decide this based on the sample correlation coefficient r and the sample size n .

If the test concludes that the correlation coefficient is significantly different from zero, we say that the correlation coefficient is “significant.”

  • Conclusion: There is sufficient evidence to conclude that there is a significant linear relationship between x and y because the correlation coefficient is significantly different from zero.
  • What the conclusion means: There is a significant linear relationship between x and y . We can use the regression line to model the linear relationship between x and y in the population.

If the test concludes that the correlation coefficient is not significantly different from zero (it is close to zero), we say that the correlation coefficient is “not significant.”

  • Conclusion: “There is insufficient evidence to conclude that there is a significant linear relationship between x and y because the correlation coefficient is not significantly different from zero.”
  • What the conclusion means: There is not a significant linear relationship between x and y . Therefore, we CANNOT use the regression line to model a linear relationship between x and y in the population.
  • If r is significant and the scatter plot shows a linear trend, the line can be used to predict the value of y for values of x that are within the domain of observed x values.
  • If r is not significant OR if the scatter plot does not show a linear trend, the line should not be used for prediction.
  • If r is significant and if the scatter plot shows a linear trend, the line may NOT be appropriate or reliable for prediction OUTSIDE the domain of observed x values in the data.

Performing the Hypothesis Test

  • Null Hypothesis: H 0 : ρ = 0
  • Alternate Hypothesis: H a : ρ ≠ 0

What the Hypotheses Mean in Words

  • Null Hypothesis H 0 : The population correlation coefficient IS NOT significantly different from zero. There IS NOT a significant linear relationship (correlation) between x and y in the population.
  • Alternate Hypothesis H a : The population correlation coefficient IS significantly DIFFERENT FROM zero. There IS A SIGNIFICANT LINEAR RELATIONSHIP (correlation) between x and y in the population.

Drawing a Conclusion

There are two methods of making the decision. The two methods are equivalent and give the same result.

  • Method 1: Using the p -value
  • Method 2: Using a table of critical values

In this chapter of this textbook, we will always use a significance level of 5%,  α = 0.05

Using the  p -value method, you could choose any appropriate significance level you want; you are not limited to using α = 0.05. But the table of critical values provided in this textbook assumes that we are using a significance level of 5%, α = 0.05. (If we wanted to use a different significance level than 5% with the critical value method, we would need different tables of critical values that are not provided in this textbook).

Method 1: Using a p -value to make a decision

Using the ti-83, 83+, 84, 84+ calculator.

To calculate the  p -value using LinRegTTEST:

  • On the LinRegTTEST input screen, on the line prompt for β or ρ , highlight “≠ 0”
  • The output screen shows the p-value on the line that reads “p =”.
  • (Most computer statistical software can calculate the  p -value).

If the p -value is less than the significance level ( α = 0.05)

  • Decision: Reject the null hypothesis.
  • Conclusion: “There is sufficient evidence to conclude that there is a significant linear relationship between x and y because the correlation coefficient is significantly different from zero.”

If the p -value is NOT less than the significance level ( α = 0.05)

  • Decision: DO NOT REJECT the null hypothesis.
  • Conclusion: “There is insufficient evidence to conclude that there is a significant linear relationship between x and y because the correlation coefficient is NOT significantly different from zero.”

Calculation Notes:

  • You will use technology to calculate the p -value. The following describes the calculations to compute the test statistics and the p -value:
  • The p -value is calculated using a t -distribution with n – 2 degrees of freedom.
  • The formula for the test statistic is [latex]\displaystyle{t}=\dfrac{{{r}\sqrt{{{n}-{2}}}}}{\sqrt{{{1}-{r}^{{2}}}}}[/latex]. The value of the test statistic, t , is shown in the computer or calculator output along with the p -value. The test statistic t has the same sign as the correlation coefficient r .
  • The p -value is the combined area in both tails.

Recall: ORDER OF OPERATIONS

parentheses exponents multiplication division addition subtraction
[latex]( \ )[/latex] [latex]x^2[/latex] [latex]\times \ \mathrm{or} \ \div[/latex] [latex]+ \ \mathrm{or} \ -[/latex]

1st find the numerator:

Step 1: Find [latex]n-2[/latex], and then take the square root.

Step 2: Multiply the value in Step 1 by [latex]r[/latex].

2nd find the denominator: 

Step 3: Find the square of [latex]r[/latex], which is [latex]r[/latex] multiplied by [latex]r[/latex].

Step 4: Subtract this value from 1, [latex]1 -r^2[/latex].

Step 5: Find the square root of Step 4.

3rd take the numerator and divide by the denominator.

An alternative way to calculate the  p -value (p) given by LinRegTTest is the command 2*tcdf(abs(t),10^99, n-2) in 2nd DISTR.

THIRD-EXAM vs FINAL-EXAM EXAM:  p- value method

  • Consider the  third exam/final exam example (example 2).
  • The line of best fit is: [latex]\hat{y}[/latex] = -173.51 + 4.83 x  with  r  = 0.6631 and there are  n  = 11 data points.
  • Can the regression line be used for prediction?  Given a third exam score ( x  value), can we use the line to predict the final exam score (predicted  y  value)?
  • H 0 :  ρ  = 0
  • H a :  ρ  ≠ 0
  • The  p -value is 0.026 (from LinRegTTest on your calculator or from computer software).
  • The  p -value, 0.026, is less than the significance level of  α  = 0.05.
  • Decision: Reject the Null Hypothesis  H 0
  • Conclusion: There is sufficient evidence to conclude that there is a significant linear relationship between the third exam score ( x ) and the final exam score ( y ) because the correlation coefficient is significantly different from zero.

Because  r  is significant and the scatter plot shows a linear trend, the regression line can be used to predict final exam scores.

Method 2: Using a table of Critical Values to make a decision

The 95% Critical Values of the Sample Correlation Coefficient Table can be used to give you a good idea of whether the computed value of r is significant or not . Compare  r to the appropriate critical value in the table. If r is not between the positive and negative critical values, then the correlation coefficient is significant. If  r is significant, then you may want to use the line for prediction.

Suppose you computed  r = 0.801 using n = 10 data points. df = n – 2 = 10 – 2 = 8. The critical values associated with df = 8 are -0.632 and + 0.632. If r < negative critical value or r > positive critical value, then r is significant. Since r = 0.801 and 0.801 > 0.632, r is significant and the line may be used for prediction. If you view this example on a number line, it will help you.

Horizontal number line with values of -1, -0.632, 0, 0.632, 0.801, and 1. A dashed line above values -0.632, 0, and 0.632 indicates not significant values.

r is not significant between -0.632 and +0.632. r = 0.801 > +0.632. Therefore, r is significant.

For a given line of best fit, you computed that  r = 0.6501 using n = 12 data points and the critical value is 0.576. Can the line be used for prediction? Why or why not?

If the scatter plot looks linear then, yes, the line can be used for prediction, because  r > the positive critical value.

Suppose you computed  r = –0.624 with 14 data points. df = 14 – 2 = 12. The critical values are –0.532 and 0.532. Since –0.624 < –0.532, r is significant and the line can be used for prediction

Horizontal number line with values of -0.624, -0.532, and 0.532.

r = –0.624-0.532. Therefore, r is significant.

For a given line of best fit, you compute that  r = 0.5204 using n = 9 data points, and the critical value is 0.666. Can the line be used for prediction? Why or why not?

No, the line cannot be used for prediction, because  r < the positive critical value.

Suppose you computed  r = 0.776 and n = 6. df = 6 – 2 = 4. The critical values are –0.811 and 0.811. Since –0.811 < 0.776 < 0.811, r is not significant, and the line should not be used for prediction.

Horizontal number line with values -0.924, -0.532, and 0.532.

–0.811 <  r = 0.776 < 0.811. Therefore, r is not significant.

For a given line of best fit, you compute that  r = –0.7204 using n = 8 data points, and the critical value is = 0.707. Can the line be used for prediction? Why or why not?

Yes, the line can be used for prediction, because  r < the negative critical value.

THIRD-EXAM vs FINAL-EXAM EXAMPLE: critical value method

Consider the  third exam/final exam example  again. The line of best fit is: [latex]\hat{y}[/latex] = –173.51+4.83 x  with  r  = 0.6631 and there are  n  = 11 data points. Can the regression line be used for prediction?  Given a third-exam score ( x  value), can we use the line to predict the final exam score (predicted  y  value)?

  • Use the “95% Critical Value” table for  r  with  df  =  n  – 2 = 11 – 2 = 9.
  • The critical values are –0.602 and +0.602
  • Since 0.6631 > 0.602,  r  is significant.

Suppose you computed the following correlation coefficients. Using the table at the end of the chapter, determine if  r is significant and the line of best fit associated with each r can be used to predict a y value. If it helps, draw a number line.

  • r = –0.567 and the sample size, n , is 19. The df = n – 2 = 17. The critical value is –0.456. –0.567 < –0.456 so r is significant.
  • r = 0.708 and the sample size, n , is nine. The df = n – 2 = 7. The critical value is 0.666. 0.708 > 0.666 so r is significant.
  • r = 0.134 and the sample size, n , is 14. The df = 14 – 2 = 12. The critical value is 0.532. 0.134 is between –0.532 and 0.532 so r is not significant.
  • r = 0 and the sample size, n , is five. No matter what the dfs are, r = 0 is between the two critical values so r is not significant.

For a given line of best fit, you compute that  r = 0 using n = 100 data points. Can the line be used for prediction? Why or why not?

No, the line cannot be used for prediction no matter what the sample size is.

Assumptions in Testing the Significance of the Correlation Coefficient

Testing the significance of the correlation coefficient requires that certain assumptions about the data are satisfied. The premise of this test is that the data are a sample of observed points taken from a larger population. We have not examined the entire population because it is not possible or feasible to do so. We are examining the sample to draw a conclusion about whether the linear relationship that we see between  x and y in the sample data provides strong enough evidence so that we can conclude that there is a linear relationship between x and y in the population.

The regression line equation that we calculate from the sample data gives the best-fit line for our particular sample. We want to use this best-fit line for the sample as an estimate of the best-fit line for the population. Examining the scatterplot and testing the significance of the correlation coefficient helps us determine if it is appropriate to do this.

The assumptions underlying the test of significance are:

  • There is a linear relationship in the population that models the average value of y for varying values of x . In other words, the expected value of y for each particular value lies on a straight line in the population. (We do not know the equation for the line for the population. Our regression line from the sample is our best estimate of this line in the population).
  • The y values for any particular x value are normally distributed about the line. This implies that there are more y values scattered closer to the line than are scattered farther away. Assumption (1) implies that these normal distributions are centered on the line: the means of these normal distributions of y values lie on the line.
  • The standard deviations of the population y values about the line are equal for each value of x . In other words, each of these normal distributions of y  values has the same shape and spread about the line.
  • The residual errors are mutually independent (no pattern).
  • The data are produced from a well-designed, random sample or randomized experiment.

The left graph shows three sets of points. Each set falls in a vertical line. The points in each set are normally distributed along the line — they are densely packed in the middle and more spread out at the top and bottom. A downward sloping regression line passes through the mean of each set. The right graph shows the same regression line plotted. A vertical normal curve is shown for each line.

The  y values for each x value are normally distributed about the line with the same standard deviation. For each x value, the mean of the y values lies on the regression line. More y values lie near the line than are scattered further away from the line.

  • Provided by : Lumen Learning. License : CC BY: Attribution
  • Testing the Significance of the Correlation Coefficient. Provided by : OpenStax. Located at : https://openstax.org/books/introductory-statistics/pages/12-4-testing-the-significance-of-the-correlation-coefficient . License : CC BY: Attribution . License Terms : Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction
  • Introductory Statistics. Authored by : Barbara Illowsky, Susan Dean. Provided by : OpenStax. Located at : https://openstax.org/books/introductory-statistics/pages/1-introduction . License : CC BY: Attribution . License Terms : Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction

Footer Logo Lumen Candela

Privacy Policy

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 13: Inferential Statistics

Understanding Null Hypothesis Testing

Learning Objectives

  • Explain the purpose of null hypothesis testing, including the role of sampling error.
  • Describe the basic logic of null hypothesis testing.
  • Describe the role of relationship strength and sample size in determining statistical significance and make reasonable judgments about statistical significance based on these two factors.

The Purpose of Null Hypothesis Testing

As we have seen, psychological research typically involves measuring one or more variables for a sample and computing descriptive statistics for that sample. In general, however, the researcher’s goal is not to draw conclusions about that sample but to draw conclusions about the population that the sample was selected from. Thus researchers must use sample statistics to draw conclusions about the corresponding values in the population. These corresponding values in the population are called  parameters . Imagine, for example, that a researcher measures the number of depressive symptoms exhibited by each of 50 clinically depressed adults and computes the mean number of symptoms. The researcher probably wants to use this sample statistic (the mean number of symptoms for the sample) to draw conclusions about the corresponding population parameter (the mean number of symptoms for clinically depressed adults).

Unfortunately, sample statistics are not perfect estimates of their corresponding population parameters. This is because there is a certain amount of random variability in any statistic from sample to sample. The mean number of depressive symptoms might be 8.73 in one sample of clinically depressed adults, 6.45 in a second sample, and 9.44 in a third—even though these samples are selected randomly from the same population. Similarly, the correlation (Pearson’s  r ) between two variables might be +.24 in one sample, −.04 in a second sample, and +.15 in a third—again, even though these samples are selected randomly from the same population. This random variability in a statistic from sample to sample is called  sampling error . (Note that the term error  here refers to random variability and does not imply that anyone has made a mistake. No one “commits a sampling error.”)

One implication of this is that when there is a statistical relationship in a sample, it is not always clear that there is a statistical relationship in the population. A small difference between two group means in a sample might indicate that there is a small difference between the two group means in the population. But it could also be that there is no difference between the means in the population and that the difference in the sample is just a matter of sampling error. Similarly, a Pearson’s  r  value of −.29 in a sample might mean that there is a negative relationship in the population. But it could also be that there is no relationship in the population and that the relationship in the sample is just a matter of sampling error.

In fact, any statistical relationship in a sample can be interpreted in two ways:

  • There is a relationship in the population, and the relationship in the sample reflects this.
  • There is no relationship in the population, and the relationship in the sample reflects only sampling error.

The purpose of null hypothesis testing is simply to help researchers decide between these two interpretations.

The Logic of Null Hypothesis Testing

Null hypothesis testing  is a formal approach to deciding between two interpretations of a statistical relationship in a sample. One interpretation is called the   null hypothesis  (often symbolized  H 0  and read as “H-naught”). This is the idea that there is no relationship in the population and that the relationship in the sample reflects only sampling error. Informally, the null hypothesis is that the sample relationship “occurred by chance.” The other interpretation is called the  alternative hypothesis  (often symbolized as  H 1 ). This is the idea that there is a relationship in the population and that the relationship in the sample reflects this relationship in the population.

Again, every statistical relationship in a sample can be interpreted in either of these two ways: It might have occurred by chance, or it might reflect a relationship in the population. So researchers need a way to decide between them. Although there are many specific null hypothesis testing techniques, they are all based on the same general logic. The steps are as follows:

  • Assume for the moment that the null hypothesis is true. There is no relationship between the variables in the population.
  • Determine how likely the sample relationship would be if the null hypothesis were true.
  • If the sample relationship would be extremely unlikely, then reject the null hypothesis  in favour of the alternative hypothesis. If it would not be extremely unlikely, then  retain the null hypothesis .

Following this logic, we can begin to understand why Mehl and his colleagues concluded that there is no difference in talkativeness between women and men in the population. In essence, they asked the following question: “If there were no difference in the population, how likely is it that we would find a small difference of  d  = 0.06 in our sample?” Their answer to this question was that this sample relationship would be fairly likely if the null hypothesis were true. Therefore, they retained the null hypothesis—concluding that there is no evidence of a sex difference in the population. We can also see why Kanner and his colleagues concluded that there is a correlation between hassles and symptoms in the population. They asked, “If the null hypothesis were true, how likely is it that we would find a strong correlation of +.60 in our sample?” Their answer to this question was that this sample relationship would be fairly unlikely if the null hypothesis were true. Therefore, they rejected the null hypothesis in favour of the alternative hypothesis—concluding that there is a positive correlation between these variables in the population.

A crucial step in null hypothesis testing is finding the likelihood of the sample result if the null hypothesis were true. This probability is called the  p value . A low  p  value means that the sample result would be unlikely if the null hypothesis were true and leads to the rejection of the null hypothesis. A high  p  value means that the sample result would be likely if the null hypothesis were true and leads to the retention of the null hypothesis. But how low must the  p  value be before the sample result is considered unlikely enough to reject the null hypothesis? In null hypothesis testing, this criterion is called  α (alpha)  and is almost always set to .05. If there is less than a 5% chance of a result as extreme as the sample result if the null hypothesis were true, then the null hypothesis is rejected. When this happens, the result is said to be  statistically significant . If there is greater than a 5% chance of a result as extreme as the sample result when the null hypothesis is true, then the null hypothesis is retained. This does not necessarily mean that the researcher accepts the null hypothesis as true—only that there is not currently enough evidence to conclude that it is true. Researchers often use the expression “fail to reject the null hypothesis” rather than “retain the null hypothesis,” but they never use the expression “accept the null hypothesis.”

The Misunderstood  p  Value

The  p  value is one of the most misunderstood quantities in psychological research (Cohen, 1994) [1] . Even professional researchers misinterpret it, and it is not unusual for such misinterpretations to appear in statistics textbooks!

The most common misinterpretation is that the  p  value is the probability that the null hypothesis is true—that the sample result occurred by chance. For example, a misguided researcher might say that because the  p  value is .02, there is only a 2% chance that the result is due to chance and a 98% chance that it reflects a real relationship in the population. But this is incorrect . The  p  value is really the probability of a result at least as extreme as the sample result  if  the null hypothesis  were  true. So a  p  value of .02 means that if the null hypothesis were true, a sample result this extreme would occur only 2% of the time.

You can avoid this misunderstanding by remembering that the  p  value is not the probability that any particular  hypothesis  is true or false. Instead, it is the probability of obtaining the  sample result  if the null hypothesis were true.

Role of Sample Size and Relationship Strength

Recall that null hypothesis testing involves answering the question, “If the null hypothesis were true, what is the probability of a sample result as extreme as this one?” In other words, “What is the  p  value?” It can be helpful to see that the answer to this question depends on just two considerations: the strength of the relationship and the size of the sample. Specifically, the stronger the sample relationship and the larger the sample, the less likely the result would be if the null hypothesis were true. That is, the lower the  p  value. This should make sense. Imagine a study in which a sample of 500 women is compared with a sample of 500 men in terms of some psychological characteristic, and Cohen’s  d  is a strong 0.50. If there were really no sex difference in the population, then a result this strong based on such a large sample should seem highly unlikely. Now imagine a similar study in which a sample of three women is compared with a sample of three men, and Cohen’s  d  is a weak 0.10. If there were no sex difference in the population, then a relationship this weak based on such a small sample should seem likely. And this is precisely why the null hypothesis would be rejected in the first example and retained in the second.

Of course, sometimes the result can be weak and the sample large, or the result can be strong and the sample small. In these cases, the two considerations trade off against each other so that a weak result can be statistically significant if the sample is large enough and a strong relationship can be statistically significant even if the sample is small. Table 13.1 shows roughly how relationship strength and sample size combine to determine whether a sample result is statistically significant. The columns of the table represent the three levels of relationship strength: weak, medium, and strong. The rows represent four sample sizes that can be considered small, medium, large, and extra large in the context of psychological research. Thus each cell in the table represents a combination of relationship strength and sample size. If a cell contains the word  Yes , then this combination would be statistically significant for both Cohen’s  d  and Pearson’s  r . If it contains the word  No , then it would not be statistically significant for either. There is one cell where the decision for  d  and  r  would be different and another where it might be different depending on some additional considerations, which are discussed in Section 13.2 “Some Basic Null Hypothesis Tests”

Table 13.1 How Relationship Strength and Sample Size Combine to Determine Whether a Result Is Statistically Significant
Sample Size Weak relationship Medium-strength relationship Strong relationship
Small (  = 20) No No  = Maybe

 = Yes

Medium (  = 50) No Yes Yes
Large (  = 100)  = Yes

 = No

Yes Yes
Extra large (  = 500) Yes Yes Yes

Although Table 13.1 provides only a rough guideline, it shows very clearly that weak relationships based on medium or small samples are never statistically significant and that strong relationships based on medium or larger samples are always statistically significant. If you keep this lesson in mind, you will often know whether a result is statistically significant based on the descriptive statistics alone. It is extremely useful to be able to develop this kind of intuitive judgment. One reason is that it allows you to develop expectations about how your formal null hypothesis tests are going to come out, which in turn allows you to detect problems in your analyses. For example, if your sample relationship is strong and your sample is medium, then you would expect to reject the null hypothesis. If for some reason your formal null hypothesis test indicates otherwise, then you need to double-check your computations and interpretations. A second reason is that the ability to make this kind of intuitive judgment is an indication that you understand the basic logic of this approach in addition to being able to do the computations.

Statistical Significance Versus Practical Significance

Table 13.1 illustrates another extremely important point. A statistically significant result is not necessarily a strong one. Even a very weak result can be statistically significant if it is based on a large enough sample. This is closely related to Janet Shibley Hyde’s argument about sex differences (Hyde, 2007) [2] . The differences between women and men in mathematical problem solving and leadership ability are statistically significant. But the word  significant  can cause people to interpret these differences as strong and important—perhaps even important enough to influence the college courses they take or even who they vote for. As we have seen, however, these statistically significant differences are actually quite weak—perhaps even “trivial.”

This is why it is important to distinguish between the  statistical  significance of a result and the  practical  significance of that result.  Practical significance refers to the importance or usefulness of the result in some real-world context. Many sex differences are statistically significant—and may even be interesting for purely scientific reasons—but they are not practically significant. In clinical practice, this same concept is often referred to as “clinical significance.” For example, a study on a new treatment for social phobia might show that it produces a statistically significant positive effect. Yet this effect still might not be strong enough to justify the time, effort, and other costs of putting it into practice—especially if easier and cheaper treatments that work almost as well already exist. Although statistically significant, this result would be said to lack practical or clinical significance.

Key Takeaways

  • Null hypothesis testing is a formal approach to deciding whether a statistical relationship in a sample reflects a real relationship in the population or is just due to chance.
  • The logic of null hypothesis testing involves assuming that the null hypothesis is true, finding how likely the sample result would be if this assumption were correct, and then making a decision. If the sample result would be unlikely if the null hypothesis were true, then it is rejected in favour of the alternative hypothesis. If it would not be unlikely, then the null hypothesis is retained.
  • The probability of obtaining the sample result if the null hypothesis were true (the  p  value) is based on two considerations: relationship strength and sample size. Reasonable judgments about whether a sample relationship is statistically significant can often be made by quickly considering these two factors.
  • Statistical significance is not the same as relationship strength or importance. Even weak relationships can be statistically significant if the sample size is large enough. It is important to consider relationship strength and the practical significance of a result in addition to its statistical significance.
  • Discussion: Imagine a study showing that people who eat more broccoli tend to be happier. Explain for someone who knows nothing about statistics why the researchers would conduct a null hypothesis test.
  • The correlation between two variables is  r  = −.78 based on a sample size of 137.
  • The mean score on a psychological characteristic for women is 25 ( SD  = 5) and the mean score for men is 24 ( SD  = 5). There were 12 women and 10 men in this study.
  • In a memory experiment, the mean number of items recalled by the 40 participants in Condition A was 0.50 standard deviations greater than the mean number recalled by the 40 participants in Condition B.
  • In another memory experiment, the mean scores for participants in Condition A and Condition B came out exactly the same!
  • A student finds a correlation of  r  = .04 between the number of units the students in his research methods class are taking and the students’ level of stress.

Long Descriptions

“Null Hypothesis” long description: A comic depicting a man and a woman talking in the foreground. In the background is a child working at a desk. The man says to the woman, “I can’t believe schools are still teaching kids about the null hypothesis. I remember reading a big study that conclusively disproved it years ago.” [Return to “Null Hypothesis”]

“Conditional Risk” long description: A comic depicting two hikers beside a tree during a thunderstorm. A bolt of lightning goes “crack” in the dark sky as thunder booms. One of the hikers says, “Whoa! We should get inside!” The other hiker says, “It’s okay! Lightning only kills about 45 Americans a year, so the chances of dying are only one in 7,000,000. Let’s go on!” The comic’s caption says, “The annual death rate among people who know that statistic is one in six.” [Return to “Conditional Risk”]

Media Attributions

  • Null Hypothesis by XKCD  CC BY-NC (Attribution NonCommercial)
  • Conditional Risk by XKCD  CC BY-NC (Attribution NonCommercial)
  • Cohen, J. (1994). The world is round: p < .05. American Psychologist, 49 , 997–1003. ↵
  • Hyde, J. S. (2007). New directions in the study of gender similarities and differences. Current Directions in Psychological Science, 16 , 259–263. ↵

Values in a population that correspond to variables measured in a study.

The random variability in a statistic from sample to sample.

A formal approach to deciding between two interpretations of a statistical relationship in a sample.

The idea that there is no relationship in the population and that the relationship in the sample reflects only sampling error.

The idea that there is a relationship in the population and that the relationship in the sample reflects this relationship in the population.

When the relationship found in the sample would be extremely unlikely, the idea that the relationship occurred “by chance” is rejected.

When the relationship found in the sample is likely to have occurred by chance, the null hypothesis is not rejected.

The probability that, if the null hypothesis were true, the result found in the sample would occur.

How low the p value must be before the sample result is considered unlikely in null hypothesis testing.

When there is less than a 5% chance of a result as extreme as the sample result occurring and the null hypothesis is rejected.

Research Methods in Psychology - 2nd Canadian Edition Copyright © 2015 by Paul C. Price, Rajiv Jhangiani, & I-Chant A. Chiang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

null hypothesis of zero correlation

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case AskWhy Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

null hypothesis of zero correlation

Home Market Research

Zero Correlation: Definition, Examples + How to Determine It

zero correlation

Correlation is a fundamental concept in statistics and data analysis, helping to understand the relationship between two variables. While strong positive or negative correlations are often highlighted, zero correlation is equally important. 

It means there is no linear relationship between the variables. In other words, changes in one variable do not predict changes in the other.

In this blog, we will explore the concept of zero correlation, providing a clear definition, illustrative examples, and methods to determine it.

What is a Zero Correlation?

Zero correlation is a statistical term that describes a situation where there is no linear relationship between two variables. When two variables have zero correlation, changes in one variable do not predict changes in the other. The correlation coefficient, which measures the degree and direction of the relationship between variables, is exactly zero in this case.

Understanding this correlation is important in statistical analysis because it helps identify variables that do not have a predictive relationship with each other, which is crucial when building statistical models or interpreting data patterns.

Why is Zero Correlation important?

Zero correlation is an important concept in statistics and data analysis for several reasons such as:

It Identifies Independence

It helps identify variables that are linearly independent of each other. If two variables have zero correlation, changes in one variable do not provide any information about changes in the other. This is crucial for understanding the structure of the data and the relationships (or lack thereof) between variables.

It Improves Statistical Models

In regression analysis and other statistical models, including variables with this correlation to the dependent variable can add noise and reduce the model’s predictive power. By identifying and excluding such variables, models can be simplified and made more efficient, leading to better performance and interpretability.

This Correlation Helps Avoiding Misinterpretation

Understanding this correlation prevents misinterpretation of data. 

  • For example, a researcher might mistakenly infer a relationship between two variables based on intuition or initial observations. 

Calculating the correlation coefficient and finding it to be zero clarifies that no linear relationship exists, avoiding false conclusions.

It Highlights Non-linear Relationships

It highlights the possibility of non-linear relationships. If two variables have zero correlation, it doesn’t necessarily mean they are unrelated; they might have a complex, non-linear relationship. Recognizing this can prompt further investigation using other methods, such as non-linear regression or data transformations.

Correlation Helps in Guiding Experimental Design

In experimental design, knowing which variables have zero correlation can guide the selection of variables to include or control for. This helps in designing more robust experiments where the influence of irrelevant variables is minimized, leading to clearer, more reliable results.

It Understands Variable Behavior

It provides insights into the behavior of variables in a dataset. In financial analysis, understanding which assets have zero correlation with each other can help in portfolio diversification, as combining such assets can reduce overall risk.

It Supports Hypothesis Testing

In hypothesis testing, this correlation is often a null hypothesis. 

  • For example, in testing whether two variables are related, the null hypothesis might state that the correlation between them is zero. 

Establishing whether this is true or false helps in validating or refuting hypotheses.

What are the Examples of Zero Correlation?

Examples of this correlation, where changes in one variable do not correspond with changes in another variable, can be found across various fields such as:

Field of Research

Example: Number of Scientific Publications and Favorite Ice Cream Flavor

A study investigates the relationship between the number of scientific publications a researcher has and their favorite ice cream flavor.

There is no logical connection between the number of scientific papers a researcher publishes and their preference for a particular ice cream flavor. As a result, these two variables are expected to exhibit this correlation.

Field of Education

Example: Students’ Shoe Size and Their Grades in Mathematics

An educational study examines whether there is any relationship between students’ shoe sizes and their grades in mathematics.

Shoe size is a physical characteristic that has no bearing on a student’s academic performance in mathematics. Therefore, the correlation between shoe size and math grades is likely to be zero.

Field of Healthcare

Example: Blood Type and Incidence of the Common Cold

A healthcare study looks into whether there is a relationship between a person’s blood type and the number of times they catch the common cold in a year.

Blood type is not associated with the frequency of contracting the common cold, which is influenced by various other factors such as exposure to viruses and immune system strength. Hence, the correlation between blood type and the incidence of the common cold is expected to be zero.

How to Identify Zero Correlation?

Here, we’ll explore how to identify this correlation through visual inspection, statistical calculation, hypothesis testing, and contextual analysis.

1. Visual The Inspection Using Scatter Plots

Scatter plots are an effective tool for visually assessing the relationship between two variables.

Create a Scatter Plot:

  • Place one variable on the x-axis and the other on the y-axis.
  • Look for any discernible trend or pattern in the data points.

Identifying Correlation:

  • If the points are scattered randomly with no clear trend (neither upward nor downward), it suggests correlation.
  • A random scatter implies that no line (whether straight or curved) can fit the data points well.
  • Students’ Shoe Sizes vs. Math Grades: If you plot shoe sizes against math grades and see a random scatter of points with no trend, this indicates zero correlation.

2. Calculate the Correlation Coefficient

The Pearson correlation coefficient (r) is the most common measure of linear correlation.

null hypothesis of zero correlation

  • Gather paired data points for the two variables.
  • Find the mean (average) of each variable.
  • Calculate how far each data point is from the mean.
  • Multiply the deviations for each pair and sum the products.
  • Use the formula to find the correlation coefficient.

Interpreting Correlation:

Value Close to 0: If 𝑟 r is close to 0, it indicates little to no linear relationship between the variables.

  • Shoe Sizes and Math Grades: If the calculated 𝑟 is approximately 0, it confirms zero correlation.

3. Do The Hypothesis Testing

Statistical hypothesis testing can determine whether an observed correlation coefficient is significantly different from zero.

  • Null Hypothesis: Assume that the correlation coefficient is zero.
  • Alternative Hypothesis: Assume that the correlation coefficient is not zero.
  • Compute Test Statistic: Use a t-test for the correlation coefficient.
  • Determine p-value: Compare the p-value to a significance level (e.g., 0.05).

Zero Correlation:

  • If the p-value is greater than the significance level, do not reject the null hypothesis, suggesting that the correlation is not significantly different from zero.
  • Blood Type and Common Cold Incidence: Testing the correlation between blood type and the incidence of the common cold, if the p-value is high, it indicates that any observed correlation is not statistically significant, supporting zero correlation.

4. Understanding Contextual Analysis

Understanding the context and theoretical background of the variables is essential for interpreting correlation results.

  • Examine Variables: Consider the nature and expected relationships between the variables.
  • Apply Domain Knowledge: Use knowledge from the field to hypothesize whether a relationship is expected.

Zero Correlation: 

  • If theory and prior research suggest no logical relationship, this supports the finding of this correlation.
  • Blood Type and Common Cold Incidence: Knowing that blood type does not affect susceptibility to the common cold supports the interpretation of zero correlation if found.

Negative vs Positive Correlation vs Zero Correlation

Correlation is a statistical measure that describes the strength and direction of the relationship between two variables. Here’s a detailed explanation of negative, positive, and zero correlation:

Positive Correlation

  • Definition: A positive correlation occurs when two variables move in the same direction. As one variable increases, the other variable also increases, and as one decreases, the other also decreases.
  • Example: The relationship between height and weight. Generally, as a person’s height increases, their weight also tends to increase.
  • Graphical Representation: In a scatter plot, points tend to cluster around a line that slopes upwards from left to right.

Negative Correlation

  • Definition: A negative correlation occurs when two variables move in opposite directions. As one variable increases, the other variable decreases, and vice versa.
  • Example: The relationship between the amount of time spent studying and the number of errors made on a test. Generally, as the time spent studying increases, the number of errors decreases.
  • Graphical Representation: In a scatter plot, points tend to cluster around a line that slopes downwards from left to right.

Zero Correlation

  • Definition: It indicates that there is no relationship between the two variables. Changes in one variable do not predict changes in the other variable.
  • Example: The relationship between a person’s shoe size and their intelligence quotient (IQ). There is no logical connection between these two variables.
  • Graphical Representation: In a scatter plot, points are distributed randomly with no discernible pattern or slope.

How QuestionPro Can Help in Correlation Analysis?

QuestionPro, a robust survey platform, offers comprehensive tools to facilitate correlation analysis effectively. Here’s how QuestionPro can help you in conducting correlation analysis:

Effortless Data Collection

QuestionPro simplifies the data collection process through its user-friendly survey creation tools. You can design and distribute surveys to gather quantitative data on various variables of interest. The platform supports various question types, allowing you to capture detailed and relevant data efficiently.

Automated Data Analysis

Once the data is collected, QuestionPro provides built-in analytics tools for correlation analysis. You can easily calculate correlations, which measure the strength and direction of the linear relationship between two variables. The linear correlation coefficient ranges from -1 to 1, where:

  • 1 indicates a perfect positive correlation.
  • -1 indicates a perfect negative correlation.
  • 0 indicates no correlation.

Visual Representation

QuestionPro offers visualization tools to help you interpret the results of your correlation analysis. Scatter plots and correlation matrices can be generated to provide a clear graphical representation of the relationships between variables. This visual aid is crucial for quickly identifying trends and patterns.

Identifying Patterns and Trends

Using QuestionPro’s correlation analysis, researchers observed correlation ( positive, negative, or zero) between variables:

  • Positive Correlation: Both variables move in the same direction. For example, increased advertising spending may correlate with increased sales.
  • Negative Correlation: The variables tend to move in opposite directions. For example, increased screen time might correlate with decreased academic performance.
  • Zero Correlation: No relationship exists between the variables. For example, the number of years in school might not correlate with the number of letters in a person’s name.

Practical Applications

Correlation analysis in QuestionPro can be used for various practical applications, such as:

  • Market Research: Measure the effectiveness of marketing campaigns by correlating advertising spending with sales performance.
  • Healthcare: Assess the relationship between medication usage and patient outcomes, such as blood pressure levels.
  • Education: Determine the impact of study habits on academic performance by correlating hours studied with grades.

Zero correlation between two variables signifies the absence of a linear relationship, indicating that changes in one variable do not correspond with changes in another. By computing correlation coefficients and visualizing data through scatter plots, researchers can accurately determine whether variables are correlated, positively correlated, negatively correlated, or show this correlation.

Using QuestionPro for correlation analysis in your surveys provides a powerful way to uncover meaningful relationships between variables. By exploring QuestionPro’s intuitive interface, advanced analytical tools, and comprehensive reporting features, you can efficiently conduct correlation analysis and derive valuable insights from your data. Contact QuestionPro today for further information!

LEARN MORE         FREE TRIAL

MORE LIKE THIS

age gating

Age Gating: Effective Strategies for Online Content Control

Aug 23, 2024

null hypothesis of zero correlation

Customer Experience Lessons from 13,000 Feet — Tuesday CX Thoughts

Aug 20, 2024

insight

Insight: Definition & meaning, types and examples

Aug 19, 2024

employee loyalty

Employee Loyalty: Strategies for Long-Term Business Success 

Other categories.

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Tuesday CX Thoughts (TCXT)
  • Uncategorized
  • What’s Coming Up
  • Workforce Intelligence

13.1 Understanding Null Hypothesis Testing

Learning objectives.

  • Explain the purpose of null hypothesis testing, including the role of sampling error.
  • Describe the basic logic of null hypothesis testing.
  • Describe the role of relationship strength and sample size in determining statistical significance and make reasonable judgments about statistical significance based on these two factors.

  The Purpose of Null Hypothesis Testing

As we have seen, psychological research typically involves measuring one or more variables in a sample and computing descriptive statistics for that sample. In general, however, the researcher’s goal is not to draw conclusions about that sample but to draw conclusions about the population that the sample was selected from. Thus researchers must use sample statistics to draw conclusions about the corresponding values in the population. These corresponding values in the population are called  parameters . Imagine, for example, that a researcher measures the number of depressive symptoms exhibited by each of 50 adults with clinical depression and computes the mean number of symptoms. The researcher probably wants to use this sample statistic (the mean number of symptoms for the sample) to draw conclusions about the corresponding population parameter (the mean number of symptoms for adults with clinical depression).

Unfortunately, sample statistics are not perfect estimates of their corresponding population parameters. This is because there is a certain amount of random variability in any statistic from sample to sample. The mean number of depressive symptoms might be 8.73 in one sample of adults with clinical depression, 6.45 in a second sample, and 9.44 in a third—even though these samples are selected randomly from the same population. Similarly, the correlation (Pearson’s  r ) between two variables might be +.24 in one sample, −.04 in a second sample, and +.15 in a third—again, even though these samples are selected randomly from the same population. This random variability in a statistic from sample to sample is called  sampling error . (Note that the term error  here refers to random variability and does not imply that anyone has made a mistake. No one “commits a sampling error.”)

One implication of this is that when there is a statistical relationship in a sample, it is not always clear that there is a statistical relationship in the population. A small difference between two group means in a sample might indicate that there is a small difference between the two group means in the population. But it could also be that there is no difference between the means in the population and that the difference in the sample is just a matter of sampling error. Similarly, a Pearson’s  r  value of −.29 in a sample might mean that there is a negative relationship in the population. But it could also be that there is no relationship in the population and that the relationship in the sample is just a matter of sampling error.

In fact, any statistical relationship in a sample can be interpreted in two ways:

  • There is a relationship in the population, and the relationship in the sample reflects this.
  • There is no relationship in the population, and the relationship in the sample reflects only sampling error.

The purpose of null hypothesis testing is simply to help researchers decide between these two interpretations.

The Logic of Null Hypothesis Testing

Null hypothesis testing  is a formal approach to deciding between two interpretations of a statistical relationship in a sample. One interpretation is called the  null hypothesis  (often symbolized  H 0  and read as “H-naught”). This is the idea that there is no relationship in the population and that the relationship in the sample reflects only sampling error. Informally, the null hypothesis is that the sample relationship “occurred by chance.” The other interpretation is called the  alternative hypothesis  (often symbolized as  H 1 ). This is the idea that there is a relationship in the population and that the relationship in the sample reflects this relationship in the population.

Again, every statistical relationship in a sample can be interpreted in either of these two ways: It might have occurred by chance, or it might reflect a relationship in the population. So researchers need a way to decide between them. Although there are many specific null hypothesis testing techniques, they are all based on the same general logic. The steps are as follows:

  • Assume for the moment that the null hypothesis is true. There is no relationship between the variables in the population.
  • Determine how likely the sample relationship would be if the null hypothesis were true.
  • If the sample relationship would be extremely unlikely, then reject the null hypothesis  in favor of the alternative hypothesis. If it would not be extremely unlikely, then  retain the null hypothesis .

Following this logic, we can begin to understand why Mehl and his colleagues concluded that there is no difference in talkativeness between women and men in the population. In essence, they asked the following question: “If there were no difference in the population, how likely is it that we would find a small difference of  d  = 0.06 in our sample?” Their answer to this question was that this sample relationship would be fairly likely if the null hypothesis were true. Therefore, they retained the null hypothesis—concluding that there is no evidence of a sex difference in the population. We can also see why Kanner and his colleagues concluded that there is a correlation between hassles and symptoms in the population. They asked, “If the null hypothesis were true, how likely is it that we would find a strong correlation of +.60 in our sample?” Their answer to this question was that this sample relationship would be fairly unlikely if the null hypothesis were true. Therefore, they rejected the null hypothesis in favor of the alternative hypothesis—concluding that there is a positive correlation between these variables in the population.

A crucial step in null hypothesis testing is finding the likelihood of the sample result if the null hypothesis were true. This probability is called the  p value . A low  p  value means that the sample result would be unlikely if the null hypothesis were true and leads to the rejection of the null hypothesis. A p  value that is not low means that the sample result would be likely if the null hypothesis were true and leads to the retention of the null hypothesis. But how low must the  p  value be before the sample result is considered unlikely enough to reject the null hypothesis? In null hypothesis testing, this criterion is called  α (alpha)  and is almost always set to .05. If there is a 5% chance or less of a result as extreme as the sample result if the null hypothesis were true, then the null hypothesis is rejected. When this happens, the result is said to be  statistically significant . If there is greater than a 5% chance of a result as extreme as the sample result when the null hypothesis is true, then the null hypothesis is retained. This does not necessarily mean that the researcher accepts the null hypothesis as true—only that there is not currently enough evidence to reject it. Researchers often use the expression “fail to reject the null hypothesis” rather than “retain the null hypothesis,” but they never use the expression “accept the null hypothesis.”

The Misunderstood  p  Value

The  p  value is one of the most misunderstood quantities in psychological research (Cohen, 1994) [1] . Even professional researchers misinterpret it, and it is not unusual for such misinterpretations to appear in statistics textbooks!

The most common misinterpretation is that the  p  value is the probability that the null hypothesis is true—that the sample result occurred by chance. For example, a misguided researcher might say that because the  p  value is .02, there is only a 2% chance that the result is due to chance and a 98% chance that it reflects a real relationship in the population. But this is incorrect . The  p  value is really the probability of a result at least as extreme as the sample result  if  the null hypothesis  were  true. So a  p  value of .02 means that if the null hypothesis were true, a sample result this extreme would occur only 2% of the time.

You can avoid this misunderstanding by remembering that the  p  value is not the probability that any particular  hypothesis  is true or false. Instead, it is the probability of obtaining the  sample result  if the null hypothesis were true.

image

“Null Hypothesis” retrieved from http://imgs.xkcd.com/comics/null_hypothesis.png (CC-BY-NC 2.5)

Role of Sample Size and Relationship Strength

Recall that null hypothesis testing involves answering the question, “If the null hypothesis were true, what is the probability of a sample result as extreme as this one?” In other words, “What is the  p  value?” It can be helpful to see that the answer to this question depends on just two considerations: the strength of the relationship and the size of the sample. Specifically, the stronger the sample relationship and the larger the sample, the less likely the result would be if the null hypothesis were true. That is, the lower the  p  value. This should make sense. Imagine a study in which a sample of 500 women is compared with a sample of 500 men in terms of some psychological characteristic, and Cohen’s  d  is a strong 0.50. If there were really no sex difference in the population, then a result this strong based on such a large sample should seem highly unlikely. Now imagine a similar study in which a sample of three women is compared with a sample of three men, and Cohen’s  d  is a weak 0.10. If there were no sex difference in the population, then a relationship this weak based on such a small sample should seem likely. And this is precisely why the null hypothesis would be rejected in the first example and retained in the second.

Of course, sometimes the result can be weak and the sample large, or the result can be strong and the sample small. In these cases, the two considerations trade off against each other so that a weak result can be statistically significant if the sample is large enough and a strong relationship can be statistically significant even if the sample is small. Table 13.1 shows roughly how relationship strength and sample size combine to determine whether a sample result is statistically significant. The columns of the table represent the three levels of relationship strength: weak, medium, and strong. The rows represent four sample sizes that can be considered small, medium, large, and extra large in the context of psychological research. Thus each cell in the table represents a combination of relationship strength and sample size. If a cell contains the word  Yes , then this combination would be statistically significant for both Cohen’s  d  and Pearson’s  r . If it contains the word  No , then it would not be statistically significant for either. There is one cell where the decision for  d  and  r  would be different and another where it might be different depending on some additional considerations, which are discussed in Section 13.2 “Some Basic Null Hypothesis Tests”

Sample Size Weak Medium Strong
Small (  = 20) No No  = Maybe

 = Yes

Medium (  = 50) No Yes Yes
Large (  = 100)  = Yes

 = No

Yes Yes
Extra large (  = 500) Yes Yes Yes

Although Table 13.1 provides only a rough guideline, it shows very clearly that weak relationships based on medium or small samples are never statistically significant and that strong relationships based on medium or larger samples are always statistically significant. If you keep this lesson in mind, you will often know whether a result is statistically significant based on the descriptive statistics alone. It is extremely useful to be able to develop this kind of intuitive judgment. One reason is that it allows you to develop expectations about how your formal null hypothesis tests are going to come out, which in turn allows you to detect problems in your analyses. For example, if your sample relationship is strong and your sample is medium, then you would expect to reject the null hypothesis. If for some reason your formal null hypothesis test indicates otherwise, then you need to double-check your computations and interpretations. A second reason is that the ability to make this kind of intuitive judgment is an indication that you understand the basic logic of this approach in addition to being able to do the computations.

Statistical Significance Versus Practical Significance

Table 13.1 illustrates another extremely important point. A statistically significant result is not necessarily a strong one. Even a very weak result can be statistically significant if it is based on a large enough sample. This is closely related to Janet Shibley Hyde’s argument about sex differences (Hyde, 2007) [2] . The differences between women and men in mathematical problem solving and leadership ability are statistically significant. But the word  significant  can cause people to interpret these differences as strong and important—perhaps even important enough to influence the college courses they take or even who they vote for. As we have seen, however, these statistically significant differences are actually quite weak—perhaps even “trivial.”

This is why it is important to distinguish between the  statistical  significance of a result and the  practical  significance of that result.  Practical significance refers to the importance or usefulness of the result in some real-world context. Many sex differences are statistically significant—and may even be interesting for purely scientific reasons—but they are not practically significant. In clinical practice, this same concept is often referred to as “clinical significance.” For example, a study on a new treatment for social phobia might show that it produces a statistically significant positive effect. Yet this effect still might not be strong enough to justify the time, effort, and other costs of putting it into practice—especially if easier and cheaper treatments that work almost as well already exist. Although statistically significant, this result would be said to lack practical or clinical significance.

image

“Conditional Risk” retrieved from http://imgs.xkcd.com/comics/conditional_risk.png (CC-BY-NC 2.5)

Key Takeaways

  • Null hypothesis testing is a formal approach to deciding whether a statistical relationship in a sample reflects a real relationship in the population or is just due to chance.
  • The logic of null hypothesis testing involves assuming that the null hypothesis is true, finding how likely the sample result would be if this assumption were correct, and then making a decision. If the sample result would be unlikely if the null hypothesis were true, then it is rejected in favor of the alternative hypothesis. If it would not be unlikely, then the null hypothesis is retained.
  • The probability of obtaining the sample result if the null hypothesis were true (the  p  value) is based on two considerations: relationship strength and sample size. Reasonable judgments about whether a sample relationship is statistically significant can often be made by quickly considering these two factors.
  • Statistical significance is not the same as relationship strength or importance. Even weak relationships can be statistically significant if the sample size is large enough. It is important to consider relationship strength and the practical significance of a result in addition to its statistical significance.
  • Discussion: Imagine a study showing that people who eat more broccoli tend to be happier. Explain for someone who knows nothing about statistics why the researchers would conduct a null hypothesis test.
  • The correlation between two variables is  r  = −.78 based on a sample size of 137.
  • The mean score on a psychological characteristic for women is 25 ( SD  = 5) and the mean score for men is 24 ( SD  = 5). There were 12 women and 10 men in this study.
  • In a memory experiment, the mean number of items recalled by the 40 participants in Condition A was 0.50 standard deviations greater than the mean number recalled by the 40 participants in Condition B.
  • In another memory experiment, the mean scores for participants in Condition A and Condition B came out exactly the same!
  • A student finds a correlation of  r  = .04 between the number of units the students in his research methods class are taking and the students’ level of stress.
  • Cohen, J. (1994). The world is round: p < .05. American Psychologist, 49 , 997–1003. ↵
  • Hyde, J. S. (2007). New directions in the study of gender similarities and differences. Current Directions in Psychological Science, 16 , 259–263. ↵

Creative Commons License

Share This Book

  • Increase Font Size

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Null and Alternative Hypotheses | Definitions & Examples

Null & Alternative Hypotheses | Definitions, Templates & Examples

Published on May 6, 2022 by Shaun Turney . Revised on June 22, 2023.

The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test :

  • Null hypothesis ( H 0 ): There’s no effect in the population .
  • Alternative hypothesis ( H a or H 1 ) : There’s an effect in the population.

Table of contents

Answering your research question with hypotheses, what is a null hypothesis, what is an alternative hypothesis, similarities and differences between null and alternative hypotheses, how to write null and alternative hypotheses, other interesting articles, frequently asked questions.

The null and alternative hypotheses offer competing answers to your research question . When the research question asks “Does the independent variable affect the dependent variable?”:

  • The null hypothesis ( H 0 ) answers “No, there’s no effect in the population.”
  • The alternative hypothesis ( H a ) answers “Yes, there is an effect in the population.”

The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample . Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample. It’s critical for your research to write strong hypotheses .

You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

null hypothesis of zero correlation

The null hypothesis is the claim that there’s no effect in the population.

If the sample provides enough evidence against the claim that there’s no effect in the population ( p ≤ α), then we can reject the null hypothesis . Otherwise, we fail to reject the null hypothesis.

Although “fail to reject” may sound awkward, it’s the only wording that statisticians accept . Be careful not to say you “prove” or “accept” the null hypothesis.

Null hypotheses often include phrases such as “no effect,” “no difference,” or “no relationship.” When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤).

You can never know with complete certainty whether there is an effect in the population. Some percentage of the time, your inference about the population will be incorrect. When you incorrectly reject the null hypothesis, it’s called a type I error . When you incorrectly fail to reject it, it’s a type II error.

Examples of null hypotheses

The table below gives examples of research questions and null hypotheses. There’s always more than one way to answer a research question, but these null hypotheses can help you get started.

( )
Does tooth flossing affect the number of cavities? Tooth flossing has on the number of cavities. test:

The mean number of cavities per person does not differ between the flossing group (µ ) and the non-flossing group (µ ) in the population; µ = µ .

Does the amount of text highlighted in the textbook affect exam scores? The amount of text highlighted in the textbook has on exam scores. :

There is no relationship between the amount of text highlighted and exam scores in the population; β = 0.

Does daily meditation decrease the incidence of depression? Daily meditation the incidence of depression.* test:

The proportion of people with depression in the daily-meditation group ( ) is greater than or equal to the no-meditation group ( ) in the population; ≥ .

*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be fine to say that daily meditation has no effect on the incidence of depression and p 1 = p 2 .

The alternative hypothesis ( H a ) is the other answer to your research question . It claims that there’s an effect in the population.

Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true.

The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.

Alternative hypotheses often include phrases such as “an effect,” “a difference,” or “a relationship.” When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes < or >). As with null hypotheses, there are many acceptable ways to phrase an alternative hypothesis.

Examples of alternative hypotheses

The table below gives examples of research questions and alternative hypotheses to help you get started with formulating your own.

Does tooth flossing affect the number of cavities? Tooth flossing has an on the number of cavities. test:

The mean number of cavities per person differs between the flossing group (µ ) and the non-flossing group (µ ) in the population; µ ≠ µ .

Does the amount of text highlighted in a textbook affect exam scores? The amount of text highlighted in the textbook has an on exam scores. :

There is a relationship between the amount of text highlighted and exam scores in the population; β ≠ 0.

Does daily meditation decrease the incidence of depression? Daily meditation the incidence of depression. test:

The proportion of people with depression in the daily-meditation group ( ) is less than the no-meditation group ( ) in the population; < .

Null and alternative hypotheses are similar in some ways:

  • They’re both answers to the research question.
  • They both make claims about the population.
  • They’re both evaluated by statistical tests.

However, there are important differences between the two types of hypotheses, summarized in the following table.

A claim that there is in the population. A claim that there is in the population.

Equality symbol (=, ≥, or ≤) Inequality symbol (≠, <, or >)
Rejected Supported
Failed to reject Not supported

To help you write your hypotheses, you can use the template sentences below. If you know which statistical test you’re going to use, you can use the test-specific template sentences. Otherwise, you can use the general template sentences.

General template sentences

The only thing you need to know to use these general template sentences are your dependent and independent variables. To write your research question, null hypothesis, and alternative hypothesis, fill in the following sentences with your variables:

Does independent variable affect dependent variable ?

  • Null hypothesis ( H 0 ): Independent variable does not affect dependent variable.
  • Alternative hypothesis ( H a ): Independent variable affects dependent variable.

Test-specific template sentences

Once you know the statistical test you’ll be using, you can write your hypotheses in a more precise and mathematical way specific to the test you chose. The table below provides template sentences for common statistical tests.

( )
test 

with two groups

The mean dependent variable does not differ between group 1 (µ ) and group 2 (µ ) in the population; µ = µ . The mean dependent variable differs between group 1 (µ ) and group 2 (µ ) in the population; µ ≠ µ .
with three groups The mean dependent variable does not differ between group 1 (µ ), group 2 (µ ), and group 3 (µ ) in the population; µ = µ = µ . The mean dependent variable of group 1 (µ ), group 2 (µ ), and group 3 (µ ) are not all equal in the population.
There is no correlation between independent variable and dependent variable in the population; ρ = 0. There is a correlation between independent variable and dependent variable in the population; ρ ≠ 0.
There is no relationship between independent variable and dependent variable in the population; β = 0. There is a relationship between independent variable and dependent variable in the population; β ≠ 0.
Two-proportions test The dependent variable expressed as a proportion does not differ between group 1 ( ) and group 2 ( ) in the population; = . The dependent variable expressed as a proportion differs between group 1 ( ) and group 2 ( ) in the population; ≠ .

Note: The template sentences above assume that you’re performing one-tailed tests . One-tailed tests are appropriate for most studies.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Descriptive statistics
  • Measures of central tendency
  • Correlation coefficient

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).

The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (“ x affects y because …”).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses . In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Turney, S. (2023, June 22). Null & Alternative Hypotheses | Definitions, Templates & Examples. Scribbr. Retrieved August 23, 2024, from https://www.scribbr.com/statistics/null-and-alternative-hypotheses/

Is this article helpful?

Shaun Turney

Shaun Turney

Other students also liked, inferential statistics | an easy introduction & examples, hypothesis testing | a step-by-step guide with easy examples, type i & type ii errors | differences, examples, visualizations, what is your plagiarism score.

SPSS tutorials website header logo

Null Hypothesis – Simple Introduction

A null hypothesis is a precise statement about a population that we try to reject with sample data. We don't usually believe our null hypothesis (or H 0 ) to be true. However, we need some exact statement as a starting point for statistical significance testing.

The Null Hypothesis is the Starting Point for Statistical Significance Testing

Null Hypothesis Examples

Often -but not always- the null hypothesis states there is no association or difference between variables or subpopulations. Like so, some typical null hypotheses are:

  • the correlation between frustration and aggression is zero ( correlation analysis );
  • the average income for men is similar to that for women ( independent samples t-test );
  • Nationality is (perfectly) unrelated to music preference ( chi-square independence test );
  • the average population income was equal over 2012 through 2016 ( repeated measures ANOVA ).
  • Dutch, German, French and British people have identical average body weigths .the average body weight is equal for D

“Null” Does Not Mean “Zero”

A common misunderstanding is that “null” implies “zero”. This is often but not always the case. For example, a null hypothesis may also state that the correlation between frustration and aggression is 0.5. No zero involved here and -although somewhat unusual- perfectly valid. The “null” in “null hypothesis” derives from “nullify” 5 : the null hypothesis is the statement that we're trying to refute, regardless whether it does (not) specify a zero effect.

Null Hypothesis Testing -How Does It Work?

I want to know if happiness is related to wealth among Dutch people. One approach to find this out is to formulate a null hypothesis. Since “related to” is not precise, we choose the opposite statement as our null hypothesis: the correlation between wealth and happiness is zero among all Dutch people. We'll now try to refute this hypothesis in order to demonstrate that happiness and wealth are related all right. Now, we can't reasonably ask all 17,142,066 Dutch people how happy they generally feel.

Null Hypothesis - Population Counter

So we'll ask a sample (say, 100 people) about their wealth and their happiness. The correlation between happiness and wealth turns out to be 0.25 in our sample. Now we've one problem: sample outcomes tend to differ somewhat from population outcomes. So if the correlation really is zero in our population, we may find a non zero correlation in our sample. To illustrate this important point, take a look at the scatterplot below. It visualizes a zero correlation between happiness and wealth for an entire population of N = 200.

Null Hypothesis - Population Scatterplot

Now we draw a random sample of N = 20 from this population (the red dots in our previous scatterplot). Even though our population correlation is zero, we found a staggering 0.82 correlation in our sample . The figure below illustrates this by omitting all non sampled units from our previous scatterplot.

Null Hypothesis - Sample Scatterplot

This raises the question how we can ever say anything about our population if we only have a tiny sample from it. The basic answer: we can rarely say anything with 100% certainty. However, we can say a lot with 99%, 95% or 90% certainty.

Probability

So how does that work? Well, basically, some sample outcomes are highly unlikely given our null hypothesis . Like so, the figure below shows the probabilities for different sample correlations (N = 100) if the population correlation really is zero.

Null Hypothesis - Sampling Distribution for Correlation

A computer will readily compute these probabilities. However, doing so requires a sample size (100 in our case) and a presumed population correlation ρ (0 in our case). So that's why we need a null hypothesis . If we look at this sampling distribution carefully, we see that sample correlations around 0 are most likely: there's a 0.68 probability of finding a correlation between -0.1 and 0.1. What does that mean? Well, remember that probabilities can be seen as relative frequencies. So imagine we'd draw 1,000 samples instead of the one we have. This would result in 1,000 correlation coefficients and some 680 of those -a relative frequency of 0.68- would be in the range -0.1 to 0.1. Likewise, there's a 0.95 (or 95%) probability of finding a sample correlation between -0.2 and 0.2.

We found a sample correlation of 0.25. How likely is that if the population correlation is zero? The answer is known as the p-value (short for probability value): A p-value is the probability of finding some sample outcome or a more extreme one if the null hypothesis is true. Given our 0.25 correlation, “more extreme” usually means larger than 0.25 or smaller than -0.25. We can't tell from our graph but the underlying table tells us that p ≈ 0.012 . If the null hypothesis is true, there's a 1.2% probability of finding our sample correlation.

Conclusion?

If our population correlation really is zero, then we can find a sample correlation of 0.25 in a sample of N = 100. The probability of this happening is only 0.012 so it's very unlikely . A reasonable conclusion is that our population correlation wasn't zero after all. Conclusion: we reject the null hypothesis . Given our sample outcome, we no longer believe that happiness and wealth are unrelated. However, we still can't state this with certainty.

Null Hypothesis - Limitations

Thus far, we only concluded that the population correlation is probably not zero . That's the only conclusion from our null hypothesis approach and it's not really that interesting. What we really want to know is the population correlation. Our sample correlation of 0.25 seems a reasonable estimate. We call such a single number a point estimate . Now, a new sample may come up with a different correlation. An interesting question is how much our sample correlations would fluctuate over samples if we'd draw many of them. The figure below shows precisely that, assuming our sample size of N = 100 and our (point) estimate of 0.25 for the population correlation.

Null Hypothesis - Sampling Distribution Under Alternative Hypothesis

Confidence Intervals

Our sample outcome suggests that some 95% of many samples should come up with a correlation between 0.06 and 0.43. This range is known as a confidence interval . Although not precisely correct, it's most easily thought of as the bandwidth that's likely to enclose the population correlation . One thing to note is that the confidence interval is quite wide. It almost contains a zero correlation, exactly the null hypothesis we rejected earlier. Another thing to note is that our sampling distribution and confidence interval are slightly asymmetrical. They are symmetrical for most other statistics (such as means or beta coefficients ) but not correlations.

  • Agresti, A. & Franklin, C. (2014). Statistics. The Art & Science of Learning from Data. Essex: Pearson Education Limited.
  • Cohen, J (1988). Statistical Power Analysis for the Social Sciences (2nd. Edition) . Hillsdale, New Jersey, Lawrence Erlbaum Associates.
  • Field, A. (2013). Discovering Statistics with IBM SPSS Newbury Park, CA: Sage.
  • Howell, D.C. (2002). Statistical Methods for Psychology (5th ed.). Pacific Grove CA: Duxbury.
  • Van den Brink, W.P. & Koele, P. (2002). Statistiek, deel 3 [Statistics, part 3]. Amsterdam: Boom.

Tell us what you think!

This tutorial has 17 comments:.

null hypothesis of zero correlation

By John Xie on February 28th, 2023

“stop using the term ‘statistically significant’ entirely and moving to a world beyond ‘p < 0.05’”

“…, no p-value can reveal the plausibility, presence, truth, or importance of an association or effect.

Therefore, a label of statistical significance does not mean or imply that an association or effect is highly probable, real, true, or important. Nor does a label of statistical nonsignificance lead to the association or effect being improbable, absent, false, or unimportant.

Yet the dichotomization into ‘significant’ and ‘not significant’ is taken as an imprimatur of authority on these characteristics.” “To be clear, the problem is not that of having only two labels. Results should not be trichotomized, or indeed categorized into any number of groups, based on arbitrary p-value thresholds.

Similarly, we need to stop using confidence intervals as another means of dichotomizing (based, on whether a null value falls within the interval). And, to preclude a reappearance of this problem elsewhere, we must not begin arbitrarily categorizing other statistical measures (such as Bayes factors).”

Quotation from: Ronald L. Wasserstein, Allen L. Schirm & Nicole A. Lazar, Moving to a World Beyond “p<0.05”, The American Statistician(2019), Vol. 73, No. S1, 1-19: Editorial.

null hypothesis of zero correlation

By Ruben Geert van den Berg on February 28th, 2023

Yes, partly agreed.

However, most students are still forced to apply null hypothesis testing so why not try to explain to them how it works?

An associated problem is that "significant" has a normal language meaning. Most people seem to confuse "statistically significant" with "real-world significant", which is unfortunate.

By the way, this same point applies to other terms such as "normally distributed". A normal distribution for dice rolls is not a normal but a uniform distribution ;-)

Keep up the good work!

SPSS tutorials

What is The Null Hypothesis & When Do You Reject The Null Hypothesis

Julia Simkus

Editor at Simply Psychology

BA (Hons) Psychology, Princeton University

Julia Simkus is a graduate of Princeton University with a Bachelor of Arts in Psychology. She is currently studying for a Master's Degree in Counseling for Mental Health and Wellness in September 2023. Julia's research has been published in peer reviewed journals.

Learn about our Editorial Process

Saul McLeod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

A null hypothesis is a statistical concept suggesting no significant difference or relationship between measured variables. It’s the default assumption unless empirical evidence proves otherwise.

The null hypothesis states no relationship exists between the two variables being studied (i.e., one variable does not affect the other).

The null hypothesis is the statement that a researcher or an investigator wants to disprove.

Testing the null hypothesis can tell you whether your results are due to the effects of manipulating ​ the dependent variable or due to random chance. 

How to Write a Null Hypothesis

Null hypotheses (H0) start as research questions that the investigator rephrases as statements indicating no effect or relationship between the independent and dependent variables.

It is a default position that your research aims to challenge or confirm.

For example, if studying the impact of exercise on weight loss, your null hypothesis might be:

There is no significant difference in weight loss between individuals who exercise daily and those who do not.

Examples of Null Hypotheses

Research QuestionNull Hypothesis
Do teenagers use cell phones more than adults?Teenagers and adults use cell phones the same amount.
Do tomato plants exhibit a higher rate of growth when planted in compost rather than in soil?Tomato plants show no difference in growth rates when planted in compost rather than soil.
Does daily meditation decrease the incidence of depression?Daily meditation does not decrease the incidence of depression.
Does daily exercise increase test performance?There is no relationship between daily exercise time and test performance.
Does the new vaccine prevent infections?The vaccine does not affect the infection rate.
Does flossing your teeth affect the number of cavities?Flossing your teeth has no effect on the number of cavities.

When Do We Reject The Null Hypothesis? 

We reject the null hypothesis when the data provide strong enough evidence to conclude that it is likely incorrect. This often occurs when the p-value (probability of observing the data given the null hypothesis is true) is below a predetermined significance level.

If the collected data does not meet the expectation of the null hypothesis, a researcher can conclude that the data lacks sufficient evidence to back up the null hypothesis, and thus the null hypothesis is rejected. 

Rejecting the null hypothesis means that a relationship does exist between a set of variables and the effect is statistically significant ( p > 0.05).

If the data collected from the random sample is not statistically significance , then the null hypothesis will be accepted, and the researchers can conclude that there is no relationship between the variables. 

You need to perform a statistical test on your data in order to evaluate how consistent it is with the null hypothesis. A p-value is one statistical measurement used to validate a hypothesis against observed data.

Calculating the p-value is a critical part of null-hypothesis significance testing because it quantifies how strongly the sample data contradicts the null hypothesis.

The level of statistical significance is often expressed as a  p  -value between 0 and 1. The smaller the p-value, the stronger the evidence that you should reject the null hypothesis.

Probability and statistical significance in ab testing. Statistical significance in a b experiments

Usually, a researcher uses a confidence level of 95% or 99% (p-value of 0.05 or 0.01) as general guidelines to decide if you should reject or keep the null.

When your p-value is less than or equal to your significance level, you reject the null hypothesis.

In other words, smaller p-values are taken as stronger evidence against the null hypothesis. Conversely, when the p-value is greater than your significance level, you fail to reject the null hypothesis.

In this case, the sample data provides insufficient data to conclude that the effect exists in the population.

Because you can never know with complete certainty whether there is an effect in the population, your inferences about a population will sometimes be incorrect.

When you incorrectly reject the null hypothesis, it’s called a type I error. When you incorrectly fail to reject it, it’s called a type II error.

Why Do We Never Accept The Null Hypothesis?

The reason we do not say “accept the null” is because we are always assuming the null hypothesis is true and then conducting a study to see if there is evidence against it. And, even if we don’t find evidence against it, a null hypothesis is not accepted.

A lack of evidence only means that you haven’t proven that something exists. It does not prove that something doesn’t exist. 

It is risky to conclude that the null hypothesis is true merely because we did not find evidence to reject it. It is always possible that researchers elsewhere have disproved the null hypothesis, so we cannot accept it as true, but instead, we state that we failed to reject the null. 

One can either reject the null hypothesis, or fail to reject it, but can never accept it.

Why Do We Use The Null Hypothesis?

We can never prove with 100% certainty that a hypothesis is true; We can only collect evidence that supports a theory. However, testing a hypothesis can set the stage for rejecting or accepting this hypothesis within a certain confidence level.

The null hypothesis is useful because it can tell us whether the results of our study are due to random chance or the manipulation of a variable (with a certain level of confidence).

A null hypothesis is rejected if the measured data is significantly unlikely to have occurred and a null hypothesis is accepted if the observed outcome is consistent with the position held by the null hypothesis.

Rejecting the null hypothesis sets the stage for further experimentation to see if a relationship between two variables exists. 

Hypothesis testing is a critical part of the scientific method as it helps decide whether the results of a research study support a particular theory about a given population. Hypothesis testing is a systematic way of backing up researchers’ predictions with statistical analysis.

It helps provide sufficient statistical evidence that either favors or rejects a certain hypothesis about the population parameter. 

Purpose of a Null Hypothesis 

  • The primary purpose of the null hypothesis is to disprove an assumption. 
  • Whether rejected or accepted, the null hypothesis can help further progress a theory in many scientific cases.
  • A null hypothesis can be used to ascertain how consistent the outcomes of multiple studies are.

Do you always need both a Null Hypothesis and an Alternative Hypothesis?

The null (H0) and alternative (Ha or H1) hypotheses are two competing claims that describe the effect of the independent variable on the dependent variable. They are mutually exclusive, which means that only one of the two hypotheses can be true. 

While the null hypothesis states that there is no effect in the population, an alternative hypothesis states that there is statistical significance between two variables. 

The goal of hypothesis testing is to make inferences about a population based on a sample. In order to undertake hypothesis testing, you must express your research hypothesis as a null and alternative hypothesis. Both hypotheses are required to cover every possible outcome of the study. 

What is the difference between a null hypothesis and an alternative hypothesis?

The alternative hypothesis is the complement to the null hypothesis. The null hypothesis states that there is no effect or no relationship between variables, while the alternative hypothesis claims that there is an effect or relationship in the population.

It is the claim that you expect or hope will be true. The null hypothesis and the alternative hypothesis are always mutually exclusive, meaning that only one can be true at a time.

What are some problems with the null hypothesis?

One major problem with the null hypothesis is that researchers typically will assume that accepting the null is a failure of the experiment. However, accepting or rejecting any hypothesis is a positive result. Even if the null is not refuted, the researchers will still learn something new.

Why can a null hypothesis not be accepted?

We can either reject or fail to reject a null hypothesis, but never accept it. If your test fails to detect an effect, this is not proof that the effect doesn’t exist. It just means that your sample did not have enough evidence to conclude that it exists.

We can’t accept a null hypothesis because a lack of evidence does not prove something that does not exist. Instead, we fail to reject it.

Failing to reject the null indicates that the sample did not provide sufficient enough evidence to conclude that an effect exists.

If the p-value is greater than the significance level, then you fail to reject the null hypothesis.

Is a null hypothesis directional or non-directional?

A hypothesis test can either contain an alternative directional hypothesis or a non-directional alternative hypothesis. A directional hypothesis is one that contains the less than (“<“) or greater than (“>”) sign.

A nondirectional hypothesis contains the not equal sign (“≠”).  However, a null hypothesis is neither directional nor non-directional.

A null hypothesis is a prediction that there will be no change, relationship, or difference between two variables.

The directional hypothesis or nondirectional hypothesis would then be considered alternative hypotheses to the null hypothesis.

Gill, J. (1999). The insignificance of null hypothesis significance testing.  Political research quarterly ,  52 (3), 647-674.

Krueger, J. (2001). Null hypothesis significance testing: On the survival of a flawed method.  American Psychologist ,  56 (1), 16.

Masson, M. E. (2011). A tutorial on a practical Bayesian alternative to null-hypothesis significance testing.  Behavior research methods ,  43 , 679-690.

Nickerson, R. S. (2000). Null hypothesis significance testing: a review of an old and continuing controversy.  Psychological methods ,  5 (2), 241.

Rozeboom, W. W. (1960). The fallacy of the null-hypothesis significance test.  Psychological bulletin ,  57 (5), 416.

Print Friendly, PDF & Email

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

10.1 - setting the hypotheses: examples.

A significance test examines whether the null hypothesis provides a plausible explanation of the data. The null hypothesis itself does not involve the data. It is a statement about a parameter (a numerical characteristic of the population). These population values might be proportions or means or differences between means or proportions or correlations or odds ratios or any other numerical summary of the population. The alternative hypothesis is typically the research hypothesis of interest. Here are some examples.

Example 10.2: Hypotheses with One Sample of One Categorical Variable Section  

About 10% of the human population is left-handed. Suppose a researcher at Penn State speculates that students in the College of Arts and Architecture are more likely to be left-handed than people found in the general population. We only have one sample since we will be comparing a population proportion based on a sample value to a known population value.

  • Research Question : Are artists more likely to be left-handed than people found in the general population?
  • Response Variable : Classification of the student as either right-handed or left-handed

State Null and Alternative Hypotheses

  • Null Hypothesis : Students in the College of Arts and Architecture are no more likely to be left-handed than people in the general population (population percent of left-handed students in the College of Art and Architecture = 10% or p = .10).
  • Alternative Hypothesis : Students in the College of Arts and Architecture are more likely to be left-handed than people in the general population (population percent of left-handed students in the College of Arts and Architecture > 10% or p > .10). This is a one-sided alternative hypothesis.

Example 10.3: Hypotheses with One Sample of One Measurement Variable Section  

 two Diphenhydramine pills

A generic brand of the anti-histamine Diphenhydramine markets a capsule with a 50 milligram dose. The manufacturer is worried that the machine that fills the capsules has come out of calibration and is no longer creating capsules with the appropriate dosage.

  • Research Question : Does the data suggest that the population mean dosage of this brand is different than 50 mg?
  • Response Variable : dosage of the active ingredient found by a chemical assay.
  • Null Hypothesis : On the average, the dosage sold under this brand is 50 mg (population mean dosage = 50 mg).
  • Alternative Hypothesis : On the average, the dosage sold under this brand is not 50 mg (population mean dosage ≠ 50 mg). This is a two-sided alternative hypothesis.

Example 10.4: Hypotheses with Two Samples of One Categorical Variable Section  

vegetarian airline meal

Many people are starting to prefer vegetarian meals on a regular basis. Specifically, a researcher believes that females are more likely than males to eat vegetarian meals on a regular basis.

  • Research Question : Does the data suggest that females are more likely than males to eat vegetarian meals on a regular basis?
  • Response Variable : Classification of whether or not a person eats vegetarian meals on a regular basis
  • Explanatory (Grouping) Variable: Sex
  • Null Hypothesis : There is no sex effect regarding those who eat vegetarian meals on a regular basis (population percent of females who eat vegetarian meals on a regular basis = population percent of males who eat vegetarian meals on a regular basis or p females = p males ).
  • Alternative Hypothesis : Females are more likely than males to eat vegetarian meals on a regular basis (population percent of females who eat vegetarian meals on a regular basis > population percent of males who eat vegetarian meals on a regular basis or p females > p males ). This is a one-sided alternative hypothesis.

Example 10.5: Hypotheses with Two Samples of One Measurement Variable Section  

low carb meal

Obesity is a major health problem today. Research is starting to show that people may be able to lose more weight on a low carbohydrate diet than on a low fat diet.

  • Research Question : Does the data suggest that, on the average, people are able to lose more weight on a low carbohydrate diet than on a low fat diet?
  • Response Variable : Weight loss (pounds)
  • Explanatory (Grouping) Variable : Type of diet
  • Null Hypothesis : There is no difference in the mean amount of weight loss when comparing a low carbohydrate diet with a low fat diet (population mean weight loss on a low carbohydrate diet = population mean weight loss on a low fat diet).
  • Alternative Hypothesis : The mean weight loss should be greater for those on a low carbohydrate diet when compared with those on a low fat diet (population mean weight loss on a low carbohydrate diet > population mean weight loss on a low fat diet). This is a one-sided alternative hypothesis.

Example 10.6: Hypotheses about the relationship between Two Categorical Variables Section  

  • Research Question : Do the odds of having a stroke increase if you inhale second hand smoke ? A case-control study of non-smoking stroke patients and controls of the same age and occupation are asked if someone in their household smokes.
  • Variables : There are two different categorical variables (Stroke patient vs control and whether the subject lives in the same household as a smoker). Living with a smoker (or not) is the natural explanatory variable and having a stroke (or not) is the natural response variable in this situation.
  • Null Hypothesis : There is no relationship between whether or not a person has a stroke and whether or not a person lives with a smoker (odds ratio between stroke and second-hand smoke situation is = 1).
  • Alternative Hypothesis : There is a relationship between whether or not a person has a stroke and whether or not a person lives with a smoker (odds ratio between stroke and second-hand smoke situation is > 1). This is a one-tailed alternative.

This research question might also be addressed like example 11.4 by making the hypotheses about comparing the proportion of stroke patients that live with smokers to the proportion of controls that live with smokers.

Example 10.7: Hypotheses about the relationship between Two Measurement Variables Section  

  • Research Question : A financial analyst believes there might be a positive association between the change in a stock's price and the amount of the stock purchased by non-management employees the previous day (stock trading by management being under "insider-trading" regulatory restrictions).
  • Variables : Daily price change information (the response variable) and previous day stock purchases by non-management employees (explanatory variable). These are two different measurement variables.
  • Null Hypothesis : The correlation between the daily stock price change (\$) and the daily stock purchases by non-management employees (\$) = 0.
  • Alternative Hypothesis : The correlation between the daily stock price change (\$) and the daily stock purchases by non-management employees (\$) > 0. This is a one-sided alternative hypothesis.

Example 10.8: Hypotheses about comparing the relationship between Two Measurement Variables in Two Samples Section  

Calculation of a person's approximate tip for their meal

  • Research Question : Is there a linear relationship between the amount of the bill (\$) at a restaurant and the tip (\$) that was left. Is the strength of this association different for family restaurants than for fine dining restaurants?
  • Variables : There are two different measurement variables. The size of the tip would depend on the size of the bill so the amount of the bill would be the explanatory variable and the size of the tip would be the response variable.
  • Null Hypothesis : The correlation between the amount of the bill (\$) at a restaurant and the tip (\$) that was left is the same at family restaurants as it is at fine dining restaurants.
  • Alternative Hypothesis : The correlation between the amount of the bill (\$) at a restaurant and the tip (\$) that was left is the difference at family restaurants then it is at fine dining restaurants. This is a two-sided alternative hypothesis.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 24 August 2024

An output-null signature of inertial load in motor cortex

  • Eric A. Kirk 1 ,
  • Keenan T. Hope 1 ,
  • Samuel J. Sober 2 &
  • Britton A. Sauerbrei   ORCID: orcid.org/0000-0003-3386-3243 1  

Nature Communications volume  15 , Article number:  7309 ( 2024 ) Cite this article

Metrics details

  • Motor cortex
  • Motor neuron

Coordinated movement requires the nervous system to continuously compensate for changes in mechanical load across different conditions. For voluntary movements like reaching, the motor cortex is a critical hub that generates commands to move the limbs and counteract loads. How does cortex contribute to load compensation when rhythmic movements are sequenced by a spinal pattern generator? Here, we address this question by manipulating the mass of the forelimb in unrestrained mice during locomotion. While load produces changes in motor output that are robust to inactivation of motor cortex, it also induces a profound shift in cortical dynamics. This shift is minimally affected by cerebellar perturbation and significantly larger than the load response in the spinal motoneuron population. This latent representation may enable motor cortex to generate appropriate commands when a voluntary movement must be integrated with an ongoing, spinally-generated rhythm.

Similar content being viewed by others

null hypothesis of zero correlation

Flexible neural control of motor units

null hypothesis of zero correlation

Cortical pattern generation during dexterous movement is input-driven

null hypothesis of zero correlation

Robot-Driven Locomotor Perturbations Reveal Synergy-Mediated, Context-Dependent Feedforward and Feedback Mechanisms of Adaptation

Introduction.

The ability to perform the same movement repeatedly in a changing environment is a hallmark of skilled motor control. Inertial load is a key environmental variable which changes with the distribution of mass across the body and must be countered with appropriately-scaled motor commands. For example, raising a coffee cup to the lips when the cup is empty and full requires different patterns of muscle activity. Similarly, the motor output generated during walking in bare feet must be adjusted when heavy boots and a backpack are worn. Such adjustments pose a demanding challenge for neural control, which is distributed across multiple interacting systems, including the motor cortex, cerebellum, brainstem, spinal cord, and muscle receptors (Fig.  1a ).

figure 1

a Block diagram illustrating key circuits involved in adaptation to mechanical loads. b Experimental rig. Mice were trained to trot on a motorized treadmill at 20 cm/s. Behavior was captured with four synchronized cameras, and electromyograms (EMG) were recorded in the biceps brachii and triceps brachii muscles. c Kinematics and EMG during locomotion without a load. Upper: 3D pose estimates, with swing onset indicated in green rectangles. Lower: upward position (magenta) and forward velocity (green) of fingertip, and raw and smoothed biceps and triceps EMG (gray). d Kinematics and EMG during locomotion with a 0.5 g load attached to the wrist. e Smoothed, step-aligned biceps EMG, triceps EMG, and forward finger velocity over one experimental session. f Median biceps activity, triceps activity, and forward finger velocity for all sessions, load-on vs. load-off ( n  = 34 sessions, n  = 7 mice). The range for the number of sessions per animal was 3–6. Lines indicate bootstrapped 95% confidence intervals. g Time course of EMG amplitude changes. Upper: biceps and triceps amplitudes across a single session. Points correspond to individual steps, and lines indicate a loess estimate of the trend. Lower: loess estimates for all sessions. To enable comparison across sessions, each curve was z-scored and stretched to unit duration. Figure  1 b–d adapted from Tyler, E., & Kravitz, L. (2020). walking mouse. Zenodo. https://doi.org/10.5281/zenodo.3925915 . https://creativecommons.org/licenses/by/4.0/ .

In the context of voluntary movement, studies in nonhuman primates have demonstrated that the motor cortex drives the generation of forces to move the upper limb and to compensate for loads 1 , 2 , and cortical neurons are modulated by force magnitude and direction 3 , 4 , 5 , 6 . Furthermore, several observations suggest that load-related responses in the motor cortex might be driven, in part, by ascending cerebellar input. Cooling of the cerebellar dentate nucleus attenuates long-latency motor cortical responses to impulse torques during voluntary movement 7 , 8 , though cortical activity during holding against a load is minimally affected by this manipulation 9 , and disruption of the cerebellar outflow with high-frequency electrical stimulation can partially suppress cortical activity in an isometric wrist task 10 . On the other hand, firing rates in cerebellar Purkinje cells closely resemble kinematics, but are altered only slightly with changes in force required to move the hand against viscous or elastic loads 11 . In mice, the forelimb is approximately three orders of magnitude less massive than in primates, and inertial forces may be relatively less important in control. The mouse behavioral repertoire and manual dexterity are also more limited. Nonetheless, mice can be trained to perform precise reaching and grasping tasks. The forelimb regions of the sensorimotor cortex are not necessary for pulling a lever against a load 12 , but are required for adaptation to a force applied to a pulled lever over dozens of trials 13 , and for the initiation and execution of reach-to-grasp movements 12 , 14 , 15 , 16 .

The complexity and heterogeneity of the motor cortical population pose a significant challenge to understanding its role in control 17 , 18 . For example, a neuron’s response to load during reaching cannot be accurately predicted from its load sensitivity during posture 6 , and directional tuning can change substantially between movement preparation and execution 19 and throughout a reach 17 . A powerful emerging approach to this complexity focuses less on the information represented by individual neurons and more on the coordinated dynamics across the cortical population, how these dynamics are related to features of the task, and how they are generated by interactions across brain areas and with the sensory periphery 15 , 20 , 21 , 22 , 23 . This approach has helped explain several perplexing features of cortical activity, such as the observation that large changes in firing rate can occur during motor preparation without evoking movement. As a movement is planned, cortical activity changes in directions, termed output-null dimensions, along which the net effect of cortical output on muscle activity is constant 24 , 25 . These changes enable the cortical population state to be set to the appropriate initial condition from which activity can evolve during movement execution. More broadly, a growing body of work 26 has shown that output-null dynamics are a key mechanism for preparing movements 24 , 25 , correcting errors online 27 , and learning 28 , 29 , 30 without producing aberrant muscle activity.

Given the central role of motor cortical dynamics in voluntary limb movements, how might these dynamics contribute to load compensation in rhythmic movements which are coordinated by an intrinsic spinal network? In mammalian overground locomotion, a spinal central pattern generator (CPG) governs the basic pattern of flexor-extensor and left-right limb alternation, can operate independently of the brain and sensory feedback 31 , 32 , 33 , and is controlled by networks in the midbrain and brainstem that determine locomotor initiation and speed 34 , 35 , 36 . The motor cortex is not necessary for locomotion over a flat surface, but is required when precision demands are increased during steps over obstacles or across a horizontal ladder 12 , 37 , 38 , 39 , 40 . Some adjustments for mechanical load are implemented by subcortical structures: in walking premammillary cats, for instance, loading of an ankle extensor tendon increases the activation of the corresponding muscle during stance, and can suppress the CPG when large forces are applied 41 . Nonetheless, the rhythmic, step-entrained activity of some cortical cells, including pyramidal tract neurons projecting to the spinal cord and brainstem, can be modulated by speed and by loading of the limb 42 , 43 , suggesting that descending cortical signals may be important for the regulation of force during locomotion.

The present study aims to address three central questions. First, does the motor cortex drive compensation for changes in inertial load imposed on the limbs during locomotion, as it does in voluntary movement, or is this compensation instead implemented by subcortical structures? Second, how are such loads represented in motor cortical population activity, and does the representation depend on cerebellar input? Finally, how are cortical dynamics related to the output of the nervous system at the level of the spinal motoneuron population? We address these questions in unrestrained, chronically instrumented mice performing an adaptive locomotion task in which they must adjust motor output to compensate for a weight on the wrist. Our approach combines three-dimensional kinematic pose estimation, recordings from forelimb muscles, the motor cortex, spinally-innervated motor units, optogenetic perturbations, and computational approaches for modeling neural population data. We find that, although inactivation of the motor cortex does not attenuate load compensation, the dominant component of cortical population activity is a tonic shift imposed by the load, and is robust to optogenetic perturbation of the cerebellum. Furthermore, the geometric properties of cortical population activity in the task contrast strongly with those of the spinal motoneuron population. While cortical activity is significantly modulated by load, cerebellar perturbation, and animal speed, with cortical trajectories that maintain relatively low tangling across experimental conditions, consistent with noise-robust dynamics, the spinal motoneuron population is instead dominated by condition-invariant signals related to flexor-extensor alternation, and also exhibits higher trajectory tangling. We conclude that load-related dynamics in the motor cortex do not directly drive motor compensation during locomotion, but instead constitute a latent representation of changes to the limb mechanics, which may modulate cortical commands during voluntary gait modification or alter the gain of spinal reflexes to correct for unexpected perturbations.

Adaptation of locomotor output to changes in inertial load

Unrestrained mice were trained to trot at ~20 cm/s on a motorized treadmill as their movements were captured with four synchronized high-speed cameras (Fig.  1b ). Three-dimensional limb kinematics were measured from video using an automatic pose estimation pipeline 44 , 45 , enabling extraction of fingertip position and velocity (Fig.  1c , lower: magenta and green traces) and segmentation of the session into swing and stance epochs (Fig.  1c , lower: green boxes). Electromyograms (EMG) were recorded from forelimb flexor (biceps brachii) and extensor (triceps brachii) muscles, rectified, and smoothed (Fig.  1c , lower: gray traces). At the beginning of each session, animals ran freely for 5–20 minutes. We then imposed an inertial load on one forelimb by attaching a 0.5 g weight to the wrist, increasing the moment of inertia of the radius-ulna about the elbow, and the animals ran for a second epoch of 5–10 min. This load, which increased the total mass of the forelimb by ~50%, induced a compensatory increase in elbow flexor muscle activity during swing and a corresponding suppression of extensor activity during stance (Fig.  1d ). The compensation was consistent across step cycles (Fig.  1e ) and sessions (Fig.  1f ; signed rank test, p  = 8.4e-6 for biceps, p  = 7.1e-3 for triceps). Finger velocity was, on average, slightly higher in the loaded condition (Fig.  1f ; p  = 5.7e-4), consistent with modest overcompensation for the load. Furthermore, in contrast with adaptation to a split-belt treadmill, which unfolds over many successive steps and requires the cerebellum 46 , 47 , this adaptation appeared to occur almost instantaneously after the load was applied (Fig.  1g ).

Load compensation is robust to perturbation of the motor cortex and cerebellum

Adjustment of motor output in different tasks requires distinct contributions from motor cortex 43 , 48 , cerebellum 49 , and cerebellar inputs to cortex 8 . To determine whether the observed compensation for inertial load requires the motor cortex and cerebellum, we used an optogenetic approach to transiently inactivate each brain area during the task. Motor cortical perturbation experiments were performed in VGAT-ChR2-EYFP mice, which express the light-gated ion channel ChR2 selectively in inhibitory interneurons, enabling robust suppression of cortical output following illumination of the brain surface with blue light 15 , 50 . An optical fiber was implanted over the forelimb motor cortex (Fig.  2a , left), and animals performed treadmill locomotion with and without a 0.5 g weight on the contralateral forelimb as laser stimulation was delivered intermittently to suppress motor cortical activity (473 nm, 40 Hz, 1 s stimulus duration, randomized 1–6 s delay between stimuli). While the load induced an increase in elbow flexor muscle activity during swing and a decrease in extensor activity during stance, cortical perturbation did not attenuate this compensation (Fig.  2a , center and right). We next tested the effects of cerebellar perturbation on motor output by implanting a fiber over the forelimb area of the pars intermedia ipsilateral to the loaded forelimb in L7Cre-2 x Ai32 mice (Fig.  2b , left), which express ChR2 selectively in Purkinje cells and allow suppression of cerebellar output during laser stimulation 51 . Cerebellar perturbation did not erase the adaptation of motor output to the load; on the contrary, it produced a modest flexor muscle enhancement and extensor attenuation (Fig.  2b , center and right and Supplementary Fig.  1c, d ). To quantify the effects of load, optogenetic perturbation, and speed on motor output, we fit a linear model for each experimental session and examined the distribution of the resulting coefficients (Fig.  2c and Supplementary Fig.  1c, d ; see Methods). The load had a significant positive effect on elbow flexor EMG and a negative effect on extensor EMG (sign rank test, q  < 0.05), and step frequency had positive effects on both flexor and extensor EMG. The interaction terms between load and both optogenetic perturbations were centered at zero, indicating that these perturbations failed to erase the adaptation of motor output to changes in load. Overall, these results show that load compensation in the task does not require normal motor cortical or cerebellar output.

figure 2

a Effect of load and motor cortical perturbation on biceps and triceps EMG in a single session. Each contour corresponds to the average step-locked EMG in one of four load and optogenetic perturbation conditions. The angle represents the phase of the step cycle, and the radius of the EMG magnitude at the corresponding step phase. Bold = load, Color = optogenetic perturbation. EMG was normalized in each session based on the distribution of within-step peaks (see Methods). b Effect of load and cerebellar perturbation on biceps and triceps EMG in a single session. c Regression coefficients estimating the effects of load, optogenetic perturbation, the interaction between load and optogenetic perturbation, and step frequency on biceps and triceps EMG (motor cortical perturbation in VGAT-ChR2-EYFP animals: n  = 4 mice, n  = 18 sessions; cerebellar perturbation in L7Cre-2 x Ai32 animals: n  = 3 mice, n  = 16 sessions). Each point corresponds to a single experimental session; lines denote 95% confidence intervals. Figure 2a, b adapted from Thompson, E. (2020). Mouse brain above & side. Zenodo. https://doi.org/10.5281/zenodo.3925987 . https://creativecommons.org/licenses/by/4.0/ .

Load and cerebellar perturbation modulate motor cortical activity

The finding that muscle activity was unaffected by cortical inactivation could be explained simply by a lack of cortical responsiveness to load. Alternatively, the motor cortex might track changes in limb mass by shifting its state along output-null dimensions—that is, dimensions which do not directly influence muscle activity. We note that because cortical inactivation not only failed to erase load-related changes in EMG, but had no detectable effect on locomotor performance, any cortical activity in the task should be confined to the null space. Furthermore, in the latter case, load-related activity in the cortex might be driven by input from the cerebellum, or by input from other sources. To distinguish between these explanations, we chronically implanted high-density silicon probes in the motor cortex of L7Cre-2 x Ai32 mice, along with an optical fiber over the contralateral forelimb area of the cerebellar pars intermedia (Fig.  3a ). Mice then performed the adaptive locomotion task as we recorded limb kinematics and cortical spiking ( n  = 710 neurons, n  = 2 mice) while intermittently perturbing the cerebellum by stimulating Purkinje cells (473 nm, 40 Hz, 1 s stimulus duration, randomized 1–6 s delay between stimuli). Most neurons were synchronized with the locomotor rhythm ( n  = 618/710, 87.0%, q < 0.05, Rayleigh test with Benjamini–Hochberg correction for multiple comparisons), consistent with studies in cats 37 , 43 and primates 52 , 53 . The effects of load and cerebellar perturbation were highly diverse across neurons. Firing rates for some cells were modulated by load (neurons 1–3, Fig.  3b ), by Purkinje cell stimulation (neuron 5), or by both (neuron 6), while effects were relatively modest for others (neuron 4). While the range of patterns observed in both animals was qualitatively similar, with many neurons exhibiting responses to load and cerebellar perturbation, the former were more numerous in one mouse, and the latter in the other (Supplementary Fig.  2 ). Overall, 47.7% of neurons exhibited changes related to load, 24.1% to Purkinje cell stimulation, and 10.6% to both, while interaction between load and Purkinje cell stimulation occurred in only 2.7% of cells (multi-way ANOVA for each neuron, q < 0.05). Among the load-sensitive neurons, 46.0% had higher firing rates in the load-on condition; among the neurons sensitive to Purkinje cell stimulation, 71.9% had a response of higher firing rates (Fig.  3c, d ).

figure 3

a Left: mice ( n  = 2, L7Cre-2 x Ai32) were chronically implanted with silicon probes in the motor cortex and an optical fiber over the contralateral cerebellar cortex for stimulation of Purkinje cells. Right: raw data showing seven neurons recorded from the motor cortex during locomotion. b Firing rates and spike rasters for six motor cortical neurons recorded across sessions and mice showing different responses to load and cerebellar perturbation in the adaptive locomotion task. Bold = Load, Color = Purkinje cell stimulation. c Effect of load and cerebellar perturbation on motor cortical neurons ( n  = 710). Each row corresponds to a single neuron, and displays the difference in z-scored firing rate between load on and off conditions (left) and between Purkinje cell stimulation on and off (right). Neurons are grouped based on the detection of an effect of load (black bar), stimulation (light blue bar), both (dark blue bar), or neither (remaining neurons). d Mean firing rates for all neurons in load on and off conditions (left) and Purkinje cell stimulation on and off (right). Color code reflects the detection of effects of load, stimulation, or both, as in ( c ). Bars indicate 95% confidence intervals for the mean. Figure 3a adapted from Tyler, E., & Kravitz, L. (2020). walking mouse. Zenodo. https://doi.org/10.5281/zenodo.3925915 . https://creativecommons.org/licenses/by/4.0/ . Figure 3a adapted from Thompson, E. (2020). Mouse brain above & side. Zenodo. https://doi.org/10.5281/zenodo.3925987 . https://creativecommons.org/licenses/by/4.0/ .

Cortical population dynamics in adaptive locomotion

Because the effects of load and cerebellar perturbation were heterogeneous across the sample of cortical cells, we next aimed to identify the coordinated, low-dimensional dynamics across the population. To extract a low-dimensional representation of cortical population dynamics in interpretable, task-relevant coordinates, we used demixed principal component analysis (dPCA; see Methods), which decomposes neural activity into dimensions related to specific experimental parameters while capturing most of the variance in the original firing rates 54 . Because animals frequently modulated their speed on the treadmill by starting and stopping locomotion, potentially influencing cortical firing rates, we included speed as a pseudo-experimental variable, in addition to load and cerebellar perturbation. For each cortical neuron, the average step-aligned firing rate was measured in twenty conditions: load on/off (two levels) x Purkinje cell stimulation on/off (two levels) x animal speed (five levels). Next, we used dPCA to find a decoder matrix that mapped the firing rates for all neurons onto a 20-dimensional latent variable space. This model explained 92.9% of the total firing rate variance and yielded scores parameterized by step phase for each dimension and condition (Fig.  4a–c ), and an encoder matrix that reconstructs the measured firing rates from these scores. We observed condition-invariant signals that were modulated by step phase, but did not differ strongly across experimental parameters (Fig.  4a , X-Y axes; Fig.  4b , first and second columns, first row). The first condition-invariant dimension was roughly sinusoidal, with a period of one stride and a peak near the swing-stance transition, while the second was qualitatively similar except for a phase shift, with a peak in mid-swing. Taken together, the condition-invariant components accounted for 28.8% of the explained variance in cortical firing rates. Animal speed had a moderate effect on cortical dynamics (18.9% of the variance), but this was distributed broadly across multiple dimensions (Fig.  4b , first column, second row; Fig, 4c). The largest speed component consisted of tonic shifts in activity, with little dependence on the step phase. Dynamics in this dimension and the top two condition-invariant dimensions, therefore, yielded stacked elliptical trajectories that translated continuously with movement speed (Fig.  4a , right), reminiscent of motor cortical dynamics in primates performing a rhythmic cycling task 55 .

figure 4

a Step-aligned neural trajectories obtained by demixed principal component analysis (dPCA). The x and y axes in each panel represent population activity in the first two condition-invariant (CI) dimensions across twenty conditions. The z axes represent the first load dimension (left), Purkinje cell stimulation dimension (center), and speed dimension (right). b Step-aligned neural activity across CI, load, Purkinje cell stimulation, speed, and interaction dimensions. c Inner product between the principal axes (left, upper triangular), the probability of the inner product is larger than the observed value for randomly-oriented vectors (left, lower triangular), and a fraction of variance is explained by each dimension (right). d Projection of firing rates aligned to Purkinje cell stimulation onto load dimensions (upper) and Purkinje cell stimulation dimensions (lower).

The largest single component of cortical activity, however, was related almost purely to inertial load (Fig.  4a , left; Fig.  4b , third column, first row), accounting for 22.6% of the explained variance in firing rate. This component depended only weakly on step phase, and consisted of a tonic shift in activity between the load-on and load-off conditions, consistent with patterns observed in individual neurons (c.f. cells 1, 3, and 6 in Fig.  3b ). Because inactivation of motor cortex had no effect on muscle activity (Fig.  2 ), we conclude that cortical activity in all measured conditions and dimensions - including the load-related dimensions - was confined to the output-null subspace. Because prior work has shown cortical compensation for load during voluntary upper limb movements can be influenced by ascending cerebellar drive 7 , 8 , we next tested whether the load signals observed during locomotion were cerebellum-dependent by examining several consequences of cerebellar perturbation. First, the effect of Purkinje cell stimulation was concentrated primarily in a single dimension (11.8% of the variance; Fig.  4c ) and, like the load effect, consisted of a tonic shift in activity (Fig.  4a , center; Fig.  4b , fourth column, first row). Second, the principal axes with the largest effects of load and cerebellar perturbation were not closely aligned (inner product −0.45; Fig.  4c , upper triangular matrix), though we failed to reject the null hypothesis that their relative orientations were random ( p  = 0.31, exact test based on beta distribution; Fig.  4c , lower triangular matrix). Third, the interaction between load and cerebellar perturbation was small, accounting for only 1.0% of firing rate variance (Fig.  4b , second column, second row; Fig.  4c ). Fourth, activity in the top load-related dimension was not partitioned by cerebellar perturbation; instead, trajectories were tightly grouped within each load condition (Fig.  4a , left; Fig.  4b , third column, first row). Fifth, projection of firing rates aligned to the onset of cerebellar perturbation onto the top load dimension revealed a minimal response (Fig.  4d , upper), while projection onto the first Purkinje cell stimulation dimension produced a large signal that was sustained throughout the stimulus train (Fig.  4d , lower). Finally, an analysis of population activity using principal component analysis, which provides an orthonormal basis, revealed that the load and Purkinje cell stimulation induced nearly orthogonal shifts in neural trajectories (Supplementary Fig.  3a–d ). Taken together, these observations support the hypothesis that the cortical representation of load in the adaptive locomotion task is independent of cerebellar input.

Effects of load and cerebellar perturbation on spinal motoneuron dynamics

Intuitively, it might be expected that a load representation in the cortical null space would be small relative to the changes in muscle activity required to move the limb against a load. Ideally, this intuition would be tested against recordings at cellular resolution from spinal motoneurons, which forward the ultimate results of central computations to muscles that actuate the body. In healthy motor units, muscle fiber action potentials are tightly locked to action potentials in the corresponding motoneuron, and motor unit potentials recorded in the muscle enable the measurement of motoneuron spike trains. Thus, we implanted flexible fine wire electrodes and high-density electrode arrays in the forelimb muscles, enabling us to record motor output at the resolution of individual spinally-innervated motor units in the adaptive locomotion task (Fig.  5a ; n  = 108 motor units, n  = 27 sessions, n  = 6 L7Cre-2 x Ai32 mice). Of the six animals, three were implanted with traditional fine wire EMG electrodes 12 , 56 , 57 , and three were implanted with Myomatrix arrays 58 designed to record forelimb muscles in the mouse. This approach allowed a comparison of motor unit yield between the two electrode designs. As detailed in the Methods, mice were each implanted with either four twisted fine wire EMG electrodes or with a single Myomatrix array with four “threads”. In two mice, all four Myomatrix threads were implanted in target muscles, and in the third mouse, only two of the four threads were successfully implanted. Animals implanted with fine wire EMGs performed a total of 14 recording sessions yielding 35 motor units (mean 2.5, min-max 1–5, per session; n  = 3 mice). Animals implanted with Myomatrix devices performed 13 recording sessions, yielding 73 motor units (mean 5.6, min-max 2–10, per session; n  = 3 mice). These results indicate a significant increase in motor unit yield from Myomatrix electrode arrays compared to traditional fine wire methods ( p  < 0.01, two-sample KS-test).

figure 5

a Left: mice ( n  = 6, L7Cre-2 x Ai32) were chronically implanted with fine wire electrodes or Myomatrix electrode arrays in the biceps brachii and triceps brachii muscles and an optical fiber over the ipsilateral cerebellar cortex for stimulation of Purkinje cells. Right: raw motor unit data from the right and left triceps during locomotion recorded from an implanted Myomatrix array. b Firing rates and spike rasters for six spinally-innervated motor units in the adaptive locomotion task. Units 1 and 2 were recorded from the biceps ipsilateral to the load, and units 3–6 from the ipsilateral triceps. Bold = Load, Color = optogenetic perturbation. c Effect of load and cerebellar perturbation on motor units ( n  = 108). Each row corresponds to a single motor unit, and displays the difference in z-scored firing rate between load on and off conditions (left) and between Purkinje cell stimulation on and off (right). Units are grouped based on the detection of an effect of load (black bar), stimulation (light blue bar), both (dark blue bar), or neither (remaining neurons). d Mean firing rates for all motor units in load on and off conditions (left) and Purkinje cell stimulation on and off (right). Color code reflects the detection of effects of load, stimulation, or both, as in ( c ). Bars indicate 95% confidence intervals for the mean. Figure 5a adapted from Tyler, E., & Kravitz, L. (2020). walking mouse. Zenodo. https://doi.org/10.5281/zenodo.3925915 . https://creativecommons.org/licenses/by/4.0/ . Figure 5a adapted from Thompson, E. (2020). Mouse brain above & side. Zenodo. https://doi.org/10.5281/zenodo.3925987 . https://creativecommons.org/licenses/by/4.0/ .

Inertial load and cerebellar perturbation were applied as in the cortical recording experiments. Motor units were more strongly entrained to the locomotor rhythm ( n  = 108/108, q < 0.05, Rayleigh test) in comparison to cortical units, with flexor motor units activated during swing, and extensor motor units during stance (Fig.  5b ). The firing rates of 50.9% of motor units were significantly modulated by load ( n  = 55/108; q < 0.05, multi-way ANOVA; Fig.  5c , d), 28.7% by Purkinje cell stimulation ( n  = 31/108), 20.4% by both load and stimulation ( n  = 22/108), and 7.4% by the interaction between load and stimulation ( n  = 8/108). Among the load-sensitive neurons, 38.2% had firing rate increases, while increases occurred in 71.0% of Purkinje cell stimulation-sensitive neurons. To identify coordinated activity patterns at the motor unit population level, we performed dPCA as for the cortical population, and projected firing rates onto twenty dPCA decoder dimensions, which explained 94.2% of the total firing rate variance. The dominant patterns revealed by dPCA consisted of robust, condition-invariant oscillations (Fig.  6a , X-Y axes; Fig.  6b , first two columns, first row), and overall, the condition-invariant signals accounted for 70.6% of the explained firing rate variance (Fig.  6c ). The first two condition-invariant dimensions showed approximately sinusoidal oscillations with a period of one stride. Inertial load and Purkinje cell stimulation had modest effects, accounting for 7.4 and 4.6% of the variance, respectively (Fig.  6a , left and center; Fig.  6b , third and fourth columns, first row; Fig.  6c ). In contrast with cortical activity patterns, the first load and Purkinje cell stimulation dimensions for the motor unit population exhibited a clear dependence on step phase, with maximal separation between conditions in mid-stance. Animal speed accounted for 12.8% of the firing rate variance, with continuous, tonic shifts in the first speed dimension (Fig.  6b , first column, second row). While these patterns yielded stacked, elliptical trajectories in the first two condition-invariant dimensions and the first speed dimension (Fig.  6a , right), roughly resembling the corresponding cortical dynamics (c.f. Fig.  4a , right), these spinal trajectories were less clearly separated across speed conditions in comparison to the cortex. The projection of motor unit firing rates aligned to Purkinje cell stimulation onto the dPCA axes revealed no effect on load dimensions, but a small, tonic modulation in the first two stimulation dimensions (Fig.  6d ), though these were small in comparison to the corresponding cortical signal (c.f. Fig.  4d ). Finally, an analysis of population activity using principal component analysis, which provides an orthonormal basis, revealed that the load and Purkinje cell stimulation induced smaller shifts in spinal trajectories (Supplementary Fig.  4a–d ).

figure 6

a Step-aligned neural trajectories were obtained by demixed principal component analysis (dPCA). The x and y axes in each panel represent population activity in the first two condition-invariant (CI) dimensions across twenty conditions. The z axes represent the first load dimension (left), Purkinje cell stimulation dimension (center), and speed dimension (right). b Step-aligned neural activity across CI, load, Purkinje cell stimulation, speed, and interaction dimensions. c Inner product between the principal axes (left, upper triangular), probability the inner product is larger than the observed value for randomly-oriented vectors (left, lower triangular), and fraction of variance explained by each dimension (right). d Projection of firing rates aligned to Purkinje cell stimulation onto load dimensions (upper) and Purkinje cell stimulation dimensions (lower).

Distinct dynamics in cortical and spinal motoneuron populations

Although neural dynamics in the cortical and spinal populations had several qualitative similarities, including the shape of trajectories in the leading condition-invariant and speed dimensions, several key differences were apparent. First, the condition-invariant dimensions had similar time-varying trajectories (Figs.  4 b, 6b , first and second columns, first row), but the amount of firing rate variance explained was 2.4-fold larger in the spinal motoneuron population (70.6 and 28.8% in spinal and motor cortex, respectively). In this sense, the spinal population response primarily reflected the locomotor rhythm, while load, speed, and Purkinje cell stimulation imposed smaller modulations on this rhythm. In the motor cortex, however, the dominant signal was related to load, and components for both Purkinje cell stimulation and speed were also prominent. This larger balance of condition-invariant activity for spinal motor output in comparison with cortex in locomoting mice contrasts with findings in primates reaching multiple targets, which showed a much larger condition-invariant component in cortex 59 , and in primates walking over obstacles 53 . Second, load and Purkinje cell stimulation effects for the motor cortex consisted primarily of tonic shifts, whereas the corresponding effects on spinal motoneurons were modulated by step phase. Third, neural trajectories were more clearly separated at different speeds for the cortical than for the spinal population.

To determine how the geometry of neural trajectories changed across conditions, we next asked how a change in a given experimental variable (e.g., from load-off to load-on) reshaped the trajectories by turning, moving, or stretching them. In particular, for each neural population (motor cortex and spinal motoneurons) and variable (load, Purkinje cell stimulation, and speed), we used Procrustes analysis to identify the rotation, translation, and rescaling required to map trajectories in baseline conditions (load-off, stimulation-off, and lowest speed) to the corresponding trajectories in the complementary conditions (load-on, stimulation-on, and highest speed; see Methods). This produced a concise description of how each experimental manipulation altered neural trajectory geometry. Inertial load and Purkinje cell stimulation induced large vertical translation in the cortical trajectories (Fig.  7a , left and center), but produced largely rotational effects for spinal trajectories (Fig.  7b , left and center; Fig.  7c , left and center), resulting from the modulation by step phase in the latter case. For speed, both populations displayed a combination of rotation and translation, along with a slight increase in scale (Fig.  7a–c , right).

figure 7

a Neural trajectories and Procrustes transformations for the motor cortex population. In each panel, the vector field indicates the direction and magnitude of the Procrustes transformation required to map the trajectories from one set of conditions onto the trajectories for another set. Left: map from load-off trajectories to load-on trajectories. Center: map from stimulation-off trajectories to stimulation-on trajectories. Right: map from slowest trajectories to fastest trajectories. b Neural trajectories and Procrustes transformations for the spinal motoneuron population. Conventions as in ( a ). c Comparison of Procrustes transformations for the motor unit and cortical populations. d Comparison of trajectory tangling in motor units and motor cortex. Left: scatterplot of tangling values for motor units and cortex across all conditions and time bins. Right: histogram of differences in tangling between motor units and cortex.

We also observed that, while cortical trajectories were clearly separated across experimental conditions, spinal trajectories had greater overlap across conditions and time points. To quantify this finding, we computed a trajectory tangling index 60 (see Methods), which measures the extent to which nearby neural states have distinct derivatives. We found that trajectories in the motor cortex consistently exhibited lower tangling in comparison with the spinal motoneuron population (Fig.  7d ). Highly tangled trajectories imply dynamics that are driven by external input, while low tangling may suggest more autonomous dynamics that are robust to noise. However, the motor cortex maintains relatively low tangling despite the presence of strong signals about the state of the limbs and throughout experimental manipulation of cerebellar inputs. Thus, low tangling might constitute a mark of noise robustness even in systems that depend strongly on inputs.

In this study, we have identified a robust signature of limb inertial load in the mouse motor cortex during adaptive locomotion, which comprised the largest single component of cortical activity in the task. Because muscle activity during load compensation was unchanged by cortical inactivation, we conclude this load-related signal is not a motor command underlying the compensation, but is instead a latent, output-null representation (Fig.  8 ). Our finding that activity along the load dimension is minimally influenced by cerebellar perturbation further suggests it is not driven primarily by cerebellar projections through the ventrolateral thalamus, but likely reflects sensory signals ascending from the dorsal column nuclei via somatosensory cortex 61 , 62 . Extensive work in cats and primates has shown that responses in the motor cortex can be driven by proprioceptive and cutaneous inputs 63 , 64 , 65 , 66 , 67 , though the latter appears to be attenuated during active movements, including locomotion 68 . In mice, signals broadly distributed across the cerebral cortex can reflect spontaneous movements, and these signals are likely generated, in part, from sensory feedback 69 . In the locomoting mouse, however, it is possible that the load-related shift in the cortex may not reflect purely sensory information; given its relatively weak dependence on the locomotor phase, this shift could be a more abstract contextual signal from other cortical regions or neuromodulatory inputs, reflecting a reconfiguration of network dynamics 70 .

figure 8

a EMG experiment to measure the adaptation of muscle activity to a change in inertial load. b Optogenetic inactivation experiment to test the necessity of motor cortex for load-related changes in muscle activity. c Neural recording experiment to measure the effects of load changes on cortical dynamics. d Optogenetic perturbation experiment to measure the effect of cerebellar output on cortical activity. e Motor unit recording experiment to measure the effect of load changes on spinal motoneuron activity.

We suggest two potential interpretations of our central finding. First, an output-null representation of load may support the generation of appropriately-scaled commands when a voluntary modification of gait that requires motor cortex must be integrated with the spinally-generated locomotor program. Studies in cats and mice have shown that unimpeded locomotion on a flat surface is relatively automatic, and can be largely handled by the spinal CPG, postural reflexes, and an input specifying speed. Our data indicate that compensation for an increase in inertial load is also handled by subcortical centers, and appears to be relatively automatic. However, when voluntary adjustments are required, as when an animal must traverse a barrier or precisely place its feet on the rungs of a ladder, the motor cortex becomes essential, and generates commands to alter gait 37 , 38 , 39 , 40 . Such voluntary commands must also take into account changes to the limb mechanics: a leap over a hurdle in bare feet will require different commands than a leap in hiking boots, or while carrying a heavy backpack. Thus, we reason, the motor cortex should continually represent the parameters and state of the plant, including the mass distribution across the limbs, even when it is operating in an output-null regime, in order to subsequently generate appropriate commands when discrete, voluntary adjustments such as a step over an obstacle are required. The output-null shift in response to load we identify here is a candidate for such a cortical representation. A second possible interpretation is that the motor cortex may adjust the gain of spinal reflexes to regulate joint impedance 71 and calibrate the motor response to unexpected perturbations, as has been found for rhythmic, voluntary upper limb movements 72 and on a longer time scale during split-belt locomotor adaptation in humans 73 .

How might subcortical centers use sensory feedback to adjust muscle activity in our task? Prior studies in cats suggest positive force feedback in the homonymous muscle mediated by Ib afferents enhances extensor activation during stance as the load is increased 41 , 74 . It is possible we observe an analogous response for the flexors: the wrist weight may increase flexor tendon strain early in swing, inducing an increase in flexor activation and, subsequently, a suppression in the extensors. We expect future studies in mice will exploit the growing arsenal of tools for precise optogenetic manipulations to suppress specific classes of afferents 75 (e.g., tendon afferents and primary and secondary spindle afferents) to narrow down the circuits involved in load-compensatory responses. Viewed more broadly, our task exemplifies the general problem of adjusting periodic force profiles to counter mechanical constraints imposed by the environment. This problem is solved by neural control in humans to grip food during chewing 76 , lungfish swimming in media of different viscosities 77 , stick insects and cockroaches walking with variable load and substrate friction 78 , 79 , 80 , 81 , and cats performing treadmill locomotion as the gravitational load is altered 82 . Despite their diversity, these adaptive motor programs share a strong reliance on sensory feedback, phase-dependent gating, a relatively automatic character, and implementation at lower levels of the motor hierarchy.

The motor cortical dynamics we observed share several key similarities with those reported in primates performing a voluntary cycling task 55 , 60 , 83 . Neural trajectories in the primary and dorsal premotor cortex during cycling are periodic and elliptical in the dominant dimensions, and translate continuously along an axis approximately orthogonal to the plane of rotation with changing speed. These dynamics are consistent with a cortical rhythm generator that determines movement speed and phase while driving smaller, more complex, muscle-like output commands that control movement via corticospinal projections. In locomotion, by contrast, the rhythm is generated by an intrinsic spinal circuit, and oscillatory activity in the cortical condition-invariant dimensions likely reflects sensory feedback or an efference copy from the CPG. Thus, although the condition-invariant activity in mouse spinal motoneurons qualitatively resembles the cortical dynamics, it is unlikely they are driven by cortical commands. Indeed, we observed that the inactivation of the motor cortex had little effect on either the rhythmic flexor-extensor alternation or on the additional forelimb EMG changes imposed by load.

Another feature of primate cortical dynamics during cycling is the maintenance of significantly lower trajectory tangling in comparison with muscle activity. That is, nearby neural states have similar derivatives, so cortical trajectories tend to avoid crossing one another across different time points and conditions (see Methods). Because higher tangling is a signature of external forcing, low tangling is therefore consistent with strong internal dynamics in the primate cortical network during the task. Similarly, tangling is low in the spinal CPG for scratching in the turtle, which transforms a tonic stimulus into rhythmic output through local network interactions 84 . In locomoting mice, we also observe lower levels of tangling in the motor cortex in comparison to the spinal motoneuron population, which must be driven by external inputs. This difference, however, is smaller than in the primate cycling task, consistent with a spinal rather than cortical locus of pattern generation, and with a greater role for inputs in driving cortical dynamics. In addition, cycling studies used both forward and backward rotations, which tended to increase tangling in muscle trajectories, while we tested locomotion in the forward direction only. Our alignment of neural activity to mouse step cycles might also attenuate tangling differences, as the duration of a step is relatively short - approximately 250 ms. Recent modeling and experimental work 84 has demonstrated rotational dynamics in the spinal CPG that can control the frequency and vigor of rhythmic movements while maintaining low trajectory tangling. In light of this finding, we expect tangling may arise primarily at the final stage of the motor system, in the spinal motoneuron population, and that load compensation might be achieved by increasing the rotation amplitude in the CPG network. Future studies could test this hypothesis in rodents by recording from the spinal interneuron population in freely moving animals, though this will require technically challenging experiments.

Our findings highlight a disassociation between the dominant patterns of motor cortical activity in a given task and the necessity of these patterns for generating motor output. Because many distinct descending and spinal pathways ultimately converge onto the same motoneurons, the problem of inferring the effects of cortical dynamics on muscle activity from simultaneous measurements of both is necessarily ill-posed. Furthermore, changes in cortical activity with experimental conditions or behavioral epochs may effectively cancel out at the motoneuronal level, enabling cortical computations to occur without influencing movement 24 , 25 . An emerging body of evidence suggests the contribution of the motor cortex to forelimb movements can depend strongly on behavioral tasks and context. In the mouse, silencing the motor cortex has negligible effects on normal locomotion 12 , 40 , moderately impairs skilled gait modification 40 , and severely disrupts precise reach-to-grasp movements 14 , 15 , 16 . Correlations between cortical neurons and the mapping between neural and muscle activity can change substantially between tasks 12 , 85 , though work in the cat suggests this mapping is preserved between voluntary gait modification and reaching 86 . In rats, lesions to the motor cortex impair learning of an interval timing task, but do not affect performance if delivered after the task has been learned 87 , and the necessity of the motor cortex for a task can depend on the preceding training regimen 88 . Meanwhile, studies of neural population dynamics in reaching primates have emphasized the significance of cortical dimensions that are decoupled from movement and contribute to internal computations during motor preparation 24 , 25 , initiation 59 , 89 , and learning 28 , 29 . Our results build upon these findings by identifying a robust, latent representation of limb mechanics in motor cortical population activity during the adaptation of a rhythmic movement governed by a spinal CPG.

Experimental animals and behavioral task

All experiments and procedures were approved by the Institutional Animal Care and Use Committee at Case Western Reserve University, and in accordance with NIH guidelines. At the time of surgical implantation, all mice were 16–23 weeks old and weighed 24–33 g. Mice with higher body mass were selected for experiments, as they were better able to carry the implant payload on the head. A total of 12 adult mice were used for experiments, including four (male) VGAT-ChR2-EYFP line-8 strain mice (Jackson Laboratory; JAX stock #014548) and eight (six male and two female) L7Cre-2 x Ai32 strain mice (Jackson Laboratory; JAX stock #004146 and #024109). Hemizygous VGAT-ChR2-EYFP mice obtained from JAX were bred with C57Bl/6J mice to obtain experimental animals. Homozygous L7-Cre2 and homozygous Ai32 mice (both obtained from JAX) were bred to obtain experimental animals. Animals were healthy, individually housed under a 12-h light-dark cycle at 65–75 °F and 40–60% humidity, and had no prior treatment, drug, or altered diet exposure. After surgery, animals were cared for and studied for up to 3 months.

General surgical procedures

All mice were implanted with optical fibers for optogenetic perturbation, and with either (1) fine wire electrodes in forelimb muscles for electromyographic (EMG) recording, (2) Myomatrix arrays 58 for high-resolution recording from motor units, or (3) silicon probes in motor cortex for neural ensemble recording. The initial surgical procedures preceding the implantation of EMG or neural electrodes was similar across surgeries. Anesthesia was induced with isoflurane (1–5%, Kent Scientific), eye lubricant was applied, fur on top of the head and posterior neck was shaved, and the mouse was positioned in a stereotaxic apparatus (model 1900, KOPF instruments) on top of a heating pad.

Under sterile technique, the top of the head was cleansed with alternating swabs of 70% ethanol and iodine surgical scrub, lidocaine (10 mg/kg) was injected under the skin on the top of the skull, the skin was removed, the periosteum on top of the skull removed, and a custom-designed 3D-printed head post was attached with UV-cured dental cement (3 M RelyX Unicem 2). Then, optical fibers and chronic recording electrodes were surgically implanted (see below). Post-surgery, the minimum recovery period was 48 h, Meloxicam (5 mg/kg) was administered for pain management once per day, and the investigators monitored animal behavior, body mass, food, and water intake on a daily basis. The recovery period was extended an additional 24–48 h for some animals as necessary.

Adaptive locomotion task

After at least 2 days of recovery from surgery, mice were placed on a custom-built motor-driven treadmill (46 cm long by 8 cm wide) that was controlled at fixed speeds between 10–30 cm/s (Fig.  1b ). The treadmill apparatus was enclosed in transparent acrylic, and belt speed monitored by a rotary encoder. Locomotion was motivated through negative reinforcement with airpuffs triggered by an infrared brake beam at the back of the treadmill belt. Mice were acclimated to the apparatus for up to three sessions, until they ran continuously without prompting. For the condition of unrestrained, load adaptive locomotion, one investigator briefly scruffed the mouse while another positioned a small weight (0.5 g) on the wrist, and at the conclusion of the load-on condition, the wrist weight was removed. The wrist weight was fabricated by gluing a steel ball bearing to a small zip-tie. For each animal, recording sessions were performed up to twice a day. Per session, mice ran between 5–20 min within the load-off and 5–10 min within the load-on conditions. Sessions started with the mouse running in the load-off condition that was followed by load-on, in a subset of sessions ( n  = 8) there was a final load-off condition that was performed. Each session was concluded based on mouse performance having at least 5 min of continuous locomotion per condition, or was ended due to mouse stress or reaching the 30 min time mark.

Videography

Four synchronized high-speed cameras (Blackfly, model BFS-U3-16S2C-CS, Teledyne FLIR; Vari-Focal IP/CCTV lens, model 12VM412ASIR, Tamron) were positioned around the treadmill, with two cameras recording from each side of the treadmill belt, acquiring approximately sagittal views of the locomoting mouse. Under infrared illumination of the field, each camera was positioned to record the complete length of the treadmill belt at a frame rate of 150 Hz and a region of interest of 1440 × 210 pixels, and was triggered by an external pulse generator using custom LabVIEW code (National Instruments). Images were acquired with the SpinView GUI (Spinnaker SDK software, Teledyne FLIR).

Pose estimation during locomotion

For tracking mouse pose (i.e., anatomical landmarks) across cameras during locomotion, DeepLabCut 44 was used. The position of 22 landmarks was tracked, including the nose, eye, fingertip, wrist, elbow, shoulder, toe, foot, ankle, knee, hip, and tail on each side of the body. Separate tracking models were developed for EMG and cortical electrodes due to differences in animal appearance between the implant types. In total, 1850 and 2002 labeled frames were used for training the EMG and cortical implant models, respectively. Next, Anipose 45 was used to triangulate the 3D pose from the 2D estimates in the four cameras. Briefly, the four cameras were calibrated using simultaneously acquired images of a ChArUco board, and the 3D pose was estimated by minimizing an objective that enforced small reprojection errors, temporal smoothness, and soft constraints on the length of rigid body segments.

The pose estimates obtained from Anipose were then transformed into a natural coordinate frame: (1) forward on treadmill, (2) right on treadmill, and (3) upward against gravity. Next, the forward coordinate was unrolled by adding the cumulative displacement of the treadmill computed from the rotary encoder. This resulted in a treadmill belt-centered coordinate frame, as though the mouse was progressing along an infinitely-long track: (1) forward on treadmill, relative to the unrolled position of the back of the belt at the start of the experiment, (2) right on treadmill, and (3) upward against gravity. Sessions were then segmented into swing and stance epochs by detecting threshold crossings of the forward finger velocity and upward finger position. For a step to be included, finger velocity during swing was required to be greater than 60% of the forward and 40% of the upward median values per session. Also, for the inclusion of each step cycle, the swing phase duration was required to be 60–400 ms. The identified swing and stance time points were used for the alignment of electrophysiological recordings. For each mouse and session, the quality of the pose estimates was assessed using Anipose quality metrics, visual inspection of trajectories, and comparison of the estimated pose with the raw videos.

Optogenetic perturbations

Optical fibers (catalog number FT200UMT, core diameter 200 μm, ThorLabs) were glued inside ceramic ferrules (catalog number CFLC230-10, ThorLabs) and positioned onto the skull over a thin layer of transparent dental cement (Optibond, Kerr), which enabled optical access to the brain 51 , 90 . To enable transient perturbation of intermediate deep cerebellar nuclei, the optical fibers were placed bilaterally above the pars intermedia of cerebellar lobule V (bregma -6.75 mm, lateral 1.7 mm) of L7Cre-2 x Ai32 mice to stimulate Purkinje cells 91 , 92 . Higher laser power levels (>8 mW) have been shown to suppress Purkinje cell firing 93 , likely due to depolarization block; we therefore ensured that the minimum effective power was used in these mice. In separate experiments, to transiently inactivate the motor cortex, optical fibers were placed bilaterally above the forelimb area of the motor cortex (bregma +0.5 mm, lateral 1.7 mm) of VGAT-ChR2-EYFP mice to stimulate inhibitory interneurons 12 , 15 , 94 . For both implant types, all stimulation during experiments was delivered unilaterally within each session.

Optogenetic perturbation with a 473 nm wavelength laser was delivered with sinusoidal waves at 40 Hz (Opto Engine LLC). The laser was triggered by an external pulse generator controlled with custom labVIEW software. For each mouse and genotype, laser power levels used during locomotion were calibrated to achieve a similar functional impact with the range of laser power based on prior investigations 12 , 15 , 50 , 91 , 92 . In L7Cre-2 x Ai32 mice, optogenetic perturbation of the cerebellum at power levels greater than ~2 mW arrested mouse locomotion, and the forelimb musculature was unable to support the mouse during stance. Whereas in VGAT-ChR2-EYFP mice, higher power levels were well tolerated during locomotion. Therefore, the power for each genotype and animal was first calibrated based on measuring EMG responses to stimulation in the home cage.

Home cage sessions (Supplementary Fig.  1a ), in which the animals spent most of their time standing or quietly exploring the cage, involved stepwise power level adjustments of optogenetic perturbation and measurement of EMG. In L7Cre-2 x Ai32 mice, Purkinje cell stimulation (0.125-4 mW) induced suppression of forelimb flexor and extensor EMG, followed by a power-dependent rebound response after the termination of the stimulus. Therefore, laser power for behavioral sessions was adjusted within this range of effect, to a level that produced minimal rebound, and did not halt locomotion (0.25–2 mW). Similar home cage calibration sessions were separately performed in VGAT-ChR2-EYFP mice, with stepwise power level adjustments to confirm quiescent muscle activity during the stimulation of cortical inhibitory interneurons (1–12 mW). To maximize the effect of motor cortical perturbation during behavioral experiments, higher power levels were used (8–12 mW) and were not observed to halt locomotion. For home cage sessions, the stimulus duration was 0.25, 0.5, or 1 s, and interstimulus intervals were randomized and between 3–10 s. During locomotion, the stimulus duration was 1 s and interstimulus intervals were randomized between 1–6 s.

Electromyogram recordings

Electromyogram (EMG) recordings of gross muscle activity from the elbow flexors and extensors was made using fine-wire 12 , 56 , 57 electrodes, and recordings from single motor units were performed with both fine-wire electrodes and high-density Myomatrix arrays 58 , 95 , 96 . For each mouse, we implanted a total of four muscle locations, targeting an elbow flexor and extensor muscle on each side. Fine-wire electrodes were made with four pairs of wires in a bipolar EMG configuration, following an established protocol 56 . Each bipolar fine-wire electrode comprised two 0.001-inch diameter, seven-stranded braided steel wires (catalog number: 793200, A-M Systems) that were crimped into a 27 gauge needle, twisted, and knotted together. For recording contacts, ~0.5–1 mm of insulation was removed per wire between the knot and needle, made closer to the knot, and staggered with an inter-contact distance of ~2 mm. The open ends of the wire on the other side of the knot were soldered onto a 32-pin connector (Omnetics Nano, A79025, 36 pins, 4 guideposts), along with a gold pin cap for attachment to the ground (Mcmaster-Carr). Myomatrix electrodes (model number RF-4×8-BVS-5) 58 were used to only record EMG with single motor unit resolution, these electrodes had gold contacts that were plated with conductive polymer PEDOT to reduce the impedance to the measured range of 3–23 kOhm. Fine-wire electrodes were grounded with a gold pin soldered to a stainless steel wire placed through the skull and into the brain by performing a craniotomy with a dental drill ~4 mm rostral to the forelimb area of the motor cortex area. The dura was left intact, Kwik-sil (World Precision Instruments) was applied, and the pin was secured to the skull with dental cement. Myomatrix electrodes were grounded onto the skull and secured with dental cement.

For surgical implantation, the fur on the posterior neck, posterior shoulders, and both forelimbs above the elbow joint of the mouse was removed using depilatory cream prior to positioning within the stereotaxic apparatus. Electrodes were implanted only after the head post, optical fibers, and ground were secured. For each forelimb, lidocaine was injected under the skin, and a 2–3 cm incision of the skin was made between the elbow and shoulder joint, along the midline axis of the lateral head of the triceps brachii muscle, and was subsequently kept moist with saline. Each electrode was led under the skin from the posterior neck to be separately implanted in the long head of the biceps brachii or triceps brachii muscles. For targeting the biceps brachii muscle, the forelimb was abducted, the elbow extended, and the paw supinated, whereas for targeting the triceps brachii muscle, the elbow was flexed and the paw pronated. The skin was adjusted using forceps to provide an opening over the targeted muscle, and electrodes were inserted into the muscle belly from proximal to distal. The fine-wire electrodes were inserted with the attached crimped needle, after insertion, the needle and excess distal wire was cut and a distal knot was made. For Myomatrix electrodes, a suture knot was tied onto the distal polyimide hole of each thread, then, following the suture needle, was carefully pulled into the targeted muscle belly. One Myomatrix thread was inserted per muscle. For both the fine-wire and Myomatrix electrode implants, the incised skin was then flushed with saline and sutured. The connector was then secured to the head post with dental cement, and the inferior skin relative to the head post was hermetically sealed with skin adhesive (3 M Vetbond).

Despite targeting muscle long heads during implantations, we did not systematically differentiate EMG between the long and short head of the biceps brachii muscle, and likely EMG during locomotor swing comprised the synergist contribution from other elbow flexor muscles, including the brachialis and coracobrachialis. Likewise, we did not differentiate EMG between the heads of the triceps brachii muscle, and it remains possible that EMG during stance may have had a minor synergist contribution from the dorso-epitrochlearis brachii and anconeus muscles 97 .

We implanted EMG electrodes in forelimb muscles bilaterally, because throughout the course of experiments the signal-to-noise would degrade and in some instances electrodes would be damaged, and these sessions were excluded. Therefore, the forelimb with better EMG signal-to-noise and minimal crosstalk from other muscles was used for experiments, determining on which side the wrist weight and optogenetic perturbations were applied. In VGAT-ChR2-EYFP mice, optogenetic silencing of the forelimb area of the motor cortex was linked to contralateral forelimb EMG and contralateral load. In L7Cre-2 x Ai32 mice, optogenetic silencing of deep cerebellar nuclei through the activation of Purkinje cells was linked to ipsilateral forelimb EMG, ipsilateral load, and contralateral cortical neuron recordings. Three (one female) L7Cre-2 x Ai32 and four VGAT-ChR2-EYFP mice were implanted with fine-wire electrodes, and three (one female) L7Cre-2 x Ai32 mice were implanted with Myomatrix electrodes. Recordings were amplified and bandpass filtered (0.01–10 kHz) using a differential amplifier and digitized (Intan RHD2216, 16-bit, 16 channel bipolar input recording headstage), and acquired at 30 kHz (Open Ephys acquisition board and software). At the conclusion of experiments on each mouse, the targeted muscles were verified post-euthanasia by dissection.

For subsequent analysis of step-aligned muscle activity, the gross EMG was high-pass filtered (200–250 Hz cutoff), rectified, and convolved with a Gaussian kernel (σ = 10 ms). To normalize the smoothed EMG signal, we first detected all peak events exceeding the 90th percentile of the full-time series. Then, the smoothed signal was divided by the median amplitude of these peaks.

Motor unit spike sorting

On many EMG recordings from fine-wire electrodes, single motor units were identified (e.g., the triceps unit in Fig.  1c, d ). For these fine-wire recordings, the EMG was high-pass filtered on each channel (cutoff set between 200 and 1000 Hz, second-order Butterworth). Motor unit spike times were identified by voltage threshold and waveform template matching (Spike2 software, version 7, Cambridge Electronics Design). In the fine-wire electrodes implanted in the biceps brachii muscle, single motor units were sometimes recorded during the stance phase, possibly due to the small relative volume of elbow flexor to extensor muscle and that the cut-end of the electrode was closer to the distal aspect of the lateral triceps brachii.

For Myomatrix electrodes, each thread comprised four bipolar recording channels that were implanted into the same muscle that enabled correlated voltage and waveform analysis across channels. The EMG was high-pass filtered (400–500 Hz cutoff, Parks-Mclellan method), and motor unit waveforms and spike times were extracted using an existing method 98 . Then, clusters were manually cut using peak-to-trough features from all channels on each thread, and unit quality was assessed by inspection of waveforms, autocorrelations, cross-correlations between units recorded on the same thread, and raw signals with unit spike times superimposed. Overall, we recorded 54 ipsilateral extensor units, 27 contralateral extensor units, 17 ipsilateral flexor units, and 10 contralateral flexor units.

Motor cortical recordings

Extracellular recordings in the forelimb area of the motor cortex 15 , 99 were made using chronically implanted high-density silicon probes (64 channel, 4-shank, 6 mm length E1 probe, Cambridge NeuroTech) secured to a manual micromanipulator (CN-01 V1, Cambridge NeuroTech). Probes were plated with the conductive polymer PEDOT to reduce the impedance to the measured range of 30–50 kOhm, and the tips were sharpened to ease insertion through the dura. The electrode was grounded with a gold pin soldered to a stainless steel wire placed through the skull and into the visual cortex. Surgical implantation of the probe occurred after the head post, optical fibers, and ground were secured to the skull. A craniotomy (dimensions ~1 × 2 mm) was performed with a dental drill to access the forelimb area of the motor cortex on the left side (bregma +0.5 mm, lateral 1.7 mm), care was taken to leave the dura intact, and cold saline was applied continuously to reduce swelling. The probe tip was inserted to a starting depth between 400–540 µm, silicone gel was applied (catalog number 3-4680, Dowsil, Dow), and the apparatus, including the amplifier, was secured to the head post, skull, and enclosed custom chamber using dental cement.

Two L7Cre-2 x Ai32 mice were implanted and recordings were amplified and bandpass filtered (0.01–10 kHz) using a differential amplifier and digitized (mini-amp-64, Cambridge NeuroTech) and acquired at 30 kHz (Open Ephys GUI). Each session, the electrophysiological signal-to-noise and spiking density across channels was assessed, to record from new neurons, and when signal quality degraded, the probe was moved 62.5–125 µm deeper every 1–3 days by adjusting the micromanipulator until the lowest recording channel hit white matter (~1–1.2 mm from the surface).

Motor cortex spike sorting

Single units in the motor cortex were identified using Kilosort 2.5 100 , 101 , 102 ( https://github.com/MouseLand/Kilosort ), and manually curated with the Phy GUI ( https://github.com/cortex-lab/phy ). Only well-isolated neurons were accepted based on spike waveforms, the presence of an absolute refractory period greater than 1 ms, the stability of spike amplitude over the session, and isolation of the cluster in feature space. Spike time cross-correlation was used to remove duplicated neurons.

Quantification and statistical analysis

Emg analysis.

To assess changes in behavior over individual experimental sessions, we first interpolated the smoothed biceps and triceps EMG and forward finger velocity between the start of swing and end of stance on each step cycle, and visualized the resulting curves as heatmaps (Fig.  1e ). For optogenetic perturbation experiments, we averaged the step-aligned curves within each condition (load on/off, optogenetic perturbation on/off), and visualized the means using polar plots (Fig.  2a, b ). Next, to obtain a compact representation of motor output on each step, we averaged the biceps (flexor) EMG during the swing epoch, the triceps (extensor) EMG during the stance epoch, and velocity (fingertip) over the entire step. Medians and bootstrapped confidence intervals for load-off and load-on conditions were visualized as scatterplots (Fig.  1f ), and a difference between conditions (where each paired observation is a load-off and load-on median in one session) was assessed with a two-sided sign rank test. The trend in step-averaged EMG across each session was modeled using loess smoothing 103 (second-order, smoothing parameter α = 0.9; Fig.  1g ). To estimate the effects of load, optogenetic perturbation, and speed on EMG and velocity, we fit one linear model for each session using ordinary least squares, where each observation corresponded to a single step. The dependent variables were biceps EMG, triceps EMG, and forward velocity, and the independent variables were step frequency (i.e., the inverse of the duration of each step), load, optogenetic perturbation, and interaction between the load and optogenetic perturbation. All variables were Z-scored to facilitate the comparison of effect sizes across variables and sessions. Coefficients and 95% confidence intervals were visualized using scatterplots and histograms (Fig.  2c and Supplementary Fig.  1c ), and the sign of the coefficients assessed with a sign rank test with Benjamini–Hochberg correction (q < 0.05; Supplementary Fig.  1d ). For coefficients related to optogenetic perturbation and its interaction with load, this test was applied separately to sessions using VGAT-ChR2-EYFP and L7Cre-2 x Ai32 mice.

Analysis of cortical neurons and spinal motoneurons

For each motor cortical neuron and spinally-innervated motor unit, firing rates over the full experimental session were computed using Gaussian smoothing (σ = 25 ms). Using the step cycle segmentation from kinematic data (described above), smoothed firing rate curves were extracted for each step using linear interpolation between the start of swing and end of stance, then averaged within each experimental condition to create peri-event time histograms (Figs.  3b ; 5b ). The effects of load and Purkinje cell stimulation as a function of the step phase were visualized by subtracting the Z-scored firing rates in the load-off, stim-off condition from the Z-scored firing rates in the load-on, stim-off (Fig.  3c ), and load-off, stim-on conditions (Fig.  5c ), respectively. Step-averaged firing rates were computed for each step by dividing the number of spikes by the step duration. Means and 95% confidence intervals for step-averaged rates were visualized with scatterplots (Figs.  3d ;   5d ) and analyzed with a multi-way ANOVA for each neuron. A Benjamini–Hochberg correction for multiple comparisons across neurons was applied. Scatterplots were implemented on log-log axes to transform a heavily skewed firing rate distribution across the sample of neurons into a more symmetric distribution (Supplementary Fig.  5 ).

Demixed principal component analysis

To identify the coordinated, low-dimensional dynamics in the motor cortical and spinal motoneuron populations, we used demixed principal component analysis (dPCA) 54 , which decomposes measured firing rates into latent variables related to experimental parameters of interest, using a published Matlab package ( https://github.com/machenslab/dPCA ). Briefly, the average step-aligned firing rate for each unit ( n  = 710 for cortical neurons, n  = 108 for spinal motoneurons) was measured in twenty different conditions in a factorial design: load on/off (two levels) × Purkinje cell stimulation on/off (two levels) × animal speed (five levels). Speed was included as it was relatively variable from stride to stride, exhibited nonstationarity in some sessions, and influenced firing rates (Supplementary Fig.  6 ). Firing rate was sampled at 100 evenly-spaced points across the step cycle, from the start of swing to the end of stance. For the speed factor, the forward speed of the animal’s nose at swing onset was partitioned into five bins with ~50% overlap using an equal count algorithm 103 . This imposed the following marginalizations over parameters: (1) load, (2) speed, (3) Purkinje cell stimulation, (4) condition-invariant, (5) load/speed interaction, (6) load/Purkinje cell stimulation interaction, and (7) speed/Purkinje cell stimulation interaction. Next, we estimated the decoder and encoder matrices with twenty components and regularization parameter λ = 1e-5, and projected firing rates onto the decoder columns to obtain scores parameterized by step phase (Figs.  4a, b,   6a, b ). The alignment between pairs of principal axes was assessed by computing the inner product (Fig.  4c , upper triangular; Fig.  6c , upper triangular), and by applying an exact test against the null hypothesis that the relative orientation of the axes is random with an alternative hypothesis that the axes are orthogonal. Under the null hypothesis, (x-1)/2 follows a beta distribution with α = β = (d-1)/2, where x is the inner product between axes and d = 20 is the dimension of the latent variable space. The probability the inner product x is within r of zero (i.e., that the axes are nearly orthogonal) under the null hypothesis is given by P(|x | <r) = B((1 + r)/2,(d-1)/2,(d-1)/2) − B((1-r)/2,(d-1)/2,(d-1)/2), where B is the beta cumulative distribution function. Thus, setting r as the absolute value of the measured inner product between two principal axes, we can calculate the probabilities shown in Figs.  4 c, 6c (lower triangular).

Comparison of cortical neuron and spinal motoneuron trajectories

For each neural population (motor cortex and spinal motoneuron) and experimental parameter (load, Purkinje cell stimulation, and speed), we extracted neural trajectories in the leading component corresponding to that parameter and in the first two condition-invariant dimensions across all twenty conditions. We then used Procrustes analysis within each neural population and parameter to find the optimal transformations from trajectories in one set of conditions to those in another set. These mappings could include translation, rotation, and isotropic rescaling, but not reflection. For the load and Purkinje cell stimulation parameters, trajectories in load-off and stimulation-off conditions were mapped to the corresponding trajectories in load-on and stimulation-on conditions, respectively. For the speed parameter, trajectories in the lowest speed condition were mapped to trajectories in the highest speed condition. The resulting maps were visualized on a regular 3D grid by mapping each grid point to a second point in the direction of its image under the Procrustes transformation, with a scaling of 0.2 for the motor cortex and 0.4 for spinal motoneurons (Fig.  7a, b ).

To further characterize the neural trajectories, we employed the concept of trajectory tangling. The intuition is that in an autonomous, noiseless system x’(t) = f(x(t)), the future trajectory is fully determined by the initial condition x(0). When inputs come into play, x’(t) = f(x(t)) + u(t), the same initial condition x(0) could diverge into distinct trajectories, depending on u(t). In this case, multiple observations of the system across trials or conditions will lead to intersecting, or “tangled,” trajectories at x(0). Thus, an observation of high trajectory tangling is taken to be evidence that inputs u(t) are driving the trajectories (under the assumption that the system is well-behaved - i.e., not chaotic). We note, however, that the converse does not hold: the periodically forced system x’(t) = (−sin(t),cos(t)) has no intrinsic dynamics, but generates minimally tangled trajectories. The analysis of trajectory tangling was performed as in previous studies 60 . Briefly, neural trajectories in the full 20-dimensional latent variable space identified by dPCA were numerically differentiated along the time axis. Next, for each time point t* and condition c*, the following quantity was computed: max{t,c} ||Z’(t*,c*) - Z’(t,c)||/||Z(t*,c*) - Z(t,c)|| + ε, where Z(t,c) is the neural state in condition t at time c, and Z’(t,c) its derivative. The value of ε was set at 10% of the mean of the sum of squares of Z(t,c), concatenated across all conditions. This normalization was performed separately for the cortical and spinal populations.

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Data availability

All data generated in this study have been deposited in the Dryad repository. The data were available under CC0 licensing at the following https://doi.org/10.5061/dryad.s7h44j1g4 .

Code availability

All code supporting the findings in this study are packaged with the data on the Dryad repository at the following https://doi.org/10.5061/dryad.s7h44j1g4 .

Porter, R. & Lemon, R. C orticospinal Function and Voluntary Movement (Oxford Univ. Press, 1993).

Scott, S. H. Optimal feedback control and the neural basis of volitional motor control. Nat. Rev. Neurosci. 5 , 532–546 (2004).

Article   CAS   PubMed   Google Scholar  

Evarts, E. V. Relation of pyramidal tract activity to force exerted during voluntary movement. J. Neurophysiol. 31 , 14–27 (1968).

Kalaska, J. F., Cohen, D. A., Hyde, M. L. & Prud’homme, M. A comparison of movement direction-related versus load direction-related activity in primate motor cortex, using a two-dimensional reaching task. J. Neurosci. 9 , 2080–2102 (1989).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Sergio, L. E. & Kalaska, J. F. Systematic changes in motor cortex cell activity with arm posture during directional isometric force generation. J. Neurophysiol. 89 , 212–228 (2003).

Article   PubMed   Google Scholar  

Kurtzer, I., Herter, T. M. & Scott, S. H. Random change in cortical load representation suggests distinct control of posture and movement. Nat. Neurosci. 8 , 498–504 (2005).

Conrad, B., Matsunami, K., Meyer-Lohmann, J., Wiesendanger, M. & Brooks, V. B. Cortical load compensation during voluntary elbow movements. Brain Res. 71 , 507–514 (1974).

Meyer-Lohmann, J., Conrad, B., Matsunami, K. & Brooks, V. B. Effects of dentate cooling on precentral unit activity following torque pulse injections into elbow movements. Brain Res. 94 , 237–251 (1975).

Hore, J. & Flament, D. Changes in motor cortex neural discharge associated with the development of cerebellar limb ataxia. J. Neurophysiol. 60 , 1285–1302 (1988).

Nashef, A., Cohen, O., Harel, R., Israel, Z. & Prut, Y. Reversible block of cerebellar outflow reveals cortical circuitry for motor coordination. Cell Rep. 27 , 2608–2619.e4 (2019).

Pasalar, S., Roitman, A. V., Durfee, W. K. & Ebner, T. J. Force field effects on cerebellar Purkinje cell discharge with implications for internal models. Nat. Neurosci. 9 , 1404–1411 (2006).

Miri, A. et al. Behaviorally selective engagement of short-latency effector pathways by motor cortex. Neuron 95 , 683–696.e11 (2017).

Mathis, M. W., Mathis, A. & Uchida, N. Somatosensory cortex plays an essential role in forelimb motor adaptation in mice. Neuron 93 , 1493–1503.e6 (2017).

Guo, J.-Z. et al. Cortex commands the performance of skilled movement. Elife 4 , e10774 (2015).

Article   PubMed   PubMed Central   Google Scholar  

Sauerbrei, B. A. et al. Cortical pattern generation during dexterous movement is input-driven. Nature 577 , 386–391 (2020).

Galiñanes, G. L., Bonardi, C. & Huber, D. Directional reaching for water as a cortex-dependent behavioral framework for mice. Cell Rep. 22 , 2767–2783 (2018).

Churchland, M. M. & Shenoy, K. V. Temporal complexity and heterogeneity of single-neuron activity in premotor and motor cortex. J. Neurophysiol. 97 , 4235–4257 (2007).

Scott, S. H. Inconvenient truths about neural processing in primary motor cortex. J. Physiol. 586 , 1217–1224 (2008).

Crammond, D. J. & Kalaska, J. F. Prior information in motor and premotor cortex: activity during the delay period and effect on pre-movement activity. J. Neurophysiol. 84 , 986–1005 (2000).

Churchland, M. M. et al. Neural population dynamics during reaching. Nature 487 , 51–56 (2012).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Shenoy, K. V., Sahani, M. & Churchland, M. M. Cortical control of arm movements: a dynamical systems perspective. Annu. Rev. Neurosci. 36 , 337–359 (2013).

Vyas, S., Golub, M. D., Sussillo, D. & Shenoy, K. V. Computation through neural population dynamics. Annu. Rev. Neurosci. 43 , 249–275 (2020).

Kalidindi, H. T. et al. Rotational dynamics in motor cortex are consistent with a feedback controller. Elife 10 , e67256 (2021).

Kaufman, M. T., Churchland, M. M., Ryu, S. I. & Shenoy, K. V. Cortical activity in the null space: permitting preparation without movement. Nat. Neurosci. 17 , 440–448 (2014).

Elsayed, G. F., Lara, A. H., Kaufman, M. T., Churchland, M. M. & Cunningham, J. P. Reorganization between preparatory and movement population responses in motor cortex. Nat. Commun. 7 , 13239 (2016).

Churchland, M. M. & Shenoy, K. V. Preparatory activity and the expansive null-space. Nat. Rev. Neurosci. 25 , 213–236 (2024).

Stavisky, S. D., Kao, J. C., Ryu, S. I. & Shenoy, K. V. Motor cortical visuomotor feedback activity is initially isolated from downstream targets in output-null neural state space dimensions. Neuron 95 , 195–208.e9 (2017).

Vyas, S. et al. Neural population dynamics underlying motor learning transfer. Neuron 97 , 1177–1186.e3 (2018).

Sun, X. et al. Cortical preparatory activity indexes learned motor memories. Nature 602 , 274–279 (2022).

Vyas, S., O’Shea, D. J., Ryu, S. I. & Shenoy, K. V. Causal role of motor preparation during error-driven learning. Neuron 106 , 329–339.e4 (2020).

Brown, T. G. On the nature of the fundamental activity of the nervous centres; together with an analysis of the conditioning of rhythmic activity in progression, and a theory of the evolution of function in the nervous system. J. Physiol. 48 , 18–46 (1914).

Grillner, S. & Zangger, P. On the central generation of locomotion in the low spinal cat. Exp. Brain Res. 34 , 241–261 (1979).

Kjaerulff, O. & Kiehn, O. Distribution of networks generating and coordinating locomotor activity in the neonatal rat spinal cord in vitro: a lesion study. J. Neurosci. 16 , 5777–5794 (1996).

Shik, M. L. & Orlovsky, G. N. Neurophysiology of locomotor automatism. Physiol. Rev. 56 , 465–501 (1976).

Capelli, P., Pivetta, C., Soledad Esposito, M. & Arber, S. Locomotor speed control circuits in the caudal brainstem. Nature 551 , 373–377 (2017).

Article   ADS   CAS   PubMed   Google Scholar  

Caggiano, V. et al. Midbrain circuits that set locomotor speed and gait selection. Nature 553 , 455–460 (2018).

Drew, T. Motor cortical activity during voluntary gait modifications in the cat. I. Cells related to the forelimbs. J. Neurophysiol. 70 , 179–199 (1993).

Beloozerova, I. N. & Sirota, M. G. The role of the motor cortex in the control of accuracy of locomotor movements in the cat. J. Physiol. 461 , 1–25 (1993).

Drew, T., Jiang, W., Kably, B. & Lavoie, S. Role of the motor cortex in the control of visually triggered gait modifications. Can. J. Physiol. Pharmacol. 74 , 426–442 (1996).

CAS   PubMed   Google Scholar  

Warren, R. A. et al. A rapid whisker-based decision underlying skilled locomotion in mice. Elife 10 , e63596 (2021).

Duysens, J. & Pearson, K. G. Inhibition of flexor burst generation by loading ankle extensor muscles in walking cats. Brain Res. 187 , 321–332 (1980).

Armstrong, D. M. & Drew, T. Discharges of pyramidal tract and other motor cortical neurones during locomotion in the cat. J. Physiol. 346 , 471–495 (1984).

Beloozerova, I. N. & Sirota, M. G. The role of the motor cortex in the control of vigour of locomotor movements in the cat. J. Physiol. 461 , 27–46 (1993).

Mathis, A. et al. DeepLabCut: markerless pose estimation of user-defined body parts with deep learning. Nat. Neurosci. 21 , 1281–1289 (2018).

Karashchuk, P. et al. Anipose: a toolkit for robust markerless 3D pose estimation. Cell Rep. 36 , 109730 (2021).

Morton, S. M. & Bastian, A. J. Cerebellar contributions to locomotor adaptations during splitbelt treadmill walking. J. Neurosci. 26 , 9107–9116 (2006).

Darmohray, D. M., Jacobs, J. R., Marques, H. G. & Carey, M. R. Spatial and temporal locomotor learning in mouse cerebellum. Neuron 102 , 217–231.e4 (2019).

Lawrence, D. G. & Kuypers, H. G. The functional organization of the motor system in the monkey. I. The effects of bilateral pyramidal lesions. Brain 91 , 1–14 (1968).

Ilg, W., Giese, M. A., Gizewski, E. R., Schoch, B. & Timmann, D. The influence of focal cerebellar lesions on the control and adaptation of gait. Brain 131 , 2913–2927 (2008).

Guo, Z. V. et al. Flow of cortical activity underlying a tactile decision in mice. Neuron 81 , 179–194 (2014).

Gao, Z. et al. A cortico-cerebellar loop for motor planning. Nature 563 , 113–116 (2018).

Foster, J. D. et al. A freely-moving monkey treadmill model. J. Neural Eng. 11 , 046020 (2014).

Article   ADS   PubMed   Google Scholar  

Xing, D., Truccolo, W. & Borton, D. A. Emergence of distinct neural subspaces in motor cortical dynamics during volitional adjustments of ongoing locomotion. J. Neurosci. 42 , 9142–9157 (2022).

Kobak, D. et al. Demixed principal component analysis of neural population data. Elife 5 , e10989 (2016).

Saxena, S., Russo, A. A., Cunningham, J. & Churchland, M. M. Motor cortex activity across movement speeds is predicted by network-level strategies for generating muscle activity. Elife 11 , e67620 (2022).

Pearson, K. G., Acharya, H. & Fouad, K. A new electrode configuration for recording electromyographic activity in behaving mice. J. Neurosci. Methods 148 , 36–42 (2005).

Akay, T., Acharya, H. J., Fouad, K. & Pearson, K. G. Behavioral and electromyographic characterization of mice lacking EphA4 receptors. J. Neurophysiol. 96 , 642–651 (2006).

Chung, B. et al. Myomatrix arrays for high-definition muscle recording. eLife https://doi.org/10.7554/elife.88551.1 (2023).

Kaufman, M. T. et al . The largest response component in the motor cortex reflects movement timing but not movement type. eNeuro 3 , ENEURO.0085-16.2016 (2016).

Russo, A. A. et al. Motor cortex embeds muscle-like commands in an untangled population response. Neuron 97 , 953–966.e8 (2018).

Alonso, I. et al. Peripersonal encoding of forelimb proprioception in the mouse somatosensory cortex. Nat. Commun. 14 , 1866 (2023).

Conner, J. M. et al. Modulation of tactile feedback for the execution of dexterous movement. Science 374 , 316–323 (2021).

Adrian, E. D. & Moruzzi, G. Impulses in the pyramidal tract. J. Physiol. 97 , 153–199 (1939).

Brooks, V. B., Rudomin, P. & Slayman, C. L. Peripheral receptive fields of neurons in the cat’s cerebral cortex. J. Neurophysiol. 24 , 302–325 (1961).

Article   Google Scholar  

Armstrong, D. M. & Drew, T. Topographical localization in the motor cortex of the cat for somatic afferent responses and evoked movements. J. Physiol. 350 , 33–54 (1984).

Lemon, R. N. & Porter, R. Afferent input to movement-related precentral neurones in conscious monkeys. Proc. R. Soc. Lond. B Biol. Sci. 194 , 313–339 (1976).

Picard, N. & Smith, A. M. Primary motor cortical responses to perturbations of prehension in the monkey. J. Neurophysiol. 68 , 1882–1894 (1992).

Armstrong, D. M. & Drew, T. Locomotor-related neuronal discharges in cat motor cortex compared with peripheral receptive fields and evoked movements. J. Physiol. 346 , 497–517 (1984).

Musall, S., Kaufman, M. T., Juavinett, A. L., Gluf, S. & Churchland, A. K. Single-trial neural dynamics are dominated by richly varied movements. Nat. Neurosci. 22 , 1677–1686 (2019).

Harris-Warrick, R. M. & Marder, E. Modulation of neural networks for behavior. Annu. Rev. Neurosci. 14 , 39–57 (1991).

Hogan, N. Adaptive control of mechanical impedance by coactivation of antagonist muscles. IEEE Trans. Autom. Contr. 29 , 681–690 (1984).

Dufresne, J. R., Soechting, J. F. & Terzuolo, C. A. Modulation of the myotatic reflex gain in man during intentional movements. Brain Res. 193 , 67–84 (1980).

Refy, O. et al. Dynamic spinal reflex adaptation during locomotor adaptation. J. Neurophysiol. 130 , 1008–1014 (2023).

Gossard, J. P., Brownstone, R. M., Barajon, I. & Hultborn, H. Transmission in a locomotor-related group Ib pathway from hindlimb extensor muscles in the cat. Exp. Brain Res. 98 , 213–228 (1994).

Santuz, A. & Zampieri, N. Making sense of proprioception. Trends Genet. 40 , 20–23 (2024).

Türker, K. S., Brodin, P. & Miles, T. S. Reflex responses of motor units in human masseter muscle to mechanical stimulation of a tooth. Exp. Brain Res. 100 , 307–315 (1994).

Horner, A. M. & Jayne, B. C. The effects of viscosity on the axial motor pattern and kinematics of the African lungfish ( Protopterus annectens ) during lateral undulatory swimming. J. Exp. Biol. 211 , 1612–1622 (2008).

Pearson, K. G. Central programming and reflex control of walking in the cockroach. J. Exp. Biol. 56 , 173–193 (1972).

Zill, S., Moran, D. & Varela, F. G. The exoskeleton and insect proprioception: II. Reflex effects of tibial campaniform sensilla in the American cockroach, Periplaneta. Am. J. Exp. Biol. 94 , 43–55 (1981).

Epstein, S. & Graham, D. Behaviour and motor output of stick insects walking on a slippery surface: I. Forward walking. J. Exp. Biol. 105 , 215–229 (1983).

Dean, J. Control of leg protraction in the stick insect: a targeted movement showing compensation for externally applied forces. J. Comp. Physiol. A 155 , 771–781 (1984).

Orlovsky, G. N. The effect of different descending systems on flexor and extensor activity during locomotion. Brain Res. 40 , 359–371 (1972).

Russo, A. A. et al. Neural trajectories in the supplementary motor area and motor cortex exhibit distinct geometries, compatible with different classes of computation. Neuron 107 , 745–758.e6 (2020).

Lindén, H., Petersen, P. C., Vestergaard, M. & Berg, R. W. Movement is governed by rotational neural dynamics in spinal motor networks. Nature 610 , 526–531 (2022).

Warriner, C. L., Fageiry, S., Saxena, S., Costa, R. M. & Miri, A. Motor cortical influence relies on task-specific activity covariation. Cell Rep. 40 , 111427 (2022).

Yakovenko, S. & Drew, T. Similar motor cortical control mechanisms for precise limb control during reaching and locomotion. J. Neurosci. 35 , 14476–14490 (2015).

Kawai, R. et al. Motor cortex is required for learning but not for executing a motor skill. Neuron 86 , 800–812 (2015).

Mizes, K. G. C., Lindsey, J., Escola, G. S. & Ölveczky, B. P. Motor cortex is required for flexible but not automatic motor sequences. Preprint at bioRxiv https://doi.org/10.1101/2023.09.05.556348 (2023).

Afshar, A. et al. Single-trial neural correlates of arm movement preparation. Neuron 71 , 555–564 (2011).

Li, N., Daie, K., Svoboda, K. & Druckmann, S. Robust neuronal dynamics in premotor cortex during motor planning. Nature 532 , 459–464 (2016).

Nguyen-Vu, T. D. B. et al. Cerebellar Purkinje cell activity drives motor learning. Nat. Neurosci. 16 , 1734–1736 (2013).

Tsutsumi, S. et al. Purkinje cell activity determines the timing of sensory-evoked motor initiation. Cell Rep. 33 , 108537 (2020).

Silva, N. T., Ramírez-Buriticá, J., Pritchett, D. L. & Carey, M. R. Climbing fibers provide essential instructive signals for associative learning. Nat. Neurosci. 27 , 940–951 (2024).

Guo, Z. V. et al. Maintenance of persistent activity in a frontal thalamocortical loop. Nature 545 , 181–186 (2017).

Zia, M., Chung, B., Sober, S. & Bakir, M. S. Flexible multielectrode arrays with 2-D and 3-D contacts for in vivo electromyography recording. IEEE Trans. Compon. Packag. Manuf. Technol. 10 , 197–202 (2020).

Article   CAS   Google Scholar  

Lu, J. et al. High-performance flexible microelectrode array with PEDOT: PSS coated 3D micro-cones for electromyographic recording. In 2 022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) 5111–5114 (IEEE, 2022).

Mathewson, M. A., Chapman, M. A., Hentzen, E. R., Fridén, J. & Lieber, R. L. Anatomical, architectural, and biochemical diversity of the murine forelimb muscles. J. Anat. 221 , 443–451 (2012).

Marshall, N. J. et al. Flexible neural control of motor units. Nat. Neurosci. 25 , 1492–1504 (2022).

Guo, J.-Z. et al. Disrupting cortico-cerebellar communication impairs dexterity. Elife 10 , e65906 (2021).

Pachitariu, M., Steinmetz, N. A., Kadir, S. N., Carandini, M. & Harris, K. D. Fast and accurate spike sorting of high-channel count probes with KiloSort. Adv. Neural Inf. Process. Syst . 29 (2016).

Steinmetz, N. A. et al. Neuropixels 2.0: a miniaturized high-density probe for stable, long-term brain recordings. Science 372 , eabf4588 (2021).

Pachitariu, M., Sridhar, S., Pennington, J. & Stringer, C. Spike Sorting with Kilosort4. Nat. Methods 21 , 14–921 (2024).

Cleveland, W. S. V isualizing Data (Hobart Press, 1993).

Download references

Acknowledgements

We thank Andrew Pruszynski and Jonathan Michaels for the discussions and Alex Sohn for the design of the motorized treadmill. This work made use of the High-Performance Computing Resource in the Core Facility for Advanced Research Computing at Case Western Reserve University. Line drawings of the mouse brain and body were obtained from SciDraw.io under a CC BY 4.0 license. The drawings were created by Ethan Tyler, Lex Kravitz, and Emmett Thompson, were edited by the authors to display the design of specific experiments, and are archived at https://zenodo.org/records/3925915 and https://zenodo.org/records/3925987 . Research reported in this publication was supported by Case Western Reserve University, by the National Institute of Neurological Disorders and Stroke of the National Institutes of Health under award numbers R01NS129576 (PI: B.A.S.), R01NS109237 (PI: S.J.S.), R01NS084844 (PI: S.J.S.), R01EB022872 (PI: S.J.S.) and U24NS126936 (PI: S.J.S.), and by the McKnight Foundation (PI: S.J.S.), Kavli Foundation (PI: S.J.S.), Azrieli Foundation (PI: S.J.S.), and the Simons-Emory International Consortium on Motor Control (PI: S.J.S.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. E.A.K. was supported by a Postdoctoral Fellowship from the Natural Sciences and Engineering Research Council of Canada.

Author information

Authors and affiliations.

Department of Neurosciences, Case Western Reserve University School of Medicine, Cleveland, OH, USA

Eric A. Kirk, Keenan T. Hope & Britton A. Sauerbrei

Department of Biology, Emory University, Atlanta, GA, USA

Samuel J. Sober

You can also search for this author in PubMed   Google Scholar

Contributions

E.A.K. and B.A.S. designed the study. E.A.K. performed the experiments. K.T.H. assisted with behavioral procedures and designed the cortical recording chamber. E.A.K. and B.A.S. analyzed the data, interpreted the results, and wrote the paper. S.J.S. contributed the Myomatrix electrode arrays, compared the performance of fine-wire electrodes and Myomatrix arrays, and wrote the relevant text in the Results. E.A.K., S.J.S., and B.A.S. edited and revised the manuscript. B.A.S. supervised the project.

Corresponding author

Correspondence to Britton A. Sauerbrei .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature Communications thanks Andrew Spence and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information, reporting summary, peer review file, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Kirk, E.A., Hope, K.T., Sober, S.J. et al. An output-null signature of inertial load in motor cortex. Nat Commun 15 , 7309 (2024). https://doi.org/10.1038/s41467-024-51750-7

Download citation

Received : 07 December 2023

Accepted : 15 August 2024

Published : 24 August 2024

DOI : https://doi.org/10.1038/s41467-024-51750-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

null hypothesis of zero correlation

IMAGES

  1. Hypothesis Testing for Zero Correlation

    null hypothesis of zero correlation

  2. Chapter 7 Calculation of Pearson Coefficient of Correlation

    null hypothesis of zero correlation

  3. 2.3 Hypothesis testing for zero correlation (FURTHER STATISTICS 2

    null hypothesis of zero correlation

  4. Correlation Examples in Real Life

    null hypothesis of zero correlation

  5. Hypothesis Testing for Zero Correlation

    null hypothesis of zero correlation

  6. t test null hypothesis example

    null hypothesis of zero correlation

COMMENTS

  1. 11.2: Correlation Hypothesis Test

    The hypothesis test lets us decide whether the value of the population correlation coefficient is "close to zero" or "significantly different from zero". We decide this based on the sample correlation coefficient and the sample size .

  2. 5.3

    In this case, because we rejected the null hypothesis we can conclude that the correlation is not equal to zero. Furthermore, because the actual sample correlation is greater than zero and our p-value is so small, we can conclude that there is a positive association between the two variables.

  3. 12.1.2: Hypothesis Test for a Correlation

    Explore the hypothesis testing process for correlation with a focus on statistical significance and data analysis in this educational resource.

  4. Hypothesis testing for a correlation that is zero or negative

    If zero is in the confidence interval, then you would fail to reject the null hypothesis that the correlation is zero. Also, note that you cannot use this for correlations of ±1 ± 1 because if they are one for data that is truly continuous, then you only need 3 data points to determine that.

  5. Pearson Correlation Coefficient (r)

    The Pearson correlation coefficient (r) is the most common way of measuring a linear correlation. It is a number between -1 and 1 that measures the strength and direction of the relationship between two variables.

  6. 1.9

    1.9 - Hypothesis Test for the Population Correlation Coefficient There is one more point we haven't stressed yet in our discussion about the correlation coefficient r and the coefficient of determination R 2 — namely, the two measures summarize the strength of a linear relationship in samples only.

  7. Null Hypothesis: Definition, Rejecting & Examples

    The null hypothesis in statistics states that there is no difference between groups or no relationship between variables. It is one of two mutually exclusive hypotheses about a population in a hypothesis test. When your sample contains sufficient evidence, you can reject the null and conclude that the effect is statistically significant.

  8. PDF Lecture 2: Hypothesis testing and correlation

    To determine whether two conditions differ with respect to the mean, we use a statistical approach known as hypothesis testing. In hypothesis testing, we pose a null hypothesis and ask: if the null hypothesis is true, how likely is the observed pattern of results? This likelihood is known as the p-value, and indicates the statistical significance of the observed pattern of results. If the p ...

  9. 12.3 Testing the Significance of the Correlation Coefficient (Optional

    The hypothesis test lets us decide whether the value of the population correlation coefficient ρ is close to zero or significantly different from zero. We decide this based on the sample correlation coefficient r and the sample size n.

  10. Hypothesis Test for Correlation

    The hypothesis test lets us decide whether the value of the population correlation coefficient ρ is "close to zero" or "significantly different from zero.". We decide this based on the sample correlation coefficient r and the sample size n. If the test concludes that the correlation coefficient is significantly different from zero, we ...

  11. 9.4.1

    9.4.1 - Hypothesis Testing for the Population Correlation In this section, we present the test for the population correlation using a test statistic based on the sample correlation.

  12. Understanding Null Hypothesis Testing

    A crucial step in null hypothesis testing is finding the likelihood of the sample result if the null hypothesis were true. This probability is called the p value. A low p value means that the sample result would be unlikely if the null hypothesis were true and leads to the rejection of the null hypothesis. A high p value means that the sample ...

  13. Zero Correlation: Definition, Examples + How to Determine It

    Correlation is a fundamental concept in statistics and data analysis, helping to understand the relationship between two variables. While strong positive or negative correlations are often highlighted, zero correlation is equally important.

  14. 13.1 Understanding Null Hypothesis Testing

    A crucial step in null hypothesis testing is finding the likelihood of the sample result if the null hypothesis were true. This probability is called the p value. A low p value means that the sample result would be unlikely if the null hypothesis were true and leads to the rejection of the null hypothesis. A p value that is not low means that ...

  15. Null & Alternative Hypotheses

    The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test: Null hypothesis

  16. Null hypothesis

    Basic definitions. The null hypothesis and the alternative hypothesis are types of conjectures used in statistical tests to make statistical inferences, which are formal methods of reaching conclusions and separating scientific claims from statistical noise. The statement being tested in a test of statistical significance is called the null ...

  17. Null Hypothesis

    A null hypothesis is a precise statement about a population that we try to reject with sample data. We don't usually believe our null hypothesis (or H 0) to be true. However, we need some exact statement as a starting point for statistical significance testing.

  18. Understanding the Null Hypothesis for Linear Regression

    This tutorial provides a simple explanation of the null and alternative hypothesis used in linear regression, including examples.

  19. 9.1: Null and Alternative Hypotheses

    Learn how to formulate and test null and alternative hypotheses in statistics with examples and exercises from this LibreTexts course.

  20. What Is The Null Hypothesis & When To Reject It

    A null hypothesis is a statistical concept suggesting that there's no significant difference or relationship between measured variables. It's the default assumption unless empirical evidence proves otherwise.

  21. 12.5: Testing the Significance of the Correlation Coefficient

    The hypothesis test lets us decide whether the value of the population correlation coefficient ρ is "close to zero" or "significantly different from zero". We decide this based on the sample correlation coefficient r and the sample size n.

  22. 10.1

    10.1 - Setting the Hypotheses: Examples. A significance test examines whether the null hypothesis provides a plausible explanation of the data. The null hypothesis itself does not involve the data. It is a statement about a parameter (a numerical characteristic of the population). These population values might be proportions or means or ...

  23. 4 Examples of No Correlation Between Variables

    This tutorial provides several examples of variables having no correlation in statistics, including several scatterplots.

  24. An output-null signature of inertial load in motor cortex

    Under the null hypothesis, (x-1)/2 follows a beta distribution with α = β = (d-1)/2, where x is the inner product between axes and d = 20 is the dimension of the latent variable space.