$$\DeclareMathOperator{cosec}{cosec}$$

# Statistics and Probability

## Furthermore

Official Guidance, clarification and syllabus links:

Technology should be used to calculate r. However, hand calculations of r may enhance understanding.

Critical values of r will be given where appropriate.

Students should be aware that Pearson’s product moment correlation coefficient (r) is only meaningful for linear relationships.

Positive, zero, negative; strong, weak, no correlation.

Students should be able to make the distinction between correlation and causation and know that correlation does not imply causation.

Technology should be used to find the equation of the regression line.

Students should be aware of the dangers of extrapolation and that they cannot always reliably make a prediction of x from a value of y, when using a y on x line.

Linear correlation of bivariate data refers to the strength and direction of a linear relationship between two variables. When two variables tend to increase or decrease together in a consistent manner, they are said to have a linear correlation. Pearson’s product-moment correlation coefficient, denoted as $$r$$, quantifies the strength and direction of this linear relationship. The value of $$r$$ lies between -1 and 1, with -1 indicating a perfect negative linear relationship, 1 indicating a perfect positive linear relationship, and 0 indicating no linear relationship.

Scatter diagrams, also known as scatter plots, are graphical representations of bivariate data. Each point on the scatter diagram represents a pair of values for the two variables. The pattern of points can give an indication of the type and strength of the relationship between the variables.

The equation of the regression line of $$y$$ on $$x$$, also known as the least squares regression line, provides a linear model that best fits the data points in a scatter diagram. This line can be used to make predictions about $$y$$ based on values of $$x$$.

Key formulae [In the exam a GDC would be used to calculate r and to find the equation of the regression line].

$$r = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum (x_i - \bar{x})^2 \sum (y_i - \bar{y})^2}}$$

Where:
$$r$$ = Pearson’s product-moment correlation coefficient
$$x_i$$ and $$y_i$$ = Individual data points
$$\bar{x}$$ and $$\bar{y}$$ = Means of $$x$$ and $$y$$ respectively

Equation of the regression line of $$y$$ on $$x$$:

$$y = a + bx$$

Where:
$$b = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2}$$
and
$$a = \bar{y} - b\bar{x}$$

If you use the TI-Nspire calculator you can find instructions for finding the regression line on the GDC Essentials page. The equation of that line can then be used for prediction purposes.

This Bicen Maths video clip shows everything you need to memorise on Regression and Correlation for A Level Statistics.

This video on Bivariate Statistics is from Revision Village and is aimed at students taking the IB Maths AA SL/HL level course.

How do you teach this topic? Do you have any tips or suggestions for other teachers? It is always useful to receive feedback and helps make these free resources even more useful for Maths teachers anywhere in the world. Click here to enter your comments.

For Students:

For All: