By Suha Tayyab
BIOSTATISTICS- Correlation and Regression
Correlation is a statistical measure that expresses the extent to which two variables are linearly related.
- Variables move together
- X and y can be interchanged
- Data is represented in a single point
The correlation coefficient, r has values between -1 and +1. It quantifies the direction and strength of a linear relationship between two quantitative variables.
- The direction is determined by the sign of the R-value. If R is positive, an upward trend on the scatter plot will be seen. If R is negative, a downward trend will be seen on the scatter plot.
- The strength of the linear relationship is determined by R-value of +/-
<0.3: weak
0.3-0.7: moderate
>0.7: strong
Regression is a statistical process that estimates the relationship between a dependent variable and an independent variable. It emphasizes how one variable affects the other.
- Tells us the functional relationship between variables
- X and y can’t be interchanged
- Data is represented by line
- Has values between 0 and +1
R²- Measures how well the regression line predicts actual values. The regression line equation can be calculated. The x values can be substituted to predict y values. Another way to go about it is to navigate your way from the x value of interest on the graph with the corresponding value on the y axis.
In practice, a large volume of data is collected from experiments or researches, and to analyse and apply these functions, applications like Microsoft Excel, SPSS are used.
We will work with sample data in excel.
Before we begin, we will talk about what Microsoft Excel is.
Microsoft Excel is spreadsheet software used to organize data in rows and columns. It is used to perform various mathematical and statistical functions to the data.
Calculating correlation
Calculating Regression