Scatter Plots are used when you have two variables and want to look for a relationship between them. They are plotted on a diagram where each axis represents a variable, and a best fit line is drawn through them. The closer the dots are to the line, the better the correlation between the variables.

Scatter Plots Types of Correlation

The types of correlation in scatter plots are defined below.

  • Positive correlation: means that higher values of one measurement are associated with higher values of the other one
  • Negative correlation: means that higher values of one measurement are associated with lower values of the other one

Note: correlation does not mean there is a cause and effect relationship

  • Sometimes the correlation can be a coincidence
  • Sometimes the correlation can be due to a third (lurking) variable which has not been include in your analysis

Scatter Plots Correlation Coefficient

  • The Pearson correlation coefficient r, tells us the strength and direction of the relationship.
  • The Coefficient of Determination R2, is the square of the Pearson coefficient. It tells us how much variation in the response variable Y, can be explained by the independent variable X.
  • r can be negative or positive, from -1 to +1, but R2 is always positive

If r >0.65 or <-0.65, then it is considered strong relationship.

Scatter Plots 01

When calculating the correlation between variables, always plot the dots. There could be a strong nonlinear relationship between variables, even if r = 0 as seen below.

Scatter Plots 02

Scatter Plots for Problem Solving

Scatter plots can be used in connection with problem solving in different scenarios.

  • If you want to determine if two variables are related as part of a root cause analysis.
  • After brainstorming as part of a fishbone diagram, to determine whether a particular cause and effect are related.
  • To determine if two effects that appear to be related both occur with the same cause.

