Regression Analysis

A simple linear regression analysis is a statistical method that helps to predict the value of a dependent variable based on the value of an independent variable. It assesses the linear relationship between two continuous variables and provides insights into the relationship’s direction, magnitude, and statistical significance.

For instance, you can use simple linear regression to predict the sales of a product based on the advertising spend (i.e., your dependent variable would be “sales” and your independent variable would be “advertising spend”). You could also determine how much of the variation in sales can be explained by advertising spend. Similarly, you could use linear regression to predict the weight of a person based on their height (i.e., your dependent variable would be “weight” and your independent variable would be “height”). You could also determine how much of the variation in weight can be attributed to the person’s height.

Note that simple linear regression is also known as bivariate linear regression, and the dependent variable is also referred to as the outcome, target, or criterion variable. At the same time, the independent variable is also called the predictor, explanatory, or regressor variable.

Assumptions

Seven assumptions need to be considered to run a linear regression analysis. The first two assumptions relate to your choice of study design and the measurements you chose to make, while the other five assumptions relate to how your data fits the linear regression model. These assumptions are:

• Assumption #1: You have one continuous dependent variable. Continuous variables can take on infinite values within a given range. For instance, temperature, time, height, weight, distance, age, blood pressure, speed, electricity consumption, and sound level are typical continuous variables. For instance, the temperature in a room can be any value within the limits of the thermometer, such as 22.5°C, 22.51°C, and so on. Similarly, time can be measured to any level of precision, like seconds, milliseconds, or even smaller units. Height and weight can vary infinitely within their possible range, measured in units like meters or feet, and can include fractions (like 1.75 meters). Distance between two points, age measured in years, months, days, and even smaller units, blood pressure measured in millimeters of mercury (mmHg), speed measured in units like kilometers per hour or miles per hour, electricity consumption measured in kilowatt-hours or other units, and sound level measured in decibels are other examples of continuous variables that can take on a range of continuous values.
• Assumption #2: You have one continuous independent variable. See the bullet above for examples of continuous variables.
• Assumption #3 in statistical analysis requires a linear association between the dependent and independent variables. For instance, if you want to investigate the relationship between the amount of rainfall and crop yield, you need to determine whether the two variables are linear. There are several methods for testing for linearity, and one of them entails visually examining a scatterplot of the dependent variable plotted against the independent variable. If the relationship between the variables forms a straight line, it indicates a linear relationship. However, if the relationship appears curved, it suggests a non-linear relationship.
• Assumption #4: The errors or residuals are independent. This independence means having one residual should not provide useful information about any other residual. For instance, if you are studying the relationship between rainfall and crop yield, it is important to ensure that the errors/residuals in your data do not depend on each other. One way to check for the independence of residuals is by using the Durbin-Watson statistics. However, there are many ways in which errors/residuals can be correlated, which can violate this assumption. For instance, if you are collecting data over time, it is common to see that observations close together are more similar than observations further apart. This observation can lead to correlated errors/residuals. In another example, if you measure a dependent variable differently over time, using new technology or changing the measurement instrument, it can cause the errors/residuals to be correlated. There are many other ways in which errors/residuals might be correlated.
• Assumption #5: There should be no significant outliers. Outliers, leverage points, or influential cases are all examples of unusual points that can significantly impact the regression equation and statistical inferences. For instance, if we consider a dataset of student grades where most students have scored between 70% and 90%, an outlier with a score of 20% or 95% can affect the line of best fit and lead to inaccurate predictions.
• Assumption #6: Your data needs to show homoscedasticity. The homoscedasticity assumption means that the error variance should be similar across all values of the independent variable. For example, if we consider a dataset of employee salaries concerning their years of experience, homoscedasticity would be present if the spread of residuals around the regression line is roughly the same for all experience levels. If there is heteroscedasticity, such that the residuals are not evenly spread, but instead, they form a funnel, fan, or some other shape, it can lead to problems with normality or homoscedasticity, which can reduce the accuracy of prediction.
• Assumption #7: You need to check that the residuals (errors) of the regression line are approximately normally distributed. SPSS Statistics produces two graphical measures that can be used to assess normality. For instance, if we consider a dataset of car prices concerning their mileage, the normality of residuals would imply that most residuals should be clustered around zero and the rest should be spread out evenly on both sides.

Interpreting Results

After running a linear regression and checking if the dataset meets the assumptions of linear regression, SPSS Statistics generates several tables that contain all the necessary information needed to report the regression results.

The output from a simple linear regression can help you achieve three main objectives. For instance, you can determine how much of the variation in the dependent variable is explained by the independent variable, predict the dependent variable’s values based on new independent variable values, and calculate how much the dependent variable changes for a one-unit change in the independent variable.

When interpreting and reporting the results of a linear regression, it is recommended that you work through three stages:

1. It would help if you determine whether the linear regression model is a good fit for the data.
2. You need to understand the coefficients of the regression model.
3. You can use SPSS Statistics to make predictions of the dependent variable based on the values of the independent variable.

To determine whether the linear regression model is a good fit for the data, you can use statistics such as the percentage of variance explained, the statistical significance of the overall model, and the precision of predictions from the regression model. These statistics can be found in the Model Summary and ANOVA tables.

Once you have interpreted the overall model fit, you can interpret and report the regression model coefficients. These coefficients can help you understand whether there is a linear relationship between the two variables. Furthermore, you can use the regression equation to calculate predicted values of the dependent variable for given values of the independent variable.

Finally, you can use SPSS Statistics to make predictions of the dependent variable based on the values of the independent variable. For instance, you can use the regression equation to predict the dependent variable’s values for different independent variable values.

ELEVATE YOUR RESEARCH WITH OUR FREE EVALUATION SERVICE!

Are you looking for expert assistance to maximize the accuracy of your research? Our team of experienced statisticians can help. We offer comprehensive assessments of your data, methodology, and survey design to ensure optimal accuracy so you can trust us to help you make the most out of your research.

WHY DO OUR CLIENTS LOVE US?

Expert Guidance: Our team brings years of experience in statistical analysis to help you navigate the complexities of your research.

Build on a Foundation of Trust: Join the numerous clients who’ve transformed their projects with our insights—’ The evaluation was a game-changer for my research!’

ACT NOW-LIMITED SPOTS AVAILABLE!

Take advantage of this free offer. Enhance your research journey at no cost and take the first step towards achieving excellence by contacting us today to claim your free evaluation. With the support of our experts, let’s collaborate and empower your research journey.

Phone Support

• Hours: Mon-Fri, 9:00 AM – 8:00 PM ET
• Contact: +1 (650) 460-7431

Email

• Availability: 365 days per year
• Contact: info@amstatisticalconsulting.com
• We will respond within 1 hour during business hours.

• Please click our chat icon in the right corner of our website.
• Mobile users: To access our chat icon, please click ‘Exit mobile version’ at the end of our website.

Contact Form

• Please fill out our contact form in the left corner of our website.
• We will respond within 1 hour during business hours.
• Mobile users: Please click ‘Exit mobile version’ at the end of our website to access our contact form.

Office Visit

• Address: 530 Lytton Avenue, 2nd Floor, Palo Alto, CA 94301
• Please schedule an appointment to visit us.

Your confidentiality is our priority. Non-disclosure agreements are available upon request.

Scroll to Top