Math / Statistics

About Linear Regression Calculator

Calculate regression equation, steps, graph, uncertainty analysis, and predictions for simple and multiple linear regression.

About this calculator

What this does

Fits a linear regression model to paired data using ordinary least squares, returning the regression equation, slope, intercept, R-squared, correlation coefficient, and residuals. Supports simple regression with one predictor and multiple regression with two or more predictors, and computes confidence intervals and prediction intervals for any given x value.

Who it is for

Students, researchers, data analysts, and anyone who needs to model the relationship between variables or make predictions from data. Useful for coursework, data exploration, trend analysis, and quick regression checks without firing up a full statistics package.

How it works

For simple regression, the calculator computes Sxx (sum of squared x deviations), Sxy (sum of cross products), then derives the slope b = Sxy/Sxx and intercept a = ȳ - b·x̄. For multiple regression it uses matrix algebra to solve the normal equations. R-squared is computed from the ratio of explained sum of squares to total sum of squares, and intervals use the t-distribution with the appropriate standard error.

Limitations

The calculator checks basic assumptions but cannot verify independence, homoscedasticity, or normality from raw data alone. Outliers can heavily influence the regression line, and extrapolating beyond the observed x range produces unreliable predictions. Multiple regression with many predictors relative to sample size risks overfitting.

Formula

Slope (Simple Regression)

The slope b = Sxy / Sxx, where Sxx = Σ(xᵢ - x̄)² and Sxy = Σ(xᵢ - x̄)(yᵢ - ȳ). It measures the expected change in y for a one-unit change in x.

Intercept

The intercept a = ȳ - b·x̄, where ȳ and x̄ are the sample means. It represents the predicted y value when x equals zero.

Coefficient of Determination (R²)

R² = 1 - SS_res / SS_total, where SS_res is the sum of squared residuals and SS_total is the total sum of squares. It ranges from 0 to 1, with higher values indicating a better fit.

Prediction Interval

A 100(1-α)% prediction interval for a new observation at x₀ is ŷ ± t_(α/2, n-2) × SE × √(1 + 1/n + (x₀ - x̄)²/Sxx), where SE is the residual standard error.

How it works

Step 1

Enter paired data

Add x and y values for simple regression, or y and predictor columns for multiple regression.

Step 2

Review the regression fit

Read the equation, slope, intercept, R^2, correlation, and residual outputs.

Step 3

Check intervals and predictions

Use the prediction input and uncertainty options to compare fitted values, confidence intervals, and prediction intervals.

Reference ranges

R² Interpretation

R² > 0.7 indicates a strong linear relationship. R² between 0.3 and 0.7 is moderate. R² < 0.3 suggests a weak fit, meaning x explains little of the variation in y.

Correlation Strength

|r| > 0.8 is a very strong linear correlation, |r| between 0.5 and 0.8 is moderate, |r| between 0.3 and 0.5 is weak, and |r| < 0.3 is negligible.

Residual Patterns

Well-behaved residuals should be randomly scattered around zero with no obvious pattern (funnel, curve, or clusters). Patterns suggest a nonlinear relationship or heteroscedasticity.

Sample Size Guidelines

For simple regression, 10-20 data points is the minimum for reliable intervals. For multiple regression, aim for at least 10-15 observations per predictor variable.

← Back to Linear Regression