Question 1

What is linear regression?

Accepted Answer

Linear regression is a statistical method that models the relationship between a dependent variable and one or more independent variables by fitting a linear equation to observed data. The most common method is ordinary least squares, which finds the line that minimizes the sum of squared vertical distances between data points and the fitted line. It is widely used for prediction, trend analysis, and understanding variable relationships.

Question 2

How do you calculate linear regression?

Accepted Answer

Simple linear regression finds the best-fit line y = a + bx by minimizing the sum of squared residuals. The slope b = Sxy / Sxx and the intercept a = ȳ - b·x̄, where Sxx is the sum of squared deviations of x and Sxy is the sum of cross products between x and y. Multiple regression extends this using matrix algebra to solve for multiple coefficients simultaneously.

Question 3

What does R² mean?

Accepted Answer

R² (coefficient of determination) measures the proportion of variance in the dependent variable that is predictable from the independent variables. An R² of 0.85 means the model explains 85% of the variance in y. However, a high R² does not prove causation, and adding more predictors always inflates R² even if they are irrelevant, which is why adjusted R² is sometimes preferred.

Question 4

What is the difference between confidence interval and prediction interval?

Accepted Answer

A confidence interval estimates the mean response at a given x value — it tells you where the average y falls for all observations with that x. A prediction interval estimates where a single new observation will fall. Prediction intervals are always wider because they include both the uncertainty in the regression line and the inherent random scatter of individual data points.

Question 5

What assumptions does linear regression make?

Accepted Answer

Linear regression assumes: linearity (the true relationship is linear), independence of observations, homoscedasticity (constant variance of residuals), approximate normality of residuals, and no severe outliers. For multiple regression, it also assumes no severe multicollinearity among predictors. Violating these assumptions can produce biased coefficients or misleading confidence intervals.

Question 6

What is the difference between simple and multiple linear regression?

Accepted Answer

Simple linear regression has one independent variable and fits a line (y = a + bx). Multiple linear regression has two or more independent variables and fits a hyperplane (y = b0 + b1x1 + b2x2 + ...). Multiple regression can capture more complex relationships but requires more data and careful attention to multicollinearity and model selection.

Question 7

Can I use this calculator for polynomial or nonlinear regression?

Accepted Answer

This calculator is designed for linear regression only, but you can fit polynomial models by entering transformed variables (e.g., x²) as additional predictors in multiple regression mode. True nonlinear regression requires iterative methods not included here.