Previous

Linear Regression

Next

What is Linear Regression?

Linear Regression is a fundamental machine learning algorithm used for predicting continuous values. It models the relationship between a dependent variable (target) and one or more independent variables (features) by fitting a straight line.

Simple Linear Regression (One Variable)

The equation of a simple linear regression line is:

y=mx+b

Where:

  • y: Predicted output
  • x: Input feature
  • m: Slope (weight)
  • b: Intercept (bias)

Multiple Linear Regression (Multiple Variables)

When there are multiple features:

y=w1​x1​+w2​x2​+...+wn​xn​+b

Here:

  • x1​,x2​,...,xn​: Input features
  • w1​,w2​,...,wn​: Weights (slopes)
  • b: Intercept

How Linear Regression Works

  1. Initialize weights and bias.
  2. Predict outputs using current weights.
  3. Calculate the loss (error) using a loss function (e.g. Mean Squared Error).
  4. Update weights using optimization methods (like Gradient Descent).
  5. Repeat until the model converges.

Loss Function

Mean Squared Error (MSE):

Where:

  • yi​:  Actual value
  • y^i: Predicted value

Evaluation Metrics

  • MSE (Mean Squared Error)
  • RMSE (Root Mean Squared Error)
  • MAE (Mean Absolute Error)
  • R² Score (Coefficient of Determination)

Python Code Example using scikit-learn

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import pandas as pd

# Example dataset
data = pd.DataFrame({
    'experience': [1, 2, 3, 4, 5],
    'salary': [30000, 35000, 50000, 55000, 60000]
})

X = data[['experience']]  # Feature
y = data['salary']        # Target

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Model training
model = LinearRegression()
model.fit(X_train, y_train)

# Prediction
y_pred = model.predict(X_test)

# Evaluation
print("MSE:", mean_squared_error(y_test, y_pred))
print("R² Score:", r2_score(y_test, y_pred))

# Coefficients
print("Slope:", model.coef_)
print("Intercept:", model.intercept_)