Linear regression is one of the simplest machine learning algorithms used to model the relationship between a dependent variable and one or more independent variables. In this example, we’ll use TensorFlow to create a linear regression model to predict outcomes based on input data.
This tutorial is perfect for beginners and demonstrates TensorFlow’s flexibility in building machine-learning models.
What Is Linear Regression?
Linear regression models the relationship between two variables by fitting a straight line to the data. It follows the equation:
y=mx+b
Where:
- y: Dependent variable (output)
- x: Independent variable (input)
- m: Slope of the line
- b: Intercept of the line
TensorFlow Implementation of Linear Regression
Step 1: Install TensorFlow
Ensure you have TensorFlow installed. Use the following command if it’s not already installed:
pip install tensorflow
Step 2: Import Libraries
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
Step 3: Generate Synthetic Data
We will generate random data to simulate the relationship y=3x+2.
# Generate random data
np.random.seed(42)
X = np.random.rand(100).astype(np.float32)
y = 3 * X + 2 + np.random.normal(0, 0.1, 100).astype(np.float32) # Add noise to data
Step 4: Define the Model
We will create a simple linear regression model using TensorFlow.
# Define trainable variables
m = tf.Variable(0.0)
b = tf.Variable(0.0)
# Define the linear regression function
def linear_model(X):
return m * X + b
Step 5: Define the Loss Function
The loss function measures the difference between the predicted and actual values.
# Define the Mean Squared Error loss function
def loss_fn(y_true, y_pred):
return tf.reduce_mean(tf.square(y_true - y_pred))
Step 6: Choose an Optimizer
We’ll use the Stochastic Gradient Descent (SGD) optimizer to minimize the loss.
optimizer = tf.optimizers.SGD(learning_rate=0.1)
Step 7: Train the Model
We’ll iterate through multiple epochs to train the model and update the values of mm and bb.
# Training loop
epochs = 200
for epoch in range(epochs):
with tf.GradientTape() as tape:
predictions = linear_model(X)
loss = loss_fn(y, predictions)
gradients = tape.gradient(loss, [m, b])
optimizer.apply_gradients(zip(gradients, [m, b]))
if (epoch + 1) % 20 == 0: # Print progress every 20 epochs
print(f"Epoch {epoch + 1}: Loss = {loss.numpy():.4f}")
Step 8: Evaluate the Model
After training, you can evaluate the performance and visualize the results.
# Plot the data and the fitted line
plt.scatter(X, y, label='Data')
plt.plot(X, linear_model(X), color='red', label='Fitted Line')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()
print(f"Trained Slope (m): {m.numpy():.4f}")
print(f"Trained Intercept (b): {b.numpy():.4f}")
Complete Code Example
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
# Generate random data
np.random.seed(42)
X = np.random.rand(100).astype(np.float32)
y = 3 * X + 2 + np.random.normal(0, 0.1, 100).astype(np.float32)
# Define trainable variables
m = tf.Variable(0.0)
b = tf.Variable(0.0)
# Define the linear regression function
def linear_model(X):
return m * X + b
# Define the Mean Squared Error loss function
def loss_fn(y_true, y_pred):
return tf.reduce_mean(tf.square(y_true - y_pred))
# Define optimizer
optimizer = tf.optimizers.SGD(learning_rate=0.1)
# Training loop
epochs = 200
for epoch in range(epochs):
with tf.GradientTape() as tape:
predictions = linear_model(X)
loss = loss_fn(y, predictions)
gradients = tape.gradient(loss, [m, b])
optimizer.apply_gradients(zip(gradients, [m, b]))
if (epoch + 1) % 20 == 0:
print(f"Epoch {epoch + 1}: Loss = {loss.numpy():.4f}")
# Plot the data and the fitted line
plt.scatter(X, y, label='Data')
plt.plot(X, linear_model(X), color='red', label='Fitted Line')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()
print(f"Trained Slope (m): {m.numpy():.4f}")
print(f"Trained Intercept (b): {b.numpy():.4f}")
Expected Output
- A scatter plot of the data with a fitted red line.
- The trained values of mm and bb, which should approximate 3 and 2, respectively.
Key Takeaways
- TensorFlow simplifies the process of implementing machine learning algorithms like linear regression.
- Gradient descent optimizes the model parameters.
- Real-world datasets often require additional preprocessing steps, such as normalization.