import pandas as pd
from sklearn import linear_model
import numpy as np
# Generate a synthetic dataset
np.random.seed(0) # For reproducibility
weights = np.random.randint(1500, 3000, 100) # Random weights between 1500kg and 3000kg
volumes = np.random.randint(1000, 2000, 100) # Random volumes between 1000cm³ and 2000cm³
co2_emissions = weights * 0.007 + volumes * 0.004 + np.random.normal(0, 10, 100) # CO2 emissions with some added noise
# Create a DataFrame
df = pd.DataFrame({
'Weight': weights,
'Volume': volumes,
'CO2': co2_emissions
})
# Linear Regression Model
X = df[['Weight', 'Volume']]
y = df['CO2']
regr = linear_model.LinearRegression()
regr.fit(X, y)
# Predict the CO2 emission of a car where the weight is 2300kg, and the volume is 1300cm³
predictedCO2 = regr.predict([[2300, 1300]])
predictedCO2
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:439: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names warnings.warn(
array([20.8871586])
The coefficient is a factor that describes the relationship with an unknown variable. In this case, we can ask for the coefficient value of weight against CO2, and for volume against CO2. The answer(s) we get tells us what would happen if we increase, or decrease, one of the independent values.
print(regr.coef_)
[0.00530465 0.00546989]
The result array represents the coefficient values of weight and volume.
Weight: 0.00755095 Volume: 0.00780526
These values tell us that if the weight increase by 1kg, the CO2 emission increases by 0.00755095g.
And if the engine size (Volume) increases by 1 cm3, the CO2 emission increases by 0.00780526 g.
I think that is a fair guess, but let test it!
We have already predicted that if a car with a 1300cm3 engine weighs 2300kg, the CO2 emission will be approximately 107g.
What if we increase the weight with 1000kg (from 2300 to 3300) what will be the CO2 emission?
Ans: 107.2087328 + (1000 * 0.00755095) = 114.75968
predictedCO2 = regr.predict([[3300, 1300]])
print(predictedCO2)
[26.19180874]
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:439: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names warnings.warn(