Lesson 2
Fitting Complex Curves with SciPy
Introduction to Complex Curves

Good job so far! Welcome to the next lesson of this course.

In the previous lesson, you were introduced to the concept of curve fitting using linear models. Now, it's time to expand upon this knowledge and delve into fitting more complex curves, starting with quadratic models. Quadratic models are essential because they help describe data patterns that are not linear, enabling you to model real-world situations more accurately, such as the trajectory of a projectile or economic growth trends.

Defining a Quadratic Model

A quadratic model is represented by the equation:

y=ax2+bx+cy = ax^2 + bx + c

Here:

  • aa, bb, and cc are constants that determine the shape and position of the parabola.
  • xx is the independent variable.

Let's define a quadratic model in Python. We'll create a function to describe our model:

Python
1def quadratic_model(x, a, b, c): 2 return a * x**2 + b * x + c

This function takes four parameters: x, a, b, and c. It returns the value of y based on the quadratic equation given. The role of this function is to serve as our mathematical model for curve fitting.

Creating Synthetic Data with Noise

To practice curve fitting, we need data. Often, we use synthetic data for this purpose, which is data generated to mimic real-world characteristics. We'll also add some noise to simulate natural variability.

We'll use NumPy to generate data:

Python
1import numpy as np 2 3# Create a range of x values 4x_data = np.linspace(-5, 5, num=40) 5 6# Generate y values based on a known quadratic equation with added noise 7y_data = 2 * x_data**2 - 3 * x_data + 5 + np.random.normal(size=x_data.size)

Here:

  • x_data is generated using np.linspace() to create 40 evenly spaced values between -5 and 5.
  • y_data is calculated using our quadratic formula with added Gaussian noise (np.random.normal()) to simulate imperfections in measurement or data recording.
Applying Curve Fitting with SciPy

With our model and synthetic data ready, we can fit the quadratic model to the data using the curve_fit function from SciPy.

Python
1from scipy.optimize import curve_fit 2 3# Fit the model to the data 4params, covariance = curve_fit(quadratic_model, x_data, y_data)

In this block:

  • curve_fit attempts to adjust the parameters ( a ), ( b ), and ( c ) to minimize the difference between the predicted and actual y values.
  • params contains the optimal parameters, while covariance provides an estimate of their variances.
Visualizing the Fitted Curve

Visualizing the data and fitted curve helps you understand the effectiveness of your model. We'll use Matplotlib for this purpose:

Python
1import numpy as np 2import matplotlib.pyplot as plt 3from scipy.optimize import curve_fit 4 5# Define the quadratic model 6def quadratic_model(x, a, b, c): 7 return a * x**2 + b * x + c 8 9# Create synthetic data 10x_data = np.linspace(-5, 5, num=40) 11y_data = 2 * x_data**2 - 3 * x_data + 5 + np.random.normal(size=x_data.size) 12 13# Fit the model to the data 14params, covariance = curve_fit(quadratic_model, x_data, y_data) 15 16# Plot the data and the fitted curve 17plt.scatter(x_data, y_data, label='Data') 18plt.plot(x_data, quadratic_model(x_data, *params), label='Fitted curve', color='red') 19plt.legend() 20plt.show()

Explanation:

  • plt.scatter() is used to plot the original noisy data points.
  • plt.plot() draws the fitted quadratic curve using the parameters found by curve_fit.
  • The legend distinguishes between the original data and the fitted curve.
Extrapolating Data Using the Fitted Function

Once the curve fitting is complete, the quadratic model can be used to make predictions, including extrapolating data beyond the range of our original dataset. This is useful in scenarios where you need to predict future values or fill in gaps in your data.

For example, if we wanted to predict y values for x values outside the range -5 to 5, we can use our fitted model parameters for extrapolation:

Python
1import numpy as np 2import matplotlib.pyplot as plt 3from scipy.optimize import curve_fit 4 5# Define the quadratic model 6def quadratic_model(x, a, b, c): 7 return a * x**2 + b * x + c 8 9# Create synthetic data 10x_data = np.linspace(-5, 5, num=40) 11y_data = 2 * x_data**2 - 3 * x_data + 5 + np.random.normal(size=x_data.size) 12 13# Fit the model to the data 14params, covariance = curve_fit(quadratic_model, x_data, y_data) 15 16# Define new x values for extrapolation 17x_extrapolate = np.linspace(-10, 10, num=100) 18 19# Use the fitted parameters to calculate extrapolated y values 20y_extrapolated = quadratic_model(x_extrapolate, *params) 21 22# Visualize the original data, fitted curve, and extrapolated data 23plt.scatter(x_data, y_data, label='Data') 24plt.plot(x_data, quadratic_model(x_data, *params), label='Fitted curve', color='red') 25plt.plot(x_extrapolate, y_extrapolated, label='Extrapolated curve', color='green', linestyle='--') 26plt.legend() 27plt.show()

In this example:

  • We define a new range x_extrapolate from (-10) to (10).
  • We compute y_extrapolated using the quadratic_model function with the optimal parameters obtained from curve_fit.
  • The plot includes the original data, fitted curve, and extrapolated curve to visualize how the model behaves outside the originally fitted range.

The resulting plot:

Exploring Other Non-Linear Functions

The quadratic function is just one example of a non-linear model used in curve fitting. In practice, various other non-linear functions are employed to match real-world data:

  • Cubic Functions: y=ax3+bx2+cx+dy = ax^3 + bx^2 + cx + d, useful for curves with inflection points.
  • Exponential Functions: y=aebxy = a \cdot e^{bx}, suitable for growth or decay processes.
  • Logarithmic Functions: y=alog(x)+by = a \cdot \log(x) + b, modeling rapid initial growth that levels off.
  • Trigonometric Functions: y=asin(bx+c)y = a \cdot \sin(bx + c), used for periodic processes.
  • Polynomial Functions: Higher degrees model complex data with more fluctuations.

Choosing the right function depends on the specific characteristics you want to capture in your data.

Summary and Next Steps

Congratulations on reaching this milestone in your learning journey! You now understand how to build and fit complex curves using SciPy, specifically quadratic models. You practiced defining a model, generating synthetic data, fitting the model to data, and visualizing the results. These skills form the foundation for more advanced data modeling tasks.

As you finish this lesson and the course, remember to reflect on the concepts you've learned and take this knowledge into the practice exercises where you'll gain applied experience. Thank you for your dedication, and well done on reaching the end of the course!

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.