Welcome! Today's focus is on t-SNE parameter tuning using Scikit-learn. This lesson covers an understanding of critical t-SNE parameters, the practice of parameter tuning, and its impact on data visualization outcomes.
Before delving into parameter tuning, let's quickly setup the dataset:
Here's a basic setup:
We will now delve into the key parameters in Scikit-learn's t-SNE. The first one being perplexity
, which is loosely determined by the number of effective nearest neighbors. It strikes a balance between preserving the local and global data structure.
The next parameter is early_exaggeration
. It governs how tight natural clusters are in the embedded space. High values tend to make clusters denser.
The final parameter, learning_rate
, modulates the step size for the gradient during the optimization process.
Now that our dataset is prepared, we have the freedom to adjust the t-SNE parameters and observe the visual impact.
We can compare the results of TSNE with default parameters and custom one:
Note: the plot might be different for you due to version differences in the libraries.
In conclusion, mastering Scikit-learn's t-SNE
involves understanding and effectively adjusting its tunable parameters. Throughout this lesson, we've traversed through parameter tuning in Scikit-learn's t-SNE
, understood the impact of key parameters, and experimented with different parameter settings. Now, gear up for hands-on practice to couple theory with application. It's time to practice and excel in t-SNE!
