Regression
-
Confusion Matrix
Previously we saw a logistic regression model that can predict grape variety from various measurements. The question arose of what kind of mistakes it makes, if any. We can figure that out using a confusion matrix. Let’s look at the program again, but this time we’ll generate a matrix called a confusion matrix that shows…
-
Normalizing Data
In the last post we looked at logistic regression and I commented that we’re using data measured on different scales. For example, in our grapes.csv data, weight is measured in grams while diameter is measured in milimetres. This raises the question of whether this is valid. Can machine learning models cope with data series measured…
-
Logistic Regression
Logistic regression is used when the output is categorical instead of continuous. Typically we’ll have two possible outcomes or classes/categories, and we’re trying to figure out which one of the two our samples belong to, based on some predictor variables. However, the Scikit-learn logistic regression model can handle multiple possible target variables. As an example…
-
Polynomial Regression
You may know that a polynomial can be used to fit curves. A polynomial equation looks like this, for example: The equation contains powers of x; the first term, 3, is a constant and may be thought of as multiplying x to the power of zero. Then we have further coefficients of x, x squared,…
-
Multiple Linear Regression
We’ve seen examples of using linear regression to fit a straight line to data points, but we can also use linear regression to fit a flat surface (a plane) to multi-dimensional data. We’re still trying to predict or approximate the value of one particular variable, but we use multiple variables to make the prediction. An…
-
Train/Test Splitting
Previously we saw a simple example of linear regression using scikit-learn. In that example we trained our model on all of our data, then examined how closely the “predictions” made by the model fit the actual data. However, what we’d really like to know is how good our model really is at making predictions about…
-
R Squared: What Is It?
If we fit a line to some data, one way to measure the “goodness of fit” is to use a measure known as R squared. However, this isn’t the full story, so it’s important to use other techniques as well. For example, if your model diverges from the data at one end, and that’s the…
-
Linear Regression with Scikit-Learn
Regression basically means fitting lines to curves. We can also fit surfaces to higher-dimensional data. By doing this, we end up with a simplified model of our data. This can be useful for making predictions about future data, or for discerning the mathematical laws that govern how the data was generated. In this post we’ll…