Home » Machine Learning » Artificial Neural Networks (ANNs) » Classifying Irises with a Neural Network

Classifying Irises with a Neural Network

Artificial Neural Networks (ANNs) are inspired by the structure of human neurons, while not really resembling them a whole lot.

Each neuron multiplies its inputs by a series of “weights”. These are then summed and passed into a non-linear function to produce the output.

Neurons are then connected together, with the output of one neuron becoming the input of another, typically in layers.

Let’s use an ANN to classify irises. We’ll start with a full useful program, then in subsequent posts we’ll try to break each bit down a bit more, to develop understanding of how all this stuff works.

First, some important notes on the program below.

  • You will need to install the keras and tensorflow packages using pip. While tensorflow is not used explicitly in the program, keras uses it by default. It’s used to perform matrix multiplications efficiently.
  • It’s very important to scale your data. Typically your data features should lie between 0 and 1. Any larger and overflows can easily occur in the network.
  • Our desired output is the target value (0, 1 or 2) that represents the iris species. It’s more efficient to present this as one-hot vectors; so 0 becomes 1, 1 becomes 01 and 2 becomes 001.

    Each vector consists of 0’s and at most one single 1, indicating the value. We can think of the mapping like this:
    0 => 001
    1 => 010
    2 => 100

    To format the data like this, we use to_categorical.
  • We use a Sequential model to organise the layers of the network
  • Each layer of neurons, where the outputs of one layer connects to the inputs of the next, is referred to as a Dense layer. We’ve added only two layers here.
  • Each dense layer has an activation function. This is the non-linear function mentioned above. ReLu is a simple function that doesn’t change values greater than 0, but maps values less than zero to zero.
  • The model needs some kind of loss function to determine how far away its output is from the desired output. Here we use categorical_crossentropy.
  • We also need some technique or optimizer for systematically adjusting the weights, which is how the ANN learns. Here we’re using adam.
  • We then present the dataset repeatedly to the network. Each run-through of the training data is called an epoch. Here we have 50 epochs.
  • We’re using the numpy argmax function to translate back from one-hot vector form to an actual number.
  • The second layer has an activation called softmax. This is only used on output layers. The values output from the final layer will all add up to one after running through softmax. This enables us to view the output of the network as a series of probabilities, representing in this case predictions of the iris species. We can simply take the highest probability as representing the predicted species.
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Dense
from sklearn.metrics import accuracy_score
import numpy as np

iris = load_iris(as_frame=True)
X = iris['data']
y = iris['target']

(X_train, X_test, y_train, y_test) = train_test_split(X, y, shuffle=True, train_size=0.7)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

model = Sequential()
model.add(Dense(500, input_dim=4, activation='relu'))
model.add(Dense(3, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics='accuracy')

model.fit(X_train, y_train, epochs=50)

y_predicted = model.predict(X_test)

print(model.summary())

print("Score:", accuracy_score(np.argmax(y_test, axis=1), np.argmax(y_predicted, axis=1)))
Epoch 1/50
4/4 [==============================] - 1s 3ms/step - loss: 1.0384 - accuracy: 0.5714
Epoch 2/50
4/4 [==============================] - 0s 1ms/step - loss: 0.8743 - accuracy: 0.8095
Epoch 3/50
4/4 [==============================] - 0s 2ms/step - loss: 0.7469 - accuracy: 0.8286
Epoch 4/50
4/4 [==============================] - 0s 2ms/step - loss: 0.6506 - accuracy: 0.8286
Epoch 5/50
4/4 [==============================] - 0s 2ms/step - loss: 0.5803 - accuracy: 0.8286
Epoch 6/50
4/4 [==============================] - 0s 2ms/step - loss: 0.5247 - accuracy: 0.8381
Epoch 7/50
4/4 [==============================] - 0s 2ms/step - loss: 0.4832 - accuracy: 0.8381
Epoch 8/50
4/4 [==============================] - 0s 2ms/step - loss: 0.4493 - accuracy: 0.8381
Epoch 9/50
4/4 [==============================] - 0s 2ms/step - loss: 0.4221 - accuracy: 0.8381
Epoch 10/50
4/4 [==============================] - 0s 2ms/step - loss: 0.3981 - accuracy: 0.8381
Epoch 11/50
4/4 [==============================] - 0s 1ms/step - loss: 0.3780 - accuracy: 0.8381
Epoch 12/50
4/4 [==============================] - 0s 2ms/step - loss: 0.3615 - accuracy: 0.8476
Epoch 13/50
4/4 [==============================] - 0s 2ms/step - loss: 0.3460 - accuracy: 0.8571
Epoch 14/50
4/4 [==============================] - 0s 2ms/step - loss: 0.3318 - accuracy: 0.8571
Epoch 15/50
4/4 [==============================] - 0s 2ms/step - loss: 0.3170 - accuracy: 0.8476
Epoch 16/50
4/4 [==============================] - 0s 2ms/step - loss: 0.3060 - accuracy: 0.8667
Epoch 17/50
4/4 [==============================] - 0s 2ms/step - loss: 0.2941 - accuracy: 0.8952
Epoch 18/50
4/4 [==============================] - 0s 2ms/step - loss: 0.2843 - accuracy: 0.9048
Epoch 19/50
4/4 [==============================] - 0s 2ms/step - loss: 0.2743 - accuracy: 0.9048
Epoch 20/50
4/4 [==============================] - 0s 2ms/step - loss: 0.2633 - accuracy: 0.9048
Epoch 21/50
4/4 [==============================] - 0s 2ms/step - loss: 0.2533 - accuracy: 0.9333
Epoch 22/50
4/4 [==============================] - 0s 2ms/step - loss: 0.2434 - accuracy: 0.9429
Epoch 23/50
4/4 [==============================] - 0s 2ms/step - loss: 0.2348 - accuracy: 0.9429
Epoch 24/50
4/4 [==============================] - 0s 2ms/step - loss: 0.2255 - accuracy: 0.9429
Epoch 25/50
4/4 [==============================] - 0s 2ms/step - loss: 0.2177 - accuracy: 0.9524
Epoch 26/50
4/4 [==============================] - 0s 2ms/step - loss: 0.2100 - accuracy: 0.9524
Epoch 27/50
4/4 [==============================] - 0s 2ms/step - loss: 0.2027 - accuracy: 0.9619
Epoch 28/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1955 - accuracy: 0.9619
Epoch 29/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1873 - accuracy: 0.9619
Epoch 30/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1803 - accuracy: 0.9619
Epoch 31/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1734 - accuracy: 0.9619
Epoch 32/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1675 - accuracy: 0.9619
Epoch 33/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1612 - accuracy: 0.9619
Epoch 34/50
4/4 [==============================] - 0s 1ms/step - loss: 0.1563 - accuracy: 0.9619
Epoch 35/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1510 - accuracy: 0.9714
Epoch 36/50
4/4 [==============================] - 0s 1ms/step - loss: 0.1469 - accuracy: 0.9714
Epoch 37/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1420 - accuracy: 0.9714
Epoch 38/50
4/4 [==============================] - 0s 1ms/step - loss: 0.1375 - accuracy: 0.9714
Epoch 39/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1346 - accuracy: 0.9714
Epoch 40/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1302 - accuracy: 0.9714
Epoch 41/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1266 - accuracy: 0.9714
Epoch 42/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1227 - accuracy: 0.9714
Epoch 43/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1192 - accuracy: 0.9714
Epoch 44/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1162 - accuracy: 0.9714
Epoch 45/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1133 - accuracy: 0.9714
Epoch 46/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1104 - accuracy: 0.9714
Epoch 47/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1085 - accuracy: 0.9714
Epoch 48/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1063 - accuracy: 0.9714
Epoch 49/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1033 - accuracy: 0.9714
Epoch 50/50
4/4 [==============================] - 0s 2ms/step - loss: 0.1013 - accuracy: 0.9714
2/2 [==============================] - 0s 2ms/step
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 500)               2500      
                                                                 
 dense_1 (Dense)             (None, 3)                 1503      
                                                                 
=================================================================
Total params: 4,003
Trainable params: 4,003
Non-trainable params: 0
_________________________________________________________________
None
Score: 0.9555555555555556

When the ANN has finished running, in this case we’ve classified the irises in the test data segment with 95+% accuracy.

The loss that we see in the output here is the output of the categorical cross-entropy loss function. It measures how far the output of the network is from the desired output, and is used to train the network. We hope to see the loss decreasing as the network is progressively trained. The loss decreases as the output of the neural network improves.

ANNs are a supervised machine learning technique. In other words, we present the ANN with the input data and the desired output at the same time. Hopefully the ANN will learn to generalise the pattern that links input to output.

Leave a Reply

Blog at WordPress.com.

%d