Home » Machine Learning » Numpy » Creating Numpy Arrays

Creating Numpy Arrays

Numpy is an important Python package, especially useful for machine learning and artificial intelligence purposes. “Numpy” is short for “numerical Python”. The package basically allows you to efficiently create and process very large arrays of numbers.

Before you start you’ll likely need to install Numpy with pip install numpy

It’s best to do this from an activated Python virtual environment, to ensure you install it for the correct version of Python.

Now we can create Numpy arrays.

The simplest way to do this is to just pass an iterator supplying numbers to the constructor.

import numpy as np

values = np.array([1, 5, 9, 13])

print(values)
[ 1  5  9 13]

Specifying the Data Type

It’s possible to specify the type of the numbers in the array if you’re particularly concerned about memory usage.

import numpy as np

values = np.array([1, 5, 9, 13], dtype='int16')
print(values)

values = np.array([1.2, 5.0, 9, 13.12], dtype='float32')
print(values)
[ 1  5  9 13]
[ 1.2   5.    9.   13.12]

Other Ways to Create a Numpy Array

Numpy has lots of other methods for creating arrays. You can create an array of zeros, random numbers, 1’s, etc.

import numpy as np

values = np.zeros(5)
print(values)

values = np.ones(4)
print(values)

# Ten random integers: 0 or 1 or 2
values = np.random.randint(0, 3, size=10)
print(values)

# Four random floats, 0-1
values = np.random.rand(4)
print(values)

# Four random floats drawn from a normal distribution 
# with mean 0, variance 1
values = np.random.randn(4)
print(values)
[0. 0. 0. 0. 0.]
[1. 1. 1. 1.]
[0 1 0 0 2 1 2 2 2 0]
[0.31559324 0.27881056 0.55191776 0.86837319]
[-1.87270017 -1.30553685 -0.0393753  -1.04951064]

Multidimensional Numpy Arrays

You can easily create multi-dimensional arrays in Numpy. The syntax for doing this varies slightly depending on the method you’re using.

import numpy as np

# 2x3 array of zeros
values = np.zeros(shape=(2, 3))
print(values)

print() # Blank line

# 2x4 array of floats from a normal
# distribution
values = np.random.randn(2, 4)
print(values)

print() # Blank line

# 4x4 array of random integers from 0-9
values = np.random.randint(0, 10, size=(4,4))
print(values)
[[0. 0. 0.]
 [0. 0. 0.]]

[[-0.17166863 -1.29735528  1.11337938  0.28045624]
 [-0.89533174  1.53091197  1.32427074 -1.11789083]]

[[1 6 7 3]
 [3 7 4 9]
 [8 1 6 9]
 [8 5 5 7]]

Linspace and Arange

The linspace function lets you generate evenly-spaced samples.

The arange function lets you generate values based on a step size.

  • linspace is most useful when you want to specify how many values you need in your array
  • arange is most useful when you care primarily about the step size.

Here are two ways to generate the sequence 0,2,4,6,8,10.

import numpy as np
values = np.linspace(0, 10, 6)
print(values)

values = np.arange(0, 11, 2)
print(values)
[ 0.  2.  4.  6.  8. 10.]
[ 0  2  4  6  8 10]

Notice that if you’re thinking about step size, linspace isn’t very intuitive. You might think that to divide 0-10 into five evenly spaced samples, you need to specify the arguments (0, 10, 5) to the linspace function.

These argument specify the start, end and number of samples.

But in fact you need (0, 10, 6), because both the start and end points get included in the result, so six values are needed, not five.

linspace generates floating point values.

With arange, you might think that to generate this sequence you need (0, 10, 2), which are the start, end and step size. But in fact the end point is not included in the values generated, so you need a value slightly greater than 10 for the end of the range.

Here, arange generates integer values. But unlike the range function, you can use floating point arguments with arange, and then you’ll get floating point values generated.

Leave a Reply

Blog at WordPress.com.

%d