Intro
Numpy is the most popular mathematical computing Python library. It offers a great number of mathematical tools including but not limited to multi-dimensional arrays and matrices, mathematical functions, number generators, and a lot more.
One of the fundamental tools in NumPy is the ndarray
- an N-dimensional array. Today, we're going to create ndarrays, generated in certain ranges using the NumPy.arange()
function.
Parameters and Return
numpy.arange([start, ]stop, [step, ]dtype=None)
Returns evenly spaced values within a given interval where:
- start is a number (integer or real) from which the array starts from. It is optional.
- stop is a number (integer or real) which the array ends at and is not included in it.
- step is a number that sets the spacing between the consecutive values in the array. It is optional and is 1 by default.
- dtype is the type of output for array elements. It is None by default.
The method returns an ndarray
of of evenly spaced values. If the array returns floating-point elements the array's length will be ceil((stop - start)/step)
.
np.arange() by Example
Importing NumPy
To start working with NumPy, we need to import it, as it's an external library:
import NumPy as np
If not installed, you can easily install it via pip
:
$ pip install numpy
All-Argument np.arange()
Let's see how arange()
works with all the arguments for the function. For instance, say we want a sequence to start at 0, stop at 10, with a step size of 3, while producing integers.
In a Python environement, or REPL, let's generate a sequence in a range:
>>> result_array = np.arange(start=0, stop=10, step=2, dtype=int)
The array
is an ndarray
containing the generated elements:
>>> result_array
array([0, 2, 4, 6, 8])
It's worth noting that the stop
element isn't included, while the start
element is included, hence we have a 0
but not a 10
even though the next element in the sequence should be a 10
.
Note: As usual, you an provide positional arguments, without naming them or named arguments:
array = np.arange(start=0, stop=10, step=2, dtype=int)
# These two statements are the same
array = np.arange(0, 10, 2, int)
For the sake of brevity, the latter is oftentimes used, and the positions of these arguments must follow the sequence of start
, stop
, step
and dtype
.
np.arange() with stop
If only one argument is provided, it will be treated as the stop
value. It will output all numbers up to but not including the stop
number, with a default step of 1
and start
of 0
:
>>> result_array = np.arange(5)
>>> result_array
array([0, 1, 2, 3, 4])
np.arange() with start and stop
With two arguments, they default to start
and stop
, with a default step
of 1
- so you can easily create a specific range without thinking about the step size:
>>> result_array = np.arange(5, 10)
>>> result_array
array([5, 6, 7, 8, 9])
Like with previous examples, you can also use floating point numbers here instead of integers. For example, we can start at 5.5:
>>> result_array = np.arange(5.5, 11.75)
The resulting array will be:
>>> result_array
array([ 5.5, 6.5, 7.5, 8.5, 9.5, 10.5, 11.5])
np.arange() with start, stop and step
The default dtype
is None
and in that case, int
s are used so having an integer-based range is easy to create with a start
, stop
and step
. For instance, let's generate a sequence of all the even numbers between 6
(inclusive) and 22
(exclusive):
>>> result_array = np.arange(6, 22, 2)
The result will be all even numbers between 6 up to but not including 22:
>>> result_array
array([ 6, 8, 10, 12, 14, 16, 18, 20])
np.arange() for Reversed Ranges
We can also pass in negative parameters into the np.arange()
function to get a reversed array of numbers.
The start
will be the larger number we want to start counting from, the stop
will be the lower one, and the step will be a negative number:
result_array = np.arange(start=30,stop=14, step=-3)
The result will be an array of descending numbers with a negative step of 3:
>>> result_array
array([30, 27, 24, 21, 18, 15])
Creating Empty NDArrays with np.arange()
We can also create an empty arange as follows:
>>> result_array = np.arange(0)
The result will be an empty array:
>>> result_array
array([], dtype=int32)
This happens because 0
is the stop
value we've set, and the start value is also 0
by default. So, the counting stops before starting.
Another case where the result will be an empty array is when the start value is higher than the stop value while the step is positive. For example:
>>> result_array = np.arange(start=30, stop=10, step=1)
The result will also be an empty array.
>>> result_array
array([], dtype=int32)
This can also happen the other way around. We can start with a small number, stop at a larger number, and have the step
as a negative number. The output will be an empty array too:
>>> result_array = np.arange(start=10, stop=30, step=-1)
This also results in an empty ndarray
:
>>> result_array
array([], dtype=int32)
Supported Data Types for np.arange()
The
dtype
argument, which defaults toint
can be any valid NumPy data type.
Note: This isn't to be confused with standard Python data types, though.
You can use the shorthand version for some of the more common datatypes, or the full name, prefixed with np.
:
np.arange(..., dtype=int)
np.arange(..., dtype=np.int32)
np.arange(..., dtype=np.int64)
For some other data types, such as np.csignle
, you'll prefix the type with np.
:
>>> result_array = np.arange(start=10, stop=30, step=1, dtype=np.csingle)
>>> result_array
array([10.+0.j, 11.+0.j, 12.+0.j, 13.+0.j, 14.+0.j, 15.+0.j, 16.+0.j,
17.+0.j, 18.+0.j, 19.+0.j, 20.+0.j, 21.+0.j, 22.+0.j, 23.+0.j,
24.+0.j, 25.+0.j, 26.+0.j, 27.+0.j, 28.+0.j, 29.+0.j],
dtype=complex64)
A common short-hand data type is a float
:
>>> result_array = np.arange(start=10, stop=30, step=1, dtype=float)
>>> result_array
array([10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22.,
23., 24., 25., 26., 27., 28., 29.])
For a list of all supported NumPy data types, take a look at the official documentation.
np.arange() vs np.linspace()
np.linspace()
is similar to np.arange()
in returning evenly spaced arrays. However, there are a couple of differences.
With np.linspace()
, you specify the number of samples in a certain range instead of specifying the step. In addition, you can include endpoints in the returned array. Another difference is that np.linspace()
can generate multiple arrays instead of returning only one array.
This is a simple example of np.linspace()
with the endpoint included and 5 samples:
>>> result_array = np.linspace(0, 20, num=5, endpoint=True)
>>> result_array
array([ 0., 5., 10., 15., 20.])
Here, both the number of samples and the step size is 5
, but that's coincidental:
>>> result_array = np.linspace(0, 20, num=2, endpoint=True)
>>> result_array
array([ 0., 20.])
Here, we make two points between 0 and 20, so they're naturally 20 steps apart. You can also the endpoint
to False
and np.linspace()will behave more like
np.arange()` in that it doesn't include the final element:
>>> result_array = np.linspace(0, 20, num=5, endpoint=False)
>>> result_array
array([ 0., 4., 8., 12., 16.])
np.arange() vs built-in range()
The Python's built-in range()
function and np.arange()
share a lot of similarities but have slight differences. In the following sections, we're going to highlight some of the similarities and differences between them.
Parameters and Returns
The main similarities are that they both have a start
, stop
, and step
. Additionally, they are both start
inclusive, and stop
exclusive, with a default step
of 1
.
However:
- np.arange()
- Can handle multiple data types including floats and complex numbers
- returns a
ndarray
- The array is fully created in memory
- range()
- Can handle only integers
- Returns a
range
object - Generates numbers on demand
Efficiency and Speed
There are some speed and efficiency differences between np.arange()
and the built-in range()
function. The range function generates the numbers on demand and doesn't create them in-memory, upfront.
This helps speed the process up if you know you'll break somewhere in that range: For example:
for i in range(100000000):
if i == some_number:
break
This will consume less memory since not all numbers are created in advance. This also makes ndarrays
slower to initially construct.
However, if you still need the whole range of numbers in-memory, np.arange()
is significantly faster than range()
when the full range of numbers comes into play, after they've been constructed.
For instance, if we just iterate through them, the time it takes to create the arrays makes np.arange()
perform slower due to the higher upfront cost:
$ python -m timeit "for i in range(100000): pass"
200 loops, best of 5: 1.13 msec per loop
$ python -m timeit "import numpy as np" "for i in np.arange(100000): pass"
100 loops, best of 5: 3.83 msec per loop
Conclusion
This guide aims to help you understand how the np.arange()
function works and how to generate sequences of numbers.
Here's a quick recap of what we just covered.
np.arange()
has 4 parameters:- start is a number (integer or real) from which the array starts from. It is optional.
- stop is a number (integer or real) which the array ends at and is not included in it.
- step is a number that sets the spacing between the consecutive values in the array. It is optional and is 1 by default.
- dtype is the type of output for array elements. It is
None
by default.
- You can use multiple dtypes with arange including ints, floats, and complex numbers.
- You can generate reversed ranges by having the larger number as the start, the smaller number as the stop, and the step as a negative number.
np.linspace()
is similar tonp.arange()
in generating a range of numbers but differs in including the ability to include the endpoint and generating a number of samples instead of steps, which are computed based on the number of samples.np.arange()
is more efficient than range when you need the whole array created. However, the range is better if you know you'll break somewhere when looping.
from Planet Python
via read more
No comments:
Post a Comment