python - NumPy array initialization (fill with identical values)

ID : 10112

viewed : 33

Tags : pythonarraysnumpypython

Top 5 Answer for python - NumPy array initialization (fill with identical values)

vote vote

94

NumPy 1.8 introduced np.full(), which is a more direct method than empty() followed by fill() for creating an array filled with a certain value:

>>> np.full((3, 5), 7) array([[ 7.,  7.,  7.,  7.,  7.],        [ 7.,  7.,  7.,  7.,  7.],        [ 7.,  7.,  7.,  7.,  7.]])  >>> np.full((3, 5), 7, dtype=int) array([[7, 7, 7, 7, 7],        [7, 7, 7, 7, 7],        [7, 7, 7, 7, 7]]) 

This is arguably the way of creating an array filled with certain values, because it explicitly describes what is being achieved (and it can in principle be very efficient since it performs a very specific task).

vote vote

89

Updated for Numpy 1.7.0:(Hat-tip to @Rolf Bartstra.)

a=np.empty(n); a.fill(5) is fastest.

In descending speed order:

%timeit a=np.empty(10000); a.fill(5) 100000 loops, best of 3: 5.85 us per loop  %timeit a=np.empty(10000); a[:]=5  100000 loops, best of 3: 7.15 us per loop  %timeit a=np.ones(10000)*5 10000 loops, best of 3: 22.9 us per loop  %timeit a=np.repeat(5,(10000)) 10000 loops, best of 3: 81.7 us per loop  %timeit a=np.tile(5,[10000]) 10000 loops, best of 3: 82.9 us per loop 
vote vote

73

I believe fill is the fastest way to do this.

a = np.empty(10) a.fill(7) 

You should also always avoid iterating like you are doing in your example. A simple a[:] = v will accomplish what your iteration does using numpy broadcasting.

vote vote

64

I had np.array(n * [value]) in mind, but apparently that is slower than all other suggestions for large enough n. The best in terms of readability and speed is

np.full(n, 3.14) 

Here is full comparison with perfplot (a pet project of mine).

enter image description here

The two empty alternatives are still the fastest (with NumPy 1.12.1). full catches up for large arrays.


Code to generate the plot:

import numpy as np import perfplot   def empty_fill(n):     a = np.empty(n)     a.fill(3.14)     return a   def empty_colon(n):     a = np.empty(n)     a[:] = 3.14     return a   def ones_times(n):     return 3.14 * np.ones(n)   def repeat(n):     return np.repeat(3.14, (n))   def tile(n):     return np.repeat(3.14, [n])   def full(n):     return np.full((n), 3.14)   def list_to_array(n):     return np.array(n * [3.14])   perfplot.show(     setup=lambda n: n,     kernels=[empty_fill, empty_colon, ones_times, repeat, tile, full, list_to_array],     n_range=[2 ** k for k in range(27)],     xlabel="len(a)",     logx=True,     logy=True, ) 
vote vote

50

Apparently, not only the absolute speeds but also the speed order (as reported by user1579844) are machine dependent; here's what I found:

a=np.empty(1e4); a.fill(5) is fastest;

In descending speed order:

timeit a=np.empty(1e4); a.fill(5)  # 100000 loops, best of 3: 10.2 us per loop timeit a=np.empty(1e4); a[:]=5 # 100000 loops, best of 3: 16.9 us per loop timeit a=np.ones(1e4)*5 # 100000 loops, best of 3: 32.2 us per loop timeit a=np.tile(5,[1e4]) # 10000 loops, best of 3: 90.9 us per loop timeit a=np.repeat(5,(1e4)) # 10000 loops, best of 3: 98.3 us per loop timeit a=np.array([5]*int(1e4)) # 1000 loops, best of 3: 1.69 ms per loop (slowest BY FAR!) 

So, try and find out, and use what's fastest on your platform.

Top 3 video Explaining python - NumPy array initialization (fill with identical values)

Related QUESTION?