for x in nparr: if x == 0: something something
as that uses a lot more time than doing this
for x in nparr.tolist(): if x == 0: something something
This is because a for loop iterating over a numpy array does not result in a sequence of Python constants but in a sequence of numpy scalars which would result in comparing a numpy array to a constant. Converting the array into a list first before the for loop will then result in a sequence of constants.
Here is some profiling I've done using cProfile to check different ways to do an 'if' on a numpy array element:
import cProfile import numpy as np runs = 1000000 print('Comparing numpy to numpy') x = np.array(1.0, np.float32) y = np.array(1.0, np.float32) cProfile.run(''' for _ in range(runs): if x == y: pass ''') print() print('Comparing numpy to constant') x = np.array(1.0, np.float32) cProfile.run(''' for _ in range(runs): if x == 1.0: pass ''') print() print('Comparing constant to constant') x = 1.0 cProfile.run(''' for _ in range(runs): if x == 1.0: pass ''') print() print('Comparing numpy.tolist() to constant') x = np.array(1.0, np.float32) cProfile.run(''' for _ in range(runs): if x.tolist() == 1.0: pass ''') print() print('Comparing numpy to numpy.array(constant)') x = np.array(1.0, np.float32) cProfile.run(''' for _ in range(runs): if x == np.array(1.0, np.float32): pass ''') print() print('Comparing numpy.tolist() to numpy.tolist()') x = np.array(1.0, np.float32) y = np.array(1.0, np.float32) cProfile.run(''' for _ in range(runs): if x.tolist() == y.tolist(): pass ''') print()
Here are the results in order of speed:
Comparing constant to constant: | 0.088 seconds |
---|---|
Comparing numpy.tolist() to constant: | 0.288 seconds |
Comparing numpy.tolist() to numpy.tolist(): | 0.508 seconds |
Comparing numpy to numpy: | 0.684 seconds |
Comparing numpy to constant: | 1.192 seconds |
Comparing numpy to numpy.array(constant): | 1.203 seconds |
It turns out that it is always faster to first convert your numpy scalars into constants via .tolist() than to do anything with them as numpy scalars.