1 2 3 | for x in nparr: if x = = 0 : something something |
as that uses a lot more time than doing this
1 2 3 | for x in nparr.tolist(): if x = = 0 : something something |
This is because a for loop iterating over a numpy array does not result in a sequence of Python constants but in a sequence of numpy scalars which would result in comparing a numpy array to a constant. Converting the array into a list first before the for loop will then result in a sequence of constants.
Here is some profiling I've done using cProfile to check different ways to do an 'if' on a numpy array element:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | import cProfile import numpy as np runs = 1000000 print ( 'Comparing numpy to numpy' ) x = np.array( 1.0 , np.float32) y = np.array( 1.0 , np.float32) cProfile.run( ''' for _ in range(runs): if x == y: pass ''' ) print () print ( 'Comparing numpy to constant' ) x = np.array( 1.0 , np.float32) cProfile.run( ''' for _ in range(runs): if x == 1.0: pass ''' ) print () print ( 'Comparing constant to constant' ) x = 1.0 cProfile.run( ''' for _ in range(runs): if x == 1.0: pass ''' ) print () print ( 'Comparing numpy.tolist() to constant' ) x = np.array( 1.0 , np.float32) cProfile.run( ''' for _ in range(runs): if x.tolist() == 1.0: pass ''' ) print () print ( 'Comparing numpy to numpy.array(constant)' ) x = np.array( 1.0 , np.float32) cProfile.run( ''' for _ in range(runs): if x == np.array(1.0, np.float32): pass ''' ) print () print ( 'Comparing numpy.tolist() to numpy.tolist()' ) x = np.array( 1.0 , np.float32) y = np.array( 1.0 , np.float32) cProfile.run( ''' for _ in range(runs): if x.tolist() == y.tolist(): pass ''' ) print () |
Here are the results in order of speed:
Comparing constant to constant: | 0.088 seconds |
---|---|
Comparing numpy.tolist() to constant: | 0.288 seconds |
Comparing numpy.tolist() to numpy.tolist(): | 0.508 seconds |
Comparing numpy to numpy: | 0.684 seconds |
Comparing numpy to constant: | 1.192 seconds |
Comparing numpy to numpy.array(constant): | 1.203 seconds |
It turns out that it is always faster to first convert your numpy scalars into constants via .tolist() than to do anything with them as numpy scalars.