You can achieve greater performance by using vectorization
When working with pandas it is tempting to use the
traditional for loop
approach for manipulating the data, but
this can be very inefficient since python
it's self is
not a fast language
Otherwise you may reach for keeping operations to the
domain of pandas
by using it's provided index
manipulation, accessor api
or the apply function
However, though this will be faster than using the
standard python
language, since this operation will be
done with fast C
implementations, we can get even
better performance by structuring our data manipulation
to take advantage of vectorized
operations
- A notebook
on some ways to do complex operations using
vectorization