Skip to content

Latest commit

 

History

History
25 lines (18 loc) · 975 Bytes

1c2p.md

File metadata and controls

25 lines (18 loc) · 975 Bytes

lang.py.module.pandas.vectorization

You can achieve greater performance by using vectorization

Overview

When working with pandas it is tempting to use the traditional for loop approach for manipulating the data, but this can be very inefficient since python it's self is not a fast language

Otherwise you may reach for keeping operations to the domain of pandas by using it's provided index manipulation, accessor api or the apply function

However, though this will be faster than using the standard python language, since this operation will be done with fast C implementations, we can get even better performance by structuring our data manipulation to take advantage of vectorized operations

Sources

  • A notebook on some ways to do complex operations using vectorization