-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange performance hit when converting seq[seq[string]]] to List[List[str]]] to pd.Dataframe #243
Comments
Marshalling seqs is pretty much a deep copy, so with big data it can get expensive. Invoking To avoid marshalling you can either use |
Sorry but I fear I'm not understanding your answer completely. I am aware that with nim_myparser the marshalling is happening from python string to nim string on input and from nim seq[seq[string]] to python List[List[str]] on output. |
Oh it seems I misread your benchmark. Can you verify that |
I've been trying to simplify the problem. Now it's about passing a (literal) unicode string and reverse it nimmodule.nim import std/[unicode]
import nimpy
proc inputMarshal(data: string): string {.exportpy.} =
return data.reversed
proc inputCast(data: PyObject): string {.exportpy.} =
return data.to(string).reversed nimmodule.nims --gc: "arc"
--d: "danger"
--app: "lib"
--passC: "-flto"
--passL: "-flto"
benchmark.py import random
import timeit
from inspect import getsource
def strgen(length: int) -> str:
get_char = chr
# Update this to include code point ranges to be sampled
include_ranges = [
(0x0021, 0x0021),
(0x0023, 0x0026),
(0x0028, 0x007E),
(0x00A1, 0x00AC),
(0x00AE, 0x00FF),
(0x0100, 0x017F),
(0x0180, 0x024F),
(0x2C60, 0x2C7F),
(0x16A0, 0x16F0),
(0x0370, 0x0377),
(0x037A, 0x037E),
(0x0384, 0x038A),
(0x038C, 0x038C),
]
alphabet = [
get_char(code_point) for current_range in include_ranges
for code_point in range(current_range[0], current_range[1] + 1)
]
return ''.join(random.choice(alphabet) for i in range(length))
def reverse_string(data: str) -> str:
return data[::-1]
rndstr: str = strgen(5000)
reps: int = 100000
pyres = timeit.timeit(f"reverse_string('{rndstr}')",
setup=f"{getsource(reverse_string)}\ngc.enable()",
number=reps)
nimres_inputMarshal = timeit.timeit(f"nimmodule.inputMarshal('{rndstr}')",
setup="import nimmodule; gc.enable()",
number=reps)
nimres_inputCast = timeit.timeit(f"nimmodule.inputCast('{rndstr}')",
setup="import nimmodule; gc.enable()",
number=reps)
print(f"""
py: {pyres}
nim (inputMarshal): {nimres_inputMarshal}
nim (inputCast): {nimres_inputCast}
""".strip()) output
Also according to my benchmarks automatic input (un)marshalling or calling PyObject.to are equivalent in benchmarks, also seems that nim's unicode.reversed is much slower than python's [::-1] proc inputMarshal(data: string) {.exportpy.} =
discard
proc inputCast(data: PyObject) {.exportpy.} =
discard data.to(string)
|
just read about python string representation and that there's no guarantee that it's ABI won't change. Also I see that python string to nim string will always face copy Line 30 in 44181d1
I guess unless I stay on the utf-8 land on the python side, I can't optimize python to nim interop. Is this going to perform copy? python nimmodule.reverseBuf("ABCDE".encode()) nim proc reverseBuf(obj: PyObject): string {.exportpy.} =
var pybuf: RawPyBuffer
getBuffer(obj, pybuf, PyBUF_SIMPLE)
result = $cast[cstring](pybuf.buf)
return result.reversed |
new results without unicode operations, just passing data and using identity functions import nimpy
import nimpy/raw_buffers
proc inputMarshal(data: string): string {.exportpy.} =
return data
proc inputCast(data: PyObject): string {.exportpy.} =
return data.to(string)
proc inputcstring(data: PyObject): string {.exportpy.} =
var pybuf: RawPyBuffer
getBuffer(data, pybuf, PyBUF_SIMPLE)
result = $cast[cstring](pybuf.buf)
pybuf.release() --gc: "arc"
--d: "danger"
--app: "lib"
--passC: "-flto"
--passL: "-flto" import string
from inspect import getsource
import timeit
import random
import nimporter
import nimmodule
def identitystr(data: str) -> str:
return data
strlen: int = 5000
rndstr: str = ''.join(random.choice(
string.ascii_uppercase + string.digits) for _ in range(strlen))
reps: int = 1000000
assert identitystr(rndstr) == nimmodule.inputMarshal(rndstr)
assert identitystr(rndstr) == nimmodule.inputCast(rndstr)
assert identitystr(rndstr) == nimmodule.inputcstring(rndstr.encode())
pyres = timeit.timeit(f"identitystr('{rndstr}')",
setup=f"{getsource(identitystr)}\ngc.enable()",
number=reps)
nimres_inputMarshal = timeit.timeit(f"nimmodule.inputMarshal('{rndstr}')",
setup="import nimmodule; gc.enable()",
number=reps)
nimres_inputCast = timeit.timeit(f"nimmodule.inputCast('{rndstr}')",
setup="import nimmodule; gc.enable()",
number=reps)
nimres_inputcstring = timeit.timeit(f"nimmodule.inputcstring('{rndstr}'.encode())",
setup="import nimmodule; gc.enable()",
number=reps)
print(f"""
py: {pyres:.5f}
nim (inputMarshal): {nimres_inputMarshal:.5f} x{nimres_inputMarshal/pyres:.2f}
nim (inputCast): {nimres_inputCast:.5f} x{nimres_inputCast/pyres:.2f}
nim (nimres_inputcstring): {nimres_inputcstring:-5f} x{nimres_inputcstring/pyres:.2f}
""".strip()) results
|
Well there's still a lot more stuff going on in nim. proc inputcstring(data: PyObject): string {.exportpy.} =
var pybuf: RawPyBuffer
getBuffer(data, pybuf, PyBUF_SIMPLE) # There's no copying here alright
result = $cast[cstring](pybuf.buf) # But here you're allocating a nim string and copying cstring into it
pybuf.release()
# And then result is marshalled back to python, by allocating a python string and copying data into it, and result is deallocated Some nuance to keep in mind. If nim string being marshalled to python string is not a valid utf8, it will fallback to python bytes. But it will cost extra performance as utf8 conversion is tried first. Your recent findings look obviuos to me, as there's more marshalling than useful work. I don't follow how they are related to your initial benchmark. |
I have two equivalent functions
I compile the python module with
the final result is correct for both, but running
%timeit -n 10
on quite big data gives surprising timings.Seems like
List[List[str]]]
conversion topd.DataFrame
is zero-cost for python version but is very expensive for nim versionsame timing happens if I do the
pd.DataFrame
conversion viapyIpmport("pandas")
in the nim/nimpy side at the end of the function (and changing the return type)Any idea what's happening here? I'm trying to replace python function with nim functions to speeup things, but I need pandas on the python side, and this behavior is confusing to me.
The text was updated successfully, but these errors were encountered: