You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How can I use Ballista to call a (potentially async) function to map rows (or I guess record batches) and return a new transformed dataframe?
I understand there are ways to create UDFs in Datafusion but I don't see how to register or use them in the Ballista context. Also don't know if any UDFs can be async. I tried calling execute_stream_partitioned and mapping over those streams and wrapping them in PartitionStreams to use in a new StreamingTable to register with Ballista, but it won't compile (mainly I think due to SendableRecordBatchStream not being Sync and PartitionStream requiring it).
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
How can I use Ballista to call a (potentially async) function to map rows (or I guess record batches) and return a new transformed dataframe?
I understand there are ways to create UDFs in Datafusion but I don't see how to register or use them in the Ballista context. Also don't know if any UDFs can be async. I tried calling
execute_stream_partitioned
and mapping over those streams and wrapping them inPartitionStream
s to use in a newStreamingTable
to register with Ballista, but it won't compile (mainly I think due toSendableRecordBatchStream
not beingSync
andPartitionStream
requiring it).Is there an example of this anywhere?
Beta Was this translation helpful? Give feedback.
All reactions