Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework chdb-node api #23

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

mgrenonville
Copy link

This commit includes the new LocalChdb api using connect / chdb_conn to offer stateful query with long live clickhouse engine instance bind with connection.

It also include a new NApi wrapper around local_result_v2 to prevent unnecessary copy and direct access to byte array (in case you want to use binary format and nodejs as a passhrough process).

Thanks to Napi::ObjectWrap, nodejs is correctly calling free when needed (ie: the result isn't referenced anymore, thus garbage collected)

This commit includes the new LocalChdb api using connect / chdb_conn to
offer stateful query with long live clickhouse engine instance bind with
connection.

It also include a new NApi wrapper around local_result_v2 to prevent
unnecessary copy and direct access to byte array (in case you want to
use binary format and nodejs as a passhrough process).

Thanks to `Napi::ObjectWrap`, nodejs is correctly calling free when
needed (ie: the result isn't referenced anymore, thus garbage collected)
@CLAassistant
Copy link

CLAassistant commented Dec 19, 2024

CLA assistant check
All committers have signed the CLA.

@auxten
Copy link
Member

auxten commented Dec 23, 2024

Wow, thank you so much @mgrenonville.
Let me release the production version of v2.2 first. After that let review it

@mgrenonville
Copy link
Author

Thank you !
What do you think about implementing a similar feature of Python(df) in chdb for nodejs ?
I've seen that NApi allows C++ to access to objects in javascript runtime.

@ceckoslab
Copy link

@auxten @mgrenonville

I noticed that chDB v3.0.0 was released 2 weeks ago: https://github.com/chdb-io/chdb/releases/tag/v3.0.0

Do we need to bump something in this MR in order to use chDB v3.0.0 ?

Auxten, also do we know if this MR will address the issues reported in #18 ? I recall that the issues were caused due to the way how Sessions were handled in older versions of chDB.

@auxten
Copy link
Member

auxten commented Jan 22, 2025

in chdb v3.0, there could be only 1 session in the process.
see release note Session Behavior part:

Session Behavior

- Sessions maintain query state throughout their lifecycle
- Only one active session allowed at a time
- Creating a new session automatically closes any existing session
- Sessions can be temporary or persistent:
  - Without specified path: Creates auto-cleaned temporary directory
  - With specified path: Maintains persistent database state

@ceckoslab
Copy link

Hello @auxten I would like to ask again part of my question that wasn't covered by your comment:

I noticed that chDB v3.0.0 was released 2 weeks ago: https://github.com/chdb-io/chdb/releases/tag/v3.0.0

Do we need to bump something in this MR in order to use chDB v3.0.0 ?

@auxten
Copy link
Member

auxten commented Jan 27, 2025

The old stateless query function cannot coexist with the new stateful one in the same process.
I need to conduct some detailed reviews and tests to confirm this. Frankly, the libchdb 3.0 API has undergone significant changes, and it's never too much to test it thoroughly.

Copy link
Member

@auxten auxten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, there are some issues to be addressed:
We need to re-impl the Query, QuerySession classes the Conn based API. The old impl of query and the new one can not co-exist in one process. As the new one need to keep some global stuff, but the old one tend to cleanup everything on finished.

It's a little bit complex for first time contributor. If you don't mind, I will take over this PR @mgrenonville

index.js Outdated
@@ -30,10 +87,17 @@ class Session {
return chdbNode.QuerySession(query, format, this.path);
}

queryBuffer(query, format = "CSV") {
if (!query) return "";
return chdbNode.QuerySessionBuffer(query, format, this.path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the QuerySessionBuffer seems not defined

index.js Outdated
if (!query) {
return "";
}
return chdbNode.QueryBuffer(query, format);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the chdbNode.QueryBuffer seems not exist

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Your right, I've not cleaned up this code :(

index.js Outdated
return chdbNode.QueryBuffer(query, format);
}

class LocalChDB {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refer to the libchdb function called here, Connect is a better name than LocalChDB

@@ -121,6 +98,34 @@ Napi::String QueryWrapper(const Napi::CallbackInfo &info) {
return Napi::String::New(env, result);
}

// QuerySession function will save the session to the path
char *QuerySession(const char *query, const char *format, const char *path,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old impl of queryStable and the new one can not co-exist in one process.
As the new one need to keep some global stuff, but the old one tend to cleanup everything on finished.
We should use the Connection base queryConn here instead.
And the Query should also use the the Connection base queryConn
See also: https://github.com/chdb-io/chdb/blob/d26d2ce84190e9f9e2dbbde024ac09e9739198eb/chdb/__init__.py#L75

@mgrenonville
Copy link
Author

mgrenonville commented Jan 30, 2025

We need to re-impl the Query, QuerySession classes the Conn based API. The old impl of query and the new one can not co-exist in one process. As the new one need to keep some global stuff, but the old one tend to cleanup everything on finished.

I think I understand what needs to be done, I think I will dig in chdb python wrapper to understand how. This is basically only for retro-compatibility purpose ? What do you think of removing Query and QuerySession since it's a subset of what can be done with Conn based API, by bumping chdb-node to a new major version ?

I've run some experimentation I made with chdb v3.0.0 and, with connect api, it continues to work 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants