-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
toml parsing is too slow #39
Comments
and I try to parse the same toml file with toml-j0.4, it only cost 45 ms. |
This looks like it's mostly compile() being slow, and I think the best fix is to rewrite it. @BinaryMuse do you mind if I rewrite compile.js? I'd probably also add more testing. |
I opened an PR here to begin to address this: #42 Some interesting findings so far. |
ok, it looks like the grammar is slow. This is a benchmark on my laptop of parsing a realistic cargo.toml file from the Rust Cargo project:
I modified toml.pegjs to remove all actions, so it's just doing parsing, and it's still worse than toml-j0.4 (which also uses pegjs).
So I'm now going to try to figure out what's slow in the grammar |
pegjs doesn't generate very efficient parsers. In particular, every rule is a function call, So rules like:
will call the S function in a loop to satisfy the *. Rules like this are faster:
pegjs still matches There doesn't seem to be a way right now to get pegjs to use a So... rewriting the grammar slightly to reduce function calls will probably yield a fairly substantial speed increase, and then I'll look at other problems. |
I've gotten part of the way to toml-j0.4 performance for the cargo.toml case just by refactoring the grammar. But there's a large performance gap that's due to toml-node calling line() and column() all the time. I'm not sure what to do about that yet. Options:
1 seems fairly easy, I think I'll try that first. |
#44 and #45 combined makes toml-node about half the speed of toml-j0.4 for parsing my cargo.toml test case. Also removing the pegjs There are more things to improve; toml parsing should be about as fast as yaml parsing, and potentially faster. |
I compare toml-node and js-yaml to load toml/yaml file with a large array config.
demo.toml:
[[arr]]
name = "abcdefg"
demo.yaml:
arr:
and, repeat the same element 1000 times in each file.
then parse them each other with toml-node and js-yaml, the result is:
toml costs 662 ms
yaml costs 33 ms
the toml parsing is so slow
The text was updated successfully, but these errors were encountered: