Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Improve performance by increasing the buffer size to 16K
Reading a 5M file takes about 13 seconds. For reference: Python takes about 1.2s for the same file, Go 0.4s, and C++ about 0.25s. So there's a lot to gain! One reason is that when reading the file in memory it only allocates 1000 bytes at a time; a simple measurement with this: 1k.toml 0.00s 10k.toml 0.00s 100k.toml 0.01s 1M.toml 0.52s 5M.toml 12.97s 10M.toml 74.58s And some measurements for other values: 2K 1k.toml 0.00s 10k.toml 0.01s 100k.toml 0.01s 1M.toml 0.36s 5M.toml 9.30s 10M.toml 53.68s 5K 1k.toml 0.00s 10k.toml 0.00s 100k.toml 0.00s 1M.toml 0.27s 5M.toml 7.02s 10M.toml 44.81s 10K 1k.toml 0.00s 10k.toml 0.00s 100k.toml 0.01s 1M.toml 0.25s 5M.toml 6.27s 10M.toml 43.05s 15K 1k.toml 0.00s 10k.toml 0.00s 100k.toml 0.01s 1M.toml 0.23s 5M.toml 5.96s 10M.toml 39.19s 20K 1k.toml 0.00s 10k.toml 0.00s 100k.toml 0.01s 1M.toml 0.24s 5M.toml 5.66s 10M.toml 39.41s 25K 1k.toml 0.00s 10k.toml 0.00s 100k.toml 0.00s 1M.toml 0.21s 5M.toml 5.77s 10M.toml 38.10s 50K 1k.toml 0.00s 10k.toml 0.00s 100k.toml 0.00s 1M.toml 0.21s 5M.toml 5.38s 10M.toml 33.91s I set it to 16K as the performance benefits drop off after that, at least on my system, and it's still quite little memory, but can also use another value if you prefer – 5K or even 2K already make a difference. (the rest of the performance is mostly in the strcmp()s in check_key() by the way, that loops over all the keys for every key it finds, which is why larger files get so drastically slower).
- Loading branch information