Skip to content

Commit

Permalink
Improve performance by increasing the buffer size to 16K
Browse files Browse the repository at this point in the history
Reading a 5M file takes about 13 seconds. For reference: Python takes
about 1.2s for the same file, Go 0.4s, and C++ about 0.25s. So there's a
lot to gain!

One reason is that when reading the file in memory it only allocates
1000 bytes at a time; a simple measurement with this:

	1k.toml     0.00s
	10k.toml    0.00s
	100k.toml   0.01s
	1M.toml     0.52s
	5M.toml    12.97s
	10M.toml   74.58s

And some measurements for other values:

	2K      1k.toml      0.00s
		10k.toml     0.01s
		100k.toml    0.01s
		1M.toml      0.36s
		5M.toml      9.30s
		10M.toml    53.68s
	5K      1k.toml      0.00s
		10k.toml     0.00s
		100k.toml    0.00s
		1M.toml      0.27s
		5M.toml      7.02s
		10M.toml    44.81s
	10K     1k.toml      0.00s
		10k.toml     0.00s
		100k.toml    0.01s
		1M.toml      0.25s
		5M.toml      6.27s
		10M.toml    43.05s
	15K     1k.toml      0.00s
		10k.toml     0.00s
		100k.toml    0.01s
		1M.toml      0.23s
		5M.toml      5.96s
		10M.toml    39.19s
	20K     1k.toml      0.00s
		10k.toml     0.00s
		100k.toml    0.01s
		1M.toml      0.24s
		5M.toml      5.66s
		10M.toml    39.41s
	25K     1k.toml      0.00s
		10k.toml     0.00s
		100k.toml    0.00s
		1M.toml      0.21s
		5M.toml      5.77s
		10M.toml    38.10s
	50K     1k.toml      0.00s
		10k.toml     0.00s
		100k.toml    0.00s
		1M.toml      0.21s
		5M.toml      5.38s
		10M.toml    33.91s

I set it to 16K as the performance benefits drop off after that, at
least on my system, and it's still quite little memory, but can also use
another value if you prefer – 5K or even 2K already make a difference.

(the rest of the performance is mostly in the strcmp()s in check_key()
by the way, that loops over all the keys for every key it finds, which
is why larger files get so drastically slower).
  • Loading branch information
arp242 committed Oct 8, 2023
1 parent 5221b3d commit 1c3c2db
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion toml.c
Original file line number Diff line number Diff line change
Expand Up @@ -1486,7 +1486,7 @@ toml_table_t *toml_parse_file(FILE *fp, char *errbuf, int errbufsz) {
while (!feof(fp)) {

if (off == bufsz) {
int xsz = bufsz + 1000;
int xsz = bufsz + 1024*16;
char *x = expand(buf, bufsz, xsz);
if (!x) {
snprintf(errbuf, errbufsz, "out of memory");
Expand Down

0 comments on commit 1c3c2db

Please sign in to comment.