Skip to content

UncleJim vs. PCollections

Glen K. Peterson edited this page Mar 21, 2016 · 19 revisions

PCollections competes only with UncleJim's first bullet point: immutable collections, so I'm only going to talk about that in this document.

Clojure's vector (list) and hashMap/hashSet have O(log32 n) performance, which theoretically scales better than the O(log2 n) binary tree structures it looks like PCollections uses.

Graph of Log base 32 (red) vs. Log base 2 (blue)

This graph shows how many operations each lookup requires (on the y-axis) for a given number of items in the collection (on the x-axis). The red line is the fast log32, the blue is the slower log2. Daniel Spiewak explains all the ramifications of this better than I ever could: https://www.youtube.com/watch?v=pNhBQJN44YQ

#Benchmarks Benchmarks below from a quad-core i7-4790K CPU @ 4.00GHz with hyper-threading disabled comparing (currently) just the immutable vector (non-linked list) in UncleJim vs. PCollections. I hope to test other collections later. I used powers of 10 for the sizes of the test vectors because both collections are based on different powers of 2, so this seemed a good combination of fair and familiar to see how they scale.

Building a vector in UncleJim is faster than building one in PCollections:

10 items: 1.4x faster
100: 2.8x faster
1000: 4.4x faster
10000: 5.9x faster
100000: 7.7x faster

The numbers show that UncleJim's Vector from Clojure is faster and it scales better than PCollections Vector. The performance gap increases by a logarithmic factor of a base near 4, which is roughly what was predicted in the Big O analysis above.

Data

I built the vector from Strings using the ordinal(int n) function which takes a little time to produce, but I don't think this makes much difference.

For all cases, building the collection took about 3x longer than accessing it, so the build time seemed the most crucial number for the comparison.

java -jar target/benchmarks.jar -wi 5 -i 7
Benchmark                              Mode  Cnt        Score       Error  Units
MyBenchmark.buildImList10             thrpt   70  3690204.388 ±  7137.675  ops/s
MyBenchmark.buildImList100            thrpt   70   340778.855 ±  3838.734  ops/s
MyBenchmark.buildImList1000           thrpt   70    30860.163 ±   245.794  ops/s
MyBenchmark.buildImList10000          thrpt   70     2869.233 ±     8.490  ops/s
MyBenchmark.buildImList100000         thrpt   70      271.891 ±     0.642  ops/s
MyBenchmark.buildTreePVector10        thrpt   70  2580167.387 ± 12669.661  ops/s
MyBenchmark.buildTreePVector100       thrpt   70   120573.558 ±   316.766  ops/s
MyBenchmark.buildTreePVector1000      thrpt   70     6939.288 ±    12.626  ops/s
MyBenchmark.buildTreePVector10000     thrpt   70      488.121 ±     1.626  ops/s
MyBenchmark.buildTreePVector100000    thrpt   70       35.501 ±     0.070  ops/s
MyBenchmark.AccessImList10            thrpt   70  3584269.478 ± 17094.483  ops/s
MyBenchmark.AccessImList100           thrpt   70   313618.991 ±  3114.999  ops/s
MyBenchmark.AccessImList1000          thrpt   70    28609.939 ±   201.496  ops/s
MyBenchmark.AccessImList10000         thrpt   70     2478.691 ±     9.350  ops/s
MyBenchmark.AccessImList100000        thrpt   70      194.369 ±     1.355  ops/s
MyBenchmark.AccessTreePVector10       thrpt   70  2309907.514 ± 13656.590  ops/s
MyBenchmark.AccessTreePVector100      thrpt   70   109941.282 ±   431.491  ops/s
MyBenchmark.AccessTreePVector1000     thrpt   70     5646.890 ±    18.767  ops/s
MyBenchmark.AccessTreePVector10000    thrpt   70      365.519 ±     0.736  ops/s
MyBenchmark.AccessTreePVector100000   thrpt   70       25.515 ±     0.025  ops/s
MyBenchmark.IterateImList10           thrpt   70  3633040.412 ±  9738.948  ops/s
MyBenchmark.IterateImList100          thrpt   70   328461.195 ±  2468.800  ops/s
MyBenchmark.IterateImList1000         thrpt   70    30329.867 ±   188.357  ops/s
MyBenchmark.IterateImList10000        thrpt   70     2634.341 ±     8.527  ops/s
MyBenchmark.IterateImList100000       thrpt   70      222.354 ±     1.295  ops/s
MyBenchmark.IterateTreePVector10      thrpt   70  1951653.575 ±  6705.411  ops/s
MyBenchmark.IterateTreePVector100     thrpt   70   105422.158 ±   438.722  ops/s
MyBenchmark.IterateTreePVector1000    thrpt   70     6130.033 ±     9.909  ops/s
MyBenchmark.IterateTreePVector10000   thrpt   70      400.318 ±     1.514  ops/s
MyBenchmark.IterateTreePVector100000  thrpt   70       29.464 ±     0.047  ops/s

##Tests:

public ImList<String> buildImList(ImList<String> empty, int maxIdx) {
    for (int i = 0; i < maxIdx; i++) {
        empty = empty.append(ordinal(i));
    }
    return empty;
}

public PVector<String> buildPVector(PVector<String> empty, int maxIdx) {
    for (int i = 0; i < maxIdx; i++) {
        empty = empty.plus(ordinal(i));
    }
    return empty;
}

public String iterateList(List<String> list) {
    String last = null;
    for (String item : list) {
        last = item;
    }
    return last;
}

public String accessList(List<String> list) {
    String last = null;
    for (int i = list.size() - 1; i >= 0; i--) {
        last = list.get(i);
    }
    return last;
}

@Benchmark public void buildImList10() {
    ImList<String> vec = buildImList(PersistentVector.empty(), 10);
    String last = vec.get(9);
    assert("9th".equals(last));
}

@Benchmark public void buildTreePVector10() {
    PVector<String> vec = buildPVector(TreePVector.empty(), 10);
    String last = vec.get(9);
    assert("9th".equals(last));
}

@Benchmark public void IterateImList10() {
    ImList<String> vec = buildImList(PersistentVector.empty(), 10);
    String last = iterateList(vec);
    assert("9th".equals(last));
}

@Benchmark public void AccessImList10() {
    ImList<String> vec = buildImList(PersistentVector.empty(), 10);
    String last = accessList(vec);
    assert("9th".equals(last));
}

// And so on using lots of cut and pasted code for 100, 1000, 10000, etc.

#Further Thoughts The Clojure collections also walk the sibling nodes in the internal data trees of these structures to provide iterators, which is pretty cool performance-wise. PCollections starts from the top of the tree doing an index lookup for each item, then increments the index and goes back to the top to look up the next (at least for the list implementation). It takes so much time to build a collection, that this kind of hid these differences. Obviously there is work to be done in this performance testing.

Clojure's (and Java's) sorted/tree map/set implementations are O(log2 n), so PCollections is likely about as fast for those two collections. If someone does performance testing to verify these theories, please let me know so I can link to it here. I plan to come back to this later...