Skip to content

UncleJim vs. PCollections

Glen K. Peterson edited this page Mar 21, 2016 · 19 revisions

PCollections competes only with UncleJim's first bullet point: immutable collections, so I'm only going to talk about that in this document.

Clojure's vector (list) and hashMap/hashSet have O(log32 n) performance, which scales better than the O(log2 n) binary tree structures PCollections uses for its vector.

Graph of Log base 32 (red) vs. Log base 2 (blue)

This graph shows how many operations each lookup requires (on the y-axis) for a given number of items in the collection (on the x-axis). The red line is the fast log32, the blue is the slower log2. Daniel Spiewak explains all the ramifications of this better than I ever could: https://www.youtube.com/watch?v=pNhBQJN44YQ

#Benchmarks Benchmarks below from a quad-core i7-4790K CPU @ 4.00GHz with hyper-threading disabled. So far this compares just the immutable vector (non-linked list) in UncleJim vs. PCollections. Both vector implementations are based on different powers of 2. I used powers of 10 for the vector sizes because I thought they would be unlikely to correspond to any particularly good or bad performance points in either Vector implementation.

Building a vector in UncleJim is faster than building one in PCollections:

10 items: 1.4x faster
100: 2.8x faster
1000: 4.4x faster
10000: 5.9x faster
100000: 7.7x faster

The numbers show that UncleJim's Vector from Clojure is faster and it scales better than PCollections Vector. The performance gap increases by a logarithmic factor of a base near 4, which is roughly what was predicted in the Big O analysis above. I tried running some tests on accessing each item by index and by iterator, but because of the way I ran them, all I could tell is that building the collection took about 3x longer than accessing the data in it once it was built. This proportion of build time to access time was about equal across the board.

Data

I built the vector from Strings using the ordinal(int n) function which takes a little time to produce, but I don't think this makes much difference.

java -jar target/benchmarks.jar -wi 5 -i 7
Benchmark                              Mode  Cnt        Score       Error  Units
MyBenchmark.buildImList10             thrpt   70  3690204.388 ±  7137.675  ops/s
MyBenchmark.buildImList100            thrpt   70   340778.855 ±  3838.734  ops/s
MyBenchmark.buildImList1000           thrpt   70    30860.163 ±   245.794  ops/s
MyBenchmark.buildImList10000          thrpt   70     2869.233 ±     8.490  ops/s
MyBenchmark.buildImList100000         thrpt   70      271.891 ±     0.642  ops/s
MyBenchmark.buildTreePVector10        thrpt   70  2580167.387 ± 12669.661  ops/s
MyBenchmark.buildTreePVector100       thrpt   70   120573.558 ±   316.766  ops/s
MyBenchmark.buildTreePVector1000      thrpt   70     6939.288 ±    12.626  ops/s
MyBenchmark.buildTreePVector10000     thrpt   70      488.121 ±     1.626  ops/s
MyBenchmark.buildTreePVector100000    thrpt   70       35.501 ±     0.070  ops/s

##Tests:

public ImList<String> buildImList(ImList<String> empty, int maxIdx) {
    for (int i = 0; i < maxIdx; i++) {
        empty = empty.append(ordinal(i));
    }
    return empty;
}

public PVector<String> buildPVector(PVector<String> empty, int maxIdx) {
    for (int i = 0; i < maxIdx; i++) {
        empty = empty.plus(ordinal(i));
    }
    return empty;
}

@Benchmark public void buildImList10() {
    ImList<String> vec = buildImList(PersistentVector.empty(), 10);
    String last = vec.get(9);
    assert("9th".equals(last));
}

@Benchmark public void buildTreePVector10() {
    PVector<String> vec = buildPVector(TreePVector.empty(), 10);
    String last = vec.get(9);
    assert("9th".equals(last));
}

// And so on using lots of cut and pasted code for 100, 1000, 10000, etc.

#Further Thoughts The Clojure collections also walk the sibling nodes in the internal data trees of these structures to provide iterators, which I thought might be an advantage. PCollections starts from the top of the tree doing an index lookup for each item, then increments the index and goes back to the top to look up the next (at least for the list implementation).

Clojure's (and Java's) sorted/tree map/set implementations are O(log2 n), so PCollections is likely about as fast for those two collections. If someone does performance testing to verify these theories, please let me know so I can link to it here. I plan to come back to this with more tests and data later...