-
Notifications
You must be signed in to change notification settings - Fork 25
UncleJim vs. PCollections
PCollections competes only with UncleJim's first bullet point: immutable collections, so I'm only going to talk about that in this document.
Clojure's vector (list) and hashMap/hashSet have O(log32 n) performance, which theoretically scales better than the O(log2 n) binary tree structures it looks like PCollections uses.
This graph shows how many operations each lookup requires (on the y-axis) for a given number of items in the collection (on the x-axis). The red line is the fast log32, the blue is the slower log2. Daniel Spiewak explains all the ramifications of this better than I ever could: https://www.youtube.com/watch?v=pNhBQJN44YQ
#Benchmarks Benchmarks below from a quad-core i7-4790K CPU @ 4.00GHz with hyper-threading disabled comparing (currently) just the immutable vector (non-linked list) in UncleJim vs. PCollections. I hope to test other collections later. I used powers of 10 for the sizes of the test vectors because both collections are based on different powers of 2, so this seemed a good combination of fair and familiar to see how they scale.
10 items: 1.4x faster
100: 2.8x faster
1000: 4.4x faster
10000: 5.9x faster
100000: 7.7x faster
The numbers show that UncleJim's Vector from Clojure is faster and it scales better than PCollections Vector. The performance gap increases by a logarithmic factor of a base near 4, which is roughly what was predicted in the Big O analysis above.
I built the vector from Strings using the ordinal(int n)
function which takes a little time to produce, but I don't think this makes much difference.
For all cases, building the collection took about 3x longer than accessing it, so the build time seemed the most crucial number for the comparison.
java -jar target/benchmarks.jar -wi 5 -i 7
Benchmark Mode Cnt Score Error Units
MyBenchmark.buildImList10 thrpt 70 3690204.388 ± 7137.675 ops/s
MyBenchmark.buildImList100 thrpt 70 340778.855 ± 3838.734 ops/s
MyBenchmark.buildImList1000 thrpt 70 30860.163 ± 245.794 ops/s
MyBenchmark.buildImList10000 thrpt 70 2869.233 ± 8.490 ops/s
MyBenchmark.buildImList100000 thrpt 70 271.891 ± 0.642 ops/s
MyBenchmark.buildTreePVector10 thrpt 70 2580167.387 ± 12669.661 ops/s
MyBenchmark.buildTreePVector100 thrpt 70 120573.558 ± 316.766 ops/s
MyBenchmark.buildTreePVector1000 thrpt 70 6939.288 ± 12.626 ops/s
MyBenchmark.buildTreePVector10000 thrpt 70 488.121 ± 1.626 ops/s
MyBenchmark.buildTreePVector100000 thrpt 70 35.501 ± 0.070 ops/s
MyBenchmark.AccessImList10 thrpt 70 3584269.478 ± 17094.483 ops/s
MyBenchmark.AccessImList100 thrpt 70 313618.991 ± 3114.999 ops/s
MyBenchmark.AccessImList1000 thrpt 70 28609.939 ± 201.496 ops/s
MyBenchmark.AccessImList10000 thrpt 70 2478.691 ± 9.350 ops/s
MyBenchmark.AccessImList100000 thrpt 70 194.369 ± 1.355 ops/s
MyBenchmark.AccessTreePVector10 thrpt 70 2309907.514 ± 13656.590 ops/s
MyBenchmark.AccessTreePVector100 thrpt 70 109941.282 ± 431.491 ops/s
MyBenchmark.AccessTreePVector1000 thrpt 70 5646.890 ± 18.767 ops/s
MyBenchmark.AccessTreePVector10000 thrpt 70 365.519 ± 0.736 ops/s
MyBenchmark.AccessTreePVector100000 thrpt 70 25.515 ± 0.025 ops/s
MyBenchmark.IterateImList10 thrpt 70 3633040.412 ± 9738.948 ops/s
MyBenchmark.IterateImList100 thrpt 70 328461.195 ± 2468.800 ops/s
MyBenchmark.IterateImList1000 thrpt 70 30329.867 ± 188.357 ops/s
MyBenchmark.IterateImList10000 thrpt 70 2634.341 ± 8.527 ops/s
MyBenchmark.IterateImList100000 thrpt 70 222.354 ± 1.295 ops/s
MyBenchmark.IterateTreePVector10 thrpt 70 1951653.575 ± 6705.411 ops/s
MyBenchmark.IterateTreePVector100 thrpt 70 105422.158 ± 438.722 ops/s
MyBenchmark.IterateTreePVector1000 thrpt 70 6130.033 ± 9.909 ops/s
MyBenchmark.IterateTreePVector10000 thrpt 70 400.318 ± 1.514 ops/s
MyBenchmark.IterateTreePVector100000 thrpt 70 29.464 ± 0.047 ops/s
##Tests:
public ImList<String> buildImList(ImList<String> empty, int maxIdx) {
for (int i = 0; i < maxIdx; i++) {
empty = empty.append(ordinal(i));
}
return empty;
}
public PVector<String> buildPVector(PVector<String> empty, int maxIdx) {
for (int i = 0; i < maxIdx; i++) {
empty = empty.plus(ordinal(i));
}
return empty;
}
public String iterateList(List<String> list) {
String last = null;
for (String item : list) {
last = item;
}
return last;
}
public String accessList(List<String> list) {
String last = null;
for (int i = list.size() - 1; i >= 0; i--) {
last = list.get(i);
}
return last;
}
@Benchmark public void buildImList10() {
ImList<String> vec = buildImList(PersistentVector.empty(), 10);
String last = vec.get(9);
assert("9th".equals(last));
}
@Benchmark public void buildTreePVector10() {
PVector<String> vec = buildPVector(TreePVector.empty(), 10);
String last = vec.get(9);
assert("9th".equals(last));
}
@Benchmark public void IterateImList10() {
ImList<String> vec = buildImList(PersistentVector.empty(), 10);
String last = iterateList(vec);
assert("9th".equals(last));
}
@Benchmark public void AccessImList10() {
ImList<String> vec = buildImList(PersistentVector.empty(), 10);
String last = accessList(vec);
assert("9th".equals(last));
}
// And so on using lots of cut and pasted code for 100, 1000, 10000, etc.
#Further Thoughts The Clojure collections also walk the sibling nodes in the internal data trees of these structures to provide iterators, which is pretty cool performance-wise. PCollections starts from the top of the tree doing an index lookup for each item, then increments the index and goes back to the top to look up the next (at least for the list implementation). It takes so much time to build a collection, that this kind of hid these differences. Obviously there is work to be done in this performance testing.
Clojure's (and Java's) sorted/tree map/set implementations are O(log2 n), so PCollections is likely about as fast for those two collections. If someone does performance testing to verify these theories, please let me know so I can link to it here. I plan to come back to this later...