-
Notifications
You must be signed in to change notification settings - Fork 25
UncleJim vs. PCollections
PCollections competes only with UncleJim's first bullet point: immutable collections. Clojure's vector (list) and hashMap/hashSet have O(log32 n) performance, which theoretically scales better than the O(log2 n) binary tree structures it looks like PCollections uses.
This graph shows how many operations each lookup requires (y) for a given number of items in the collection (x). The red line is the fast log32, the blue is the slower log2. Daniel Spiewak explains all the ramifications of this better than I ever could: https://www.youtube.com/watch?v=pNhBQJN44YQ
The Clojure collections also walk the sibling nodes in the internal data trees of these structures to provide iterators, which is pretty cool performance-wise. PCollections starts from the top of the tree doing an index lookup for each item, then increments the index and goes back to the top to look up the next (at least for the list implementation).
Clojure's (and Java's) sorted/tree map/set implementations are O(log2 n), so PCollections could theoretically be as fast or faster for those two collections. If someone does performance testing to verify these theories, please let me know so I can link to it here.
UncleJim has additional benefits listed in the bullets at the top of this document.