Skip to content
This repository has been archived by the owner on Oct 24, 2023. It is now read-only.

Predicate Dispatch #19

Open
mstade opened this issue Aug 8, 2014 · 5 comments
Open

Predicate Dispatch #19

mstade opened this issue Aug 8, 2014 · 5 comments

Comments

@mstade
Copy link
Owner

mstade commented Aug 8, 2014

In discussions on how to route URLs, the suggestion came up to have generic predicate dispatch functionality; the URL patterns would be the predicates in this case.

It's not immediately clear to me how this would work, but supposedly it means the predicate dispatch function just operates on a generic decision tree, and the building of that tree must be configurable since the order of priority, precedence and such really do depend on the pattern language. Switching on JS types is very different from switching on URI templates, for instance.

@sammyt -- please do chime in.

@sammyt
Copy link
Collaborator

sammyt commented Aug 8, 2014

The decision tree that rhumb uses is pretty specific to URL routing at the moment (and probably also to its brand of routing) but I don't see why a intermediate tree structure couldn't be designed that was valuable for generic pattern matching. Rhumb could then export that tree from its route parser.

@mstade
Copy link
Owner Author

mstade commented Aug 10, 2014

I'm currently reading through the papers linked to from match, starting with this one. They are good and not too difficult to digest (Maranget's material is pretty dense though) and in general seems to agree with the approach we took in resourceful and what you've done in rhumb. There is a generic tree that can be built, and creating a router like rhumb then comes down to making primitives that return patterns of a specific format. Something like the following:

router = matcher()
router.def(route('/foo/bar'))
router.def(route('/foo/{bar}-baz'))

// Where route creates structures like [ 'foo', 'bar' ] and [ 'foo', [ any('bar'), '-baz' ] ] respectively

The papers say, not too surprisingly to us I suppose, that the most important bit to get right is the construction of the DAG, since if it's efficient (and optimized for most specific paths first) then dispatch is only marginally more expensive than single dispatch. This is of course very much like what we did with fixed, part, variable, and wildcard segments. Presumably, a super generic matcher lets you even configure this stack, but I doubt there's a whole lot of value in that. (Could be, I suppose, since if you have a specific context (such as URL routing) then you could probably cut a few corners.)

Anyway, back to reading.

@mstade
Copy link
Owner Author

mstade commented Aug 10, 2014

The core.match code includes a lot of commentary and is a pretty straight forward read. I still don't quite understand all the mechanics, but I think a REPL and some testing might aid in getting there.

@mstade
Copy link
Owner Author

mstade commented Aug 10, 2014

Also, unless I'm missing something vital, it looks as though core.match requires the params vector and condition vectors to be of the same arity. It's not clear to me whether a position in a condition could indicate never or ignore in order to allow conditions of different arities.

I think supporting different arities is vital – how could it otherwise be extended? As well, arity becomes a fast path eliminator; if each level of a tree corresponds to arity then we could immediately cull those paths that don't match, without having to go through any conditions. This could possibly cause issues with variadic functions, not sure about that.

@mstade
Copy link
Owner Author

mstade commented Aug 10, 2014

Increasingly of the opinion that the matcher should actually be configured with a priority stack. It could look something like this:

var dispatch = matcher(
  [ fixed()
  , variable()
  , any()
  ]
)

dispatch.def(template('/foo/{bar}/baz-{wobble}/*'), handler) // template(...) -> [ fixed('foo'), variable('bar'), [ fixed('baz-'), variable('wobble') ], any() ]

module.exports = function route(url) {
  var path = parse(url) // parse('/foo/moo/baz-booze/something') -> [ 'foo', 'moo', 'baz-booze', 'something' ]
  apply(dispatch, path)
}

Thus the matching functions builds a lookup tree from the calls to def, using the priority defined in the matcher function. Once a priority has been set, it couldn't change, so it's an up front thing. I suppose that could be changed (the full tree would need regenerating then, with possible ambiguities showing up all over the place) but I can't quite see the need for it. Being able to extend (i.e. add to, not change) the matcher after the fact using def is massively useful though.

The router in this example encapsulates the matcher, and its responsibility is just to parse URLs into something the matcher can work with. I think this kind of API is pretty straight forward, should make it fairly easy to build efficient trees and yet be flexible enough to cover lots of cases. Using nested arrays filled with functions makes this interesting, because it gives us the ability to sort the matching in such a way that higher priority functions are tested first, with the implication being that higher priority means more specific and lower priority means more general. This is more or less what the Maranget paper is all about, sorting things in such a way that things fail fast.

A thought I'm having is that just doing var dispatch = matcher() should return a default priority stack, something which makes sense for javascript as a language. Perhaps something like:

[ strictEquality
, valueEquality
, hasValues
, is
, any
]

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants