Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions #1

Open
annevk opened this issue Mar 10, 2020 · 54 comments
Open

Questions #1

annevk opened this issue Mar 10, 2020 · 54 comments

Comments

@annevk
Copy link

annevk commented Mar 10, 2020

Heya, I was wondering if there's a more complete example available that shows how scripts might be labeled, how a protected object is determined, and why Reflect.get and such are not accessible to such labeled scripts (or would behave differently). Thanks!

@bholley
Copy link

bholley commented Mar 10, 2020

Also curious what happens with e.g. property defines and sets. If an interposed script sets window.foo, what happens?

@pes10k
Copy link
Collaborator

pes10k commented Mar 11, 2020

hi! @annevk did you see this? privacycg/proposals#3 (comment) Does it answer any of your questions?

I'm proposing that the objects that are protected be anything thats not provided by the language. So, to a first approximation, anything that appears in a WHATWG or W3C spec. And I don't anticipate any reason to prevent scripts from getting at Reflect.get (part of the goal is to not require any rewriting of existing scripts, some of which are going to use Reflect). Its just that the Reflect.get or similar call being done in the labeled / membraned script would also be intercepted by the membrane proxy (i think just as with any other Proxy)

@pes10k
Copy link
Collaborator

pes10k commented Mar 11, 2020

@bholley If i understand your question correctly, I think nothing much special. In the case of setting window.foo, The membrane would see a call to set with window as the obj, and foo as the prop, and whatever foo is as the value (using MDN's names), and the proxy could choose to stick foo on window, or stick something else on window instead, or not modify window at all, etc

@pes10k
Copy link
Collaborator

pes10k commented Mar 11, 2020

@bholley or rather, that wouldn't be the case for window (this proposal is not trying to solve the problem of hiding script state from each other) but would be the case for any other shared object (say, document instead).

So, for window, nothing would happen. For some other shared state / properties / structures / what I said above :)

Apologies, I wasn't sure if you were asking about window specifically, or property sets / gets, or the intersection of the two

@bholley
Copy link

bholley commented Mar 11, 2020

Do I understand correctly that window is exempted because it's the global object? How do you then prevent the interposed script from modifying global state belonging to non-interposed scripts, and hijacking them to bypass the membrane?

We've done a lot of JS membrane security stuff in Firefox. In my experience, they only work when they're complete and bidirectional - so every object needs to be clearly on one side or the other, and scripts can never directly access objects on the other side of the membrane. Since it's pretty hard to interpose between a script and its standard prototypes, that generally means you need a separate global object to delineate the membrane.

That gets you to a membrane that's sound. In order for the membrane to be useful from a security perspective, the access control also needs to be based on a denylist rather than an allowlist. That is to say, rather than enumerating the bad patterns you're trying to avoid, you want to enumerate the specific things you want to allow, and then audit that surface to convince yourself that nothing sensitive is reachable.

Putting that together, I could imagine a setup wherein we import a script into an isolated global, and then use a proxy membrane to provide just enough virtual access to the primary global to allow it to accomplish its goals. That strikes me as something that could work, but fairly different from the proposal here.

@pes10k
Copy link
Collaborator

pes10k commented Mar 11, 2020

Do I understand correctly that window is exempted because it's the global object? How do you then prevent the interposed script from modifying global state belonging to non-interposed scripts, and hijacking them to bypass the membrane?

Preventing scripts from touching each others state is out of the scope for this proposal. There are already solutions for that (closures, module scripts, etc), and if a script gives away its capabilities, nothing we can do about that. This proposal is only targeting the problem of state that has to be global for the web platform to work.

that generally means you need a separate global object

I appreciate what you're saying, but I think you're generalizing from constraints related to different goals. More specifically, anything that requires rewriting / changing the code being wrapped wont solve the problem.

That gets you to a membrane that's sound. In order for the membrane to be useful from a security perspective, the access control also needs to be based on a denylist rather than an allowlist.

The primitives provided here allow you to go either way. The example in privacycg/proposals#3 (comment) shows how this could be used in an "allow list" manner (e.g. don't allow the script to touch global stuff except for in the narrow set of ways I can predict)

@bholley
Copy link

bholley commented Mar 11, 2020

Preventing scripts from touching each others state is out of the scope for this proposal.

The proposal offers a security enhancement based on a membrane, and I'm suggesting that practical usage of the API will result in membrane bypasses, rendering the security enhancement ineffective. That seems in-scope to me.

I think you're generalizing from constraints related to different goals.

Could be? Our goals were "construct a membrane that can't be bypassed and use it to enforce security policy".

More specifically, anything that requires rewriting / changing the code being wrapped wont solve the problem.

"Interposed script runs unmodified" was also a constraint for us, if that's what you're asking - I agree that's an important constraint to build something broadly useful here. But I believe that it's hard to build a sound membrane without a separate global, and also that a separate global does not preclude running script unmodified.

The primitives provided here allow you to go either way.

Yep. I'm just suggesting that you frame the examples in terms of allowlists, because it's more plausible that such a setup could be effective, and also makes explicit which part of the object graph is intended to be exposed to the interposed script.

@pes10k
Copy link
Collaborator

pes10k commented Mar 11, 2020

The proposal offers a security enhancement based on a membrane, and I'm suggesting that practical usage of the API will result in membrane bypasses, rendering the security enhancement ineffective. That seems in-scope to me.

Could be? Our goals were "construct a membrane that can't be bypassed and use it to enforce security policy".

I'm happy to be wrong here; im only asking if there is something about the fundamentals of the design that makes you think so?

Again, the only things we're trying to constrain through membranes are document / state / things that need to be shared for the platform to work. I would be grateful if you could show how those promises in this proposal could be bypassed.

Put differently, are you saying there is something fundamental with the design (which to my eyes seems correct, from an ocap style approach), or are you saying its conceptually fine, but in practice its difficult to use these to impose useful constraints / policies (or some third thing)?

Again, i'm not arguing; im just trying to understand your feedback, beyond "moz tried something kinda similar and it didn't work" :)

a separate global does not preclude running script unmodified.

I don't follow this claim. Certainly it constraints scripts that are currently written that expect access to these globals, no? Again, if you could share more details it would be helpful to understand / re-evaluate the proposal.

@bholley
Copy link

bholley commented Mar 11, 2020

are you saying there is something fundamental with the design

Yes. I believe that an effective JS membrane requires every object to be on one side of the membrane or the other. From what I can tell from this proposal, the global object is shared between both sides. Moreover, it would be difficult for the engine to avoid direct references to certain built-ins like Object.prototype, so I'm assuming those would need to be shared as well.

I would be grateful if you could show how those promises in this proposal could be bypassed.

So, suppose an interposed script does something like this:

membraneBypasses = [];
({}).__proto__.valueOf = function() { membraneBypasses.push(this); };

That allows the interposed script to capture an unmediated reference to every object that is implicitly converted to a primitive. You can also do similar things by overwriting or shadowing properties on the window that are accessed by the trusted script.

I don't follow this claim. Certainly it constraints scripts that are currently written that
expect access to these globals, no?

The idea sketched in #1 (comment) is to use a separate global, but provide mediated access to the global properties that the script expects. So you start from nothing in a fresh global, and export just enough functionality into it that the script doesn't break.

@bholley
Copy link

bholley commented Mar 12, 2020

I think a key source of ambiguity in the current proposal is the precise set of checkpoints where the engine is expected to invoke the membrane handler. The current text says it happens whenever labeled code attempts to cross a label boundary - but as the example above illustrates, there are lots of tricky ways for script to reach objects (i.e. via the prototype chain of an object literal, or as the |this| argument to a method override).

We could add guards to every property access, every function argument, every return value, etc - but at a pretty severe performance and complexity cost. And by that point, we're just doing exhaustive VM-level access checking, and don't really need the membrane.

@pes10k
Copy link
Collaborator

pes10k commented Mar 12, 2020

I see, i think this might be because of a misunderstanding (or ambiguity?) of the proposal. For "wrapped" scripts, the objects are always on the trusted script's side of the membrane.

By construction, whenever labeled code is given access to a protected object, it is given
only a new, “membrane” proxy object; untrusted code cannot gain direct access to objects
on the other side of the security boundary.

So in your membraneBypasses example, this would still be a reference to the proxy. The only things the proxy can hand back to the labeled script are JS defined types (string, number, etc) or membrane references. So for the wrapped stuff, you'd be either hitting HTMLElement.prototype.valueOf (or similar) and so triggering the membrane logic again, or if you bottomed out at Object.prototype.valueOf, the membrane logic would ensure that this was wrapped too.

I think this addresses your concerns? if not, im happy to clarify more (or see the error of my ways) :)

The idea sketched in #1 (comment) is to use a separate global, but provide mediated access to the global properties that the script expects. So you start from nothing in a fresh global, and export just enough functionality into it that the script doesn't break.

How do you imagine this capability introduction mechanism work for existing code / sites? E.g. there is a some existing <script src=X> tag on a site now. How to do capability introduction w/o something like the proposal (and w/o then allowing the thing that got the introduced capability from sharing it w/everyone else through the introduced capability)

@bholley
Copy link

bholley commented Mar 12, 2020

the objects are always on the trusted script's side of the membrane

Does this mean that object literals ({}) minted by the script are on the trusted side of the membrane? Or something else?

So in your membraneBypasses example, this would still be a reference to the proxy.

I don't follow. Could you walk me through the steps you expect the engine to perform when executing the example snippet?

How do you imagine this capability introduction mechanism work for existing code / sites?

To be clear, the goal of the mechanism I sketched is to allow sites to self-modify in order to run unmodified third-party scripts in a safe way. Does that match your goals?

@jackfrankland
Copy link

Does this mean that object literals ({}) minted by the script are on the trusted side of the membrane? Or something else?

I'd like to try and answer this, more so to verify that I'm understanding the proposal correctly.

Access to the prototype of object literals and other built-ins like Object, under the current scope of the proposal, would not be wrapped by the membrane.

In your example here:

membraneBypasses = [];
({}).__proto__.valueOf = function() { membraneBypasses.push(this); };

the this could be something that is protected, for example window.localStorage, or an object that is an instance of HTMLElement. If so, when your script attempts to access/call properties/methods on it, the membrane proxy will control what is returned.

I believe that an effective JS membrane requires every object to be on one side of the membrane or the other.

In my view, this proposal is an effective way of allowing to prevent unwanted access to the protected objects. I think having a completely separate global, that presumably allows for managed synchronous calls between membranes, as opposed to alternative tech already available like Web Workers and iframes, is maybe serving a different goal.

@bholley
Copy link

bholley commented Mar 16, 2020

Access to the prototype of object literals and other built-ins like Object,
under the current scope of the proposal, would not be wrapped by the membrane.

If that's the case, there are lots of potential ways for untrusted code to masquerade as trusted code. Suppose you do (new Function(...))() - it's the same Function constructor used by trusted code, so how does the engine determine whether the resulting script is trusted or untrusted? You could try outlawing the Function constructor, or inheriting the incumbent script, or introspecting the callstack, or various other things, but it gets complex and brittle very quickly.

If so, when your script attempts to access/call properties/methods on it, the membrane
proxy will control what is returned.

This sounds like exhaustive access checking, rather than a membrane. The point of a membrane is to enable a capability-based security model, wherein script can never obtain a reference to a restricted object. This lets you avoid all the tricky security considerations in the common cases and push them to the boundary. If every property access and function call needs to do a security check, the complexity and overhead become unrealistic to implement.

@jackfrankland
Copy link

I think you raise you a good point about the Function constructor, and I assume the eval function too. I would hope that somehow the resultant script would inherit the labelling of the script that created it, but perhaps there should be some consideration to protect some built-ins, in an effort to eliminate these possibilities?

Following on from this, if the script ends up being allowed access to document.write, or to inject script into HTML, I'm not sure if there's an expectation for the engine to know which script was responsible and inherit its labelling. While DOM access is in scope for being protected, I agree that it may require an exhaustive set of rules to sufficiently protect against a malicious actor from finding loopholes.

I share your concerns for sure, but I can still see how this proposal makes it possible to achieve its goals, especially with what has been said about this being a way to provide the building blocks, and tooling and sharing of common proxies could help alleviate the set up. I wonder if it would be good to get a sense of how many loopholes there could be?

If every property access and function call needs to do a security check, the complexity and overhead become unrealistic to implement.

I'm not sure the expectation would be to have an overly-exhaustive blacklist, over a simpler whitelist, but the proposal would allow for that granularity.

@bholley
Copy link

bholley commented Mar 16, 2020

I would hope that somehow the resultant script would inherit the labelling of the script that created it

It gets tricky when script can be passed around as strings, and unintentionally converted to script by the target script (via obvious paths like eval/Function(), or less-obvious paths like attribute sets for inline event handlers).

but perhaps there should be some consideration to protect some built-ins, in an effort to eliminate these possibilities?

I think the most realistic way to do this is to give the interposed script its own copy of the built-ins (i.e. a separate global).

I agree that it may require an exhaustive set of rules to sufficiently protect against a
malicious actor from finding loopholes.

Yeah. Sadly, there are a lot of loopholes in the web platform, many of which depend on programming patterns that may be used inconsistently across sites.

I share your concerns for sure, but I can still see how this proposal makes it possible
to achieve its goals

In practice, the bar for introducing significant complexity to the platform is pretty high - so I think the security story needs to be pretty rock-solid and easy to implement in order to get traction.

I'm not sure the expectation would be to have an overly-exhaustive blacklist,
over a simpler whitelist, but the proposal would allow for that granularity.

I think we're talking about different things. I'm not talking about the complexity of the scripted membrane handler, I'm talking about the complexity within the engine to determine when to invoke the handler. In a true membrane setup, all the restricted objects are actually proxies, and so naturally invoke their handler at the right time. But the scenario you propose requires the engine to recognize that an object is restricted at the time of access, which is much more complex to implement.

@pes10k
Copy link
Collaborator

pes10k commented Mar 16, 2020

Trying to summarize / generalize over the above comments, please nudge/forgive if I miss something.

Membrane and JS-provided structures (Object, Function, etc)

The membrane layer is not intended to protect against JS-type prototype pollution. However, code on the trusted-side of the membrane would be guaranteed a non-modified version of JS-provided prototypes.

To be clear, the goal of the mechanism I sketched is to allow sites to self-modify in order to run unmodified third-party scripts in a safe way. Does that match your goals?

I don't believe so. The goal is to allow for protections to be enforced on sites, as they exist today, from the user-agent, extension or site-side levels. Requiring modifications to existing sites / site-self-modification will prevent uptake a la CSP, and prevent the user-agent / extension use cases.

I think the most realistic way to do this is to give the interposed script its own copy of the built-ins (i.e. a separate global).

As above, this breaks the primary use case, being able to layer protections on to existing applications, as is. It also would rule out allowing protections to be applied through extensions / etc.

Re: Script labeling / lineage

I think this would compose well with this proposal (e.g. it gives more information to the decision layer) but that the proposal does not require it to be useful. And since no shipping version of any JS/HTML engine currently tracks script providence fully / correctly (can go into details here if useful) I don't want to gate this proposal on script-lineage tracking.

Put differently, I think this proposal would be useful as is, and that tracking lineage would make it more so. This isn't a punt (Brave is working on lineage tracking in a number of ways, I know Moz is too) but just a desire to keep the two issues distinct, and then voltron them together later.

re document.write

Yep, this, or many other ways scripts and inject other code are places where script-lineage would make the proposal even more useful. But even just knowing having cases for handling unexpected /code executing in the membrane would be very useful here. If the membrane knows whats expected to execute (i.e. an allow-list approach) it can provide useful protections in bottom / unknown cases too (e.g. document.write).

@bholley more broadly, do the following clarifications / additions address some or all of your concerns

  1. The membrane definition enforcement code gets unmodified JS globals, but does not interpose on whats shared between JS units operating in the page
  2. Pages never receive a reference to the guarded data structures (e.g. Navigator.prototype), a membrane can only yield a membrane or a JS-defined "primitive".
  3. An implementation of the proposal would require some additional runtime assistance to inform the membrane code of what the less-trusted code thinks its accessing, when its accessing the membrane-proxy-like-object.

@bholley
Copy link

bholley commented Mar 16, 2020

The membrane layer is not intended to protect against JS-type prototype pollution.
However, code on the trusted-side of the membrane would be guaranteed a
non-modified version of JS-provided prototypes.

How would that be implemented? Are there two separate copies of the prototypes?

The goal is to allow for protections to be enforced on sites, as they exist today, from the
user-agent, extension or site-side levels.

Thanks for the clarification. I have some concerns about this, but let's leave that aside for now.

As above, this breaks the primary use case

I don't think it does. My suggestion is to run the third-party script in a separate global and then virtualize access to the intended capabilities (using a series of proxies to construct "fake" versions of the window built-ins on that global). That would allow scripts to run unmodified.

I think this proposal would be useful as is, and that tracking lineage would make it more so.

Per above, I'm not sold on adding a complex new security primitive to the web platform if it is vulnerable to straightforward bypass. But I'm still unclear about how this proposal actually works, so let's focus on that first.

@bholley more broadly, do the following clarifications / additions address some or all of
your concerns

I think more precision would be helpful here. Could you detail the steps you expect the engine to perform in the code snippet at #1 (comment) ?

@pes10k
Copy link
Collaborator

pes10k commented Mar 16, 2020

How would that be implemented? Are there two separate copies of the prototypes?

yes, to a first approximation, think of the membrane script as operating in a SES style realm.

That would allow scripts to run unmodified

But it would require modifications to the host application, to do the capability introduction, which breaks the use case.

Could you detail the steps you expect the engine to perform in the code snippet at #1 (comment) ?

In the snippit you're pointing at, if it was happening in a "wrapped" script, no membrane code would fire.

@bholley
Copy link

bholley commented Mar 16, 2020

yes, to a first approximation, think of the membrane script as operating in a SES style realm.

By "the membrane script" are you referring to the handler passed to registerMembraneProxy, or are you referring to the untrusted third-party script? The former seems orthogonal to the original concern, and the latter is roughly what I'm proposing.

But it would require modifications to the host application, to do the capability introduction

The UA can detect the script load, create a separate realm, set up the proxies, inject the script, and run it.

In the snippit you're pointing at, if it was happening in a "wrapped" script,
no membrane code would fire.

Ok. So if script can get direct, unmediated references to a sensitive object, the engine needs to check every object access to determine if access is permitted. Is that right?

@pes10k
Copy link
Collaborator

pes10k commented Mar 17, 2020

By "the membrane script" are you referring to the handler passed to registerMembraneProxy

I'm referring to the script that defines the membrane policy, so both the code that defines the handler passed to registerMembraneProxy, and any surrounding code.

I'm not sure why thats orthogonal through; thats what prevents the wrapped script from using JS-defined prototypes from sneaking around the membrane.

The UA can detect the script load, create a separate realm, set up the proxies, inject the script, and run it.

I'd need to read a broader proposal to understand the particulars, but the reason we didn't go this broad approach is that, to end up with the same privacy properties (i.e. avoiding the give away "*" capability problem) you need to do the same dual-hoisting proxying / membrane approach, but just with more foot guns, since the policy definer would need to set up the proxies themselves.

So if script can get direct, unmediated references to a sensitive object…

There is no way for a script to get an unmediated reference to a sensitive object. Calls to wrapped structures can only yield new proxy references. There is no way for the wrapped structure to yield an unwrapped / mediated reference. Calls / accesses against that just-returned proxy have the same property too, and so on.

I worry that I'm either not answering your question, or that I'm missing some suability in it.

@pes10k
Copy link
Collaborator

pes10k commented Mar 17, 2020

if it seems like we're 100% talking past each other, im also happy to jump on a call or similar

@bholley
Copy link

bholley commented Mar 17, 2020

both the code that defines the handler passed to registerMembraneProxy, and any
surrounding code.

Ok - so all the trusted script? Which means that the trusted code uses separate JS prototypes from all the untrusted code. Is that right?

If so, that sounds like "running in a separate global/realm" to me!

There is no way for a script to get an unmediated reference to a sensitive object.

So, the point of that snippit was to demonstrate how untrusted script might steal references to trusted objects. When you said "no membrane code would fire", I interpreted that to mean that unmediated references would end up in the membraneBypasses array when trusted script accidentally triggers the malicious valueOf via an implicit conversion. But perhaps what you meant was that, because the two scripts use different sets of prototypes, the malicious valueOf method would never be invoked by trusted script?

@pes10k
Copy link
Collaborator

pes10k commented Mar 17, 2020

If so, that sounds like "running in a separate global/realm" to me!

The difference I thought we were discussing (though I maybe have misunderstood) was that in this proposal, the membrane code has its own "realm" (would not need to be a realm as defined in that proposal, but something that prevented prototype and state leak), while all other code executes as normal. I though you were proposing that each wrapped script executes in its own realm (e.g. a realm per wrapped script), which is a big diff!

But if we're saying the same thing, then hurray, i'm glad we scraped it all down :)

But perhaps what you meant was that, because the two scripts use different sets of prototypes, the malicious valueOf method would never be invoked by trusted script?

Yes, all page script (existing script) has one set of JS-defined prototypes (as works currently). The trusted script has its own JS-defined prototypes. For shared structures (e.g. Navigator.prototype) the trusted script would mediate access .

@bholley
Copy link

bholley commented Mar 17, 2020

So from the above, it sounds like you intend everything with unmediated access (trusted code) to run with a separate set of JS standard prototypes from everything else (untrusted code). That clears things up quite a bit, but I should point out that the proposal text doesn't say anything to this effect, and in fact implies the opposite (by naming prototype protection as out-of-scope).

Another point of ambiguity, I think, is the amount of code that is expected to sit on the trusted side of the boundary. I thought the intention was for most first-party application logic to be trusted, but I think you be suggesting that only the membrane handler and directly-supporting code is trusted, and routine first-party business logic is untrusted?

I am more or less certain that mediating all first-party access through scripted proxies would ruin performance on most JS-driven pages (especially ones with VDOMs). So I don't think that's a realistic option.

The key observation that I thought this proposal was making (and which I agree with), is that pages often have two categories of script: first-party script, which is trusted and performance-sensitive, and third-party script, which is less-trusted and often not performance-sensitive. So if you can isolate the latter from the former, you can apply fine-grained access checks on the un-trusted stuff without impacting the performance of the trusted stuff. I think that is a reasonable thing to explore, but effective isolation requires various measures (like separate JS built-ins) which, for practical implementation purposes, require a separate realm/global.

@pes10k
Copy link
Collaborator

pes10k commented Mar 17, 2020

prototype protection as out-of-scope

Thank you for this. What I intended by this in the spec is to say the API is not trying to prevent scripts from poisioning each other's prototypes; the only reason for the realm-like requirement is to prevent scripts from sneaking into the membrane logic. So the goal of the proposal isn't to protect JS-built in prototypes, but some protection (either side of the membrane boundary) is needed for proposal. I can update the spec to make this clearer.

the amount of code that is expected to sit on the trusted side

Only the membrane code is trusted, but it can extend that trust to 1p script arbitrarily by just not wrapping 1p script (the membrane logic chooses what to wrap). So, the proposal can easily accommodate what you're describing; just label 3p scripts that you want to wrap, don't wrap 1p scripts.

If you think this is too much of a perf foot gun, we could also consider including a convenience API to say "for X code unit, no longer consider it behind the membrane", to provide a quick way to opt-out trusted 1p script, but still keep untrusted inline code protected.

@pes10k
Copy link
Collaborator

pes10k commented Mar 17, 2020

The hesitance to make a "1p" vs "3p" distinction is that there are a lots of places 3p code can look like 1p code, that no runtime tracks currently, because it would require complex tainting (Brave has an offline system that does this robustly, and I know FF is exploring one, but… far off). What we could do is have a way of cleaving on a slightly different axis; labeling script thats in the initial page, and script thats not.

Then you can cleanly distinguish between script the page author knows about (or injected by the CSM or whatever), and code thats executing bc of script injection or otherwise. WDYT?

@bholley
Copy link

bholley commented Mar 17, 2020

the only reason for the realm-like requirement is to prevent scripts from sneaking into
the membrane logic.

It's needed to protect more than just the membrane logic - it's also needed to protect anything else that operates on unmediated references, because otherwise those references can leak back to the untrusted script and puncture the membrane. That's what my "membraneBypasses" snippit up-thread is intended to demonstrate.

Only the membrane code is trusted, but it can extend that trust to 1p script arbitrarily
by just not wrapping 1p script (the membrane logic chooses what to wrap).

And if it chooses to not interpose on a given script, that non-interposed script sits on the trusted side of the boundary, and thus uses the trusted set of prototypes, right?

My central point here is that untrusted scripts need separate prototypes from trusted scripts, and practically speaking, that means the two need to run in separate realms. And since engines are already built around the assumption that the script running in the window global realm is trusted, the simplest way to implement this scheme is for untrusted scripts to be loaded into a separate non-Window realm (possibly automatically by the UA).

What we could do is have a way of cleaving on a slightly different axis

For now, I don't have a strong opinion about how labeling is done - just about how useful isolation is achieved once certain scripts have been labeled "untrusted".

@jackfrankland
Copy link

Sorry for interjecting if it doesn't help your discussion here.

@bholley
For me, I'm struggling to see how the sharing of prototype would allow an untrusted script to have unmediated access to one of the protected objects methods and properties. The membraneBypasses snippet you created does not in my view sufficiently demonstrate this, but perhaps I'm missing something.

@pes10k

What we could do is have a way of cleaving on a slightly different axis; labeling script thats in the initial page, and script thats not.

Just want to point out that there can be many good reasons to not bundle all scripts with the initial document response. The proposal states "Trusted code must execute before any other code units on the page executes, otherwise page execution is halted". My interpretation was that trusted code was therefore reduced to the membrane creating scripts only.

@bholley
Copy link

bholley commented Mar 17, 2020

The membraneBypasses snippet you created does not in my view sufficiently demonstrate this

Happy to discuss that further, but perhaps in a separate issue to keep this thread manageable? Would need more detail on exactly what your concern is.

@deian
Copy link

deian commented Mar 18, 2020

I'm a bit late to the party, but I want to echo @bholley's comments. I think doing this without separate prototype chains/global object is really hard to get right. I think there is a bunch of code in C++ land that will use the context global -- literals are one case of this, you need to handle redefinition of undefined, etc. Separate globals addresses a bunch of this from the start. And fixing this without piggybacking on JS contexts/compartments will be super slow: think about monkey patching Array.prototype (presumably across all the different sandboxes).

I also want to point out that plug-n-play is probably equally hard (whether you have a separate global or just try to wrap): you need to somehow handle mapping objects (and their prototypes, but probably lazily) across the sandbox boundary. Just wrapping will probably break a lot of code that uses instanceof or monkeypatch prototypes. This is all doable (e.g., we did it at intrinsic for Node.js), but I think the just-wrap-it-and-it-works is likely not going to work (of course client-side JS may be very different from Node, so am happy to be wrong). I don't think I want to be discouraging because I think this is an awesome thing to do.

Last thing I want to point out: when you're writing policy code it's really important to think about how you handle objects that may be poisoned. This is much harder without separate globals, but even with separate globals, it's easy to write vulnerable code. It's fine to have a low-level interface where this is done (e.g., in something in the spirit of Defensive JS), but I think having declarative interfaces is crucial to making this usable.

@pes10k
Copy link
Collaborator

pes10k commented Mar 18, 2020

@bholley @deian @jackfrankland

Thank's ya'll for the feedback. Was just discussing with Brendan today and I think the "optimization" / "ease of use" change I suggested before (only wrapping the web-defined shared state, leaving the JS-provided shared state alone) wont work, largely for the reasons @bholley and @deian just mentioned.

Unfortunately though, i think the idea of separate execution compartments / realms / etc just either wont work for the use case (b/c the host app will need to write capability introduction code, and so would require non-trivial rewrites) or bc a general "layer on" solution for capability introduction would require effectively just re-writing the current proposal to avoid a "give away *" capability footgun.

So, I'm going to make some changes to the proposal as follows:

  1. code wrapped / labeled for protection will need to have the entire environment wrapped by the membrane (so everything hanging off window, including JS builtins)
  2. emphasize that the proposal is intended to compose well with, but not depend on, some way of tracing code unit lineage
  3. emphasize that the proposal is meant to look like JS Proxies, but involves more runtime machinery (e.g. to fix cases like the instanceof @deian mentioned above)
  4. emphasize the the goal here is to get the primitives right; writing a declarative layer on top of it seems like a nice idea, but wouldn't want to define the capabilities in that way because there are some policies that wont be stateless, and so easier to express with script logic (e.g. something like privacy budget)

On my read of the above, and spending sometime yesterday re-reading through the OCAP / membrane / etc literature, I think the above should be whats needed (everything strictly lives on one side of the membrane or the other, but also maintaining the requirement of working with existing apps), and will ultimately scrape down to "can we get it fast enough". 🤞

@deian @bholley @jackfrankland would be grateful for your thoughts on whether the above seems correct to you, or if there are things we might still be missing at this point

@bholley
Copy link

bholley commented Mar 19, 2020

Thanks for the update!

I'm pretty concerned that the runtime changes required to properly wrap the entire environment (particularly 1 and 3) would be too invasive to be realistic to implement in a production engine. You're of course welcome to continue exploring it, but I just want to set expectations from our side.

I still don't really grok the argument that a separate global requires more intervention than a same-global membrane. In either case, somebody (either the page or the UA) is going to need to inject a membrane proxy handler to define which parts of the object graph the untrusted script is permitted to access. That can either take the form of an access control list (this proposal) or as a virtualized facsimile (on a separate clean global). But in either case, somebody's going to need to write some handler code - either something specific to a given untrusted script, or something general enough to apply to lots of untrusted scripts while maintaining useful security invariants.

We could perhaps defer the rest of this conversation until the scheduled meeting, if that is indeed still happening.

@jackfrankland
Copy link

@pes10k
I think the changes sounds good. I was exploring what was being said about exploiting prototypes, and can think of at least one pertinent scenario that would be possible with the previous proposal: overriding String.prototype.split would give access to cookie data if parsing logic was done on document.cookie. With your change to wrap the entire environment, I can see it allowing to prevent such exploits.

Please take my opinion on this with a pinch of salt though, as I have absolutely no knowledge of how easy this proposal would be to implement. My perspective is as a developer of third party content, who liases with infosec for publishers and retailers, all of which have varying degrees of hoops to jump through when it comes to trusting our scripts. Having a mechanism to ensure our scripts are complying to the contract would be very valuable to both us as a third party, and the first party (we would be able to update our scripts dynamically without necessarily requiring audits each time - this is my plea for traction with this proposal despite difficulties to implement 😄). I envisage a scenario where we provide the first party with the correct restrictive membrane proxy handler for our origin.

@bholley
I believe the negative of having a separate global would be the assumed requirement for the scripts to be rewritten. If the script was responsible for adding to the DOM for example, it would do this with window.document...appendChild. If it was in a separate global, it would have to access the DOM via something other than its own global. Although I would be happy to make changes to conform to this, I can imagine there could be a pretty large pushback on something like this. Please let me know though if I've completely misunderstood you.

If a wrapper is applied instead, proxy handler traps would still remain optional. If traps aren't defined, the accessor will receive the reference by default.

@deian
Copy link

deian commented Mar 19, 2020

@jackfrankland
I think what @bholley and I are saying is that the challenge for getting this to work without any changes to the third-party script is the same -- it kind of comes down to what your wrapper is doing. The advantage of the separate global is that you can do this securely. Without the separate global, you'd have to get all the edge cases right (and this is a whach-a-mole if experience from previous JS sandboxing efforts are any indication).

@jackfrankland
Copy link

jackfrankland commented Mar 19, 2020

Sure, I agree, and I like the idea of being able to embed a script into a separate global that has mediated, synchronous access to the main global - rather than the current solution of third-party iframes or Web Workers that are limited by async messages.

The key thing with the proxy solution in my opinion, is that the first party (or the user-agent or extension, on its behalf), could keep the existing third party services running as is, while taking steps to ensure that they weren't invading the user's privacy. A trap could be created to not allow access to storage APIs for example, which could effectively eliminate any unwanted tracking.

@pes10k
Copy link
Collaborator

pes10k commented Mar 19, 2020

My sense is that to prevent the "give away *" capability problem, you need membranes, if these policies are going to be written by anyone other than deeply trained security folks. And once you need membranes, the current approach is easier than a formal capability introduction system (w/ its own membrane implementation) per script, even if they generalize to the same thing.

BUT! I'd be happy to be wrong about that. And maybe the best way forward is something in the middle (maybe a nice API for per script membranes)? I think that'll scrape down to the current proposal, but at this point we're much more wedded to the goal and being able to ship something correct that works w/o rewriting any application code, than we are to the current suggestion.

Brave's current plan is to update the proposal as per #1 (comment). Brave wont begin implementation of anything for another month or two at the earliest, so eager to knock out the knots before then.

@bholley

We could perhaps defer the rest of this conversation until the scheduled meeting, if that is indeed still happening.

I think that meeting would still be valuable, even if it ends up just being the folks on this thread and maybe one or two other folks Brave side (maybe Brendan or @jumde or whoever gets stuck doing the implementing).

If that sounds like something folks here would be interested in (a call, or f2f or whatever makes the most sense), i think the best thing to do would be to say so at privacycg/meetings#2, so that if there are other privacycg stragglers / lookie-los who are interested, they can jump in too, and find a time then.

What do ya'll think?

@bholley
Copy link

bholley commented Mar 20, 2020

I believe the negative of having a separate global would be the assumed requirement for
the scripts to be rewritten.

That is not intended to be an assumed requirement.

If the script was responsible for adding to the DOM for example, it would do this with
window.document...appendChild. If it was in a separate global, it would have to access
the DOM via something other than its own global.

Right, it would do so via a virtualized DOM implemented with proxies (or just plain objects).

So before the script was injected into that separate global, the UA (or the page) would inject various fake objects into that global that would act (to the extent desired) like a real DOM. But since it's not actually the real DOM, access to the real DOM is entirely governed by the implementation of that proxy layer, and the extent to which the various operations forward, fail silently, throw, etc.

A key point here is that the engine does not acquire any additional enforcement responsibilities beyond a correct implementation of the existing spec for realms and proxies, which makes it a much easier sell to implement.

And once you need membranes, the current approach is easier than a formal
capability introduction system (w/ its own membrane implementation) per script

I think in practice the actual membrane logic would be handled by some general library with a pluggable security policy that the end-users would fill in. I think some amount of pre-written helper code would be needed with either approach.

There are obviously ergonomic benefits of having the system do more for you, but the implementation cost of that is quite steep in this case.

@pes10k
Copy link
Collaborator

pes10k commented Mar 20, 2020

A key point here is that the engine does not acquire any additional enforcement responsibilities beyond a correct implementation of the existing spec for realms and proxies, which makes it a much easier sell to implement.

It seems like both implementations are quickly scraping down to the same thing. But re the above, you still need machinery to keep stuff like instanceof, === and things like that working correctly. Otherwise, the complexity has just shifted to keeping the virtual dom in sync with the real dom, etc, no?

In otherwords, the more we talk about this, the more i think we're just using different vocab to talk about the same thing (e.g. there isn't much sunlight between "a different global for each code unit, with machinery to decide it look like a shared global when desired" and" "a single global, with machinery (membranes) to decide when to arbitrarily like about the state of that global, etc"

@bholley
Copy link

bholley commented Mar 23, 2020

It seems like both implementations are quickly scraping down to the same thing.

I'm not yet convinced. The approach I'm describing, at a first pass, can be implemented entirely in userspace (given an implementation of explicit Realms, which aren't quite a thing yet). The other approach requires substantial intervention on behalf of the engine.

you still need machinery to keep stuff like instanceof, === and things like that working correctly.

I believe all of this can be managed by the proxy handlers, but it's certainly possible that I've missed something. Can you describe some cases that require engine intervention in more detail?

@pes10k
Copy link
Collaborator

pes10k commented Mar 23, 2020

@bholley maybe it'd just easiest if you could point me to a write up of your approach. That would be very useful in advance of Thursday's call. Bc from the reading of the above, at the very least there needs to be a virtual dom per script, per-proxy isolation / mediation per script, realms implementations, UA modification to add the shim vantage point, and maybe more. So, having the skeleton of your proposal written up in one place to refer to might be useful to make the conversation and call more productive

"substantial intervention on behalf of the engine" is not a limitation on our part, we're happy to intervene as much as possible to get to the right point if it can be done in a performant way.

@bholley
Copy link

bholley commented Mar 23, 2020

@bholley maybe it'd just easiest if you could point me to a write up of your approach.
That would be very useful in advance of Thursday's call.

To summarize the approach:
(1) Create a fresh realm.
(2) Inject a shim layer to mimic the behavior of Window built-ins to the extent that you want to support them. This is probably best accomplished by Proxies implementing membrane semantics.
(3) Evaluate the untrusted script(s) in the scope of that realm, rather than the window.

Steps 1-3 can be performed directly by the site, or as a UA intervention to supplant the default behavior of certain <script> tags on unmodified pages.

The tricky details are all in (2) of course, and are well beyond anything that I might reasonably write up at the present level of engagement. I'm merely suggesting that you explore accomplishing your goals in user-space if it's practical to do so, because it's a much lighter lift than getting something standardized.

Why might (2) not be practical to build? I can think of two broad reasons. The first would be if it weren't possible to build a high-fidelity (and sufficiently-configurable) membrane layer for the DOM out of ES proxies. This could be the case (Gecko's implementation uses proxies at the C++ level, which is arguably cheating), but projects like [1] seem to suggest it is possible. The second would be if it weren't possible to construct a security policy for that membrane that was permissive enough to be useful but restrictive enough to provide useful security properties. That could very well be the case, but if so, it's equally a problem for the registerMembraneProxy proposal.

"substantial intervention on behalf of the engine" is not a limitation on our part

Sure - but to the extent that you're interested in getting other vendors on board, complexity of implementation is an important aspect to consider.

[1] https://github.com/ajvincent/es-membrane

@pes10k
Copy link
Collaborator

pes10k commented Mar 25, 2020

@bholley

Could you join the call tomorrow to discuss this in more detail? I think that would be helpful.

Just to round out the conversation here until then though:

  1. We don't want to hinge this on realms, since there are still many unknowns there, including timeline, final shape of the proposal, etc.
  2. Regarding things like === and instanceof, been reading through this [1]. It seems like both it and the implementation you pointed to [2] can both solve identify problems, so, thats great for both approaches.
  3. I'm still thinking this largely scrapes down to very similar systems, that either system could be implemented from the others approach, would likely use a great deal of the same machinery under the hood.

All this would be great to discuss on the call if you can join!

1: https://github.com/tvcutsem/harmony-reflect/blob/master/examples/membrane.js
2: https://github.com/ajvincent/es-membrane

@pes10k
Copy link
Collaborator

pes10k commented Mar 25, 2020

that call is being discussed here privacycg/meetings#2

@bholley
Copy link

bholley commented Mar 25, 2020

Yep I'll be there. See you then!

@pes10k
Copy link
Collaborator

pes10k commented Apr 6, 2020

Apologies on the delay in following up here, but wanted to follow up with a couple notes based on the group call and some follow up discussions.

  1. Im convinced we need to wrap all global structures, not just web related ones, to maintain the desired privacy / security capabilities.

  2. I'm even more certain now that the two systems are fully equivalent, both in capability (no pun intended) and likely in performance. The motivating case for this (which i should have said explicitly up top, but needed to have dragged out of me by @deian) is something like mootools, where sometimes scripts need to be able to modify prototype chains, etc. At this point, you need all the same indirection points, either through the replacement-global-that-is-really-a-membrane in the original proposal, or through each script's separate prototype chains in @bholley 's realms focused alternative; everything is de-JIT-ed either way. The only exceptions would be cases where you wanted to have a performance improvement for non modified prototype chains, which would be trivial to do in existing v8 machinery either way.

  3. You'd need to implement both in more or less the same way anyway, to maintain the same properties. In v8 terms, each wrapped script would be in its own context, with its global replaced by the an indirection point (the membrane in the original proposal, some indirection point in a realm-style alternative, where you decide whether to give access to true or modified prototypes, etc).

I'm going to update the proposal text now. I dont expect we'll start prototyping work on our end for another 1-2 months.

@bholley
Copy link

bholley commented Apr 6, 2020

something like mootools, where sometimes scripts need to be able to modify prototype chains

The key question here is whether you want these modifications to be visible to all scripts, or just the sandboxed scripts that introduce them (ditto for |var| declarations that end up on the global). The hypothesis behind the separate-realm approach is that it's mostly the latter, because it's unlikely that a first-party script would depend on global state pollution introduced by an untrusted third-party script. There are certainly counter-examples, but the security policy is much more likely to be the limiting factor for compatibility.

You'd need to implement both in more or less the same way anyway, to maintain the
same properties.

To be clear, the separate-realm approach would not seek to make prototype/global modifications visible, enabling the simple implementation described in #1 (comment)

@pes10k
Copy link
Collaborator

pes10k commented Apr 6, 2020

The hypothesis behind the separate-realm approach is that it's mostly the latter, because it's unlikely that a first-party script would depend on global state pollution introduced by an untrusted third-party script

I think this is unlikely to be true, given the popularity of libraries like mootools (still used on > 1m sites, for example https://publicwww.com/websites/mootools/). Any approach that solves to goal of not requiring rewriting existing code needs to work in such cases. At least, for our goals with the project :)

@bholley
Copy link

bholley commented Apr 6, 2020

I think this is unlikely to be true, given the popularity of libraries like mootools

Say more? I'd think mootools would generally end up on the same side of the trust boundary as the author code that depends on it, because it's a known-quantity and because such libraries are almost always loaded from the first-party domain.

@pes10k
Copy link
Collaborator

pes10k commented Apr 6, 2020

Im sure that will happen sometimes (maybe even most times?), bu

  1. we want a solution that'll work for all cases, not just most
  2. im certain there are cases where people are including mootools from (say) google's api CDN, and people would like to allow that to do moo-tools like things, but not do arbitrarily anything, and
  3. we still need the capability for the other decisions points (browser, extensions) who want to draw different trust boundaries than a site (e.g if a site includes mootools and then their own ad library from arbitrary parties, or even rolls them up into a single file), a browser or extension might want to allow the ad library to do normal things except hit storage, something like that

@jackfrankland
Copy link

Many publishers include their scripts from different domains as standard. Another prevalent case maybe would be loading polyfills, where subsequent scripts rely on their existence.

To be clear, the separate-realm approach would not seek to make prototype/global modifications visible

@bholley By global modifications do you also mean window.foo?

@bholley
Copy link

bholley commented Apr 6, 2020

@bholley By global modifications do you also mean window.foo?

I primarily mean |var foo|, which ends up on the global. In the separate-global implementation, you'd have |this !== window|, so the behavior of |window.foo| would be up to the handler of the proxy impersonating |window|.

Many publishers include their scripts from different domains as standard.

Sure, but generally from a domain under their control (or one from a widely-trusted entity).

Broadly speaking, there are two pretty disjoint sets of scripts: scripts that are under the first-party author's control and are part of the primary experience, and those that are not (i.e. tracking/ad/analytic scripts). I don't think sandboxing the former is high-value or practical. The latter might be, depending on the implementation strategy.

Anyway, I think I've made my position clear at this point, and don't have the cycles to discuss more in the near term. I'd be interested to hear an update once Brave has taken a crack at implementing.

@pes10k
Copy link
Collaborator

pes10k commented Apr 6, 2020

Yep, sounds good, thanks for your feedback @bholley :)

@jackfrankland
Copy link

Apologies to drag it out, but I just want to make a final point for the benefit of anyone else who might be following this thread. There are third party scripts that are part of the primary experience of first party sites. These scripts are often "Trusted" to be on the first party, but only to a certain extent, meaning that the first party would expect/want them to not do certain things with the access they have to window; currently there is no mechanism to ensure this.

Most JS APIs/SDKs are written with the assumption that the globalThis is the window. Good examples of third party APIs that offer a primary experience, but would break under a separate global:
https://www.youtube.com/iframe_api
https://maps.googleapis.com/maps/api/js

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants