-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Escaped HTML tags are "un-escaped" when rendering HTML #52
Comments
I tracked it down to the changes made in 84c495d (#45), where the But it seems like (part of) the whole point of #45 was to not leave HTML entities as they are so as to fix #44... @caseyjhol Care to comment? 🙂 |
Ah I could've sworn I tested and accounted for this scenario, but clearly I missed the mark. I think the better approach then might be to limit the scope to |
@caseyjhol Thanks for the quick response! You're right, you did add an example of a very similar situation:
But as it turns out, the entities are in fact "un-escaped" during rendering of the test MJML file as described in this issue - the reason the test passes anyway is that the space between # no exception:
htmlcompare.assert_same_html("< script >", "< script >")
# exception:
htmlcompare.assert_same_html("<script >", "<script >") I haven't checked HTMLCompare's code, but I think this is probably because putting a space between But that is exactly the case that people don't need to guard against with sanitization/escaping. So the fix for this issue should add another test case (or extend this one) with an escaped HTML tag like |
@sh-at-cs Thank you for reporting this issue, including the detailed analysis. I think this is a serious issue which we need to fix. I'll try to spend some time on this later - either on reviewing a solution or trying to fix this myself. When we merged #45 somehow the security implications completely escaped my attention. |
Unfortunately all could do yesterday was to add the test case in a new branch fix-escaped-html-tags. How should we fix this issue? I see two approaches (just brainstorming here):
Would that solve the issue if we also remove the |
Going to try to dig into this today a bit. |
I pushed some additional code in the branch fix-escaped-html-tags. Basically the idea is to do the unescape only for contents of I also added CSS parsing using tinycss2. My idea is that this would blow up if the contents would be invalid (e.g. other HTML code) but I did not test that. Not sure if it is worth the additional dependency because arbitrary CSS can influence the displayed contents... Update: It seems like tinycss happily parses even completely invalid HTML but I think css_inline would remove that. Should we just assume that the |
Also maybe you can also check the security advisory draft I created for this issue. Feel free to suggest additions and please check if you agree with the severity classification. |
Consider:
In the resulting HTML output, the formerly HTML-escaped
<
(<
) and>
(>
) are "un-escaped", so the rendered HTML actually containsPretty unsafe: <script>
.Why does this happen? This reverses the user's safety measures and can be dangerous.
The MJML reference implementation doesn't do this and correctly keeps such escape sequences untouched: https://mjml.io/try-it-live/fvvhZhdu9V
The text was updated successfully, but these errors were encountered: