Protecting Against HSTS Abuse

Mar 16, 2018

by Brent Fulgham

HTTP Strict Transport Security (HSTS) is a security standard that provides a mechanism for web sites to declare themselves accessible only via secure connections, and to tell web browsers where to go to get that secure version. Web browsers that honor the HSTS standard also prevent users from ignoring server certificate errors.

Apple uses HSTS on iCloud.com, for example, so that any time a visitor attempts to navigate to the insecure address “http://www.icloud.com” either by typing that web address, or clicking on a link, they will be automatically redirected to “https://www.icloud.com”. HSTS will also cause the web browser to go to the secure site in the future, even if the insecure address is used. This is a great feature that prevents a simple error from placing users in a dangerous state, such as performing financial transactions over an unauthenticated connection.

What could be wrong with that?

Well, the HSTS standard describes that web browsers should remember when redirected to a secure location, and to automatically make that conversion on behalf of the user if they attempt an insecure connection in the future. This creates information that can be stored on the user’s device and referenced later. And this can be used to create a “super cookie” that can be read by cross-site trackers.

HSTS as a Persistent Cross-Site Identifier (aka “Super Cookie”)

An attacker seeking to track site visitors can take advantage of the user’s HSTS cache to store one bit of information on that user’s device. For example, “load this domain with HTTPS” could represent a 1, while no entry in the HSTS cache would represent a 0. By registering some large number of domains (e.g., 32 or more), and forcing resource loads from a controlled subset of those domains, they can create a large enough vector of bits to uniquely represent each site visitor.

The HSTS authors recognized this possibility in Section 14.9 of their spec:

…it is possible for those who control one or more HSTS Hosts to encode information into domain names they control and cause such UAs to cache this information as a matter of course in the process of noting the HSTS Host. This information can be retrieved by other hosts through cleverly constructed and loaded web resources, causing the UA to send queries to (variations of) the encoded domain names.

On the initial website visit:

A random number is assigned to the visitor, for example 8396804.
This can be represented as a binary value (e.g., 100000000010000000000100)
The tracker script then makes subresource requests to a tracker-controlled domain over https, one request per active bit in the tracking identifier.
- https://bit02.example.com
- https://bit13.example.com
- https://bit23.example.com
- … and so on.
The server responds to each HTTPS request with an HSTS response header, which caches the tracking value in the web browser.
Now we are guaranteed to load the HTTPS version of bit02.example.com, bit13.example.com, and bit23.example.com, even if the load is attempted over HTTP.

On subsequent website visits:

The tracker script loads 32 invisible pixels over HTTP that represent the bits in the binary number.
Since some of those bits (bit02.example.com, bit13.example.com, and bit23.example.com in our example) were loaded with HSTS, they will automatically be redirected to HTTPS.
The tracking server transmits one image when they are requested over HTTP, and a different image when requested over HTTPS.
The tracking script recognizes the different images, turns those into zero (HTTP) and one (HTTPS) bits in the number, and voila — your unique binary value is recreated and you are tracked!

Attempts to mitigate this attack are challenging because of the difficulty in balancing security and privacy goals. Improperly mitigating the attack also runs the risk of weakening important security protections.

Challenges

Periodically, the privacy risks of HSTS are discussed in the media as a theoretical tracking vector (e.g., [1], [2], and [3]). Absent evidence of actual malicious abuse of the HSTS protocol, browser implementors erred on the side of caution and honored all HSTS instructions provided by sites.

Recently we became aware that this theoretical attack was beginning to be deployed against Safari users. We therefore developed a balanced solution that protects secure web traffic while mitigating tracking.

Apple’s Solution

The HSTS exploit consists of two phases: the initial tracking identifier creation phase, and the subsequent read operation. We decided to apply mitigations to both sides of the attack.

Mitigation 1: Limit HSTS State to the Hostname, or the Top Level Domain + 1

We observed tracking sites constructing long URL’s encoding the digits in various levels of the domain name.

For example:

https://a.a.a.a.a.a.a.a.a.a.a.a.a.example.com
https://a.a.a.a.a.a.a.a.a.a.a.a.example.com
https://a.a.a.a.a.a.a.a.a.a.a.example.com
https://a.a.a.a.a.a.a.a.a.a.example.com
https://a.a.a.a.a.a.a.a.a.example.com
https://a.a.a.a.a.a.a.a.example.com
https://a.a.a.a.a.a.a.example.com
…etc...

We also observed tracking sites using large number of sibling domain names, for example:

https://bit00.example.com
https://bit01.example.com
https://bit02.example.com
...etc...
https://bit64.example.com

Telemetry showed that attackers would set HSTS across a wide range of sub-domains at once. Because using HSTS in this way does not benefit legitimate use cases, but does facilitate tracking, we revised our network stack to only permit HSTS state to be set for the loaded hostname (e.g., “https://a.a.a.a.a.a.a.a.a.a.a.a.a.example.com”), or the Top Level Domain + 1 (TLD+1) (e.g., “https://example.com”).

This prevents trackers from efficiently setting HSTS across large numbers of different bits; instead, they must individually visit each domain representing an active bit in the tracking identifier. While content providers and advertisers may judge that the latency introduced by a single redirect through one origin to set many bits is imperceptible to a user, requiring redirects to 32 or more domains to set the bits of the identifier would be perceptible to the user and thus unacceptable to them and content providers. WebKit also caps the number of redirects that can be chained together, which places an upper bound on the number of bits that can be set, even if the latency was judged to be acceptable.

This resolves the setting side of the super cookie equation.

Mitigation 2: Ignore HSTS State for Subresource Requests to Blocked Domains

We modified WebKit so that when an insecure third-party subresource load from a domain for which we block cookies (such as an invisible tracking pixel) had been upgraded to an authenticated connection because of dynamic HSTS, we ignore the HSTS upgrade request and just use the original URL. This causes HSTS super cookies to become a bit string consisting only of zeroes.

Conclusion

Telemetry gathered during internal regression testing, our public seeds, and the final public software release indicates that the two mitigations described above successfully prevented the creation and reading of HSTS super cookies while not regressing the security goals of first party content. We believe them to be consistent with best practices, and to maintain the important security protections provided by HSTS. We have shared the details of Mitigation 1 with the authors of RFC 6797, and are working to incorporate the behavior as part of the standard.

However, the internet is a wide space full of unique and amazing uses of Web Technology. If you feel that you have a legitimate case where these new rules are not working as intended, we would like to know about it. Please send feedback and questions to web-evangelist@apple.com or @webkit on Twitter, and file any bugs that you run into on WebKit’s bug tracker.