Surfin' Safari

User Agent String Changes On WebKit Trunk

Posted by Peter Kasting on Thursday, March 31st, 2011 at 1:25 pm

Recently some changes to the User Agent (UA) string have landed. These changes are designed to add UA string detail, remove redundancy, and increase compatibility with Internet Explorer, and are happening in conjunction with similar changes in Firefox 4.

Here are a few sample pre-change UA strings:

Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/534.27+ (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; en-us) AppleWebKit/534.27+ (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27

Here are the equivalent post-change UA strings:

Mozilla/5.0 (Windows NT 6.0; WOW64) AppleWebKit/534.27+ (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_7) AppleWebKit/534.27+ (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27

In detail, the differences are as follows:

  1. On Windows, the initial “Windows;” platform identifier has been removed. This string is also present in the subsequent OS version identifier, and removing it is more compatible with Internet Explorer, whose UA string doesn’t have this initial token.
  2. The “U” SSL encryption strength token has been removed. This token dates from more than a decade ago, when U.S. export laws limited the encryption strength that could be built into software shipped to various other countries; the valid values are “U” (for “USA” 128-bit encryption support), “I” (for “International” 40-bit encryption support), and “N” (for “None”, no encryption support). These days, it’s unusual to ship without 128-bit SSL support everywhere; ports can add “I” or “N” if necessary.
  3. On 64-bit versions of Windows, tokens have been added after the OS version. 32-bit builds running on 64-bit Windows have added “WOW64”. (“WOW64” stands for “Windows 32-bit On Windows 64-bit” and is the name Microsoft gives its 32-bit compatibility subsystem.) 64-bit native builds use “Win64; x64” for x64-based processors and “Win64; IA64” for Itanium systems. These tokens are useful for sites that need to provide download links for native executables, and match what Internet Explorer uses.
  4. The locale has been removed. Web authors who want to know what languages a browser supports should use the HTTP Accept-Language header instead, which can supply multiple locales.
  5. Windows CE builds of Qt-based ports should report the OS version slightly more accurately (e.g. “Windows CE 5.1” instead of “Windows CE 5.x” or “Windows 5.1”).

As various ports ship these changes, you might notice web compatibility problems.  If so, please point webmasters to this post, and/or file bugs in the bug tracker.

10 Responses to “User Agent String Changes On WebKit Trunk”

  1. CyberSkull Says:

    This should make writing a UA parser/regex simpler.

  2. diamondsw Says:

    Faking “KHTML” and “like Gecko” should be behind us by now – WebKit is fully established, between the massive use in mobile, Safari, and Chrome. When I saw the UA was being simplified, I was hoping this would go away:
    “(KHTML, like Gecko) Version/5.0.4″

    Just tell me the WebKit build and the host application version:
    WebKit/534.27 Safari/533.20.27

    Meanwhile, I’m very surprised to see that WebKit isn’t matching Firefox in the presentation of the host OS. “Intel Mac OS X 10_6_7″ seems awfully verbose compared to “Intel Mac OS X 10.6″. Why underscores, and why report the point release of the OS? We’re not reporting the service pack or hotfix level of Windows.

  3. JAB Creations Says:

    While I’m glad the user agent has gotten a little attention these changes still fall short of what needs to be done.

    Here is an example with only the necessary information…
    Safari/5.0.4; AppleWebKit/534.27; Windows NT 6.0; WOW64;

  4. squareman Says:

    I second JAB. Why are we still playing this silly Mozilla/5.0 game? Why?

  5. Vasil Dinkov Says:

    @JAB Creations and @squareman
    It’s obvious it’s for compatibility with old content.

  6. Alican Çubukçuoğlu Says:

    @Vasil Dinkov
    Actually, it’s obvious that it’s NOT for compatibility with old content. This update already breaks compatibility with some old content.

  7. Peter Kasting Says:

    There are a ridiculous number of web pages that rely on all those pieces, many of which are unmaintained. We will never be able to remove them.

  8. JAB Creations Says:

    Even if something is no longer being maintained it still has to be hosted which implies someone is occasionally paying a web host to keep the content up.

    Also broken code should break, period. Working on a browser to “correctly” display broken code that can’t be correctly displayed in any context wastes everyone’s time. It wastes the time of those building the browsers and it reduces the need for competent web designers and web developers who write standards compliant code.

    In regards to what CyberSkull said years ago yes, I would have used regular expressions to detect browsers based on user agents, now I rely on DOM object detection. However why are the strings “KHTML” and “Gecko” still in the user agent? Why not just add the strings “MSIE” and “Opera” while we’re at it? If people are relying on those strings then we should STILL remove them. Why? If their pages break they either have to learn how to code correctly and if they can’t or won’t then those job positions should become open to those who can do such jobs correctly.

  9. DrPizza Says:

    Why the asymmetry between Macintosh and Windows?

    The Windows string has the pattern:
    (; ; )
    The Mac OS X string has the pattern:
    (; )
    Why not:
    (Mac OS X 10.6.7; Intel; x64) to match?

    And man, do we still really need “KHTML, like Gecko”? WebKit’s compatible with neither!

  10. DrPizza Says:

    Oh goddammit, the stupid thing ate my angle brackets.

    Windows is:
    ([full OS version]; [subsystem]; [optional architecture])
    Mac is:
    ([pointless static Macintosh text]; [architecture plus operating system with version number weirdly using underscores instead of dots])