Surfin' Safari

WebKit gets Native getElementsByClassName

Posted by David Smith on Friday, December 21st, 2007 at 11:43 pm

getElementsByClassName is one of the more common functions requested by JavaScript programmers (and added by JavaScript libraries); it works along the same lines as getElementsByTagName and getElementById in looking up elements of a web page by their properties. In fact, it’s so common that in the new in-progress HTML5 specification it’s been added to the official DOM API. Last week WebKit joined upcoming versions of Firefox and Opera in supporting this new feature.

The advantages of a native implementation are clear:

  • No additional JavaScript library files required
  • Clearly specified and consistent behavior
  • Blindingly fast

How fast? Let’s have a look at WebKit’s shiny new implementation. For testing purposes I wrote a simple benchmark allowing comparison between three different methods for getting elements by their class names. The first is the new native getElementsByClassName, and the last two are both from prototype.js; one uses XPath, and the other is a straight JavaScript/DOM implementation.

Graph of getElementsByClassName benchmark results

The results speak for themselves. Web applications that do a lot of class lookups should see noticeable speed improvements when run with any of the native implementations, and existing JavaScript libraries can fill in for older or less advanced browsers. John Resig has run a different benchmark of the same functionality in Firefox 3 and observed a similar native vs. JavaScript/DOM speedup ratio.

Testing Methodology:
Tests were run three times, then averaged. The page was reloaded between each run. The browser used was Safari 3 with WebKit r28911. The hardware was a 2GHz Apple MacBook with 3GB of RAM running Mac OS X 10.5.1. The XPath and JS/DOM functions are from Prototype 1.5.1.

17 Responses to “WebKit gets Native getElementsByClassName”

  1. ain Says:

    I’ve just tested it and it works superbly. Here’s the test case.

  2. anup Says:

    It is definitely good to have such a method native to WebKit. I had a quick look at the JS you used for the xpath text and it is something like this:

    document.getXPathElementsByClassName = function(className, parentElement) {
     var q = ".//*[contains(concat(' ', @class, ' '), ' " + className + " ')]";
     return document._getElementsByXPath(q, parentElement);
    }
    

    In short, using double slash ( // ) in the xpath is a major performance killer. Admittedly the page you are testing is not that big, but using more specific XPath will be a lot better, as the double slash means *every* node from the context down will be visited.

    It would be good if you could change the test perhaps. I would still expect the native implementation to be far better, but I’d expect the xpath to be a bit quicker than it currently is.

  3. richardyork Says:

    Are there plans to support the Selectors API in Webkit? Personally, I find that much more useful.

  4. Mark Rowe Says:

    @anup: The XPath query is taken directly from Prototype’s implementation of getElementsByClassName. It seems that every node being visited is the point of the query, and that returning every node matching the criteria would be rather difficult without visiting them.

  5. anup Says:

    @Mark: sorry, what I meant is that the XPath will visit non-div nodes as well. In cases where double slashes is hard to avoid, it would be better to use something like .//div rather than .//*, but as you said, it has been taken directly from Prototype’s implementation, so your comparison is against a real world scenario, which is fair enough. (My point was more along the lines of if comparing class access via the 3 methods, the xpath query could be made a lot more efficient — but I still agree that fundamentally the native method will be far quicker, regardless.)

  6. Mark Rowe Says:

    @anup: That would be making it more efficient by removing the generality. getElementsByClassName returns more than just div elements, though that happens to be what was used in the test case. Restricting the XPath variant to a subset of all possible nodes would make it faster, but it would no longer be testing the same thing.

  7. gcarothers Says:

    I’m rather worried by the XPath performance. There shouldn’t be any HUGE reason for this performance gap. The fastest selector in XPath should likely be:
    //*[@class = $classname]
    This should take time to compile xpath, plus time to walk whole tree. The ONLY performance difference between that and getelementsbyclassname should be the time it takes to compile the xpath. I hardly imagine that should take 4000+ ms. The fact that one is 30 times faster bodes poorly for Safari’s XPath implementation. Perhaps some more time spent on optimizing the far more useful and general XPath evaluation? As if it was faster there would be far less need for these one off functions, and their implementations could be as simple as generating the optimal XPath.

  8. David Smith Says:

    @richardyork I’m going to give it a shot, at least. We’ll see if I have enough spare time to finish it :)

  9. Mark Rowe Says:

    @gcarothers: In my tests the difference is closer to 15xs, which as you mention is still a very large gap. Some quick testing in the Firefox 3 beta shows a similar performance gap compared to the native getElementsByClassName. I would strongly encourage you to file a performance bug report at http://bugs.webkit.org/ about tightening up this aspect of XPath performance.

  10. Maciej Stachowiak Says:

    WebKit’s XPath implementation (like, I believe, most of the browser-hosted ones) does not compile down to native code or anything like that. Therefore, XPath matching is likely to involve more branches than something specific to class. In addition, we have caches for fetching by id, class and tag name, which XPath would not easily be able to use without a lot of special-casing for certain XPath expressions.

  11. Dimitri Bouniol Says:

    Does getElementsByClassName return elements that contain the clas specified?
    For Example, if i were to do document. getElementsByClassName(“box”), would it return elements whose class is “box red” as well as the ones having only “box”?

  12. David Smith Says:

    @Dimitri

    Yes it does; you can also specify multiple classes to filter by (for example, document.getElementsByClassName(‘box red’);).

  13. randallfarmer Says:

    This may be unlikely to happen, but I’d love anyElement.getElementsByClassName (not just document.getElementsByClassName). It seems to be in the HTML5 spec, but it’s not clear whether it’s in WebKit.

    One example of where I’d use it it: I’ve got an application with forms for dozens of similar records. Right now I manually search all elements of the active form for a particular classname (for example, “answergroup” or “errlabel”). But it would be lovely if that could be a native lookup (at least in some browsers).

  14. Mark Rowe Says:

    @randallfarmer: It is already supported.

  15. randallfarmer Says:

    @Mark Rowe: That is lovely.

  16. Jimmi Says:

    If you’re going down this route, why not implement getElementsBySelector?

    See here… http://simonwillison.net/2003/Mar/25/getElementsBySelector/

  17. Mark Rowe Says:

    @Jimmi: The Selectors API is similar to what you suggest but has the advantage of currently going through the standardisation process. Nightly builds of WebKit already support the querySelector and querySelectorAll methods that it specifies.