NWMatcher - A Quick Look
As usual, as soon as I decide it's time to go to sleep, somebody :P has to post a stunning benchmark about a selector matcher ... did I shut down? No, I had to read the entire source code of the library, and this is my quick summary, but consider I am tired ...
The first one is that Diego used almost all best and unobtrusive technique to make it as reliable as possible, which is good, the second one is that Diego introduced something new to achieve its goal: create a Ferrari matcher via pre-compiled functions.
This is the notable good part of NWMatcher, and now there's the bad one (not that bad in any case)
Diego uses a callback to understand native methods from fakes. This is generally good, and this function is isNative, but since these kind of checks are for greedy developers with global pollution maniac, I cannot understand why at some point there is a root.addEventListener ? ... and no checks if addEventListener is native one, something that could make the entire caching system useless or able to generate errors. OK, that would be silly, I mean to inject an event like that, impossible to emulate in Internet Explorer, but who knows what a library could do with such event ...
Another inconsistency is about being unobtrusive, goal reached 99.9% ... what is that, a public property attached directly in the used context? We need to be aware that the document will probably have a snapshot, plus an isCaching property, not a drama, but I think those are the exception that confirm the rule.
Another thing I'll never get is the usage of new in front of functions which aim is to return a non primitive value.
Above function could be called with or without new and the result will be always the same, the returned object. This simply means that if we use new we are consuming CPU and RAM for no reason. So why a performances based library should not take care of this simple fact?
That's it. This stuff is really simple to get with C, C++, or those languages where we have to declare types, names, etc etc, and a new in front of a function is not transparent, is something to think about, cause a new is a new, we are asking for an instance, and a function able to return always another value, rather than the generated instance, shouldn't be called via new.
Diego, I am sorry, I am using this irrelevant gotcha simply because I wrote about these things dunno how many times ... please remove every new Snapshot from your code, or just use the instance, attaching proeprties.
Makes sense? For performances reason, I still suggest to simply remove every new, leaving just Snapshot ... OK, that is probably more than I was planning to say ...
Mutations events are used to cache results. Why mutation events? Let's say in a single function is so common, specially in jQuery world as example, to search the same bloody thing hundreds of time ...
Above way to code is present in dunno how many examples, articles, books, and is as elegant as illogical. If we need a collection, I know we all like the $("thingy") style, but it's that difficult to write code such:
Nowadays, Ajax based Web applications are constantly muted, eachAjax call aim is to change something, show something, or remove something. With these perpetual changes whatever event is attached into a mutation event will be fired dozen of times but Diego, which is not the last JavaScript hacker, found a way to partially avoid this problem. He removes the mutation so the first fired one, will avoid same operations for other N times. This is good, but there is still an event to manage, internally, and a callback to call, but the bad part is that for each select, those events have to be reassigned, or checked, again (isCached attaced property).
Talking about performances, we need to consider the real case scenario, rather than a static benchmark over the same selection, or couple of selections, where nodes are, as I have said, constantly changed via innerHTML, innerText, appendChild, removeChild, insertBefore, insertAfter, all operations probably present in every single function of our JavaScripted page.
Now, is it truly worth it to cache results? Why, if it was the key, browser vendors are not caching results when we perform a getWhateverByWhatever or querySelectorAll?
Is our library truly that slow that a couple of searches performed consecutively are such big problem so we need to increase libraries complexities, side effects, and browser problems/behaviors, for something that on daily basis will be used I guess 5% of the time due to constant DOM manipulation?
NWMatcher, A Selector Monster
First of all, I did not know or remember this project and I must congrats with Diego Perini for different reasons.The first one is that Diego used almost all best and unobtrusive technique to make it as reliable as possible, which is good, the second one is that Diego introduced something new to achieve its goal: create a Ferrari matcher via pre-compiled functions.
About NWMatcher
It is a lightweight stand alone library (so far, but already namespaced) which aim is to standardize some common DOM related task. The initial aim was to create a faster way to understand if a node matches a specific selector, something truly useful with event delegation.About The Technique
To make things as fast as possible, Diego used a pre compilation strategy which is absolutely brilliant. Without XPath transformation, which is also not fully compatible with CSS in any case, and taking care of every single browser "gotcha", Diego creates once, and never again, the function able to perform selector related tasks. The result is that first call a part, with a new selector, the match method will perform a light speed!This is the notable good part of NWMatcher, and now there's the bad one (not that bad in any case)
Few Inconsistencies
Except a weird procedure to understand if array slice could be performed over nodes, something I always got in 3 lines:
var slice = Array.prototype.slice;
try{slice.call(document.childNodes)}catch(e){slice=function(){
// whatever to emulate slice for this scope
}};
Diego uses a callback to understand native methods from fakes. This is generally good, and this function is isNative, but since these kind of checks are for greedy developers with global pollution maniac, I cannot understand why at some point there is a root.addEventListener ? ... and no checks if addEventListener is native one, something that could make the entire caching system useless or able to generate errors. OK, that would be silly, I mean to inject an event like that, impossible to emulate in Internet Explorer, but who knows what a library could do with such event ...
Another inconsistency is about being unobtrusive, goal reached 99.9% ... what is that, a public property attached directly in the used context? We need to be aware that the document will probably have a snapshot, plus an isCaching property, not a drama, but I think those are the exception that confirm the rule.
Another thing I'll never get is the usage of new in front of functions which aim is to return a non primitive value.
function A(){
return {};
};
Above function could be called with or without new and the result will be always the same, the returned object. This simply means that if we use new we are consuming CPU and RAM for no reason. So why a performances based library should not take care of this simple fact?
// example
function Snapshot(){
// here this exists
// and it has a prottype attached
// and it has been internally initialized
// to be used as Snapshot instance
// we return an instanceof Object?
// well done, so the garbage collector
// has to consider this instance
// which is a ghost instance, never used
// but it has been created
return {
// whatever ...
};
};
// let every comment inside the function
// make sense simply adding new
var snap = new Snapshot();
// a non-sense, imho, since this
// would have produced exactly the same
// without instance generations
var snap = Snapshot();
That's it. This stuff is really simple to get with C, C++, or those languages where we have to declare types, names, etc etc, and a new in front of a function is not transparent, is something to think about, cause a new is a new, we are asking for an instance, and a function able to return always another value, rather than the generated instance, shouldn't be called via new.
Diego, I am sorry, I am using this irrelevant gotcha simply because I wrote about these things dunno how many times ... please remove every new Snapshot from your code, or just use the instance, attaching proeprties.
Snapshot =
function() {
this.isExpired = false;
this.ChildCount = [ ];
this.TwinsCount = [ ];
this.ChildIndex = [ ];
this.TwinsIndex = [ ];
this.Results = [ ];
this.Roots = [ ];
}
Makes sense? For performances reason, I still suggest to simply remove every new, leaving just Snapshot ... OK, that is probably more than I was planning to say ...
Mutation Events Side Effects
For compatible browsers NWMatcher uses mutations events such DOMAttrModified, DOMNodeInserted, and DOMNodeRemoved. The cache is partially arbitrary, activated by defaults, deactivated via setCache(false).Mutations events are used to cache results. Why mutation events? Let's say in a single function is so common, specially in jQuery world as example, to search the same bloody thing hundreds of time ...
$("body").doStuff($whatever);
$("body").doOtherStuff($whatever);
$("body").doSomethingElse();
Above way to code is present in dunno how many examples, articles, books, and is as elegant as illogical. If we need a collection, I know we all like the $("thingy") style, but it's that difficult to write code such:
I am sure it is difficult, and Diego knows this stuff better than me, indeed he is using result cache to avoid repetitive expensive searches over potentially massive structures as a DOM could be, in order to guarantee best returned results performances. So far, so good ... and mutations events, attached to the root of the DOM, the document, are able to clear the cache ... how? via a callback. Simple? Sure, it is simple ... but there is a little detail we should consider.
var $body = $("body"); // that's it, wrapped once, SPEED!
$body.doStuff($whatever);
$body.doOtherStuff($whatever);
$body.doSomethingElse();
Nowadays, Ajax based Web applications are constantly muted, eachAjax call aim is to change something, show something, or remove something. With these perpetual changes whatever event is attached into a mutation event will be fired dozen of times but Diego, which is not the last JavaScript hacker, found a way to partially avoid this problem. He removes the mutation so the first fired one, will avoid same operations for other N times. This is good, but there is still an event to manage, internally, and a callback to call, but the bad part is that for each select, those events have to be reassigned, or checked, again (isCached attaced property).
Talking about performances, we need to consider the real case scenario, rather than a static benchmark over the same selection, or couple of selections, where nodes are, as I have said, constantly changed via innerHTML, innerText, appendChild, removeChild, insertBefore, insertAfter, all operations probably present in every single function of our JavaScripted page.
Now, is it truly worth it to cache results? Why, if it was the key, browser vendors are not caching results when we perform a getWhateverByWhatever or querySelectorAll?
Is our library truly that slow that a couple of searches performed consecutively are such big problem so we need to increase libraries complexities, side effects, and browser problems/behaviors, for something that on daily basis will be used I guess 5% of the time due to constant DOM manipulation?
Comments
Post a Comment