A Meaningful Client Side Alternative To node require()

TL;DR

Here you have the solution to all your problems ... no? So take a break, and read through :)

Current Status

Using a generic "module loader" for JavaScript files is becoming a common technique, and full of wins, not only on the node.js server side. There are different client side alternatives, not entirely based over the module concept, and apparently only one concrete proposal, called AMD, able to work in both server and client following a build/parsing procedure.
However, as somebody pointed out already ...

AMD is Not the Answer

This post does not even tell everything bad about AMD problems, and I am getting there, but it's surely a good starting point.
Tobie article is another resource that inspired somehow what I am going to propose here, so before discriminating this or that technique, I hope you'll find time to understand the whole problem ... found it? Let's go on then :)

The Beauty Of node.js require Function

I don't think I should even spend many words here ...

// module: compress

// in the global scope, it's just a module
// so no pollution to the global/window object
// no need to put everything in yet another closure
var fs = require("fs");

// just as example
this.compress = function (file, fn) {
fs
.createReadStream(file)
.pipe(
// check this out!
require("zlib").createGzip()
)
.pipe(
fs.createWriteStream(file + ".gz")
).on("close", function () {
var content = fs.readFileSync(file + ".gz");
fs.unlinkSync(file + ".gz");
fn(content);
})
;
};

So basically, with a synchronous require(module) call we can decide whenever we want where that module should be included in our logic. This means we don't need to think in advance to all needed dependencies for our current module, as well as this doesn't require to split in many nested functions our asynchronous logic.
Look at above example ... in node world the require("fs") is performed basically in every single module ... the FileSystem is that important, hell yeah!
We dont' care much about requiring it at the very beginning of above stupid module 'cause surely that operation will cost nothing! Each module is cached indeed so there's actually zero impact if we use a require hundreds of time inline ... really, it's that fast, but the very first one might cost a bit.
This is why require("zlib") is loaded only once, and only when it's needed ... so that memory, disk, cpu, and everything else, can live in peace before the very first call of that exported method while after that, that call will cost again nothing.

Is a mandatory nested function call through AMD logic as fast as above example? I tell you, NO, but this is not the real problem, is it?

AMD Does Not Truly Scale

... as simple as that. If you need a build process behind something that could fail in any case, I would say you are doing it wrong. I am talking about current era, where internet connection could be there or not ... I am talking about mobile and I give you a very basic example.
I have created my freaking cool website or webapp that loads asynchronously everything and I am on the road with my supa-dupa smart phone.
When "suddenly, le wild spot without internet coverage appears"!
Now I press that lovely button in the lovely application that loads asynchronously the lovely next part of my action and guess what happens ... are you there? Nothing!
That instant without my mobile network will screw up my flow, will break my action, will let me wait "forever" and in most common cases, it will block me to use properly the app.
Of course, if that action required server side interaction, that action won't work in any case ... but how come I don't even see a notification? How come pieces of my application are not showing up telling me that there is actually a problem, rather than letting me wait without any notification, 'couse probably even notification logic was created as AMD module?

And what if connection is simply bad? The whole application or website is responding decently but that part, oh gosh how much I was planning to use that app part, is taking ages to load! How would you feel in front of such program?

AMD Resolves Lazy/Circular Load Synchronously

That's correct ... gotcha! Have you ever read what you should do in order to resolve modules that load asynchronously other modules inside themselves?

//Inside b.js:
define(["require", "a"],
function(require, a) {
//"a" in this case will be null if a also asked for b,
//a circular dependency.
return function(title) {
// me: Excuse me ... WUT?
return require("a").doSomething();
}
}
);

So here the thing, if "by accident" you have a circular dependency you should use require() as node.js does ... oh well ...

On Circular Dependencies / Cycles

This topic is one of the biggest paradox about programming ... we try to decouple everything, specially when it comes to write modules, so that not a single part of the app should be aware of the surrounding environment ... isn't it? Then someone said that circular dependencies are bad ... but how come?
Here an example, a truly stupid one:
Hi, I am a human, the human module, and I am completely sufficient, but I need to go in a place that would take too much by my own ... so I need the module car.

Hi, I am a car, the module car, and I am fabulous by my own, but I need the module human to be able to go somewhere.

A partner, better than a car, could also explain the fact we actually think in circular references all the time ... am I wrong?
The AMD take in this case is that "we should change the require logic when this happens and we should be aware in advance in order to solve this" ... yeah, nice, so now in AMD we have two ways to require and return what we export ... and once again, in my humble opinion, this does not scale ... at all!

Double Effort, Double Code!

With AMD we don't only need to change our AMD code/style/logic when things are not known in advance, as showed before ... we also need to write code for modules that is not compatible with node.js, resulting in such piece of redundant code that I believe nobody truly want to write more than once in a programmer life.
Take an excellent library as lodash is, and check what it has to do in order to be compatible with AMD too ...

// expose Lo-Dash
// some AMD build optimizers, like r.js, check for specific condition patterns like the following:
if (typeof define == 'function' && typeof define.amd == 'object' && define.amd) {
// Expose Lo-Dash to the global object even when an AMD loader is present in
// case Lo-Dash was injected by a third-party script and not intended to be
// loaded as a module. The global assignment can be reverted in the Lo-Dash
// module via its `noConflict()` method.
window._ = lodash;

// define as an anonymous module so, through path mapping, it can be
// referenced as the "underscore" module
define(function() {
return lodash;
});
}
// check for `exports` after `define` in case a build optimizer adds an `exports` object
else if (freeExports) {
// in Node.js or RingoJS v0.8.0+
if (typeof module == 'object' && module && module.exports == freeExports) {
(module.exports = lodash)._ = lodash;
}
// in Narwhal or RingoJS v0.7.0-
else {
freeExports._ = lodash;
}
}
else {
// in a browser or Rhino
window._ = lodash;
}

Now ... ask yourself, is this what you want to write for every single module you gonna create that might work in both server and client side?

Combined AMD Modules

Here another possibility, super cool ... we can combine all modules together into a single file: YEAH! But if this what you do for your project, don't you wonder what is the point then to use all those "asynchronous ready" callbacks if these will be executed in a synchronous way in production? Was that different syntax truly needed? And what about JS engines parsing time? Is processing the whole project at once in a single file a good thing for both Desktop and Mobile?
Why are you developing asynchronously with all those nested callbacks if you provide a synchronous build? Is the code size affected? Does all this make any sense?

AMD, The Good Part... Maybe

OK, there must be some part of this logic that conquered many developers out there .. and I must admit AMD "solved with nonchalance" the fact JavaScript, in the client side, has always had problems with the global scope pollution.
The fact AMD forces us to write the module inside a function that receives already arguments with other needed modules, is a win ... but wait a second, wasn't that boring before that nobody until now wrote a single bloody closure to avoid global scope pollution?
I think AMD is a side effect, with all possible noble and good purpose, of a general misunderstanding of how is JS code sharing across libraries.
Let's remember we never even thought about modules until we started clashing with all possible each other polluting namespaces, global variables, Object.prototype, and any sort of crap, thinking we are the only script ever running in a web page ... isn't it?
So kudos for AMD, at least there is a function, but where the hack is the "use strict" directive suggested for every single bloody AMD module in any example you can find in the documentation? where is the global pollution problem solver, if developers are not educated or warned about the problem itself?

node.js require ain't gold neither

When network, roundtrips and latency come into the game, the node.js require() solution does not fit, scale, work, neither.
If you understand how Loading from node_modules Folders logic works, and you have an extra look into All Together diagram, you will realize that all those checks, performed through an HTTP connection, won't ever make sense on the client side.
Are we stuck then? Or there's some tiny little brick in the middle that is not used, common, public, or discussed yet?

A node require() Like, For Client Side

Eventually, here where I was heading since the beginning: my require_client proposal ... gosh it took long!
Rewind ... How about having the best from all techniques described "here and there" in order to:
  • avoid big files parsing, everything is parsed once and on demand
  • provide an easy to use builder for production ready single file code
  • use one single syntax/model/style to define modules for both node or client side
  • solve cycles exactly as node does
  • forget 10 lines of checks to export a bloody module
  • organize namespaces in folders
  • obtain excellent performance
  • make a single file distributable
... and finally, compare results against all other techniques?
Here, the AMD loader, versus the inline and DOM script injection loader, versus the dev version of my proposal, and finally the production/compiled version of my proposal ... how about that? You can use any console, profiler, dev tool/thing you want, to compare results, it's about a 150Kb total amount of scripts with most of them loaded ASAP, and one loaded on "load" event.
You can measure jquery, underscore, backbonejs, and the ironically added as last script head.js script there within their loading/parsing/ready time.

Reading Results

If you think nowadays the DOMContentLoaded event is all you need to startup faster your web page/app, you are probably wrong. DOMContentLoaded event means almost nothing for real UX, because a user that has a DOM ready but can't use, because modules and logic are not loaded yet, or see, because CSS and images have not been resolved yet, the page/app, is simply a "user waiting" rather than interacting and nothing else.
Accordingly, if you consider the code flow and the time everything is ready to be used, the compiled require() method is the best option you have: it's freaking fast!

"use strict" included

The best part I could think about, is the "use strict"; directive by default automatically prepended to any module that is going to be parsed.
This is a huge advantage for client side code because while we are able to create as many var as we want in the module scope, the engine parser and compiler will instantly raise an error the line and column we forgot a variable declaration. All other safer things are in place and working but .. you know, maybe you don't want this?
That's why the require_client compiler makes the strict configuration property easy to spot, configure, and change ... as long as you know why you are doing that, everything is fine.

How Does It Work

The compiler includes a 360 bytes once minzipped function that when is not optimized simply works through XHR.
This function could be the very only global function you need, since every module is evaluated sandboxed and with all node.js module behaviors.
You can export a function itself, you can export a module, you can require inside a module, you can do 90% of what you could do in a node.js environment.
You don't need to take care of global variables definition, those won't affect other modules.
What you should do, is to remember that this is the client so the path you choose in the development version, is the root, as any node_modules folder would be.
If you clone the repository, you can test via copy and paste the resulting build/require.js in whatever browser console traps such require_client ~/folder_you_put_staff/require_client/js or require_client ~/folder_you_put_staff/require_client/cycles.
require("a") in the first case and require("main") in the second.
In order to obtain a similar portable function you should create a folder with all scripts and point to that folder via require_client so that a script with all inclusions will be created.

A Basic Example

So here what's the require_client script is able to produce.
Let's imagine this is our project folder structure:

project/
css/
js/
require.dev.js
index.html

The index.html file can simply have a single script in its header that includes require.dev.js and the bootstrap module through require("main");, as example.
So, let's imagine we have module a, b, and main inside the js folder, OK?
require_client project/js project/require.js
This call will produce the require.js file such as:

/*! (C) Andrea Giammarchi */
var require=function(c,d,e){function l(n,m){return m?n:g[n]}function b(o){var m=a[o]={},n={id:o,parent:c,filename:l(o,h),web:h};n[k]=m;d("global","module",k,(e.strict?"'use strict';":"")+l(o)).call(m,c,n,m);j.call(m=n[k],i)||(m[i]=h);return m}function f(m){return j.call(a,m)?a[m]:a[m]=b(m)}var k="exports",i="loaded",h=!0,a={},j=a.hasOwnProperty,g={
"a": "console.log(\"a starting\");exports.done=false;var b=require(\"b\");console.log(\"in a, b.done = \"+b.done);exports.done=true;console.log(\"a done\");",
"main": "console.log(\"main starting\");var a=require(\"a\");var b=require(\"b\");console.log(\"in main, a.done=\"+a.done+\", b.done=\"+b.done);",
"b": "console.log(\"b starting\");exports.done=false;var a=require(\"a\");console.log(\"in b, a.done = \"+a.done);exports.done=true;console.log(\"b done\");"
};f.config=e;f.main=c;return f}(this,Function,{
strict:true
});
Now the index.html could simply include reuiqre.js rather than the dev version ;)
Above program is the same showed in node.js API documentation about cycles. If you copy and paste this code in any console and you write after require("main"); you'll see the expected result.
As summary, require_client is able to minify and place inline all your scripts, creating modules names based on files and folders hierarchy.
All modules will be evaluated with a global object, already available, as well as module, exports, and the latter used as the module context.
The simple object based cache system will ensure these modules are evaluated once and never again.

What's YAGNI

Few things, that could change, are not in on purpose. As example, the module.parent is always the global object, since in fact, it's in the global scope, through Function compilation, that the module will be parsed the very first time. Not sure we need a complicated mechanism to chain calls and also this mechanism is error prone.
If you have 2 scripts requiring the same module, first come, first serve. The second one should not affect runtime the already parsed module changing suddenly the module.parent property ... you know what I mean?
The path resolution is a no-go, rather than trying to fix all possible messes with paths and OS, put your files in a single JS folder and consider that one your virtual node_modules one for clients.
If you have folder links inside the JS folder it's OK, but if you have recursive links you are screwed. Please keep it simple while I think how to avoid such problem within the require_client logic, thanks.

What Is Not There Yet

If your project is more than a megabyte minzipped, you might want to be able to split in different chunks the code so that the second, last, injected, require, won't disturb the first one. This is not in place yet since this free time project was born for a small/medium web app I am working on that will be out pretty soon ... an app that surely does not need such amount of code as many other web app should NOT ... but you know, I've been talking about scaling so much that a note about the fact this solution won't scale so nicely with massive projects is a must say.

UpdateJust landed a version that does not cause conflicts with require itself. The first defined require will be the one used everywhere so it's now possible to name a project and include it in the main page so ... parallel projects are now available ;)

If you would like to reuse node modules that work in the client side too, you needto copy them inside the path folder.
The configuration object is the one you can find at the end of the require.js file ... there are two defaults there, but you can always change them via require.config.path = "different"; as well as you can drop the require.config.strict = false; directive so that modules will be evaluated without "use strict"; directive.
Anything else? Dunno ... you might come up with some hint, question, suggestion. And thanks for reading :), I know it was a long one!

Last Thoughts

If AMD and RequireJS comes with a compiler able to make everything already available somehow, think how much pointless become the optimization once you can have already available all dependencies without needing to write JS code in a different way, regardless it's for node.js or the client web normal code.
There are NOT really many excuse to keep polluting the global scope with variables, we have so may alternatives today that keep doing it would result as evil as any worst technique you can embrace in JS world.

Comments

Popular posts from this blog

8 Things you should not be afraid of as a Developer

News

Why REST is so important