Scope and Hoisting in Node Modules

I’ve been building a project in node.js, basically a simple SDK for a set of product APIs I work with. One thing it does is validate the data you send it before making an API call, and in the module that does the validation I created a global variable to store error messages while it iterated through the data object. I was pretty sure that might cause problems and had two questions…

  • Could that global variable cause collisions in a high-volume usage (vs. my handful of test cases)?
  • Could I scope that variable so it wasn’t in the global scope, but was available throughout my iteration?

I didn’t know the answer to either (though I suspected it, causing these questions to arise). I decided to create and run some test cases.

My first test was could I conditionally declare the variable within a function and have it not be declared on recursive runs of the function. I was already passing a boolean to identify whether the instance of the function was primary or an internal recursion to ensure some code didn’t run during recursion. Creating a bare-bones version of that model was pretty simple.

If the scope existed throughout the function and all its recursions, the output would have been:

Scoped is undefined
Scoped is 3
returned value = 4

Instead, it was:

Scoped is undefined
Scoped is undefined
returned value = 3

I didn’t have to read very far in a reference doc about the var statement to find out that all var statements are hoisted before the code is run. That meant even if the var statement is part of a conditional statement, it’s going to be processed before any conditional code. In short the if statement controlling whether or not it gets declared is useless for controlling scope.

So declaring the variable within the function wouldn’t work.

That left me two options…

  • Test to see if I could create collisions.
  • Try declare the variable outside the function, but not within the global scope.

Once again, the second option was going to be very simple to execute as a test case, although it would end up adding another argument to every invocation of this function and I’d need to make sure that every invocation contained it when I modified my code. Still, that seemed like it would be less time consuming than creating and implementing a test case for collisions.

The expected output on that would be: [ true, false ] and it was. I then extended the code a smidge.

There the expected output would be

[ true, false ]
[ false ]

It was. But this wasn’t the most readable code. I don’t just think about who will inherit my codebase. I think about 6 months from now when the SDK I’m working on needs to be modified for a new release or I’m trying to help a user debug their implementation of it, and I’m looking at something I wrote and feeling like I’m reading some lunatic’s legacy code.

Assigning a name to the array at declaration time would help make the code more readable, but would it create a global hook?

So I tried:

That creates the output: [ true, 'is this in scope?', false, 'is this in scope?' ].

So declaring the array when we first called the function created a global variable. If we used that variable name in a subsequent function call without redeclaring it as an empty array:

The contents from the subsequent call are added to the array, both function call’s return values are simply a reference to the global array variable, and thus we get this output.

[ true, 'in scope?', false, 'in scope?', false, 'in scope?' ]
[ true, 'in scope?', false, 'in scope?', false, 'in scope?' ]

But here’s the magic of JavaScript variables. Much like a domain name, a variable name is a pointer to a unique object ID. So if we define the variable as an empty array in the arguments each time, a new array object is created each time and the variable is storing a different object ID. So if we try:

We get the following output:

[ true, 'in scope?', false, 'in scope?' ]
[ false, 'in scope?' ]

That’s nicer for readability, still, that’s liable to fall victim to collisions. Declaring the array without a name is like giving it a secret name that other bits of code won’t know about unless it’s passed to them as an argument (or they do some serious gymnastics to discover it).

So declaring the array without a name in the arguments seemed like a safe and usable option I’d just need to comment on explicitly so it didn’t seem insane. And it would be the quickest fix. But testing scope as a module wasn’t too hard and I wanted to do that too.

The following code tests quick and long execution with a module. The quick execution (an immediate return) is commented out while the long execution (a setInterval for 500 ms) is active.

In the quick execution mode, everything happens fast enough that the output is:

[ 1 ]
[ 2 ]
[ 3 ]
[ 4 ]
[ 5 ]
[ 6 ]
[ 7 ]
[ 8 ]
[ 9 ]

While in the long execution, we see 100% collision.

[ 9 ]
[ 9 ]
[ 9 ]
[ 9 ]
[ 9 ]
[ 9 ]
[ 9 ]
[ 9 ]
[ 9 ]

If we code our module where runfoo calls a function, declaring the array without a name in the function call instead of using it globally throughout the module, we do have to do a little more labor to ensure we pass the array around, but we don’t have collisions.

We get the 1-9 output in both short and long executions.

While I suspected the results would turn out like this (and some people more expert than me would simply know it would), I wasn’t quite sure. Sometimes I need to test my assumptions before I build them into my code. And for these assumptions, I decided to share my tests as a blog post. Do with it as you will. And if you’d like me to publish more posts when I test my assumptions, please like or share this on your favorite social media to encourage me.

Leave a Reply

Your email address will not be published. Required fields are marked *