Intent: Why Types Are Important

Apr 20, 2016

A common complaint from both the Javascript religious and the newcomers alike is Javascript is tremendously difficult to debug when things go sideways. When a null or undefined reference is passed, the stack trace can be illuminating or it can be completely obscure. Couple this with the growing popularity to use anonymous functions assigned to variables in lieu of named functions and you have a recipe for tremendous difficulty.

In modern compiled languages like C# and Java or F# and Scala, there is an enforced, static type system in place which ensures values which would cause a malfunction are disallowed from function calls. This does not guarantee program correctness, but it does help eliminate strange errors.

Of course, the really important thing which comes from static typing is less about the compile-time checking as it is the integration of typed thinking into the entire development process. While developers working in statically typed languages are thinking about the logic for their programs, they are also considering the data types they are using to arrive at their results.

Revealing Intent

When a programming language provides type annotations, it means the programmer can declare what they intend the program to do up front. Most statically typed languages typically have an editor (or two) which provide insight into what the annotations say elsewhere in the codebase, possibly quite far from where the programmer is currently looking.

What these editors are really providing is a look into the intent of the work which was done before. I refer to this kind of behavior as revealing intent. By revealing intent to the programmer, they can make choices which simplify the work of understanding other objects, functions and behaviors.

Javascript, for better or worse, does not allow for revealing intent other than variable names. This means that either each variable name must contain information about its type so people can opt to conform to the expected types or, alternatively, misbehave and intentionally break the function being called. I am a noted fan of dynamic languages and like my functions free-flowing, but sometimes I long for a good, strong contract.

Let’s pick a simple function written in Javascript and see what our baseline is.

    function addJS (a, b) {
        return a + b;
    }

Our function could add two numbers, but it could also be called upon to do other things the name does not specifically call out. AddJS could concatenate strings, coerce numbers to strings or act upon functions, objects and many other data types. Clearly this is not what we intended.

Microsoft designed a language called TypeScript which transpiles to Javascript and includes features from ES Next as well as a quasi-static type system. The type system in TypeScript is a step in the right direction when types are used, but there are still some problems. Let’s have a look at our function rewritten in TypeScript.

    function addTS (a: number, b: number): number{
        return a + b;
    }

Our new add function declares it takes two numbers and returns a number. This is really handy when we are exposing a function as an API to the rest of the world because other programmers can then capture this type information as they program and use it to make their calls conform to the expectation declared in the type contract…

Unless they aren’t using Typescript in the rest of their application.

Typescript really only solves the problem if someone has bought into the entire ecosystem and uses modules which exclusively have TypeScript annotations attached or a TypeScript Definition file. Atop all of this anyone who wants to access these annotations will need to use an editor which supports that kind of behavior.

A Type By Any Other Name...

As functional programming continues to gain traction, patterns like function currying become more common in codebases. This means we now have functions which could return other functions (higher-order functions) which run 2 or more layers deep. This kind of behavior can be demonstrated by a small rewrite of our add function.

    function curriedAddJS (a) {
        return function (b) {
            return a + b;
        };
    }

With this setup in vanilla Javascript, we have really challenged the next programmer to try and decode our intent if they aren’t familiar with a currying pattern. Due to the lack of types and intent declaration in Javascript, this function, for as simple as it is, tells us almost nothing about intent since even the input variables are separated across different functions and the result is the product of a closure.

If we were working in Scala we would get intent and behavior bundled in, possibly, too terse a form. Nonetheless, the full intent of our behavior is described by the following line of code.

    def add(a:Int)(b:Int) = { a + b }

This function definition actually declares not only data types for the function, but how they interact. We could almost say add moves out of the range of function definition and into a type definition on its own. That is, however, a little more esoteric than we need to be.

Of course, moving back to TypeScript, we can see where data types, function definitions and intent start to fall apart. Although data types are declared and displayed, it is difficult to write a curried function in a way that is both clear and declarative of intent. Let’s have a look at our curriedAdd function in TypeScript.

    function curriedAddTS (a: number): Function {
        return function (b: number): number {
            return a + b;
        }
    }

That’s kind of like a punch to the eyeball, isn’t it?

Tying Intent and Implementation

At the end of the day, the challenge we keep coming up against is the fact that our intent is either declared, vaguely, in code and lost before execution, or it is not declared at all. Really, though, data types are not intent. This is one of the biggest problems we have.

Although programming deals in data and behavior, the problem we have introduced is we have become obsessed with the data types and we have forgotten that they are only meaningful within the context of behavior.

A couple of weeks ago, we looked at a way to add a little metadata to our function in order to communicate information to programmers who use our API in the future. I also introduced a small library called Signet, which helps to simplify the process of attaching metadata so we don’t have to litter our code with a bunch of Object.append calls.

By using the language we introduced in Typed Thinking, we can actually get a full declaration around our function and add meaning and context to our API in general. Let’s try applying our full type, declaring intent, to our vanilla JS implementation of curriedAdd.

    const curriedAdd = signet.enforce('number => number => number', curriedAddJS);

This definition helps us to fully understand what curriedAdd will do. When we get back our final function, we can query the signature from anywhere in our code, including execution time, and get a full report back on what our function will do. Let’s take a look.

    console.log(curriedAdd.signature); // number => number => number

This is a simple riff on the previous post since we already knew this was possible. Where our simple type language becomes useful is when we start working with curriedAdd. Instead of getting back an untyped function, which gives us no more information than we ever had before, we have a fully parsed AST which comes along for the ride with our function. This means we will actually get all of our other types for free all the way through our entire execution path. Let’s have a look at what happens when we call curriedAdd with a single value and check the signature.

    console.log(curriedAdd(1).signature); // number => number

This means our initial type declaration actually allows us to understand the return types without any further declaration anywhere else!

Our enforced type declaration has given us a way to communicate our intent to anyone interacting with our code whether we are available to answer questions or not. This type assignment reveals intent and helps to make library APIs and code in other files easier to work with.

On top of the ability to clearly identify intent and keeping our APIs static and safe, we still get to keep the dynamic nature of Javascript and all of the convenience that comes with it anywhere we are using local functions. This blended static/dynamic coding allows us to quickly and easily iterate on implementation details within a module or namespace while keeping the actual interface well-defined for the external user!

Summary

Languages and the way they manage types can be a really divisive topic, but with the flexibility built into Javascript, we can actually manage types on a case by case basis and surgically insert requirements without having to tie our hands elsewhere in our code.

Although TypeScript is a popular solution for trying to get a handle on the declaration of intent throughout our codebase, it starts to fray at the edges with advanced techniques, making it unsuitable for the move toward a more functional style of programming.

In the end, enforcing a type on our API endpoints with a strong, lightweight library like Signet gives us the right blend of enforced static typing where we need it and a fully dynamic environment when we want it. This seems like the only sane direction to go when working in a language as rich and flexible as Javascript. Go declare your intent and make people awesome.

Unit Testing Express Routing

Apr 13, 2016

If you are unit testing your code, and you should be, then it is likely you have encountered certain patterns which make testing more difficult. One of the patterns which pops up often is centered around Express.js routes. Although the router has a nice, simple API to code against, the actual testing of the route action code can be a little challenging if you aren’t used to using tools such as mocks, fakes and stubs.

Today we are going to explore using the Mockery library and an express router fake module to simplify the process of reaching into our modules and getting ahold of the route actions we provide to express for the sake of testing and ensuring program correctness.

The Libraries

The post today will make use of Mocha, Mockery and Express Route Fake. The first two you may have heard of, the last you likely have not. Let’s walk the list and get a handle on what each of these tools does for us at a high level.

Mocha

If you have done any TDD or unit testing in Javascript and especially in Node, you have likely already encountered Mocha. Mocha is a popular BDD tool for testing Javascript. We will have examples of what test code in Mocha looks like later.

Mockery

Mockery is a tool for manipulating packages and modules which are required within script files for Node. Although Mockery is quite useful for breaking tight coupling, it is not as well known as Mocha. Unit testing scripts which are tightly coupled through require statements is challenging since there is no clean way to inject a dependency into a dependency chain without a third party tool. Mockery is key to getting good tests around Express route actions as we will see.

Express Route Fake

Express Route Fake is a module I wrote to emulate behavior we use at Hunter to gain access to our route actions as we get tests written around our code. The core idea of Express Route Fake is to substitute the fake module in for Express in order to capture references to the actions which are assigned to different routes. We will explore this shortly.

Registering Replacements with Mockery

I am assuming you are already familiar with a testing framework, so I am not going to cover the basics of using Mocha. Instead we are going to start off looking at how to register a faked module with Mockery so we can break a dependency in Node.

Mockery actually manipulates the Node module cache and updates it with code of our choosing. This gives us the ability, at test time, to dig in and create a fake chunk of code which we control so we can ensure our tests actually send and receive the right output without doing dangerous things like interacting with a network connection or a database. We want the scope of our unit tests to be as tight as is reasonable so we can ensure the code under test actually performs the correct action every time.

This kind of guarantee around tests is called determinism. Deterministic tests are tests which, when provided a single input, always return the same output. Ideally any interactions which would break the determinism of our functionality should be stripped away and replaced with code which always does the same thing. This gives us guarantees and makes it easier to identify and test a variety of different cases. Let’s have a look at an example of breaking a dependency with Mockery and replacing the code with something else.

    beforeEach(function() {
        var mysqlFake = {
            query: function(query, params, callback) {
                callback(null, []); // Returns a null error and an empty array
            }
        };

        mockery.enable({
            warnOnReplace: false,
            warnOnUnregistered: false
        });

        mockery.registerMock('mysql', mysqlFake);
        myModule = require('../app/myModule');
    });

    afterEach(function() {
        mockery.deregisterAll();
        mockery.disable();
    });

The beforeEach block sets up a fake MySQL module with a query method which immediately calls a passed callback. Mocking MySQL does two things for us. First it removes the asynchronous behavior which comes from interacting with a database, ensuring all of our tests run inside the event loop. The second thing this does for us is it allows us to provide a guarantee that the results passed back from our MySQL fake are always the same. This means we don’t have to set up and tear down an actual database instance. We can simply test the code we care about and assume the database library does what it is supposed to.

Important items to note about the code in the beforeEach and afterEach blocks: Mockery takes a certain amount of work to get working. The enable call in beforeEach starts Mockery working in memory and tells it not to post warning messages every time it does something. It is safe to assume, if you see the results you want, that Mockery is working. The code in afterEach returns the cache back to its original state, leaving everything clean for the following test.

Faking Router For Great Good

Now that we have looked a little bit at how Mockery works, we are ready to start digging into what we really care about. Let’s start testing our Express route actions. The first thing we should look at is a little bit of example Express routing code. Below is a simple route example which just receives a request and then responds with 200 and a little message. It’s not exciting code, but we can actually test it.

    'use strict';
    
    var router = require('express').Router();

    router.get('/mypath/myentity', function(req, res) {
        // Do stuff here
        res.status(200).send('Everything worked out fine').end();
    });
    
    module.exports = router;

We can see a few things here which will be really important to get the tests right. First, Router is a factory function. This means anything that is going to stand in for express will have to implement this correctly. The next thing we know is, the router object which is returned has methods on it like ‘get.’ We will want our fake object to replicate this as well. Now is probably a good time to take a look at the source code for Express Route Fake.

Express Route Fake is a small module which packs a pretty massive punch. Everything is set up for us to quickly capture and test our route actions. The very first thing we have is a cache object which will store key/value pairs so we can request whichever function we want to test easily.

After the cache object we have a simple function which captures the route method, the route path and the route action. This function is the real workhorse of the entire module. Every route call will go through this one function and be stored in our cache. It is critical to understand that all of the methods, paths and actions will be captured and related so we can test them later.

Finally we have the Router factory fake, getRouteAction and reset functions. Reset is exactly what one might expect, it resets the cache to empty so the entire process can be repeated without starting with a dirty object. getRouteAction performs two critical activities. First, it verifies the route exists and throws an error if it doesn’t. Secondly, it passes back the route action so we can test the function outside of the express framework.

A side note on the getRouteAction error behavior; it is important and useful to get clear errors from our fake in this case. Over time my team has run into confusing situations because our implementation was home-grown and does not throw useful errors. This means we get an error stating “undefined is not a function” which does not really tell us what is wrong. By getting an error which informs you the route doesn’t exist, you can immediately verify whether the route is being created correctly and not need to troubleshoot your tests.

Putting the Setup Together

Now that we have the tools and have taken a look at them, let’s start constructing what our tests would look like. We know Mockery is going to substitute in our fake module and we also know that Express Route Fake will capture the actions. All we really need to do is a little setup to get things rolling.

    describe('Testing Express Routes', function() {
        var myRoutes;
        var req;
        var res;

        beforeEach(function() {
            req = {};
            res = {
                resData: {
                    status: 0,
                    response: ''
                },
                status: function(status) {
                    res.resData.status = status;
                },
                send: function(response) {
                    res.resData.response = response;
                },
                end: function() { }
            };

            mockery.enable({
                warnOnReplace: false,
                warnOnUnregistered: false
            });

            expressFake.reset();

            mockery.registerMock('express', expressFake);

            // It is critical to require our module AFTER we inject our fake, or Node will use the
            // original module, which defeats the entire purpose of this setup.
            myRoutes = require('../routes/myRoutes');
        });

        afterEach(function() {
            mockery.deregisterAll();
            mockery.disable();
        });

    });

In our setup we have a little bit of extra setup. Since Node and Express interact with the http module through request and response objects (typically called req and res respectively), we will need to create objects we can pass through and use as well. Considering the calls we are making on res, I just included the bare minimum functionality to actually test our route action: status, send and end.

Please note we are actually requiring the module under test AFTER we perform our Mockery setup. It’s really important to do this, otherwise Node will provide the actual Express module and our fake code won’t be used.

Now that we have our code set up and ready to go, let’s take a look at what our tests look like.

        it('should set status to 200', function() {
            var routeAction = expressFake.getRouteAction('get', '/mypath/myentity');

            routeAction(req, res);

            assert(res.resData.status === 200);
        });

        it('should respond with appropriate message', function() {
            var routeAction = expressFake.getRouteAction('get', '/mypath/myentity');

            routeAction(req, res);

            assert(res.resData.message === 'Everything worked out fine');
        });

We can see that actually testing the actions, now, has become three simple lines of Javascript. What is happening in each of these tests is, our module is registering actions against routes, which are stored by our Express Route Fake module. Once the test starts, we simply ask for the function and execute it as if it were called by Express because of an HTTP request. We pass in our fake objects and capture the result of our action behavior. This allows us to see directly inside of our method and test the interesting parts, throwing away the stuff that would be, otherwise, obscured by frameworks, libraries and Node itself.

A second, important item to note is that we get extra guarantees around our route paths. If someone were to come along and change the path in our module, but not update the tests, or vice versa, we would get immediate feedback since getRouteAction would throw an error telling us the path does not exist. That’s a whole lot of security for a little bit of up-front work!

Summing Up

Today, wee looked at how to use just a couple of modules to insert a fake for Express and get better tests around our code. By using Mockery and Express Route Fake, you can capture route actions and get them under test. Moreover, you can do this while only writing code that is specific to the tests you are writing.

As you write more tests, it might become useful to create a factory for creating custom request and response objects which would simplify the test code even more. Of course, with all of this abstraction it does become more challenging to see what is happening under the covers. Ultimately, this kind of tradeoff can be useful in situations like this where repeated code is more of a liability than a help.

The next time you sit down to create new functionality and wire it into an Express route, consider starting off with Mockery and Express Route Fake. They will simplify the tests you need to write and limit the amount of code you have to change in order to get tests in place. Happy coding!

Typed Thinking in Javascript

Apr 6, 2016

Javascript is a dynamically typed language. I suspect this is not news to anyone reading this. The reason this is important, however, is I have heard people say Javascript is untyped. This statement is most definitely false. Javascript has and supports types, it simply does not actively expose this to the world at large.

Javascript is in good company when it comes to dynamically typed languages. Python and Ruby are also popular languages which are dynamically typed. Other venerable languages which are dynamically typed include Clojure, Elixir, Io and Racket. People coming from statically typed languages often say that Javascript’s dynamic typing is a hindrance to good programming. I disagree. Bad programming is a hindrance to good programming. I feel programmers coming from the languages listed above would probably agree.

What's the Difference?

Several popular languages today, including C#, Java and C++, are statically typed. This means the programmer declares the values they plan on using to accomplish a task when they define a method. There are distinct benefits to this kind of programming, specifically, the compiler can quickly determine whether a method call is valid. This kind of validation is useful and can prove a good tool for programmers, no doubt.

// Somewhere in a static class...
    public Int add(Int a, Int b) {
        return a + b;
    }

As you can see above, everything is explicitly annotated with a type definition. This kind of annotation is effectively a note to anyone who reads this code, including the compiler, et al, that this function behaves this way. Unfortunately, this convenience comes with a price. Suppose you wanted an add function for any sort of number including mixed arguments…

public Int add(Int a, Int b){
    return a + b;
}

public Double add(Double a, Double b){
    return a + b;
}

public Double add(Int a, Double b){
    return (Double) a + b;
}

public Double add(Double a, Int b){
    return a + (Double) b;
}

// And it goes on and on and on...

Modern improvements on type values has helped improve this problem (don’t shoot me, Java people), but it becomes obvious rather quickly that having restricted type flexibility means there is a lot more work which must be done to accomplish a seemingly simple task. We have to make a trade to get this compile-time help from the language.

Dynamic typing, on the other hand, does not have this restriction. In Javascript (or Python, Clojure, etc.) no type annotation is needed. This means the language will perform what is called type inference to do the right thing. Languages like Python or Clojure are less forgiving if types don’t line up correctly. If, for instance, you attempted to add a number and an array in either of these languages, an error would occur and everything would go downhill from there.

Javascript works a little harder to do the right thing; perhaps a little too hard. In a strange twist of fate I, once, attempted to demonstrate that Javascript would throw an error when trying to add a string and a function. Instead I got a string containing the original string, and the source code for the function. Suffice to say, this is not what I expected.

Nevertheless, this kind of type management is both the weakness and the strength of a dynamically typed language. Rather than having to spend time really thinking about strings, ints, doubles, bools and so on, you can spend more time thinking about the way your program works…

Until it doesn’t.

Correctness and Types in a Dynamic World

One of the most important things to consider in Javascript is intent. Although the kinds of strange things can be accomplished by applying common actions to unexpected values can be entertaining, it is not particularly helpful when attempting to write a correct program.

Correctness in programming is when a program performs the expected action and, within the domain of acceptable values, returns the correct output. In other words, an adder would be incorrect if it always returned 9, regardless of the input; an adder which always returned a valid sum would be considered correct.

By considering correctness, we must consider input and output types. Let’s keep using our add function because it’s easy to understand. Above, when we discussed types annotations, we looked at an add function in Java. We said that the input values a and b were both integers and the output is an integer. This forces the idea of correctness upon our function which, actually, could be defined as correct in a broader sense. What if, instead of declaring all of the different types and overloading the function again and again, we made up a new type. Let’s call this type Addable. Suppose we had an addable type in Java and could rewrite our function accordingly:

// type Addable includes all Int, Float, Double, Long, etc. values

public Addable add(Addable a, Addable b){
    return a + b;
}

We can actually define a notation which will help us to understand the correct input/output values of our function. We can say add has a function signature which looks like this: Addable, Addable => Addable. In other words, our function takes two Addable values and returns a new, Addable, value. All of this is true and we could test this function via various methods to prove the specific addition behavior is correct.

This new Addable type is effectively what we get in Javascript with the type “number.” We know that any number can be added to any number, so whether one number is an integer and another is a floating point value, we know they can still be added together. This means we can actually go so far as to eliminate the type annotations altogether and simply write our function as follows:

function add (a, b) {
    return a + b;
}

Of course, the problem we face here is there is no annotation to tell the next programmer what types a and b should be. Moreover, Javascript is quite forgiving and will allow a programmer to pass anything in which might be usable with a “+” operator. There are solutions to each of these, though we will only look at solutions for telling the next programmer what we intended.

Ad Hoc Properties to the Rescue

Under the hood, Javascript shares some really interesting characteristics with Smalltalk. Specifically, everything in Javascript, when managed within the runtime, is an object. This means we can do all kinds of neat things with functions, like assign properties.

What this means is we can actually do something real about making our programming intentions more clear. What if we took our add function and assigned an ad hoc property to the Function object instance called “signature?” By creating a property which declared what the function should do we get two benefits. First, anyone reading the source can immediately see what we meant to do and, second, we can actually create an artifact in our code which can be called upon elsewhere to get immediate feedback on what our behavior should look like. Here’s an example:

function add (a, b) {
    return a + b;
}

add.signature = 'number, number => number';

Now, looking at our code we can see what add does. It takes two numbers and returns a number. We can use this same property to our advantage elsewhere in our code. If we were planning to use add and wanted to see what the expected input and output are, we can simply output the signature. Here’s how we could do that:

console.log(add.signature); // number, number => number

Now we know! Better yet, if add was somewhere deep in a third-party library, we wouldn’t have to dig through third-party code to understand what the contract for add might be.

Thinking Types

The really important idea here is, even if they aren’t expressed in code, types live within everything we do in Javascript. As we develop software, it becomes really easy to simply not think about what a function signature looks like and call it with whatever we have, hoping it does what we expect.

Programming this way is dangerous and can lead to bugs which are hard to triage and fix. Instead of using the spray and pray approach, it is helpful to understand, more fully, what you intend to do and work with the types which are intended in a functions activity.

What this means to the dynamic programmer is, we have to be more vigilant, more cautious and more prepared while solving a problem than someone working with a statically typed, explicitly annotated language.

There are two ideas we must always keep in mind when programming, the goal of a correct program and what we must do to get there. The first idea is related to the company goal related to whatever problem we are actually trying to solve. The second idea encompasses types and actions almost exclusively.

Summary

Regardless of the typing mechanism for the chosen language with which we solve a problem, types are part of the solution. Javascript does not express the value and function types explicitly in the source code, but the types we use are equally important to anything used in a statically typed language.

We can solve the problem of expressing our function signature through using comments or adding a property which can be read and understood by other programmers. This will help alleviate the challenges which arise from misunderstanding source code.

Finally, as we work we must always be aware of the types we are interacting with and how they lead to the solution for whichever problem we are solving at the time. Instead of throwing things at the wall and seeing what sticks, let’s work carefully and with intent to write correct, valid programs.

P.S. If you don’t want to remember all of the metadata stuff you have to do, check out signet.

Objects Are Still Shared State

Mar 30, 2016

Dear programmers coming from Classical Object Oriented programming, please stop thinking that encapsulation of variables eliminates the “globalness” of your variable. It’s a hard truth, but you had to hear it from someone; you have a problem. Consider this an intervention.

I had a conversation a couple months ago where I looked at some code a senior developer had written and asked, “why are you using a global variable?” The response I got was “it’s the exposing module pattern, so it’s local and encapsulated. It’s not global.” The variable was a cache object exposed outside of the module; and it was global anyway.

When I say global, it is not about whether the entire program, or the entire world, can access your value, it’s about how your variable gets managed and modified. Really, the problematic aspect of a global variable comes from the fact that global variables, in many popular languages, represent shared, mutable state.

Consider a world where every variable is actually immutable, i.e. once you create a variable, you can’t change the value. In this particular case, a global variable is really nothing more than a globally readable value. You can’t write to it, so you can’t impact the rest of the running program. Is that global variable actually a problem? Decidedly less so, that’s for sure.

Mutating Object State

Let’s take a look at a very simple, though rather common, example of the way variables are often managed inside objects.

    function SneakyObj () {
        this.value = 0;
    }
    
    SneakyObj.prototype = {
        getValue: function () {
            var current = this.value;
            this.value++;
            return current;
        }
    };

There are two things wrong with this if value is actually important to the internal state of the object. First, since Javascript does not support private variables (explicitly, but we’ll come back to that), then this suffers from the Indecent Exposure code smell. Essentially, anyone in the world can directly access and modify the internal state of this object. That’s bad news.

The second issue with this object is the getter actually modifies the internal value of our object and returns a representation of the previous object state. Effectively, our getter is modifying the internal state of the object and lying to us about it.

Before you proclaim “I never do that! How very dare you,” keep in mind that this pattern shows up all the time. Popular frameworks like Angular and Ember actually encourage this kind of thing through the controller pattern. This is a sneaky trap that is hard to avoid.

Although we can’t quickly resolve the code smell, let’s take a look at a remedy for the lie that is our “get” method name.

    function LessSneakyObj () {
        this.value = 0;
    }
    
    LessSneakyObj.prototype = {
        getAndUpdateValue: function () {
            var current = this.value;
            this.value++;
            return current;
        }
    };

Now we understand and declare what the method does. For some people this is enough and we need to go no further. I, on the other hand, feel this is still rather suspect and would prefer to see a cleaner, more elegant construction.

Separate The Activity

The issue I draw with our updated object is, we have one method which does all the things. This is a really bad idea since it really doesn’t protect the programmer from a micro-god function. (Hey, You can have micro-frameworks and micro-services.) Effectively we have fixed the naming problem, but we haven’t actually resolved the smelly code which lives within our method.

Typically I prefer a single function which will return the current state of affairs and other function, if you MUST, which modifies the internal state. This kind of separation of concerns actually helps to keep object state sane and useful. If not for the exposed internal value of the object, we would be on our way to saner code.

    function ObviousObj () {
        this.value = 0;
    }
    
    ObviousObj.prototype = {
        get: function () {
            return this.value
        },
        
        update: function () {
            this.value++;
        }
    };

We can see this code actually separates the functionality and has the lovely side effect of making the code more readable. If I were working in a project using an MVC paradigm, I would call this good and move on. We have separated the behaviors and tried to keep everything clean, tidy and meaningful. Our view would be able to access the values it needs and we keep our state management safe from accidental update.

Turn Up The Encapsulation

From here we can start looking at working on our fine detail. Up to now, we have accepted that our internal values are exposed and available for the world to manipulate, AKA Indecent Exposure. It’s time to fix that little bit of nastiness and make our object water- and tamper-proof.

The only way to actually protect a variable from external access in Javascript is through closures. Since functions are objects and objects are built atop function constructors, we can perform a little scope management surgery and make our object really safe and secure. Let’s take a look and see what we can do to lock things down.

    function EncapsulatedObj () {
        var internalState = {
            value: 0
        };
        
        this.get = this.get.bind(null, internalState);
        this.update = this.update.bind(null, internalState);
    }
    
    EncapsulatedObj.prototype = {
        get: function (state) {
            return state.value;
        },
        
        update: function (state) {
            state.value++;
        }
    };

This code does a little fiddling around with scope by partially applying the object’s internal state to our get and set functions. This protects our variable from being accessed by the outside world, but allows our get and update methods to access our value freely. When your data must be locked away, this will get you there.

Our Code Goes to 11

In order to finish up this journey, it seemed only right to create a completely pure, immutable object just to see where it would lead us. If we were to really go all the way, we would need to do a little more work to ensure everything still worked as we would expect.

We know the variable “value” maintains a count for some reason, so it will be important to ensure value is always an integer. We also want to make sure the get method always gives the current count. Finally, update should do just that: update the count value. What does it mean to make an update if everything is immutable? Let’s have a look and find out.

    function isNumber (value){
        return typeof value === 'number';
    }
    
    function isInt (value){
        return isNumber(value) && Math.floor(value) === value;
    }
    
    function SafeObj (initialValue) {
        var value = SafeObj.cleanValue(initialValue);
        
        this.get = this.get.bind(null, value);
        this.update = this.update.bind(null, value);
    }

    SafeObj.cleanValue = function (value) {
        return isInt(value) ? value : 0;
    };
    
    SafeObj.prototype = {
        get: function (value) {
            return value;
        },
        
        update: function (value) {
            return new SafeObj(value + 1);
        }
    };

This is just chock full pure functions and added behavior. With all of that added behavior, we get something magical. Instead of having an object which is mutable and, ultimately, somewhat unpredictable and hard to test, we end up with an object which has the following properties:

  • Immutable
  • Contains pure methods
  • Has a single, pure, static method
  • Is compositionally built
  • Updates through new object construction

This whole object construction could lead us down many discussions which would get into types, values, mutability, function composition and more. For now, it will suffice to say, this kind of development creates the ideal situation for developing safely and really turns our code up to 11.

The numbers all go to 11.

Summing Up

Although we got a little spacey at the end, the important thing to take away from this whole thing is, any time an object is built and modifies its own state through method calls, the methods are actually relying on shared, mutable state.

Shared mutable state in an object really is just a micro-global and should be viewed as such. This means, any value which can be accessed and modified should be considered unsafe and untrustworthy. Untrustworthy data should never be viewed as the source of truth.

From here forward, if you start to add a variable to an object or module, ask yourself, does this really need to be global, or can I localize it? Perhaps you will find a better way to keep your code clean and easy to reason about.

Pattern Matching in Javascript

Mar 16, 2016

For more than a year I have been considering the idea of pattern matching in Javascript. I know I am not the only one trying to solve this problem because there are a handful of resources where people have put together propositions for solutions, including a Sweet.js macro called Sparkler, and Bram Stein’s blog post and associated Github repo.

Although I really, really like the idea of a macro to handle pattern matching, I fear people will throw it out immediately since pattern matching by itself is already an, sadly, obscure topic. This means anything that requires a macro package will probably turn the general populace off, especially since I haven’t met anyone in my area who has heard of Sweet.js except me.

Although I like Bram’s approach to handling macros with a function library, it looks like he didn’t get a chance to make much headway. That’s really unfortunate since I think he was headed in a good, although kind of simple direction.

Before we go any further, it is important to discuss the idea of pattern matching for the uninitiated. Pattern matching is a functional means to quickly look at the shape and signature of data and make a programmatic decision based on what is there. In other words, pattern matching is like conditionals which have spent the last 10 years at the gym.

Imagine, if you will, conditional statements which looked like this:

match(vector) {
    case [1, 2, 3]:
        return 'Low number sequence';
    case [_, _, 0]:
        return '2D vector in 3d plane';
    default:
        return 'I have no idea what you gave me';
}

Even this isn’t really powerful enough to truly capture what pattern matching does, but I think it gives a little insight. Nonetheless, if we had a construct which would give us the ability to match on an entire set of conditions within our data structures, the face of Javascript programming would be a very different place.

Pattern matching is not just about reducing keystrokes, but it actually redefines what it means to actually describe and interact with your data altogether. It would do for data structures what regular expressions have done for string manipulation.

Pattern matching is the dream.

So, after doing a lot of thinking, I think I have settled on a means to give this dream the breath of life. Unfortunately, I believe it is unlikely that the path to data Nirvana will be easy. I have a suspicion, this is the same issue Bram encountered. Looking at the ~1400 lines of code that make up the Sparkler macro, pattern matching could be a tricky problem.

Function and Contract

I looked at the Sweet.js macro, Bram Stein’s early exploration and the match behavior in both Scala and Racket. I decided I needed something which felt like fluent Javascript instead of succumbing to the Racket nut which lives inside me, so I opted to avoid the hardcore Lisp syntax. Scala had a closer feel to what I wanted so I kept that in mind. Bram’s example felt close, but a little too muddy and the Sweet.js macro just felt a little too much like operators instead of functions. What I landed on was this, () => () => any; that is to say a function ($match) which returns a function (pattern assembly) which returns the final result of the pattern matching. Here’s an example of a simple exploration, drawing against Bram’s factorial implementation.

    function factorial(n) {
        return $match(n)(function (pattern) {
            return [
                [0, 1],
                [pattern.else(), () => n * factorial(n - 1)]
            ];
        });
    }

It’s easy to see the first call is just $match(n). This returns another function which takes a function as an argument; i.e. $match is a higher-order function which returns a higher order function, which takes a function as an argument, which then does stuff and returns a result. Another way of looking at this is $match is a function which chains to a function which is designed to perform pattern assembly. Once the pattern is assembled, everything is executed and we get a result.

Clear as mud?

Expanding the Concept

This small example seems pretty simple. Check for equality and if nothing works, then use the else clause. This is a little condition-block feeling, but I think people will understand the else clause better than anything else I might have put in there.

Anyway, digging in a little further, I wanted to also be able to quickly and easily match arrays, objects or other things as well. This simple equality checking was not enough, so I started expanding, moving into some sort of factory behavior to create matchers on the fly. This brought me to an example which was a little more interesting and a lot more complex.

    function vectorMatcher(vector) {
        return $match(vector)(function (pattern) {
            [[pattern.number(), 0], 'x'],
            [[0, pattern.number()], 'y'],
            [[pattern.number(), pattern.number()], 'x,y'],
            [pattern.else(), new TypeError('invalid vector')]
        });
    }

Here I am returning a string based on the pattern of a pair (2-valued array) array being treated like a vector. If the first three patterns match, we get a string describing the axes the vector lives on. If the vector is not a pair, or it contains something other than numbers, I want to return a type error with a message explaining the provided vector was invalid.

This bit of logic is significantly more complex than our previous factorial example and leaves us in a place where the code is descriptive, but perhaps not as readable as we would like. Moreover, I thought about what I really want to be able to say and it made more sense to, perhaps, create something of a pattern matching DSL. (domain specific language)

Matching on a DSL

If I was going to invent any kind of simple language for expressing matching behaviors, I wanted it to be less cryptic than regular expressions, but I also wanted it to be terse enough that it didn’t become some giant, sprawling string someone would have to mentally parse to really understand what was happening. What I landed on was a simple near-Javascript expression of what the values should look like when they properly match our criteria. This turns our earlier expression into the following.

    function vectorMatcher(vector) {
        return $match(vector)(function (pattern) {
            [pattern('[<n>, 0]'), 'x'],
            [pattern('[0, <n>]'), 'y'],
            [pattern('[<n>, <n>]'), 'x,y'],
            [pattern.else(), new TypeError('invalid vector')]
        });
    }

I reduced the type descriptor for brevity, opting for an angle-bracketed character. Now we can wrap up our expression in a single pattern call and get something the matcher can quickly and easily execute to verify whether the vector matches our requirements or not. This, however, also means I am responsible for generating an AST (abstract syntax tree) for these expressions. Of course, if I am going to do that, it would be best to create one by hand so I can see what I am actually contending with.

Matcher Abstract Syntax Tree

The AST for my matcher language would, ultimately, link in with an underlying state machine of some sort, but I won’t dig that deep right now. Nonetheless, what I ended up with, when constructing an AST is long, but relatively declarative. This means, I could, theoretically, start the entire process NOT at the language level, but instead at a place which people would more readily understand. Let’s have a look at the AST replacement for our matching behavior.

    function vectorMatcher(vector) {
        return $match(vector)(function (pattern) {
            [pattern.array([
                pattern.number(),
                pattern.number(0)
            ]), 'x'],
            [pattern.array([
                pattern.number(0),
                pattern.number()
            ]), 'y'],
            [pattern.array([
                pattern.number(),
                pattern.number()
            ]), 'x,y'],
            [pattern.else(), new TypeError('invalid vector')]
        });
    }

It’s long, and probably overkill for the problem presented here, but it gives us a real view into the guts of the problem and a way out of the mud. This is, also, unfortunately all I could pull together in time for this post, but I feel like we covered a tremendous amount of ground. I will continue to experiment with pattern matching and, perhaps, by next time we could even have a working object model to build our tree with. Until the next post, keep coding!

  • Web Designers Rejoice: There is Still Room

    I’m taking a brief detour and talking about something other than user tolerance and action on your site. I read a couple of articles, which you’ve probably seen yourself, and felt a deep need to say something. Smashing Magazine published Does The Future Of The Internet Have Room For Web Designers? and the rebuttal, I Want To Be A Web Designer When I Grow Up, but something was missing.

  • Anticipating User Action

    Congrats, you’ve made it to the third part of my math-type exploration of anticipated user behavior on the web. Just a refresher, the last couple of posts were about user tolerance and anticipating falloff/satisficing These posts may have been a little dense and really math-heavy, but it’s been worth it, right?

  • Anticipating User Falloff

    As we discussed last week, users have a predictable tolerance for wait times through waiting for page loading and information seeking behaviors. The value you get when you calculate expected user tolerance can be useful by itself, but it would be better if you could actually predict the rough numbers of users who will fall off early and late in the wait/seek process.

  • User Frustration Tolerance on the Web

    I have been working for quite a while to devise a method for assessing web sites and the ability to provide two things. First, I want to assess the ability for a user to perform an action they want to perform. Second I want to assess the ability for the user to complete a business goal while completing their own goals.

  • Google Geocoding with CakePHP

    Google has some pretty neat toys for developers and CakePHP is a pretty friendly framework to quickly build applications on which is well supported. That said, when I went looking for a Google geocoding component, I was a little surprised to discover that nobody had created one to do the hand-shakey business between a CakePHP application and Google.

  • Small Inconveniences Matter

    Last night I was working on integrating oAuth consumers into Noisophile. This is the first time I had done something like this so I was reading all of the material I could to get the best idea for what I was about to do. I came across a blog post about oAuth and one particular way of managing the information passed back from Twitter and the like.

  • Know Thy Customer

    I’ve been tasked with an interesting problem: encourage the Creative department to migrate away from their current project tracking tool and into Jira. For those of you unfamiliar with Jira, it is a bug tracking tool with a bunch of toys and goodies built in to help keep track of everything from hours to subversion check-in number. From a developer’s point of view, there are more neat things than you could shake a stick at. From an outsider’s perspective, it is a big, complicated and confusing system with more secrets and challenges than one could ever imagine.

  • When SEO Goes Bad

    My last post was about finding a healthy balance between client- and server-side technology. My friend sent me a link to an article about SEO and Google’s “reasonable surfer” patent. Though the information regarding Google’s methods for identifying and appropriately assessing useful links on a site was interesting, I am quite concerned about what the SEO crowd was encouraging because of this new revelation.

  • Balance is Everything

    Earlier this year I discussed progressive enhancement, and proposed that a web site should perform the core functions without any frills. Last night I had a discussion with a friend, regarding this very same topic. It came to light that it wasn’t clear where the boundaries should be drawn. Interaction needs to be a blend of server- and client-side technologies.

  • Coding Transparency: Development from Design Comps

    Since I am an engineer first and a designer second in my job, more often than not the designs you see came from someone else’s comp. Being that I am a designer second, it means that I know just enough about design to be dangerous but not enough to be really effective over the long run.

  • Usabilibloat or Websites Gone Wild

    It’s always great when you have the opportunity to built a site from the ground up. You have opportunities to design things right the first time, and set standards in place for future users, designers and developers alike. These are the good times.

  • Thinking in Pieces: Modularity and Problem Solving

    I am big on modularity. There are lots of problems on the web to fix and modularity applies to many of them. A couple of posts ago I talked about content and that it is all built on or made of objects. The benefits from working with objectified content is the ease of updating and the breadth and depth of content that can be added to the site.

  • Almost Pretty: URL Rewriting and Guessability

    Through all of the usability, navigation, design, various user-related laws and a healthy handful of information and hierarchical tricks and skills, something that continues to elude designers and developers is pretty URLs. Mind you, SEO experts would balk at the idea that companies don’t think about using pretty URLs in order to drive search engine placement. There is something else to consider in the meanwhile:

  • Content: It's All About Objects

    When I wrote my first post about object-oriented content, I was thinking in a rather small scope. I said to myself, “I need content I can place where I need it, but I can edit once and update everything at the same time.” The answer seemed painfully clear: I need objects.

  • It's a Fidelity Thing: Stakeholders and Wireframes

    This morning I read a post about wireframes and when they are appropriate. Though I agree, audience is important, it is equally important to hand the correct items to the audience at the right times. This doesn’t mean you shouldn’t create wireframes.

  • Developing for Delivery: Separating UI from Business

    With the advent of Ruby on Rails (RoR or Rails) as well as many of the PHP frameworks available, MVC has become a regular buzzword. Everyone claims they work in an MVC fashion though, much like Agile development, it comes in various flavors and strengths.

  • I Didn't Expect THAT to Happen

    How many times have you been on a website and said those very words? You click on a menu item, expecting to have content appear in much the same way everything else did. Then, BANG you get fifteen new browser windows and a host of chirping, talking and other disastrous actions.

  • Degrading Behavior: Graceful Integration

    There has been a lot of talk about graceful degradation. In the end it can become a lot of lip service. Often people talk a good talk, but when the site hits the web, let’s just say it isn’t too pretty.

  • Website Overhaul 12-Step Program

    Suppose you’ve been tasked with overhauling your company website. This has been the source of dread and panic for creative and engineering teams the world over.

  • Pretend that they're Users

    Working closely with the Creative team, as I do, I have the unique opportunity to consider user experience through the life of the project. More than many engineers, I work directly with the user. Developing wireframes, considering information architecture and user experience development all fall within my purview.

  • User Experience Means Everyone

    I’ve been working on a project for an internal client, which includes linking out to various medical search utilities. One of the sites we are using as a search provider offers pharmacy searches. The site was built on ASP.Net technology, or so I would assume as all the file extensions are ‘aspx.’ I bring this provider up because I was shocked and appalled by their disregard for the users that would be searching.

  • Predictive User Self-Selection

    Some sites, like this one, have a reasonably focused audience. It can become problematic, however, for corporate sites to sort out their users, and lead them to the path of enlightenment. In the worst situations, it may be a little like throwing stones into the dark, hoping to hit a matchstick. In the best, users will wander in and tell you precisely who they are.

  • Mapping the Course: XML Sitemaps

    I just read a short, relatively old blog post by David Naylor regarding why he believes XML sitemaps are bad. People involved with SEO probably know and recognize the name. I know I did. I have to disagree with his premise, but agree with his argument.

  • The Browser Clipping Point

    Today, at the time of this writing, Google posted a blog stating they were dropping support for old browsers. They stated:

  • Creativity Kills

    People are creative. It’s a fact of the state of humanity. People want to make things. It’s built into the human condition. But there is a difference between haphazard creation and focused, goal-oriented development.

  • Reactionary Navigation: The Sins of the Broad and Shallow

    When given a task of making search terms and frequetly visited pages more accessible to users, the uninitiated fire and fall back. They leave in their wake, broad, shallow sites with menus and navigtion which look more like weeds than an organized system. Ultimately , these navigation schemes fail to do the one thing they were intended for, enhance findability.

  • OOC: Object Oriented Content

    Most content on the web is managed at the page level. Though I cannot say that all systems behave in one specific way, I do know that each system I’ve used behaves precisely like this. Content management systems assume that every new piece of content which is created is going to, ultimately, have a page that is dedicated to that piece of content. Ultimately all content is going to live autonomously on a page. Content, much like web pages, is not an island.

  • Party in the Front, Business in the Back

    Nothing like a nod to the reverse mullet to start a post out right. As I started making notes on a post about findability, something occurred to me. Though it should seem obvious, truly separating presentation from business logic is key in ensuring usability and ease of maintenance. Several benefits can be gained with the separation of business and presentation logic including wiring for a strong site architecture, solid, clear HTML with minimal outside code interfering and the ability to integrate a smart, smooth user experience without concern of breaking the business logic that drives it.

  • The Selection Correction

    User self selection is a mess. Let’s get that out in the open first and foremost. As soon as you ask the user questions about themselves directly, your plan has failed. User self selection, at best, is a mess of splash pages and strange buttons. The web has become a smarter place where designers and developers should be able to glean the information they need about the user without asking the user directly.

  • Ah, Simplicity

    Every time I wander the web I seem to find it more complicated than the last time I left it.  Considering this happens on a daily basis, the complexity appears to be growing monotonically.  It has been shown again and again that the attention span of people on the web is extremely short.  A good example of this is a post on Reputation Defender about the click-through rate on their search results.

  • It's Called SEO and You Should Try Some

    It’s been a while since I last posted, but this bears note. Search engine optimization, commonly called SEO, is all about getting search engines to notice you and people to come to your site. The important thing about good SEO is that it will do more than simply get eyes on your site, but it will get the RIGHT eyes on your site. People typically misunderstand the value of optimizing their site or they think that it will radically alter the layout, message or other core elements they hold dear.

  • Information and the state of the web

    I only post here occasionally and it has crossed my mind that I might almost be wise to just create a separate blog on my web server.  I have these thoughts and then I realize that I don’t have time to muck with that when I have good blog content to post, or perhaps it is simply laziness.  Either way, I only post when something strikes me.

  • Browser Wars

    It’s been a while since I have posted. I know. For those of you that are checking out this blog for the first time, welcome. For those of you who have read my posts before, welcome back. We’re not here to talk about the regularity (or lack thereof) that I post with. What we are here to talk about is supporting or not supporting browsers. So first, what inspired me to write this? Well… this:

  • Web Scripting and you

    If there is one thing that I feel can be best learned from programming for the internet it’s modularity.  Programmers preach modularity through encapsulation and design models but ultimately sometimes it’s really easy to just throw in a hacky fix and be done with the whole mess.  Welcome to the “I need this fix last week” school of code updating.  Honestly, that kind of thing happens to the best of us.

  • Occam's Razor

    I have a particular project that I work on every so often. It’s actually kind of a meta-project as I have to maintain a web-based project queue and management system, so it is a project for the sake of projects. Spiffy eh? Anyway, I haven’t had this thing break in a while which either means that I did such a nice, robust job of coding the darn thing that it is unbreakable (sure it is) or more likely, nobody has pushed this thing to the breaking point. Given enough time and enough monkeys. All of that aside, every so often, my boss comes up with new things that she would like the system to do, and I have to build them in. Fortunately, I built it in such a way that most everything just kind of “plugs in” not so much that I have an API and whatnot, but rather, I can simply build out a module and then just run an include and use it. Neat, isn’t it?

  • Inflexible XML data structures

    Happy new year! Going into the start of the new year, I have a project that has carried over from the moment I started my current job. I am working on the information architecture and interaction design of a web-based insurance tool. Something that I have run into recently is a document structure that was developed using XML containers. This, in and of itself, is not an issue. XML is a wonderful tool for dividing information up in a useful way. The problem lies in how the system is implemented. This, my friends, is where I ran into trouble with a particular detail in this project. Call it the proverbial bump in the road.

  • Accessibility and graceful degradation

    Something that I have learnt over time is how to make your site accessible for people that don’t have your perfect 20/20 vision, are working from a limited environment or just generally have old browsing capabilities. Believe it or not, people that visit my web sites still use old computers with old copies of Windows. Personally, I have made the Linux switch everywhere I can. That being said, I spend a certain amount of time surfing the web using Lynx. This is not due to the fact that I don’t have a GUI in Linux. I do. And I use firefox for my usual needs, but Lynx has a certain special place in my heart. It is in a class of browser that sees the web in much the same way that a screen reader does. For example, all of those really neat iframes that you use for dynamic content? Yeah, those come up as “iframe.” Totally unreadable. Totally unreachable. Iframe is an example of web technology that is web-inaccessible. Translate this as bad news.

  • Less is less, more is more. You do the math.

    By this I don’t mean that you should fill every pixel on the screen with text, information and blinking, distracting graphics. What I really mean is that you should give yourself more time to accomplish what you are looking to do on the web. Sure, your reaction to this is going to be “duh, of course you should spend time thinking about what you are going to do online. All good jobs take time.” I say, oh young one, are you actually spending time where it needs to be spent? I suspect you aren’t.

  • Note to self, scope is important.

    Being that this was an issue just last evening, I thought I would share something that I have encountered when writing Javascript scripts.  First of all, let me state that Javascript syntax is extremely forgiving.  You can do all kinds of  unorthodox declarations of variables as well as use variables in all kinds of strange ways.  You can take a variable, store a string in it, then a number, then an object and then back again.  Weakly typed would be the gaming phrase.  The one thing that I would like to note, as it was my big issue last evening, is scope of your variables.  So long as you are careful about defining the scope of any given variable then you are ok, if not, you could have a problem just like I did.  So, let’s start with scope and how it works.

  • Subscribe

    -->