Sanitary Data: Maybe, Either and Deref

Aug 19, 2015

Although there are times when conditional blocks are necessary for functionality, it is common that conditions simply wrap up data which may be null or undefined. These kinds of conditional blocks have become so common that they are considered idiomatic code. The underlying goal, however, is to guarantee sanitary data.

Sanitary data is data that is guaranteed to be safe for use in your function without worrying about edge cases which could arise around data that could be subtly wrong or unsafe. Sanitary data is more than just data that exists. It is data that adheres to a set of qualifications which make the program stable and safe. These qualifications include things such as truthyness and type guarantees.

Since it is considered idiomatic to bundle up data in conditional blocks, often times functionality is wrapped up in the condition, muddying the waters between sanitary data and function execution. I would like to break that idiom with a concept introduced in Haskell, known as maybe.

In Haskell, maybe(a) returns Just(a) or nil. Javascript does not have algebraic data types, Just or nil, so we are going to have to adjust a little. That said, there are still some guarantees we can put into our maybe that will flow forward to other variables. Let’s have a look at an implementation of maybe that will set the foundation.

function maybe(value, expectedType){
    let sanitizedType = typeof expectedType === 'string' ? expectedType : '',
        valueOkay = typeof value === sanitizedType ? true : Boolean(value);
    
    return valueOkay ? value : null;
}

// Usage looks like this:
maybe('foo'); // foo
maybe('foo', 'object'); // null
maybe(0); // null
maybe(0, 'number'); // 0
maybe(false); // null
maybe(false, 'boolean'); // false

function myFn(value){
    let sanitizedValue = maybe(value, 'string');
    sanitizedValue = sanitizedValue === null ? 'default' : value;

    // do more stuff
}

As you can see, it gives you a strong guarantee of what your data should look like. If your data is acceptable, you will just get your data back, or Just(a). If your data fails the provided conditions, you will get null, which is as close as we can come to nil in Javascript.

Maybe alone will not give us the kind of data integrity we want. This reduces the conditions we have to consider down to 2, which is much better than the open landscape of the unknown, but it is not sufficient alone. As you can see in the usage section above, we still are testing our ‘sanitized’ value. Our goal is to remove as many of our data conditions as possible, so we still have more work to go.

When maybe isn’t enough, we can look to another function: either. I use either so often that if I am ever without my functional library which provides either, I build it in as a utility. It’s honestly that important. Let’s take a look at what either looks like.

function either(defaultValue, value, expectedType){
    let sanitizedValue = maybe(value, expectedType);
    return sanitizedValue === null ? defaultValue : value;
}

// Usage goes a little like this:
either('bar', 'foo'); // foo
either(42, 'foo', 'number'); // 42
either('baz', 0); // baz
either(false, 0, 'boolean'); // false

function myFn(value){
    let sanitizedValue = either('default', value, 'string');
    // do more stuff
}

As you can see, either gives us the strength of maybe with the sanitary nature of a safe default. Now we have removed all of the conditions from our small function. This is excellent! Now we can guarantee our data is sanitary even in a hurricane of strange calls and bad user data, and it’s a tiny helper function that declares exactly what it does every time we use it.

The extra benefit we have with functions like maybe and either is they really do introduce the idea of function purity in a way that works with our everyday, practical application. We can write functions that should never throw type errors, never get null pointers and always give us clean, safe data.

Except…

What happens when you need a value that is buried deep in an object? What if any layer in your object might not exist? This happens a lot. Actually it happens so much that I found myself using large blocks of either statements to guarantee I didn’t break my function. This is bad. We don’t want big blocks of function calls when our entire goal was to remove large blocks of code around data to begin with.

Enter deref. Deref is the automated application of either over and over, ensuring we always get good data all the way through our object. Let’s take a look at what deref could look like:

function deref(userObj, keyStr){
    var keyTokens = keyStr.split('.'),
        keyToken = keyTokens.shift().trim(),
        derefResult = keyToken === '' ? userObj : either({}, userObj, 'object')[keyToken];
    
    // NEVER return undefined
    derefResult = typeof derefResult === 'undefined' ? null : derefResult;
    
    return keyTokens.length === 0 ? derefResult : deref(derefResult, keyTokens.join('.'));
}

// Usage:
deref(null, 'foo.bar.baz'); // null
deref({ test: [ 'foo', 'bar' ] }, 'test.1'); // bar

function myNewFn(valueObj){
    let refOutput = deref(valueObj, 'my.object.deep.reference.valueList.3'),
        sanitizedData  = either('default', refOutput, 'string');

    // Do stuff with ultimate certainty!!
}

Now we really have a set of functions that break us out of the data conditional cycle of madness! With as few as one or two lines, we can provide strong guarantees that our data will always match the expectations of our code. By introducing three simple helper functions, we have reduced our code down to the essence of what we mean. We no longer have to spin our wheels cleaning up data or ensuring we aren’t sending bad values through functions that could introduce dangerous runtime errors that could crash our applications when the user does something unexpected.

What is really happening underneath these helper functions is, we are creating sanitary data in an isolated way. Most programs, when you scratch beneath the surface, are little more than data manipulations. By placing strong guarantees on the data we are interacting with, we can stabilize our programs and reduce the amount of stress we feel about strange data circumstances and edge cases that are, largely, unimportant to producing meaningful output.

If you don’t want to litter your code with copy/paste functions from a blog post, download the JFP library and use these functions which are pre-built to make your code sane. Read the docs to see other ways you can make these functions work for you.

The next time you find yourself writing large conditional blocks to handle strange data, think about maybe, either and deref and then breathe a sigh of relief. When we introduce sanitary data into our code and think a little more functionally, the world looks a little brighter, so make your unit tests and your QA person happier and sanitize that data!

Mainstay Monday: SOLID - Interface Segregation Principle

Aug 17, 2015

This post is part of a series on the SOLID programming principles.

The Interface Segregation Principle is a close relative to the Single Responsibility Principle. The idea behind interface segregation is your API should use several small functions or methods to accomplish tasks instead of one large function. The traditional definition states you should use many client-specific interfaces instead of one general purpose interface.

Since Javascript doesn’t have support for abstract classes or interfaces, we are going to focus on the functional application of interface segregation. Let’s start off supposing your program is going to deal with a few different cases for objects which will be handed around and managed. You know that your are going to potentially receive objects, which you want the keys from, arrays which you want just the strings out of and you will be receiving a JSON string from the user through some request or input. Here’s the way a single function looks when we try to manage all of these cases:

function doIt(myObj, isArray, isUserSpecified){
    if (!isArray, !isUserSpecified) {
        return Object.keys(myObj);
    } else if (isArray){
        return myObj.filter(value => typeof value === 'string');
    } else {
        try {
            return JSON.parse(myObj);
        } catch (error) {
            return [];
        }
    }
}

Obviously this fails single responsibility, but there is more going on here. This function receives two different boolean values and changes the way it behaves based on configuration values. This is a dangerous path to walk and I strongly suggest people avoid it. The other problem that we have here is practically every executable line returns a value. This whole function is a setup for danger.

Note, I have actually seen functions like this. This kind of practice is easy to find in the wild, especially when looking at the code written by novice developers. There is also a variation on this function which handles the creation of booleans inside the function. The code, when rewritten looks like this.

function doItAlternate(myObj){
    let isArray = Object.prototype.toString.call(myObj) === '[object Array]',
        isUserSpecified = typeof myObj === 'string';
        
    if (!isArray, !isUserSpecified) {
        return Object.keys(myObj);
    } else if (isArray){
        return myObj.filter(value => typeof value === 'string');
    } else {
        try {
            return JSON.parse(myObj);
        } catch (error) {
            return [];
        }
    }
}

I’m not sure which is worse, but we are really solving three completely different problems with this code. Let’s suppose, instead of all the booleans, we were to start breaking this function down and solving the problems independently. This is where our segregation comes into play. We have one case where we want object keys. By inspection we can see this is not related to the array problem or the user entered data problem. Let’s split that function out.

function getObjectKeys(myObj){
    return Object.keys(myObj);
}

This function clearly cuts straight to the heart of what it does. Now we can take an object and safely capture the keys This reduces the cognitive load to understanding the cases when each boolean should be passed and whether or not something will go wrong with the code if our cases go wrong. More importantly, any place in our code where we need to call this function can do it without any knowledge that our program could ever receive arrays or user defined functions. Those behaviors are completely outside the scope of this particular piece of functionality.

Let’s deal with our array logic.

function getStringValues(myArray){
    return myArray.filter(value => typeof value === 'string');
}

This is another one-liner, but it serves a very specific purpose. We no longer have this bundled in with our object or user input logic which means we can understand precisely the roll it plays. Now our code can safely assume it will always get the same information back, so we can call our array function in the correct context and reduce the overhead that accompanies a single, general-purpose function.

Finally, let’s have a look at our segregated user input function.

function parseUserObject(userObj){
    var output;
    
    try {
        output = JSON.parse(userObj);
    } catch (error) {
        output = {};
    }
    
    return output;
}

This one is the biggie of the bunch. User data is notoriously unreliable and this is the one that muddied the water the most. Originally we had a return statement in the try and one in the catch block. This seems like a terrible idea. More importantly this really added a lot of complexity to our original function since we, not only, had to know this was user data, but we had to drop in a block to handle any of the fallout that happens when JSON.parse is called on something that won’t parse.

With this function isolated, we get the same benefits we would get with the segregation of the other parts of our function, but we also get the added bonus of being able to rewrite this function without having to dirty up a general purpose function’s scope with a bunch of variables which may never be used in any of the other behaviors. Now we can clearly define a single entry point and a single exit. This function starts to approach the kind of purity we like when we want to wrap things up in unit tests.

Let’s take a look at one last element of the interface segregation principle. We have looked at how interface segregation and single responsibility work together to clean up a function that increases cognitive load, let’s take a look at the value of wrapping up general purpose behaviors in specific purpose functions. This is where interface segregation can really shine and simplify your programming.

Below is a set of functions I’ve created to demonstrate satisfying specific needs and reducing the exposure of general purpose functions to our code in the large.

function stringPredicate(value){
    return typeof value === 'string';
}

function shortPredicate(value){
    return value.length < 5;
}

function numberPredicate(value){
    return typeof value === 'number';
}

function evenPredicate(value){
    return value % 2 === 0;
}

function filterStrings(valueList){
    return valueList.filter(stringPredicate);
}

function filterShortStrings(valueList){
    return filterStrings(valueList).filter(shortPredicate);
}

function filterNumbers(valueList){
    return valueList.filter(numberPredicate);
}

function filterEvenNumbers(valueList){
    return filterNumbers(valueList).filter(evenPredicate);
}
```

Here we can see several things at work. First, we have wrapped up the filter function in a few convenience functions which give us a specific type of output. This is great for sanitizing data as well as function composition. With each of the produced functions, we can provide value: filtering strings or numbers, filtering strings of a certain length or filtering only numbers which are even.

What is even better is, we can actually use these functions in a composite way to build more complex functions or new functions that do something different.  Imagine if we had to do this directly in our code. That would be a LOT of duplication and we would have to interact with the general purpose filter function every time.

We've looked at two interesting cases around the concept of segregating our interfaces and providing solutions for problems which can be reused throughout our code. First we looked at how interface segregation and single responsibility principle are related, and how one strengthens the other.  Then we had a look at wrapping up broad-use functions in solution-driven structures to simplify the process of solving problems in our program.

Interface segregation is a strong principle for simplifying code and providing a clearer strategy for solving problems.  It works hand in hand with other principles to make your program cleaner, simpler and more stable, which is what we all really want, isn't it?

    

All in the Family: Filter, Map, Reduce, Recur

Aug 12, 2015

In programming it is becoming more common to see functional patterns. Filter, map and reduce are all discussed openly in programming circles that live far away from Lisp, Haskell, ML and other functional languages. It is common, in Javascript especially, to see these functions being misunderstood and misused. My goal is to uncover the relation between these functions and provide a really deep understanding of what is going on when these common functions are use correctly.

By the end of this post, I hope you will see the relationship filter, map, reduce and recursion all come together in a harmonious way, allowing the programmer to transform data without loops and without brutish, heavy handed data manipulations. This journey will be part technical understanding and part new-age enlightenment. Don’t worry if everything doesn’t make sense on the first pass. This is a deep dive into the world of functional programming and a departure from the imperative methods people commonly use.

Let’s start off with reduce. Reduce is a function that, as you might have guessed, reduces a list of values. As a side note, in my little corner of the world, I think of arrays as lists and I will refer to them as such throughout this post.1 A common example of reduce is adding numbers. An adder is a simple function to implement and allows for a very simple explanation. Here’s what it might look like:

function add(a, b){
    return a + b;
}

var myResult = [1, 2, 3, 4].reduce(add, 0);
console.log(myResult); // 10

I want to dig under the surface of this behavior and understand what is really going on with reduce. Let’s take a look at a way to implement reduce. The first implementation is just going to use a standard loop. Here’s what it might look like:

function loopReduce(valueList, reduction, initialValue){
    var index = 0,
        result = initialValue !== undefined ? initialValue : valueList[0];
    
    index += result !== initialValue ? 1 : 0;
    
    while(valueList[index] !== undefined){
        result = reduction(result, valueList[index]);
        index++;
    }
    
    return result;
}

It’s not elegant, but it gets the job done. We loop over everything in the list and apply a reduction. You can create an add function and a list of numbers and try it for yourself. Anyway, if you really want to do this functionally, you would pull out our good friend, recursion. Recursion is a more mathematical means for looping over a set of values, and dates back to some of the earliest prototypes for computing back in the 1930’s.

Before we go any further, I want to introduce a few short functions that will make everything a lot more functional. In our loop-based function would have gotten little utility out of these, but moving forward these are going to be very important.

function first(valueList){
    return valueList[0];
}

function rest(valueList){
    return valueList.slice(1);
}

function isUndefined(value){
    return value === undefined;
}

In the next function we are going to use recursion to handle the looping so the body of the function only needs to be concerned with a single step in the process of reducing our list. Now that we have those out of the way, let’s crack open a functional version of reduce and see what we can find. Let’s have a look.

function recursiveReduce(valueList, reduction, initialValue){
    var _a = isUndefined(initialValue) ? initialValue : first(valueList),
        _b = isUndefined(initialValue) ? first(rest(valueList)) : first(valueList),
        remainderList = isUndefined(initialValue) ? rest(rest(valueList)) : rest(valueList);
    
    return remainderList.length > 0 ?
           recursiveReduce(remainderList, reduction, reduction(_a, _b)) :
           reduction(_a, _b);
}

If this is your first time digging into the world of functional programming and using functions like first and rest, you might want to stop and absorb all of this for a moment. Reduce is a rather complex transformation function that requires keeping a fair amount in your head, so this is a lot to absorb. Another challenge we encounter in Javascript is the lack of pattern matching which would simplify this function significantly. Nontheless, that’s a pretty heavy change from where we started at the beginning of the post and we still have more of this mountain to climb.

For sufficiently small lists of values, this reduction technique will work fine, but as the list gets too big, our reduce function will begin to slow down and fail. This is because recursion in Javascript is not tail-optimized, so each call goes on the stack which will eventually overflow. This overflow is the primary reason why many imperative modern languages discourage recursive algorithms.

Clojure introduces an idea that helps us to remedy this issue. It is possible to use recursion inefficiently in Clojure and fill the stack, however, by using the recur function and calling it at the end of your function, you get the tail optimization you are looking for. Similarly, the JFP library offers a recur function that allows for tail-optimized recursion.2 Let’s rewrite our function using the JFP recur function.

function recurReduce(recur, valueList, reduction, initialValue){
    var _a = !isUndefined(initialValue) ? initialValue : first(valueList),
        _b = isUndefined(initialValue) ? first(rest(valueList)) : first(valueList),
        remainderList = isUndefined(initialValue) ? rest(rest(valueList)) : rest(valueList);
    
    return remainderList.length > 0 ?
           recur(remainderList, reduction, reduction(_a, _b)) :
           reduction(_a, _b);
}

function reduce(valueList, reduction, initialValue){
    return j.recur(recurReduce, valueList, reduction, initialValue);
}

Phew! That was a long walk to get to a really efficient and effective reduce function. It’s elegant, declarative, and it uses tail-optimized recursion so we can really say we are operating in a functional way from the ground up. What’s even cooler is, now we can see that even recursion can actually be looked at from a different angle and managed as a function instead just a concept. However, the title of this post mentions filter and map as well. Fortunately, we don’t have to take the same long walk to get to those functions. We already have nearly everything we need already: looping, function application, even data copying!

Let’s start with filter. Anyone who has used filter correctly understands the power you hold when you start manipulating lists of elements and stripping elements down to the bare bones, leaving only the data you need. It generally looks a little like this:

[1, 2, 3, 4, 5, 6, 7].filter(value => value % 2 === 0); // [2, 4, 6]

If we peel back the skin and look at what is happening in filter, we can either look at it as another recursion. This means that filter is a function of recursion, just like reduce. Here’s an idea of what that would look like:

function recurFilter(recur, valueList, filterFn, initialSet){
    var sanitizedSet = isUndefined(initialSet) ? [] : initialSet,
        testValue = first(valueList),
        remainderList = rest(valueList);
    
    if(filterFn(testValue)){
        sanitizedSet.push(testValue);
    }
    
    return remainderList.length ? recur(remainderList, filterFn, sanitizedSet) : sanitizedSet;
}

function filter(valueList, filterFn){
    j.recur(recurFilter, valueList, filterFn, []);
}

This is a lot like our implementation of reduce. The problem is it’s actually so much like reduce we are actually duplicating code. Duplicate code is a bad thing when you already have a function that will do what you need. Let’s rewrite our filter function as a function of reduce.

function filterer(filterFn, newList, value){
    if(filterFn(value)){
        newList.push(value);
    }
    
    return newList;
}

function filter(valueList, filterFn){
    return reduce(valueList, filterer.bind(null, filterFn), []);
}

If we think of filter as a function of reduce, then, all of a sudden, almost all of the logic goes away. Our original function was so close to reduce that they could have been separated at birth. The only thing we really need to make filter unique is a wrapper function to evaluate our predicate and capture the values that pass. Filter is a reduction function. In much the same way, map is also a reduction function. Here’s an implementation skipping all of the intermediate steps and drawing from what we know about filter:

function mapper(mapFn, newList, value){
    newList.push(mapFn(value));
    return newList;
}

function map(valueList, mapFn){
    return reduce(valueList, mapper.bind(null, mapFn), []);
}

That’s it! Map and filter are clearly little more than a wrapper around the reduce function. With the relation to reduce, we can say that filter and map are functions of reduce. We have essentially built a family tree of filter and map, which we can see are cousins related to the same reductive heir. This relationship could not work the other way around since only reduce has the flexibility to transform lists in a variety of ways, all while maintaining function purity. So, now when people talk about the filter-map-reduce pattern, we know that they are really talking about the reduce-reduce=reduce pattern.

Why is this important?

When I first starting learning functional programming. I had a really hard time separating myself from the idea that filter, map and reduce were really any different than a wrapper over forEach. I thought of them as highly specialized looping functions that did a little bit of extra work under the covers to add some power to the language.

This idea that these functions are wrappers over a loop is a common misconception made by many people coming from the imperative world into functional programming. It is common to see these functions called with function arguments with side effects, treating them as nothing more than a looping structure.

By breaking down the barrier between the programmer and high-performance recursion, it becomes obvious that traditional loops are not really part of the game at all. Any looping that might be done is used simply to access values within a list. This data access is what’s really important. As it turns out, the order in which the elements are accessed is really unimportant as long as the output list adheres to the same order as the input list.

This break from conventional thinking, and seeing these functions which perform operations on a list as functions of reduce, helps us to understand what is really happening: we are actually transforming the data into something new! The reason the original data is unmodified is not an operation of the language and data mutability, it is the way immutability could exist at all.

When you return to looking at the code you are writing, and use map or filter again, it will be as if you are seeing beyond the curtain into the way the cogs and wheels are really working together. You may, like me, wonder why tail-optimized recursion is not core to the language. Things will make sense in a new way, and instead of writing programs which work, by brute force, to capture and strong-arm data into a form the user can make sense of, you will allow the data to glide effortlessly through transformations and become something the user can enjoy.

Blog Post Notes

  1. As it turns out, arrays in Javascript are more closely related to vectors in languages like Clojure. This distinction can be discussed at length elsewhere, but it is important to note. Ultimately, arrays and vectors have extra memory access features which give them a performance boost when data is accessed at places other than the head or the tail. Lists must be access sequentially. I prefer the term list when dealing with arrays in Javascript because many of the functions we use to process arrays come directly from list processing, and do not benefit from the vector characteristics.
  2. The tail optimization that JFP uses is a generic form of the trampolining method. Each pass of the recursive function finishes and returns. Within the recur function is a while loop which reduces the recursion into a high-performing loop structure. This allows a central recur function to capture the return and either execute the next step or exit, depending on the state of the recursion.

Mainstay Monday: SOLID - Liskov Substitution Principle

Aug 10, 2015

This post is part of a series on the SOLID programming principles.

We’ve reached middle and, possibly, one of the more subtle principles in SOLID. Up to now we have dealt in ideas that either prescribe a way to separate and clean up your code, or provide rules for ways to maintain consistency in behavior while adding new functionality. Liskov substitution offers a means to guarantee expectations for developers are met when things change over time.

Arguably this is one of the most difficult principles to apply to functional programming since subclassing doesn’t exist, method overriding is meaningless, however there are still some examples that we can look at to identify similarities. Let’s take a look at a function called deref, which dereferences an object value or returns null if the reference does not exist.

function validateObject(dataObj){
    return typeof dataObj === 'object' && dataObj !== null;
}

function validateToken(token){
    return token !== '' && token !== undefined;
}

function deref(baseObj, ref){
    let refTokens = ref.split('.'),
        token = refTokens.shift(),
        result = validateToken(token) && validateObject(baseObj) ? baseObj[token] : baseObj;
    
    result = result === undefined ? null : result;
    
    return Boolean(refTokens.length) ? deref(result, refTokens.join('.')) : result;
}

This is a recursive algorithm, which we discussed a couple weeks ago. As you can see, it has two return states, either the current result or the result from the next call on the stack. We’ll assume that the key string passed in won’t be long enough to overflow the stack.

Now, suppose we wanted to take our current deref implementation and extend it to return a default value if the reference we want doesn’t exist. We could, theoretically, add something else to this implementation, but that would violate Open/Closed at the very least. Instead, let’s create a wrapper function that extends the contract.

When we extend the contract for the function, we need to make sure that we don’t break functionality for older code that is only using deref. This means, the new argument must be managed in an optional way. In classical OO languages, we could use method overloading to accomplish this, and it purely functional languages, we would have pattern matching, but Javascript lives in two worlds, so we’re going to handle this our own way.

function derefWithDefault(baseObj, ref, defaultValue){
    let result = deref(baseObj, ref),
        sanitizedDefault = defaultValue === undefined ? null : defaultValue;
    
    return result === null ? sanitizedDefault : result;
}

It only took a couple extra lines of code and we’ve now created a new function that will give us some very powerful added functionality. What’s better with this implementation is, we have maintained the original code, keeping our old functionality insulated from the new behavior. This means any new code that is written can call our new pseudo-subclassed function just as it would have the old function, and get predictable behavior, and we can revisit old code in time and refactor to the new behavior with nothing more than a function name change. Code stability is the name of this game.

Now, let’s have a look at an object oriented approach. Suppose we have a pet class, and we are describing pets which can do the trick “speak.” It’s pretty safe to assume we’re really talking about parrots and dogs, but we’ll assume there are a whole large class of animals that could be pets and do the trick called “speak.” Let’s have a look at our base class:

class Pet{
    constructor(){
        this.phrase = 'Hello, world.';
    }

    speak(){
        console.log(this.phrase);
    }
}

var genericPet = new Pet();
genericPet.speak(); // Hello, world.

Obviously our base pet is some sort of program or computer. Perhaps it’s a highly-evolved open worm or a Tamagotchi. At any rate, our pet isn’t very interesting, but it’s easy to extend and that’s what we’re going to do.

Let’s make our pet a dog. Dogs can speak, so we’re okay there. Let’s add another trick, too. Dogs can roll over. Well, mine won’t because they are stubborn, but you can teach a dog to roll over, so let’s use that trick. Here’s what our dog would look like:

class Dog extends Pet{
    constructor(){
        super();
        this.phrase = 'Woof!';
    }
    
    rollOver(){
        console.log('I rolled over, where\'s my treat?');
    }
}

var myDog = new Dog();

myDog.speak(); // Woof!
myDog.rollOver(); // I rolled over, where's my treat?

If we look at this code, it’s pretty clear that anywhere something is looking for a generic Pet instance, you could pass in Dog and it would be acceptable. It is critical to understand that we intentionally did not change speak. Suppose we were to create another pet that would only speak if you gave it a cracker and it didn’t do any other tricks. This would definitely be a picky pet. Let’s go ahead and call it just that:

class PickyPet extends Pet{
    constructor(){
        super();
        this.phrase = "Thanks for the cracker.";
    }
    
    speak(cracker){
        if(cracker !== 'cracker'){
            throw new Error ('No cracker, no speak.');
        }
        
        super.speak();
    }
}

var myPickyPet = new PickyPet();
myPickyPet.speak('cracker'); // Thanks for the cracker.
myPickyPet.speak(); // Throws error with message "No cracker, no speak."

As it turns out this is such a well-known violation of Liskov Substitution that my code editor highlighted the new speak method and informed me that it was an invalid extension of the base class. Obviously, anything expecting a conforming instance of Pet would have a problem with our new subclass. As it turns out, Javascript doesn’t care about this violation until runtime and by then, it’s too late.

There are more subtle violations that could also happen, but it’s hard to list them all. Suppose speak did take an argument, but threw no error for any kind of type violation; this kind of code is still a violation since our new picky pet does throw an error. Other kinds of problems can be type mismatches, variations on what is returned by the method or function, removal of functionality that is critical for the parent class to work properly and so on.

Liskov Substitution is a fairly subtle principle which is meant to protect the deepest, most core parts of your program. I have a friend who claims that every other SOLID principle flows forth from Liskov and I would generally tend to agree, though that’s a discussion for another day. Ultimately, if you adhere to the Liskov Substitution principle, your code is more likely to behave well under a broad set of conditions and remain stable even as you enhance you program over time. Think about Liskov Substitution as you work and you will write better code and craft better software.

Refactoring with Boolean Algebra and De Morgan's Laws

Aug 5, 2015

Many junior and mid-level programmers working today have opted to skip a university education and, instead, have either gone through an Associate’s program or a coding bootcamp. Some programmers started their career out of college without a formal background in CS, getting degrees in physics, biology, chemistry or even liberal arts like journalism or design. Although these developers may be quite effective, there are many topics that are core to the standard Computer Science curriculum which developers without a formal CS education may not have been exposed to.

Today we start leveling the playing field. Dear reader, if you are programming, guess what, You’re doing math! Are you surprised? Don’t be. All of the logic that is key making programs work came out of mathematics.1 We’ve all written some sort of condition statement at one point or another or we wouldn’t be here, but could we do it better?

We can rebuild him. We have the technology.

Boolean algebra is the field of mathematics that provides us with the conditional logic we all know and use every day. In order to really understand, at a deeper level, what we are doing, it is important to really get a grasp on managing conditions using the rules uncovered by mathematicians who came before. At the end of this post, we will look at De Morgan’s laws and how they can be used to transform blocks of code with a couple of simple rules.

Before we dive into the bag of tricks, let’s get a little bit of vocabulary and syntax out of the way.

Vocab:

  • Predicate expression - An expression that evaluates to either true or false
  • Tautology - An expression that always evaluates to true
  • Apenantology - An expression that always evaluates to false2

Syntax:

  • && - Logical and
  • || - Logical or
  • ! - Logical not
  • <=> - Logical equivalence, technically means "can be replaced by"

Okay, now that we have that out of the way, let’s talk predicates. The other day I was working through some older code which had been touched by several people. In this code was a conditional statement that looked roughly like this:

if((Boolean(valueList) && (externalValue && (dataValueA || dataValueB))) || (Boolean(valueList) && !externalValue)){
    //Functionality went here.
}

This isn’t the worst conditional I have ever seen, but it’s pretty hard to read. It’s not entirely clear what all of the condition states are, or why. Something I knew for sure, was there was a common factor here. Let’s look all the way back to grade-school algebra at the distributive property. We know this is true:

5 * (3 + 7) <=> 15 + 35 <=> 50

The distributive property holds for Boolean algebra as well. Let’s look at some boolean variables we’ll call P, Q and R.3 I won’t present a formal proof for any of the claims I am going to make, mainly because that’s not what this post is about, and they are already written and available on the web with a simple search on Google. Let’s have a look at the distributive property for predicates.

P && (Q || R) <=> (P && Q) || (P && Q)

// Similarly

P || (Q && R) <=> (P || Q) && (P || R);

Going back to our original problem, I could use the distributive law to pull one of the variables completely out of the predicate expression and simplify what I was looking at immediately. Let’s take a look at our new conditional.

if(Boolean(valueList) && ((externalValue && (dataValueA || dataValueB))) || !externalValue){
    //Functionality went here.
}

That’s already a lot easier to look at and we’ve barely even started. I see another value in the mix I really want to simplify, but somehow it just doesn’t seem like it will be as easy as pulling out our valueList. Let’s do some distribution again and see where that leaves us.

if(Boolean(valueList) && ((!externalValue || externalValue) && (!externalValue || (dataValueA || dataValueB)))){
    //Functionality went here.
}

Well, that’s just about as ugly as it was when we started, except for one thing. We introduced the concept of a tautology in the vocab. We actually have one right here. Let’s take a moment and look at a symbolic representation, first.

P || !P <=> true // Always true. Try it out.
P && P <=> true // I feel like this speaks for itself.

This means we can do a little bit of trickery and get rid of some stuff. Let’s take a look at the tautology in our code.

(!externalValue || externalValue) //AHA! Always true!

//So this means we can reduce like so

(true && (!externalValue || (dataValueA || dataValueB)))

//We don't need to leave the true in there.

(!externalValue || (dataValueA || dataValueB))

With this refactoring we are left with a much simpler expression to evaluate. Before presenting the reduced predicate expression, let’s take a look at one other rule in Boolean Algebra, associativity. When all operators are the same, you can freely associate any set of values. Here’s what it looks like:

P && Q && R <=> (P && Q) && R <=> P && (Q && R)

//Also

P || Q || R <=> (P || Q) || R <=> P || (Q || R)

With all of that put together, we get a final reduced result which relies on just one instance of each variable. This is, as far as I can see, about as far as the train goes. Here’s our final conditional statement:

if(Boolean(valueList) && (!externalValue || dataValueA || dataValueB)){
    //Functionality went here.
}

This isn’t where the story for Boolean Algebra ends. I introduced a new word, created using some Greek roots, apenantology. This is as useful for evaluating conditional statements as our buddy, the tautology. Apenantology is the state where something always evaluates to false.4 Let’s have a look at a symbolic example.

P && !P <=> false //Always false. Try plugging in values.
!P || !P <=> false //Always false, unsurprisingly

Here’s where it gets interesting. Where a tautology can be removed from a predicate expression, an apenantology can either be eliminated, or it will actually define the entire expression. Here’s how it works:

//Apenantology in an or condition
P && !P || (Q && R) //This can be simplified to
false || (Q && R) //Since this is an or expression, we can remove the false
Q && R //These are the only variables that matter.

//Apenantology in an and condition
P && !P && (Q && R) //Reducing
false && (Q && R) //Because the first value is false, the expression is false
false

Let’s take our predicate expression from when we did our second distribution. Let’s replace the or with and and see what we get instead.

//Here was our original expression:
((!externalValue || externalValue) && (!externalValue || (dataValueA || dataValueB)))

//I'm going to modify it a bit.
((!externalValue && externalValue) && (!externalValue && (dataValueA || dataValueB)))
(!externalValue && externalValue) //Apenantology

//Now let's simplify.
(false && (!externalValue && (dataValueA || dataValueB))) //Whoops! false && (P && Q)
false  //Anything inside the conditional block would never run. This is dead code.

What about De Morgan’s laws?

De Morgan’s laws are named for the mathematician Augustus De Morgan. He discovered two very useful little rules that can be applied with assurance to any logical statements and guarantee equivalence and maintain truth values.

De Morgan’s laws are actually pretty simple, but they give us some tools that can help to either simplify an expression or identify an expression that would give us the negation of the original expression. When I was learning these rules I kind of liked to think of them as the either/or rules. There are two and they look like this:

!(P && Q) <=> !P || !Q
!(P || Q) <=> !P && !Q

For something that was named for a mathematician, these seem like a couple of really simple rules. As it turns out, these can be really useful when working with conditional logic to either reverse or simplify a conditional block. For an example, let’s take a look at a conditional reversal using one of De Morgan’s laws

//This is a conditional statement we want to reverse
if(valueA || valueB){
    //Do one thing
} else {
    //Do another
}

//Reverse order
if(!(valueA || valueB)){
    //Do another
} else {
    //Do one thing
}

//Distribute

if(!valueA && !valueB){
    //Do another
} else {
    //Do one thing
}

Whoa. That was kind of a drink from the proverbial firehose. For someone who is seeing this material for the first time, all of this math and its relation with code might seem a little dense. I wouldn’t worry too much about getting it all at once. The important thing is to start looking at your code and identifying places where it seems like simplification should be doable. The more complicated your conditional statements are, the more likely a bug is lurking in the dark.

I would recommend you start getting your feet wet with the concept of tautologies. By simply recognizing where an idea is repeated and removing the repetition, your conditional blocks will become clearer and more precise. After you have applied tautologies comfortably to your code, try playing with the distributive and associative laws. These three ideas will clean most of the garbage out of complex, messy conditionals.

Once the foundation work is comfortably in your arsenal, come back and start playing with the ideas around identifying apenantologies and flipping conditionals around to identify the best order for your conditions to be presented in. Sometimes reordering conditions is all you need to make your code clean, clear and the best it can be. These principles of logic lay the foundation for what is crucial to make a program do all the things you ever wanted, so use them and make your code shine.

Blog Post Notes

  1. Technically the logical paradigm came from a combination of Mathematics and Philosophy. A guy by the name of Bertrand Russell worked with mathematicians early in the 20th century to create a formal language which could be used to describe mathematics work that was being done. He and several important mathematicians helped to make formal proofs work in a predictable, and readable way. Thanks, forebears! Without you, computing might not be where it is today.
  2. Apenantology is a neologism I am introducing here in order to accurately describe a situation which is diametrically opposed to a tautology. Tautology is built from the greek roots tautos, meaning identical, and logos meaning word. I constructed apenantology from apenanti, meaning opposite and logos meaning word.
  3. This naming is common for mathematics and it makes it easier to read the expressions we are tinkering with. Most logical expressions are written using P, Q, R, S and occasionally T. When more than five variables are involved, I personally start to worry about complexity.
  4. A rough proof is provided as a gist for those curious as to whether apenantologies are supportable mathematically.

  • Web Designers Rejoice: There is Still Room

    I’m taking a brief detour and talking about something other than user tolerance and action on your site. I read a couple of articles, which you’ve probably seen yourself, and felt a deep need to say something. Smashing Magazine published Does The Future Of The Internet Have Room For Web Designers? and the rebuttal, I Want To Be A Web Designer When I Grow Up, but something was missing.

  • Anticipating User Action

    Congrats, you’ve made it to the third part of my math-type exploration of anticipated user behavior on the web. Just a refresher, the last couple of posts were about user tolerance and anticipating falloff/satisficing These posts may have been a little dense and really math-heavy, but it’s been worth it, right?

  • Anticipating User Falloff

    As we discussed last week, users have a predictable tolerance for wait times through waiting for page loading and information seeking behaviors. The value you get when you calculate expected user tolerance can be useful by itself, but it would be better if you could actually predict the rough numbers of users who will fall off early and late in the wait/seek process.

  • User Frustration Tolerance on the Web

    I have been working for quite a while to devise a method for assessing web sites and the ability to provide two things. First, I want to assess the ability for a user to perform an action they want to perform. Second I want to assess the ability for the user to complete a business goal while completing their own goals.

  • Google Geocoding with CakePHP

    Google has some pretty neat toys for developers and CakePHP is a pretty friendly framework to quickly build applications on which is well supported. That said, when I went looking for a Google geocoding component, I was a little surprised to discover that nobody had created one to do the hand-shakey business between a CakePHP application and Google.

  • Small Inconveniences Matter

    Last night I was working on integrating oAuth consumers into Noisophile. This is the first time I had done something like this so I was reading all of the material I could to get the best idea for what I was about to do. I came across a blog post about oAuth and one particular way of managing the information passed back from Twitter and the like.

  • Know Thy Customer

    I’ve been tasked with an interesting problem: encourage the Creative department to migrate away from their current project tracking tool and into Jira. For those of you unfamiliar with Jira, it is a bug tracking tool with a bunch of toys and goodies built in to help keep track of everything from hours to subversion check-in number. From a developer’s point of view, there are more neat things than you could shake a stick at. From an outsider’s perspective, it is a big, complicated and confusing system with more secrets and challenges than one could ever imagine.

  • When SEO Goes Bad

    My last post was about finding a healthy balance between client- and server-side technology. My friend sent me a link to an article about SEO and Google’s “reasonable surfer” patent. Though the information regarding Google’s methods for identifying and appropriately assessing useful links on a site was interesting, I am quite concerned about what the SEO crowd was encouraging because of this new revelation.

  • Balance is Everything

    Earlier this year I discussed progressive enhancement, and proposed that a web site should perform the core functions without any frills. Last night I had a discussion with a friend, regarding this very same topic. It came to light that it wasn’t clear where the boundaries should be drawn. Interaction needs to be a blend of server- and client-side technologies.

  • Coding Transparency: Development from Design Comps

    Since I am an engineer first and a designer second in my job, more often than not the designs you see came from someone else’s comp. Being that I am a designer second, it means that I know just enough about design to be dangerous but not enough to be really effective over the long run.

  • Usabilibloat or Websites Gone Wild

    It’s always great when you have the opportunity to built a site from the ground up. You have opportunities to design things right the first time, and set standards in place for future users, designers and developers alike. These are the good times.

  • Thinking in Pieces: Modularity and Problem Solving

    I am big on modularity. There are lots of problems on the web to fix and modularity applies to many of them. A couple of posts ago I talked about content and that it is all built on or made of objects. The benefits from working with objectified content is the ease of updating and the breadth and depth of content that can be added to the site.

  • Almost Pretty: URL Rewriting and Guessability

    Through all of the usability, navigation, design, various user-related laws and a healthy handful of information and hierarchical tricks and skills, something that continues to elude designers and developers is pretty URLs. Mind you, SEO experts would balk at the idea that companies don’t think about using pretty URLs in order to drive search engine placement. There is something else to consider in the meanwhile:

  • Content: It's All About Objects

    When I wrote my first post about object-oriented content, I was thinking in a rather small scope. I said to myself, “I need content I can place where I need it, but I can edit once and update everything at the same time.” The answer seemed painfully clear: I need objects.

  • It's a Fidelity Thing: Stakeholders and Wireframes

    This morning I read a post about wireframes and when they are appropriate. Though I agree, audience is important, it is equally important to hand the correct items to the audience at the right times. This doesn’t mean you shouldn’t create wireframes.

  • Developing for Delivery: Separating UI from Business

    With the advent of Ruby on Rails (RoR or Rails) as well as many of the PHP frameworks available, MVC has become a regular buzzword. Everyone claims they work in an MVC fashion though, much like Agile development, it comes in various flavors and strengths.

  • I Didn't Expect THAT to Happen

    How many times have you been on a website and said those very words? You click on a menu item, expecting to have content appear in much the same way everything else did. Then, BANG you get fifteen new browser windows and a host of chirping, talking and other disastrous actions.

  • Degrading Behavior: Graceful Integration

    There has been a lot of talk about graceful degradation. In the end it can become a lot of lip service. Often people talk a good talk, but when the site hits the web, let’s just say it isn’t too pretty.

  • Website Overhaul 12-Step Program

    Suppose you’ve been tasked with overhauling your company website. This has been the source of dread and panic for creative and engineering teams the world over.

  • Pretend that they're Users

    Working closely with the Creative team, as I do, I have the unique opportunity to consider user experience through the life of the project. More than many engineers, I work directly with the user. Developing wireframes, considering information architecture and user experience development all fall within my purview.

  • User Experience Means Everyone

    I’ve been working on a project for an internal client, which includes linking out to various medical search utilities. One of the sites we are using as a search provider offers pharmacy searches. The site was built on ASP.Net technology, or so I would assume as all the file extensions are ‘aspx.’ I bring this provider up because I was shocked and appalled by their disregard for the users that would be searching.

  • Predictive User Self-Selection

    Some sites, like this one, have a reasonably focused audience. It can become problematic, however, for corporate sites to sort out their users, and lead them to the path of enlightenment. In the worst situations, it may be a little like throwing stones into the dark, hoping to hit a matchstick. In the best, users will wander in and tell you precisely who they are.

  • Mapping the Course: XML Sitemaps

    I just read a short, relatively old blog post by David Naylor regarding why he believes XML sitemaps are bad. People involved with SEO probably know and recognize the name. I know I did. I have to disagree with his premise, but agree with his argument.

  • The Browser Clipping Point

    Today, at the time of this writing, Google posted a blog stating they were dropping support for old browsers. They stated:

  • Creativity Kills

    People are creative. It’s a fact of the state of humanity. People want to make things. It’s built into the human condition. But there is a difference between haphazard creation and focused, goal-oriented development.

  • Reactionary Navigation: The Sins of the Broad and Shallow

    When given a task of making search terms and frequetly visited pages more accessible to users, the uninitiated fire and fall back. They leave in their wake, broad, shallow sites with menus and navigtion which look more like weeds than an organized system. Ultimately , these navigation schemes fail to do the one thing they were intended for, enhance findability.

  • OOC: Object Oriented Content

    Most content on the web is managed at the page level. Though I cannot say that all systems behave in one specific way, I do know that each system I’ve used behaves precisely like this. Content management systems assume that every new piece of content which is created is going to, ultimately, have a page that is dedicated to that piece of content. Ultimately all content is going to live autonomously on a page. Content, much like web pages, is not an island.

  • Party in the Front, Business in the Back

    Nothing like a nod to the reverse mullet to start a post out right. As I started making notes on a post about findability, something occurred to me. Though it should seem obvious, truly separating presentation from business logic is key in ensuring usability and ease of maintenance. Several benefits can be gained with the separation of business and presentation logic including wiring for a strong site architecture, solid, clear HTML with minimal outside code interfering and the ability to integrate a smart, smooth user experience without concern of breaking the business logic that drives it.

  • The Selection Correction

    User self selection is a mess. Let’s get that out in the open first and foremost. As soon as you ask the user questions about themselves directly, your plan has failed. User self selection, at best, is a mess of splash pages and strange buttons. The web has become a smarter place where designers and developers should be able to glean the information they need about the user without asking the user directly.

  • Ah, Simplicity

    Every time I wander the web I seem to find it more complicated than the last time I left it.  Considering this happens on a daily basis, the complexity appears to be growing monotonically.  It has been shown again and again that the attention span of people on the web is extremely short.  A good example of this is a post on Reputation Defender about the click-through rate on their search results.

  • It's Called SEO and You Should Try Some

    It’s been a while since I last posted, but this bears note. Search engine optimization, commonly called SEO, is all about getting search engines to notice you and people to come to your site. The important thing about good SEO is that it will do more than simply get eyes on your site, but it will get the RIGHT eyes on your site. People typically misunderstand the value of optimizing their site or they think that it will radically alter the layout, message or other core elements they hold dear.

  • Information and the state of the web

    I only post here occasionally and it has crossed my mind that I might almost be wise to just create a separate blog on my web server.  I have these thoughts and then I realize that I don’t have time to muck with that when I have good blog content to post, or perhaps it is simply laziness.  Either way, I only post when something strikes me.

  • Browser Wars

    It’s been a while since I have posted. I know. For those of you that are checking out this blog for the first time, welcome. For those of you who have read my posts before, welcome back. We’re not here to talk about the regularity (or lack thereof) that I post with. What we are here to talk about is supporting or not supporting browsers. So first, what inspired me to write this? Well… this:

  • Web Scripting and you

    If there is one thing that I feel can be best learned from programming for the internet it’s modularity.  Programmers preach modularity through encapsulation and design models but ultimately sometimes it’s really easy to just throw in a hacky fix and be done with the whole mess.  Welcome to the “I need this fix last week” school of code updating.  Honestly, that kind of thing happens to the best of us.

  • Occam's Razor

    I have a particular project that I work on every so often. It’s actually kind of a meta-project as I have to maintain a web-based project queue and management system, so it is a project for the sake of projects. Spiffy eh? Anyway, I haven’t had this thing break in a while which either means that I did such a nice, robust job of coding the darn thing that it is unbreakable (sure it is) or more likely, nobody has pushed this thing to the breaking point. Given enough time and enough monkeys. All of that aside, every so often, my boss comes up with new things that she would like the system to do, and I have to build them in. Fortunately, I built it in such a way that most everything just kind of “plugs in” not so much that I have an API and whatnot, but rather, I can simply build out a module and then just run an include and use it. Neat, isn’t it?

  • Inflexible XML data structures

    Happy new year! Going into the start of the new year, I have a project that has carried over from the moment I started my current job. I am working on the information architecture and interaction design of a web-based insurance tool. Something that I have run into recently is a document structure that was developed using XML containers. This, in and of itself, is not an issue. XML is a wonderful tool for dividing information up in a useful way. The problem lies in how the system is implemented. This, my friends, is where I ran into trouble with a particular detail in this project. Call it the proverbial bump in the road.

  • Accessibility and graceful degradation

    Something that I have learnt over time is how to make your site accessible for people that don’t have your perfect 20/20 vision, are working from a limited environment or just generally have old browsing capabilities. Believe it or not, people that visit my web sites still use old computers with old copies of Windows. Personally, I have made the Linux switch everywhere I can. That being said, I spend a certain amount of time surfing the web using Lynx. This is not due to the fact that I don’t have a GUI in Linux. I do. And I use firefox for my usual needs, but Lynx has a certain special place in my heart. It is in a class of browser that sees the web in much the same way that a screen reader does. For example, all of those really neat iframes that you use for dynamic content? Yeah, those come up as “iframe.” Totally unreadable. Totally unreachable. Iframe is an example of web technology that is web-inaccessible. Translate this as bad news.

  • Less is less, more is more. You do the math.

    By this I don’t mean that you should fill every pixel on the screen with text, information and blinking, distracting graphics. What I really mean is that you should give yourself more time to accomplish what you are looking to do on the web. Sure, your reaction to this is going to be “duh, of course you should spend time thinking about what you are going to do online. All good jobs take time.” I say, oh young one, are you actually spending time where it needs to be spent? I suspect you aren’t.

  • Note to self, scope is important.

    Being that this was an issue just last evening, I thought I would share something that I have encountered when writing Javascript scripts.  First of all, let me state that Javascript syntax is extremely forgiving.  You can do all kinds of  unorthodox declarations of variables as well as use variables in all kinds of strange ways.  You can take a variable, store a string in it, then a number, then an object and then back again.  Weakly typed would be the gaming phrase.  The one thing that I would like to note, as it was my big issue last evening, is scope of your variables.  So long as you are careful about defining the scope of any given variable then you are ok, if not, you could have a problem just like I did.  So, let’s start with scope and how it works.

  • Subscribe

    -->