2014-01-18

Classes for all the things?

Recently I've been thinking a lot about how to simplify my code. Now, the key thing to note here is that simple != familiar.

Classes for example are familiar to most people. But code consisting exclusively of classes and OOP concepts isn't necessarily simple all of the time. The question arises: Is there something simpler?

Well, the first thing that comes to mind, of course, good old functions. Functions are universal, everybody understand them, even new programmers that don't yet get all the fancy concepts about OOP get plain old functions.

Keep in mind, I'm not saying that classes don't have their place in our code, but rather that maybe we don't need them quite as often as we might think.

For example: Let's say we want to write a program that needs to do some sort of calculation on some data we provide. We want to have different types of calculations for getting different information from our data. Let's call them Calculation Type A and Calculation Type B. We need to take into account data normalization before starting the calculation, and let's make room for future improvements as we might add a Calculation Type C later on that may require a different kind of normalization.

We immediately think to ourselves: Well I'm just going to make an abstract class and just inherit from that. So here we go:

class AbstractCalc(object):
    def normalize(self, data):
        #do complex normalization
        return data

    def calc(self, data):
        raise NotImplementedError

This is nice. We have an abstract class that has the normalize method implemented (which we can override in derived classes if need be) and we have a calc method that needs to be implemented by the derived classes. So let's implement those:

class ATypeCalc(AbstractCalc):
    def calc(self, data):
        normalizes_data = self.normalize(data)
        print "Doing some complex A Calculations"
        result = ...

        return result

class BTypeCalc(AbstractCalc):
    def calc(self, data):
        normalized_data = self.normalize(data)
        print "Doing some complex B Calculations"
        result = ...

        return result

Awesome. We have our concrete calculation classes and for now we use the parents normalize method but we can just as easily use our own custom one. The usage of this implementation is something like this:

if calc_type == 'a':
    calc = ATypeCalc()
elif calc_type == 'b':
    calc = BTypeCalc()

# do actual calculation
calc.calc(data)

We can just wrap this in a dispatcher function called get_calc and then we get:

def get_calc(calc_type):
    if calc_type == 'a':
        return ATypeCalc()
    elif calc_type == 'b':
        return BTypeCalc()

calc = get_calc(calc_type)
calc.calc(data)

Pretty awesome, because now to use our code we just call the get_calc function which returns an object instantiated from one of the calculation classes and we're good to go.

This is all fine and dandy but we have just 2 methods in those classes. Do we really need a class for that?

I wonder if we could get away with using just plain old functions? We just need to keep the code modular enough that adding new calculations is easy and preserve the existing API.

Let's see how that would look. First we define our normalize function, and our a/b calculation functions:

def normalize(data):
    #do complex normalization
    return data

def a_calc(normalize, data):
    normalized_data = normalize(data)
    print "Doing some complex Calculations A"
    result = ...
    return result

def b_calc(normalize, data):
    normalized_data = normalize(data)
    print "Doing some complex Calculations B"
    result = ...
    return result

Seems straight forward enough. We have a shared normalize function which we pass into the a_calc and b_calc functions, and this way if one day we need to change the normalize function for a_calc we just pass in a different function in there. Yes, we can do this in Python because functions are first class citizens so we can pass functions to other functions and have functions return functions. Pretty neat.

What about using this new implementation. We define 2 more functions, one called calc, and one dispatcher function:

def calc(fn, normalize, data):
    fn(normalize, data)

def get_calc(calc_type):
    from functools import partial

    if calc_type == 'a':
        return partial(calc, a_calc, normalize)
    elif calc_type == 'b':
        return partial(calc, b_calc, normalize)

But wait, what's this partial funny business? Remember how I said simple != familiar, so stay with me.

With partial application we can use our calc function that takes exactly 3 parameters and bind the first 2 parameters (namely the concrete a/b calc and normalize functions) and leave the 3rd parameter (the data) unbound. This effectively returns a new function that accepts only one parameter. You can, of course, partially apply any number of parameters of a function, and get back a new function that takes that much less parameters.

It's the same solution just implemented solely with functions. Using this method we left room for adding a 3rd calculation type that can use the general normalize function we have implemented or use it's custom normalization. In the same effect we can change the normalize function in both A and B type calculations.

And now we can use this API in a similar way we did the one before:

calc = get_calc(calc_type)
calc(data)

There you have it. Two different approaches for solving the same problem, with roughly the same amount of lines of code and the same API.

Notice how nowhere in the post did I mention the term FP (Functional programming). This is mostly because Python is not really a functional language despite it having many of the nice functional-like features. Also, when people hear "Functional Programming" most of the time they immediately run away.

Personally I find the functional solution cleaner, more readable, and easier to explain to new programmers. Alas, I leave it to you to decide what the best approach is for your concrete problem. Remember, there's no silver bullet, always use the best tool at your disposal for the problem at hand.

fp functional programming python