Classes for all the things?
Recently I've been thinking a lot about how to simplify my code. Now, the key thing to note
here is that simple != familiar
.
Classes for example are familiar to most people. But code consisting exclusively of classes and OOP concepts isn't necessarily simple all of the time. The question arises: Is there something simpler?
Well, the first thing that comes to mind, of course, good old functions. Functions are universal, everybody understand them, even new programmers that don't yet get all the fancy concepts about OOP get plain old functions.
Keep in mind, I'm not saying that classes don't have their place in our code, but rather that maybe we don't need them quite as often as we might think.
For example: Let's say we want to write a program that needs to do some sort of calculation on some data we provide.
We want to have different types of calculations for getting different information from our data.
Let's call them Calculation Type A
and Calculation Type B
. We need to take into account data normalization
before starting the calculation, and let's make room for future improvements as we might add
a Calculation Type C
later on that may require a different kind of normalization.
We immediately think to ourselves: Well I'm just going to make an abstract class and just inherit from that. So here we go:
class AbstractCalc(object): def normalize(self, data): #do complex normalization return data def calc(self, data): raise NotImplementedError
This is nice. We have an abstract class that has the normalize
method implemented (which we can override in derived classes if need be) and we have a calc
method that needs to be implemented by the derived classes.
So let's implement those:
class ATypeCalc(AbstractCalc): def calc(self, data): normalizes_data = self.normalize(data) print "Doing some complex A Calculations" result = ... return result class BTypeCalc(AbstractCalc): def calc(self, data): normalized_data = self.normalize(data) print "Doing some complex B Calculations" result = ... return result
Awesome. We have our concrete calculation classes and for now we use the parents normalize
method but we
can just as easily use our own custom one.
The usage of this implementation is something like this:
if calc_type == 'a': calc = ATypeCalc() elif calc_type == 'b': calc = BTypeCalc() # do actual calculation calc.calc(data)
We can just wrap this in a dispatcher function called get_calc
and then we get:
def get_calc(calc_type): if calc_type == 'a': return ATypeCalc() elif calc_type == 'b': return BTypeCalc() calc = get_calc(calc_type) calc.calc(data)
Pretty awesome, because now to use our code we just call the get_calc
function which returns an
object instantiated from one of the calculation classes and we're good to go.
This is all fine and dandy but we have just 2 methods in those classes. Do we really need a class for that?
I wonder if we could get away with using just plain old functions? We just need to keep the code modular enough that adding new calculations is easy and preserve the existing API.
Let's see how that would look. First we define our normalize
function, and our a/b calculation functions:
def normalize(data): #do complex normalization return data def a_calc(normalize, data): normalized_data = normalize(data) print "Doing some complex Calculations A" result = ... return result def b_calc(normalize, data): normalized_data = normalize(data) print "Doing some complex Calculations B" result = ... return result
Seems straight forward enough. We have a shared normalize
function which we pass into the a_calc
and b_calc
functions, and this way if one day we need to change the normalize
function for a_calc
we just
pass in a different function in there. Yes, we can do this in Python because functions are first class citizens so we can
pass functions to other functions and have functions return functions. Pretty neat.
What about using this new implementation. We define 2 more functions, one called calc
, and one dispatcher function:
def calc(fn, normalize, data): fn(normalize, data) def get_calc(calc_type): from functools import partial if calc_type == 'a': return partial(calc, a_calc, normalize) elif calc_type == 'b': return partial(calc, b_calc, normalize)
But wait, what's this partial
funny business? Remember how I said simple != familiar
, so stay with me.
With partial application we can use our calc
function that takes exactly 3 parameters and bind the first 2
parameters (namely the concrete a/b calc and normalize functions) and leave the 3rd parameter
(the data) unbound. This effectively returns a new function that accepts only one parameter.
You can, of course, partially apply any number of parameters of a function, and get back a new function that takes
that much less parameters.
It's the same solution just implemented solely with functions. Using this method we left room
for adding a 3rd calculation type that can use the general normalize
function we have implemented or use it's
custom normalization. In the same effect we can change the normalize function in both A and B type calculations.
And now we can use this API in a similar way we did the one before:
calc = get_calc(calc_type) calc(data)
There you have it. Two different approaches for solving the same problem, with roughly the same amount of lines of code and the same API.
Notice how nowhere in the post did I mention the term FP (Functional programming). This is mostly because Python is not really a functional language despite it having many of the nice functional-like features. Also, when people hear "Functional Programming" most of the time they immediately run away.
Personally I find the functional solution cleaner, more readable, and easier to explain to new programmers. Alas, I leave it to you to decide what the best approach is for your concrete problem. Remember, there's no silver bullet, always use the best tool at your disposal for the problem at hand.