The Curse of Over-Engineering

I have two children, so I’ve watched a lot of Spongebob Squarepants. Indeed, I’m not ashamed to admit that I enjoyed watching Spongebob even before I had children as an excuse. There is a scene in one of the earlier episodes which I often think of while working with software:

In this scene, Squidward is teaching an art class and Spongebob is his artistically gifted student. At one point, Spongebob astonishes Squidward by free-handing a perfect circle. When Squidward asks him how he did it, Spongebob demonstrates his technique by first drawing a very detailed face, then reducing the face to basic lines, then erasing that and ending up with a circle.

I love this because it perfectly illustrates the software development process. So often we start out with an overly-complex solution and spend the rest of the development cycle paring it down to the simple essentials. Many times, over-engineering is a result of trying to force an idea where it doesn’t actually fit. Ideas are our precious little babies and we are loathe to abandon them. Instead, we buttress them with all sorts of extra scaffolding that introduces more problems, which then need patching, etc, etc. It is essential at this point to step back and ask again, “What problem am I really trying to solve?” It is very likely that there is a better solution out there that doesn’t require so much extra work to support.

While writing my last post on conditionals in JMeter, I found a classic example of this kind of over-engineering in my own script. The post was based on a script I used for a game in which 25% of players were authenticated through Facebook while the other 75% were anonymous. Authenticated players get more in-game features, so their server footprint is much heavier than that of anonymous players. Reflecting this split in my load testing was important, so I used the method described in that post and was pleased with how well it worked. There was one key difference though; in the post I recommend this condition to determine flow:

${IS_A} <= ${PCT_A}

Simple, clean, easy to understand. But in my original script, the condition looked like this:

${PCT_AUTH} > 0 && ${IS_AUTH}%Math.floor(100/${PCT_AUTH}) == 0

It wasn’t until I was trying to explain this that I realized it was actually not a good solution at all. It works well enough for percentages that divide evenly into 100, and even 33% works alright. But try it with 34% and the shortcomings are clear. With 34%, the math works out to {IS_AUTH}%2 == 0, so instead of getting the authorized path just over one third of the time, I would be getting it half the time. Not good!

So how did I come up with such a convoluted condition in the first place? My thought process went something like this:

  1. I’m dividing things into buckets, so I’ll use the modulo operator. (This is the original wrong idea. Modulo is handy for fitting a large random number into a smaller set of values, but all I needed here was an over/under calculation. The subtext is that I didn’t learn about modular arithmetic until I took my first programming class at the ripe old age of 29, so it has the sparkle of “real engineering” in my mind.)
  2. It want to express my input variable PCT_AUTH as a percentage, so I’ll have to divide it into 100 to get a value that will work in a modular operation. So 25% becomes 4, 20% becomes 5, etc. (Note how the initial mistake leads to the first layer of complication. This should have been a clue that my first idea wasn’t correct.)
  3. The modulo operator only works with integers, so I’ll use Math.floor change any decimal to an integer. (Yet another layer of obfuscation. This is the part that makes 34% such a failure, because 100/34 = 2.941, and floor(2.941) = 2)
  4. Since I’m dividing by the PCT_AUTH input value, I’d better make sure it isn’t zero. (Of course, I didn’t realize this until I tried to run the script with PCT_AUTH = 0)
  5. Yay, I’m getting the desired 1/4 of all players being authenticated! I’m a genius!!!

Even at the time, I had noticed some input values produced odd results. But the script was working with the values I cared about most, so I didn’t really dig into what was going on. I had real testing to do, after all. It wasn’t until I pulled out the script months later and started playing around with it that I realized how broken it actually was. And even then, I was still so wedded to the modulo idea that it took me an embarrassingly long time to see the much simpler solution that I should have been using to begin with. If only I’d had someone else to test my work – it probably would have been fixed much sooner.

Leave a comment