Saturday, February 12, 2011

Impact of maturing distinctions (or how the need for increasingly rich information generates software engineering challenges)...

One of the things that I have noticed over my years as a software engineer as been the process of capturing some particular set of "requirements" and constituting them as rules in a software system I end up building. And then over time, the need for the system to be able to respond to subtle increases in information distinctions exposes software adaptation blocks; i.e. software design fragility. And my response has been to try to design defensively around these areas. The end result is that not accounting for distinctions early can generate changes that hit the core system design and ripple throughout the resulting complex system. And accounting for the more refined distinctions too early can prematurely make the system too complex increasing risks to the system's continued useful existence.

Rather than talk in abstractions, here's a more concrete example of the process of a particular data point in a system "growing" to need higher degrees of details in the information.

Let's use the notion of something nice and nebulous, hunger. The requirement is to include some value to indicate whether an instances of a digital organism has it's hungry signal turned on. In Java, this would be represented as a boolean, as in some field of a class:

private boolean hungry;

After we've been working on the overall system for awhile (possibly already released a version of it to the public/production/some-irreversible-external-thing), we discover that we need a bit more information. Rather than track hunger. We are really tracking the degree of motivation to seek food based on a set of discrete values; HIGH, MODERATE and LOW.

Obviously, this trinary of discrete values won't fit in our boolean. This means we have to change from boolean to something else. Thanks to our looking ahead, we can see there may be additional discrete values added in the future (imagine something like WEE_BIT_PECKISH or EXPLOSIVELY_BLOATED), so we just make the value an int (or an Enumeration which is a more effectively adaptive way to essentially use the equivalent of an int). And using constants, we now represent the trinary. So, we enhanced the ability for the variable to represent information from 1 bit (boolean) to 2 bits (trinary) and with the ability to more easily expand in the future to as much as 32 bits. It would now look something like this:

public enum HungerLevel{
  , LOW
private HungerLevel hungerLevel;

Again, the system has been "published" when the next requirement comes in. There is a need to be able to capture hunger more specifically, as a decimal number (pretend that hunger is now being discerned by testing for the presence/absence of specific neurochemicals). So, we need to move to a continuous value, or scalar.

private double hungerDegree;

And a short time later, there is a new discovery. It turns out that the human brain has two separate modules for reflecting the individual's hunger state. One indicates that the person is needing to eat to elevate blood sugar. And the other indicates the person is sated as enough fat has been detected to have been ingested. So, now TWO scalars are needed. And yes, this why you can feel very full after a huge turkey day dinner and still crave more pumpkin pie.

private double satiety; //affected by recent fat level intake
private double bloodsugar; //affected by recent insulin spike and drop-off 

Additionally, it turns out these values shift independently over time. So, we need to add a third variable, time, and capture the two scalars for each time unit.

private double[][] digestiveHormonalState = new double[][] {{s1, b1}, {s2, b2}, {s3, b3}};

So, we have moved from a boolean to a discrete to a scalar to a pair of scalars to a time based array of pairs of scalars as the system requirements adapted. This plays havoc with most software APIs. However, this model is strongly reflective of the real world problems facing a software engineer as a system is attempting to be designed, built and maintained over time.

There's no huge insight here. I just found this pattern to occur repeatedly in human meaning systems outside of software engineering. And I found the pattern interesting enough to share. And I am looking for ways to be able to capture meaning adaptations (i.e. enhanced distinctions) while minimizing the overall impact they have to systems built upon them.


  1. I really like the example of the attribute hunger, how its meaning is better understood over time, and how this (may) lead to a need to change the datatype over time to account for these.

    Here are some questions/thoughts this has raised in my mind, which I'm curious if you'll address at some point:

    1) Throughout the original code, at least in some places, there would likely be conditional statements like this:

    if (x.isHungry) { doSomething() }

    Such code here is clearly written to simply base its conditional operation on the boolean nature of hunger. So, even with mechanisms to adapt the datatype of hunger, that may not be enough to properly adapt the operations that now may need to be adjusted throughout the code. Is the old isHungry() able to be mapped to >= HungerLevel.MODERATE? Or to >= HungerLevel.HIGH? Or would it would it map differently depending on the function, where some functions might even have initially thought of any level, even low, as isHungry() being true, while other functions might be better adapted to a higher standard.

    And then when you adapt HungerLevel values to one scalar or two, you'd need to know how those old isHungry() references or old references to HungerLevel.MODERATE should be adapted, and whether this is consistent in all places.

    Then later, you'll want to adapt them not just based on how high the levels are at, but for how long they've been that high/low, etc.

    Adapting the code to this extent seems to me beyond the power of language constructs such as inheritance, overriding, aspects, etc. None of this processing structure would have been anticipated when hunger was seen as a simple boolean.)

    It requires not just adapting the hunger type itself, but somehow adapting all statements based on references to that datatype.

    I'm not sure how a language could be that flexible. And, I'm not really sure if it needs to be.

    2) How often does this kind of conceptual change occur in your programming projects such that projects are at risk unless the code can be adapted to account for these?

    My own experience is that the projects I work on rarely require this serious a change of data types. Perhaps I'm lucky, or perhaps I code for a slower changing domain than you do. (Though mortgage products do change frequently, my interfaces to abstract over these current and future products hasn't had to change much, if at all.)

    With that said, I really will find it fascinating if you are able to find solutions to allow for clean adaptations for situations such as your hunger example presents.


    Jim N.

  2. Jim N.,

    You bring up some very interesting points. In the series of posts I am now starting (posted the first one this evening), I am going to be playing with "requirements changing" over the course of the exploration of the specific domain inspired by Conway's Life. Rather than give away the punch lines, I will let you read about them as I post them. And then at some point in the future, we can revisit this post and your comment and have, what I think, will be a very enjoyable discussion. I'm having SO much fun playing in this area, more fun than I can remember having feeling and thinking about code than I have in a very long time.