MISRA - Why it may make your code worse

28-Jan-2015

I've been working with embedded systems since the 1980s. In that time software development has gone through many changes. In those early days everything was written in C. Often people had no version control systems in place. Testing was sparse and often an afterthought. C compilers for embedded targets were buggy and immature. Prior to ANSI C standards we didn't even have type checking of function parameters. There were no IDEs or GUI debuggers. Code review was unheard of in many industries. Developing code was really difficult.

ANSI C was formalised in 1989. ANSI C was a huge step forward. At last we had a compiler that actually checked the types of parameters. Gone was the need to remind yourself of types by using Hungarian Notation, though this archaic practice still persists.

C++ also started to see use in the late 1980s. This brought OOD and OOP to the table. It also brought massively increased complexity. Given the poor testing and coding practices, poor training and weak project management prevalent in the industry at the time, this also brought a wave of expensive project failures. C++ got a bad reputation in embedded circles. One it really doesn't deserve. I've heard battle-scarred project managers claim that C++ is unsuitable for embedded systems. But I've seen other companies use it very well.

Today we have a range of high quality languages and tools. C++ has matured and can be used to great effect in embedded systems. Target systems have also grown, so now it is not unusual to have a large Linux platform, which can run standard open source webservers, database etc. This allows the use of languages like Python. Developing systems in these higher level languages can be much faster. If time to market and development costs trump hardware costs, this is a compelling factor in the choice of technologies. If you are not running on a constrained system and you are IO bound, not CPU bound, you would be wise to consider using something other than C or C++. This applies to a lot of network connected devices and services. I'm also optimistic about Go, which looks like a well designed language that could be very good in the embedded space.

We also now have excellent versioning tools, like git. I've worked with Source Safe, CVS, Subversion, Clear Case, plus Mercurial and Bazaar. Git is head and shoulders the best in my opinion. Clear Case is so bad (and expensive) I am astonished anyone uses it. As late as 2012 I worked with a company that was using Source Safe for version control, several years after Microsoft dropped support for it. I'm pleased to say this company migrated to git, and hasn't looked back. It has also adopted TDD and Agile. So, we have better tools, and some people are adopting them.

The embedded market has always been conservative. Its software technologies traditionally trailed behind mainstream software development. There are lots of reasons for this, some of them good ones. The development of compilers for embedded systems lagged behind those for PC platforms for years. This was particularly true for C++ in the 1990s. The target markets for embedded systems used to be much smaller, so vendors had less incentive to spend money on development. Before the widespread adoption of Open Source tools, companies were tied to proprietary and often niche products. Embedded systems tended to have smaller footprints and constrained resources. It simply wasn't possible to fit big software in a small system. C and C++ also produce fast, close to the metal code. Often this is vital for embedded systems.

The Motor Industry had the requirement for high quality, high reliability software. Frankly, a lot of embedded software in the early days was neither. Something needed to be done. The Motor Industry decided that it needed [Coding Standards]. So MISRA was born.

Don't get me wrong. I think code standards are important. But I see two problems here. Firstly, people confuse code standards with code quality. Secondly, they pick the wrong standards. This can have the effect of making your code worse. It can actually prevent you from writing good code. I've worked with lots of companies across a range of industries. I've witnessed the problem of applying the wrong code standards, mistaking it for code quality and producing a poor product as a result.

I'm going to use just one example of a MISRA standard causing bad code. I've seen this exact problem in several companies. The problem is 'single exit point'. This idea comes from Structured Programming. The wikipedia article (above) describes 'Early Exit' as a common deviation. It is, however, not common enough.

Company A

The following is a real example of code that both complied with a company's MISRA standards and passed a code review.

It follows a common form, a series of nested conditionals, each with an 'else' clause. You have to have an 'else' clause in MISRA following an 'if else', even if it is empty. In case the author forgot something. Sadly, people over interpret this to mean that all 'if' statements require an 'else'. People often put in an empty 'else' clause with a statement eg. /* MISRA */. This always feels to me like writing * "This is a stupid thing to do. They made me do it." *

In the code below, the flag 'Expired' gets set in 5 different places. Here the author has decided to make full use of pointless 'else' clauses by setting the Expired variable to false (which it already is). But you, the reader, are now forced to read and check them all. Are they all correct? You are also forced to check all the braces. Do they match the indentation? Because they are using the 'else' clause and are forced to nest deeply (because of the single exit point rule) the error handling code becomes far removed from the conditional that it applies to. In extreme cases, where people write grotesquely long functions in this form, the 'else' clause can be hundreds of lines away from the conditional. Bad by design, hard to read, but commonplace. So, the code is compliant, but it is already slowing you down and potentially obscuring functionality and bugs. This is poor quality code. But hey, it's standards compliant. Code like this is depressingly common.


    /* Company A code */

bool TIMER_Expired(uint8_t TimerID)
{     
    bool Expired = false;

    /* Is the given timer ID within the array bounds? */
    if (TimerID < LENGTHOF(Timers))
    {
        /* Timer still allocated ? */
        if (Timers[TimerID].Interval != 0)
        {
            /* Expiry time passed? */
            if ((bool)((int32_t)(Timers[TimerID].ExpiryTime - ElapsedTicks) <= 0))
            {
                Expired = true;
            }
            else    
            {
                Expired = false;
            }
        }
        else
        {
            Expired = false;
        }
    }    
    else
    {        
        Expired = false;
    }

    return Expired;
}

The Solution

The solution is simple; use Guard Clauses. We've got rid of all the nesting. We've eliminated the flag. Each return result is clearly associated with the preceding test. This is simpler and easier to read. Notice that you simply reverse all the tests.


    /* better code */

bool TIMER_Expired(uint8_t TimerID)
{     
    /* Is the given timer ID within the array bounds? */
    if (TimerID >= LENGTHOF(Timers))
    {
        return false;
    }

    /* Timer still allocated ? */
    if (Timers[TimerID].Interval == 0)
    {
        return false;
    }

    /* Expiry time passed? */
    if ((int32_t)(Timers[TimerID].ExpiryTime - ElapsedTicks) > 0)
    {
        return false;
    }

    return true;
}

With a quick look at the rest of the file we find the same 'ExpiryTime - ElapsedTicks' calculation all over the place, including in a function named 'PassedTimeInTicks()'. So we can refactor again to use that :


    /* better code++ */

bool TIMER_Expired(uint8_t TimerID)
{     
    /* Is the given timer ID within the array bounds? */
    if (TimerID >= LENGTHOF(Timers))
    {
        return false;
    }

    /* Timer still allocated ? */
    if (Timers[TimerID].Interval == 0)
    {
        return false;
    }

    return PassedTimeInTicks(Timers[TimerID].ExpiryTime);
}

That looks much better. Code re-use, better factoring, simpler structure, no nesting, simple tests. It has clarity. But it fails the MISRA test. No pointless 'else' clauses. Multiple exit points. I would probably put an ASSERT() in the first conditional, but that is another discussion. I don't like the names much either, but that was house style. The bottom line is that it is much easier to read.

Company B

Another company I worked with had an interesting technique to create bad code. Instead of nesting the large number of conditional tests and revealing the full stupidity of the single exit point rule, they decided to hide it by using gotos. Thousands of them. Gotos are banned in structured programming, with some justification. There has been controversy about gotos since Dijkstra's famous 1968 paper. As Dijkstra said "The goto statement as it stands is just too primitive, it is too much an invitation to make a mess of one's program." But this company saw that as a challenge. Their code looked something like this :


    /* Company B code */

#define ERR_EXT(flag) if (!(flag)) { goto error_exit; }

bool TIMER_Expired(uint8_t TimerID)
{ 
    bool stat;

    /* Is the given timer ID within the array bounds? */
    stat = TimerID < LENGTHOF(Timers);
    ERR_EXT(stat);

    /* Timer still allocated ? */
    stat = Timers[TimerID].Interval != 0;
    ERR_EXT(stat);

    /* Expiry time passed? */
    stat = ((int32_t)(Timers[TimerID].ExpiryTime - ElapsedTicks) <= 0;
    ERR_EXT(stat);

error_exit:
    return stat;
}

Their code was riddled with this sort of thing. I'm simplifying, they had dozens of wrapper macros, some using goto, many of which fiddled with the status variable or performed logging. All in an opaque, hard to read way. Yes, they did use 'stat' as a variable name (it clashes with the POSIX libraries). But they declared that the code was compliant with their coding standards - an 'enhanced' MISRA. Breathtaking.

They insisted on lots of other pointless things. No use of C++ style comments was another. These were introduced into the C standard in C99, by which time most compilers supported them.

Another company I worked with were very MISRA, very unthinking. The argument there raged about whether a char pointer should be

char* s;

or

char *s;

See? The compiler doesn't care. I don't much. The latter is probably better, but I wouldn't lose any sleep over it. A much more important thing for them to focus on was the fact that they were using the functions likely to cause buffer overflow errors; a leading cause of security vulnerabilities in software. Company B, unsurprisingly, had the same problem. Auditing and eliminating the use of * sprintf(), strcat(), strcpy() * etc would have been useful. But they focused on coding standards, not on software quality. Their code was conformant, but poor and potentially insecure. They were very proud of their code, but it was riddled with security hazards.

Conclusion

Like Hungarian Notation, some code standards address a problem that existed in the past, but is irrelevant today. Or, ideas have moved on. Single exit points may have made sense as a contrast to unstructured assembler in the 1970s. But that isn't where we are today. Banning C++ style comments may have made sense if you were targeting an obscure processor in the 1990s, but it probably doesn't make sense now. But the standards linger, like the UK tax year starting on the old ecclesiastical calendar, or using a Babylonian number system for time. Or our crazy calendar system; September (from Latin septem, "seven") is the ninth month. LED brightness is still measured in candle power. All these are standards, none of them very sensible, all have their origins in legacy rather than reason.

Some people have code standards because it gives them a cosy warm feeling. They think it must be good code because it complies with a standard. Wrong. It might be terrible code. It may even be worse because it complies with a standard.

So please don't confuse code standards with code quality. And don't pick the wrong standards.

MISRA does have a lot of good advice in it. But some of it is bad advice. Take the good bits, add some good bits of your own, and throw the rest away. You will be able to write better code. The important thing is to focus on quality, not on standards compliance. Especially if some of those standards are wrong. And please fix those strcat()s. They do matter.