BUGGY SOURCE CODE EFFECTS BREATH TEST RESULTS

A court-ordered audit of the source code that powers a breathalyzer machine has uncovered serious bugs and technical deficiencies. The professional code reviewers contend that the software is far below industry standards for quality and that it contains programming errors. The results of this review have raised serious questions about the viability of such devices as a law enforcement tool.

Over the past several years, DWI defendants have increasingly challenged the accuracy of field breathalyzers, contending that the machines are potentially fallible and do not provide a sufficient degree of accuracy to justify using them as the sole basis on which guilt is determined.

In several cases, defendants have asked the courts to mandate source code reviews so that the software that runs the devices can be tested and evaluated for quality. Courts in Florida, Minnesota, and several other states have granted such requests, within certain parameters. In instances where the breathalyzer companies have declined to make code available for such reviews, Judges have been forced to throw out cases or reduce the charges against defendants.

Some diagnostic routines in the code will silently return arbitrary default values upon failure, leading to potentially inaccurate breathalyzer test results. The software will also silently ignore errors in some cases unless there are a large number of consecutive failures.

In an ongoing DWI case in New Jersey, where the source disclosure issue escalated to the state's Supreme Court, breathalyzer company Draeger was forced to submit its code for independent review. The software review summaries published by the expert source code auditors indicate that the underlying software that powers the Draeger breathalyzer exhibits potentially serious flaws.

Two reviews have been published. One review, which was conducted by SysTest, was commissioned by Draeger. The second review, conducted by Base One, was commissioned by the defendant. The reviews differ in scope and offer different conclusions, but they both agree that the code falls below industry-standard best practices and that it contains bugs.

Buggy software

The SysTest analysis was narrowly focused on determining if the source code matches the product's documented behavior and contains any deliberate deviations that could lead to inaccurate results. SysTest's study confirmed that the code does not contain malware, but the report noted that the code is excessively complex, poorly maintained, and includes at least one reproducible bug—a buffer overflow error that will occur under specific conditions.

The Base One test, which was broader in scope, attempted to provide a more holistic analysis of the software's accuracy. The report identifies 24 major defects and points to a wide range of troubling issues. The analysts discovered that the embedded software disables safeguard features built into the device's processor that are intended to detect and prevent the execution of invalid or corrupt instructions. The researchers contend that this circumvention can lead to unpredictable results in the event of fatal errors.

The researchers also found that the device doesn't have any built-in sensors to determine if its physical state is consistent at any given time. When the code activates a motor or valve, the report says, it simply assumes that this function has been correctly performed and does not test to make sure. Some diagnostic routines in the code will silently return arbitrary default values upon failure, leading to potentially inaccurate breathalyzer test results. The software will also silently ignore errors in some cases unless there are a large number of consecutive failures.

There were some problems with the style of the code that were identified by both studies. One of the stylistic issues that concerned the reviewers was the extensive use of unprotected global variables. This is considered poor form because it increases the risk that the program state will become inconsistent or that values will be inadvertently modified or overwritten. The researchers also expressed some concern about the fact that decimal precision is not maintained consistently throughout the code.

The final conclusions issued by the code evaluators are somewhat different. SysTest contends that, despite the clear deficiencies it documented in the code, the software is still effective and reliable when used correctly.

"It is the opinion of SysTest Labs that while the reviewed source code is not written in a manner consistent with usual software design best practices, there are no obvious defects intentionally written to produce anything other than consistent test results," the report says. "SysTest Labs Incorporated expects that the Alcotest 7110 MKIII-C's source code, as written, and when used in accordance with DSDI guidelines, will reliably produce consistent test results."

The Base One reviewers are less optimistic and suggest that using the software in a law enforcement capacity poses too great a risk. "As a matter of public safety, the Alcotest should be suspended from use until the software has been reviewed against an acceptable set of software development standards, and recoded and tested if necessary," the Base One report says. "An incorrect breath test could lead to accidents and possible loss of life, because the device might not detect a person who is under the influence, and that person would be allowed to drive. The possibility also exists that a person not under the influence could be wrongly accused and/or convicted."

Despite the fact that the reports arrive at conflicting conclusions, the evidence of bugs and technical flaws documented by both reports demonstrates that the software's quality is highly questionable. It illuminates the need to conduct such audits on software that is entrusted with critical tasks. In a blog entry about the source audits, security expert Bruce Schneier characterizes the situation as "an excellent lesson in the security problems inherent in trusting proprietary software."

"This is important. As we become more and more dependent on software for evidentiary and other legal applications, we need to be able to carefully examine that software for accuracy, reliability, etc," he wrote. "Every government contract for breath alcohol detectors needs to include the requirement for public source code. 'You can't look at our code because we don't want you to' simply isn't good enough."

The implications for Draeger could be significant. According to a report by DWI.com, an agreement between Draeger and the state of New Jersey stipulates that the company will have to refund the state almost $7 million if the state supreme court declares that the machine is unreliable.

http://arstechnica.com/tech-policy/news/2009/05/buggy-breathalyzer-code-reflects-importance-of-source-review.ars