Coverage metrics gain ground in formal verification
By Richard Goering
Coverage metrics will play an increasingly important role in formal verification, according to Hana Chockler, research staff member at IBM’s Haifa Research Laboratory and general chair of the upcoming Haifa Verification Conference (HVC 2008). In this interview, Chockler describes some of her research in formal coverage and discusses HVC’s focus on both software and hardware verification.
Chockler has been at IBM since 2005. She was previously a visiting scientist at the Massachusetts Institute of Technology, and she did postdoctoral work at Worcester Polytechnic Institute and Northeastern University. Her interests span formal and informal verification methods, as well as verification of hardware and embedded software.
SCDsource: What does your research at IBM Haifa Labs involve, and what are you most interested in right now?
Chockler: I am a member of the Formal Verification Group at the IBM Haifa Research Lab. Of all the eight IBM Research Labs, Haifa is the biggest one outside the U.S. and it serves as a world center of expertise for verification technologies. Our group has a formal verification platform called Rulebase, which we continue to develop and maintain. Naturally, a large chunk of the group’s research is in the formal verification of hardware. We also do research in PSL [Property Specification Language], and we recently started exploring power verification.
My current interests revolve around two main research areas — sanity checks for formal verification, and integration between formal and informal verification techniques. Sanity checks for formal verification include vacuity and coverage. In the area of integrating formal and informal methods, I am involved in the research of cross-entropy based methods for testing. Recently, I have begun exploring ideas on how to exchange coverage information between formal and informal verification methods.
SCDsource: You’re also the general chair of HVC 2008 [Haifa Verification Conference, Oct. 27-30]. What is the primary goal of this conference, and how is it different from other verification conferences?
Chockler: This is the fourth year in which we are organizing a full-fledged conference dedicated to verification. The scope of HVC is very wide compared to other conferences in this area, and it includes both formal and informal verification techniques for both hardware and software. Our goal is to bring together researchers and developers in all verification communities and provide them with a forum for sharing their work, exchanging their ideas, and discussing the challenges and future directions of verification across the spectrum.
Academic research in the verification of systems is generally divided into two paradigms — formal verification and dynamic verification, or testing. Different algorithms and techniques are used for hardware and software systems in each paradigm. Yet, at their core, all of these techniques aim to achieve the same goal of ensuring the correct functionality of a complicated computerized system. HVC is the only conference that brings together researchers from all four fields, thereby encouraging the migration of methods and ideas between domains.
At this HVC, we are also organizing special sessions that facilitate the interaction of people from different areas of verification. We have a panel on “coverage in verification” to which we invited representatives of both formal verification and testing, of hardware and of software verification, and of industry and academia. We are expecting an extremely interesting discussion there. We also have a special session on post-silicon verification, which is a relatively new research area for formal methods. Finally, we have a full day of mini-tutorials under the subject “emerging challenges and directions in verification,” where we are going to have presentations on new challenges for verification across the spectrum. I have no doubt this will spark some very interesting interactions.
SCDsource: Why does HVC 2008 have a focus on both hardware and software verification, and an interest in “hybrid” methods?
Chockler: This is a unique quality of HVC compared to other verification conferences. We believe that hardware and software verification have a lot in common. Of course, the challenges are different, but many ideas can be carried over from hardware verification, which is an older and more developed research area, to software verification. We also believe that the future belongs to “hybrid” methods that combine formal and informal techniques in order to achieve a higher quality of verification than would be possible using each method alone. Since the scope of the conference is so wide, it is a natural place for the exposure of hybrid methods and ideas.
SCDsource: It appears much of your research has focused on coverage methods for formal verification. What coverage metrics are commonly used today in formal model checking, and how effective are they?
Chockler: The notion of coverage metrics in formal verification is inspired by coverage metrics in testing. This is actually a perfect example of how ideas are transferred from one verification area to another. In testing, coverage metrics are used as a heuristic measure for the exhaustiveness of a test suite, and usually measure the part of a system under test that is visited during invocation of the test.
In formal verification, we can visit all reachable states of the system. So what is the right coverage metric for formal verification? The metrics we suggested are primarily based on checking which areas of the system can be changed (mutated or removed) without changing the outcome of the verification procedure. I believe that this is the right direction for developing useful coverage metrics in formal verification.
In fact, Intel was first to suggest such metrics in 1999 [HKHZ99] and they reported a bug that was discovered in a priority buffer by applying coverage metrics. However, until recently, coverage metrics have not been used in practical applications, due to the significant computational overhead that they incur. Over the last year, we’ve observed a growing awareness of “suspecting the positive answer” of model checking in industrial tools, so I hope coverage metrics will be soon prove useful in real-life verification.
SCDsource: What capabilities are most needed in formal verification coverage? And how important are these capabilities to the further acceptance, and use, of formal methods?
Chockler: First of all, coverage in formal verification, similar to coverage in testing, should measure how well we checked our system. It should be used to answer the question “Have I written enough properties to verify this system?” Secondly, and perhaps no less important in facilitating acceptance of formal methods, there is a psychological aspect to coverage; it supplies some additional information to the user together with the positive answer of the model checker. This way, the users do not have to put blind trust in their model checker. They can see the covered parts and decide whether it is exhaustive enough or more properties should be checked.
Last but not least, the way coverage information is presented to the user is very important. Since systems are very large, long lists of covered (or uncovered) states are probably useless. We need to extract some meaningful information from these lists and present it in a way that aids users in writing more specifications. Ideally, we should suggest new specifications that cover areas of the system that are not covered by existing specifications. Unfortunately, nobody seems to know how to do this yet.
SCDsource: One of your papers talked about “corresponding metrics” between simulation-based and formal verification. What metrics in the formal world correspond to code coverage, functional coverage, and assertion coverage as used today with simulation?
Chockler: Coverage metrics in simulation-based verification are mostly concerned with parts of the system that are visited during invocation of a test, although there are some metrics that check the effect of local mutations. So, for example, code coverage in simulation-based verification measures the number of lines of code that are visited during a test. A line of code is not covered if it is not visited in any of the tests in our test suite.
In formal verification, we can check the effect of mutating a line of code — that is, replacing it with something else — or omitting it altogether. If the verification still passes, then this line is not important and is left uncovered by the verification procedure. In the same way, functional coverage in formal verification checks the effect of removing states or transitions in a finite state machine [FSM] that represents the satisfaction of the specification in the system [CKV06b]. In assertion coverage for formal verification, we check the effect of omitting behaviors that satisfy a given assertion on the outcome of model checking.
It is interesting to note that when we check linear time specifications, omitting a part of the system cannot falsify a previously satisfied specification. This is because removing a part of the system results in a system with fewer behaviors than the original one. Thus, if the specification was satisfied before, it is still satisfied. So how do we define coverage metrics that check the effect of omitting parts of the system? We suggested checking vacuity (meaningless satisfaction) of the system. If a system satisfies a specification in an interesting, or meaningful, way, and after we remove a part of the system it satisfies the specification vacuously, then this part is covered by the specification.