Compliance in Codebases - How to Use Clang LibASTMatchers for Compliance
The following article was created from a 2021 CppCon talk given by Jonah Jolley.
One of our clients has a software product that must comply with federal regulations. For them, they must submit technical documents to a governing body. However, in their case, the codebase didn’t accurately reflect what was documented. This led to an unmitigated, unsafe condition.
To solve this issue, we used Clang LibASTMatchers to check the codebase and ensure they were indeed compliant.
Why Compliance in Codebases Matter
Compliance is adhering to rules and regulations set by an authority. In the professional world, compliance will generally fall into two buckets
- Regulatory: Companies have to follow relevant laws and regulations for their industry. This is important as these external guidelines are in place for safety, whether personal safety or data safety.
- Organizational: While this is typically HR or security departments, this also applies to coding practices. Measures set by the engineering team to ensure quality and easier collaboration
The biggest concern is that safety can be jeopardized, whether it’s a standard to use a certain grade of material or process for testing and auditing software.
Without compliance, quality suffers. Trust with customers erodes. You also face the possibility of fines and disciplinary action.
To reduce the risk of noncompliance, it’s important for organizations to practice corrective and preventive action (CAPA). CAPA involves:
- Investigation: investigate the situation, understand the risks, make a correction plan, and a timeline for implementation
- Correction: Fix the issue (for some industries, this would be a recall). It could also be personal and shift change or documentation update
- Root Cause Analysis: Find out the underlying cause
- Preventative Action: understand how this happened and change processes to avoid it in the future
Tips for Compliance
Disclaimer: We’re not compliance officers. However, there are some basic steps we take in industries that require it.
- Ensure employees are up to date and understanding regulations and practices
- Ensure up to date documentation of the design and architecture of a product
What about automation? Generally, this is a well-defined problem. Codebases can use static analysis tools. But what about custom checkers to ensure certain things? With Clang, there is an opportunity to increase the efficiency of this process
An Example
For this example, we’re going to pretend we’re making a candy dispenser. We have an LED touchscreen where we can select the candy we want. Right now it’s showing its screensaver.
We have another LED light that will blink different lights for different statuses. And we have a door we can open to grab our candy.
Device Architecture
- GUI where you can select your candy, it also will show statuses
- Driver process, which holds all the microcontroller code and sensor information. It provides an API to the Baker Process to control them
- Baker Process That takes a recipe and will command the different components inside to make it based on that recipe.
- Also, in the baker process, there are monitors that watch sensor information based on the recipe to ensure conditions stay in a safe range.
- If a monitor sees an unsafe condition it raises an alarm,
- The Alarm Manager will look up that alarm that is defined in a config file and in there we have the correct action a device should take, whether full shutdown or pause the flame
So we need alarms that describe an unsafe condition, monitors that ensure device conditions stay nominal, and an alarm manager that will enforce corrective action. As a part of getting certified to be able to release to the public, we need to submit documentation on the device. Part of that is submissions of all of the alarms as they describe potential unsafe situations.
How Do We Solve This?
This alarm client is Instantiated with a string, coming from a literal or a config file. It will eventually get its raise function called which is an event dispatched to the Alarm Manager. Somewhere else we have every alarm defined in a config and severity rating and other metadata we might need.
We see an alarm client. We see the alarm it is and that it is raised. So the two main things we need to grab are the variable name, and what it was initialized with.
We can’t do this manually, because it would take a person hours to maintain, it’s error-prone, and existing tools are cumbersome.
We can’t use regex either. Regex is brittle and the underlying representation would be the same even though the code text is different. It’s also complex to maintain. After I write the most “amazing” beautiful regex ever. Coming back to it after an extended period of time it looks like gibberish. Finally, C++ has more going on than just the textual representation. We need context. If we had a local variable with the same name in two different functions. Now we need to capture the function name to confidently mark an alarm as defined and raised
What About Clang Tooling?
Leveraging Clang-Tidy would be really great, we could write a custom checker. However, this falls short because we need to post-process the information we are getting. We could leverage this if we only wanted to make sure that every AlarmClient had its raise function called. If you’re thinking about this in your own codebases you could leverage that.
With LibTooling, we could control the output of the program, specialize it for what we need, and enable being able to post-process with additional tooling. Clang LibTooling is a library to support writing standalone tools. It allows us to run tools over single files or subsets of files while giving us full control and access to the Clang AST. It also allows us to share code with Clang Plugins.
Additional Resources for Clang
If your codebase needs to comply with certain regulations, reach out to our software development team. We'd be happy to brainstorm the best solution and implement Clang or another tool.