Automatic static code analysis before uploading your code

Developers and team leaders are probably familiar with static code analysis tools such as CppCheck, Klocwork and others. The main problem of the usage of such tools is that there is no enforcement. Meaning, you can upload your code to your code repository server with issues that could have found before the upload (while issues are already in repository server and even worse, deployed at your customers endpoints).

This means that the code repository server must be reviewed (using a static code analysis tool) once in a while in order to find and fix these issues (which requires a cycle of 2-3 engineers). So why not enforcing code to be analysed each time it is being uploaded?  This way you’ll save the engineers find-fix-test cycle and you ensure all of your issues will be fixed before releasing the product to the customers.

Check out this article I uploaded to the CppCheck community - CppCheck integration to TortoiseSVN (includes a script for static code analysis automation).

Since last changes with SourceForge, CppCheck data on SourceForge is missing. Therefore, I’m re-posting it here:

===================================================

Since we are not robots (yet), it is very possible to forget running a Cppcheck before committing code to the SVN server. Organizations that use Cppcheck (or any other static code analysis tool) usually perform the code analysis once a day/week/month. The team leader assigns a task to developer to fix the issue, commit the code and wait for the next code analysis. And then we start this cycle again. Sometimes the code analysis is taken after a build was already released to QA, or even worse, to customers.

So we all know that asking your developers to run the Cppcheck before every commit they do is not feasible. However, this process can be automated (and also invisible in some manner) for the developers.

Attached to this page a script which will automatically force the Cppcheck on all source files that are being committed. The check is run when the commit is triggered (before the commit is actually performed) with a zero effort from the developers. In the case issues are found, the script will fail the commit so the developer can fix the issues and commit only Cppcheck-checked code (failing the commit can be bypassed if needed). The great value of this approach is that we can fix the issues before they are committed to the SVN server!
Configuration

  1. Download SVN_Pre_Commit_Hook__CppCheck_Validate, extract the zipped file and edit the script:
    • cppCheckPath - Full path to your Cppcheck.exe (not CppcheckGui.exe).
    • supportedFileTypes - Add or remove file types to check. This variable is here so the script won’t check ‘.sln’, ‘.vxproj’ and other non-source file types.
    • enableScript - ’1′ or ’0′ to enable/disable running the script.
  2. Right click (somewhere on desktop) → TortoiseSVN → Settings → Hook Scripts → Add…
  3. Configure Hook Scripts:
    • Hook Type: Choose ‘Pre-Commit Hook’ (upper right corner).
    • Working Copy Path: The directory that all of your SVN checkouts are done. Use the top most directory (or just use ‘C:\’ for example).
    • Command Line To Execute: Full path to the attached script.
    • Make sure that both ‘Wait for the script to finish‘ and ‘Hide the script while running‘ checkboxes are checked → OK → OK.
    ConfigureHookScripts

Hints

  1. Even if the commit failed because it didn’t pass the static code analysis, SVN gives you the option to easily recommit disregarding the failure by clicking the ‘Retry without hooks‘ button. If commit succeeded (meaning, Cppcheck did not find any issues), it will look like nothing happened (so developers will still see a commit end message just like before).
    CommitScreen
  2. If you want to implement this solution in your organization/team you can do it in two different approaches:
    • Client side solution - Meaning, the steps above should be taken for all of your development machines. The benefit in this approach is that only relevant teams can use this solution and not all of the developers that are working on the SVN server. Besides, ignoring this Cppcheck (in case of false-positives for example) is quite easy using one button click integrated in the TortoiseSVN Client (‘Retry without hooks‘). This approach means that Cppcheck must be installed on all of the relevant developers machines of course.
    • Server side solution - Meaning, Cppcheck should be installed only on the SVN server and the steps above should be taken only once (server side only). So clients (developers’ machines) should take no action since every commit will trigger the hook at server side. The benefit is this is taken only once, but this solution may be to restrictive for some organizations. In addition, in order to ignore the hook (once again, false-positive for example) – you need to create some ‘back-door’ script that will allow developers to bypass it with a specific keyword in the commit message.
  3. More about SVN hook scripts - Client Hook ScriptsServer Hook Scripts.

All you need to do is take the Configuration steps above just once. Afterwards, you can work with SVN the same as before, just now you get to see your failures before code is committed to the SVN server.

Reverse Engineering COM dlls

Reverse engineering closed source binaries is always a challenge. Using tools such as Process Monitor, API Monitor, IDA and others make such tasks possible to achieve, but still requires good knowledge and experience. Whether you want to exploit or defend (depends on the hat you wear) a specific application, you need to find the right interception point(s). In most cases we are dealing with dlls, which is obviously by getting the dll’s export functions addresses and start investigating from there.

Lately I had to find the right place for interception in order to develop an additional security layer to one of Microsoft’s IIS components. No need to mention that the set of the dlls I’ve investigated is completely undocumented and unfortunately does not even have private symbols in Microsoft’s symbols server. So my guideline was –

  • Find a few interesting entry points (exported functions) in some dlls
  • Debug the target process by attaching the process using Visual Studio
  • Load relevant dll exports (once again, we have no symbols)
  • Set breakpoints on my suggested entry points

After eliminating some of the loaded in the process, my suspicious was on one dll with more than 6,000 exported functions. Cutting things short, I’ve found a function, let’s call it foo.dll:func() which can point that I’m in a good spot in the critical path. Setting a breakpoint on this function proved me that I was in the right place. Whenever I performed the operation I wanted to intercept, the operation was hung until the breakpoint was released. Done? Nope. Since my goal was to protect and not just audit an operation, I tested what will happen if I’ll skip the execution of this function or just return access denied. Doing so did not provide the results I was looking for. Although I skipped execution, original operation triggered by the user was still completed successfully. This probably means that this was not the function I was looking for, though I’m very close. Why? Because the breakpoint really held the execution of the whole request, so I’m in a good spot, probably on the right thread as well.

Taking one more try with the same breakpoint has showed something interesting. When the execution has paused because of the breakpoint, I was taking a look at the thread’s call stack and noticed that this function was not in the first frame of the call stack. To simplify it (2 frames scope only), it was looking like this:
boo.dll:newFunc()+0×500 → foo.dll:func()

So, we have a new dll in the game, boo.dll. It seems like the real operation was initiated from the newFunc exported function in boo.dll, and along the way, in offset 0×500, it has reached a ‘call’ opcode to foo.dll:func().

WRONG!

Why? Because of three reasons –

  • Setting a breakpoint at boo.dll:newFunc() entry point did not break the execution
  • The original name of newFunc() was not related at all to the operation I was looking for
  • Each time I performed different operations, I’ve noticed boo.dll:newFunc() in the call stack with different offsets for each operation.

So although at first glance it may look like calls originated from boo.dll:newFunc(), they actually weren’t. Then what is going on here? COM objects were in the game.

Since this dll does not have symbols, Visual Studio only chance to show a more informative call stack, is to load dll exports, so it will look like a minimal symbols version of the dll. However, there are more functions in this dll, so whenever there is a function offset that Visual Studio does not recognize for this dll, it basically look up for the first known symbol and present it like execution originated from this symbol plus the offset for the real address. Meaning, it looks like execution was originated from function X (which we do have symbol for) while execution was originated from function Y (which we don’t have symbol for).

So, apparently boo.dll is not just exporting functions, it also has a couple of dozens COM interfaces. So how do we proceed from here?

We have 3 main challenges –

  1. Discovering all dll’s COM interfaces, methods and addresses (remember, GetProcAddress() will not do the trick here).
  2. Making your debugger (VisualStudio in my case), to show the correct call stack, including COM function calls and also to have the ability to set breakpoints at COM functions easily. A good call stack will be bool.dll:ComFunc+offset → foo.dll:func() (since boo.dll:newFunc exported function was just misleading, we don’t want to see it in the call stack).
  3. Understanding an application API calls flow. If you reverse engineer a set of dlls that include more than a few dozens of APIs, it becomes a challenge to find the right place for interception. We will need logging of all API calls for a specific operation. This is a bit similar to API Monitor and Process Monitor but what both are missing is fully automated logging of COM calls without the need of manually preconfigured definitions file (and also integrating it for Visual Studio).

How can we achieve this? Stay tuned for the DbgGenerator tool I’ve created which allows you to load symbols for such dlls and easily reverse engineer them with no time. I promise good stuff.