Safety with C++Builder 12.3: Introducing Sanitizers

Ever since we started using our enhanced Clang compiler, we’ve had a consistent request: can we add runtime checking?

For many customers, this has been because the old ‘classic’ compiler used to have a feature called CodeGuard, which was helpful for catching errors, and if migrating to a newer toolchain having some semi-equivalent is a reasonable wish.
For many other customers, who knew Clang, it was because Clang comes with multiple features for different kinds of runtime checks, called sanitizers. And they’re useful. And these customers asked for them.

Our old Clang toolchain could not support these. But the new C++ Modern toolchain, which was ready to use in 12.2 and revised for quality and performance in 12.3, out now, is something we’ve talked about as a ‘foundation for the future.’ ‘We can build things we could not before,’ we’ve said. ‘Having it will let us deliver real value to you.’

You can see where this is going.

We’re making good on that.

Last release, it was CMake and incredible compiler performance (which we increased by up to 20% more this release, by the way.)

This release, we have multiple new things we could not deliver before.

And one of them is runtime checking with sanitizers. They are here, in C++Builder and RAD Studio 12.3. There are two of them: the address (memory) sanitizer, and the undefined behaviour sanitizer. Let’s look at both.

Table of Contents

Address Sanitizer

This sanitizer, known as ‘asan’, checks for common memory access issues: out of bounds, nil dereference, double free, etc. It tracks both the heap and stack.

For various technical reasons, when using Asan, you need to run your app from the command line. Asan writes to stderr which most UI apps don’t have, so run it like so:

myapp 2> asanlog.txt

The ‘2’ creates and then redirects stderr to the asanlog.txt file. When running, if the sanitizer finds an error, it will write it out and terminate the process.

Let’s try it. The following code has a subtle, easy-to-make bug in it:

void __fastcall TForm1::FormCreate(TObject *Sender)
{
    std::vector<int> numbers = {1, 2, 3, 4, 5};

    auto it = numbers.begin();
    while (it != numbers.end()) {
        std::cout << "Processing: " << *it << std::endl;
        ++it;
    }

    // This is a bug, but is it obvious? IMO it's easy to miss...
    // It dereferences an iterator that points past the end of the vector.
    std::cout << "Accidental access: " << *it << std::endl;
}

void __fastcall TForm1::FormCreate(TObject *Sender)

{

std::vector<int> numbers = {1, 2, 3, 4, 5};

auto it = numbers.begin();

while (it != numbers.end()) {

std::cout << "Processing: " << *it << std::endl;

++it;

}

// This is a bug, but is it obvious? IMO it's easy to miss...

// It dereferences an iterator that points past the end of the vector.

std::cout << "Accidental access: " << *it << std::endl;

}

This code writes to memory it should not, but it’s quite possible your app will not crash when it’s executed, so you’d never know otherwise.

Turn on the Address Sanitizer (Project Options, Building, C++ Compiler, Safety), build, run from the command line following the syntax above, and you’ll get an error output to the log file, after which the app will terminate.

It prints a lot of information, so I’ll quote selective parts of the full output.

ERROR: AddressSanitizer: heap-buffer-overflow on address 0x11e8153a73a4

1	ERROR: AddressSanitizer: heap-buffer-overflow on address 0x11e8153a73a4

Ok: an error, Asan is involved and it is a buffer overflow (going past the end of some allocated memory.) 0x11e8153a73a4 is the address being accessed when the overflow was detected. Past some more information about program state, we get to what happened:

READ of size 4 at 0x11e8153a73a4 thread T0

1	READ of size 4 at 0x11e8153a73a4 thread T0

First thread read four bytes it should not. Then there’s a call stack:

    #0 0x7ff766033ed4 in TForm1::FormCreate(System::TObject*) C:sanitizersUnit1.cpp:29
    #1 0x7ffc79253b69 in Vcl::Forms::TCustomForm::DoCreate() (I:srcworkdelphi.12xbin64vcl290.bpl+0x180283b69)
    #2 0x7ffc792532ac in Vcl::Forms::TCustomForm::AfterConstruction() (I:srcworkdelphi.12xbin64vcl290.bpl+0x1802832ac)
    #3 0x7ff76603fc37 in __AfterConstruction (C:sanitizersWin64xDebugasan_vcl_demo.exe+0x14000fc37)

#0 0x7ff766033ed4 in TForm1::FormCreate(System::TObject*) C:sanitizersUnit1.cpp:29

#1 0x7ffc79253b69 in Vcl::Forms::TCustomForm::DoCreate() (I:srcworkdelphi.12xbin64vcl290.bpl+0x180283b69)

#2 0x7ffc792532ac in Vcl::Forms::TCustomForm::AfterConstruction() (I:srcworkdelphi.12xbin64vcl290.bpl+0x1802832ac)

#3 0x7ff76603fc37 in __AfterConstruction (C:sanitizersWin64xDebugasan_vcl_demo.exe+0x14000fc37)

This incorrect read of 4 bytes occurrs in Unit1.cpp on line 29. Line 29 is the std::cout line.

After this it prints more info:

0x11e8153a73a4 is located 0 bytes to the right of 20-byte region [0x11e8153a7390,0x11e8153a73a4) allocated by thread T0 here:

1	0x11e8153a73a4 is located 0 bytes to the right of 20-byte region [0x11e8153a7390,0x11e8153a73a4) allocated by thread T0 here:

This tells you the valid range of the buffer (which we know is the vector’s allocation), 20 bytes, and the access was 0 bytes ‘to the right’ (odd wording, but after) the end of it, meaning it’s immediately following the end. As a side note, a vector’s end() iterator does not point to the last element, but to one after the last element – and that’s the source of this bug, a misunderstanding by the developer who wrote that code.

There is then a call stack (for brevity, much of it omitted here) of when the memory was allocated:

    #8 0x7ff766034142 in std::__1::vector<int, std::__1::allocator<int>>::vector[abi:v15007](std::initializer_list<int>) i:srcworkdelphi.12xincludex86_64-w64-mingw32c++v1vector:1286
    #9 0x7ff766033c4e in TForm1::FormCreate(System::TObject*) C:sanitizersUnit1.cpp:19

1 2	#8 0x7ff766034142 in std::__1::vector<int, std::__1::allocator<int>>::vector[abi:v15007](std::initializer_list<int>) i:srcworkdelphi.12xincludex86_64-w64-mingw32c++v1vector:1286 #9 0x7ff766033c4e in TForm1::FormCreate(System::TObject*) C:sanitizersUnit1.cpp:19

…and call stack entry #9 is where it enters our code, and you can see the file and line number. Line 19 is where the vector was created and initialized.

We now have what happened in general, a call stack for it, details on what happened, and the setup so you can where everything was allocated.

Then there is a summary and a memory dump. This shows ‘shadow bytes’ which is a separate map where Asan stores data about the memory that was allocated. Ie, what we’re seeing here is not the faulty memory itself, but information about the memory.

=>0x041f17df4e70: fa fa 00 00[04]fa fa

1	=>0x041f17df4e70: fa fa 00 00[04]fa fa

There’s a lot we could dig into about the various flags here, and Asan tracks memory status for the stack, freed areas, and much more. This maps the memory region with every byte printed here representing eight bytes of real memory, and we see ‘fa’ (a flag meaning outside a memory allocation), and several 00-s meaning valid heap memory. The [04] indicates info about a sub-part of four of those eight bytes, so we see two and a half eight-byte regions, or 20 bytes, or five lots of 4-bytes, which corresponds exactly to our vector of five ints which have four bytes each.

With all this information, we can easily tell what happened: we accessed just after a buffer, the call stack and line told us where, then where the memory was allocated, then showed us what the memory was being tracked as. Putting it all together: we wrote past the end of the vector. And on the guilty line, there we are, dereferencing end().

The Address Sanitiser will catch these kinds of issues for you. It’s a brilliant tool and highly useful.

Undefined Behaviour Sanitizer

There are lots of things in C++ that are undefined behaviour, where the compiler can do anything it wants because it is not valid code. Often, in order to enable optimizations, a compiler will even insert ‘traps’ (crashes!) for undefined behaviour in order to make a guarantee that a certain state exists, so it can optimize. I wrote a blog on a specific situation where a trap is emitted here. That may seem alarming but even if it didn’t, you have no guarantee what undefined behaviour is actually going to do in your code. I blogged about compiler warnings to detect these ahead of time here.

But the real acid test is when your code runs. How do you actually catch live undefined behaviour in your code?

With the undefined behaviour sanitizer.

Some common examples of undefined behaviour are dereferencing null pointers (reading or writing should trigger Asan, but the act of dereferencing to begin with is undefined behaviour and will trigger UBSan), accessing unaligned memory (eg if your pointer math is wrong), or incorrectly casting, such as:

struct Base {
    virtual ~Base() = default;
};

struct Derived : Base {
    void hello() { std::cout << "Hello!n"; }
};

void __fastcall TForm2::FormCreate(TObject *Sender)
{
    Base b;
    Derived* d = dynamic_cast<Derived*>(&b); // UB: not actually a Derived
    d->hello();
}

struct Base {

virtual ~Base() = default;

};

struct Derived : Base {

void hello() { std::cout << "Hello!n"; }

};

void __fastcall TForm2::FormCreate(TObject *Sender)

{

Base b;

Derived* d = dynamic_cast<Derived*>(&b); // UB: not actually a Derived

d->hello();

}

(Here, the undefined behaviour is casting a Base to a Derived when it’s not. The call to hello() is there to make sure this is not optimized out, but it itself is also undefined behaviour since it will be called on a type that doesn’t have a hello() method.)

This one can be run from inside the IDE. Keep an eye on the Events window, because it doesn’t raise exceptions, but logs what happens. You have to watch it to see it. Make a habit of keeping the Events window open in your saved desktop layout, even in the Default layout, so that during or after running an app it will be onscreen and you’ll see it scroll as something is logged.

You’ll see:

This has less information, but there are other messages around it which I’m not quoting for brevity. Here, UBSan is flagging a potential error for the vtable, which is because we have not cast correctly.

When we call the method itself, we get further undefined behaviour, and that’s flagged too — you can see there are further messages below this one.

Very useful!

Using Both

We recommend the new sanitizers are enabled mutually exclusively, which is a general Clang sanitizer rule. Some online material says you can turn Asan and UBSan on at the same time; we have not tested this scenario.

When turning a sanitizer on or off, you should do a full rebuild.

When using PDB debug info in the new toolchain we’ve seen multiple other third party tools reported working by our MVPs, a great example of the sort of thing the new toolchain makes possible.

When to use a Sanitizer

For debugging, only.

These are runtime checks and so are built into your app. They can result in immediate app termination. They may affect performance. They may have unforeseen side effects. Run them in a non-production instance isolated from the net, because there are reports they can increase your attack surface. Do not ship ‘release’ builds, ie apps you want your users to use, with either of the sanitizers turned on.

They are great for you to use to run your app and find errors.

With that out of the way – in a safe development/debug space, use them all the time. Turn them on in your debug builds, regularly. Run your unit tests or integration tests with them. Exercise new features with them. Do CI builds with them enabled.

They will catch problems, and you will increase your app’s robustness, safety, and quality.

Reduce development time and get to market faster with RAD Studio, Delphi, or C++Builder.
Design. Code. Compile. Deploy.
Start Free Trial Upgrade Today

Free Delphi Community Edition Free C++Builder Community Edition

Safety with C++Builder 12.3: Introducing Sanitizers

Address Sanitizer

Undefined Behaviour Sanitizer

Using Both

When to use a Sanitizer

About author

David Millington

Leave a ReplyCancel reply

Search

Something Fresh

Charles "Charlie" Calvert, Developer, Educator, And A Thoroughly Decent Man

Who Uses Delphi? The Silent Success Behind Astronauts, Theme Parks, Satellites, And A Multi-billion Ultimate Payday!

Instruction Sets in C++Builder 12.3!

Popular Posts

RAD Studio 12.2 Athens Inline Patch 1 Available

Announcing the Availability of RAD Studio 12.2 Athens

New in 12.3: Scripts for Migration from InterBase Express to FireDAC

AI-Powered Smart CodeInsight in RAD Studio 12.3

Announcing the Availability of RAD Studio 12.3 Athens

RAD Studio 12.2 Athens Inline Patch 1 Available

Announcing the Availability of RAD Studio 12.2 Athens

New in 12.3: Scripts for Migration from InterBase Express to FireDAC

AI-Powered Smart CodeInsight in RAD Studio 12.3

Categories

Popular From News

New in 10.3.2: C++17 for Win64 - target all Windows with the C++17 Clang compiler

Delphi 12 And C++Builder 12 Community Editions Released!

We've Updated The HUGE Delphi Anniversary “Innovation Timeline” Infographic. Grab it Now!

Embarcadero InterBase 2020 Update 6 Released!

3 x 12 VCL Enhancements in Delphi 12

C++Builder @ stackoverflow

Delphi @ stackoverflow

InterBase @ stackoverflow

Categories

Useful Links

Follow us

Safety with C++Builder 12.3: Introducing Sanitizers

Address Sanitizer

Undefined Behaviour Sanitizer

Using Both

When to use a Sanitizer

About author

Leave a ReplyCancel reply

Join Our Global Developer Community

Search

Something Fresh

Popular Posts

Categories

Popular From News

Categories

Useful Links

Follow us