Secure Software doesn't develop itself.

The picture shows the top layer of the Linux kernel's API subsystems. Source: https://www.linux.org/attachments/kernel-jpeg.6497/

Author: René Pfeiffer Page 1 of 3

René Pfeiffer has a background in system administration, software development, physics, and teaching. He writes on this blog about all things code and code security.

Parallel Operations on numerical Values

Everyone knows the vector container of C++’s Standard Template Library (STL). It is useful, versatile, and store the data of all elements in a contiguous memory location. There is another container named std::valarray for array data that is not widely known. It is part of the STL for a long time (i.e. way before C++11). The use case is to perform operations on all array elements in parallel. You can even multiply two valarray containers element by element without using loops or other special code. While it has no iterators, you can easily create a valarray container from a vector, perform calculations in parallel, and push the results into a vector again. The C++ reference has example code to show how to do this. Creation from a vector requires access to the memory location of the vector’s data.

std::vector<double> values;
// Put some values into the vector here …
// Convert vector to valarray
std::valarray<double> val_values( values.data(), values.size() );

Now you can perform operations on all elements at once. Calculating cos() of all elements simply looks like this:

auto val_result = cos(val_values);

If you take the time and compare it to a loop through a vector where the function is called for every element, then you notice valarray is much faster. It depends on your compiler. GCC and Clang are quite fast. The apply() member function allows you to run arbitrary functions on every element. If you only need a subset of the elements, then you can create slices with the required values.

Static Tests and Code Coverage

The picture shows a warning sign indicating that a laser beam is operating in the area. Source: https://commons.wikimedia.org/wiki/File:Laser-symbol-text.svgTesting software and measuring the code coverage is a critical ritual for most software development teams. The more code lines you cover, the better the results. Right? Well, yes, and no. Testing is fine, but you should not get excited about maximising the code coverage. Measuring code coverage can turn into a game and a quest for the highest score. Applying statistics to computer science can show you how many code paths your tests need to cover. Imagine that you have a piece of code containing 32 if()/else() statements. Testing all branches means you will have to run through 4,294,967,296 different combinations. Now add some loops, function calls, and additional if() statements (because 32 comparisons are quite low for a sufficiently big code base). This will increase the paths considerably. Multiply the number by the time needed to complete a test run. This shows that tests are limited by physics and mathematics.

Static analysis is a standard tool which helps you detect bugs and problems in your code. Remember that all testing tries to determine the behaviour of your application. Mathematics has more bad news for you. Rice’s Theorem states that all non-trivial semantic properties of a specific code are undecidable. An undecidable problem, which is a decision problem, cannot be solved by any algorithm implementation. Rice published the theorem with a proof in 1951, and it relates to the halting problem. It implies that you cannot decide if an application is correct. You also cannot decide if the code executes without errors. The theorem sounds odd, because clearly you can run code and see if it shows any errors given a specific set of input data. This is a special case. Rice’s theorem is a generalisation and applies to all possible input data. So your successful tests basically work with special cases that do not cause harm. Security testing checks for dangerous behaviour or signs of weaknesses. Increasing the input data variations can cover more cases, but Rice’s theorem still holds, no matter how much effort you put into your testing pipeline.

Let’s get back to the code coverage metric. Of course, you should test all of your code. The major goal for your code is to handle errors correctly, fail safely (i.e. without creating damage), and keep control of the code execution. You can achive these goals with any code coverage per test above 0%. Don’t fall prey to gamification!

Mixing Secure Coding with Programming Lessons

The picture shows a fantasy battle where a witch attacks a wizard with spells. Source: https://wiki.alexissmolensk.com/index.php/File:Spellcasting.jpgLearning about programming first and then learning secure coding afterwards is a mistake. Even if you are new to a programming language or its concepts, you need to know what can go wrong. You need to know how to handle errors. You need to do some basic checks of data received, no matter what your toolchain looks like. This is part of the learning process. So instead of learning how to use code constructs or language features twice, take the shortcut and address security and understanding of the concepts at once. An example method of classes and their behaviour. If you think in instances, then you will have to deal with the occasional exception. No one would learn the methods first, ignore all error conditions, and then get back to learn about errors.

Another example are variables with numerical values. Numbers are notorious. Even the integer data types stay in the Top 25 CWE list since 2019. Integer overflow or underflow simply happens with the standard arithmetic operators. There is no fancy bug involved, just basic counting. You have to implement range checks. There is no way around this. Even Rust requires you to do extra bound checks by using the checked_add() methods. Secure coding always means more code, not less. This starts with basic data types and operators. You can add these logical pitfalls to exercises and examples. By using this approach, you can convey new techniques and how a mind in the security mindset improves the code. There is also the possibility of switching between “normal” exercises and security lessons with a focus on how things go wrong. It’s not helpful to pretend that code won’t run into bugs or security weaknesses. Put the examples of failure and how to deal with it right into your course from the start.

If you don’t know where to start, then consult the secure coding guidelines and top lists of well-known vulnerabilities. Here are some good pointers to get started:

The Ghost of Legacy Code and its Relation to Security

The picture shows a spade and the wall of a pit dug into the earth. The wall shows the different layers created by sedimentation over time. Source: http://www.thesubversivearchaeologist.com/2014/11/back-to-basics-stratigraphy-101.htmlThe words legacy and old carry a negative meaning when used with code or software development. Marketing has ingrained in us the belief that everything new is good and everything old should be replaced to ensure people spend money and time. Let me tell you that this is not the case, and that age is not always a suitable metric. Would you rather have your brain surgery from a surgeon with 20+ years of experience or a freshly graduated surgeon on his or her first day at the hospital?

So what is old code? In my dictionary, the label “not maintained anymore” is assigned to legacy and old code. This is where the mainstream definition fails. You can have legacy code which is still maintained. There is a sound reason for using code like this: stability and fewer errors introduced by creating code from scratch. Reimplementing code always means that you start from nothing. Computer science basic courses teach everyone to reuse code in order to avoid these situations. Basically, reusing code means that you allow code to age. Just don’t forget to maintain parts of your application that work and experience few changes. This is the sane version of old code. There is another one.

An old codebase can serve as a showstopper for changes. If you took some poor design decisions in the past, then parts of your code will resist fresh development and features. Prototypes often exhibit this behaviour (a prototype usually never sees the production phase unaltered). When you see this in your application, then it is time to think about refactoring. Refactoring has fewer restrictions if you can do this in your own code. Once components or your platform is part of the legacy code, then you are in for a major upgrade. Operating systems and run-time environments can push changes to your application by requiring a refactoring. Certifications can do the same. Certain environments only allow certified components. Your configuration becomes frozen once applications or run-time get the certification. All changes may require a re-certification. Voilà, here is your stasis, and your code ages.

Legacy code is not a burden per se. It all depends if the code is still subject to maintenance, patches, and security checks. Besides, older code usually has fewer bugs.

Code, Development, Agile, and the Waterfall – Dynamics

The picture shows the waterfalls of Gullfoss under the snow in Iceland. Source: https://commons.wikimedia.org/wiki/File:Iceland_-_2017-02-22_-_Gullfoss_-_3684.jpgCode requires a process to create it. The collection of processes, tasks, requirements, and checks is called software development. The big question is how to do it right. Frankly, the answer to this question does not exist. First, not all code is equal. A web server, a filesystem, a database, and a kernel module for network communication contain distinct code, with only a few functions that can be shared. For adding secure coding practices, some attendees of my courses question the application of checklists and cleaning of suspicious data. Security is old-fashioned, because you have to think of risks, how to address them, and how to improve sections of your code that connect to the outside world. People like to term agile where small teams bathe in outbursts of creativity and sprint to implementing requested features. You can achieve anything you set your mind to. Tear down code, write it new, deliver the features. This is not how secure coding works, and this is not how your software development process should look like (regardless what type of paradigm you follow).

It is easy to drift into a rant about the agile manifesto. Condensing the entire development process into 68 words, all done during three days of skiing in Colorado, is bound to create very general statements whose implementation wildly differs. This is not the point I want to make. You can shorten secure coding to 10 to 13 principles. The SEI CERT secure coding documents feature a list with the top 10 methods. It’s still incomplete, and you still have to actually integrate security into your writing-code-process. So you can interpret secure coding as a manifesto, too. Neglecting the implementation has advantages. You can use secure coding with all existing and future programming languages. You can use it on all platforms, also current and yet to be invented. The principles are always true. Secure coding is a model that you can use to improve how your team creates, tests, and deploys code. This also means that adopting a security stance requires you to alter your toolbox. All of us have a favourite development environment. This is the first place where you can get started with secure coding. It’s not all about having the right plugins, but it is important to see what code does while it is being developed.

The title features the words agile and waterfall. Please do yourself a favour and stop thinking about buzzwords. It doesn’t matter how your development process produces code. It matters that the code has next to none security vulnerabilities, shows no undefined behaviour and cannot be abused by third parties. Secure code is possible with any development process provided you follow the principles. Use the principle’s freedoms to your advantage and integrate what works best.

CrowdStrike and how not to write OS Drivers

The image shows a screenshot of a null pointer execption in Microsoft Windows. Source: Zach VorhiesYesterday the CrowdStrike update disable thousands of servers and clients all across the world. The affected systems crashed when booting. A first analysis by Zach Vorhies (careful, the link goes to the right-wing social media network X) has some not very surprising news about the cause of the problem. Apparently, the system driver from CrowdStrike hit a null pointer access violation. Of course, people immediately started bashing C++, but this is too shallow. There are different layers where code is executed. The user space is usually a safe ground where you can use standard techniques of catching errors. Even a crash might be safer for user space applications than continuing and doing more harm. Once your code runs as a system driver, then you are part of the operating system and have to observe a couple of constraints. OS code can’t just exit or crash (even exception without the catch{} code count as a crash). So having a null situation in mission-critical code is something which should never happen. This should have been caught in the testing phase. Furthermore, Modern C++ has no use for null pointers. You must use smart pointers, and by doing that, you don’t need to handle null pointers. There is nothing more to it.

You cannot ignore certain error conditions when running within the operating system. Memory allocation, I/O errors, and everything concerning memory operations is critical. There must be checks in place, and there is no excuse for omitting these checks.

Finding 0-Days with Large Language Models exclusive-or Fuzzing

The picture shows all the different Python environments installed on a system. The graphical overviiew is very confusing. Source: https://xkcd.com/1987/If all you have is a Large Language Model (LLM), then you will apply it to all of your problems. People are now trying to find 0-days with the might of LLMs. While there is no surprise that this works, there is a better way of pushing your code to the limit. Just use random data! Someone coined the term fuzzing in 1988. People have been using defective punch cards as input for a while longer. With input filtering of data, you want to eliminate as much bias as possible. This is exactly why people create the input data using random data. Human testers think too much, too less, or are too constrained. (Pseudo-)Random number generators rarely have a bias. LLMs do. This means that the publication about finding 0-days by using LLMs should not be good news. Just like human Markov chains, LLMs only „look“ in a specific direction when creating input data. The model is the slave of vectors and the training data. The process might use the source code as an „inspiration“, but so does a compiler with a fuzzing engine. Understanding that LLMs do not possess any cognitive capabilities is the key point here. You cannot ask an LLM what it thinks of the code in combination with certain input data. You are basically using a fancy data generator that uses more energy and is too complex for the task at hand.

Comparing LLMs with fuzzing engines does not work well. Both approaches serve an original purpose. Always remember that the input data in security tests should push your filters to the limit and create a situation that you did not expect. Randomness will do this much more efficiently than a more complex algorithm. If you are fond of complexity or have too much powerful hardware at your hands, there are other things you can do with this.

URL Validation, Unit Tests and the Lost Constructor

I have some code that requests URLs, looks for session identifiers or tokens, extracts them, and calculates some indicators of randomness. The tool works, but I decided to add some unit tests in order to play with the Catch2 framework. Unit tests requires some easy to check conditions, so validating HTTP/HTTPS URLs sounds like a good idea to get started. The code uses the curl library for requests, so checking URLs can be done by libcurl or before feeding the URL string to it. Therefore I added some regular expressions. RFC 3986 has a very good description of Uniform Resource Identifiers (URIs). The full regular expression is quite large and match too many variations of URI strings. You can inspect it on the regex101 web site. Shortening the regex to matching URLs beginning with “http” or “https” requires to define what you want to match. Should there be only domain names? Are IP addresses allowed? If so, what about IPv4 and IPv4? Experimenting with the filter variations took a bit of time. The problem was that no regex was matching the pattern. Even patterns that worked fine in other programming languages did not work in the unit test code. The error was hidden in a constructor.

Class definitions in C++ often have multiple variations of constructors. The web interface code can create a default instance where you set the target URL later by using setters. You can also create instances with parameters such as the target or the number of requests. The initialisation code sits in one member function which also initialises the libcurl data structures. So the constructors look like this:

http::http() {
}

http::http( unsigned int nreq ) {
init_data_structures();
set_max_requests( nreq );
return;
}

The function init_data_structures() sets a flag that tells the instance if the libcurl subsystem is working or not. The first constructor does not call the function, so the flag is always false. The missing function call is hard to miss. The reason why the line was missing is that the code had a default constructor at first. The other constructors were added later, and the default constructor function was never used, because the test code never creates instances without an URL. This bring me back to the unit tests. The Catch2 framework does not need a full program code with a main() function. You can directly create instances in your test code snippets and use them. That’s why the error got noticed. Unit tests are not security tests. The missing initialisation function call is most probably not a security weakness, because the code does not run with the web request subsystem flag set to false. It’s still a good way to catch omissions or logic errors. So please do lots of unit tests all of the time.

Floating Point Data Types and Computations

The picture shows how real numbers fit into the IEEE 754 floating point data type representation. Source: https://en.wikibooks.org/wiki/Introduction_to_Numerical_Methods/Rounding_Off_ErrorsFloating point data types are available in most programming languages. C++ knows about float, double, and long double data types. Other programming languages feature longer (256 bit) and shorter (16 bit and lower) representations. All data types are specified in the IEEE Standard for Floating-Point Arithmetic (IEEE 754). IEEE 754 is the standard for all implementations. Hardware also supports storage and operations. Floating point data storage is usually used in numerical calculations. Since the use case is to represent real numbers, the accuracy is a problem. Mathematically there is an infinite amount of other real numbers between two arbitrary chosen real numbers. Computers are notoriously bad at storing an infinite amount of data. For the purposes of programming, this means that all choices for using any floating point data type have to deal with error conditions and how to handle them. Obvious errors include the division by zero. Less obvious conditions are rounding errors, special numbers (infinity, not a number, signed zeroes, subnormal numbers), and overflows.

Not all of the error conditions may pose a threat for your applications. It depends on what type of numerical calculations your code does or consumes. Comparisons have to be implemented in a thoughtful way. Test for equality may fail, because of rounding errors. Using the “real zero” can backfire. The C and C++ standard libraries supply you with a list of constants. Among them is the minimal difference that can be represented in a floating point data type. It is called the epsilon value. Epsilon (ε) is often used to denote very small values. cfloat or float.h defines FLT_EPSILON (for float), DBL_EPSILON (for double), and LDBL_EPSILON (for long double). Using this value as the smallest difference possible is usually a good idea. There is another method for finding neighbouring floating point numbers. C++11 has introduced functions to find the next neighbour value. The accuracy is determined by the unit of least precision (ULP). ULPs are defined by the value of the least significant bit. Using ULPs or the epsilon values is a different approach. ULP checking requires transformation of the values into integer registers. Both methods work well away from the zero. If you are near the zero value, then consider using multiples of the epsilon value as a comparison value.

There is another overlooked fact. The float data type has 32 bit of storage. This means you can use 4 billions different bit combinations, which is not a lot. Looping through all values and stress testing a numerical function can be done in minutes. There is a blog post covering this technique complete with example code.

I have compiled some useful resources for this topic.

Linking against the Threading Building Blocks (oneTBB) library with g++ and clang++

A couple of weeks ago, I created a repository of example code for C and C++ courses. The examples include a source file that uses the Threading Building Blocks (oneTBB) library. Since the examples are rather small, I included no Makefile. Instead, I wrote a Bash script using clang++ and a Ninja build file using g++. Strangely, the clang++ build worked, but the g++ version complained about symbols in the linking phase. The linking errors look horrible. Here is the top of the long list of errors (added here for search engines to find the error):

/usr/bin/ld: /tmp/cc4i6mNS.o: in function `tbb::detail::d1::wait_context::add_reference(long)':
parallel_algorithms.cpp:(.text._ZN3tbb6detail2d112wait_context13add_referenceEl[_ZN3tbb6detail2d112wait_context13add_referenceEl]+0x6c): undefined reference to `tbb::detail::r1::notify_waiters(unsigned long)'
/usr/bin/ld: /tmp/cc4i6mNS.o: in function `tbb::detail::d1::execution_slot(tbb::detail::d1::execution_data const&)':
parallel_algorithms.cpp:(.text._ZN3tbb6detail2d114execution_slotERKNS1_14execution_dataE[_ZN3tbb6detail2d114execution_slotERKNS1_14execution_dataE]+0x14): undefined reference to `tbb::detail::r1::execution_slot(tbb::detail::d1::execution_data const*)'
/usr/bin/ld: /tmp/cc4i6mNS.o: in function `tbb::detail::d1::spawn(tbb::detail::d1::task&, tbb::detail::d1::task_group_context&)':
parallel_algorithms.cpp:(.text._ZN3tbb6detail2d15spawnERNS1_4taskERNS1_18task_group_contextE[_ZN3tbb6detail2d15spawnERNS1_4taskERNS1_18task_group_contextE]+0x30): undefined reference to `tbb::detail::r1::spawn(tbb::detail::d1::task&, tbb::detail::d1::task_group_context&)'
/usr/bin/ld: /tmp/cc4i6mNS.o: in function `tbb::detail::d1::execute_and_wait(tbb::detail::d1::task&, tbb::detail::d1::task_group_context&, tbb::detail::d1::wait_context&, tbb::detail::d1::task_group_context&)':
parallel_algorithms.cpp:(.text._ZN3tbb6detail2d116execute_and_waitERNS1_4taskERNS1_18task_group_contextERNS1_12wait_contextES5_[_ZN3tbb6detail2d116execute_and_waitERNS1_4taskERNS1_18task_group_contextERNS1_12wait_contextES5_]+0x2c): undefined reference to `tbb::detail::r1::execute_and_wait(tbb::detail::d1::task&, tbb::detail::d1::task_group_context&, tbb::detail::d1::wait_context&, tbb::detail::d1::task_group_context&)'…

There was no obvious difference between the compiler calls.

g++ -Wall -Werror -Wpedantic -std=c++20 -ltbb -g -O0 -o parallel_algorithms parallel_algorithms.cpp
clang++ -Wall -Werror -Wpedantic -std=c++20 -ltbb -march=native -o parallel_algorithms parallel_algorithms.cpp

After browsing through a lot of search results that don’t explain the problem, I tried to put the -ltbb at the end of the command. If you do this, then everything works fine with g++:

g++ -Wall -Werror -Wpedantic -std=c++20 -g -O0 -o parallel_algorithms parallel_algorithms.cpp -ltbb

🥳 oneTBB doesn’t require much, but its link option has to be at the end of the compiler command. Apparently clang++ does something different when resolving symbols. Good to know.

Page 1 of 3

Powered by WordPress & Theme by Anders Norén