Secure Software doesn't develop itself.

The picture shows the top layer of the Linux kernel's API subsystems. Source: https://www.linux.org/attachments/kernel-jpeg.6497/

Author: René Pfeiffer

René Pfeiffer has a background in system administration, software development, physics, and teaching. He writes on this blog about all things code and code security.

Go, Go Carbon, Go++, Carbon++, C++, or Go Rust?

I break my rule of not writing titles with questions marks. The exemption is because of the new programming language, Carbon. I saw the announcement a couple of days ago. The article mentioned that Google engineers are working on a new programming language called Carbon. The author of the article added the tagline „A Hopeful Successor To C++“. The immediate question to the endeavor of creating a new programming language was: Why? There are a lot of programming languages and dialects around. All the languages have their own background. It is easy to bash a specific programming language, but if you take the history of creation into account, then it gets easier to understand why specific design choices were made. It is easy to recommend different choices in retrospect. Given the existence of Go and the periodic C++ standard updates, I wonder what the design goals of Carbon are. Luckily, the article mentioned them:

  • C++ performance
  • Seamless bidirectional interoperability with C++
  • Easier learning curve for C++ developers
  • Comparable expressivity
  • Scalable migration

If you read the actual Carbon language description from the repository, then there are some additional goals:

  • Safer fundamentals, and an incremental path towards a memory-safe subset
  • Language evolution (Carbon wants to be C++’s TypeScript/Kotlin)

This doesn’t look too bad. Clearly safer fundamentals and memory-safe features are a good idea. After checking some code examples, the syntax of Carbon looks a lot like Rust. For my taste, the easier learning curve is in the beholder’s eye. Personally, I dislike Rust’s syntax. Carbon adds some grammar and differences in special characters that will probably hinder anyone with C++ experience (well, at least myself). The interoperability claims that you can use your tried build system for your projects. The Carbon project clearly states that it is presenting a prototype for exploring the desired language features. Apparently, the designers want to use C++ as the foundation and add their own vision.

C++ has evolved a lot in the past decade. The new C++ standards have implemented a lot of the missing features that had to be supported by third-party libraries. Because a programming language needs time to create a stable version. Carbon will have to catch up with C++’s head start. Judging from the project vision, it seems to be yet another use case for LLVM. Let’s see if we find out the real reason why Google wants to replace C++.

Using C++ Threads or OpenMP for parallel Processing

Having easy access to parallel processing is a pleasant feature in programming languages. The thread syscalls of operating systems have notoriously been difficult to access, especially in C. The Open Multi-Processing (OpenMP) library started in 1997 to make things easier. It helps to mark sections of parallel code and loop that can be parallelized. It works well for C, C++, and FORTRAN code. It is easy to implement. Plus, your code can be compiled with or without the OpenMP library present. The downside is that your code requires OpenMP on the target. I recently had a case where C++ code needs to be installed on different platforms (i.e. systems with different major version level). OpenMP is tied to the C/C++ standard library and the compiler. The code is compiled by Clang, so in this particular case you need different OpenMP libraries. In order to reduce the dependency on OpenMP, the code was refactored to use C++ threads.

C++11 threads are easy to use. When switching from OpenMP, you only have to convert your #pragma statements to function calls. When using member functions, you have to code around a peculiarity of std::async and std::thread. Member functions of dynamically allocated objects cannot be called directly. If you try to do this, then you will get an error message. Consider the following object:

class hash_list {
private:
kyotocabinet::HashDB HDB;
kyotocabinet::HashDB::Cursor *pos;

public:
bool walkthrough( string directory ) {

}
The class is used to access different Tokyo Cabinet databases. The function walkthrough() does heavy I/O work and updates the database file. Calling the function directly with std::async will not work. I fought with many compiler errors and was tempted to convert the member function to static, but this would have required a full rewrite of the class. Static member variables and function change access to encapsuled data. Instead, you will need a wrapper function to call the members.

#ifndef USE_OPENMP
bool wrap_walkthrough( hash_list *h, string d ) {
return( h->walkthrough(d) );
}
#endif

The function wrap_walkthrough() works fine, and it can be called with different dynamically allocated objects. The section calling the functions looks like this:

#ifndef USE_OPENMP
future<bool> f_rc_path = std::async( std::launch::async, wrap_walkthrough, path_orig, opt_path );
future<bool> f_rc_prfx = std::async( std::launch::async, wrap_walkthrough, path_prefix, prefix_path );
const bool rc_path  = f_rc_path.get();
const bool rc_prfx  = f_rc_prfx.get();
if ( ! rc_path ) {
cerr << "Walkthrough for " << opt_path << " failed!" << endl;
rc += 23;
}
if ( ! rc_prfx ) {
cerr << "Walkthrough for " << prefix_path << " failed!" << endl;
rc += 23;
}
#endif

Remember to write wrapper functions when you encounter the error message “reference to non-static member function must be called”.

Anatomy of a Buffer Overflow in Python 3.x

The bug tracking system of Python was notified of a buffer overflow in Python. Affected versions were 3.10, 3.9, 3.8, 3.7, and 3.6. The code in question is part of the PyCArg_repr() function. This function is called when Python has to evaluate parameters from the ctypes class (i.e. wehn you are using C type variables in your Python code). The overflow can be triggered by using extreme values and letting Python expand the content into a buffer:
case 'd':
sprintf(buffer, "<cparam '%c' (%f)>",
self->tag, self->value.d);
break;
The %f place-holder is interesting. Using 1.79769e+308 (maximum for double data type) or 1.18973e+4932 (maximum for long double data type) will trigger a buffer overflow. This can be detected by the runtime and lead to an error message aborting the interpreter. In any case it is a good example to always validate input data before processing it. Some applications use components from different programming languages. Whenever data is handed around between functions implemented in different run-time environments, then you have to be extra careful about the data types. Sometimes implicit conversions occur. If conversions between numerical data and strings are performed, then always check the limits on both ends.

You can do the bounds checks even with arbitrary-precision arithmetic (also called bignum, multiple-precision, or infinite-precision arithmetic). Conversions can be done in confined common data types with less precision. This means to cut off the values and lose precision. Arbitrary-precision arithmetic often can export the data to string representations or other serialisation formats. This means that you have to estimate the size of the result. Java™ offers the BigDecimal and BigInteger classes. The buffer estimate looks like this:
byte[] storedUnscaledBytes = bigDecimal.unscaledValue().toByteArray();
int storedScale = bigDecimal.scale();
This gives you the exported values and its size. The latter needs to be used when using the export with functions using size-limited buffers. Both object size and object needs to be processed together and must not be separated in any further processing step. Check your code for conversions near APIs to external libraries or other components. There might be potential for overflows or conversions errors.

Implementing basic Tests during Software Development

The recent GnuPG bugs have sparked a discussion about standard tests during software development. The case was a buffer in the code which could be overwritten by a decryption operation. Overflow bugs can be easily avoided by defensive programming, but also by standard tests during the development phase. Modern compilers have features to test for stack/heap overflows, memory leaks, undefined behaviour, and many more cases you don’t want in your code. Clang offers the Clang Static Analyzer tools. GCC 10 offers similar features in the form of its static code analysis options as well. Valgrind celebrated its 20th anniversary last year, so there are no excuses. Since every project written in C/C++ always needs a set of build options for anyway, why not add some scripts or configurations to your tests?

First of all, adding something to your tests requires that you already do systematic testing of your code. Most projects have a collection of regression tests to make sure the code behaves as expected after changes. Additionally there might be test cases stemming from the bug reports to check for errors which should have been fixed and should never return. Furthermore, some projects have stress tests, load tests, and even fuzzing tests which can be easily activated. All of this requires a testing platform and processes to define, develop, test, and deploy tests. Not having a test infrastructure is no excuse for not testing code. This is especially true for code bases like libgcrypt or other widely used libraries/tools. The lack of continuous integration (CI) pipelines is also no excuse. Ideally tests are automated, but they don’t have to be. They need to just be run with few changes to the code and the build instructions. Often code has debug flags or other parameters which influences the run-time behaviour or generated code. That’s the way to start. Once your configuration (and maybe scripts) are in place, then you can go forth and automate everything else.

Collecting test cases should be your first step. Harvest the bug tracker and the change history. Try to extract cases and data that triggered a bug. Build a library of tests, then start extending your build system by a test mode that utilises this library and performs the tests. Don’t forget the benefits of your toolchain! Use the static analyzers when testing or running code. You can even do what you usually do with your code before shipping, just make sure the analysers are in place (i.e. compiled in or supervising the code execution). Using different compiles in the pre-shipping phase is a very good idea, too. All in all this should not add an enormous time to your development cycle. You have to test your code any way, why not let computers do this?

Using the built-in features of your compiler (or your favourite run-time framework) in order to detect bugs is a basic task for developers. Don’t wait until security researchers or penetration testers will do this for you. And if they do, please don’t treat bug reports as yet another rant. If it is a real bug report, then you should fix it and blame your code. The alternative is not to accept bug reports, but doing this doesn’t help anyone.

Everyone has an Opinion – so do we

This is a new blog. Its purpose is to serve as a companion to the secure coding / secure design curriculum I am developing for years. The we in the title are partners that help to educate software developers about how to protect their code against hostile environments and malicious attackers. The blog itself is embedded into the wiki that holds a collection of secure coding/design patterns, taxonomy, and examples. So much for the background.

The blog carries opinion it its title. The reason is simple. While we have a lot of information security standards, the information technology itself is driven by opinions. Agile software development, the use of containers versus processes versus virtual machines, the programming language of the day, the/your/our operating system of choice, code platforms, frameworks, and many more aspects of the modern digital tools available to software developers is based on opinion. This doesn’t mean that there is no right answer. The problem is just that there are many of them. What works for your organisation, your team, your project is highly dependant on the context you are working in. Furthermore it depends on how your customers use your application. One size fits all might work for hats, it doesn’t work for most other things. This is why opinion is presented in this blog. All articles will have the intent to connect to the wonderful world of software development, secure coding, and secure design.

If you want to engage, then you can leave comments. However the time window for adding comments will close after a few weeks. This is purely out of administrative reasons, because neither me nor my partners have the time to watch the comments and manually approve them. Be quick, or write an email!

Enjoy the articles!

Page 3 of 3

Powered by WordPress & Theme by Anders Norén