May 2023 – Secure Design

Continuous Integration (CI) is a standard in software development. A lot of companies use it for their development process. It basically means using automation tools to test new code more frequently. Instead of continuous, you can also use the word automated, because CI can’t work manually. Modern build systems comprise scripts and descriptive configurations that invoke components of the toolchain in order to produce executable code. Applications build with different programming languages can invoke a lot of tools with individual configurations. The build system is also a part of the code development process. What does this mean for CI in terms of secure coding?

First, if you use CI methods in your development cycle, then make sure you understand the build system. When working with external consultants that audit your code, the review must be possible without the CI pipeline. In theory, this is always the case, but I have seen code collections that cannot be built easily, because of the many configuration parameters hidden in hundreds of directories. Some configuration is old and use environment variables to control how the toolchain has to translate the source. Especially cross-platform code is difficult to analyse because of the mixture of tools. Often it is only possible to study the source. This is a problem, because a code review also needs to rebuild the code with changing parameters (for example, changing compiler flags, replacing compilers, adding analyzers, etc.). If the build process doesn’t allow this, then you have a problem. This makes switching to different tools impossible, which is also necessary when you need to test new versions of your programming language or need to migrate old parts of your code to a newer standard.

Furthermore, if your code cannot be built outside your CI pipeline, then reviews are basically impossible. Just reading the source means that a lot of testing cannot be done. Ensure that your build systems do not grow into a complex creation no one wants to touch any more. The rules of secure and clean coding also apply to your CI pipeline. Create individual packages. Divide the build into modules, so that you can assemble the final application from independent building blocks. Also, refactor your build configuration. Make is simpler and remove all workarounds. Once the review gets stuck and auditors have to read your code like the newspaper, it is too late.

The trend of large language models (LLMs) continues. Many people are doing experiments and explore how these algorithms can help them when developing software. Most integrated development environments have features that help you while writing code. Access to documentation, function call parameters, static checks, and suggestions are standard tools to help you. LLMs are the new kid on the block. Some articles describe how questions (or prompts) to chat engines were used to create code samples. The output depends a lot on the prompt. Changing words or rephrasing the prompt can lead to different results. This differs from the way other tools work. Getting useful results means to play with the prompt and engage in trial-and-error cycles. Algorithms such as ChatGPT are not sentient. They cannot think. The algorithm just remixes and repeats part of its training data. Asking for code examples is probably most useful for getting templates or single functions. This use case is disappointingly close to browsing tutorials or Stackoverflow.

Designing prompts is a new skill artificially created by LLMs algorithms. This is another problem, because you now need to collect prompts for creating the most useful code. The work shifts to another domain, but you don’t actually save time unless you have a compendium of prompts. Creating useful and well-tested templates is a better use of resources. The correct of of patterns governs code creating with or without LLMs.

The security questions have not been addressed yet. There are studies that analyse how the code quality of tool- and human-generated code looks like. According to the Open Source Security and Risk Analysis (OSSRA) report from 2022, the code created by assistant features contained vulnerabilities 40% of the time. An examination of code created by Github’s Copilot shows that autogenerated code contains bugs that belong to specific software weaknesses. The code created by humans has a distinct pattern of weaknesses. A more detailed analysis can only be done by larger statistical samples, because Copilot’s training data is proprietary and not accessible. There is room for more research, but it is safe to say that LLMs also make mistakes. Output from these algorithms must be included in the quality assurance cycle, with no exceptions. Code generators cannot work magic.

If you are interested in using LLMs for code creation, make sure that you understand the implications. Developing safe and useful templates is a better way than to engineer prompts for the best code output. Furthermore, the output can change whenever the LLM changes its version or training data. Using algorithms to create code is not a novel approach. Your toolchains most probably already contain code generators. In either case, you absolutely have to understand how they work. This is not something an AI algorithm can replace. The best approach is to study the code suggested by the algorithm, transfer it into pseudo-code, and write the actual code yourself. Avoid cut & paste, otherwise your introduce more bugs to fix later.

Secure Software doesn't develop itself.

Month: May 2023

Continuous Integration is no excuse for a complex Build System

Using AI Language Models for Code Creation