Secure Software doesn't develop itself.

The picture shows the top layer of the Linux kernel's API subsystems. Source: https://www.linux.org/attachments/kernel-jpeg.6497/

Tag: Hype

Code, Development, Agile, and the Waterfall – Dynamics

The picture shows the waterfalls of Gullfoss under the snow in Iceland. Source: https://commons.wikimedia.org/wiki/File:Iceland_-_2017-02-22_-_Gullfoss_-_3684.jpgCode requires a process to create it. The collection of processes, tasks, requirements, and checks is called software development. The big question is how to do it right. Frankly, the answer to this question does not exist. First, not all code is equal. A web server, a filesystem, a database, and a kernel module for network communication contain distinct code, with only a few functions that can be shared. For adding secure coding practices, some attendees of my courses question the application of checklists and cleaning of suspicious data. Security is old-fashioned, because you have to think of risks, how to address them, and how to improve sections of your code that connect to the outside world. People like to term agile where small teams bathe in outbursts of creativity and sprint to implementing requested features. You can achieve anything you set your mind to. Tear down code, write it new, deliver the features. This is not how secure coding works, and this is not how your software development process should look like (regardless what type of paradigm you follow).

It is easy to drift into a rant about the agile manifesto. Condensing the entire development process into 68 words, all done during three days of skiing in Colorado, is bound to create very general statements whose implementation wildly differs. This is not the point I want to make. You can shorten secure coding to 10 to 13 principles. The SEI CERT secure coding documents feature a list with the top 10 methods. It’s still incomplete, and you still have to actually integrate security into your writing-code-process. So you can interpret secure coding as a manifesto, too. Neglecting the implementation has advantages. You can use secure coding with all existing and future programming languages. You can use it on all platforms, also current and yet to be invented. The principles are always true. Secure coding is a model that you can use to improve how your team creates, tests, and deploys code. This also means that adopting a security stance requires you to alter your toolbox. All of us have a favourite development environment. This is the first place where you can get started with secure coding. It’s not all about having the right plugins, but it is important to see what code does while it is being developed.

The title features the words agile and waterfall. Please do yourself a favour and stop thinking about buzzwords. It doesn’t matter how your development process produces code. It matters that the code has next to none security vulnerabilities, shows no undefined behaviour and cannot be abused by third parties. Secure code is possible with any development process provided you follow the principles. Use the principle’s freedoms to your advantage and integrate what works best.

Please stop anthropomorphising Algorithms

The picture showes a box with lots of cables and technical-looking decorations. It symbolises a machine and was created by the Midjourney image generator.If you read the news, the articles are full of hype for the new Large Language Models (LLMs). People do amazing and stupid things by chatting with these algorithms. The problem is that LLMs are just algorithms. They have been „trained“ with large amounts of data, hence the first „L“. There is no personality or cognitive process involved. Keep the following properties in mind when using these algorithms.

  • LLMs stem from the field of artificial intelligence, which is a field of computer science.
  • LLMs perform no cognitive work. The algorithms do not think, they just remix, filter, and repeat. They can’t create anything new better than a random generator and a Markov chain of sufficiently high order.
  • LLMs have no opinion.
  • LLMs feature in-built hallucinations. This means the algorithms can lie. Any of the current (and possibly future models of the same kind) will generate nonsense at some point.
  • LLMs can be manipulated by conversation (by using variations of prompts).
  • LLMs are highly biased, because their training data lacks different perspectives (given their size and cultural background of the data, especially languages used for training).
  • LLMs can’t be debugged reliably. Most fixes to problems comprise input or output filters. Essentially, these filters are allow or block lists. This is the weakest form of using filters for protection.

So there are a lot of arguments for not treating LLMs as something with a opinion, personality, or intelligence. These models can mimick language. When training, they always learn language patterns. Yes, LLMs belong to the research field of artificial intelligence, but there is no thinking going on. The label „AI“ doesn’t describe what the algorithm is doing. This is not news. Emily Bender published an article titled „Look behind the curtain: Don’t be dazzled by claims of ‘artificial intelligence’“ in the Seattle Times in May 2022. Furthermore, there is a publication from Emily Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. Its title is On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. The publication date is March 2021. They published both articles way before the „new“ chat algorithms and LLM-powered „search engines“ hit the marketing departments. You have been warned.

So, this article is not meant to discourage anyone from exploring LLMs. You just need to be careful. Remember that algorithms are seldom a silver bullet that can solve all your problems or make them go away. This is especially true if you want to use LLMs for your software development cycle.

Update: If you really want to explore the uses of synthetic text generation, please watch the presentation “ChatGP-why: When, if ever, is synthetic text safe, appropriate, and desirable?” by Emily Bender.

Powered by WordPress & Theme by Anders Norén