The term microservice architecture is often used to describe modern software development and state-of-the-art software architecture. But what is it? The answer is simple: There is no general definition of the term. An application build with “microservice” in mind has some properties that can be helpful for your codebase. The services part does not necessarily mean code that needs to answer to networked API calls. You can choose to couple your code loosely and implement the parts with independent operation in mind. Think of the classic module approach. Modules can use the network to communicate, but you can use any technique that relays messages between your modules. This is a reminder of the Unix philosophy of code. The first two rules are quotes from the Wikipedia article (and the Bell System Technical Journal from 1978):
- Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new “features”.
- Expect the output of every program to become the input to another, as yet unknown, program.
This is essentially the minimal version of thinking in microservices. The rest, such as scalability, flexibility, network APIs, containers or cloud services, are consequences of these two rules. Do not fall into the trap of only linking microservices to web applications. Even single-binary applications can follow this approach. To give you an example: I had the opportunity to inspect the code of a voice-over-IP telephony server system. In essence, the application functioned as an IP-only telephone switchboard. The customer was looking to improve the memory management, because the application usually runs for long periods of time. The code itself was divided into modules which could be enabled or disabled at will. There were even dummy functions to set missing modules to a minimal level of functionality so you could test the whole application. The calling convention on the architecture level regulated what every module needs to receive and to return. Basically, this is a microservice architecture in a big binary.
You can walk through many lists of advantages and disadvantages, but the key to thinking “microservice” is not to create code that you are afraid to change. Sometimes prototypes work well, so there is some reluctance not to break anything by adding changes. Don’t hesitate! Replace, break, and repair code. This is the way.
Learning by doing means you spent a lot of time with reading documentation and exploring example code that illustrates the features of your favourite development toolchain. Getting a well-written example code has become substantially more difficult in the past years. Once upon a time, Google offered a search engine just for source code. It was active between 2006 and 2012. Now you are stuck with search engines and their deteriorating quality. The amount of AI-generated content, copy-&-paste from documentation, and hyperlinks to gigantic forum discussions filled with errors and even more copy-&-paste snippets destroys the classical Internet research. You have to select your sources carefully. So what is a good strategy here? I have compiled a short checklist that enables you to avoid wasting time.
Testing software and measuring the code coverage is a critical ritual for most software development teams. The more code lines you cover, the better the results. Right? Well, yes, and no. Testing is fine, but you should not get excited about maximising the code coverage. Measuring code coverage can turn into a game and a quest for the highest score. Applying statistics to computer science can show you how many code paths your tests need to cover. Imagine that you have a piece of code containing 32 if()/else() statements. Testing all branches means you will have to run through 4,294,967,296 different combinations. Now add some loops, function calls, and additional if() statements (because 32 comparisons are quite low for a sufficiently big code base). This will increase the paths considerably. Multiply the number by the time needed to complete a test run. This shows that tests are limited by physics and mathematics.
Learning about programming first and then learning secure coding afterwards is a mistake. Even if you are new to a programming language or its concepts, you need to know what can go wrong. You need to know how to handle errors. You need to do some basic checks of data received, no matter what your toolchain looks like. This is part of the learning process. So instead of learning how to use code constructs or language features twice, take the shortcut and address security and understanding of the concepts at once. An example method of classes and their behaviour. If you think in instances, then you will have to deal with the occasional exception. No one would learn the methods first, ignore all error conditions, and then get back to learn about errors.
The words legacy and old carry a negative meaning when used with code or software development. Marketing has ingrained in us the belief that everything new is good and everything old should be replaced to ensure people spend money and time. Let me tell you that this is not the case, and that age is not always a suitable metric. Would you rather have your brain surgery from a surgeon with 20+ years of experience or a freshly graduated surgeon on his or her first day at the hospital?
Continuous Integration (CI) is a standard in software development. A lot of companies use it for their development process. It basically means using automation tools to test new code more frequently. Instead of continuous, you can also use the word automated, because CI can’t work manually. Modern build systems comprise scripts and descriptive configurations that invoke components of the toolchain in order to produce executable code. Applications build with different programming languages can invoke a lot of tools with individual configurations. The build system is also a part of the code development process. What does this mean for CI in terms of secure coding?
The trend of large language models (LLMs) continues. Many people are doing experiments and explore how these algorithms can help them when developing software. Most integrated development environments have features that help you while writing code. Access to documentation, function call parameters, static checks, and suggestions are standard tools to help you. LLMs are the new kid on the block. Some articles describe how questions (or prompts) to chat engines were used to create code samples. The output depends a lot on the prompt. Changing words or rephrasing the prompt can lead to different results. This differs from the way other tools work. Getting useful results means to play with the prompt and engage in trial-and-error cycles. Algorithms such as ChatGPT are not sentient. They cannot think. The algorithm just remixes and repeats part of its training data. Asking for code examples is probably most useful for getting templates or single functions. This use case is disappointingly close to browsing tutorials or Stackoverflow.