Why are release strategies important?
Every product or application needs a release strategy. It’s how you can double check that everything in your deployment is appropriately tested, validated and verified. Having a standardized release strategy in place allows your team to follow a protocol and reduce the number of unknowns they must face in the product life cycle. However, there are a few considerations to make this critical process run smoothly.
How should you approach testing in your release strategy?
Testing is one of the most important pieces of any release, whether it’s architecture or infrastructure. Integration and performance testing are very frequent and there are tools on the market, like Jenkins or Argo workflows, that can help you set up a robust testing ecosystem.
Traditionally, groups deploy all of their services or put them in an environment that looks like production and replicate the events. This can lead to a lot of time spent going back and forth fixing issues. However, in things like microservices or distributed architecture, there are some relatively new processes that make testing into more of a more flexible chunked approach.
Using smaller chunks allows you to consistently test. You can push changes to production, but you can also prod in your environment and try more. You can try robust end to end testing more frequently, in smaller, quicker chunks. Ideally, this also allows you to roll back much quicker if there are issues.
Even better, if your changes are just feature flags, you can easily identify a performance problem and turn it off or roll it back right. Your chain can make a new event, you can still publish the old event and your feature flag can either identify the problems or work correctly.
Unfortunately, there isn’t a one size fits all solution to how you should approach testing, and there is always going to be a balance between the speed and quality of your solution. It really depends on the organization.
How often should you do a release?
A lot of organizations run on traditional release structures that happen quarterly. Teams will bundle changes into a big release and then spend days or weeks regression testing. However, this method often rushes a production result and time is wasted on huge tests and fixes.
A better process might be closer to continuous delivery on a micro scale - pushing changes as they come and validating along the way. Microservices and distributed architecture are a huge advantage in these situations.
If you have a monolith in front of you, you can’t make a change or test pieces of the model if you deploy - you have to test the whole thing. However, with a microservice model, you can take a system and break it into smaller chunked components.
What does an infrastructure release strategy look like?
Infrastructure releases should not be any different than application releases.
Take a look at your process and determine if you’re holding your infrastructure release to the same standards as your applications. If you’re not doing any unit tests for your infrastructure, why not? Both applications and architecture have essentially the same process - you take inputs and produce some output.
Though your output for an architecture isn’t going to be an event, it should produce outputs as you expect them - that’s why you shouldn’t rush unit or integration tests. It’s especially important in integration tests to make sure you don’t have any security gaps as well.
There are a few different ways people manage infrastructure - what are some of your opinions on them?
Unfortunately, infrastructure isn’t as straightforward as applications. It's definitely a lot harder to create a strategy for architecture than it is for a code base because you can't really feature flag your database size.
Terraform is a really popular way for managing infrastructure and one of the things it recommends is that you separate the use cases for your architecture, environments, and regions into separate files. But that can be a little inefficient - if you have different copies of the same file that all have different inputs, you’ll have a hard time reconciling into producing one output. So you’ll still have to code something expressive and write your own tests to verify.
As humans, we try to create abstract behaviors to talk about our architecture like “oh this group of nodes can talk to this group of nodes” and write code that actually validates that those things are happening. But putting everything into practice on that infrastructure level, that’s difficult.
Additionally, for infrastructure, there aren’t a lot of frameworks for configuration management. If you look at application-focused frameworks, like Spring, they have a process for you to put your configurations directly into the repo and it's really easy for it to just get read in at runtime.
However, for infrastructure, there isn’t a lot of this - and that’s why there is a lot of open endedness. The CDK for Terraform was only released about two years ago. These frameworks are really really new, and there is only a limited amount of support for them. HashiCorp recently came out with support for the CDK, but availability is limited. You just have to choose the tool that's best suited to do the things that you need to do.
Why is disaster recovery important?
A lot of people talk about disaster recovery as a sort of frequently occurring disaster scenario. However, this is something that can happen pretty frequently and could be as simple as a database update where you end up messing up some of your customer data when you pop in a new version.
In infrastructure, a recovery process for this kind of thing is essentially parallel to a rollback strategy on the application side. It’s valuable to explore automating this process in general, and doing so can make it easier to roll back and validate that feature flags work.
So, how do you create an automated structure? In infrastructure, people start to ask this question because there are a lot of different tools, like AWS Cloud Formation and Terraform, that you can use locally. Essentially, there’s no one-size-fits-all tool and it just depends on your requirements.
What kind of takeaways do you want to leave with the audience?
Take a look at your infrastructure release process and start asking the questions about your strategy:
- Are you running this like a unit test? Why or why not?
- Are you breaking down your application or infrastructure into smaller bits? Is that working?
- How is what you’re currently doing working? Are you spending a lot of time fixing bugs? Why?
Though release strategies will vary by company, having a solid one is key to having a really successful product lifecycle.
About the author