Programmable infrastructure, or infrastructure as code, is an approach to managing infrastructure using software development techniques, rather than the traditional manual processes used by IT operations teams. This approach goes beyond the “aaS” (as a Service) offerings and self-service portals, which many enterprises are now adopting as part of their cloud strategy, to a fully digitized end-to-end process for managing the lifecycle of infrastructure resources.

Many leading technology companies are early adopters of the programmable infrastructure approach, proving its viability and value. Therefore, it’s likely that programmable infrastructure is the next logical step in the evolution of IT infrastructure, building upon IaaS, just as IaaS followed virtualization and the “boxes” (i.e. hardware devices) before it.

Programming your IT infrastructure

Programmable infrastructure brings several benefits.

First, the actual running infrastructure matches the code defining it, unlike runbook documents or human memory, which diverge from reality almost immediately.

The approach also allows teams to replicate the end-to-end hosting environment for an application in its entirety by simply running the infrastructure code, enabling test and staging environments that accurately match production, as well as a repeatable process for rebuilding the production environment itself.

Organizations can even store the infrastructure as code in source control, giving them the capability to track changes through versioning and to roll back to prior versions when a release fails.

Perhaps most important, programmable infrastructure replaces manual and error-prone system configuration with automation, improving quality and reducing labor on relatively menial tasks.

Eight requirements for enablement

In the Kovarus methodology, there are eight core requirements for enabling the
programmable infrastructure operating model. These are:

1. Infrastructure APIs

Infrastructure resources are accessed and managed through an API rather than a portal, management user interface or command line tool. A defined API interface enables consistency and rigor when calling infrastructure services and opens up the ability to interact with the infrastructure through any modern programming language. The APIs can be delivered in a fine-grained model by a public cloud provider such as AWS or a private cloud software package such as OpenStack, or as individual API-enabled infrastructure components – for example, an F5 load balancer.

Additionally or alternatively, orchestration tools or cloud management platform software can provide more course-grained APIs for managing higher-level composite infrastructure services.

2. Infrastructure as code tooling

In order to call the appropriate APIs to provision the end-to-end infrastructure environment, tooling must be able to define infrastructure as code and process that code or model. More than just a script that executes a single task, this entails describing the full infrastructure to host an app, encompassing every step required to take a blank slate to full stack infrastructure needed to run an app.

Today, there are two types of tooling that can meet this requirement. First, public cloud providers may offer a tool native to their service, such as AWS CloudFormation. This approach is likely to have the lower barrier to entry but will work only in the cloud provider offering the tool. Second, teams can use a configuration management tool such as Chef, Puppet or Ansible as the infrastructure as code toolset. This strategy provides much more flexibility, supporting private and hybrid clouds, as well as multiple public cloud providers. However, for a single public cloud platform, there typically will be more overhead involved in writing infrastructure code with configuration management tools than with a native tool from the provider.

3. Development skillset

If infrastructure is to be programmed, the team building out the solution needs to have development skills. The choice of tooling determines the level of development skills required. Some tools are declarative, and thereby cater to a wider audience that includes infrastructure engineers with good scripting knowledge. However, other tools are much more similar to application development and may require core coding experience.

Meeting this requirement often requires reorganization within IT. In the spirit of DevOps, organizations may need to restructure teams in order to bring developer skills to an infrastructure project, or infrastructure knowledge to a development team taking on full-stack responsibilities.

4. Source control

Managing infrastructure using software development techniques means adopting many of the best practices that come from that discipline. First and foremost is storing the infrastructure code that has been developed in source control. Just like application code, infrastructure code needs to be stored centrally to enable teamwork and collaboration, versioned to provide branching and rollback capabilities, and treated as the ultimate “source of truth” that other diagrams or documents use as an input. Shifting the source of truth from infrastructure tools to source control is one of the largest cultural shifts for a typical infrastructure operations team, so it deserves special attention in any programmable infrastructure strategy.

5. Backlog management

Another best practice that IT teams can adopt from software development is a backlog management process, which encompasses intake of new features or requests, prioritization of tasks, assignment to resources and progress tracking through completion. In most Enterprise IT organizations, the most straightforward and efficient way to do this is to adopt the tools and processes that your development teams already use. While an initial programmable infrastructure capability may require larger upfront architecture and planning – like a traditional waterfall project – you should manage ongoing improvements and subsequent apps that assume the programmable infrastructure approach using Agile, Scrum and/or XP techniques, just like application development. Think of the whole environment as a product that needs ongoing management and continuous improvement, not as a project that is completed once in a defined timeframe and then left alone.

6. Testing plan

Traditionally, IT Operations teams have avoided changing their infrastructure environments to avoid risk. Programmable infrastructure takes the opposite approach, which embraces change and automates the deployment process to reduce risk. In order to decrease risk of failure (and not increase it), the new automated process must actually work, and the IT Operations team must prove it works before making changes to production operations. The mechanism for building this trust and confidence is another software development process called quality assurance (QA).

While a broader QA strategy may be warranted, the core requirement is to build a testing plan that defines:

  • Success criteria for the automated process
  • How the infrastructure code will be tested outside of production
  • What tests will be performed to prove the success criteria are met
  • What tools and processes will be used to audit and validate process adherence

Many aspects of the testing plan may be borrowed from existing application testing procedures. However, the tools used for testing the infrastructure itself, such as ServerSpec, will be different than the ones used to test application code, meaning they will likely be new to the organization.

7. Defect management

Defect management is the set of processes used when production is not functioning correctly and is critical to risk mitigation when deploying infrastructure as code. IT Operations teams must track defects, or “bugs,” prioritize and assign resolution and quickly deploy fixes in production. Additionally, the programmable infrastructure environment should enable rollbacks to the last optimal state for new releases that fail, a critical feature that separates simple scripting from a closed loop programmable infrastructure process.

Good intelligence is key to effective defect management. In the infrastructure realm, that entails holistic operational monitoring and log aggregation that enables effective troubleshooting. These are tools that most infrastructure teams have deployed in static environments, but in the programmable infrastructure paradigm, agents need to be provisioned through automation and dynamically hooked into enterprise data stores.

8. Release strategy

The capability to provision infrastructure through code is just a small piece of the overall strategy needed to move to a programmable infrastructure model. The final requirement, release strategy, pulls many of the prior components into an end-to-end process that facilitates agility, quality and low risk.

When running in production, infrastructure is intrinsically different than applications, so the release strategy must match. Techniques such as disposable infrastructure (where hosting environments are thrown away and rebuilt upon every release) and blue-green deployments (where the new deployment is created in parallel to the old environment and user traffic is switched over time) are some best practices that are emerging to facilitate risk-free releases in programmable infrastructure environments.

Staying ahead of the curve

Programmable infrastructure is both an extremely advanced practice and one that is early in the adoption curve, making it a higher-level component of the Kovarus cloud enablement methodology. Because programmable infrastructure is poised to be the next major operating model shift after “aaS” and self-service, we recommend all infrastructure and operations teams that support developers begin researching this topic. Digital businesses have a particularly large incentive to become first adopters of programmable infrastructure, especially because the underlying tools and techniques are rapidly becoming viable for the broader market.

In fact, many first movers within our customer base are already piloting infrastructure as code in their IT operations.