hero-jobbies-7

Shift-Left Cloud FinOps: Embedding Cost-Aware Policies Earlier In the Infrastructure Lifecycle

Organizations usually implement FinOps or cloud cost optimization policies after provisioning cloud resources. However, the growing adoption of Infrastructure as Code (IaC) tools, like Terraform and AWS CloudFormation, present an opportunity for Engineering and FinOps teams to “shift left” and implement cost optimization policies earlier in the infrastructure lifecycle, e.g., at developer workstations or CI/CD pipelines. 

This proactive approach involves fixing issues at their inception—thereby preventing them from escalating into costly problems—and focuses on educating engineers and equipping them with a cost-aware mindset. As a result, this shift-left approach transforms the resource provisioning phase from a stage of limited control to one of strategic financial cloud cost governance, setting the stage for a more efficient, cost-effective, and cost-aware infrastructure development process. 

Here are a few potential policies you can implement earlier in the lifecycle to prevent cloud costs or operational inefficiencies:

  • Cloud Tagging: Implement a policy for mandatory cost allocation tags in IaC artifacts. These tags attribute cloud resource costs to the correct teams, projects, or environments. Enforcing cloud tagging at the IaC level allows organizations to ensure proper tagging of every resource before provisioning, which improves the accuracy of cost allocation, ownership tracking, optimization opportunity identification, and reporting.

 

Shift left tagging

  • Resource Type Modernization: Using the latest cloud resource types saves money and improves performance costs. Preventing your engineering team from provisioning old, legacy resources earlier in the infrastructure lifecycle avoids unnecessary costs, performance issues, or the headache and effort of migrating to newer technologies at a later date. For example, enforcing the use of  Amazon EBS GP3, not the older GP2 version, can save up to 20% on costs. 


Cost Gp3 to gp2

  • Appropriate Resource Sizing for Use Cases: Implementing policies based on workload or business unit requirements can help avoid costs and eliminate the need for right-sizing or migration efforts later. These policies would include setting standards for selecting instance types, storage options, and other cloud services that match specific business needs. An example could be disallowing expensive GPUs for development machines or io1 volumes for VMs.


    Cost Dev 2

  • Log Retention: Setting the right retention policies for logs ensures you are not paying for unused old logs afterward. Logs are often overlooked and can proliferate and cost you money. Ensuring a reasonable log retention period in the IaC artifact prevents your log bills from racking up and prevents clean-up activity cycles from starting later.

 

cost finops log retention

  • Lifecycle Rules Enablement: Properly defining or enabling storage lifecycle rules can help you ensure that your object storage is stored cost-effectively through its lifecycle. Examples could be rules enabled to transition to different, lower-tiered storage classes after a specific period as the data would become less frequently accessed or automatically delete data when no business needs to require it. Lifecycle rules could also include setting “TTL” (Time to Live) for cloud resources during the development and testing phases. Resource costs, particularly storage costs, will be automatically optimized if these checks are enabled and enforced in the IaC artifact.

 

More advanced policy enforcement may require additional application support, which can be enforced as a warning or guidance to developers to consider as they build applications.

  • Recommend using Spot Instances: Spot instances offer significant cost savings—often up to 90% compared to on-demand prices—by allowing users to purchase unused compute capacity at reduced rates. This pricing model is ideal for workloads that are flexible in their timing, such as batch processing, data analysis, and background tasks. Inspecting the IaC artifact earlier to ensure that workloads are taking advantage of these opportunities can prevent future cloud costs.

  • Recommend using ARM64 Instances: ARM64 Architectures can provide significant benefits on a cost/efficiency basis vs traditional AMD64/x86 architectures, and multiple cloud providers offer these VMs. For most Linux-based applications, moving to ARM64 instance types is simply rebuilding the application images with the dependencies for those architectures to achieve these significant savings.

 

Preventing Unnecessary Cloud Costs Sprawls and Drive Behavior Change With Stacklet

With Stacklet Platform’s IaC governance capabilities, your FinOps and cloud engineering teams can automatically identify and fix issues related to tagging errors or costly configurations earlier in the infrastructure lifecycle. You can easily implement “cost-ware” guardrails and best practices for your infrastructure code across developer workstations, code reviews, CI pipelines, and deployment pipelines.

  • Single Declarative Language along with out-of-the-box policies: Stacklet provides out-of-the-box policies to get you started and a policy language that is highly expressive and human-readable, making it far more straightforward to incorporate your own rules.

  • Multi-Stage Policy Enforcement: Apply cost control policies across your infrastructure lifecycle. Check policies and provide engineer-friendly remediation recommendations at the pre-commit stage on the developer workstation, the pre-merge stage in the CI pipeline, or the pre-deploy stage in the CD pipeline.  

  • Full Lifecycle Governance Designed for Enterprise Use: Stacklet provides a single declarative language, toolset, and workflow to enforce cost guardrails across your entire cloud infrastructure code lifecycle. As a result, Stacklet reduces the operational complexity of managing and operating multiple tools for cost governance. In addition, it provides critical enterprise capabilities such as exception management, policy enforcement dashboards, reports, out-of-the-box policies, and SSO to accelerate policy and tool adoption.

    To learn more or request a demo, you can sign up here

Credits: Thanks to Kapil Thangavelu for providing content and reviewing this blog