Skip to content

Without preventative guardrails in the cloud, managing waste and inefficiencies is like spring cleaning without a plan. Things get cleaned up, but the clutter keeps returning. In the cloud, that clutter represents inefficiencies and waste. Since you’re paying for usage, it becomes an ongoing cost.

Stacklet recently partnered with Centiment to survey over 300 cloud infrastructure and FinOps professionals to understand current cloud waste better, the costly mistakes contributing to it, and their impact. The survey revealed that organizations still need help with waste, with 51% of respondents estimating that more than 40% of their cloud spend could be optimized. This waste grows when engineering teams make costly mistakes due to limited insights, lack of awareness, and inadequate cost controls or guardrails.

62% of organizations reported that these mistakes cost them USD 25,000 or more per month – a significant amount of infrastructure budget lost, equivalent to at least one engineering FTE per month.

The Stacklet Platform, developed by the core creative team behind CNCF’s Cloud Custodian, not only eliminates waste but also fosters a preventative culture to keep inefficiencies from recurring. It automates remediation, enforces guardrails, and continuously optimizes cloud usage and governance, ensuring costly mistakes are minimized and inefficiencies don’t return. This blog post highlights some of the key preventive capabilities of Stacklet that help minimize costly mistakes, drive efficiency, and encourage behavioral changes among engineering teams, making them more cost-aware. With Stacklet, there’s hope for more efficient and cost-effective cloud management.

Comprehensive policy and guardrails, designed for the real-world

Stacklet’s robust policy and governance guardrails prevent unnecessary spending and endless optimization backlogs. Our comprehensive approach to infrastructure lifecycle management ensures your cloud environment has minimal waste and is continuously optimized. Key attributes include:

  • Policy Guardrails Across the Cloud Infrastructure Lifecycle: Stacklet is the only solution on the market that allows you to enforce preventative policies across your entire infrastructure via a single, expressive language – during the build, deploy, or operate phases. From Infrastructure as Code (IaC) to runtime, Stacklet identifies costly misconfigurations and can automatically trigger communication and remediation workflows. While adopting IaC is a best practice for ensuring consistency, it’s often not fully implemented across all teams. Some developers may still use the cloud provider’s console directly, which can introduce manual errors and inconsistencies. Stacklet addresses these gaps by applying guardrails not only to IaC but also to runtime environments, ensuring consistent governance regardless of how resources are provisioned. Moreover, not all inefficiencies can be detected during the build or deployment stages. Some forms of cloud waste, such as underutilized or idle resources, are only visible days or weeks after moving to production. Stacklet ensures these inefficiencies are identified and corrected by enforcing runtime policies and minimizing cloud waste throughout the infrastructure lifecycle. This comprehensive approach ensures that organizations can prevent waste at every stage – no matter how or when the infrastructure is deployed or used.
Cloud Usage And Governance Feed In Stacklet
  • Developer-Centric Automated Actions, Workflows: Stacklet facilitates immediate action or triggers remediation workflows for costly misconfigurations, ensuring that workloads and usage are optimized without delay. Whether in Infrastructure as Code (IaC) or runtime environments, Stacklet provides actionable insights and feedback to developers, helping them create and maintain compliant infrastructure. Our platform supports customized, multi-step remediation workflows tailored and customizable to your organization’s needs with tools like Slack, Jira, and ServiceNow integrations. This ensures swift communication and effective action, enabling teams to respond promptly and efficiently. Stacklet’s exception management is designed for flexibility, allowing policies to adapt while maintaining control. This unique feature ensures a balance between flexibility and compliance, enabling teams to address unique situations while upholding governance throughout the infrastructure lifecycle.
Example Remediation Workflow In Stacklet
  • Broad Coverage with Flexible, Out-of-the-Box Policies: Powered by the open-source project Cloud Custodian, Stacklet supports over 500+ cloud resource types and offers more than 1,500 policies to ensure comprehensive coverage – from popular cloud services to experimental new technologies. With a flexible, declarative policy language and a real-time asset database, you can quickly identify and implement custom policies tailored to your evolving needs.
Examples of IaC Cost Governance With Stacklet (“Shift Left FinOps”)

The growing adoption of Infrastructure as Code (IaC) tools, such as Terraform, presents an opportunity for engineering and FinOps teams to “shift left” and implement cost optimization policies earlier in the infrastructure lifecycle, like at developer workstations or CI/CD pipelines. Misconfigurations can lead to high cloud bills from the moment infrastructure gets provisioned. By proactively addressing issues during the IaC stages, teams can prevent costly problems before they escalate, saving hours of engineers’ time and reducing rework later.

With Stacklet Platform’s IaC governance capabilities, FinOps and cloud engineering teams can automatically identify and correct issues, such as tagging errors or expensive configurations, earlier in the process. This allows the implementation of “cost-aware” guardrails and best practices for your infrastructure code, extending across developer workstations, code reviews, CI pipelines, and deployment pipelines.

A few examples of these include:

  • Resource Type Modernization: Using the latest cloud resource types saves money and improves performance costs. Preventing your engineering team from provisioning old, legacy resources earlier in the infrastructure lifecycle avoids unnecessary costs, performance issues, or the headache and effort of migrating to newer technologies at a later date. For example, enforcing the use of Amazon EBS GP3, not the older GP2 version, can save up to 20% on costs.
Sample Message In GitHub
  • Appropriate Resource Sizing for Use Cases: Implementing policies based on workload or business unit requirements can help avoid costs and eliminate the need for right-sizing or migration efforts later. These policies would include setting standards for selecting instance types, storage options, and other cloud services that match specific business needs. An example could be disallowing expensive GPUs for development machines or io1 volumes for VMs.
  • Log Retention: Setting the right retention policies for logs ensures you are not paying for unused old logs afterward. Logs are often overlooked and can proliferate and cost you money. Ensuring a reasonable log retention period in the IaC artifact prevents your log bills from racking up and prevents clean-up activity cycles from starting later.

More examples of these shift left FinOps policies are given here.

Examples of run time or post deployment prevention with Stacklet

While shifting left with IaC is crucial, a significant amount of cloud inefficiencies only become apparent once infrastructure is in use. Forms of waste such as underutilized resources, development instances running during off-hours, or temporarily increasing storage tiers for business reasons often occur post-deployment. Stacklet helps prevent these inefficiencies by continuously scanning cloud environments and event streams, and triggering actions, remediation workflows in real-time.

A few key examples of these include:

  • Automated Scheduling of Development Machines: Implementing an off-hour policy program, particularly for development machines, can save significant amounts of money. Stacklet Platform helps automate and better orchestrate off-hour policies, making it easy to enforce rules like automatically powering off development instances during off-hours and turning them back on according to a schedule. For more information, see our detailed blog on best practices for off-hour policy programs and Stacklet’s key capabilities in this area.
  • Unused resources. The platform can help you find unused resources in your environment across a variety of resource types including VMs, block storage, databases, cloud based Gen AI managed services, middleware services, VPC’s and more.
Sample Slack Message
  • Underutilized Resources: The Stacklet Platform identifies underutilized resources based on metrics and configurations. It facilitates rightsizing overprovisioned resources while keeping development teams informed at every step, ensuring efficient resource use without sacrificing performance.

Important Note: Many examples from the IaC section, like log retention, can also be enforced in near real-time at runtime. This allows Stacklet to catch misconfigurations made through consoles or scripts – even when IaC isn’t widely used in your organization. This dual approach ensures continuous compliance regardless of how resources are managed.

Prevent Waste and Inefficiency With Stacklet

Save Money. Save Time. Drive Cultural change

Stacklet enables continuous cloud usage optimization through comprehensive policies enforced across the entire lifecycle. Integrated into development workflows, Stacklet proactively prevents waste and inefficiencies, significantly reducing the time spent on remediation through prevention and automation. By implementing effective preventative measures, organizations can save up to 50% on cloud expenses while fostering a culture of accountability and efficiency. Stacklet empowers engineering teams to build a more cost-aware and optimized cloud environment, promoting long-term sustainability and continuous improvement.

Start your journey toward continuous cloud cost efficiency with Stacklet today. Schedule a demo today here

 

Credits: Thanks to Sonny Shi and Jamison Roberts for reviewing this blog

Categories

  • cost optimization
  • FinOps