Cloud Cost Management Best Practices

Cloud cost management is a discipline that helps organizations monitor, track, and optimize spending in the public cloud. In this post, I’ll outline some best practices for establishing effective cloud cost management in your organization. 

Introduction: The Promise of the Cloud, and the Inhibitors Stifling Progress

By moving to the cloud, teams can gain the agility to speed innovation and gain a competitive advantage in the marketplace. For these reasons, many business leaders are eager to exploit the advantages of the cloud. However, too often, a lack of proven processes and approaches can create doubt and uncertainty, and stifle progress.

One especially critical inhibitor is in the area of cost. Fundamentally, if cloud spending goes untracked and uncontrolled, teams will invariably get hit by unexpected cost overruns. What’s worse, when initial forays into the cloud create these issues, decision-makers can restrict, reduce, or put an outright halt to future cloud initiatives. 

As more and more organizations are starting to scale their cloud usage, the need to establish best practices for cloud cost management has become increasingly critical. 

Having worked in cloud environments for over a decade now, I’ve gained a lot of first-hand experience in terms of the cloud cost management approaches that have been tried. I’ve seen the tactics that have proven to be effective—and those that haven’t worked at all. In the following sections, I’ll outline some best practices you should consider for controlling and optimizing your costs in the cloud. 

Increase Visibility into Cloud Expenditures

A data-driven, transparent approach is a prerequisite for effective cloud cost management. Especially if you provide transparency to application engineering teams, they are more likely to take actions to optimize spending in the cloud. Here are a couple of things to consider in order to increase your visibility: 

  • Set up billing alarms. Setting up billing alerts and alarms is a great way to stay on top of your cloud costs. This allows you to monitor your billing metrics and then receive a notification when the bill exceeds a threshold amount. These alerts can be set up in AWS, Azure, and GCP so that you don’t get surprised by unexpectedly high bills. Typically, this billing threshold is set up based on the previous month’s spending, so you have a baseline to start with for each account. If a billing alarm is received, you will then have an opportunity to investigate what is causing the increased cost and validate or remediate the cause before the billing cycle completes. Various types of alarms can be set for engineering, operations, and finance teams.

  • Regularly review billing dashboards and reports. Reviewing your cloud billing reports in near real-time or on a regular basis can be very beneficial to achieve a better understanding of the items you’re being billed for and where the majority of costs are occurring. Using tags or labels on all of your cloud resources can really help provide insights on your billing and help you break down the bill by things like project name, cost center, environment, resource creator, and so on. You can also use the cloud provider’s billing API to automate the generation of custom reports based on your resource tags and values. These custom reports can then be used for things like project cost tracking and many other efforts. Comparing the current or last month's bill to previous months’ bills will allow you to see what services are creating increased costs and identify anomalies in the billing cycle. Third-party cloud cost management tools can also help you get out-of-the-box dashboards that provide consistent, standardized insights across multiple cloud platforms.


Proactively Implement Cost Policies

Having a set of organization-wide cost policies for developers to follow is essential for cloud cost management. One way to create your organization's cost policy set is to view the historic billing reports for your cloud accounts. Next, you can identify which services are generating the highest costs and focus on those as a starting point. The cost policies should be broken down by cloud provider and service and highlight the ways developers can reduce and save costs within those services. For example, within an AWS EC2 service, cloud cost management policies might include things like deleting EC2 snapshots after X number of days, deregistering old AMIs after X number of days, deleting unattached EBS volumes, right-sizing EC2 instances based on CloudWatch metrics in which the average CPU utilization is below X% for 14 days, and so on. 

As you implement cost policies, you need to ensure that these policies don’t hinder development velocity. Policy implementation should be automated where possible to minimize any manual workflows or extra work for developers. To enforce cost policies, thousands of companies use Cloud Custodian. Cloud Custodian is an open-source project that enables teams to use code to enforce cost policies across multiple cloud platforms. 

Communicate Your Cost Policies 

Communicating your cloud cost management strategy and policies to your development teams is very important. This effort is vital in ensuring the strategy is understood by anyone who’s responsible for creating or managing cloud resources. These communications should include details on how to maintain a clean and cost-efficient environment by outlining key procedures like limiting the number of backups to X number of days, deregistering old unused machine images, rightsizing guidelines for resources such as VMs and databases, removing unused and unattached resources, and so on. Communicating these guidelines and cloud cost management goals is a great first step to having a better-managed cloud. 

Build a Culture and Team for FinOps 

FinOps is emerging as a new model to manage and optimize cloud spending, without hindering developer velocity. In this model, individual development teams take responsibility for costs. However, these teams are closely supported by a central FinOps group. 

The FinOps group ensures that individual development teams have the right visibility into their cloud spending and are aware of and able to implement organizational policies and industry best practices. FinOps teams also facilitate cross-functional team collaboration on cloud cost management. 

The FinOps group also helps elevate conversations around cost, enabling teams to move from looking solely at expenses to how to gain maximum business value from the consumption of cloud services. FinOps teams help development and finance groups make informed decisions on cloud spending and justify additional costs. Each expenditure is evaluated based on the value that the application or service brings to the business. 

Depending on the size of the organization and its cloud footprint, the size of the FinOps group can vary substantially in size. While FinOps might only sound appropriate for mature organizations with budgets of millions of dollars, the earlier in their cloud journey that teams start employing FinOps principles, the better—even if it’s a team of one practitioner at the outset.  Following and joining the FinOps foundation is a great way to get started with this discipline. 

Have you adopted any of the above practices? What else should I have included here?