Today, may enterprises are looking for an effective cloud cost management methodology that will help them track and optimize cloud spend. One approach that has helped some organizations is to have a combination of processes and tools in conjunction with automation for managing the cloud spend in a more predictable way.
This article presents the key principles of Financial Operations (FinOps) on AWS in order to improve forecasting accuracy and cost predictability, create a culture of ownership and cost transparency.
Cloud migration increases business agility with the ability to react to fluctuations in demand very fast. This ability to scale up and down has allowed procurement of resources to move from the sole ownership of the finance team to various parties like IT, engineering and other teams. This democratization has led to a constantly growing group of cost-focused stakeholders who are responsible for understanding, managing and ultimately optimizing costs.
FinOps is the answer to an ever-growing complexity surrounding cloud cost management. FinOps is not a technology, it is a cultural change and set of processes aided by tools and automation for an effective cost management. FinOps aims to address the following challenges:
- Hardship in identifying and tracking the cost sources
- Unforeseen spikes in costs
- Inadequate cost optimization plan
- Inability to properly leverage different pricing models
- Uncertainties over the chosen pricing models
- Insufficient mechanisms to distribute cost within the organization
FinOps — Core Principles
With AWS Cost explorer, cost spend can be broken down by tags for a more accurate view of where the spend is coming from. Cost allocation tags are really important when dealing with multiple entities like applications, projects, environments, business units, etc. Cost allocation tags are just like any other tags, they don’t require any special configuration. Cost spend for resources without cost allocation tags would require additional effort to map it back to a specific entity. AWS uses the cost allocation tags to organize resource costs on the cost allocation report, to make it easier to categorize and track AWS costs.
Tagging policies can also be enforced through AWS Organizations in order to maintain consistent tags, including the preferred case treatment of tag keys and tag values across all the accounts that are part of the Organization.
Defining cost governance policies on the paper is very simple, but making sure that the policies are enforced consistently across the board is really hard. This is one of the areas where automation should be leveraged to enforce governance policies instead of relying on humans. Infrastructure as code can help address problems related to consistently enforcing cost governance policies whenever cloud services are created. Enforce governance on who can deploy resources and the process for identifying, monitoring, and categorizing these newly created resources.
- Enforce default tagging and tagging policies for enterprise wide trackability and visibility.
- Bake tagging policies into automation templates and only publish approved templates with all the governance controls for consumption.
- Use services like AWS Organizations, IAM and Service Catalog to enforce tagging policies.
Usage Monitoring and Analysis
Creating cost-related metrics and then tracking them provides a data-driven decision-making mechanism. This makes it easy to understand and manage costs and identify opportunities for savings. AWS provides various mechanisms to monitor and track cost spend continuously.
AWS CloudWatch provides billing metrics to monitor estimated cost spend in real time. AWS also provides the capability to set ‘Budgets’ as part of the cost management service. Budgets allows for tracking the cost spend against the defined budget limit in real time.
AWS CloudWatch in conjunction with Budgets can help keep the costs under the desired threshold. With AWS budgets, more advanced cost tracking against the budget can be done using filters like Tag, API Operation, Service, etc. For example, budget for a specific project can be defined using the Tag filter and the usage is tracked against it.
Monitoring and alerting on cost spend is just not enough to manage costs effectively, ability to analyse cost spend is another key aspect of cost management on cloud. AWS Cost Explorer provides visualization functionality to analyze cost and usage over time. Cost Explorer also provides data exploration functionality, such as the ability to group and filter cost and usage information, to help quickly and easily get to the data needed to make data-driven decisions.
Trending and variance analysis should be performed either using third party tools or the Cost Anomaly Detection feature provided by Cost Explorer.
- Define a budget and track the usage against the budget. Create alarms to get notified when usage exceeds the budget.
- Use CloudWatch billing metrics to track estimated cost on a real time basis.
- Create Cost Anomaly monitor in AWS Cost Management and subscribe to get notified on any aberrations.
- Review AWS Cost Anomaly Detection findings every week to identify anomalous spend and root causes.
Pricing Model Analysis
Each component of the workload should be analysed and determined if the component and resources will be running for extended periods or dynamic and short running in order to choose the best possible pricing model.
Savings Plans or reserved Instances should be utilized for permanently running resources. Short term capacity is configured to use Spot Instances, or Spot Fleet. Short-term workloads that cannot be interrupted and do not run long enough for reserved capacity, should utilize on-demand model.
Use Cost Explorer Savings Plans and Reserved Instance recommendations to perform regular analysis at the master account level for commitment discounts.
- Review AWS Savings Plan and Reservation recommendations once in every two weeks and take necessary actions.
- Review AWS Cost Management Rightsizing recommendations once in every two weeks and take necessary actions.
- Review AWS CloudWatch utilization metrics to identify under-utilized resources once in every two weeks and take necessary actions. Use automation to get metrics data instead of manually reviewing from the console.
- Review AWS Cost Management utilization and coverage report for Savings Plan and Reservation once in every two weeks.
Cost Spend Reporting
Cost spend reporting on a regular basis with actionable insights drives better cloud utilization. Consistent visibility into cloud spend should be provided to all relevant parties which will help the teams to benchmark the costs thus leading to a set of best practices on how to manage costs efficiently through various mechanisms like right sizing and architectural optimization.
The AWS Cost and Usage Report tracks your AWS usage and provides estimated charges associated with that usage. It gives the most granular insight possible into the costs and usage, and it is the source of truth for the billing pipeline. Cost reporting will help understand resource-level drivers of costs, and identify opportunities for optimization.
The AWS Cost and Usage Report is delivered automatically to a specified S3 bucket, and it can be downloaded directly from there. It can also be ingested into Amazon Redshift or uploaded to Amazon QuickSight for real time analysis.
- Generate cost report once in every two weeks and share with all the relevant parties. Cost report must include: Cost breakdown by service, Recommendations for rightsizing, savings plan and reservations, Utilization and coverage information for savings plan and reservations.
- Perform trending and variance analysis once in every four weeks and share the report with actionable insights to all the relevant parties.
As financial and cost management requirements become more complex, a holistic and centralized approach to cost management is absolutely vital in a large enterprise for effective cost management. Automation for FinOps should be centralized in order to reduce effort duplication and better governance.
Committed use discounts like Savings Plan and Reserved Instances will yield bigger savings if done centrally because it’ll give a holistic view of enterprise wide resource utilization which will lead to a bigger commit resulting in bigger cost savings.
- Committed use discounts, reservations and volume discounts should be centrally managed and governed.
- A dedicated squad with at least two engineers should be formed during the cloud migration phase to carry out FinOps related work for a big enterprise.
- Use as much automation as possible for data collection, transformation and report generation.
- Don’t try to do everything on day one, set realistic but aggressive goals and iteratively introduce new mechanisms for cost management.
- Incentivise good cost management practices and identify champions to help spread the culture throughout the organization.
Cloud cost optimization potential can be unlocked through FinOps and the key is to start early. Although FinOps practices are relatively easy to implement in small environments, a set of processes, tools and automation are required in large environments and enterprises to be successful at scale. Having these practices early in the cloud adoption journey can help create the right mindset to ensure success when dealing with complex enterprise environments.