Cloud Squeeze – AWS Cost Optimization & Right Sizing with AI

Capacity Planning For The Cloud – A New Way Of Thinking Needed

Published on September 13, 2017
Share this on

Based on IDC research, the CAPEX-OPEX drive has created an environment whereby in 2018, the typical IT department will have a minority of their apps and platforms residing in on-premises data centers.

The traditional mode of capacity planning focused on obtaining servers funded by applications able to achieve capital investment. Application groups had to obtain the capital needed to fund compute resources to operate the application. CAPEX-OPEX simplified is like the option of purchasing a car on full payment, with yearly depreciation benefits vs. leasing a vehicle on a monthly cost with some limits on miles.

In terms of an analogy, spend in the cloud is similar to that of a sushi-train, where there is near an unlimited/infinite set of choices. Your consumption is billed at the end of your meal for what you actually consumed. On the other hand, if you went to a fine dining restaurant, where there was a 30 minute preparation time for the food – if you didn’t plan your meal properly, it could mean you either ordered too little or too much under time constraints. Yet in the cloud, it seems like people still use the old style planning patterns.

A workload is a term referred to when running a set of resources in the cloud. This term usually refers to some characteristic of computing, memory, storage, networking typically around an application profile run in the cloud.

A lift and shift style migration is a frequently sought out method to move infrastructure to the cloud. Capacity planning at its core when applied to compute, storage, memory, and networking aspects with the traditional mindset creates tremendous wastage and opportunities. This mass migration to the cloud has made it possible to highlight cost savings opportunities on top of the cost savings gained from migrating from traditional colocation data centers to the cloud.

If a customer has come to the cloud through this lift and shift style migration, chances are that there is somewhere between 30-70% savings attainable. Remember, in the cloud, capacity is near infinite, almost available on demand when you need it.

The ease of creating capacity on demand, creates the scenarios of idle capacity, for example – spawn of some infrastructure or test and failure to turn it off (“by the time of day” or “shut down”). An example of underutilized capacity is having 100TB storage when 30 TB is needed now, 50 TB next year and possibly more the next year.

Misappropriated allocation is choosing the wrong type of infrastructure – for example, newer classes of infrastructure may do far more, than what one is on. Life cycle management involves choosing the right class of storage. Volume discounts or reservations can be used to reserve capacity for 1 or 3 year periods and take advantage of lower price points in terms of committing to a longer duration of use. When a reservation is used prematurely, one can find themselves with excess capacity, if underutilization levels are not resolved first.

Choosing to reserve capacity in the cloud prior to applying the earlier steps can limit your cloud cost reduction ability. You can get some cost savings from reservations but may be forced to pay for a resource irrespective of your ability to use it.

The three key metric when optimizing a workload for the cloud comes down to:

1. At rest capacity:

Simply put, if no one is using this service/application, what is the absolute minimum necessary for its availability? For example – typical public cloud solutions may require a web server, database, possibly some middle-tier service. Did you know entire single page apps (SPA) built on Angular, React can be deployed fully inside an S3 bucket? No servers to manage – simple storage that can be made available globally using CloudFront or Akamai like caching solutions. The lowest level of at rest capacity is something that should be considered with applications, irrespective of the number of infrastructure components needed.

The rest configuration of applications can be reduced dramatically when S3 buckets are used to leverage functions previously reserved for application servers.

2. Response velocity:

Or scale up to demand velocity. How soon can the application respond to an increase in load? One such scale out architecture involves spawning application servers by the time of day or based on increased load. The key question one has to look at is, how quickly can the response needs be handled?

3. Termination velocity:

How quickly after the burst can the application scale back down to at close to the “at rest state”?

By combining tagging mechanisms, cost allocation for compute resources in the cloud can be attributed down to each application, visitor or geographic region introducing new paradigms for capacity planning and cost allocation.

If you’d like to learn more about these cost saving opportunities in the cloud and how to transform capacity planning thought processes for your enterprise, you will find this 15-page depth white paper valuable.

To download this whitepaper please follow these 2 steps:

Tag a friend in the comments. Or, write “Yes” if you prefer. This helps us tremendously in getting the message out to as many cloud users as possible. A like, share or comment helps. Download the white paper below: Get the full white paper


AWS Marketplace Cost Optimization

This tool gives you cost optimization advice based on your monthly AWS spend. We typically find 25-55% cost savings opportunities in your AWS account with a 5% minimum guarantee


Take a three-minute check-up, a seven-day obligation free trial and find 7-55% or more savings in your AWS account – guaranteed.

whitepaper - cloud capacity planning


Receive this white paper and updates to transform the way you think about capacity planning