Bulk AWS EC2 pricing

By October 1, 2020 October 16th, 2020 Powerlearnings

Written by Mudit Jain, Principal Architect at Powerupcloud Technologies.

Your Amazon EC2 usage is calculated by either the hour or the second based on the size of the instance, operating system, and the AWS Region where the instances are launched. Pricing is per instance hour consumed for each instance, from the time an instance is launched until it is terminated or stopped. However, there is no bulk upload in a simple monthly calculator and calculator. AWS, with the following options to execute the same.

  • There are paid discovery tools that also do bulk pricing. However, often initial budgeting TCO calculation is done way ahead in the pipeline.
  • Manually add entries to AWS calculator and simple monthly calculator. However, this is not feasible with a large number of VMs on enterprise-level and manual entries are error-prone.
  • You can work with the excels but it has its own limitations.

To  overcome the above-mentioned, we have arrived at an effective 4-steps solution process, they are,

  1. Source of truth download 
  2. Source file cleanup
  3. One-time setup
  4. Bulk Pricing:
    1. Source file preparation
    2. pricing

1. Source of truth download

AWS publishes its service EC2 pricings here.

The documentation for Using the bulk API.

2. Source file cleanup

Use any python with Pandas pre-installed.

#!/usr/bin/env python
# coding: utf-8
# In[1]:
import pandas as pd
# In[2]:
srcindex=pd.read_csv('/Users/mudit/Downloads/index2.csv',header=5,low_memory=True)
# In[3]:
vmsrcindex = srcindex[srcindex['Instance Type'].notna()]
curvmsrcindex=vmsrcindex[vmsrcindex['Current Generation'] == "Yes"]
#remove non-ec2 pricings in ec2 and remove old generations pricings
# In[5]:
tencurvmsrcindex=curvmsrcindex[curvmsrcindex['Tenancy'].isin(['Shared'])]
remove0=tencurvmsrcindex[tencurvmsrcindex['PricePerUnit'] != 0]
remove_unsed=remove0[~remove0.PriceDescription.str.contains("Unused")]
#at the TCO stage 95% of our customer need pricing against shared tenancy and unused pricings  are not as relevant.
# In[8]:
filter1=remove_unsed[['Instance Type','Operating System','TermType','PriceDescription','Pre Installed S/W','LeaseContractLength','PurchaseOption','OfferingClass','Location','Unit','PricePerUnit','vCPU','Memory']]
filter2=filter1.apply(lambda x: x.astype(str).str.lower())
filter3=filter2[~filter2['PurchaseOption'].isin(['partial upfront'])]
filter4=filter3
#most customers don’t use partial upfront.
# In[9]:
filter5=filter4[~filter4['Instance Type'].str.contains("^['r','c','m']4.*")]
filterered=filter5[~filter5.PriceDescription.str.contains("byol")]
#in the aws published excels, even r4,c4 and m4's are marked as current generation. Also, at this stage most customers are not looking for byol pricing.
# In[10]:
filterered.to_csv('/Users/mudit/Downloads/filterered.csv')

3. One-time setup

#!/usr/bin/env python
# coding: utf-8
# In[76]:
import pandas as pd
# In[77]:
filtered=pd.read_csv('/Users/mudit/Downloads/filterered.csv',low_memory=True)
source2=pd.read_csv('/Users/mudit/Desktop/source.csv',header=0,low_memory=True)
#read the source file and filtered file create in previous step
# In[78]:
source2=source2.dropna(subset=['hostname', 'Instance Type', 'Operating System','Location','TermType'])
source2=source2.apply(lambda x: x.astype(str).str.lower())
filtered=filtered.apply(lambda x: x.astype(str).str.lower())
#lowercase everything for string match
# In[79]:
source2=source2[source2.columns.drop(list(source2.filter(regex=('Unnamed'))))]
processedsrc=source2[source2.columns.drop(list(source2.filter(regex=('Validation'))))]
#to drop columns used in source file for validation
# In[85]:
processedsrc_ondemand=processedsrc.loc[lambda processedsrc: processedsrc['TermType']=="ondemand"]
priced_processedsrc_ondemand=pd.merge(processedsrc_ondemand,filtered,on=['Instance Type','Operating System','TermType','Pre Installed S/W','Location'],how='left')
#find pricing for on-demand entries
# In[86]:
processedsrc_reserved=processedsrc.loc[lambda processedsrc: processedsrc['TermType']=="reserved"]
priced_processedsrc_reserved=pd.merge(processedsrc_reserved,filtered,on=['Instance Type','TermType','Operating System','Pre Installed S/W','Location','PurchaseOption','OfferingClass','LeaseContractLength'],how='left')
#find pricing for reserved entries
# In[87]:
output=pd.merge(priced_processedsrc_reserved,priced_processedsrc_ondemand,how='outer')
#summarise the findings
# In[88]:
output.to_csv('/Users/mudit/Desktop/output_pricing.csv')
#export
# In[ ]:

4. Bulk Pricing

a. Source file preparation:

It should be a fixed input format:

Please find attached sample input sheet. Columns description:

1.Hostname - any unique & mandatory string
2.Col2-6 - For reference, non-unique, Optional
3.'Instance Type' - Mandatory valid instance type.
4.'Operating System' - Mandatory and have to be one of the following:

1.'linux'
2.'rhel'
3.'suse'

1.'Pre Installed S/W' - Optional, have to be one of the following:
1.'None'
2.'sql ent'
3.'sql std'
4.'sql web'

1.'Location' - Mandatory and have to be one of the following:
1.'africa (cape town)'
2.'asia pacific (hong kong)'
3.'asia pacific (mumbai)'
4.'asia pacific (osaka-local)'
5.'asia pacific (seoul)'
6.'asia pacific (singapore)'
7.'asia pacific (sydney)'
8.'asia pacific (tokyo)'
9.'aws govcloud (us-east)'
10.'aws govcloud (us-west)'
11.'canada (central)'
12.'eu (frankfurt)'
13.'eu (ireland)'
14.'eu (london)'
15.'eu (milan)'
16.'eu (paris)'
17.'eu (stockholm)'
18.'middle east (bahrain)'
19.'south america (sao paulo)'
20.'us east (n. virginia)'
21.'us east (ohio)'
22.'us west (los angeles)'
23.'us west (n. california)'
24.'us west (oregon)'
25.'eu (stockholm)'
26.'middle east (bahrain)'
27.'south america (sao paulo)'
28.'us east (n. virginia)'
29.'us east (ohio)'
30.'us west (los angeles)'
31.'us west (n. california)'
32.'us west (oregon)'
33.'us west (n. california)'
34.'us west (oregon)'

1.'TermType' - Mandatory and have to be one of the following:
1.'ondemand'
2.'reserved'

1.'LeaseContractLength' - Mandatory for TermType=='reserved' and have to be one of the following:
1.'1yr'
2.'3yr'

1.'PurchaseOption' - Mandatory for TermType=='reserved' and have to be one of the following:
1.'all upfront'
2.'no upfront'

1.'OfferingClass' - Mandatory for TermType=='reserved' and have to be one of the following:
1.'convertible'
2.'standard'

b. pricing:

Now you have pricing against each EC2 and the excel skills can come in handy to do the first level of optimizations etc.

Download the Source Sheet here.

Will come back with more details and steps on Bulk Sizing and Pricing Exercise in our next blog.

Leave a Reply