If you are considering moving some load to the Cloud, besides deep knowledge of all pros & cons and Cloud provider differences, you need to be aware of the Cloud pricing model not to regret later.
In this case take a look of Google BigQuery service.
You need to pay for a storage costs where two difference price tag can be applied:
- For the active data (data modified within 90 days), you’ll have to pay 2 cents/GB/month, where first 10 GB you’ll get for free.
- For long term data (data in tables that are not modified within 90 days), price is 1 cent/GB/month, where first 10 GB you’ll get for free as for the active data case.
In case you modify data placed in the long term storage, the whole table reverts back to active data.
For example, if your table occupies 100TB , you’ll pay 1000 US$/month (12kUS$ per year) for long term storage, or 2000 US$ per month (24kUS$ per year) for the active storage.
Unlike on-premise solution, you are also paying for each query that you want to execute based on. amount of data your query will process.
Here two models are available:
- on demand queries (pay based on usage) for cases when you need to execute some SQL sporadically
- flat rate queries where you need to pay 40000 US$/month for 2000 slots and 10000 US$ per 500 additional slots
BigQuery Slot is a measure for computational capacity required to execute SQL queries. BigQuery automatically calculates how many slots are required by each query depending on query size and complexity.
1 slot = 1 unit of computational capacity
On top of all that, in case you need to stream data, you are going to pay additionally 1 cent/200MB, where only successful insert are charged.
Here are some tricks how to minimize the costs while working with Google BigQuery:
- query only columns you need (BQ under the hood is columnar store)
- use table preview option to explore data, don’t run query to explore
- calculate query price before running
- use dry_run flag in command line interface
- query validator in UI
- use GCP pricing calculator
As you can see, working with Cloud database like Google BigQuery force you to think about the optimization.
In case you don’t want to obey tuning recipes provided, you’ll pay for that.
You can find similar costing model with other cloud providers like AWS, Azure and others.
Besides the fact that you need to be careful what you are doing to avoid additional charging, pricing model, as you can see from this example, is by far more complex comparing with on-premise solution.
In case you can choose among different cloud providers, you need to be aware that although they are providing similar services, there is also a huge difference (and target audience) among them, but that is probably topic for the next post.
I wish I can apply the same Cloud costing approach in on-premise environments, as it will force every user to think about optimization in advance.
Comments