Everything you must know about AWS Redshift pricing

With the exponential growth of Ai and Machine Learning, the need for compute capacity has reached an all time high. But behind computing operations comes the need for storing and handling massive amounts of data. This is where services such as AWS Redshift come in handy.

Keeping control over those services cost becomes vital. You’ll discover in this article that Redshift pricing is far from being simple.

What you must know about AWS Redshift

AWS Redshift is a fully managed, cloud-based data warehouse service. It is designed to handle large-scale data analytics. It enables the storage, processing and analysis of vast amounts of structured and semi-structured data.

You’ve probably heard about it around data intensive use cases. Its common use cases are indeed mostly data consolidation, machine learning, predictive analytics, data lake integration and real time analytics.

To understand why AWS Redshift is such a popular choice for AWS customers let’s look at its main features. It will also greatly help you understand its Redshit pricing strategy further down in this article.

AWS Redshift covers most data warehousing use cases.

The advantages of AWS Redshift:

  • Scalability: Redshift is based on Massively Parallel Processing (MPP) technology. This makes it the go to solution for rapid data analysis on large datasets. It is capable of distributing and executing queries across multiple nodes simultaneously, ensuring high performance even with large datasets. For users with evolving quantities of data in the warehouse, Redshift makes it easy to scale it up or down. This is done by adding or removing nodes without significant downtime.
  • Columnar Storage: data is stored by in columns, which greatly optimizes query performance. Instead of reding entire rows Redshift can only focus on the necessary columns.
  • Advanced Compression: Data on AWS Redshift is automatically compresses to reduce storage costs and improve query performance. The less data there is to read from the disk the better the query performance.
  • Concurrency Scaling: This is a smart feature provided by AWS. It automatically adds and removes capacity to handle unpredictable query workloads. It does not require any manual intervention and ensure a consistent performance of the data warehouse.
  • Security and backups: This is a key topic for users especially for critical datasets. Amazon Redshift takes frequent snapshots of the data. This enables easy recovery in case of data corruption or loss. Moreover, all data stored benefits from AWS encryption at rest and in transit. More globally it has all possible finetuning offered through IAM access management.
  • Integration with AWS Ecosystem: It could seem like a marketing point but it’s not! Having a warehouse able to seamlessly integrate for example with Amazon S3 for data storage or QuickSight for business intelligence is great.
AWS redshift pricing cost optimization

AWS Redshift pricing explained

AWS Redshift’s base pricing is based on a pay-as-you-go model, allowing users to control costs based on usage. Despite its ability to handle massive amounts of data, the smallest per-hour starting point is only a $0,25.

With provisioned AWS Redshift, you can choose On-Demand Instances. It means that you pay for your database by the hour with no long-term commitments or upfront fees. You can also opt for Reserved Instances for additional savings. Alternatively, Amazon Redshift Serverless allows you to pay for usage by automatically starting up, shutting down, and scaling capacity up or down based on your application’s needs. You pay only for capacity consumed while processing the workload.

We have listed below all the variables that have an impact of your Redshift costs.

1- AWS Redshift node types

There are two different node types to choose from. The choice depends on the required performance and data quantity you need to store. Those types are RA3 and DC2. If you are using AWS Redshift Serverless, the appropriate resources required to service the workload are automatically provisioned. The Serverless option is best if you don’t want to choose a node type.

When using RA3 nodes, you choose the number of nodes based on your performance requirements. You only pay for the managed storage you use.

RA3 nodes

The RA3 nodes come with managed storage, you can scale and pay for your compute and managed storage independently. The metric used to estimate size of your RA3 cluster is the amount of data you processed daily.
Coming back to Redshit Managed Storage (RMS), each RA3 node uses high-performance large SSD drives for local storage. For longer term, but also more durable, storage it uses AWS S3 (Simple Storage Service). When data in a a node reaches the local SSD limit, RMS automatically offloads it to AWS S3.

In terms of price, there is no difference between data hosted on the local SSD or on Amazon S3. Moreover, as we previously mentioned, AWS Redshift is built and design for large amount of data. When workloads require ever-growing storage, it is possible to automatically scale your warehouse storage without paying and adding extra nodes.

DC2 nodes

Those are designed for compute-intensive data warehouses and include local SSD storage. It’s up to you to select how many nodes you need based on your performance and data size requirements.
In terms of storage, things differ from RA3 nodes. On DC2 nodes, your data is stored locally, these nodes are indeed mostly used for high-performance applications. When your data grows, you can add more nodes to increase your cluster’s storage capacity.


Tip: AWS recommends using DC2 for datasets up to 1TB for the best price to performance ratio. If you expect your dataset to be larger you should better be using RA3. When using RA3 you are able to size compute and storage independently.

2- AWS Redshift free trial

The easiest way to get started is to give AWS Redshift a try. AWS offers a $300 credit for a 90-day trial of Redshift Serverless.

If Redshift Serverless is not available in your region, it is possible to have the trial on provisioned clusters. For the latest trial conditions, visit AWS website.

3- On-demand pricing for Redshift

You are billed per hour based on the node types and number of nodes in your cluster. This option provides flexibility and is ideal for unpredictable workloads or for those who want to avoid long-term commitments.

AWS offers the possibility to “pause and resume” to suspend on-demand billing during the time the cluster is paused. It can be a manual or scheduled operation on Redshift node types. In terms of cost, during the time a cluster is paused you pay only for backup storage. This means you don’t have to overprovision in advance which contributes to optimize Redshift cost.

How to calculate AWS Redshift on-demand price per TB per year:

(instance hourly price) x (number of hours in a year) / (number of TB per instance)

Note that for RA3 nodes the billing is split between actual data storage and compute node costs.

4- Redshift Serverless

Amazon RedShift Serverless is a pricing model that helps optimize costs

Serverless is a great option to start small and cheap. It can indeed start as low as $3 per hour! If you are already familiar with AWS Serverless computer instances the concept remains similar here. You only pay for the compute capacity consumption of your data warehouse when it’s active. The warehouse capacity follows your workload and scales up or down automatically. Moreover, it stops when you are inactive to save costs.

The billing metric unit for Redshift serverless is the RPU: Redshift Processing Units. What does it mean? You pay on a per-second basis for the workloads you run with a minimum of 60 seconds. The warehouse startup time isn’t charged by AWS and the price includes the automatic scaling.

In order to keep Redshift Serverless  cost under control and avoid surprises there is a specific option called “Base”. You can activate it to limit the MaxRPU (capacity) and Max RPU-Hours.

5- AWS Redshift Managed Storage Pricing

This is specific to the RA3 node type. The pricing for the stored data is based on a fixed GB rate per month for your region.  Moreover, it is the same rate regardless of the data size.

How does AWS track my usage? It’s simply calculated hourly based on the total data you have in managed storage. To keep track of the amount of data in your RA3 cluster, use AWS CloudWatch or your console.

Note that no transfer costs apply on data exchange between tour managed storage and RA3 nodes.

Keep in mind that you must pay extra for your backups. Once you terminate your cluster, you’ll keep being charged for their retention.

6- Redshift Spectrum pricing

When using Spectrum, you are charged per terabyte of data scanned when querying data directly in Amazon S3.  This allows you to run SQL queries on vast amounts of data stored in S3.

From a billing point of view, Redshift Spectrum queries data in Amazon S3. You are charged normal S3 for storing and requesting data in and from your S3 buckets.

For Amazon Redshift Serverless, remember that queries of external data in Amazon S3 are not billed for separately. They are included in the amount billed for Amazon Redshift Serverless in RPU-hour amounts.

Here is a tip from Holori to lower you Redshift cloud bill:
It’s possible to improve the query performance and reduce costs by storing data in a compressed, partitioned, columnar data format. Thus, compressing data using one of Redshift Spectrum’s supported formats helps decrease costs because less data is scanned.

Likewise, storing data in a columnar format, such as Optimized Row Columnar (ORC) or Apache Parquet decreases your charges. The simple reason is that Redshift Spectrum only scans columns required by the query.

AWS Redshfit covers a wide range of use cases from Ai to Machine Learning and standard data storage.

7- Redshift Concurrency Scaling pricing

Redshift Concurrency Scaling pricing provides extra capacity (called transient clusters) when your workload exceeds the provisioned resources. You accumulate one hour of free Concurrency Scaling credits per day. Of course, usage beyond the free credits is charged per second.

The advantage here is that you don’t have resources to manage nor upfront costs.  Also, you won’t be charged for the startup or shutdown time of the transient clusters. You can accumulate up to 30 hours of free Concurrency Scaling credits for each active cluster. Credits do not expire as long as your cluster is not terminated.

When you exceed your free credits, AWS charges you the  per-second on-demand rate for a Concurrency Scaling cluster. Note that there is a one-minute minimum charge each time a Concurrency Scaling cluster is activated. Of course, the per-second on-demand rate paid depends on the type and number of nodes in your Redshift cluster.

If you are an AWS Redshift Serverless user, there are no separate charges for Concurrency Scaling. WIth Serverless AWS is taking care of scaling up and down the resources for you.

8- Redshift ML Pricing

Redshift ML is used to create, train, and deploy machine learning (ML) models. If you haven’t used SageMaker, you are eligible for a free tier that reduces your AWS Redshift costs. To be more precise, you can benefit from two free “create model” requests per month. Since each of them can contain up to 100,000 cells it’s a nice way to save costs. That’s something you should clearly consider given the high price of Amazon Redshift. You should now have understood the mechanism. Each Create Model request brings an extra S3 charge. But don’t worry here, we estimate that it should be relatively limited to a few dollars.
Moreover, AWS gives you the possibility to control your Redshift ML costs. This is done by defining the MAX_CELLS value. The default setting is 1 million (which shouldn’t incur data training cost of more than $20). Here is a rough estimate for various numbers of cells.

Price per millions of cells for AWS Redshift

9- Reserved Instances for AWS Redshift

If you are a reader of our blog, you shouldn’t have missed our recent article about AWS Reserved Instance pricing. If not, I encourage you to read it to get a better understanding of the mechanism behind this pricing model.

A reserved Instance is a pricing model within AWS. It allows users to commit to using a specific instance type, region, and duration (one or three years). This commitment can lead to massive savings of up to 75% compared to Redshift On-Demand pricing. You benefit from the same capabilities as On-Demand and add the benefits of a reduced hourly rate. Moreover you also have the possibility of capacity reservation.

There are three options for Reserved Instance pricing:
  • No Upfront : You pay nothing upfront, and you commit to pay monthly over the course of one year.
  • Partial Upfront:  You pay a portion of the Reserved Instance upfront, and the remainder over a one- or three-year term.
  • All Upfront: You pay for the entire Reserved Instance term (one or three years) with one upfront payment.

Keep in mind that once you commit, AWS will charge you for the resource regardless of your actual use! Don’t let it idling and burning cash for nothing!

10- Redshift Backup Storage

Amazon Redshift’s backup storage is linked to data warehouse snapshots. Charges apply for manual snapshots created via the console, API, or CLI. Automated snapshots (retained up to 35 days) are free. Recovery points in Amazon Redshift Serverless under 24 hours old are also free. Older ones incur charges as part of RMS. Data in RA3 clusters is billed at RMS rates. Manual snapshots of RA3 clusters are charged at standard Amazon S3 rates

11- AWS Data Transfer Costs for Redshift

Data transferred between AWS S3 and AWS Redshift isn’t charged if you are within the same region. Be careful when using VPCs as data transfer can also apply.
For other data transfer types, standard AWS egress/ingress pricing applies. Please refer to Amazon Data Transfer pricing for further details on this. For an overview on the egress costs, I recommend a recent article written by my colleague Alex.

How to keep AWS Redshift cost under control?

Congrats, you just went through the 11 pricing components of AWS Redshift. Now you must be wondering how not to get lost.

There are so many options, pricing models, resource types that it’s extremely easy to miss one. Don’t end up with unexpected costs, here are some advice.

1- Optimize Redshift queries

Compress data and choose efficient storage formats to minimize storage and Redshift Spectrum costs.

Use SQA (Short Queries Acceleration) which knows how to prioritize the shorter queries. These are less compute intensive and will go before the larger ones. This is done in your workload management (WLM) settings.

Use CDC (Change Data Capture), it only processes data that has changed rather than the entirety.  It is placed between your data source and warehouse to ensure that no changes remain unseen.

2- Set limits

As mentioned for Redshift ML pricing, AWS gives you the possibility to control your costs. This is done by defining a maximum cells value, which prevents unwanted surprises.

3- Use various AWS Redshift pricing model options

AWS Redshift is such a diverse and powerful tool that Amazon offers numerous pricing models to meet your needs.

When you have no visibility over your future use, opt for on demand . Use Serverless if you prefer to let AWS handle the automatic scaling up and down of your resources. Use reserved instances when you are able to anticipate your workload and have visibility over the next months or years.

Moreover, in addition to the pricing models, make full use of the various credits, quotas and allowance permitted by AWS. We demonstrated how the use of free concurrency scaling credits during busy periods can help to avoid unnecessary costs.

How Holori can help with Redshift and AWS costs ?

Holori cloud cost visibility solution offers comprehensive cloud cost dashboards

Holori brings a new dimension to the understanding of your cloud environment and its cost. With modern FinOps tools such as Holori, getting started is just a matter of minutes. Connect your AWS, GCP or Azure accounts and get instantly access to powerfull dashboard. It helps you identify cost drivers and trends, generate comprehensive reports and dig into resource configurations. Holori also provides actionable recommendations to optimize cloud costs. It can be rightsizing resources, eliminating unused services, and navigating between pricing models.

Being able to compare how Amazon Redshift purchase models impact your budget becomes much easier. You are able to define budgets and make sure that threshold won’t be exceed without being noticed. Optimizing is often a team effort, Holori allows you to collaborate with colleagues and keep everyone on the same page.

If you decide to start small with smaller node sizes, follow their real usage. Make the switch to larger ones only when it becomes necessary. By monitoring your performance metrics you also avoid over provisioning which, in the end, lowers your cloud costs.

Getting started on Holori is super easy, create an account on the app, connect your AWS account and we handle the rest.

Conclusion

AWS Redshift’s pricing model can be complex. On the other hand it’s designed to offer flexibility and scalability for a wide range of use cases. By understanding the key components of Redshift pricing, you can effectively control costs. Indeed, compute, storage, data transfer, and backups, have different sets of pricing rules. Mastering this helps you make the most of Redshift’s powerful data warehousing capabilities.

It is critical to use the right set of tools to keep a close look and understanding of each resource cost to avoid bad surprises on your cloud bill.

Redshift’s pricing options can be tailored to meet your needs and help you achieve cost-efficiency in the cloud. Whether you are a small business with limited data or an enterprise running large-scale analytics, Redshift can be the perfect solution.

Manage your AWS cloud costs with Holori now : https://app.holori.com/

Manage and reduce Redshift cost

Share This Post

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

AWS

AWS Bedrock Pricing and Cost Optimization guide

Building advanced generative AI applications is made simpler with AWS Bedrock, a fully managed, serverless AI service. Bedrock provides access to high-performance foundation models (FMs)