Introducing Chaos Genius for Databricks Cost Optimization

Join the waitlist

Snowflake Storage Costs 101 - An In-Depth Guide to Manage and Optimize Storage Costs

In this article, we will provide valuable insights and techniques for reducing Snowflake storage costs. We'll take a closer look at how Snowflake storage costs work—and point out specific key areas where you can clean it up. This article is suitable for both new and experienced Snowflake users, covering key areas for Snowflake cost optimization and best practices.  By the end of this article, you'll have a better understanding of Snowflake storage costs and the tools & techniques needed for improving your Snowflake ROI.

How Do Snowflake Storage Costs Work?

Before we do a deep dive into the intricacies of understanding and optimizing Snowflake storage costs, let's first understand its "multi-cluster, shared data" design architecture - separating storage, compute, and cloud services. This design allows Snowflake to logically integrate these components while physically separating them. It differs from traditional data warehouses, where storage, compute, and cloud services are tightly coupled. Let's delve further into the design of a multi-cluster, shared data system, focusing on its three components:

  • Storage: This is the persistent storage layer for data stored in Snowflake. It resides in a scalable cloud storage service(such as GCS or S3) that ensures data replication, scaling, and availability without customer management. Snowflake optimizes and stores data in a columnar format within this storage layer, organized into databases as specified by the user.
  • Compute: This is a collection of independent compute resources that execute data processing tasks required for queries.
  • Cloud Services: This is a collection of system services that handle infrastructure, security, metadata, and optimization across the entire Snowflake.
Snowflake Storage Costs
Snowflake Architecture (Source: docs.snowflake.com)

Snowflake storage costs are an important factor to consider when using it. The monthly costs for storing data in Snowflake are based on a flat rate per terabyte (TB) of consumption (after being compressed). The amount charged is determined by your account type (capacity or on demand), Cloud Platform, and the region, whether in the United States, Asia, or Europe.

For example, the Capacity storage rate is ~$23 per terabyte per month for customers located in the United States and ~$24.50 per terabyte per month for customers located in the European Union region. The cost of Snowflake Credits for capacity purchases is determined upon ordering and is actually based on the size of the total committed customer purchase. On the other hand, if you decide to use an Snowflake’s On Demand Pricing, the storage rate for AWS – US East (Northern Virginia) customers is $40 per terabyte per month, whereas it is $45 dollar per terabyte per month for customers who deploy in the EU region.

Note: Snowflake storage costs can vary depending on the type of data you are storing and the duration for which you keep it.

There are several types of storage costs in Snowflake, such as:

Staged file storage cost in Snowflake: It refers to the costs associated with storing files for bulk data loading or unloading in Snowflake. In Snowflake, data loading and unloading are typically done by staging the data in a file format, such as CSV or JSON, before loading it into or unloading from the Snowflake database. When a file is staged, it calculates storage costs based on the size of the file. This means Snowflake's cost fluctuates based on the size of the data. Snowflake does not charge data ingress fees to bring data into your account but does charge for data egress [5].

Note: Data ingress fees are free of charge, however, utilizing Snowpipe or executing COPY or INSERT queries to load data will incur a compute cost.

Take a look at this guide on loading data in Snowflake.

Database storage cost in Snowflake: It refers to the costs associated with storing data in Snowflake's databases. It includes the data stored in the database tables and any historical data maintained for Time travel. So, how does time travel work in Snowflake? Time travel is a feature in Snowflake that allows users to query their data as it existed at a specific point in time. This feature is used to recover lost data, track changes in data over time, or for compliance and auditing purposes. Historical data stored for Time Travel is also included in the database costs. Snowflake automatically compresses all data stored in tables, which helps optimize Snowflake storage used for an account. The compressed file size is used to calculate the total storage used for an account, and this is used to determine the costs associated with the data stored in the databases. The compressed file size is smaller than the original file size, so the cost will eventually be lower.

Time Travel and Fail-safe Costs: These refer to the costs associated with maintaining historical data and ensuring data integrity in Snowflake. These costs are calculated per day based on the amount of data that is changed during that time period.

Time Travel Storage Costs in Snowflake:

  • Snowflake’s Time Travel feature allows users to query their data as it existed at a specific point in time, and it also provides Snowflake query optimization and query cost reduction benefits.
  • The number of days historical data is maintained is based on the table type and the Time Travel retention period for the table.
  • Snowflake minimizes the amount of storage required for historical data by maintaining only the necessary information to restore the individual table rows that were updated or deleted.

Fail-safe Storage Costs in Snowflake:

  • Snowflake’s Fail-safe features allows one to recover data even after the time travel period has ended. It is meant to be used in extreme cases, and it can only be accessed by Snowflake.
  • The costs associated with Fail-safe are calculated based on the amount of data that is stored in multiple copies.
  • Snowflake only maintains full copies of tables when tables are dropped or truncated.

Zero-copy cloning cost: It refers to the costs associated with creating a copy of a database, schema, or table along with its associated data in Snowflake without incurring any additional storage charges until changes are made to the cloned object. It is because zero-copy cloning is a metadata-only operation [4]. For example, Snowflake storage costs charges will be applied at that point if a user clones a database and then adds a new table or deletes rows from a cloned table. Zero-copy cloning has many uses beyond creating backups, such as supporting the development and test environments.

Monitoring Snowflake Storage Costs Using Snowsight.

Monitoring data storage is an important aspect of Snowflake cost reduction and ensuring that your data is stored efficiently. As an ACCOUNTADMIN, you have the ability to view data storage across your entire account as well as for individual tables [1].

Account-wide Storage:

  • As an admin, you can use Snowsight or even the classic web interface to view data storage across your entire account. But for the sake of this article, we will be using Snowsight to do it.
  • In Snowsight, navigate to Admin > Usage > Storage.
  • This page displays the total average Snowflake data storage for your account and all databases, stages, and data in fail-safe.
Snowflake Storage Costs
Monitoring Account-wide storage usage

Individual Table Storage:

  • Any user with the appropriate privileges can view the data storage for individual tables.
  • In the Classic Web UI or Snowsight, navigate to Databases > “db_name” > Tables.
  • This will show the storage usage for each table in the selected database.
Snowflake Storage Costs
Individual Table storage
Note: Keep an eye on how much data storage you're using so you're aware of any unexpected changes. This will help you find and fix any problems that might be affecting the efficiency of your data storage and help you get the most out of your storage costs.

To gain insight into historical Snowflake storage costs, users can access the ACCOUNT_USAGE and ORGANIZATION_USAGE schemas. These schemas provide detailed information about storage costs, including the amount of data stored, the duration of time the data was stored, and the cost associated with storing the data.

Viewing Snowflake Storage History for Your Organization

As we have previously discussed, the total Snowflake storage cost is comprised of various costs associated with different types of storage, including staged file storage, database table storage, and fail-safe and time-travel storage. To better understand Snowflake’s storage costs for your organization, users with the ACCOUNTADMIN role can use Snowsight to view the amount of data stored in Snowflake.

To access this storage info, you must navigate to the Admin section and select Usage from the drop-down menu.

Snowflake Storage Costs
Admin section and Usage dropdown menu

From there, select Storage from the Usage Type drop-down.

Snowflake Storage Costs
Usage section and storage dropdown menu

It is essential to select a virtual warehouse when using the dashboard to view cost and usage information from the shared snowflake database, as it consumes compute resources when retrieving this data, Snowflake recommends using an X-Small(Extra Small) warehouse for this purpose.

Snowflake Storage Costs
Warehouse selection
Snowflake Storage Costs
Warehouse size

Filter by Tag

Usage dashboard in your organization allows you to filter storage usage by a specific tag/value combination. This feature is designed to enable you to attribute Snowflake costs to specific logical units within your organization. Using tags, you can easily identify and track the storage usage for different departments, projects, or any other logical units you have defined in your organization. This feature is similar to filtering credit consumption by tag, enabling you to track and attribute costs to specific units within your organization. In both cases, using tags allows for easy tracking and cost attribution, making it easy to understand and manage your organization's storage and credit usage.

Snowflake Storage Costs
Filter by Tag

View Snowflake Storage by Type or Object

The bar graph on the Usage dashboard of your organization allows you to filter Snowflake storage usage data either By type or By object.

Snowflake Storage Costs
Filter storage usage data

Filtering By Type, you can view the size of storage for each Snowflake storage category, such as Database, Fail Safe, and Stage. Additionally, storage associated with Time Travel is included in the Database category, providing a comprehensive view of storage usage by type[2]. This feature is useful for understanding the breakdown of storage usage across different storage categories and can help identify areas where Snowflake storage usage is particularly high or low.

Filtering By Object, the graph displays the size of storage for each specific object. For example, you can view the size of a particular database or stage [2]. This feature allows you to drill down and see the detailed storage usage for specific objects, which can be useful for identifying and addressing storage usage issues at the individual object level.

Viewing Data Usage for a Table in Snowflake

Snowsight allows users with the appropriate access privileges to view the size (in bytes) of individual tables within a schema or database. This feature is useful for understanding the storage usage of specific tables and can help identify and address storage usage issues at the table level.

To view the size of a table using Snowsight, follow these steps:

  • Select Data > Databases from the main menu.
  • On the left side of the Databases page, use the database object explorer to view the database and schema.
  • Expand the database and any schema in the database to view the tables within it.
  • Click on any table to view its statistics, including its size.
Snowflake Storage Costs
Data section and Databases dropdown menu
Snowflake Storage Costs
Database object explorer
Snowflake Storage Costs
Tables in a database
Snowflake Storage Costs
Table size

Query for Table Size in Snowflake

It is possible to gain insights into tables, including their size, by writing SQL queries, instead of using the Snowsight web interface because this allows users for more flexibility + automation when working with large amounts of data [3]. A user with the proper access/privileges can list data about tables using the SHOW TABLES command.

For example, the following query will list all tables in a specific schema:

SHOW TABLES IN SCHEMA your_schema;
Snowflake Storage Costs
Querying Data for Table Size

Let's go a bit further. So now that we know what the SHOW TABLES command does,it's time to dive deep into actually calculating the size of the individual table so users with the ACCOUNTADMIN role can use SQL to view table size information by executing queries against the TABLE_STORAGE_METRICS view in the ACCOUNT_USAGE schema. This view shows the table-level storage utilization info, which can be utilized to compute the storage billing for every table in the account, including those that have been removed but are still consuming the storage fees. TABLE_STORAGE_METRICS contains storage information about all tables that a particular account holds and tons of information about table size, including the number of bytes currently being stored, the number of bytes that have been stored historically—and more. For a quick example, here is a sample query that will calculate and return the total table size in GB for the table DATASETS which is present in REAL_ESTATE_DATA_ATLAS database:

USE role ACCOUNTADMIN;
SELECT TABLE_NAME,
       TABLE_SCHEMA,
       Sum(((active_bytes + time_travel_bytes + failsafe_bytes + retained_for_clone_bytes)/1024)/1024)/1024 AS TOTAL_STORAGE_USAGE_IN_GB
FROM   REAL_ESTATE_DATA_ATLAS.INFORMATION_SCHEMA.TABLE_STORAGE_METRICS WHERE TABLE_NAME in ("DATASETS");
GROUP  BY 1,2
ORDER  BY 1,2,3
Note: To query the TABLE_STORAGE_METRICS view, the user must have ACCOUNTADMIN privilege and there might be 1-2 hour delay in updating the storage related stats for active_bytes + time_travel_bytes + failsafe_bytes + retained_for_clone_bytes in this view.

Query for Snowflake Storage Costs: ORGANIZATION_USAGE and ACCOUNT_USAGE Schemas

So before we even start to query the data for Snowflake storage cost, we need to understand that Snowflake provides two schemas that make it possible to query storage cost: ORGANIZATION_USAGE and ACCOUNT_USAGE. These schemas contain data related to usage and cost and provide very detailed and granular, analytics-ready usage data to build custom reports or even dashboards.

ORGANIZATION_USAGE schema contains data about usage and costs at the organization level, which includes data about storage usage and cost for all the databases and stages within an organization in a shared database named SNOWFLAKE. Only the users with the ORGADMIN or ACCOUNTADMIN role can access the ORGANIZATION_USAGE schema.

For more information on the ORGANIZATION_USAGE views, please refer to this.

ACCOUNT_USAGE schema contains data about usage and costs at the account level, which includes data about storage usage and cost for all the tables within an account.

For more information on the ACCOUNT_USAGE views, please refer to this.

Most views in these two schemas contain the cost of storage in terms of storage size. But, if you want to view the cost in currency instead of size, you can write queries using the USAGE_IN_CURRENCY_DAILY View. This view converts the storage size into actual currency using the daily price of a TB allowing users to easily understand the actual cost of storage in terms of currency.

SELECT 
  DATE_TRUNC('day', USAGE_DATE) AS "Usage Date", 
  SUM(USAGE) AS "Total Usage", 
  SUM(USAGE_IN_CURRENCY) AS "Total Cost" 
FROM 
  ORGANIZATION_USAGE.USAGE_IN_CURRENCY_DAILY 
WHERE 
  ORGANIZATION_NAME = 'YourOrganizationName'
  AND USAGE_TYPE = 'YourUsageType' 
  AND USAGE_DATE BETWEEN 'start_date' AND 'end_date'
  AND CURRENCY = 'CurrencyOfTheUsage'
GROUP BY 
  1 
ORDER BY 
  1 DESC 
LIMIT 
  10;

This particular query returns the total usage and total cost for a specific organization, usage type, and date range, grouped by day and ordered by decreasing usage date, with a limit of 10 rows.

Snowflake Storage Cost Reduction Techniques for Maximizing Snowflake ROI

Let's talk about some ways to keep Snowflake storage costs as low as possible. We've already talked about how storage, compute, and services are the three main things that make up Snowflake costs. Snowflake's pricing is based on how much you use it, so you only pay for what you use. It's really very important to keep an eye on certain areas to reduce storage costs, as you may be BURNING up space unnecessarily. Here are some important places to keep an eye on to find ways to clean up and cut down on storage costs.

  1. Staged files for bulk data loading and unloading consume storage space. Frequently check the STORAGE_USAGE view in ACCOUNT_USAGE, which might give an idea of how much space is consumed by stages.
  2. Unused Tables: Use Snowflake's ACCESS_HISTORY object to find how frequently the objects are queried and drop those that haven't been used in a certain period of time.
  3. Time travel Costs: Time travel is a unique feature of Snowflake that retains deleted or updated data for a specified number of days, but it is important to note that this stored history data encounters the same cost as active data. Hence, always be careful, make sure to check it, and consider the data criticality and table churn rate when setting up time-travel retention periods.
  4. Transient Tables: Consider having a transient database/schema for environments where data has low importance, such as staging environments. This will help control storage costs and avoid unnecessary data retention.

Conclusion

As with any cloud-based service, you should ALWAYS ALWAYS keep a close eye on your Snowflake storage costs to make sure you're not spending too much on resources you don't actually need. By understanding the different storage costs and using tools and techniques like Snowsight (Snowflake web user interface) and custom queries, users can get valuable information about the storage costs. Using the tips and tricks outlined in this article, you can keep your reduce your Snowflake storage costs to a minimum and improve your Snowflake ROI.


FAQs

How storage cost is calculated in Snowflake?

Snowflake storage costs are based on a flat rate per terabyte of consumption (after compression) and vary depending on the account type (capacity or on demand), Cloud Platform, and region.

What is staged file storage cost in Snowflake?

Staged file storage cost refers to the cost associated with storing files for bulk data loading or unloading in Snowflake.

How can I monitor Snowflake storage costs?

Snowflake provides tools like Snowsight to monitor storage usage at the account-wide and individual table levels. Users can also query the ACCOUNT_USAGE and ORGANIZATION_USAGE schemas to gain detailed information about storage costs.

How much does Snowflake storage cost for 1 TB?

The cost of storing 1 TB of data in Snowflake varies based on factors such as account type, Cloud Platform, and region. For Capacity storage in the United States, it's around ~$23/month, while in the EU, it's approximately ~$24.50/month. With On Demand Pricing, it's ~$40/month in the US and ~$45/month in the EU.


References

[1] “Access Control Considerations — Snowflake Documentation.” Snowflake Documentation, https://docs.snowflake.com/en/user-guide/security-access-control-considerations.html#using-the-accountadmin-role. Accessed 27 January 2023.

[2] “Exploring Snowflake Storage Cost — Snowflake Documentation.” Snowflake Documentation, https://docs.snowflake.com/en/user-guide/cost-exploring-data-storage.html#filter-by-tag.  Accessed 27 January 2023.

[3] “How to Find Snowflake Table Size? | by Alexander | Medium.” Alexander, 8 September 2021, https://alexandersks.medium.com/how-to-find-snowflake-table-size-2125880d07d7.  Accessed 27 January 2023.

[4] Avila, Joyce. Snowflake: the Definitive Guide: Architecting, Designing, and Deploying on the Snowflake Data Cloud. O'Reilly Media, 2022. Accessed 28 January 2023.

[5] “Understanding Overall Snowflake Costs — Snowflake Documentation.” Snowflake Documentation, https://docs.snowflake.com/en/user-guide/cost-understanding-overall.html#how-are-costs-incurred. Accessed 28 January 2023.

[6] Understanding Snowflake Storage Cost — Snowflake Documentation. https://docs.snowflake.com/en/user-guide/cost-understanding-data-storage.html

Pramit Marattha

Technical Content Lead

Pramit is a Technical Content Lead at Chaos Genius.

People who are also involved

“Chaos Genius has been a game-changer for our DataOps at NetApp. Thanks to the precise recommendations, intuitive interface and predictive capabilities, we were able to lower our Snowflake costs by 28%, yielding us a 20X ROI

Chaos Genius has given us a much better understanding of what's driving up our data-cloud bill. It's user-friendly, pays for itself quickly, and monitors costs daily while instantly alerting us to any usage anomalies.

Anju Mohan

Director, IT

Simon Esprit

Chief Technology Officer

Join today to get upto
30% Snowflake
savings

Join today to get upto 30% Snowflake savings

Unlock Snowflake Savings Join waitlist
Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.