Introducing Chaos Genius for Databricks Cost Optimization

Join the waitlist

Snowflake Zero Copy Clone 101—An Essential Guide (2024)

Snowflake zero copy clone is an incredibly useful and advanced feature that allows users to clone a database, schema, or table quickly and easily without any additional Snowflake storage costs. What's more, it takes only a few minutes for Snowflake zero copy clone to complete without the need for complex manual configuration, as often done in conventional databases—depending on the size of the source item. This article covers all you need to know about Snowflake zero copy clone.  

Let's dive in!

What is Snowflake zero copy clone?

Snowflake zero copy clone, often referred to as "cloning", is a feature in Snowflake that effectively creates an exact copy of a database, table, or schema without consuming extra storage space, taking up additional time, or duplicating any physical data. Instead, a logical reference to the source object is created, allowing for independent modifications to both the original and cloned objects. Snowflake zero copy cloning is fast and offers you maximum flexibility with no additional Snowflake storage costs associated with it.

Use-cases of Snowflake zero copy clone

Snowflake zero copy clone provides users with substantial flexibility and freedom, with use cases like:

  • To quickly perform backups of Tables, Schemas, and Databases.
  • To create a free sandbox to enable parallel use cases.
  • To enable quick object rollback capability.
  • To create various environments (e.g., Development,Testing, Staging, etc.).
  • To test possible modifications or developments without creating a new environment.

Snowflake zero copy clone provides businesses with smarter, faster, and more flexible data management capabilities.

How does Snowflake zero copy clone work?

The Snowflake zero copy clone feature allows users to clone a database object without making a copy of the data. This is possible because of the Snowflake micro-partitions feature, which divides all table data into small chunks that each contain between 50 and 500 MB of uncompressed data. However, the actual size of the data stored in Snowflake is smaller because the data is always stored compressed. When cloning a database object, Snowflake simply creates new metadata entries pointing to the micro-partitions of the original source object, rather than copying it for storage. This process does not involve any user intervention and does not duplicate the data itself—that's why it's called "zero copy clone".

To gain a better understanding, let's deep dive even further.

To illustrate this, consider a database table, EMPLOYEE table, and its cloned snapshot, EMPLOYEE_CLONE, in a Snowflake database. The metadata layer in Snowflake connects the metadata of EMPLOYEE to the micro-partitions in the storage layer where the actual data resides. When the EMPLOYEE_CLONE table is created, it generates a new metadata set pointing to the same micro-partitions storing the data for EMPLOYEE. Essentially, the clone EMPLOYEE_CLONE table is a new metadata layer for EMPLOYEE rather than a physical copy of the data. The beauty of this approach is that it enables us to create clones of tables quickly without duplicating the actual data, saving time and storage space. Moreover, since the clone shares the same set of micro-partitions as the original table, any changes made to the data in one table will automatically reflect in the other.

Snowflake zero copy clone illustration
Snowflake zero copy clone illustration

In Snowflake, micro-partitions cannot be changed/altered once they are created. Suppose any modifications to the data within a micro-partition need to be made. In that case, a new micro-partition must be created with the updated changes (the existing partition is maintained to provide fail-safe measures and time travel capabilities). For instance, when data in the EMPLOYEE_CLONE table is modified, Snowflake replicates and assigns the modified micro-partition (M-P-3) to the staging environment, updating the clone table with the newly generated micro-partition (M-P-4) and references it exclusively for the EMPLOYEE_CLONE table, thereby incurring additional Snowflake storage costs only for the modified data rather than the entire clone.

Cloned Data illustration - Snowflake zero copy clone
Cloned Data illustration

What are the benefits of Snowflake zero copy clone?

Snowflake zero copy clone feature offers a variety of beneficial characteristics. Let's look at some of the key benefits:

  • Effective data cloning: Snowflake zero copy clone allows you to create fully-usable copies of data without physically copying the data, significantly reducing the time required to clone large objects.
  • Saves storage space and costs: It doesn't require the physical duplication of data or underlying storage, and it doesn't consume additional storage space, which can save on Snowflake costs.
  • Hassle-free cloning: It provides a straightforward process for creating copies of your tables, schemas, and databases using the keyword "CLONE" without needing administrative privileges.
  • Single-source data management: It creates a new set of metadata pointing to the same micro-partitions that store the original data. Each clone update generates new micro-partitions that relate solely to the clone.
  • Data Security: It maintains the same level of security as the original data. This ensures that sensitive data is protected even when it's cloned.

What are the limitations of Snowflake zero copy clone?

Snowflake zero copy clone feature offers many benefits. Still, there are certain limitations to keep in mind:

  • Resource requirements and performance impact: Cloning operations require adequate computing resources, so excessive cloning can lead to performance degradation.
  • Longer clone time for large micro-partitions: Cloning a table with a large number of micro-partitions may take longer, although it is still faster than a traditional copy.
  • Unsupported Object Types for Cloning: Cloning does not support all object types.

Which are the objects supported in Snowflake zero copy clone?

Snowflake zero copy clone feature supports cloning of the following database objects:

  • Databases
  • Schemas
  • Tables
  • Views
  • Materialized views
  • Sequences
Note: When a database object is cloned, the clone is not similar to the source object; rather, the clone is a reference to the original object, and modifications to the clone do not affect the source object. The clone will contain a new set of metadata, including a new set of access controls; so, the user must ensure that the appropriate permissions are granted for the clone.

How do access control works with cloned objects in Snowflake?

When using Snowflake's zero copy clone feature, it's important to keep in mind that cloned objects do not automatically inherit copy privileges from the source object. This means that an account admin(ACCOUNTADMIN) or the owner of the cloned object must explicitly grant any required privileges to the newly created clone.

If the source object is a database or schema, the granted privileges of any child objects in the source will be replicated to the clone. But, in order to create a clone, the current role must have the necessary privileges on the source object. For example, tables require the SELECT privilege, while pipelines, streams, and tasks require the OWNERSHIP privilege, and other object types require the USAGE privilege.

What are the account-level objects not supported in Snowflake zero copy clone?

Snowflake zero copy clone doesn't support particular objects that cannot be cloned. These include account-level objects, which exist at the account level. Some examples of account-level objects are:

  • Account-level roles
  • Users
  • Grants
  • Virtual Warehouses
  • Resource monitors
  • Storage integrations

Conclusion

Snowflake zero copy clone feature provides an innovative and cost-efficient way for users to clone tables without using additional Snowflake storage costs. This process streamlines the workflow, allowing databases, tables, and schemas to be cloned without creating separate environments.

This article provided an in-depth overview of Snowflake zero copy clone, from how it works to its potential use cases, and demonstrated how to set up and utilize the feature.

If you're interested in delving into a comprehensive guide that walks you through the process of creating a Snowflake zero copy clone table from the ground up, be sure to take a look at this article!

FAQs

Why is it called zero copy clone?

The term "Zero Copy Clone" is used because Snowflake's cloning process doesn't involve physical data copying. It creates a reference to the source data, eliminating the need for duplication and resulting in zero additional storage costs.

How does Snowflake zero copy clone work?

Snowflake zero copy clone works by creating new metadata entries that point to the micro-partitions of the original source object instead of making a physical copy of the data.

What are the advantages of zero copy cloning Snowflake?

  • Effective data cloning without physical duplication, saving time.
  • Storage space and cost savings as it doesn't consume additional storage.
  • Hassle-free cloning process using the "CLONE" keyword.
  • Single-source data management with new metadata for each clone.
  • Maintaining data security and access controls.

What are the limitations of Snowflake zero copy clone?

  • Resource requirements and potential performance impact.
  • Longer clone time for tables with a large number of micro-partitions.
  • Not all object types are supported for cloning.

Which objects are supported in Snowflake Zero Copy Cloning?

  • Databases
  • Schemas
  • Tables
  • Views
  • Materialized views
  • Sequences

Can Snowflake objects be cloned?

Yes, individual external named stages in Snowflake can be cloned. External stages refer to buckets or containers in external cloud storage. Cloning an external stage does not affect the referenced cloud storage. However, internal (Snowflake) named stages cannot be cloned.

Can you clone Internal named stages ?

No, Internal named stages cannot be cloned.

How does Zero Copy Cloning save time and money?

Zero Copy Cloning eliminates the need for creating multiple development environments in separate accounts, reducing costs and time spent on creating large copies of production tables.

Tags

Pramit Marattha

Technical Content Lead

Pramit is a Technical Content Lead at Chaos Genius.

People who are also involved

“Chaos Genius has been a game-changer for our DataOps at NetApp. Thanks to the precise recommendations, intuitive interface and predictive capabilities, we were able to lower our Snowflake costs by 28%, yielding us a 20X ROI

Chaos Genius has given us a much better understanding of what's driving up our data-cloud bill. It's user-friendly, pays for itself quickly, and monitors costs daily while instantly alerting us to any usage anomalies.

Anju Mohan

Director, IT

Simon Esprit

Chief Technology Officer

Join today to get upto
30% Snowflake
savings

Join today to get upto 30% Snowflake savings

Unlock Snowflake Savings Join waitlist
Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.