Introducing Chaos Genius for Databricks Cost Optimization

Join the waitlist

HOW TO: Use Snowflake REPLACE to Manipulate Substrings & String Patterns (2024)

Snowflake offers a wide range of string and binary functions to manipulate and transform string/character data types. These functions are useful for performing various tasks such as data cleansing, data standardization, data extraction—and data analysis processes. One of the most common and powerful string and binary functions in Snowflake is REPLACE(), which replaces all occurrences of a specified substring with another string.

In this article, we will cover everything you need to know about the Snowflake REPLACE() function for solving common string manipulation challenges. We'll explore its syntax, compare it to alternatives such as REGEXP_REPLACE() and TRANSLATE(), demonstrate its use cases with practical hands-on examples, and discuss best practices for its application—and so much more!

Let’s dive right in!!

What Is Snowflake REPLACE() Function?

Snowflake REPLACE() is a string and binary function that allows you to remove all occurrences of a specified substring, and optionally replace them with another string. Snowflake REPLACE() operates on character data types, such as VARCHAR, CHAR, TEXT and STRING.

Snowflake REPLACE() can be used for various purposes, such as:

  • Removing unwanted prefixes/suffixes from a string
  • Correcting spelling errors or typos in a string
  • Changing the format or style of a string
  • Updating outdated or incorrect information in a string
  • Performing dynamic string manipulation based on expressions or variables

Snowflake REPLACE() is simple to use, but it also offers loads of flexibility and customization options. In the next section, we will explain how the Snowflake REPLACE() works and what its arguments are.

How Does a Snowflake REPLACE() Function Work?

As we have already covered, Snowflake REPLACE() function removes all occurrences of a specified substring and optionally replaces them with another string. Put simply, Snowflake REPLACE() performs a case-sensitive search to find occurrences of the substring specified in the second argument within the original string. Whenever it encounters that substring, it will replace it with the new replacement string provided as the third argument.

Syntax for REPLACE() is straightforward:

REPLACE( <subject> , <pattern> [ , <replacement> ] )

Snowflake REPLACE() takes 3 arguments:

  • <subject>: String value to be searched and modified, which can be a literal value, a column name, or an expression that returns a character data type.
  • <pattern>: Substring to be searched for and replaced in the subject, which can be a literal value, a column name, or an expression that returns a character data type.
  • <replacement>: Substring to replace the matched pattern in the subject. This can also be a literal value, a column name, or an expression that returns a character data type.

Note:

  • <pattern> is matched using literal strings, not regular expressions.
  • <replacement> can also be an empty string ('') to remove the pattern from the subject.

Snowflake REPLACE() returns a new string value with all occurrences of the pattern replaced by the replacement. The function preserves the data type of the subject. If any of the arguments is NULL, the function returns NULL. We will discuss in detail in a later section how Snowflake REPLACE() handles NULL values.

Here is one simple example of using the Snowflake REPLACE() function:

SELECT REPLACE('Slash your Snowflake spend with ---', '---', 'Chaos Genius') as result;
Using Snowflake REPLACE to replace the string value
Using Snowflake REPLACE to replace the string value

As you can see, Snowflake REPLACE function replaces all occurrences of the string “---”  in the subject “Slash your Snowflake spend with —” with “Chaos Genius”, and returns the new string “Slash your Snowflake spend with Chaos Genius”.

Snowflake REPLACE() also helps you to remove all the occurrences of a specified substring by using an empty string ('') as the replacement argument.

For example, if you want to remove “with Chaos Genius” from a string, you can use the following query:

SELECT REPLACE('Slash your Snowflake spend with Chaos Genius', 'with Chaos Genius', '') AS result;
Removing all the occurrences of a specified substring by using an empty string as a replacement argument - Snowflake REPLACE
Removing all the occurrences of a specified substring by using an empty string as a replacement argument - Snowflake REPLACE

As you can see, Snowflake REPLACE replaces the substring “with Chaos Genius” in the subject “Slash your Snowflake spend with Chaos Genius” with an empty string, and returns the new string “Slash your Snowflake spend”.

Now, in the next section, we will compare the Snowflake REPLACE() function with another similar function, REGEXP_REPLACE()

What Is the Difference Between Snowflake REPLACE and REGEXP_REPLACE?

Snowflake REPLACE() is not the only way to replace substrings in a string value. There is another similar function called REGEXP_REPLACE that also allows you to perform replacement operations but with some additional features and flexibility.

Syntax for REGEXP_REPLACE() is straightforward:

REGEXP_REPLACE( <subject> , <pattern> [ , <replacement> , <position> , <occurrence> , <parameters> ] )

Snowflake REGEXP_REPLACE function takes 6 (4 of ‘em are optional) arguments:

  • <subject>: String value to be searched and modified.
  • <pattern>: Regular expression to be searched for and replaced in the subject.
  • <replacement>: Substring to replace the matched pattern in the subject. If an empty string is specified, the function removes all matched patterns and returns the resulting string.
  • <position>: Number of characters from the beginning of the string where the function starts searching for matches.
  • <occurrence>: Number of occurrences of the pattern to be replaced in the subject. If 0 is specified, all occurrences are replaced.
  • <parameters>: A string of characters that specify the behavior of the function. This is an optional argument that supports one or more of the following characters:
  • c: Enables case-sensitive matching.
  • i: Enables case-insensitive matching.
  • m: Enables multi-line mode. By default, multi-line mode is disabled.
  • e: Extracts sub-matches.
  • s: Enables the POSIX wildcard character. to match \n. By default, wildcard character matching is disabled.
For more details, see regular expression parameters

Now, let's dive into the main difference between Snowflake REPLACE and REGEXP_REPLACE. The primary difference is that Snowflake REGEXP_REPLACE uses regular expressions to match the pattern, while Snowflake REPLACE uses literal strings. Here is a table that quickly summarizes the differences between the Snowflake REPLACE and REGEXP_REPLACE functions.

Snowflake REPLACE

Snowflake REGEXP_REPLACE

Snowflake REPLACE() uses simple substring matching

Snowflake REGEXP_REPLACE() uses regular expressions (regex) for more complex pattern matching

Syntax: REPLACE( <subject> , <pattern> [ , <replacement> ] )

Syntax: REGEXP_REPLACE( <subject> , <pattern> [ , <replacement> , <position> , <occurrence> , <parameters> ] )

It has 3 required arguments

It has 6 arguments, most of which are optional

Snowflake REPLACE() replaces all occurrences of a specified substring, and optionally replaces them with another string

REGEXP_REPLACE() returns the subject with the specified pattern either removed or replaced by a replacement string

TLDR; Snowflake REPLACE() is best for simple substring substitutions on string data. REGEXP_REPLACE() is more advanced and configurable, but also more complex.

How does Snowflake REPLACE Function Handle Null Values?

Snowflake REPLACE function treats NULL values in a special way. If any of the arguments of the function is NULL, the function returns NULL as the result, meaning that the function does not perform any replacement operation on the subject if the pattern or the replacement is NULL. Similarly, the function does not return any value if the subject itself is NULL.

For example:

SELECT REPLACE(NULL, 'some_string', 'new_string');
Snowflake REPLACE example
Handling NULL values with Snowflake REPLACE
Handling NULL values with Snowflake REPLACE
SELECT REPLACE('string', NULL, 'new_string');
Snowflake REPLACE example
Handling NULL values with Snowflake REPLACE
Handling NULL values with Snowflake REPLACE
SELECT REPLACE('string', 'some_string', NULL);
Snowflake REPLACE example
Handling NULL values with Snowflake REPLACE
Handling NULL values with Snowflake REPLACE

Practical Examples of Snowflake REPLACE Function

Now, in this section, we will see some practical examples of using the Snowflake REPLACE function to perform various data manipulation and transformation tasks. We will use a sample table called gadgets and insert some dummy data into it:

CREATE TABLE gadgets (
  id NUMBER,
  name VARCHAR,
  description VARCHAR,
  price NUMBER
);

INSERT INTO gadgets 
VALUES
  (1, 'IPhone 15', 'Latest smartphone from Apple ', 1799),
  (2, 'Samsung Galaxy', 'Flagship smartphone from Samsung', 799),
  (3, 'MacBook Pro', 'Powerful laptop from Apple with a 13-inch display and M1 chip', 1299),
  (4, 'Dell XPS', 'Sleek laptop from Dell with Intel Core i7 processor', 1299),
  (5, 'AirPods', 'Wireless earbuds from Apple with ANC', 300),
  (6, 'Kindle', 'E-reader from Amazon with an awesome display', 429),
  (7, 'Xiaomi Tablet', 'Tablet from Xiaomi with 10-inch display', 749);
Snowflake REPLACE example
Creating gadgets table and inserting some dummy data into it - Snowflake REPLACE
Creating gadgets table and inserting some dummy data into it - Snowflake REPLACE

Here is what our gadgets table looks like:

Selecting all the records of the Snowflake gadgets table - Snowflake REPLACE
Selecting all the records of the Snowflake gadgets table - Snowflake REPLACE

Example 1—Basic Usage of Snowflake REPLACE Function

Now, we will use the Snowflake REPLACE function to replace substrings in a string. Suppose we want to change the name of the product “IPhone 15 to “Apple IPhone 19” in the name column.

We can use the following query:

SELECT id, REPLACE(name, 'IPhone 15', 'Apple IPhone 19') AS name, description, price
FROM gadgets
WHERE id = 1;
Snowflake REPLACE example
Using the Snowflake REPLACE function to replace substrings in a string
Using the Snowflake REPLACE function to replace substrings in a string

As you can see, Snowflake REPLACE function replaces the substring “iPhoneIPhone 15” in the name column with the substring “Apple IPhone 19”, and returns the new string “Apple iPhoneIPhone 19”.

Example 2—Removing Prefixes/Suffixes

Next, we will use the Snowflake REPLACE function to remove prefixes or suffixes from a string. Suppose we want to remove the gadget names from the name column and only keep the model names.

We can use the following query:

SELECT id, REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(name, 'Apple ', ''), 'Samsung ', ''), 'Xiaomi ', ''), 'Dell ', ''), 'Amazon ', '') AS name, description, price
FROM gadgets
WHERE id IN (1, 2, 3, 4, 5, 6, 7);
Snowflake REPLACE example
Removing Prefixes/Suffixes with Snowflake REPLACE
Removing Prefixes/Suffixes with Snowflake REPLACE

As you can see, Snowflake REPLACE function removes the substrings “Apple”, ”Samsung”, “Xiaomi”, “Dell” and “Amazon” from the gadgets name column by replacing them with empty strings (''), and returns the new strings with only the gadgets model names.

Example 3—Handling null values with Snowflake REPLACE()

As we saw earlier, NULL values result in NULL outputs:

SELECT id, REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(NULL, 'Apple ', ''), 'Samsung ', ''), 'Xiaomi ', ''), 'Dell ', ''), 'Amazon ', '') AS name, description, price
FROM gadgets
WHERE id IN (1, 2, 3, 4, 5, 6, 7);
Snowflake REPLACE example
Handling null values with Snowflake REPLACE()
Handling null values with Snowflake REPLACE()

As you can see, if any of the arguments of the function is NULL, the function returns NULL as the result.

Example 4—Standardizing Data Formats

Now, in this particular example, we will use the Snowflake REPLACE function to standardize the data formats in a string. Suppose we have another table called gadgets_orders and insert some dummy data into it::

CREATE TABLE gadgets_order (
  id NUMBER,
  customer_id NUMBER,
  product_id NUMBER,
  order_date VARCHAR,
  order_amount VARCHAR
);

INSERT INTO gadgets_order
VALUES
  (1, 11, 2, '2024-01-01', '1799.00'),
  (2, 22, 1, '01/02/2024', '799.00 USD'),
  (3, 33, 3, '01-03-2024', '1299.00 US Dollars'), 
  (4, 44, 4, '2024/01/04', '1299.00 Dollars'),
  (5, 55, 5, '2024/01/04', '300.00 Dollars'),
  (6, 66, 7, '2024/01/04', '429.00 Dollars'),
  (7, 77, 6, '2024/01/04', '749.00 Dollars');
Snowflake REPLACE example
Creating gadgets_order table and inserting some dummy data into it - Snowflake REPLACE
Creating gadgets_order table and inserting some dummy data into it - Snowflake REPLACE

Here is what our gadgets_order table looks like:

Selecting all the records of Snowflake gadgets_order table - Snowflake REPLACE
Selecting all the records of Snowflake gadgets_order table - Snowflake REPLACE

Suppose we want to standardize the format of the order_date column to YYYY-MM-DD, and the format of the order_amount column to $XXX.XX. We can use the following query:

SELECT id, customer_id, product_id, 
  REPLACE(
    REPLACE(order_date, '/', '-'), 
    '.', 
    '-'
  ) AS order_date, 
  '$' || REPLACE(
    REPLACE(
      REPLACE(
        REPLACE(order_amount, '$', ''), 
        'USD', 
        ''
      ), 
      'Dollars', 
      ''
    ), 
    'US', 
    ''
  ) AS order_amount 
FROM 
  gadgets_order;
Snowflake REPLACE example
Standardizing Data and Price Formats with Snowflake REPLACE
Standardizing Data and Price Formats with Snowflake REPLACE

As you can see, Snowflake REPLACE function replaces the different separators and currency symbols in the order_date and order_amount columns with the standard ones and returns the new strings.

What Is the Difference Between TRANSLATE() and Snowflake REPLACE()?

Both Snowflake REPLACE() and TRANSLATE() are String functions that can substitute characters within strings, but there are some notable differences:

Snowflake REPLACE()

Snowflake TRANSLATE()

Snowflake REPLACE() works with strings of any length

TRANSLATE() performs single character substitutions

Syntax: REPLACE( <subject> , <pattern> [ , <replacement> ] )

Syntax: TRANSLATE( <subject>, <sourceAlphabet>, <targetAlphabet> )

Snowflake REPLACE() allows more control over the find-and-replace logic

TRANSLATE() has a simpler syntax

Snowflake REPLACE() does a single find-and-replace

TRANSLATE() makes multiple translations in one operation

TLDR; TRANSLATE() is best for fast bulk character substitutions, while Snowflake REPLACE() enables more advanced string manipulation with greater flexibility.

When to Use Snowflake REPLACE Function?

Finally, in this section, we will discuss when to use the Snowflake REPLACE function and explore its benefits and limitations. Here are common use cases where Snowflake REPLACE() can be useful for:

1) Removing or Replacing Substrings

Use Snowflake REPLACE to eliminate unwanted characters, words, or phrases from a string by substituting them with an empty string (''). Also, it helps in replacing existing substrings and altering names, formats, or even styles within a string.

2) Data Cleansing and Transformation

You can use Snowflake REPLACE to correct spelling errors, typos, or inconsistencies within your data. Standardize data formats like dates, numbers, or currencies, thereby enhancing data quality and accuracy.

3) Dynamic String Manipulation

Use Snowflake REPLACE for string operations based on expressions or variables. For instance, concatenate strings, split or extract substrings, and generate new strings based on conditions or logic.

4) Simple String Alterations

You can implement Snowflake REPLACE for straightforward changes to a string, such as adding or removing prefixes or suffixes, altering the case, or reversing the order. This function streamlines string modifications easily.

5) Control Over Replacement Logic

You can use the REPLACE function to customize the replacement operation according to your needs. You can control the case sensitivity, the number of replacements, and the position of the replacement within the string. The REPLACE function gives you more flexibility and customization options than other similar functions (Like TRANSLATE()).

Conclusion

And that’s a wrap! Snowflake REPLACE() is a powerful function that allows you to replace all occurrences of a specified substring in a string value with another substring. Snowflake REPLACE() can help you perform various data manipulation and transformation tasks, such as cleansing, standardization, extraction—and analysis. As we saw in the examples above, Snowflake REPLACE() can be used for simple tasks like correcting typos as well as more complex tasks like dynamic string manipulation. But, you should always be aware of how Snowflake REPLACE() handles the null values and the case sensitivity of the arguments.

In this article, we covered:

  • What Is Snowflake REPLACE() Function?
  • How Does a Snowflake REPLACE() Function Work?
  • What Is the Difference Between Snowflake REPLACE and REGEXP_REPLACE?
  • How does Snowflake REPLACE Function Handle Null Values?
  • Practical Examples of Snowflake REPLACE Function
  • What Is the Difference Between TRANSLATE() and Snowflake REPLACE()?
  • When to Use Snowflake REPLACE Function?

…and so much more!

By now, you should be able to use Snowflake REPLACE to manipulate and transform your string data effectively. It's simple to use, yet customizable for diverse needs, helping you to effectively cleanse, transform—and standardize your string/text data.

FAQs

What is the Snowflake REPLACE() function?

Snowflake REPLACE() is a string function that finds and replaces a specified substring with a new substring in a string value. It replaces all occurrences of the specified substring.

Does Snowflake REPLACE() replace all occurrences or just the first one?

Snowflake REPLACE() will replace all occurrences of the specified substring, not just the first match.

Is Snowflake REPLACE() case-sensitive?

Yes, Snowflake REPLACE() performs case-sensitive matches by default.

How does Snowflake REPLACE() handle NULL values?

If any Snowflake REPLACE() argument is NULL, it returns NULL without performing any replace.

Can Snowflake REPLACE() insert new characters?

Yes, the replacement string can contain new characters not originally present.

Can Snowflake REPLACE() be used to remove substrings?

Yes, you can remove substrings by replacing ‘em with an empty string.

When would Snowflake REPLACE() be useful for data cleansing?

Snowflake REPLACE can correct invalid data entries, standardize formats, and fix typos/inconsistencies to improve data quality.

Can I use column values or expressions as arguments in Snowflake REPLACE()?

Yes, you can use column names or expressions that evaluate to a string instead of just literal strings.

Is there a limit on the string length supported by Snowflake REPLACE()?

The maximum string length is 16MB (16777216 characters) which is the Snowflake STRING/VARCHAR limit.

How is Snowflake TRANSLATE() different from Snowflake REPLACE()?

TRANSLATE does single-character substitutions while Snowflake REPLACE works on entire strings.

Can I use Snowflake REPLACE() to concatenate or split strings?

Yes, Snowflake REPLACE can be used alongside other string functions like CONCAT or SPLIT for such operations.

When should I avoid using Snowflake REPLACE()?

Avoid Snowflake REPLACE if you need very high performance—use TRANSLATE instead. Also if you need more advanced regex patterns use the REGEXP_REPLACE function.

Is Snowflake REPLACE() case-sensitive by default?

Yes, Snowflake REPLACE performs case-sensitive literal substring matching by default.

Can I make Snowflake REPLACE() case-insensitive?

Yes, you can by converting the string to the same case before applying REPLACE.

Does Snowflake REPLACE() support regex or wildcards?

No, Snowflake REPLACE does not support regex or wildcards, only literal substring matching.

Can Snowflake REPLACE() insert strings that don't exist in the original?

Yes, the replacement string can contain new characters not originally present.

What data types does Snowflake REPLACE() support?

Snowflake REPLACE works on STRING, VARCHAR, CHAR, TEXT, and similar string/character data types.

Pramit Marattha

Technical Content Lead

Pramit is a Technical Content Lead at Chaos Genius.

People who are also involved

“Chaos Genius has been a game-changer for our DataOps at NetApp. Thanks to the precise recommendations, intuitive interface and predictive capabilities, we were able to lower our Snowflake costs by 28%, yielding us a 20X ROI

Chaos Genius has given us a much better understanding of what's driving up our data-cloud bill. It's user-friendly, pays for itself quickly, and monitors costs daily while instantly alerting us to any usage anomalies.

Anju Mohan

Director, IT

Simon Esprit

Chief Technology Officer

Join today to get upto
30% Snowflake
savings

Join today to get upto 30% Snowflake savings

Unlock Snowflake Savings Join waitlist
Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.