Introducing Chaos Genius for Databricks Cost Optimization

Join the waitlist

HOW TO: Implement Snowflake Search Optimization (2024)

The Snowflake Search Optimization Service (SOS) is a feature that improves the performance of point-lookup and analytical queries, particularly those involving selective filtering on large tables.

This article is divided into two parts. Part 1 will cover the basics of the Snowflake search optimization service, including its implementation, the associated costs, strategies for cost management, factors affecting the cost and a hands-on example. Part 2 of the article will explore the advantages of using the search optimization service, the types of queries that can benefit from it, the difference between Snowflake Clustering vs Search Optimization Service—and more!!

So, get ready to learn more about this powerful feature, and let's dive right in!

What is Snowflake Search Optimization Service (SOS)?

Search Optimization Service (SOS) is a feature of the Snowflake cloud data platform that utilizes a background maintenance service to create an optimized data structure called the search access path. This path scans all of the table’s micro partitions and records metadata, which is subsequently used by Snowflake to construct the most effective search access path for the table’s data.  When queries which perform point-lookups - many filters that retrieve a small subset of data - are run on a table with Search Optimization, the Snowflake optimizer analyzes the search access path to determine the shortest and most effective access path.

Note: Snowflake search optimization service is only available in the Enterprise edition or higher.

The Search Optimization Service (SOS) is applicable to a wide range of queries, including selective point lookup queries on tables. These queries typically return only one or a small number of distinct rows and are commonly used by business users who need fast response times for critical dashboards with highly selective filters, data scientists exploring large data volumes and looking for specific subsets of data, and data applications retrieving a small set of results based on an extensive set of filtering predicates.

Also, the SOS feature is quite useful for substring and regular expression searches (e.g. LIKE, ILIKE, RLIKE, etc.), queries on fields in  VARIANT, OBJECT and ARRAY columns that use certain types of predicates (equality predicates, IN predicates, predicates that use ARRAY_CONTAINS and ARRAYS_OVERLAP, and predicates that check for NULL values), and queries that use selected geospatial functions with GEOGRAPHY values.

Once you identify the queries that can benefit from the search optimization service, you can configure search optimization for the columns and tables used in those queries, which can lead to significant improvements in query performance.

Check out this official Snowflake documentation to find out more about it.

How Snowflake's Search Optimization Service Boosts Query Performance?

Snowflake search optimization service improves query performance by creating a set of indexes on the columns involved in the query. These indexes serve as a map that helps Snowflake quickly locate the micro-partitions containing the necessary data, reducing the amount of time and computing resources required to execute the query.

In addition to creating indexes, the Snowflake search optimization service applies filters to prune any unnecessary micro-partitions from the query. By doing this, it further reduces the amount of data that needs to be scanned, which helps to accelerate query performance.

This service is especially helpful when working with large tables or tables with a large number of micro-partitions, as it helps to ensure that only the required data is scanned, reducing the computational overhead of executing the query.

How to Enable Snowflake Search Optimization for a Table?

To enable Snowflake search optimization for a table, you need to follow these steps:

  • Make sure you have the necessary privileges to enable Snowflake search optimization.
  • Run the ALTER TABLE command to enable the Snowflake search optimization for the table.

The ALTER TABLE command to enable Snowflake search optimization service for a table is as follows:

ALTER TABLE some_table_name ADD SEARCH OPTIMIZATION

To demonstrate the default Snowflake search optimization service option for tables, we can create a new table called "my_table" that contains all the data from the "CUSTOMER" table in the "TPCH_SF100" database of the "SNOWFLAKE_SAMPLE_DATA" schema and enable Snowflake search optimization service for the table using the default settings.

Here is an example:

CREATE TABLE my_table as SELECT * FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF100.CUSTOMER;

We can then enable Snowflake search optimization service for the table using the following command:

ALTER TABLE my_table ADD search OPTIMIZATION
Note: search optimization is done by the Snowflake background process. So it might take a ~ few minutes/hrs for the process to complete.

Also, note that you will see a sudden surge in Snowflake credit consumption while the search optimization runs and starts to build the search access path.

To estimate the costs for specific tables before committing, you can use the SYSTEM$ESTIMATE_SEARCH_OPTIMIZATION_COSTS function with the <table_name> as a parameter.

select SYSTEM$ESTIMATE_SEARCH_OPTIMIZATION_COSTS('<table_name>')
Estimating search optimization costs for <table_name> - Snowflake search optimization
Estimating search optimization costs for <table_name>

If you need to disable the service, you can easily do so with the correct privileges by running the ALTER TABLE statement with the DROP search OPTIMIZATION clause and the table name.

ALTER TABLE my_table DROP SEARCH OPTIMIZATION;

How do you check the list of columns for which Snowflake search optimization service is enabled?

To check the list of columns for which Snowflake search optimization service is enabled, you can use the following DESCRIBE command:

DESCRIBE SEARCH OPTIMIZATION ON my_table;
Describing search optimization on my_table - Snowflake search optimization
Describing search optimization on my_table

What Kinds of privileges are required to enable Snowflake search optimization service?

To enable the Snowflake search optimization service requires specific privileges. Users must have either OWNERSHIP or the ADD SEARCH OPTIMIZATION privilege on the schema containing the table they want to enable Snowflake search optimization service.

OWNERSHIP

Users with ownership of a schema can enable Snowflake search optimization service for any table within that schema.

To grant ownership of a schema, use the following command:

GRANT OWNERSHIP ON SCHEMA <schema_name> TO <user_or_role>;

ADD SEARCH OPTIMIZATION privilege on the schema

Users without ownership of a schema must have the ADD SEARCH OPTIMIZATION privilege on the schema containing the table they want to enable Snowflake search optimization service for.

To grant the ADD SEARCH OPTIMIZATION privilege, use the following command:

GRANT ADD SEARCH OPTIMIZATION ON SCHEMA <schema_name> TO <user_or_role>;

But, to use Snowflake search optimization for a query, you only need the SELECT privilege on the table. You don’t need additional privileges since search OPTIMIZATION is a table property that is automatically detected and used when querying a table.

How are Snowflake search optimization services charged?

Snowflake search optimization service is charged based on the amount of compute and storage resources used.

The overall cost of Search optimization is included in the overall cost of running Snowflake workloads.

Compute and Storage Costs

Enabling Snowflake search optimization requires additional compute and storage resources. These resources are used to maintain the search indexes and optimize the search access path.

The compute and storage costs associated with the Snowflake search optimization service are based on the number and size of columns enabled for the Snowflake search optimization service.

What are the factors Affecting the Cost of Snowflake search optimization service?

Several factors can affect the cost of the Snowflake search optimization service.

These include:

  • Length and width of the table: number of rows and columns
  • Number of distinct values for each column: increases the storage used. Can be as much as the table's size if all values are unique.
  • Churn: number of inserts, updates, and deletes on the table. Increases compute the cost for maintaining the search access paths

Ways to keep the Snowflake costs under control:

  • Only enable the search optimization service on tables that will benefit from it. Not all tables need the Snowflake search optimization service enabled.
  • Use Snowflake search optimization service on tables with large amounts of data and high numbers of micro-partitions.
  • Either group/combine INSERT, UPDATE, DELETE and MERGE operations on a table into fewer, larger transactions instead of multiple smaller transactions. By doing so, the background maintenance process needed to maintain the search optimization service will be run less frequently, which can significantly help reduce the overall Snowflake cost.
  • If the table is not clustered, consider dropping the Snowflake search optimization service from the table first, then re-adding the table/columns once you recluster the table.
  • Monitor your usage of Snowflake search optimization service carefully and adjust as needed. Use the Snowflake Account Usage to carefully monitor your usage and make adjustments as necessary to keep the costs under control.

Now that we have a clear understanding of what the Snowflake Search Optimization Service is, how this service boosts query performance, how to enable it, and how charges are applied, let's dive into in-depth performance comparison between Snowflake Search Optimized and Non-Optimized tables.

Example: Performance Comparison of Snowflake Search Optimized and Non-Optimized Tables

The example below shows Snowflake search optimization service in action.

Let’s create a table with 100 million customer records from SNOWFLAKE_SAMPLE_DATA.TPCDS_SF100TCL.CUSTOMER

Table with 100M customer record from Snowflake sample data

create table CUSTOMER as select * from  SNOWFLAKE_SAMPLE_DATA.TPCDS_SF100TCL.CUSTOMER

Step 1: Clone the table

create table customer_SOS clone CUSTOMER;

Step 2: Now, enable Snowflake search optimization service on that cloned table

ALTER TABLE customer_SOS ADD search OPTIMIZATION;

Step 4: Confirm that search Optimization is complete by examining the following fields in the output.

show tables like '%customer_SOS%';

Step 5: let's run a point-lookup query on a table WITHOUT optimization enabled.

SELECT *
FROM CUSTOMER
WHERE C_CUSTOMER_SK = '4722123';
Selecting CUSTOMER record with ID - Snowflake search optimization
Selecting CUSTOMER record with ID

As you can see below, the query profile below shows that it had to scan all the micro partitions to get the required result in 1.3 seconds.

Snowflake query profile - Snowflake search optimization
Snowflake query profile

Step 6: Finally, let's run a point-lookup query on a table WITH optimization enabled.

SELECT *
FROM customer_SOS
WHERE C_CUSTOMER_SK = '4722123';

As you can see below, the query profile is running much faster and using search optimization effectively and returning the results much faster.

Snowflake query profile - Snowflake search optimization
Snowflake query profile
Note: If you didn't see any improvement, it might be due to cached results. To reset the cache, run the following commands:

🚨 Caution: Use at your own risk! 🚨

ALTER SESSION SET USE_CACHED_RESULT = FALSE;
ALTER WAREHOUSE COMPUTE_WH SUSPEND;
ALTER WAREHOUSE COMPUTE_WH RESUME;

Conclusion

Snowflake's search optimization service is a powerful feature that can significantly enhance the performance of complex queries. Activating this service for a table is a straightforward process, but it should only be used for specific use cases with caution. While search optimization can definitely improve query performance, one needs to look deeper into identifying which tables, columns, and queries to use the service for without blowing up your Snowflake costs.

To summarize Part 1 of this article, we covered the basics of the Snowflake search optimization service, including its definition, activation process, required privileges, how they are charged and a performance comparison between optimized and non-optimized tables. In Part 2 of the article, we will delve deeper into the advantages of search optimization, the types of queries that can benefit from it, the difference between Snowflake Clustering and Search Optimization Service—and a whole lot more!


FAQs

What queries can benefit from Snowflake Search Optimization Service?

Snowflake Search Optimization Service (SOS) can benefit the following types of queries:

  • Selective point lookup queries
  • Substring and regular expression searches
  • Queries on fields in VARIANT, OBJECT, and ARRAY columns that use certain types of predicates
  • Queries that use selected geospatial functions with GEOGRAPHY values

How do you check the list of columns with enabled Search Optimization Service?

Use the "DESCRIBE SEARCH OPTIMIZATION ON my_table" command on the table to check the list of columns for which Snowflake Search Optimization Service is enabled.

How can you estimate the search optimization costs for a table?

Use the "SYSTEM$ESTIMATE_SEARCH_OPTIMIZATION_COSTS" function with the table name as a parameter to estimate the search optimization costs

How can you disable Snowflake Search Optimization for a table?

With the appropriate privileges, use the ALTER TABLE statement with the "DROP SEARCH OPTIMIZATION" clause and the table name to disable the service.

Can I enable search optimization on a table?

Yes, To enable search optimization on a table, contact Snowflake Support. It is not enabled by default.

Does search optimization require automatic clustering to be enabled?

Yes, search optimization requires automatic clustering to be enabled. Contact Snowflake Support to enable automatic clustering services if needed.


Pramit Marattha

Technical Content Lead

Pramit is a Technical Content Lead at Chaos Genius.

People who are also involved

“Chaos Genius has been a game-changer for our DataOps at NetApp. Thanks to the precise recommendations, intuitive interface and predictive capabilities, we were able to lower our Snowflake costs by 28%, yielding us a 20X ROI

Chaos Genius has given us a much better understanding of what's driving up our data-cloud bill. It's user-friendly, pays for itself quickly, and monitors costs daily while instantly alerting us to any usage anomalies.

Anju Mohan

Director, IT

Simon Esprit

Chief Technology Officer

Join today to get upto
30% Snowflake
savings

Join today to get upto 30% Snowflake savings

Unlock Snowflake Savings Join waitlist
Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.