The Snowflake Search Optimization Service (SOS) is a feature that improves the performance of point-lookup and analytical queries, particularly those involving selective filtering on large tables.
This article is divided into two parts. Part 1 will cover the basics of the Snowflake search optimization service, including its implementation, the associated costs, strategies for cost management, factors affecting the cost and a hands-on example. Part 2 of the article will explore the advantages of using the search optimization service, the types of queries that can benefit from it, the difference between Snowflake Clustering vs Search Optimization Service—and more!!
So, get ready to learn more about this powerful feature, and let's dive right in!
What is Snowflake Search Optimization Service (SOS)?
Search Optimization Service (SOS) is a feature of the Snowflake cloud data platform that utilizes a background maintenance service to create an optimized data structure called the search access path. This path scans all of the table’s micro partitions and records metadata, which is subsequently used by Snowflake to construct the most effective search access path for the table’s data. When queries which perform point-lookups - many filters that retrieve a small subset of data - are run on a table with Search Optimization, the Snowflake optimizer analyzes the search access path to determine the shortest and most effective access path.
Note: Snowflake search optimization service is only available in the Enterprise edition or higher.
The Search Optimization Service (SOS) is applicable to a wide range of queries, including selective point lookup queries on tables. These queries typically return only one or a small number of distinct rows and are commonly used by business users who need fast response times for critical dashboards with highly selective filters, data scientists exploring large data volumes and looking for specific subsets of data, and data applications retrieving a small set of results based on an extensive set of filtering predicates.
Also, the SOS feature is quite useful for substring and regular expression searches (e.g. LIKE, ILIKE, RLIKE, etc.), queries on fields in VARIANT, OBJECT and ARRAY columns that use certain types of predicates (equality predicates, IN predicates, predicates that use ARRAY_CONTAINS and ARRAYS_OVERLAP, and predicates that check for NULL values), and queries that use selected geospatial functions with GEOGRAPHY values.
Once you identify the queries that can benefit from the search optimization service, you can configure search optimization for the columns and tables used in those queries, which can lead to significant improvements in query performance.
Check out this official Snowflake documentation to find out more about it.
How Snowflake's Search Optimization Service Boosts Query Performance?
Snowflake search optimization service improves query performance by creating a set of indexes on the columns involved in the query. These indexes serve as a map that helps Snowflake quickly locate the micro-partitions containing the necessary data, reducing the amount of time and computing resources required to execute the query.
In addition to creating indexes, the Snowflake search optimization service applies filters to prune any unnecessary micro-partitions from the query. By doing this, it further reduces the amount of data that needs to be scanned, which helps to accelerate query performance.
This service is especially helpful when working with large tables or tables with a large number of micro-partitions, as it helps to ensure that only the required data is scanned, reducing the computational overhead of executing the query.
How to Enable Snowflake Search Optimization for a Table?
To enable Snowflake search optimization for a table, you need to follow these steps:
- Make sure you have the necessary privileges to enable Snowflake search optimization.
- Run the ALTER TABLE command to enable the Snowflake search optimization for the table.
The ALTER TABLE command to enable Snowflake search optimization service for a table is as follows:
ALTER TABLE some_table_name ADD SEARCH OPTIMIZATION
To demonstrate the default Snowflake search optimization service option for tables, we can create a new table called "my_table" that contains all the data from the "CUSTOMER" table in the "TPCH_SF100" database of the "SNOWFLAKE_SAMPLE_DATA" schema and enable Snowflake search optimization service for the table using the default settings.
Here is an example:
CREATE TABLE my_table as SELECT * FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF100.CUSTOMER;
We can then enable Snowflake search optimization service for the table using the following command:
ALTER TABLE my_table ADD search OPTIMIZATION
Note: search optimization is done by the Snowflake background process. So it might take a ~ few minutes/hrs for the process to complete.
Also, note that you will see a sudden surge in Snowflake credit consumption while the search optimization runs and starts to build the search access path.
To estimate the costs for specific tables before committing, you can use the SYSTEM$ESTIMATE_SEARCH_OPTIMIZATION_COSTS function with the <table_name> as a parameter.
If you need to disable the service, you can easily do so with the correct privileges by running the ALTER TABLE statement with the DROP search OPTIMIZATION clause and the table name.
ALTER TABLE my_table DROP SEARCH OPTIMIZATION;
How do you check the list of columns for which Snowflake search optimization service is enabled?
To check the list of columns for which Snowflake search optimization service is enabled, you can use the following DESCRIBE command:
DESCRIBE SEARCH OPTIMIZATION ON my_table;
What Kinds of privileges are required to enable Snowflake search optimization service?
To enable the Snowflake search optimization service requires specific privileges. Users must have either OWNERSHIP or the ADD SEARCH OPTIMIZATION privilege on the schema containing the table they want to enable Snowflake search optimization service.
Users with ownership of a schema can enable Snowflake search optimization service for any table within that schema.
To grant ownership of a schema, use the following command:
GRANT OWNERSHIP ON SCHEMA <schema_name> TO <user_or_role>;
ADD SEARCH OPTIMIZATION privilege on the schema
Users without ownership of a schema must have the ADD SEARCH OPTIMIZATION privilege on the schema containing the table they want to enable Snowflake search optimization service for.
To grant the ADD SEARCH OPTIMIZATION privilege, use the following command:
GRANT ADD SEARCH OPTIMIZATION ON SCHEMA <schema_name> TO <user_or_role>;
But, to use Snowflake search optimization for a query, you only need the SELECT privilege on the table. You don’t need additional privileges since search OPTIMIZATION is a table property that is automatically detected and used when querying a table.
How are Snowflake search optimization services charged?
Snowflake search optimization service is charged based on the amount of compute and storage resources used.
The overall cost of Search optimization is included in the overall cost of running Snowflake workloads.
Compute and Storage Costs
Enabling Snowflake search optimization requires additional compute and storage resources. These resources are used to maintain the search indexes and optimize the search access path.
The compute and storage costs associated with the Snowflake search optimization service are based on the number and size of columns enabled for the Snowflake search optimization service.
What are the factors Affecting the Cost of Snowflake search optimization service?
Several factors can affect the cost of the Snowflake search optimization service.
- Length and width of the table: number of rows and columns
- Number of distinct values for each column: increases the storage used. Can be as much as the table's size if all values are unique.
- Churn: number of inserts, updates, and deletes on the table. Increases compute the cost for maintaining the search access paths
Ways to keep the Snowflake costs under control:
- Only enable the search optimization service on tables that will benefit from it. Not all tables need the Snowflake search optimization service enabled.
- Use Snowflake search optimization service on tables with large amounts of data and high numbers of micro-partitions.
- Either group/combine INSERT, UPDATE, DELETE and MERGE operations on a table into fewer, larger transactions instead of multiple smaller transactions. By doing so, the background maintenance process needed to maintain the search optimization service will be run less frequently, which can significantly help reduce the overall Snowflake cost.
- If the table is not clustered, consider dropping the Snowflake search optimization service from the table first, then re-adding the table/columns once you recluster the table.
- Monitor your usage of Snowflake search optimization service carefully and adjust as needed. Use the Snowflake Account Usage to carefully monitor your usage and make adjustments as necessary to keep the costs under control.
Now that we have a clear understanding of what the Snowflake Search Optimization Service is, how this service boosts query performance, how to enable it, and how charges are applied, let's dive into in-depth performance comparison between Snowflake Search Optimized and Non-Optimized tables.
Example: Performance Comparison of Snowflake Search Optimized and Non-Optimized Tables
The example below shows Snowflake search optimization service in action.
Let’s create a table with 100 million customer records from SNOWFLAKE_SAMPLE_DATA.TPCDS_SF100TCL.CUSTOMER
Table with 100M customer record from Snowflake sample data
create table CUSTOMER as select * from SNOWFLAKE_SAMPLE_DATA.TPCDS_SF100TCL.CUSTOMER
Step 1: Clone the table
create table customer_SOS clone CUSTOMER;
Step 2: Now, enable Snowflake search optimization service on that cloned table
ALTER TABLE customer_SOS ADD search OPTIMIZATION;
Step 4: Confirm that search Optimization is complete by examining the following fields in the output.
show tables like '%customer_SOS%';
Step 5: let's run a point-lookup query on a table WITHOUT optimization enabled.
SELECT * FROM CUSTOMER WHERE C_CUSTOMER_SK = '4722123';
As you can see below, the query profile below shows that it had to scan all the micro partitions to get the required result in 1.3 seconds.
Step 6: Finally, let's run a point-lookup query on a table WITH optimization enabled.
SELECT * FROM customer_SOS WHERE C_CUSTOMER_SK = '4722123';
As you can see below, the query profile is running much faster and using search optimization effectively and returning the results much faster.
Note: If you didn't see any improvement, it might be due to cached results. To reset the cache, run the following commands:
🚨 Caution: Use at your own risk! 🚨
ALTER SESSION SET USE_CACHED_RESULT = FALSE; ALTER WAREHOUSE COMPUTE_WH SUSPEND; ALTER WAREHOUSE COMPUTE_WH RESUME;
Snowflake's search optimization service is a powerful feature that can significantly enhance the performance of complex queries. Activating this service for a table is a straightforward process, but it should only be used for specific use cases with caution. While search optimization can definitely improve query performance, one needs to look deeper into identifying which tables, columns, and queries to use the service for without blowing up your Snowflake costs.
To summarize Part 1 of this article, we covered the basics of the Snowflake search optimization service, including its definition, activation process, required privileges, how they are charged and a performance comparison between optimized and non-optimized tables. In Part 2 of the article, we will delve deeper into the advantages of search optimization, the types of queries that can benefit from it, the difference between Snowflake Clustering and Search Optimization Service—and a whole lot more!
What queries can benefit from Snowflake Search Optimization Service?
Snowflake Search Optimization Service (SOS) can benefit the following types of queries:
- Selective point lookup queries
- Substring and regular expression searches
- Queries on fields in VARIANT, OBJECT, and ARRAY columns that use certain types of predicates
- Queries that use selected geospatial functions with GEOGRAPHY values
How do you check the list of columns with enabled Search Optimization Service?
Use the "DESCRIBE SEARCH OPTIMIZATION ON my_table" command on the table to check the list of columns for which Snowflake Search Optimization Service is enabled.
How can you estimate the search optimization costs for a table?
Use the "SYSTEM$ESTIMATE_SEARCH_OPTIMIZATION_COSTS" function with the table name as a parameter to estimate the search optimization costs
How can you disable Snowflake Search Optimization for a table?
With the appropriate privileges, use the ALTER TABLE statement with the "DROP SEARCH OPTIMIZATION" clause and the table name to disable the service.
Can I enable search optimization on a table?
Yes, To enable search optimization on a table, contact Snowflake Support. It is not enabled by default.
Does search optimization require automatic clustering to be enabled?
Yes, search optimization requires automatic clustering to be enabled. Contact Snowflake Support to enable automatic clustering services if needed.