Snowflake Openflow 101: Complete Setup and Integration Guide (2025)

Moving data in and out of your Snowflake can be a real pain. You often have to write complex scripts or use legacy ETL tools. But that's not all; you may have to deal with delays, inconsistencies, and escalating costs. The truth is that traditional ETL setups struggle with high-volume loads, complex schemas, and low-latency change data capture. So you're forced to use workarounds that often fall apart. There are a lot of modern ETL tools, but they can be hard to set up and take a long time to scale. Snowflake Openflow is here to rescue. Openflow is a fully managed service built on top of Apache NiFi, which pairs a Snowflake-managed control plane with BYOC (Bring Your Own Cloud) data planes that you deploy in your VPC. It features a visual flow builder and a wide range of connectors to seamlessly link multiple sources. On top of that, it includes built-in security, scalability, and tracking; all natively baked within your Snowflake environment.

In this article, we will cover everything you need to know about Snowflake Openflow, its architecture, how to set it up from absolute scratch, compare it with existing Snowflake features/tools (alternatives), and build a real-world example that ingests files directly from Google Drive.

What Is Snowflake Openflow?

Put simply, Snowflake Openflow is a managed data integration service for Snowflake. It lets you pull and push data from virtually any source (databases, file shares, APIs, streaming topics, cloud storage buckets, even Google Drive/SharePoint and many more) directly into Snowflake (or vice versa).

Snowflake Openflow (Source: Snowflake)

Snowflake Openflow is built on top of Apache NiFi, an open source, flow-based data engine. It provides an intuitive, drag-and-drop canvas where you string together processors (think: connectors and transformations) to move data into Snowflake tables or stages.

Snowflake Openflow is tailored for modern use cases such as event-driven and streaming data, unstructured data, and low-code development. It has hundreds of processors that can handle everything from JDBC DBs to Kafka, REST APIs to file shares, Cloud Drives to Kafka, you name it.

Apache NiFi, the underlying engine, was designed for this sort of work. NiFi provides a visual UI to build "flows" that can ingest, route, transform, and deliver data. It excels at real-time and streaming scenarios. NiFi is flexible, event-driven, and well-suited for flowing data continuously, whereas older ETL tools expect scheduled jobs and static workflows.

Snowflake wraps NiFi in a Snowflake-friendly package. It provides native integration points (connectors that write directly to Snowflake tables or stages) and runs on your cloud account, allowing you to maintain control. The Openflow control plane lets you build and monitor flows. The Openflow data plane is a NiFi cluster that manages the flows themselves (more on this in the architecture section). So, in brief, Snowflake Openflow eliminates the need to manually set up your own NiFi cluster or wrestle with Kafka and messaging systems; Snowflake handles practically everything for you.

Snowflake For Data Engineering: Snowflake Openflow And dbt Projects

Snowflake Openflow and Its Integration with Snowflake

Snowflake Openflow is tightly integrated with Snowflake security and architecture. You manage it through the Snowflake UI and authenticate with your Snowflake accounts. Under the hood, Openflow uses Snowflake OAuth2 Authentication for its runtimes and respects Snowflake roles and privileges for control-plane action. Only users granted Openflow system privileges can create deployments or runtimes. Data loaded by Openflow lands in your specified database, schema, and warehouse. Because the service runs in your Virtual Private Cloud (VPC), you can enforce the same networking and compliance policies as your Snowflake account.

Notable Snowflake Openflow Features

Some of the key notable features and benefits of Snowflake Openflow are:

1) Open and extensible

Snowflake Openflow is built on top of Apache NiFi. You can add custom processors or build your own flows. Snowflake provides a managed NiFi service inside your VPC.

2) Unified data integration platform

Snowflake Openflow allows you to handle complex, bi-directional data extraction and loading through a fully managed service that can be deployed directly inside your own VPC.

3) Enterprise-ready

Snowflake Openflow is fully managed by Snowflake, with enterprise-grade features. You get high availability, monitoring, metadata tracking, and security that integrate with Snowflake’s governance model. It also offers SSL/TLS, OAuth2 Authentication, and role-based access controls.

4) High-speed ingestion

Snowflake Openflow supports large-scale loading of any data (including binary files) into Snowflake. You can perform near-real-time loads or manage large bulk loads efficiently.

5) Continuous multimodal ingestion

Snowflake Openflow is very effective with live data. Snowflake Openflow can stream Google Drive updates, Kafka events, or database changes in near real time, making the data readily available for AI/ML (via Snowflake Cortex) or analytics.

🔮 TL;DR: Snowflake Openflow simplifies data movement. Instead of dealing with complex scripts or external orchestration tools, you use Snowsight (Snowflake’s UI) to launch flows and monitor progress. If a flow fails or stalls, you will receive alerts and logs in Snowflake. Openflow is multi-tenant (control plane in Snowflake) but runs in your own AWS VPC (data plane), providing both ease of use and complete control.

Want to take Chaos Genius for a spin?

It takes less than 5 minutes.

Enter your work email
Enter your work email

How Snowflake Openflow Works (Architecture)

Snowflake Openflow’s architecture cleanly separates the control logic from the execution environment. Snowflake Openflow uses a layered architecture with two main planes:  the Control Plane and the Data Plane.

a) Openflow Control Plane (Snowflake-managed): The Control Plane is where you configure everything. It lives inside Snowflake’s infrastructure, accessible via Snowsight or APIs. You log in to Snowflake Openflow in Snowsight, create "deployments", manage user access, and author flows on the canvas. The control plane keeps track of your pipelines, deployments, runtimes, and system metadata. You never see the servers or clusters – it’s fully managed.

b) Openflow Data Plane (Customer-managed): When you create a deployment, Snowflake spins up an EC2 deployment agent and EKS (Kubernetes) cluster in your AWS account (Bring Your Own Cloud model). These runtimes actually execute your NiFi flows. In each deployment, you can have one or more runtimes (which are essentially NiFi container clusters). You deploy connectors and custom flows onto those runtimes. Data flows (pulling from sources and pushing to Snowflake) run here. You manage scale by adjusting node count for these runtimes, and Snowflake handles syncing the NiFi container images from its System Image Registry.

Snowflake Openflow Architecture Overview - Snowflake Openflow

Here is how it comes together:

  • A Deployment is a boundary in your cloud. It is like an isolated integration workspace (often tied to a team or project). It has one or more Runtimes, and it’s backed by a set of AWS resources (VPC, EKS, load balancers) provisioned via CloudFormation.
  • A Runtime is a container cluster (an EKS node group) that runs the actual NiFi flows (called "data pipelines"). You can resize it (scale-out nodes) or even run multiple runtimes for Dev/Test vs Prod in the same deployment.
  • The Openflow Control Plane (layer) contains the Openflow UI and API that you use to create deployments/runtimes and observe them. It’s where you go to launch connectors or build flows from scratch.

Each deployment in AWS has its own EKS cluster, NAT gateway, private subnets, etc. A lightweight Snowflake-provided deployment agent EC2 instance boots up first, syncs Openflow container images from the Snowflake registry, and then deploys the rest of the stack. Importantly, this all happens inside your AWS account – Snowflake never manages your data plane infrastructure (except the images).

Authentication and networking: Snowflake Openflow uses OAuth2 Authentication for the runtimes to talk to Snowflake. That means each NiFi flow connects to Snowflake via Snowflake service user and key pair (similar to Snowpark Container Services). From a network standpoint, you can enable AWS PrivateLink so that the control plane communicates secretly with Snowflake. At the very least, the deployment's NAT gateway must allow the NiFi cluster to communicate with Snowflake (to execute COPY commands or schema queries).

🔮 TL;DR: Snowflake Openflow’s architecture splits the job: Snowflake handles the front-end and orchestration (control), and your cloud runs the NiFi engine (data). You can spin up new deployments for different business units or environments, each with its own isolation, but all managed from one Snowflake interface.

Snowflake Openflow vs Snowflake Snowpipe vs Snowflake COPY

Snowflake already offers a couple of data load methods: the Snowflake COPY command (manual or scheduled bulk loads) and Snowflake Snowpipe (continuous file loading). So, where does Snowflake Openflow fit in? It’s important to note that Snowflake Openflow isn’t a direct replacement for these tools but rather a powerful new alternative. Let’s explore the key differences and use cases to understand how it complements the existing options.

Snowflake Integration Comparison
🔮 Snowflake Openflow Snowflake Snowpipe Snowflake COPY Command
Data Source Types Virtually any: cloud files, APIs, DBs, messaging, SaaS. Can ingest unstructured content (images, audio). Cloud storage stages only (files in S3/GCS/Azure). Can handle structured/semi-structured (CSV, JSON, Parquet). Cloud storage stages only (same as Snowpipe). Mostly structured/semi data (JSON, CSV).
Pipeline Scope End-to-end data integration (extract and load). It pulls from sources and writes to Snowflake. Load-only. Snowflake Snowpipe only loads data already in a stage. It does not extract from source systems. Load-only (via user-triggered Snowflake COPY commands on staged files). No extraction logic.
Load Mode (Batch vs Stream) Supports batch, micro-batch, or true streaming, depending on flow design. NiFi can poll or subscribe to streams (Kafka/Kinesis) or poll files/API. Continuous micro-batch for files by default (auto-triggered by events). Separate “Streaming” feature handles event/row streams via a client. Batch. You run Snowflake COPY INTO table ... manually or via a script/SQL schedule.
Triggering Flows can be triggered on a schedule, on file events, or run continuously. You control triggers within NiFi. It's fully automated. Triggered by cloud storage events or REST API calls when new files arrive. It's fully manual. You issue Snowflake COPY (or use Snowflake tasks) on a schedule/when ready.
Latency Low (near real-time) for streaming flows; seconds to minutes for file flows, depending on configuration. Low (~30 seconds median) for file loads. Snowflake Snowpipe Streaming can be ~5 seconds for small row sets. Higher: depends on schedule. A large batch COPY might take minutes; no built-in low-latency mode.
Compute NiFi runs on customer-managed compute (or fully-managed). Snowflake compute is used when writing (via the Snowflake Sink connector). Snowflake-managed compute. Snowflake Snowpipe uses Snowflake’s serverless workers. Customer-managed. You need to provision a warehouse to run the Snowflake COPY command.
Scalability Horizontal scaling: NiFi can run multi-node clusters (scale out within your VPC). You can add nodes for heavy loads. Automatic scaling internally (Snowflake handles load). Scale by choosing a larger warehouse or multi-cluster warehouse.
Data Types Any data type. Ingest JSON, Parquet, ORC, XML, CSV, images, PDFs, audio, video, binary, etc. Great for unstructured and multi-modal data. Structured/semi-structured only (Parquet, JSON, Avro, CSV, XML). Does not natively load images or other blobs. Same as Snowpipe (semi-structured formats).
Integration Very broad. Openflow can ingest databases (CDC), stream events (Kafka), or crawl SaaS objects (Google Drive, Slack, ..). Limited to files. Good for log ingest, cloud dumps. Limited to files. Good for one-off or periodic batch loads of data files.
Setup Complexity Higher: you must deploy a VPC/EKS stack (or use Snowflake-managed VPC), set roles and a Snowflake Image Repository, etc. But once set up, adding new flows is easy. Low: Just define a PIPE with event notifications. No extra infrastructure beyond your storage. Low: Just COPY from an existing stage, or use Snowflake UI/CLI to run it.

🔮 TL;DR: Snowflake Snowpipe (and Snowflake COPY) is great if your data is already stored in cloud storage and you just want to load it into Snowflake as it is. But if you need to connect to other systems (on-premise DBs, APIs, SaaS) or handle streaming, CDC, or unstructured data, they fall short. Snowflake Openflow fills that gap by automating the entire ingestion process from source to Snowflake. Snowflake Openflow is like a broader data-moving platform, whereas Snowflake Snowpipe/Snowflake COPY are specific Snowflake loading mechanisms.

Therefore, if your team requires flexible, general-purpose, visual pipelines, especially for non-file sources or complex transformation logic, Snowflake Openflow is the ideal new option. If your workflow primarily involves dropping CSVs into S3 and loading them, Snowflake Snowpipe or the COPY command might still be simpler.

🔮 Step-by-Step Setup Guide to Activate and Set Up Snowflake Openflow From Scratch

To get Snowflake Openflow up and running, you'll generally follow three main phases: First, prepare your Snowflake account. Second, deploy the Openflow service infrastructure in your cloud environment. Finally, create a runtime environment where your data flows will execute. Let's go through each step.

  • Phase 1 preps your Snowflake account.
  • Phase 2 deploys infrastructure.
  • Phase 3 spins up runtimes.

Prerequisites

  • Snowflake Edition — Snowflake Openflow is GA on AWS regions. Make sure your account is on a supported AWS commercial region. Currently, Snowflake Openflow is not supported on trial accounts and is only available on AWS (not Azure or GCP).
  • Privileges:
    • You need ORGADMIN (org-level admin) to accept terms once.
    • You need ACCOUNTADMIN or another high-privileged role to set up everything else (create roles, Snowflake Image Repository).
  • Snowflake Image Repository — Since Snowflake Openflow runtimes pull container images from Snowflake’s registry, you need an Snowflake image repository created in your account (in a Snowflake database) for Snowflake Openflow. If you haven’t used Snowflake Container Registry before, you must CREATE IMAGE REPOSITORY. We’ll cover this in step 2.
  • AWS Setup — An AWS account where you can launch CloudFormation stacks and create an EKS cluster in a new or existing VPC. If using BYOC (Bring Your Own Cloud) mode, prep an existing VPC (Virtual Private Cloud) with 2 public and 2 private subnets in different AZs. Otherwise, Snowflake can create a new VPC (Virtual Private Cloud) for you.

▶️ Phase 1 — Snowflake Account Preparation (Pre-requisites)

In this phase, we will configure our Snowflake account for Openflow use. That means setting up the Snowflake image repository, granting privileges, and accepting terms.

Step 1—Log in to Snowflake

First, sign in to the Snowflake web UI (Snowsight) with your ACCOUNTADMIN or a similar high-privilege role.

Step 2—Configure Snowflake Image Repository

Snowflake Openflow uses container images stored in a Snowflake-managed repository. Create an internal database and Snowflake image repository for it. Run these commands:

USE ROLE ACCOUNTADMIN;

CREATE OR REPLACE DATABASE OPENFLOW;
CREATE OR REPLACE SCHEMA OPENFLOW.OPENFLOW;

CREATE IMAGE REPOSITORY IF NOT EXISTS OPENFLOW.REGISTRY;
GRANT USAGE ON DATABASE OPENFLOW TO ROLE PUBLIC;
GRANT USAGE ON SCHEMA OPENFLOW.OPENFLOW TO ROLE PUBLIC;
GRANT READ ON IMAGE REPOSITORY OPENFLOW.OPENFLOW.REGISTRY TO ROLE PUBLIC;

This command sets up a Snowflake image registry (in your account) named OPENFLOW.REGISTRY. The system will push the required NiFi container images here.

To verify whether the Snowflake image repository exists, run the following command. You should see at least one repository URL in the output:

SHOW IMAGE REPOSITORIES;
Verifying Snowflake Image Repository - Snowflake Openflow

Step 3—Configure Account Level Privileges

Next, tell Snowflake which role can create Snowflake Openflow deployments and runtimes. Typically, this is a custom role. First, let's create a role for Snowflake Openflow administration (if you don't have one).

USE ROLE ACCOUNTADMIN;

CREATE ROLE IF NOT EXISTS OPENFLOW_ADMIN;

Then, grant account-level privileges needed for deployments/runtimes.

GRANT CREATE OPENFLOW DATA PLANE INTEGRATION ON ACCOUNT TO ROLE OPENFLOW_ADMIN;

GRANT CREATE OPENFLOW RUNTIME INTEGRATION ON ACCOUNT TO ROLE OPENFLOW_ADMIN;

The first privilege lets that role create deployment integration objects, and the second lets it create runtime objects. (You can name the role whatever makes sense in your org).

Configuring account-level privileges - Snowflake Role - Snowflake Openflow

Step 4—Set Default Secondary Roles (IMPORTANT)

Snowflake Openflow requires users to have at least one non-ACCOUNTADMIN role as their default. You can do this by altering your users:

USE ROLE ACCOUNTADMIN;

ALTER USER <your_user> SET DEFAULT_SECONDARY_ROLES = 'ALL';

Step 5—Accept Snowflake Openflow Terms of Service

In Snowsight, go to the Data tab, then Openflow. You should see a prompt to accept the Snowflake Openflow terms of service. Click Accept. This only needs to be done once per account.

Accepting Snowflake Openflow Terms of Service - Snowflake Openflow
Accepting Snowflake Openflow Terms of Service - Snowflake Openflow

Step 6—Launch Snowflake Openflow

Still in Snowsight, navigate to Data > Openflow. Click Launch Openflow

Launching Snowflake Openflow - Data Integration Tools - Snowflake Openflow

You can see that this opens the Snowflake Openflow control plane UI (the management interface) in a new tab. Finally, log in using your Snowflake credentials.

To access Snowflake Openflow, your user’s default role must not be ACCOUNTADMIN, ORGADMIN, GLOBALORGADMIN, or SECURITYADMIN. For this process, use the custom role created in Step 3 (see above).

Accessing Snowflake Openflow - Snowflake Role - Snowflake Openflow

After these steps, your Snowflake account is ready. You have a Snwoflake image repository, an admin role with the right privileges, and you’ve opened the Snowflake Openflow UI. Next, we’ll create an actual Snowflake Openflow deployment in your cloud.

▶️ Phase 2 — Configuring Snowflake Openflow Deployment

In this phase, you will set up the Snowflake Openflow deployment in your cloud (AWS). A deployment defines the necessary AWS resources (VPC, network configurations) where the Openflow data plane will operate. We will use the Openflow UI to generate an AWS CloudFormation template, which you will then launch in your AWS account.

Step 1—Open the Snowflake Openflow Control Plane

 If you close the browser tab, return to Snowsight (Data > Openflow), click Launch Openflow again, and log in using your Snowflake credentials. 

Step 2—Start a Deployment

In the Snowflake Openflow control plane, go to the Deployments tab and click Create a deployment to start the setup process and review the prerequisites.

Creating Snowflake Openflow deployment - Snowflake Openflow

Since these requirements were already addressed above, you can proceed by clicking Next.

Creating Snowflake Openflow deployment - Snowflake Openflow

Step 3—Name the Deployment

In the wizard that appears, on the "Deployment location" step, select Amazon Web Services as the cloud provider. Enter a name for your deployment.

Naming Snowflake Openflow deployment - AWS CloudFormation - Snowflake Openflow

Step 4—Choose VPC Mode (AWS BYOC)

On the next step ("Configuration"), you decide whether Snowflake will create a new VPC or use your existing one. For maximum flexibility, select Bring Your Own VPC (BYOC). This means you’ll provide your own VPC ID and subnets. (If you prefer Snowflake to manage the VPC (Virtual Private Cloud) for you, choose "Managed VPC" instead, but BYOC gives you maximum control).

For simplicity, we will select the "Managed VPC" option. This allows Snowflake to handle all VPC management tasks, including OS patching and upgrades.

Step 5—Assign Ownership Role

On the configuration page, navigate to the "Deployment access" section and select the owner role as the custom role created earlier. Select that role. This user/role will be the owner and have full control over the deployment objects. (By default, the creating user has ownership).

Configuring Snowflake Openflow deployment - AWS CloudFormation - Snowflake Openflow

Step 6—Create Deployment & Download Template

Continue through any remaining steps (you can skip PrivateLink/custom ingress unless needed). Finally, click Create Deployment. Snowflake will generate an AWS CloudFormation template for you. A dialog box will prompt you to download the template ZIP file. Click Download to save it.

Creating Snowflake Openflow deployment - AWS CloudFormation - Snowflake Openflow

Step 7—Upload Template to AWS CloudFormation

Now, log in to your AWS account (where you want the data plane), open the AWS CloudFormation console, and create a new stack. Upload the template file you just downloaded. On the parameters screen, provide the name, required values (such as your existing VPC ID and private subnet IDs if you chose BYOC). Do not modify other values (the Snowflake-provided defaults are fine).

Upload deployment template to AWS CloudFormation - Snowflake Openflow Deployment

Step 8—Configure and Launch the Stack

Review the configuration and launch the AWS CloudFormation stack. This will create the necessary AWS resources (EC2 instances, networking, etc) for the Snowflake Openflow data plane agent. The agent VM will then pull down and install the rest of the infrastructure (NiFi nodes, EKS pods, etc) automatically.

Step 9—Wait for Deployment Creation

The AWS CloudFormation stack takes some time (often ~45–60 minutes) to complete all the work. In the Snowflake Openflow UI (back under Deployments), you can click Refresh or simply watch the stack. Once the AWS side is done, your deployment’s status in the UI should show as Active. At that point, the deployment is live.

Step 10—Confirm Deployment Status

In the Snowflake Openflow control plane, under Deployments, ensure your new deployment is listed and has a green "Active" state. You can click on it to see details. The deployment is now ready to host data plane runtimes.

Confirming Snowflake Openflow deployment - Snowflake Openflow Deployment

As you can see, at this point, we have a running Snowflake Openflow deployment in AWS (or your cloud). Now we need to define a runtime: essentially the NiFi cluster where the flows will run.

▶️ Phase 3 — Configuring Snowflake Openflow Runtime

In this phase, you will create a Snowflake Openflow runtime, which is a cluster of NiFi nodes associated with your deployment. Once the runtime is created, you will be able to access the NiFi canvas to build your data flows.

Step 1—Create a Runtime in the Control Plane

In the Snowflake Openflow UI, switch to the Runtimes tab and click Create a runtime. This opens a dialog to define a new runtime cluster.

Step 2—Name the Runtime and Choose Deployment

Give the runtime a name, and pick the deployment you just created from the "openflow" drop-down.

Configuring Snowflake Openflow runtime - Snowflake Openflow

Step 3—Choose Node Type and Cluster Size

Select an instance type for the NiFi nodes (this determines CPU/RAM for each node). Then set the minimum and maximum number of nodes. For example, start with 1 min / 1 max. The runtime will start with the minimum nodes and auto-scale up to the max if load requires.

Choosing Node Type and Cluster Size - Snowflake Openflow

Step 4—Wait for Runtime Provisioning

Finally, select the ownership role and usage role, then click "Create". Snowflake (specifically, the Data Plane Agent) will then provision the NiFi cluster based on your specifications. This process may take a few minutes.

Creating Snowflake Openflow runtime - Snowflake Openflow

Step 5—Open the Snowflake Openflow Canvas

Once the runtime creation is finished, you’ll see it in the list under Runtimes. Click on the runtime name, then click Open Canvas. This will launch the NiFi flow editor in a new tab. You may need to log in again (use the Snowflake role/user credentials you set up). Now you should see the NiFi UI with a blank canvas, ready for building flows.

Snowflake Openflow Canvas Structure - Snowflake Openflow

Done – your Snowflake Openflow environment is set up. You have the control plane (Snowflake UI) and a running data plane (NiFi cluster). Next, we’ll illustrate with an example flow: ingesting files from Google Drive into Snowflake.

🔮 Example—Step-by-Step Guide to Ingesting Files from Google Drive to Snowflake via Openflow

Let’s walk through a practical example that demonstrates how to use the Google Drive Snowflake Openflow connector to load documents from a Google Workspace Drive into Snowflake.

Prerequisites:

  • Snowflake Prep — Complete all setup above (phases 1 – 3). Make sure the Snowflake Openflow control plane and a runtime are active. You also need a Snowflake user to manage this flow.
  • Google Workspace Admin — You need Google Workspace Admin privileges to create a service account and grant domain-wide delegation. Have access to the Google Admin console and Google Cloud Console.
  • Snowflake Objects — Create a database and schema where Google Drive data will land.
  • Snowflake Secrets — It’s highly recommended to store keys (Snowflake private key, Google key JSON) in a secrets manager (like AWS Secrets Manager). Configure Snowflake Openflow parameter providers for it, as described in the docs. This way, you only reference parameters, not raw keys, in your flow.

What we aim to achieve: Any file placed in a designated Google Drive Shared Drive should be automatically loaded into Snowflake (Bronze tables). We also want to capture relevant metadata and enable content searchability if desired.

Step 1—Log in to Snowflake

Open Snowsight and sign in with the role you will use for this process.

Step 2—Open the Snowflake Openflow Control Plane

Now, go to Data > Openflow and launch Snowflake Openflow. Confirm you see your deployment and runtime available. (If not, troubleshoot the previous steps).

Step 3—Google Cloud Setup

Make sure that your Google Cloud environment is fully configured. Set up the required credentials and permissions to enable secure access from Snowflake to your Google Drive data.

First, create a new project in the Google Cloud Console. This project serves as the foundation for managing APIs, billing, and permissions across all Google Cloud services.

Creating new Google Cloud Project - Google Drive Integration - Snowflake Openflow

Next, within your Google Cloud project, navigate to the API Library.

Navigating to Google Cloud API library catalog - Snowflake Openflow

Enable the following APIs. These are required to access and index content from Google Drive.

Google Drive API

Enabling Google Drive API - Google Drive Integration - Snowflake Openflow

Google Cloud Search API

Enabling Google Cloud Search API - Google Drive Integration - Snowflake Openflow

Step 4—Create a Google Service Account and Key

You need a Google user with Super Admin rights in your Workspace organization and a Google Cloud Project. In the Google Admin console, verify you have the following roles:

  • Organization Policy Administrator 
  • Organization Administrator

These allow enabling service account keys and delegation.

Google Cloud by default disables service account key creation. To allow Snowflake Openflow to use the service account JSON key, disable this policy.

Log in to the Google Cloud Console using a super admin account that has the Organization Policy Administrator role. Make sure you are viewing at the organization level, rather than within a specific project in your organization. Then, navigate to the Organization Policies section, select the "Disable service account key creation" policy, click "Manage Policy" and disable enforcement. Finally click "Set Policy" to apply the changes.

Creating Google Service Account and Key - Google Drive Integration - Snowflake Openflow

To enable Snowflake Openflow to authenticate with Google APIs, create a Google Cloud service account and download its JSON key. In the Cloud Console, navigate to IAM & Admin > Service Accounts.

Creating Google Service Account and Key - Google Drive Integration - Snowflake Openflow

Click "Create Service Account", enter a name, and click "Create and Continue". 

Creating Google Service Account and Key - Google Drive Integration - Snowflake Openflow

Then click Done to finish the setup. In the list of service accounts, locate your newly created account and copy its OAuth2 Client ID, which is critical if you plan to enable domain-wide delegation. Click the account’s Actions menu, then select Manage keys > Add key > Create new key, choose the JSON key type (which is the default), and click Create. A private key file will be downloaded, so store this JSON key securely, as it is required by Snowflake Openflow for authentication.

Creating Google Service Account and Key - Google Drive Integration - Snowflake Openflow
Creating Google Service Account and Key - Google Drive Integration - Snowflake Openflow

Step 5—Grant Domain-Wide Delegation

In the Google Admin console (admin.google.com, Security > API controls > Domain-wide delegation), add a new client.

Managing Domain-Wide delegation - Google Drive Integration - Snowflake Openflow

On the API Controls screen, click Manage domain-wide delegation.

Managing Domain-Wide Delegation - Google Drive Integration - Snowflake Openflow

Click Add New and enter the OAuth2 Client ID copied from the Create Service Account and Key section, and the following scopes (to allow reading Drive and directory info)

https://www.googleapis.com/auth/drive.metadata.readonly
https://www.googleapis.com/auth/admin.directory.group.member.readonly
https://www.googleapis.com/auth/admin.directory.group.readonly
https://www.googleapis.com/auth/drive.file
https://www.googleapis.com/auth/drive.metadata
Adding new OAuth2 Client ID and scopes - Google Drive Integration - Snowflake Openflow

Authorize these scopes. This lets the service account act on behalf of users in your domain.

Step 6—Create a Google Drive Folder

In Google Drive, open the left menu and click "Shared drives", then click "Create a shared drive".

Creating Google Drive Folder - Google Drive Integration - Snowflake Openflow

Give the drive a name such as "Snowflake Openflow Data Repo", and click Create. This shared drive will serve as the source folder from which Openflow ingests documents into Snowflake.

Creating Google Drive Folder - Google Drive Integration - Snowflake Openflow

Step 7—Upload Files to Google Drive Folder

Upload or move the relevant documents into the new "Snowflake Openflow Data Repo" folder. Snowflake Openflow will read those files to ingest their contents into Snowflake. Keep files formatted for indexing and follow your organization’s data governance and access-control rules; do not upload sensitive data without proper classification and protection.

Step 8—Prepare Snowflake Objects

Back in Snowflake, create a minimal, securely configured environment for Openflow. This involves creating a service user, a dedicated role, a small warehouse, and the database/schema that the connector will utilize. First, create the service account. For example, as USERADMIN, run:

USE ROLE USERADMIN;
CREATE USER openflow_service
  TYPE=SERVICE
  COMMENT='Service account used by Snowflake Openflow connector';
Creating a dedicated Snowflake service user - Snowflake Openflow

Assign a public RSA key to the service user for key pair authentication rather than a password. Snowflake strongly recommends key-pair authentication. Generate an RSA key pair outside Snowflake (using OpenSSL) and assign the public key to the service user. Store the public/private key securely. Do not place the key in source control or in a connector config file in plain text. Always use a secrets manager.

Generate Key Pair (Outside Snowflake):

openssl genrsa -out openflow_private_key.pem 2048
openssl rsa -in openflow_private_key.pem -pubout -out openflow_public_key.pem

Assign Public Key to User:

Extract the public key content (excluding --—BEGIN/END PUBLIC KEY--— headers) and assign it to the user.

USE ROLE USERADMIN;
ALTER USER openflow_service SET RSA_PUBLIC_KEY = '<public_key_content>';

Store Private Key:

  • Preferred Method: Configure a secrets manager in the Snowflake Openflow UI under Controller Settings > Parameter Provider. Specify the path to the private key in the secrets manager (AWS Secrets Manager path: openflow/private_key).
  • Alternative (Less Secure): Upload the private key file directly in the Snowflake Openflow connector settings. Avoid hard-coding the key in configuration files.

Next, create a dedicated role for the Snowflake Openflow connector and assign it to the service user with appropriate privileges.

USE ROLE SECURITYADMIN;
CREATE ROLE IF NOT EXISTS openflow_service_role
  COMMENT = 'Role for Snowflake Openflow connector to manage data ingestion';

-- Grant the role to the service user
GRANT ROLE openflow_service_role TO USER openflow_service;

-- Set the role as the default for the user
ALTER USER openflow_service SET DEFAULT_ROLE = openflow_service_role;
Creating a dedicated Snowflake role for Snowflake Openflow connector

Then, designate a Snowflake warehouse for the Snowflake Openflow connector to execute queries. Start with a small warehouse size (XSMALL) and scale up based on workload requirements.

CREATE WAREHOUSE IF NOT EXISTS openflow_warehouse
  WAREHOUSE_SIZE = XSMALL
  AUTO_SUSPEND = 300
  AUTO_RESUME = TRUE
  COMMENT = 'Warehouse for Snowflake Openflow connector';
Creating Snowflake Warehouse - Snowflake Openflow

Grant the necessary privileges on the warehouse to the openflow_service_role:

USE ROLE SECURITYADMIN;
GRANT USAGE, OPERATE ON WAREHOUSE openflow_warehouse TO ROLE openflow_service_role;
Granting privileges to the role - Snowflake Openflow

Subsequently, create a Database and Schema to store the ingested data:

USE ROLE SYSADMIN;
CREATE DATABASE IF NOT EXISTS OPENFLOW_DEMO_DB
  COMMENT = 'Database for Snowflake Openflow connector data';

CREATE SCHEMA IF NOT EXISTS OPENFLOW_DEMO_DB.OPENFLOW_DEMO_SCHEMA
  COMMENT = 'Schema for Snowflake Openflow connector data';
Creating Snowflake Database and Schema - Snowflake Openflow

Finally, assign the essential permissions to the openflow_service_role to enable the Snowflake Openflow connector to perform data ingestion and related operations:

USE ROLE SECURITYADMIN;

-- Grant database-level permissions
GRANT USAGE ON DATABASE OPENFLOW_DEMO_DB TO ROLE openflow_service_role;

-- Grant schema-level permissions
GRANT USAGE, CREATE TABLE, CREATE DYNAMIC TABLE, CREATE STAGE, CREATE SEQUENCE, CREATE CORTEX SEARCH SERVICE
  ON SCHEMA OPENFLOW_DEMO_DB.OPENFLOW_DEMO_SCHEMA
  TO ROLE openflow_service_role;


  -- If you plan to use Dynamic Tables or Cortex Search Services directly via this flow,
-- you would add those specific grants here:
-- GRANT CREATE DYNAMIC TABLE ON SCHEMA OPENFLOW_DEMO_DB.OPENFLOW_DEMO_SCHEMA TO ROLE openflow_service_role;
-- GRANT CREATE CORTEX SEARCH SERVICE ON SCHEMA OPENFLOW_DEMO_DB.OPENFLOW_DEMO_SCHEMA TO ROLE openflow_service_role;
Assigning permissions to the openflow_service_role role - Snowflake Openflow
Note: For basic file ingestion, CREATE TABLE and CREATE STAGE are sufficient. CREATE DYNAMIC TABLE and CREATE CORTEX SEARCH SERVICE are for more advanced features and are typically granted only if those specific functionalities are leveraged by your flow.

Step 9—Install the Google Drive Connector

In the Snowflake Openflow UI, go to the Overview page. In the Featured Connectors section, click View more connectors, find Google Drive, and click Add to runtime. Choose your runtime and click Add.

Installing Google Drive Connector - Snowflake Openflow
Note: Before installation, make sure the target Snowflake database and schema (from step 8) exist. You will then be prompted to authenticate: first, authenticate with your Snowflake account to install the connector, then authenticate to the runtime. After a few moments, a new Google Drive process group will appear on the canvas.

Step 10—Configure Connector Parameters

On the Openflow canvas, right-click the newly added Google Drive process group and select Parameters.

Snowflake Openflow canvas - Snowflake Openflow

To configure the connector, fill in the three main sections: Google Drive source, Snowflake destination, and ingestion specifics.

Configuring connector parameters - Snowflake Openflow

Configure all the parameter fields:

  • Google Delegation User — The Google admin user email to impersonate (often your Super Admin).
  • GCP Service Account JSON — Upload the JSON key file you downloaded in step 5.
  • Destination Database/Schema — The Snowflake MYDB.MYSCHEMA where files will land.
  • Snowflake Account Identifier — Your account (myorg-myaccount).
  • Authentication Strategy — Use KEY_PAIR (since we use the service user).
  • Snowflake Private Key — Supply the private key (or file) of the Snowflake service user. If using a parameter provider, reference the key parameter here.
  • Snowflake Role, Username, Warehouse — The role (openflow_service_role), the username (openflow_service), and a warehouse (pick a running warehouse) to load data.
  • Google Drive ID — The ID of the Shared Drive or folder to ingest.
  • File Extensions To Ingest — Comma-separated extensions (pdf,txt,png) – the connector will filter on these.
  • Snowflake File Hash Table Name — An internal table (in your schema) to track file hashes (like <database>.<schema>.FILE_HASHES) to skip unchanged files.

Enter values accordingly.

Step 11—Start the Connector

After filling parameters, click Enable All Controller Services, then right-click the process group and choose Start.

Enabling all controller services and starting the connector - Snowflake Openflow

You will see that the connector will begin running, periodically polling Google Drive for files. It will load new or changed files into Snowflake tables in your designated schema.

Step 12—Verify Ingestion

Check Snowflake (OPENFLOW_DEMO_DB.OPENFLOW_DEMO_SCHEMA) for the output tables. You should see the Google Drive files’ metadata and content successfully loaded.

Verifying ingestion - Snowflake Openflow

You can see a table of file listings and a table of file contents (depending on the connector design). Also, make sure to verify permissions: check that the Snowflake service user role has the right grants on the target tables.

And that completes the example. If you have followed the article thoroughly, you have now configured everything from Google to Snowflake via Openflow, all within the Snowflake UI and using standardized flows. You can follow this same pattern for other connectors (such as Box, Slack, Kafka, JDBC, and more) by following similar steps: provision credentials in Snowflake, configure secrets, add the connector, and start the flow.

Save up to 30% on your Snowflake spend in a few minutes!

Enter your work email
Enter your work email

Conclusion

And that's a wrap! Data teams have long faced significant challenges in efficiently moving data between platforms. Snowflake Openflow is specifically designed to address these issues. It is a fully managed integration service capable of connecting virtually any data source to any destination, with hundreds of processors supporting both structured and unstructured data. It is built on Apache NiFi and provides a simple drag-and-drop interface for building data pipelines, as well as ready-made connectors built for speed and reliability. It can also be implemented as a managed service in your own cloud. With auto-scaling and monitoring, you can finally stop juggling with multiple ETL tools and writing complex scripts. Instead, you get fast performance, audit trails, and reliable data flowing directly into Snowflake.

In this article, we’ve covered:

  • What is Snowflake Openflow?
  • How Snowflake Openflow Works (Architecture)
  • Snowflake Openflow vs. Snowflake Snowpipe vs. Snowflake COPY
  • A Step-by-Step Setup Guide to Activate and Set Up Snowflake Openflow from Scratch
  • An Example: Step-by-Step Guide to Ingesting Files from Google Drive to Snowflake via Openflow

... and so much more!

FAQs

What is Snowflake Openflow?

Snowflake Openflow is a Snowflake-managed data integration service (GA as of 2025 on AWS) that lets you connect any source or destination via hundreds of NiFi-style processors. It runs in your cloud and is managed through Snowflake’s UI.

Which cloud regions and accounts can use Snowflake Openflow?

Currently, Snowflake Openflow is available to all Snowflake accounts in AWS Commercial regions. It uses AWS BYOC (Bring Your Own Cloud), so your account must be on AWS. (Other clouds are planned but AWS is the only one GA now).

How does Snowflake Openflow differ from Snowpipe and COPY?

Snowflake Snowpipe is for continuous loading of files from cloud storage. COPY is manual batch loading. Snowflake Openflow is broader: it supports any source (including databases, message queues, SaaS APIs, and files) in both batch and streaming modes. It adds inline transformations and schema drift handling, and provides a UI for monitoring (features Snowpipe/COPY lack). In short, Snowflake Snowpipe is file-focused, while Snowflake Openflow is a full ingestion platform built on NiFi.

Which clouds does Snowflake Openflow run on?

Today, Snowflake Openflow runs on AWS (any region) via EKS in your VPC. It doesn’t run on Azure/AzureGov/GCP yet.

What are "deployments" and "runtimes" in Snowflake Openflow?

A deployment is like an environment or project area in your AWS account. It encapsulates the VPC/EKS setup. Each deployment can have multiple runtimes, which are NiFi clusters (node groups) inside that deployment. You might use one deployment per team or per business area. Multiple runtimes let you isolate workloads (dev vs prod) or scale independently. All deployments and runtimes are created and managed via the Snowflake control plane.

Where does the Snowflake Openflow runtime run and who manages it?

The runtime (data plane) runs in your AWS account (in your VPC/subnets) or in a Snowflake-managed environment if you choose. If BYOC, you manage the AWS infrastructure (subnets, instances) and bear the AWS costs. The NiFi software itself is fetched from Snowflake’s image registry and orchestrated via the control plane. The control plane (management UI) is run and managed by Snowflake.

How does Snowflake Openflow authenticate to Snowflake?

Snowflake Openflow uses Snowflake credentials to write data into your account. Typically, you configure a Snowflake SERVICE user with key-pair authentication. In the connector parameters you choose KEY_PAIR strategy and supply the RSA private key (stored securely) for that service user. Snowflake Openflow will then login as that user to execute SQL (COPY into tables). You can also use session token auth, but key-pair (SERVICE user) is common for automated pipelines.

Do I need special Snowflake privileges to create deployments and runtimes?

Yes. You must have the system privilege CREATE OPENFLOW DATA PLANE INTEGRATION to make deployments, and CREATE OPENFLOW RUNTIME INTEGRATION to make runtimes. Within a deployment, the role with OWNERSHIP has full control (including deleting it). In practice, an ACCOUNTADMIN grants those privileges to an Openflow admin role and then that role is used for all Openflow work.

How do I monitor Snowflake Openflow pipelines and logs?

In the Snowflake Openflow UI, you have a "Monitor" section (under the control plane) that shows pipeline status, recent flows, and any errors. NiFi also keeps provenance and logs for each flow step. You can also query Snowflake’s event or telemetry tables (if you set them up) to see flow execution logs. For low-level logs, you may SSH into the data plane nodes or check AWS CloudWatch if configured.

What data types and modes can Snowflake Openflow handle?

Anything. Snowflake Openflow (via NiFi) can ingest anything. It natively supports structured and semi-structured data (CSV, JSON, Parquet, Avro, XML, etc.) and can also fetch and store unstructured blobs like text documents, images, audio, and video.

Can I customize connectors or add custom Nifi processors?

Yes. Snowflake Openflow is based on Apache NiFi, which is open source. You can upload your own NiFi templates or use custom processors (if allowed by Snowflake). Snowflake provides many out-of-the-box connectors, but nothing stops advanced users from building their own NiFi flows or using extra processors (though custom code should be tested carefully).

What Snowflake objects does Snowflake Openflow create?

When you create a deployment, Snowflake Openflow creates an "Openflow deployment" integration object in your Snowflake account (in the SNOWFLAKE database). It may also create an event table in the OPENFLOW database/schema if you choose that option, or use the account’s event table. Runtimes show up as "Openflow runtime" objects. Other than that, the flows create whatever tables you specify for data output. No unexpected tables are created without your flow’s design.

How to secure and govern Snowflake Openflow pipelines for compliance?

Snowflake Openflow integrates with Snowflake’s security model. You control which roles own deployments and runtimes. Data in transit is encrypted (TLS for connectors). Secrets (keys) should be kept in a proper secret store. Since it runs in your VPC, you can apply network controls (security groups, private link). Snowflake logs all API activity for auditing. From a governance perspective, Snowflake Openflow can utilize Snowflake RBAC, object tagging, and so on (since flows write to Snowflake tables under a service role you control).

When should I not use Snowflake Openflow?

If your needs are very simple, such as occasional manual loads of CSV files from a known location, then basic Snowflake Snowpipe or the COPY command might suffice without the extra overhead. If you prefer to use a different integration tool your team already knows well, you might stick with that. Also, Snowflake Openflow is AWS-only in 2025, so if you need GCP/Azure pipelines outside Snowflake’s offerings, you’d use other services. In general, if your pipelines are very lightweight and entirely file-based, Snowflake Snowpipe + external automation could be simpler. But for anything beyond trivial, Snowflake Openflow is the top choice.