16 Oct 2023
Min Read
Always Fresh: Snowflake and DeltaStream Integrate
Snowflake is a popular data warehouse destination for streaming data. It allows users to process large amounts of data for a variety of purposes, such as dashboards, machine learning, and applications. However, to keep these use cases up to date, users need to keep the data fresh. DeltaStream’s Snowflake integration makes it easy to keep your data fresh in Snowflake with a simple SQL query.
How can I use Snowflake with DeltaStream?
Our new Snowflake integration provides a way to write pre-processed data into Snowflake without the need for storing the data in an intermediary storage:
Using DeltaStream queries, source(s) data can be transformed in the shape and form that best suits your Snowflake tables, then it can use your Snowflake Store and continuously write the resulting data in the target Snowflake table.
DeltaStream’s new Snowflake Store uses the Snowpipe Streaming APIs to provide the best streaming experience to your Snowflake account without any data loss. Stay tuned for more information on this in the upcoming blog posts.
How DeltaStream integrates with Snowflake
Creating a Snowflake Store
In DeltaStream, a Snowflake Store can be created for connectivity with a specific Snowflake Account. Use the SNOWFLAKE store type to create a new Snowflake Store:
CREATE STORE sf WITH ( 'type' = SNOWFLAKE, 'uris' = 'https://<account_identifier>.snowflakecomputing.com', 'access_region' = 'AWS us-east-1', 'snowflake.account_id' = '<account_identifier>', 'snowflake.username' = '<account_username>', 'snowflake.role_name' = 'ACCOUNTADMIN', 'snowflake.client.key_file' = '/path/to/key.p8', 'snowflake.client.key_passphrase' = '<key_passphrase_if_any>', 'snowflake.warehouse_name' = '<warehouse_name>' );
Just like any other Store creation in DeltaStream, properties.file can also be used to offload the extra store information to a properties file. See Store Parameters for more information.
Once the Snowflake Store is created, you get access to any CRUD operation on databases, schemas, and tables on the specified Snowflake account without leaving your DeltaStream account.
For example, you can list available Snowflake databases:
main_db.public/sf# LIST ENTITIES; Entity name ------------------------- DELTA_STREAMING SNOWFLAKE SNOWFLAKE_SAMPLE_DATA
Or operate on the Snowflake schemas and tables:
main_db.public/sf# LIST ENTITIES IN "DELTA_STREAMING"; Entity name ---------------------- INFORMATION_SCHEMA PUBLIC main_db.public/sf# CREATE ENTITY "DELTA_STREAMING"."MY_SCHEMA"; main_db.public/sf# LIST ENTITIES IN "DELTA_STREAMING"; Entity name ---------------------- INFORMATION_SCHEMA MY_SCHEMA PUBLIC
main_db.public/sf# LIST ENTITIES IN "DELTA_STREAMING"."PUBLIC"; Entity name ------------------- STREAM_DATA_TBL sf_pv_table
The Snowflake entities can also be described for additional information:
main_db.public/sf# DESCRIBE ENTITY "DELTA_STREAMING"."PUBLIC".sf_pv_table; Type | Name | Created at | Last DDL At | Last DDL By | Retention time | Updated At | Comment --------+----------------------------------------+----------------------+----------------------+-------------------------+----------------+----------------------+---------- Table | "DELTA_STREAMING"."PUBLIC".sf_pv_table | 2023-09-05T22:26:24Z | 2023-09-05T22:26:24Z | SNOWPIPE_STREAMING_USER | 1 | 2023-09-05T22:26:25Z | Columns: Name | Type | Nullable | Policy name | Primary key | Unique key | Comment -----------+-------------------+----------+-------------+-------------+------------+---------- VIEWTIME | NUMBER(38,0) | ✓ | | | | USERID | VARCHAR(16777216) | ✓ | | | | PAGEID | VARCHAR(16777216) | ✓ | | | |
Materializing Streaming Data with Snowflake
Once your Snowflake Store has been created, you are ready to write queries that’ll enable you to apply any necessary computation to an upstream stream of data and continuously write the result of that computation to a corresponding Snowflake table.
Let’s take a look at a query where we count the number of transactions per type of credit card:
CREATE TABLE "CC_TYPE_USAGE" WITH ( 'snowflake.db.name' = 'DELTA_STREAMING', 'snowflake.schema.name' = 'PUBLIC' ) AS SELECT cc_type AS "CREDIT_TYPE", COUNT(*) AS "TOTAL_SO_FAR" FROM transactions GROUP BY cc_type;
In this example, we’re using a CREATE TABLE AS SELECT (CTAS) query to let the system know that we want to create a Table in Snowflake under the “DELTA_STREAMING”.”PUBLIC” namespace with the name CC_TYPE_USAGE. Once the CTAS query is created, we can describe the new Snowflake table of the same name:
main_db.public/sf# DESCRIBE ENTITY "DELTA_STREAMING"."PUBLIC"."CC_TYPE_USAGE"; Type | Name | Created at | Last DDL At | Last DDL By | Retention time | Updated At | Comment --------+--------------------------------------------+----------------------+----------------------+----------------+----------------+----------------------+---------- Table | "DELTA_STREAMING"."PUBLIC"."CC_TYPE_USAGE" | 2023-09-06T22:23:18Z | 2023-09-06T22:23:18Z | SFACCOUNTOWNER | 1 | 2023-09-06T22:23:18Z | Columns: Name | Type | Nullable | Policy name | Primary key | Unique key | Comment ---------------+-------------------+----------+-------------+-------------+------------+---------- CREDIT_TYPE | VARCHAR(16777216) | ✓ | | | | TOTAL_SO_FAR | NUMBER(38,0) | ✓ | | | |
By creating the above CTAS query, we have started a process that’ll continuously count the number of credit card types that are used in our product transactions. The result set of this query appends a new row to the CC_TYPE_USAGE table for every new count for a specific cc_type, where we can use the last count for each type to perform required analysis for the business.
The result set can be previewed in the CLI:
main_db.public/sf# PRINT ENTITY "DELTA_STREAMING"."PUBLIC"."CC_TYPE_USAGE"; CREDIT_TYPE | TOTAL_SO_FAR --------------+--------------- VISA | 1 VISA | 2 DISC | 1
Modern Streaming and Analytics
At DeltaStream we believe in providing a modern and unified stream processing solution that can unlock the full potential of your data, leading to improved operational efficiency, data security, and ultimately a competitive advantage for every organization when developing products. We hope that with the new Snowflake integration we open yet another door for providing data unity across your organization.
If you have any questions or want to try out this integration, reach out to us or request a free trial!