Why Snowflake
Snowflake is a strong landing zone for TrackLayer events because it separates storage from compute, scales cleanly across analysts and automation jobs, and handles semi-structured ecommerce payloads without forcing every property into a rigid table on day one. TrackLayer can send stable top-level fields for event_time, merchant_id, event_name, order_id, value, currency, platform, and user_id while preserving the full original payload for audits, enrichment, and future model changes.
This matters when marketing, analytics, finance, and data science teams all need the same event stream for different jobs. Analysts can query purchases and funnel steps immediately, engineers can replay or inspect raw payloads, and data teams can build dbt marts without asking frontend teams to rewrite tracking code every time the reporting model changes. Snowflake becomes the durable system of record while TrackLayer remains the event normalization and routing layer.
Prerequisites
A Snowflake account with an active virtual warehouse sized for the expected event volume and sync cadence.
A database created for analytics data or permission to create a dedicated TrackLayer database.
A service user with USAGE on the database, schema, and warehouse plus CREATE TABLE in the target schema.
A Snowflake network policy that allows TrackLayer export IPs for the service user.
A TrackLayer Growth+ tier workspace with permission to add warehouse destinations.
Step-by-step setup
Create the database and schema
Start with a dedicated namespace. Keeping TrackLayer raw events separate from modeled marts makes access control, retention, backfills, and dbt source definitions easier to reason about.
CREATE DATABASE IF NOT EXISTS tracklayer;
USE DATABASE tracklayer;
CREATE SCHEMA IF NOT EXISTS events;Create the writer role and grants
Grant only the permissions TrackLayer needs to load data. The writer role should be able to use the warehouse and namespace, create the destination table, and insert rows after the table exists.
CREATE ROLE IF NOT EXISTS tracklayer_writer;
GRANT USAGE ON DATABASE tracklayer TO ROLE tracklayer_writer;
GRANT USAGE ON SCHEMA tracklayer.events TO ROLE tracklayer_writer;
GRANT CREATE TABLE ON SCHEMA tracklayer.events TO ROLE tracklayer_writer;
GRANT USAGE ON WAREHOUSE analytics_wh TO ROLE tracklayer_writer;
GRANT INSERT, SELECT ON FUTURE TABLES IN SCHEMA tracklayer.events
TO ROLE tracklayer_writer;Create the service user
Use a non-human service user with the dedicated role as its default role. Store the generated password in your secret manager and rotate it on the same schedule as other warehouse credentials.
CREATE USER IF NOT EXISTS tracklayer_svc
PASSWORD = 'replace_with_a_long_generated_secret'
DEFAULT_ROLE = tracklayer_writer
DEFAULT_WAREHOUSE = analytics_wh
MUST_CHANGE_PASSWORD = FALSE;
GRANT ROLE tracklayer_writer TO USER tracklayer_svc;Add the Snowflake destination in TrackLayer
In TrackLayer, open Destinations, choose Snowflake, and enter the account identifier, service user, password, warehouse, database, and schema. Use production credentials only after staging has completed a successful test sync.
Account: xy12345.eu-central-1
User: tracklayer_svc
Password: stored secret
Warehouse: analytics_wh
Database: tracklayer
Schema: eventsTest sync and verify rows
Run a manual test sync from TrackLayer, wait for the job to finish, then verify that recent events landed with timestamps, merchant identifiers, event names, revenue fields, and raw payloads intact.
SELECT
event_id,
event_time,
merchant_id,
event_name,
value,
currency
FROM tracklayer.events.raw_events
ORDER BY event_time DESC
LIMIT 20;Raw events table
TrackLayer can create the table during the first sync when the service role has CREATE TABLE. If your data team prefers to manage DDL directly, create the table ahead of time and let TrackLayer append into it. Snowflake uses automatic micro-partitions based on loaded data, so event_time pruning comes from natural ordering and query filters rather than a manual partition declaration.
CREATE TABLE IF NOT EXISTS TRACKLAYER.EVENTS.RAW_EVENTS (
EVENT_ID STRING NOT NULL,
EVENT_TIME TIMESTAMP_NTZ NOT NULL,
LOADED_AT TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP(),
MERCHANT_ID STRING NOT NULL,
EVENT_NAME STRING NOT NULL,
USER_ID STRING,
ANONYMOUS_ID STRING,
ORDER_ID STRING,
PLATFORM STRING,
SOURCE STRING,
VALUE NUMBER(18, 4),
CURRENCY STRING,
CONSENT_STATUS STRING,
PROPERTIES VARIANT,
RAW_PAYLOAD VARIANT
)
CLUSTER BY (MERCHANT_ID, EVENT_NAME);Sample queries
Revenue by platform
SELECT
platform,
currency,
DATE_TRUNC('day', event_time) AS event_day,
SUM(value) AS revenue,
COUNT(DISTINCT order_id) AS orders
FROM tracklayer.events.raw_events
WHERE event_name = 'purchase'
GROUP BY 1, 2, 3
ORDER BY event_day DESC, revenue DESC;Funnel rates
WITH user_steps AS (
SELECT
merchant_id,
user_id,
MAX(IFF(event_name = 'view_content', 1, 0)) AS viewed,
MAX(IFF(event_name = 'add_to_cart', 1, 0)) AS added,
MAX(IFF(event_name = 'begin_checkout', 1, 0)) AS checked_out,
MAX(IFF(event_name = 'purchase', 1, 0)) AS purchased
FROM tracklayer.events.raw_events
WHERE event_time >= DATEADD('day', -30, CURRENT_TIMESTAMP())
GROUP BY 1, 2
)
SELECT
merchant_id,
AVG(added) AS add_to_cart_rate,
AVG(checked_out) AS checkout_rate,
AVG(purchased) AS purchase_rate
FROM user_steps
WHERE viewed = 1
GROUP BY 1;Cohort LTV
WITH first_purchase AS (
SELECT
merchant_id,
user_id,
DATE_TRUNC('month', MIN(event_time)) AS cohort_month
FROM tracklayer.events.raw_events
WHERE event_name = 'purchase'
GROUP BY 1, 2
)
SELECT
f.merchant_id,
f.cohort_month,
DATEDIFF('month', f.cohort_month, DATE_TRUNC('month', e.event_time)) AS month_number,
SUM(e.value) AS cohort_revenue,
COUNT(DISTINCT f.user_id) AS customers,
SUM(e.value) / NULLIF(COUNT(DISTINCT f.user_id), 0) AS ltv
FROM first_purchase f
JOIN tracklayer.events.raw_events e
ON e.merchant_id = f.merchant_id
AND e.user_id = f.user_id
WHERE e.event_name = 'purchase'
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3;Cost optimization
- Create materialized views for common aggregates such as daily revenue, campaign performance, and funnel steps when the same dashboard queries run many times per day.
- Suspend the warehouse between syncs and use auto-resume for scheduled loads. TrackLayer exports do not require a warehouse to stay warm all day for most stores.
- Consider Snowpark for ML workloads that need session features, propensity scoring, churn prediction, or product recommendation features derived from raw event history.
Integration with dbt
Treat TrackLayer's Snowflake table as a raw source in dbt. The first layer should be staging models that cast timestamps, unpack common VARIANT properties, normalize event names, and deduplicate replayed rows by event_id. Keep staging narrow and predictable so downstream models do not need to understand every raw payload shape TrackLayer preserves.
From staging, build marts around business questions: orders, customers, attribution touches, lifecycle cohorts, product performance, and campaign profitability. That staging → marts pattern lets TrackLayer keep a faithful raw stream while dbt owns semantic decisions such as net revenue, channel grouping, repeat purchase windows, and LTV definitions.
sources:
- name: tracklayer
database: tracklayer
schema: events
tables:
- name: raw_events
models:
staging:
stg_tracklayer_events → cleaned events
marts:
fct_orders → one row per order
fct_funnel_steps → event funnel metrics
dim_customers → customer identity and cohortsTroubleshooting
Incorrect username or password
Confirm the TrackLayer destination uses the service user, not your personal Snowflake login. Rotate the password and update the destination if there is any doubt.
SQL access control error
Check that tracklayer_writer has USAGE on the warehouse, database, and schema plus CREATE TABLE and INSERT privileges in the destination schema.
Network policy blocks login
Add TrackLayer export IPs to the Snowflake network policy attached to tracklayer_svc, then retry the connection test from TrackLayer.
Warehouse is suspended too long
Enable auto-resume on the export warehouse and confirm the service role has USAGE on that warehouse. TrackLayer can trigger the resume when the sync starts.
Rows land but queries feel slow
Filter by event_time, merchant_id, and event_name. Add clustering for high-volume tables and move repeated transformations into dbt models or materialized views.
Common questions
Does TrackLayer overwrite Snowflake rows?
No. The export is designed as an append stream. Use event_id and loaded_at to deduplicate in staging models if you intentionally replay historical events.
Can I use key-pair authentication?
Use the authentication method shown in your TrackLayer destination settings. If key-pair authentication is enabled for your workspace, prefer it over password authentication for production.
Which warehouse size should I start with?
Start small and measure. Most ecommerce event exports can begin on an X-Small or Small warehouse, then scale only if sync windows or downstream transformations need more throughput.
Should raw events be flattened before Snowflake?
Keep raw events flexible. Store stable top-level fields for common filters and metrics, then preserve the original payload in VARIANT for later enrichment and backfills.
How often should exports run?
Use the shortest cadence that your reporting workflows actually need. Operational dashboards may need frequent syncs, while finance and lifecycle marts often work well hourly.
Related implementation guides
Segment destination guide
Route selected Segment events into TrackLayer before forwarding clean canonical events to Snowflake and other destinations.
Read guide →Identity resolution guide
Design stable customer identifiers before modeling Snowflake event tables, cohorts, and lifetime value.
Read guide →Attribution models guide
Turn raw event exports into channel, campaign, and lifecycle reporting models your team can trust.
Read guide →