To sync data from Triple Whale to Google BigQuery, you must provide the necessary project details and grant specific permissions. This guide walks you through the setup process to ensure a seamless connection.
BigQuery Sync Setup
For step-by-step setup guidance, see the BigQuery Integration guide. The following is a quick reference of the setup information you will need:
-
Project ID and Dataset ID: Users must provide a project ID and dataset ID where data will be written.
-
Service Account Permissions: Triple Whale requires you to grant BigQuery editor permission to the
srv-big-query-exporter@shofifi.iam.gserviceaccount.com
service account on the specific dataset you are syncing, in order to create and write to tables in that dataset.-
To grant BigQuery editor permission to the service account, navigate to Share > Manage Permissions
-
Click Add Principal
-
Add
srv-big-query-exporter@shofifi.iam.gserviceaccount.com
as New principal, with the role of BigQuery Data Editor
-
Connecting to Triple Whale
Once your BigQuery dataset is set up, connect it to Triple Whale:
-
Go to your Triple Whale Integrations page.
-
Click Connect on the BigQuery integration.
Connecting BigQuery to Triple Whale
-
Enter the Project ID and Dataset ID created in Step 1.
-
Click the checkbox confirming that you added the Triple Whale service account (Step 2).
-
Click Save to complete the setup.
Creating an Agent to Schedule Your Sync
After connecting your BigQuery warehouse to Triple Whale, you can automate data exports by creating an Agent.
-
Open Agents
Go to Moby > Shop’s Agents in your Triple Whale account. Click New Agent or select an existing one to edit. -
Set BigQuery Table as Your Destination
In the Agent builder, add a Send to Destinations step and pick Sync to Warehouse. Click Select Provider to choose your connected BigQuery warehouse, and enter the Table ID where you wish to send your data (this should be a unique Table ID that doesn't already exist). -
Define Query
Check Use dedicated query, and provide your desired SQL query for the data export. (Note: For smaller exports, you may instead write a custom SQL query using a Get Data step in your agent, though we recommend the Use dedicated query for larger exports.) -
Configure Your Schedule
Choose how often you want the Agent to run (e.g., hourly). Once saved, Triple Whale will automatically send new rows to your BigQuery dataset on that schedule.
Finalizing Your Setup: Handling Data De-duplication (Recommended)
Now that your BigQuery database is connected to Triple Whale, it's important to handle duplicate records to ensure accurate data in your warehouse. Since each export appends new rows rather than updating existing ones, querying the raw data directly may result in inflated totals if duplicates aren’t managed.
Because unique row keys depend on the structure of your query, Triple Whale does not automatically de-duplicate records. You must define the right approach based on your data model.
To prevent duplication-related issues, follow the best practices outlined in the De-duplication for Data Warehouse Sync guide:
- Tracking the most recent version of each record
- Identifying unique row keys based on your data model
- Filtering out older duplicate rows in your queries
Implementing de-duplication ensures that your analysis reflects the most accurate and up-to-date data in your warehouse.