# Google BigQuery
Google BigQuery (opens new window) is a serverless, highly-scalable, and cost-effective cloud data warehouse with an in-memory BI Engine and machine learning built in. The Workato connector to Google BigQuery allows you to automate various actions on datasets in your BigQuery instance such as inserting rows or performing queries on existing datasets. You are also able to trigger recipes off new rows in datasets.
# How to connect to BigQuery on Workato
The BigQuery connector allows two types of authentication : OAuth 2.0 and Service Account.
Field | Description |
---|---|
Connection name | Give this BigQuery connection a unique name that identifies which BigQuery instance it is connected to. |
Authentication Type | Choose OAuth 2.0 when using a user account and Service Account when using a service account to connect to BigQuery. |
GCP Project service account email (Applicable only to Service Account Auth) | The email of your service account |
Private key (Applicable only to Service Account Auth) | The private of your service account |
Advanced settings (Applicable only to Service Account Auth) | Adjust the scopes of your connection. |
# Service Account
You can also authenticate to BigQuery using a Google Cloud service account. A service account is a special type of Google account that is associated with your Google Cloud Project that can be used to run API requests on your behalf. Service accounts can be used in BigQuery to ensure that the solution will continue running even if individual users' permissions change. Read more about service accounts here (opens new window).
To create a service account, you need to log into your Google Cloud Platform (GCP) console. Follow the guide here (opens new window) to create a new service account in your GCP project. Follow this guide (opens new window) to add a new private key and download the key in JSON format. Note that after you download the key file, you cannot download it again.
Copy the private key from -----BEGIN PRIVATE KEY-----
to -----END PRIVATE KEY-----\n
both inclusive and paste it as input on Workato.
# Setting permissions
For the service account to use all actions and triggers in the BigQuery connector, the service account should be given the BigQuery Admin
IAM role or custom role with the following 11 permissions:
- bigquery.datasets.get
- bigquery.jobs.create
- bigquery.jobs.get
- bigquery.jobs.list
- bigquery.jobs.listAll
- bigquery.tables.create
- bigquery.tables.get
- bigquery.tables.getData
- bigquery.tables.list
- bigquery.tables.update
- bigquery.tables.updateData
TIP
If you would like to only insert data with this connection on Workato, you may:
- Use the
BigQuery Data Editor
role. - Add the bigquery.jobs.create permission.
If you would like to only select data with this connection on Workato, you may:
- Use the
BigQuery Data Viewer
role. - Add the bigquery.jobs.create permission.
Read this guide (opens new window) to create custom roles for the service account on Google Cloud Platform. More information on each of the required permissions above can be found in the list of BigQuery permissions (opens new window).
Note that when the service account with a custom role is used, Project ID must be supplied directly on Workato as the Select project
dropdown in Setup
will not load and you will need to provide the Project ID manually.
# Rate limits on BigQuery
BigQuery's rate limits (opens new window) on tables indicates that bulk
insert operations on tables can only be performed 1000 times a day. To bypass these limits, use the Insert row
and Insert rows
actions which are not affected by such limits.
# Working with the BigQuery connector
After establishing a connection with the BigQuery connector, most actions will require some additional parameter inputs.
Field | Description |
---|---|
Project | The project available in the connection to be billed for the query. |
Dataset | The dataset which the action or trigger will pull the possible tables from. |
Table | The table inside the dataset. |
Location | The geographical location of where the job should be run. |
# Single row vs batch of rows
The BigQuery connector can read or write to your database either as a single row or in batches. When using batch read actions/triggers, you have to provide the batch size you wish to work with. The batch size can be any number between 1 and 50000, with 50000 being the maximum batch size.
Batch trigger inputs
Besides the difference in input fields, there is also a difference between the outputs of these 2 types of operations. A trigger that processes rows one at a time will have an output datatree that allows you to map data from that single row.
Single row output
However, a trigger that processes rows in batches will output them as an array of rows. The Rows datapill indicates that the output is a list containing data for each row in that batch.
Batch trigger output
As a result, the output of batch triggers/actions needs to be handled differently. In cases where there are downstream batch actions which accept Rows source list input fields, you'll be able to map the rows array pill to it to tell Workato you want to map the entire array.
Using batch trigger output
# WHERE condition
This input field is used to filter and identify rows to perform an action on. It is used in multiple triggers and actions in the following ways:
- filter rows to be picked up in triggers
- filter rows in Select rows action
- filter rows to be deleted in Delete rows action
This clause will be used as a WHERE
statement in each request. This should follow basic SQL syntax. Refer to this BigQuery documentation (opens new window) for a full list of rules for writing SQL statements compatible with BigQuery.
# Simple statements
Knowing the data types of the column in BigQuery are important to build working queries. When comparing string values, values must be enclosed in single quotes (''
) and columns used must exist in the table. When comparing integer values, the supplied value should not be enclosed in single quotes.
A simple WHERE
condition to filter rows based on values in a single column looks like this.
string_column = 'USD' and integer_column = 1111
If used in a Select rows action, this WHERE
condition will return all rows that have the value 'USD' in the currency
column. Just remember to wrap datapills with single quotes in your inputs.
Using datapills in WHERE
condition
# Complex statements
Your WHERE
condition can also contain subqueries. The following query can be used on the users
table.
id in (select distinct(user_id) from zendesk.tickets where priority = 2)
When used in a Delete rows action, this will delete all rows in the users
table where at least one associated row in the tickets
table has a value of 2 in the priority
column.
Using datapills in WHERE
condition with subquery
# Defining your output fields
Sometimes, actions, and triggers in BigQuery will allow you to define the expected output columns a query. This input field shows up in the following triggers and actions:
- Scheduled query trigger
- Get query job output
- Select rows using custom SQL
Define your output fields easily using the output schema designer by using our CSV uploader. In these cases, simply run a sample query in the BigQuery console like above and export a CSV.
Upload this CSV into the schema wizard and automatically generate all fields for you.
Last updated: 9/9/2023, 2:27:33 AM