Prerequisites
Creating a custom role
NetSpring will use a dedicated service account to access your data. In order to use NetSpring to connect and query your Google BigQuery database, you must create a custom role with specific permissions and then assign it to your service account. You should have admin privileges in the project in order to execute the steps that are listed below.
- Navigate to console.cloud.google.com, sign in, and select your project.
- Then, navigate to IAM & admin, select Roles, and choose +Create Role.
- On the Create Role page, do the following:
- Enter a title, description, and create an ID or keep the existing one.
- Select a Role launch stage.
- Select +Add Permissions. In the Add permissions window, select the following permissions:
- bigquery.config.get
- bigquery.datasets.getIamPolicy
- bigquery.jobs.create
- bigquery.jobs.list
- bigquery.savedqueries.get
- bigquery.savedqueries.list
- bigquery.tables.get
- bigquery.tables.getData
- bigquery.tables.list
- resourcemanager.projects.get
- Click Add and select Create.
Setting up your service account
The service account is a scratch space for users. If you already have a service account for BigQuery, check if the following permissions are granted on your account:
- The service account has BigQuery Data Viewer permissions for any datasets you want to analyze with NetSpring.
- The service account has BigQuery Data Owner permissions for the netspring_rw dataset.
- The service account has permissions to run jobs in the project via the BigQuery Job User role.
If you do not have a service account for BigQuery, follow these steps to create a service account.
- Go to console.cloud.google.com, log in, and select your project.
- In the navigation menu onn the upper left, go to IAM & Admin, select Service Accounts, and then click + Create Service Account from the top menu.
- Provide a name, add an optional description, and click Create.
- On the Service account permissions page, click Select a role, choose Custom, and select the custom role you created previously. Then, click Continue.
Creating a JSON key
- On the Create service account page, click + Create Key.
- For Key type, select JSON.
- Click Create. The key will be downloaded to your computer. Once the download is complete, you will see a download confirmation message. Note the filename and click Close.
Sharing your dataset
To share your dataset, follow the steps below:
- Go to console.cloud.google.com, log in, and select your project.
- From the navigation menu in the upper left, navigate to the Big Data section and choose BigQuery. Ensure your project appears on the left side of the page under the search box.
- Click on your project to display its datasets, select the dataset you wish to share, and click Share Dataset.
- Choose Owner to view the complete list of project owners.
- In the Add members box, enter the email address of the service account with which you want to share the dataset.
- Click Select a role, choose Project Owner, and click Add. Finally, click Done.
Creating a writable schema in BigQuery
Product analytic queries are very expensive and require scanning billions of events for every query. Naive approaches like scanning event tables on the fly do not scale and result in poor query performance and very high compute costs. NetSpring constantly analyzes your query patterns to materialize common sub-computations, pre-calculate partial aggregates, optimize the layout of the event stream and other optimizations. Given our warehouse-native architecture, we never move any customer data out of the warehouse. Instead, all these artifacts are stored in an area of the warehouse dedicated for NetSpring. NetSpring needs read-write access to this area so it can create and update these artifacts as needed. We do not need write access to any other part of the warehouse. To help us optimize, follow the steps below.
- Create a new dataset dedicated for NetSpring.
- Click on the (⋮) next to the project in the left pane and select Create dataset.
- Create a new dataset named
netspring_rw
. You can choose any name you want. - For the location type, make sure to use the same settings as your other datasets to minimize data movement costs.
- Click CREATE DATASET.
- Share this dataset with NetSpring with write-access.
- Click on the (⋮) icon next to the
netspring_rw
dataset and select Share. - Click on ADD PRINCIPAL.
- Under Add principals, enter the Service Account Email.
- Under Assign roles, select BigQuery in the left pane and BigQuery Data Editor in the right pane.
- Confirm that the settings look as follows and hit SAVE.
- Click on the (⋮) icon next to the
- Also include the following role to enable NetSpring to debug BigQuery performance issues:
roles/bigquery.resourceViewer
- Alternatively, the following privileges can be directly granted:
`bigquery.jobs.get`
`bigquery.jobs.listAll`
- You also have to add details of the schema with the NetSpring application. To do this,
- On the left panel in your NetSpring app, navigate to Settings.
- Under General Settings, enable Materialization using the toggle and specify the following details:
- Database - the name of the database in the data warehouse where the materialized tables will be created.
- Schema - the name of the schema within the database where the materialized tables will be created. This value will be
netspring_rw
. - Refresh Cron Schedule - the refresh periodicity of the materialized tables using the Cron syntax.
When this is done, NetSpring creates materialized tables within the data warehouse which contain intermediate results for improving performance.