Overview
This article outlines the process of configuring Snowflake as a data source in DvSum, enabling integration for data cataloging and profiling. The steps provided apply to both DvSum Data Insights (DI) and DvSum Data Quality (DQ), with only minor variations based on the specific platform.
Adding Snowflake source in DvSum:
Prerequisite: Enabling Query History for Snowflake
Before configuring Snowflake as a source, ensure that query history is enabled for your Snowflake account. This is crucial for tracking data lineage and gaining insights into usage patterns. For more information, refer to the Enabling Query History for Data Sources article.
You can follow the steps mentioned below to configure and authenticate a Snowflake source:
Step 1:
- Go to the Data Sources tab and click on Add Source button.
- A modal will open which will ask us to choose a data source that is to be added.
- Select Snowflake source, provide some source name and Save it.
Step 2: Once the source is saved, we will be redirected to the connection settings detail page of this new Snowflake source.
- Enable the checkbox of On-premise Web Service.
- Select the SAWS which is set up and is currently running.
- Enter the correct URL, Warehouse, DB Login, and Password.
- Authenticate the source by clicking the “Authenticate” button.
Note: By Default the SAWS type will be cloud. For more information regarding Cloud SAWS, click here
Step 3: Once the source is Authenticated, Database section will appear below the Authenticate button from where we have to select the database that we need to scan. This database selection field is a single select field and we can only select one database.
Once the database is selected, we have two options either limiting the scan to some specific schema(s)
or to scan all of them.
For Snowflake source we would have a Schema field as well which will contain a checkbox, If we want to scan all the schemas then we shouldn't check this checkbox and proceed with saving and scanning the source. But if we want to limit our scan to some specific schemas then check this checkbox.
Once it is checked then the list of available schemas will be displayed. User can select single or multiple schemas from the Available Schemas list and move them to the Selected Schemas tab on the right.
Step 4: After credentials are authenticated, and Database & Schema(s) are selected we need to save the source. For that, scroll up to the top. From the top right corner click on the “Done” button.
After that click the “Save” button. The source will get saved successfully.
Step 5: Now we can move to Scan History Page and click the "Scan Now" button. The scan will run and a job will be created once we click on the "Scan Now" button. Once the status of the job gets Completed, our new Snowflake source's scan will be completed successfully.
After the scan completion, click on Scan Name and it will open the Scan Summary page of this scan.
On the Scan Summary page, it will show all the insights of the scan i.e how many new tables and columns are fetched in this scan from the schemas we selected earlier.
In order to have more insights of the tables' details, click on "Data Dictionary" from the sidebar. Table listing view will appear. Click on "Recently Refreshed" tab. In this tab, we will see all the tables that we have got in the recent scan. Click on table names to get to know more details of the table from the detail page.
Watch this quick video tutorial of how to add and configure an Snowflake source into DvSum app.
0 Comments