Data Profiling & Exploration
Data Profiling and Exploration allows you to discover understand what all data exists across your multiple data sources. It helps you visualize your data by showing it in a map format that you can click to discover more information, identify missing relationships, profile the data, and manage multiple audits for the same data element in the same place. The kind of use-cases that data exploration can assist with
- A business user has been asked to audit a particular data in the system for regulation requirements. They don’t know which database, which schema that data exists.
- IT team is designing a new database. They want to find out where potential constraints and relationships (like Primary Keys, Foreign Keys) might be missing
- Data Governance – get information on areas where there is proliferation and duplication of data and integrity of data
Once your administrator has added data sources into the system, they are available to you for exploration. Alternatively, if you have just signed up, you can use the demo data sources that are automatically available within your account.
Data Profiling is the first step in the PARC process of Profile. Audit. Review. Comply. You would typically do profiling and then set audits of the data based on evaluating the data profiles.
Data Profiling is the workflow that allows you to better understand the characteristics of the data. It helps you answer the following questions
- Table level
- How many records in the data table?
- When was the data table last updated
- What is the primary key in the data table
- Column level
- Sparseness / Density
- Unique Values
- Missing / Blank values
- Minimum and Maximum values
- Pattern of data
- Distribution of data
You can access Data Profiling from Left Navigation, and then selecting the Source and Tables you want to view, of simply selecting a View from your history that is automatically displayed. Profiling information is also available when reviewing nodes in the Data Exploration section.