Data Integration & Analytics Glossary
Learn about the major concepts and terms for data analytics, business intelligence, and data integration with this in-depth industry glossary.
A
AI Analytics
AI analytics refers to the use of machine learning to automate processes, analyze data, derive insights, and make predictions or recommendations.
Analytics as a Service
Analytics as a service refers to a subscription-based model in which data analytics and BI processes take place on cloud-based, vendor-managed systems rather than using on-premise hardware.
Analytics Dashboard
An analytics dashboard is an interactive graphical user interface that allows you to display, track, and analyze key performance indicators and metrics. Modern dashboards can combine real-time data from multiple sources and provide AI-assisted data preparation, chart creation, and analysis.
Apache Kafka
Apache Kafka is an open-source distributed event streaming platform which is optimized for ingesting and transforming real-time streaming data. By combining messaging, storage, and stream processing, it allows you to store and analyze historical and real-time data.
Augmented Analytics
Augmented analytics describes the use of artificial intelligence (AI) and machine learning technologies within a data analytics platform to enhance human intuition and productivity across the analytics lifecycle.
AutoML
AutoML (short for automated machine learning) refers to the tools and processes which make it easy to build, train, deploy and serve custom machine learning models.
B
BI Dashboard
A BI dashboard is a business intelligence tool which allows users to track, analyze and report on key performance indicators and other metrics. BI dashboards typically visualize data in charts, graphs and maps and this helps stakeholders understand, share and collaborate on the information.
Big Data Analytics
Big data analytics is the process of collecting, preparing and analyzing large, diverse data sets to generate valuable insights.
Business Dashboard
A business dashboard is an interactive data visualization and analysis tool, enabling the presentation, monitoring, and examination of key performance indicators (KPIs) and metrics.
Business Intelligence
Business intelligence (BI) combines applications, processes, and infrastructure that enables access to and analysis of information to improve and optimize decisions and performance.
Business Intelligence Reporting
Business Intelligence reporting is broadly defined as the process of using a BI tool to prepare and analyze data to find and share actionable insights.
Business Intelligence Tools
Business intelligence tools are technology or software applications used to collect, combine, and analyze various types of business-relevant information.
C
CDC SQL Server
SQL Server CDC (change data capture) is the process of recording changes in a Microsoft SQL Server database and then delivering those changes to a downstream system.
Change Data Capture
Change data capture (CDC) refers to the process of identifying and capturing changes made to data in a database and then delivering those changes in real-time to a downstream process or system.
Cloud Analytics
Cloud analytics is a service model in which data analytics and business intelligence processes occur on a public or private cloud rather than on a company’s on-premise servers to help streamline the process of taking raw data to insights.
Cloud Data Warehouse
A cloud data warehouse is a database stored as a managed service in a public cloud and optimized for scalable BI and analytics. It removes the constraint of physical data centers and lets you rapidly grow or shrink your data warehouses to meet changing business needs.
D
Dashboard
A dashboard presents critical data, visualizations, and KPIs focused on the specific needs of analytics user segments, allowing for a quicker, more organized review and analysis of business-critical information and trends.
Dashboard Reporting
Dashboard reporting helps businesses make better informed decisions by allowing users to not only visualize KPIs and track performance, but also interact with data directly within the dashboard to analyze trends and gain insights.
Dashboard Software
Dashboard software allows users to create visual representations of data and KPIs, helping them recognize patterns and make faster, data-driven decisions.
Data Aggregation
Data aggregation is the process of combining datasets from diverse sources into a single format and summarizing it to support analysis and decision-making.
Data Analytics
Data analytics refers to the use of processes and technology to combine and examine datasets, identify meaningful patterns, correlations, and trends in them, and most importantly, extract valuable insights.
Data Analytics Tools
Data analytics tools are technology or software applications that allow users to find patterns, trends, and relationships in their data.
Data Catalog
A data catalog is an inventory of data assets, organized by metadata and data management and search tools, which provides on-demand access to business-ready data.
Data Dashboard
A data dashboard is an interactive tool that allows you to track, analyze, and display KPIs and metrics. Modern dashboards allow you to combine real-time data from multiple sources and provide you AI-assisted data preparation, chart creation, and analysis.
Data Discovery
Data discovery refers to the process of exploring and analyzing data to uncover patterns, identify relationships, and gain insights that improve decision making and business performance.
Data Exploration
Data exploration refers to the process of reviewing a raw dataset to uncover characteristics and initial patterns for further analysis.
Data Fabric
Data fabric refers to a machine-enabled data integration architecture that utilizes metadata assets to unify, integrate, and govern disparate data environments.
Data Governance
Data governance refers to the set of roles, processes, policies and tools which ensure proper data quality throughout the data lifecycle and proper data usage across an organization.
Data Ingestion
Data ingestion refers to the tools & processes used to collect data from various sources and move it to a target site, either in batches or in real-time.
Data Integration
Data integration is the process of synchronizing data across applications and data platforms and providing users with comprehensive, accurate, and up-to-date information for business intelligence and analytics.
Data Integrity
Data integrity refers to the accuracy, consistency, and completeness of data throughout its lifecycle.
Data Lake
A data lake is a centralized repository that holds all of your organization's structured and unstructured data. It employs a flat architecture which allows you to store raw data at any scale without the need to structure it first.
Data Lake Architecture
The modern data lake architecture provides rapid data access and analytics by having all necessary compute resources and storage objects internal to the data lake platform.
Data Lake vs Data Warehouse
Data lakes and data warehouses are both universal data repositories. Data lakes typically store large volumes of unstructured data and data warehouses store structured data that has been processed based on predefined business needs.
Data Lakehouse
A data lakehouse is a data management architecture which combines key capabilities of data lakes and data warehouses. It brings the benefits of a data lake, such as low storage cost and broad data access, plus the benefits of a data warehouse, such as data structures and management features.
Data Lineage
Data lineage refers to the process of understanding and visualizing data flows from source to current location and tracking any alterations made to the data on its journey.
Data Literacy
Data literacy is the ability to read, work with, analyze and communicate with data, building the skills to ask the right questions of data and machines to make decisions and communicate meaning to others.
Data Management
Data management refers to the process of collecting, storing, organizing, and maintaining data to support analysis and decision-making.
Data Mart
A data mart is a structured data repository purpose-built to support the analytical needs of a particular department, line of business, or geographic region within an enterprise.
Data Mesh
Data mesh refers to a data architecture where data is owned and managed by the teams that use it. A data mesh decentralizes data ownership and provides a self-serve data platform and federated computational governance.
Data Migration
Data migration is the process of moving data between storage systems, applications, or formats. Typically a one-time process, it can include prepping, extracting, transforming and loading the data.
Data Mining
Data mining is the process of using statistical analysis and machine learning to discover hidden patterns, correlations, and anomalies within large datasets.
Data Modeling
Data modeling is the process of creating a diagram that represents your data system and defines the structure, attributes, and relationships of your data entities.
Data Pipeline
A data pipeline is a set of tools and processes used to automate the movement and transformation of data between a source system and a target repository. Building data pipelines can break down data silos and create a single, complete picture of your business.
Data Products
Data products are highly trusted, re-usable, and consumable data assets purposefully designed for domain-specific business outcomes.
Data Quality
Data quality assesses the extent to which a dataset meets established standards for accuracy, consistency, reliability, completeness, and timeliness.
Data Replication
Data replication is the process of creating and maintaining identical copies of data across multiple storage locations, systems, or databases in real-time or periodically.
Data Science vs Data Analytics
Data science and data analytics are closely related but there are differences between the two fields. One key difference is that data science involves creating custom data models.
Data Strategy
A data strategy is a structured plan outlining how your organization will manage, utilize, and derive value from your data assets. It encompasses policies, processes, and technologies aimed at ensuring data quality, security, and compliance.
Data Transformation
Data transformation refers to the process of cleaning, validating, and preparing data to match that of a target system.
Data Trends
Our experts help you understand the top 10 emerging BI and data trends, and find out how to use them to your advantage.
Data Vault
A data vault is a flexible, agile, and scalable data modeling approach in data warehousing to handle complex data structures and support enterprise analytics.
Data Visualization
Data visualization enables people to easily uncover actionable insights by presenting information and data in graphical, and often interactive graphs, charts, and maps.
Data Visualization Examples
This guide showcases the ten most compelling and interesting data visualization examples from recent years. As you’ll see, a well-done chart can turn huge datasets into clear stories on any topic, from food to music to politics.
Data Visualization Tools
Data visualization tools let users create graphics and imagery that help them make sense out of large amounts of data and make more informed decisions.
Data Warehouse
A data warehouse is a data management system which aggregates large volumes of data from multiple sources into a single repository of highly structured and unified historical data.
Data Warehouse Automation
The process of automating the entire data warehouse lifecycle from data modeling and real-time ingestion to data marts and governance to accelerate the availability of analytics-ready data.
Data Wrangling
Data wrangling is the process of cleaning, structuring, and transforming raw data into a usable format for analysis. Also known as data munging, it involves tasks such as handling missing or inconsistent data, formatting data types, and merging different datasets.
Database Replication
Database replication refers to the process of copying data from a primary database to one or more replica databases in order to improve data accessibility and system fault-tolerance and reliability.
DataOps
DataOps is a data management methodology that aims to improve the communication, integration, and automation of data flows between data management and consumers throughout an organization.
Decision Support System
A decision support system (DSS) is an analytics software program used to gather and analyze data to inform decision making, either by suggesting insights and analyses for humans to perform or by automating calculations and delivering best-case decisions.
Delta Lake
Delta Lake is an open-source storage layer that uses the ACID compliance of transactional databases to bring reliability, performance, and flexibility to data lakes.
Descriptive Analytics
Descriptive analytics focuses on summarizing and interpreting historical data to gain insights into events, patterns, and trends in a business.
Digital Dashboard
A digital dashboard is an electronic interface which allows users to track, analyze and report on KPIs and metrics. Modern, interactive dashboards make it easy to combine data from multiple sources and deeply explore and analyze the data directly within the dashboard itself.
E
ELT
ELT stands for “Extract, Load, and Transform” and describes the set of data integration processes to extract data from one system, load it into a target repository, and then transform it for downstream uses such as business intelligence (BI) and big data analytics.
Embedded Analytics
Embedded analytics refers to the seamless integration of data analysis and reporting capabilities directly into other software applications.
ETL
ETL stands for “Extract, Transform, and Load” and describes the set of processes to extract data from one system, transform it, and load it into a target repository.
ETL Pipeline
An ETL pipeline is a set of processes to extract data from one system, transform it, and load it into a target repository. By converting raw data to match the target system before loading, ETL pipelines allow for systematic and accurate data analysis in the target repository.
ETL Tool
An ETL tool is used to consolidate and transform multi-sourced data into a common format and load the transformed data into an easy-to-access storage environment such as a data warehouse or data mart.
ETL vs ELT
The ETL and ELT acronyms both describe processes of extracting, transforming and loading data from a source into a target repository. In the ETL process, data transformation is performed in a staging area outside of the target repository and in ELT, transformation is performed on an as-needed basis in the target system itself.
Explainable AI
Explainable AI (XAI) refers to a set of techniques and processes that help you understand the rationale behind the output of a machine learning algorithm.
F
Financial Analysis
Financial analysis is the process of examining financial statements and other relevant data to assess the financial health and performance of an organization.
Financial Analytics
Financial analytics is the use of tools and processes to combine and analyze datasets to gain insights into the financial performance of your organization.
I
Interactive Data Visualization
Interactive data visualization is the use of tools and processes to produce a visual representation of data which can be explored and analyzed directly within the visualization itself. This interaction can help uncover insights which lead to better, data-driven decisions.
iPaaS
Integration platform as a service (iPaaS) refers to a cloud-based platform that enables the integration of various applications and data sources across different cloud and on-premise environments.
K
Kafka Streams
Kafka streams integrate real-time data from diverse source systems and make that data consumable as a message sequence by applications and analytics platforms such as data lake systems.
KPI
KPI stands for key performance indicator, a quantifiable measure of performance over time for a specific objective.
KPI Dashboard
A KPI dashboard displays key performance indicators in interactive charts and graphs, allowing for quick, organized review and analysis.
KPI Examples
KPI examples provide stakeholders guidance in selecting the most impactful key performance indicators for their organization and teams.
KPI Reports
KPI reports provide a graphical, at-a-glance view of key metrics in real-time, helping decision-makers track the performance of their company, department, or initiatives, and identify areas in need of improvement
M
Machine Learning vs AI
Machine learning is a subset of AI focused on algorithms enabling computers to learn and make predictions without being programmed.
Marketing Analytics
Marketing analytics is the practice of combining and analyzing datasets, identifying patterns, and then coming away with actionable insights that improve the ROI of marketing efforts.
Marketing KPIs
Marketing KPIs are quantifiable measures of performance for specific strategic objectives. Marketing leaders and teams use KPIs to gauge the effectiveness of their efforts, guide their strategy, and optimize their programs and campaigns.
Metadata Management
Metadata management refers to the organization and control of data which describes technical, business, or operational aspects of other data.
Modern Data Stack
A modern data stack (MDS) is a collection of tools and technologies used to gather, store, process, and analyze data in a scalable and cost-effective way.
P
People Analytics
People analytics refers to the tools and processes used to analyze data to gain insights into the hiring, productivity, engagement and retention of talent.
Predictive Analytics
Predictive analytics refers to the use of statistical modeling, data mining techniques and machine learning to make predictions about future outcomes based on historical and current data.
Predictive Analytics Examples
A wide range of industries and job roles leverage predictive analytics for use cases such as fraud detection, forecasting, and healthcare diagnosis.
Predictive Modeling
Predictive modeling is a statistical technique used to predict the outcome of future events based on historical data. Machine learning algorithms are used to train and improve mathematical models to help you make better decisions.
Prescriptive Analytics
Prescriptive analytics is the use of advanced processes and tools to analyze data and content to recommend the optimal course of action or strategy moving forward.
Predictive vs Prescriptive Analytics
Predictive analytics focuses on forecasting outcomes to help you make decisions. Prescriptive analytics takes the extra step to recommend the specific, optimal course of action or strategy for you to take.
R
Real-Time Analytics
Real-time analytics refers to the use of tools and processes to analyze and respond to real-time information about your customers, products, and applications as it is generated.
Real-Time Data
Real-time data refers to information that is made available for use as soon as it is generated. Ideally, this data is passed instantly from source to consuming app.
Reporting vs Analytics
Reporting is the process of gathering and presenting data in a structured way. Analytics is the process of analyzing your data to find patterns and gain insights.
Revenue Operations
Revenue operations (RevOps) is a management model that accelerates revenue by aligning customer-facing teams behind shared objectives.
S
Spatial Analysis
Spatial analysis is the collection, display and manipulation of location data—or geodata—such as addresses, satellite images and GPS coordinates to uncover location-based insights.
Streaming Data
Streaming data refers to data which is continuously flowing from a source system to a target. It is usually generated at high speed by many data sources.
Supply Chain Analytics
Supply chain analytics refers to the tools and processes used to combine and analyze data from multiple systems to gain insights into the procurement, processing and distribution of goods.
V
Visual Analytics
Visual analytics integrates computational analysis techniques with interactive visualizations, offering users a new and innovative way to interact with, explore, and manipulate data.