Unified Data Platform Solution Accelerator

A production-ready, enterprise-grade data platform built on Microsoft Fabric with optional integrations for Microsoft Purview governance and Azure Databricks advanced analytics.

Overview

The Unified Data Platform Solution Accelerator provides a complete, configurable data foundation using the medallion architecture pattern (Bronze, Silver, Gold). It enables organizations to rapidly deploy a governed, analytics-ready data platform with minimal configuration.

Key Capabilities

Medallion architecture with Bronze, Silver, and Gold lakehouses
48 PySpark notebooks for automated data transformations
Pre-built Power BI semantic models and dashboards
Optional Microsoft Purview integration for enterprise governance
Optional Azure Databricks integration for advanced analytics
Multi-domain data models (Sales, Finance, Customer, Product)
Automated deployment via Azure Developer CLI (azd)

Architecture Options

The solution offers four deployment configurations based on organizational requirements:

Option	Components	Description
Option 1	Fabric + Power BI	Core medallion architecture with analytics dashboards
Option 2	Option 1 + Purview	Adds data governance, lineage tracking, and classification
Option 3	Option 1 + Databricks	Adds advanced analytics and hybrid processing
Option 4	All Components	Complete enterprise data platform

Option 1: Core Medallion Architecture

┌─────────────────────────────────────────────────────────────────┐
│                     Microsoft Fabric Workspace                   │
├─────────────────┬─────────────────┬─────────────────────────────┤
│  Bronze         │  Silver         │  Gold                       │
│  Lakehouse      │  Lakehouse      │  Lakehouse                  │
│  (Raw Data)     │  (Validated)    │  (Enriched)                 │
├─────────────────┴─────────────────┴─────────────────────────────┤
│  48 PySpark Notebooks  │  Semantic Models  │  Power BI Reports  │
└─────────────────────────────────────────────────────────────────┘

Option 4: Full Enterprise Architecture

┌──────────────────────────────────────────────────────────────────────────┐
│                         Microsoft Purview                                 │
│  (Unified Catalog, Data Map, Governance Domains, Lineage)                │
└────────────────────────────────┬─────────────────────────────────────────┘
                                 │
┌────────────────────────────────┴─────────────────────────────────────────┐
│                       Microsoft Fabric Workspace                          │
├─────────────────┬─────────────────┬──────────────────────────────────────┤
│  Bronze         │  Silver         │  Gold                                │
│  Lakehouse      │  Lakehouse      │  Lakehouse ←── Shortcuts/Mirroring   │
└─────────────────┴─────────────────┴──────────────┬───────────────────────┘
                                                   │
                                    ┌──────────────┴───────────────┐
                                    │    Azure Databricks          │
                                    │    Unity Catalog             │
                                    │    (Advanced Analytics)      │
                                    └──────────────────────────────┘

Repository Structure

unified-data-platform-solution-accelerator/
├── docs/
│   ├── DeploymentGuide.md              # Main deployment overview
│   ├── DeploymentGuideFabric.md        # Automated Fabric deployment
│   ├── DeploymentGuideFabricManual.md  # Manual Fabric deployment
│   ├── DeploymentGuidePowerBI.md       # Power BI configuration
│   ├── DeploymentGuidePurview.md       # Purview integration
│   ├── DeploymentGuideDatabricks.md    # Databricks deployment
│   ├── NotebooksGuideFabric.md         # Fabric notebooks reference
│   ├── NotebooksGuideDatabricks.md     # Databricks notebooks reference
│   ├── SetupPurview.md                 # Purview provisioning
│   ├── SetupDatabricks.md              # Databricks provisioning
│   ├── TechnicalArchitecture.md        # Architecture details
│   ├── LocalDevelopmentSetup.md        # Local dev environment
│   ├── QuotaCheck.md                   # Azure quota verification
│   └── SampleWorkflow.md               # Validation workflow
├── infra/
│   ├── scripts/
│   │   ├── fabric/                     # Fabric deployment scripts
│   │   ├── databricks/                 # Databricks deployment scripts
│   │   └── utils/                      # Utility scripts
│   └── bicep/                          # Infrastructure as Code
├── src/
│   ├── fabric/
│   │   └── notebooks/
│   │       ├── bronze_to_silver/       # 16 transformation notebooks
│   │       ├── silver_to_gold/         # 16 aggregation notebooks
│   │       ├── schema/                 # 8 schema definition notebooks
│   │       ├── data_management/        # 5 utility notebooks
│   │       ├── run_bronze_to_silver.ipynb
│   │       └── run_silver_to_gold.ipynb
│   └── databricks/
│       └── notebooks/
│           ├── bronze_to_adb_silver/   # 3 data loading notebooks
│           ├── schema/                 # 2 schema notebooks
│           ├── data_management/        # 2 cleanup notebooks
│           └── run_bronze_to_adb.ipynb
├── reports/
│   └── UDPLZ_SalesDashboard.pbix        # Power BI dashboard
├── data/
│   └── samples/                        # Sample CSV data files
├── azure.yaml                          # Azure Developer CLI config
└── README.md

Prerequisites

Required

Azure Subscription with Owner or Contributor access
Microsoft Fabric Capacity (F2 or higher recommended)
Azure CLI (v2.50+)
Python 3.9+
Git

Optional (Based on Deployment Option)

Microsoft Purview Account (Option 2, 4)
Azure Databricks Workspace - Premium tier (Option 3, 4)
Power BI Pro or Premium Per User license

Permissions

Component	Required Permission
Fabric	Workspace Admin on target capacity
Purview	Data Curator, Collection Admin
Databricks	Workspace Admin, Unity Catalog privileges
Azure	Resource Group Contributor

Quick Start

Step 1: Check Azure Quota

Before deployment, verify quota availability:

# Clone the repository
git clone https://github.com/PatrickGallucci/unified-data-platform.git
cd unified-data-platform-solution-accelerator

# Run quota check
cd infra/scripts
chmod +x quota_check_params.sh
./quota_check_params.sh --models gpt-4o-mini:150 --regions eastus,westus

Step 2: Authenticate

# Azure CLI
az login
az account set --subscription "<your-subscription-id>"

# Azure Developer CLI (for automated deployment)
azd auth login

Step 3: Deploy Option 1 (Fabric + Power BI)

Automated Deployment (Recommended)

# Set environment variables
export AZURE_FABRIC_CAPACITY_NAME="your-capacity-name"

# Deploy using azd
azd up

Manual Deployment

# Navigate to scripts directory
cd infra/scripts/utils

# Set environment variables
export AZURE_FABRIC_CAPACITY_NAME="your-capacity-name"
export AZURE_FABRIC_WORKSPACE_NAME="UDPLZ Data Platform Workspace"

# Run deployment script
pwsh ./run-python-script-fabric.ps1

Step 4: Configure Power BI (Post-Deployment)

Open the udplz_gold lakehouse in Fabric
Navigate to Lakehouse settings and copy the SQL analytics endpoint
Open the deployed Power BI report and verify the connection

For detailed instructions, see Power BI Deployment Guide.

Deployment Guides

Guide	Description
Deployment Overview	Main deployment guide with all options
Fabric Automated	One-command deployment using azd
Fabric Manual	Step-by-step manual deployment
Power BI Setup	Dashboard configuration
Purview Integration	Data governance setup
Databricks Integration	Advanced analytics setup
Local Development	Development environment

Data Model

The solution includes pre-built data models across multiple business domains:

Schemas

Schema	Domain	Tables
`shared`	Master Data	customer, product
`sales`	Sales Operations	order, orderline, orderpayment
`salesfabric`	Sales (Gold)	order, orderline, orderpayment
`salesadb`	Sales (Databricks)	order, orderline, orderpayment
`finance`	Financial Data	(extensible)

Notebook Inventory

Fabric Notebooks (48 total)

Category	Count	Purpose
Bronze to Silver	16	Load CSV to validated tables
Silver to Gold	16	Transform and enrich data
Schema	8	Define data models
Data Management	5	Truncate, drop, utilities
Runners	2	Orchestration
Test/Sample	1	Analysis examples

Databricks Notebooks (8 total)

Category	Count	Purpose
Bronze to ADB Silver	3	Load data to Unity Catalog
Schema	2	Model and permissions
Data Management	2	Cleanup utilities
Runner	1	Orchestration

Power BI Dashboard

The included Power BI dashboard provides immediate business insights:

YOY Net Sales Comparison - Trend analysis across years
Revenue by Customer Segment - Individual, Business, Government breakdown
Top Products by Revenue - Product performance ranking
Top Products by Quantity - Volume analysis
Sales by Gender - Demographic distribution

Connection Configuration

The dashboard connects to the Gold lakehouse via SQL analytics endpoint:

Server: <workspace-name>.datawarehouse.fabric.microsoft.com
Database: udplz_gold
Authentication: Microsoft Entra ID

Extending the Solution

Adding New Domains

Create schema notebook in src/fabric/notebooks/schema/:

# model_<domain>_gold.ipynb
spark.sql("CREATE SCHEMA IF NOT EXISTS <domain>")
spark.sql("""
    CREATE TABLE IF NOT EXISTS <domain>.<table> (
        id STRING,
        name STRING,
        created_at TIMESTAMP
    )
""")

Create bronze-to-silver notebook in src/fabric/notebooks/bronze_to_silver/
Create silver-to-gold notebook in src/fabric/notebooks/silver_to_gold/
Register notebooks in runner notebooks

Adding New Data Sources

Upload source files to Bronze lakehouse Files/samples_fabric/<domain>/
Create corresponding transformation notebooks
Update runner notebooks to include new transformations

Validation

After deployment, validate the solution using the Sample Workflow:

Quick Validation Checklist

Fabric workspace contains 4 folders (lakehouses, notebooks, reports, databricks)
Three lakehouses created: udplz_bronze, udplz_silver, udplz_gold
48 notebooks deployed in organized folder structure
Sample CSV data loaded in Bronze lakehouse
Run run_bronze_to_silver notebook successfully
Run run_silver_to_gold notebook successfully
Power BI report displays data correctly
(Option 2) Purview scan discovers Fabric assets
(Option 3) Databricks mirrored catalog accessible in Fabric

SQL Validation

-- Verify table counts in Gold lakehouse
SELECT COUNT(*) FROM [udplz_gold].[salesfabric].[order];
SELECT COUNT(*) FROM [udplz_gold].[salesfabric].[orderline];
SELECT COUNT(*) FROM [udplz_gold].[salesfabric].[orderpayment];

Troubleshooting

Issue	Likely Cause	Resolution
Deployment script fails	Missing prerequisites	Verify Azure CLI, Python, permissions
Capacity not found	Wrong capacity name	Run `az fabric capacity list` to verify
Notebook execution fails	Lakehouse not attached	Attach default lakehouse in notebook
Power BI connection error	Wrong SQL endpoint	Verify endpoint in lakehouse settings
Purview scan fails	API permissions	Enable admin APIs in Fabric tenant settings
Databricks mirror fails	External data access	Enable in Databricks metastore settings

For detailed troubleshooting, see individual deployment guides.

Security Considerations

Data Protection

Enable sensitivity labels in Microsoft Purview
Configure row-level security in Power BI semantic models
Use managed identities for service authentication
Implement network isolation with private endpoints

Access Control

Use Microsoft Entra ID for authentication
Implement least-privilege access via Fabric workspace roles
Configure Unity Catalog permissions for Databricks
Enable audit logging in all services

Compliance

Microsoft Purview DLP policies for Power BI semantic models
Data classification and sensitivity labeling
Lineage tracking for regulatory requirements
Retention policies for data lifecycle management

Contributing

We welcome contributions to the Unified Data Platform Solution Accelerator.

Fork the repository
Create a feature branch (git checkout -b feature/new-capability)
Commit changes (git commit -m 'Add new capability')
Push to branch (git push origin feature/new-capability)
Open a Pull Request

Please ensure:

Code follows existing patterns and naming conventions
Notebooks include documentation cells
Updates to architecture require corresponding documentation updates
All secrets and credentials are parameterized, never hardcoded

Resources

Microsoft Documentation

Community

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For issues and feature requests, please use GitHub Issues.

Maintained by Microsoft | Report an Issue | Request a Feature

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github		.github
.vscode		.vscode
docs		docs
infra		infra
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
azure.yaml		azure.yaml
test.json		test.json

License

PatrickGallucci/unified-data-platform

Folders and files

Latest commit

History

Repository files navigation

Unified Data Platform Solution Accelerator

Overview

Architecture Options

Option 1: Core Medallion Architecture

Option 4: Full Enterprise Architecture

Repository Structure

Prerequisites

Required

Optional (Based on Deployment Option)

Permissions

Quick Start

Step 1: Check Azure Quota

Step 2: Authenticate

Step 3: Deploy Option 1 (Fabric + Power BI)

Step 4: Configure Power BI (Post-Deployment)

Deployment Guides

Data Model

Schemas

Notebook Inventory

Power BI Dashboard

Connection Configuration

Extending the Solution

Adding New Domains

Adding New Data Sources

Validation

Quick Validation Checklist

SQL Validation

Troubleshooting

Security Considerations

Data Protection

Access Control

Compliance

Contributing

Resources

Microsoft Documentation

Community

License

Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Languages