A production-ready, enterprise-grade data platform built on Microsoft Fabric with optional integrations for Microsoft Purview governance and Azure Databricks advanced analytics.
The Unified Data Platform Solution Accelerator provides a complete, configurable data foundation using the medallion architecture pattern (Bronze, Silver, Gold). It enables organizations to rapidly deploy a governed, analytics-ready data platform with minimal configuration.
Key Capabilities
- Medallion architecture with Bronze, Silver, and Gold lakehouses
- 48 PySpark notebooks for automated data transformations
- Pre-built Power BI semantic models and dashboards
- Optional Microsoft Purview integration for enterprise governance
- Optional Azure Databricks integration for advanced analytics
- Multi-domain data models (Sales, Finance, Customer, Product)
- Automated deployment via Azure Developer CLI (azd)
The solution offers four deployment configurations based on organizational requirements:
| Option | Components | Description |
|---|---|---|
| Option 1 | Fabric + Power BI | Core medallion architecture with analytics dashboards |
| Option 2 | Option 1 + Purview | Adds data governance, lineage tracking, and classification |
| Option 3 | Option 1 + Databricks | Adds advanced analytics and hybrid processing |
| Option 4 | All Components | Complete enterprise data platform |
┌─────────────────────────────────────────────────────────────────┐
│ Microsoft Fabric Workspace │
├─────────────────┬─────────────────┬─────────────────────────────┤
│ Bronze │ Silver │ Gold │
│ Lakehouse │ Lakehouse │ Lakehouse │
│ (Raw Data) │ (Validated) │ (Enriched) │
├─────────────────┴─────────────────┴─────────────────────────────┤
│ 48 PySpark Notebooks │ Semantic Models │ Power BI Reports │
└─────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────────┐
│ Microsoft Purview │
│ (Unified Catalog, Data Map, Governance Domains, Lineage) │
└────────────────────────────────┬─────────────────────────────────────────┘
│
┌────────────────────────────────┴─────────────────────────────────────────┐
│ Microsoft Fabric Workspace │
├─────────────────┬─────────────────┬──────────────────────────────────────┤
│ Bronze │ Silver │ Gold │
│ Lakehouse │ Lakehouse │ Lakehouse ←── Shortcuts/Mirroring │
└─────────────────┴─────────────────┴──────────────┬───────────────────────┘
│
┌──────────────┴───────────────┐
│ Azure Databricks │
│ Unity Catalog │
│ (Advanced Analytics) │
└──────────────────────────────┘
unified-data-platform-solution-accelerator/
├── docs/
│ ├── DeploymentGuide.md # Main deployment overview
│ ├── DeploymentGuideFabric.md # Automated Fabric deployment
│ ├── DeploymentGuideFabricManual.md # Manual Fabric deployment
│ ├── DeploymentGuidePowerBI.md # Power BI configuration
│ ├── DeploymentGuidePurview.md # Purview integration
│ ├── DeploymentGuideDatabricks.md # Databricks deployment
│ ├── NotebooksGuideFabric.md # Fabric notebooks reference
│ ├── NotebooksGuideDatabricks.md # Databricks notebooks reference
│ ├── SetupPurview.md # Purview provisioning
│ ├── SetupDatabricks.md # Databricks provisioning
│ ├── TechnicalArchitecture.md # Architecture details
│ ├── LocalDevelopmentSetup.md # Local dev environment
│ ├── QuotaCheck.md # Azure quota verification
│ └── SampleWorkflow.md # Validation workflow
├── infra/
│ ├── scripts/
│ │ ├── fabric/ # Fabric deployment scripts
│ │ ├── databricks/ # Databricks deployment scripts
│ │ └── utils/ # Utility scripts
│ └── bicep/ # Infrastructure as Code
├── src/
│ ├── fabric/
│ │ └── notebooks/
│ │ ├── bronze_to_silver/ # 16 transformation notebooks
│ │ ├── silver_to_gold/ # 16 aggregation notebooks
│ │ ├── schema/ # 8 schema definition notebooks
│ │ ├── data_management/ # 5 utility notebooks
│ │ ├── run_bronze_to_silver.ipynb
│ │ └── run_silver_to_gold.ipynb
│ └── databricks/
│ └── notebooks/
│ ├── bronze_to_adb_silver/ # 3 data loading notebooks
│ ├── schema/ # 2 schema notebooks
│ ├── data_management/ # 2 cleanup notebooks
│ └── run_bronze_to_adb.ipynb
├── reports/
│ └── UDPLZ_SalesDashboard.pbix # Power BI dashboard
├── data/
│ └── samples/ # Sample CSV data files
├── azure.yaml # Azure Developer CLI config
└── README.md
- Azure Subscription with Owner or Contributor access
- Microsoft Fabric Capacity (F2 or higher recommended)
- Azure CLI (v2.50+)
- Python 3.9+
- Git
- Microsoft Purview Account (Option 2, 4)
- Azure Databricks Workspace - Premium tier (Option 3, 4)
- Power BI Pro or Premium Per User license
| Component | Required Permission |
|---|---|
| Fabric | Workspace Admin on target capacity |
| Purview | Data Curator, Collection Admin |
| Databricks | Workspace Admin, Unity Catalog privileges |
| Azure | Resource Group Contributor |
Before deployment, verify quota availability:
# Clone the repository
git clone https://github.com/PatrickGallucci/unified-data-platform.git
cd unified-data-platform-solution-accelerator
# Run quota check
cd infra/scripts
chmod +x quota_check_params.sh
./quota_check_params.sh --models gpt-4o-mini:150 --regions eastus,westus# Azure CLI
az login
az account set --subscription "<your-subscription-id>"
# Azure Developer CLI (for automated deployment)
azd auth loginAutomated Deployment (Recommended)
# Set environment variables
export AZURE_FABRIC_CAPACITY_NAME="your-capacity-name"
# Deploy using azd
azd upManual Deployment
# Navigate to scripts directory
cd infra/scripts/utils
# Set environment variables
export AZURE_FABRIC_CAPACITY_NAME="your-capacity-name"
export AZURE_FABRIC_WORKSPACE_NAME="UDPLZ Data Platform Workspace"
# Run deployment script
pwsh ./run-python-script-fabric.ps1- Open the
udplz_goldlakehouse in Fabric - Navigate to Lakehouse settings and copy the SQL analytics endpoint
- Open the deployed Power BI report and verify the connection
For detailed instructions, see Power BI Deployment Guide.
| Guide | Description |
|---|---|
| Deployment Overview | Main deployment guide with all options |
| Fabric Automated | One-command deployment using azd |
| Fabric Manual | Step-by-step manual deployment |
| Power BI Setup | Dashboard configuration |
| Purview Integration | Data governance setup |
| Databricks Integration | Advanced analytics setup |
| Local Development | Development environment |
The solution includes pre-built data models across multiple business domains:
| Schema | Domain | Tables |
|---|---|---|
shared |
Master Data | customer, product |
sales |
Sales Operations | order, orderline, orderpayment |
salesfabric |
Sales (Gold) | order, orderline, orderpayment |
salesadb |
Sales (Databricks) | order, orderline, orderpayment |
finance |
Financial Data | (extensible) |
Fabric Notebooks (48 total)
| Category | Count | Purpose |
|---|---|---|
| Bronze to Silver | 16 | Load CSV to validated tables |
| Silver to Gold | 16 | Transform and enrich data |
| Schema | 8 | Define data models |
| Data Management | 5 | Truncate, drop, utilities |
| Runners | 2 | Orchestration |
| Test/Sample | 1 | Analysis examples |
Databricks Notebooks (8 total)
| Category | Count | Purpose |
|---|---|---|
| Bronze to ADB Silver | 3 | Load data to Unity Catalog |
| Schema | 2 | Model and permissions |
| Data Management | 2 | Cleanup utilities |
| Runner | 1 | Orchestration |
The included Power BI dashboard provides immediate business insights:
- YOY Net Sales Comparison - Trend analysis across years
- Revenue by Customer Segment - Individual, Business, Government breakdown
- Top Products by Revenue - Product performance ranking
- Top Products by Quantity - Volume analysis
- Sales by Gender - Demographic distribution
The dashboard connects to the Gold lakehouse via SQL analytics endpoint:
Server: <workspace-name>.datawarehouse.fabric.microsoft.com
Database: udplz_gold
Authentication: Microsoft Entra ID
- Create schema notebook in
src/fabric/notebooks/schema/:
# model_<domain>_gold.ipynb
spark.sql("CREATE SCHEMA IF NOT EXISTS <domain>")
spark.sql("""
CREATE TABLE IF NOT EXISTS <domain>.<table> (
id STRING,
name STRING,
created_at TIMESTAMP
)
""")- Create bronze-to-silver notebook in
src/fabric/notebooks/bronze_to_silver/ - Create silver-to-gold notebook in
src/fabric/notebooks/silver_to_gold/ - Register notebooks in runner notebooks
- Upload source files to Bronze lakehouse
Files/samples_fabric/<domain>/ - Create corresponding transformation notebooks
- Update runner notebooks to include new transformations
After deployment, validate the solution using the Sample Workflow:
- Fabric workspace contains 4 folders (lakehouses, notebooks, reports, databricks)
- Three lakehouses created:
udplz_bronze,udplz_silver,udplz_gold - 48 notebooks deployed in organized folder structure
- Sample CSV data loaded in Bronze lakehouse
- Run
run_bronze_to_silvernotebook successfully - Run
run_silver_to_goldnotebook successfully - Power BI report displays data correctly
- (Option 2) Purview scan discovers Fabric assets
- (Option 3) Databricks mirrored catalog accessible in Fabric
-- Verify table counts in Gold lakehouse
SELECT COUNT(*) FROM [udplz_gold].[salesfabric].[order];
SELECT COUNT(*) FROM [udplz_gold].[salesfabric].[orderline];
SELECT COUNT(*) FROM [udplz_gold].[salesfabric].[orderpayment];| Issue | Likely Cause | Resolution |
|---|---|---|
| Deployment script fails | Missing prerequisites | Verify Azure CLI, Python, permissions |
| Capacity not found | Wrong capacity name | Run az fabric capacity list to verify |
| Notebook execution fails | Lakehouse not attached | Attach default lakehouse in notebook |
| Power BI connection error | Wrong SQL endpoint | Verify endpoint in lakehouse settings |
| Purview scan fails | API permissions | Enable admin APIs in Fabric tenant settings |
| Databricks mirror fails | External data access | Enable in Databricks metastore settings |
For detailed troubleshooting, see individual deployment guides.
- Enable sensitivity labels in Microsoft Purview
- Configure row-level security in Power BI semantic models
- Use managed identities for service authentication
- Implement network isolation with private endpoints
- Use Microsoft Entra ID for authentication
- Implement least-privilege access via Fabric workspace roles
- Configure Unity Catalog permissions for Databricks
- Enable audit logging in all services
- Microsoft Purview DLP policies for Power BI semantic models
- Data classification and sensitivity labeling
- Lineage tracking for regulatory requirements
- Retention policies for data lifecycle management
We welcome contributions to the Unified Data Platform Solution Accelerator.
- Fork the repository
- Create a feature branch (
git checkout -b feature/new-capability) - Commit changes (
git commit -m 'Add new capability') - Push to branch (
git push origin feature/new-capability) - Open a Pull Request
Please ensure:
- Code follows existing patterns and naming conventions
- Notebooks include documentation cells
- Updates to architecture require corresponding documentation updates
- All secrets and credentials are parameterized, never hardcoded
- Microsoft Fabric Documentation
- Microsoft Purview Documentation
- Azure Databricks Documentation
- Power BI Documentation
This project is licensed under the MIT License - see the LICENSE file for details.
For issues and feature requests, please use GitHub Issues.
Maintained by Microsoft | Report an Issue | Request a Feature