To deploy this solution accelerator, ensure you have access to an Azure subscription with the necessary permissions to create resource groups, resources, app registrations, and assign roles at the resource group level. This should include Contributor role at the subscription level and Role Based Access Control role on the subscription and/or resource group level. Follow the steps in Azure Account Set Up.
Check the Azure Products by Region page and select a region where the following services are available:
- Azure AI Foundry
- Azure OpenAI Service
- Azure AI Content Understanding Service
- Azure Blob Storage
- Azure Container Apps
- Azure Container Registry
- Azure Cosmos DB
- Azure Queue Storage
- GPT Model Capacity
Here are some example regions where the services are available: East US, East US2, Australia East, UK South, France Central.
If you encounter issues running PowerShell scripts due to the policy of not being digitally signed, you can temporarily adjust the ExecutionPolicy by running the following command in an elevated PowerShell session:
Set-ExecutionPolicy -Scope Process -ExecutionPolicy BypassThis will allow the scripts to run for the current session without permanently changing your system's policy.
Pick from the options below to see step-by-step instructions for GitHub Codespaces, VS Code Dev Containers, and Local Environments.
Deploy in GitHub Codespaces
You can run this solution using GitHub Codespaces. The button will open a web-based VS Code instance in your browser:
-
Open the solution accelerator (this may take several minutes):
-
Accept the default values on the create Codespaces page.
-
Open a terminal window if it is not already open.
-
Continue with the deploying steps.
Deploy in VS Code
You can run this solution in VS Code Dev Containers, which will open the project in your local VS Code using the Dev Containers extension:
-
Start Docker Desktop (install it if not already installed).
-
Open the project:
-
In the VS Code window that opens, once the project files show up (this may take several minutes), open a terminal window.
-
Continue with the deploying steps.
Deploy in your local Environment
If you're not using one of the above options for opening the project, then you'll need to:
-
Make sure the following tools are installed:
- PowerShell (v7.0+) - available for Windows, macOS, and Linux.
- Azure Developer CLI (azd) (v1.15.0+) - version
- Python 3.9+
- Docker Desktop
- Git
-
Clone the repository or download the project code via command-line:
azd init -t microsoft/content-processing-solution-accelerator/
-
Open the project folder in your terminal or editor.
-
Continue with the deploying steps.
Consider the following settings during your deployment to modify specific settings:
Configurable Deployment Settings
When you start the deployment, most parameters will have default values, but you can update the following settings by following the steps here:
| Setting | Description | Default Value |
|---|---|---|
| Azure Region | The region where resources will be created. | East US |
| Azure AI Content Understanding Location | Location for the Content Understanding service. | Sweden Central |
| Secondary Location | A less busy region for Azure Cosmos DB, useful in case of availability constraints. | eastus2 |
| Deployment Type | Select from a drop-down list. | GlobalStandard |
| GPT Model | Choose from gpt-4o. | gpt-4o |
| GPT Model Version | GPT model version used in the deployment. | 2024-08-06 |
| GPT Model Deployment Capacity | Configure capacity for GPT models. | 30k |
| Use Local Build | Boolean flag to determine if local container builds should be used. | false |
| Image Tag | Image version for deployment (allowed values: latest, dev, hotfix). |
latest |
| Existing Log Analytics Workspace | To reuse an existing Log Analytics Workspace ID instead of creating a new one. | (none) |
[Optional] Quota Recommendations
By default, the GPT model capacity in deployment is set to 30k tokens.
We recommend increasing the capacity to 100k tokens, if available, for optimal performance.
To adjust quota settings, follow these steps.
Reusing an Existing Log Analytics Workspace
Guide to get your Existing Workspace ID
Once you've opened the project in Codespaces, Dev Containers, or locally, you can deploy it to Azure by following these steps:
-
Login to Azure:
azd auth login
azd auth login --tenant-id <tenant-id>
Note: To retrieve the Tenant ID required for local deployment, you can go to Tenant Properties in Azure Portal from the resource list. Alternatively, follow these steps:
- Open the Azure Portal.
- Navigate to Azure Active Directory from the left-hand menu.
- Under the Overview section, locate the Tenant ID field. Copy the value displayed.
-
Provision and deploy all the resources:
azd up
-
Provide an
azdenvironment name (e.g., "cpsapp"). -
Select a subscription from your Azure account and choose a location that has quota for all the resources.
- This deployment will take 4-6 minutes to provision the resources in your account and set up the solution with sample data.
- If you encounter an error or timeout during deployment, changing the location may help, as there could be availability constraints for the resources.
-
Once the deployment has completed successfully:
Please check the terminal or console output for details of the successful deployment. It will display the Name, Endpoint (Application URL), and Azure Portal URL for both the Web and API Azure Container Apps.
- You can find the Azure portal link in the screenshot above. Click on it to navigate to the corresponding resource group in the Azure portal.
Important Note : Before accessing the application, ensure that all Post Deployment Steps are fully completed, as they are critical for the proper configuration of Data Ingestion and Authentication functionalities.
-
If you are done trying out the application, you can delete the resources by running
azd down.
-
Optional: Publishing Local Build Container to Azure Container Registry
If you need to rebuild the source code and push the updated container to the deployed Azure Container Registry, follow these steps:
-
Linux/macOS:
cd ./infra/scripts/ ./docker-build.sh -
Windows (PowerShell):
cd .\infra\scripts\ .\docker-build.ps1
This will create a new Azure Container Registry, rebuild the source code, package it into a container, and push it to the Container Registry created.
-
-
Register Schema Files
Want to customize the schemas for your own documents? Learn more about adding your own schemas here.
The below steps will add two sample schemas to the solution: Invoice and Property Loss Damage Claim Form:
-
Get API Service's Endpoint
-
Execute Script to registering Schemas
-
Move the folder to samples/schemas in ContentProcessorApi - /src/ContentProcessorApi/samples/schemas
Bash
./register_schema.sh https://<< API Service Endpoint>>/schemavault/ schema_info_sh.jsonWindows
./register_schema.ps1 https://<< API Service Endpoint>>/schemavault/ .\schema_info_ps1.json
-
-
-
Import Sample Data
-
Grab the Schema IDs for Invoice and Property Damage Claim Form's Schema from first step
-
Move to the folder location to samples in ContentProcessorApi - /src/ContentProcessorApi/samples/
-
Execute the script with Schema IDs
Bash
./upload_files.sh https://<< API Service Endpoint >>/contentprocessor/submit ./invoices <<Invoice Schema Id>>./upload_files.sh https://<< API Service Endpoint >>/contentprocessor/submit ./propertyclaims <<Property Loss Damage Claim Form Schema Id>>Windows
./upload_files.ps1 https://<< API Service Endpoint >>/contentprocessor/submit .\invoices <<Invoice Schema Id>>
./upload_files.ps1 https://<< API Service Endpoint >>/contentprocessor/submit .\propertyclaims <<Property Loss Damage Claim Form Schema Id>>
-
-
Add Authentication Provider
- Follow steps in App Authentication to configure authenitcation in app service. Note that Authentication changes can take up to 10 minutes.
-
Deleting Resources After a Failed Deployment
- Follow steps in Delete Resource Group if your deployment fails and/or you need to clean up the resources.
Now that you've completed your deployment, you can start using the solution. Try out these things to start getting familiar with the capabilities:
- Open the web container app URL in your browser and explore the web user interface and upload your own invoices.
- Create your own schema definition, so you can upload and process your own types of documents.
- Ingest the API for processing documents programmatically.


