Skip to content

Conversation

@YunchuWang
Copy link
Member

@YunchuWang YunchuWang commented Nov 5, 2025

This pull request introduces a new sample web application, ExportHistoryWebApp, for managing export history jobs via a REST API. The changes add new project files, configuration files, and controller logic, as well as update the solution file to include the new projects and their build configurations.

New Export History Web Application

  • Added new sample project ExportHistoryWebApp to the solution, including its project file (ExportHistoryWebApp.csproj) and necessary references to Durable Task components and the new ExportHistory project. [1] [2]
  • Implemented ExportJobController to provide REST endpoints for creating, retrieving, listing, and deleting export history jobs. This controller uses the Durable Task ExportHistory client and models.
  • Defined the CreateExportJobRequest model to represent the payload for creating export jobs, supporting various configuration options such as mode, time ranges, storage container, and filtering.

Configuration and Environment Setup

  • Added launch settings (launchSettings.json) and application configuration files (appsettings.json, appsettings.Development.json) to support local development and environment variable management for connection strings and storage settings. [1] [2] [3]
  • Added Program.cs to configure dependency injection, Durable Task client and worker services, and controller routing for the web application.

API Usage and Documentation

  • Provided an HTTP request collection file (ExportHistoryWebApp.http) with example requests for creating, listing, retrieving, and deleting export jobs, including tips and usage notes for common scenarios.

Solution File Updates

  • Updated Microsoft.DurableTask.sln to include the new ExportHistory and ExportHistoryWebApp projects, and adjusted build configuration and project nesting accordingly. [1] [2] [3] [4]

@YunchuWang YunchuWang marked this pull request as ready for review November 7, 2025 03:22
@YunchuWang YunchuWang changed the title Export job Export History job Nov 7, 2025
@halspang halspang requested a review from Copilot November 10, 2025 20:00
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces comprehensive support for exporting orchestration history to Azure Blob Storage through a new ExportHistory library and sample web application. The changes enable users to create batch or continuous export jobs that export completed orchestration instances.

Key changes:

  • Adds new ExportHistory library with entity-based job management, orchestrators for export operations, and Azure Blob Storage integration
  • Introduces ExportHistoryWebApp sample with REST API endpoints for managing export jobs
  • Extends Durable Task client with history streaming and instance ID listing capabilities
  • Includes comprehensive test coverage for all models, entities, and client operations

Reviewed Changes

Copilot reviewed 71 out of 71 changed files in this pull request and generated 20 comments.

Show a summary per file
File Description
src/ExportHistory/* Core library implementation including entities, orchestrators, activities, models, and client abstractions
test/ExportHistory.Tests/* Comprehensive unit tests for models, entities, orchestrations, and client operations
src/Client/Core/DurableTaskClient.cs Adds ListInstanceIdsAsync and StreamInstanceHistoryAsync methods for export support
src/Client/Grpc/GrpcDurableTaskClient.cs Implements gRPC-based history streaming and instance ID listing
src/Grpc/orchestrator_service.proto Adds protobuf definitions for ListInstanceIds RPC endpoint
samples/ExportHistoryWebApp/* Sample ASP.NET Core web app with REST API for export job management
Microsoft.DurableTask.sln Updates solution to include new ExportHistory projects

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@halspang halspang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't gotten through to the tests yet, but wanted to publish what I have so far.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 71 out of 71 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings November 19, 2025 22:11
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 71 out of 71 changed files in this pull request and generated 6 comments.

@YunchuWang YunchuWang requested a review from halspang November 20, 2025 15:49
Copy link
Member

@halspang halspang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, things are looking good. Mostly some questions on intent behind why we did a few things/a few alterations to the orchestrations.

/// <summary>
/// Represents a request to create a new export job.
/// </summary>
public class CreateExportJobRequest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this is in the sample? Shouldn't we provide a standard request for export jobs in the SDK?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now see that it's just for the controller, but I wonder if it's better to have the controller take the options? Just wondering what we want to encourage with users.

try
{
ExportHistoryJobClient jobClient = this.exportHistoryClient.GetJobClient(id);
await jobClient.DeleteAsync();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a user perspective, is this deleting an old job, or is this how you stop an ongoing job? If it's for stopping, we should update the API name. If it's strictly like a purge, we can keep it, but maybe use the same terminology from the existing APIs.

/// <param name="lastInstanceKey">The last fetched instance key.</param>
/// <param name="cancellation">The cancellation token.</param>
/// <returns>A page of instance IDs with continuation token.</returns>
public virtual Task<Page<string>> ListInstanceIdsAsync(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to the PR? Also, if we're going to state in the comment that this is for listing with a filter, we should name it less generically than ListInstancesById.

// Create conversion state to maintain orchestration instance across events
// This is required for entity-related events (EntityOperationCalled, EntityLockRequested, etc.)
// which need the parent orchestration instance information from ExecutionStartedEvent
Microsoft.DurableTask.ProtoUtils.EntityConversionState conversionState = new(insertMissingEntityUnlocks: false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are entities exported in the same way as an orchestration's history?

Comment on lines +550 to +562
try
{
moveNextResult = await responseStream.MoveNext(cancellation).ConfigureAwait(false);
}
catch (RpcException e) when (e.StatusCode == StatusCode.Cancelled)
{
throw new OperationCanceledException(
$"The {nameof(this.StreamInstanceHistoryAsync)} operation was canceled.", e, cancellation);
}
catch (RpcException e) when (e.StatusCode == StatusCode.NotFound)
{
throw new ArgumentException($"An orchestration with the instanceId {instanceId} was not found.", e);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we generally throw the generic exception out? Or should we be wrapping it in something?

/// <param name="lastInstanceKey">The last fetched instance key.</param>
/// <param name="cancellation">The cancellation token.</param>
/// <returns>A page of instance IDs with continuation token.</returns>
public virtual Task<Page<string>> ListInstanceIdsAsync(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this generic name, if this is designed to always filter by completed time (as per the comment), we should update the method name to reflect that.

Or, if the comment is wrong, update that to have this be more generic. It feels more like a query than a list with all the filters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, how does this differ from GetAllInstancesAsync? Is it just the query/filter ability?

/// </summary>
/// <param name="format">The export format.</param>
/// <returns>The file extension (e.g., "json", "jsonl.gz").</returns>
static string GetFileExtension(ExportFormat format)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we get a benefit from supporting two formats from the start? I would assume, to save costs, we'd default to the compressed format. Or, does that introduce an issue with things like Kusto ingestion?

Alternatively, are you trying to provide a lower compute by higher storage option?

/// <summary>
/// Initializes a new instance of the <see cref="ExportJobCreationOptions"/> class.
/// </summary>
/// <param name="mode">The export mode (Batch or Continuous).</param>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought originally we were going to say a continuous one was just an export with no end time? Did we decide to have specific modes just for increased specificity?

LastInstanceKey: currentState.Checkpoint?.LastInstanceKey,
MaxInstancesPerBatch: currentState.Config.MaxInstancesPerBatch);

InstancePage pageResult = await context.CallActivityAsync<InstancePage>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late to the party, but is this why we do ListInstancesById instead of GetAllInstancesAsync? Because we want to do manual paging? If so, you can disregard my previous comment asking why we do it :)

}

// Process batch with retry logic
BatchExportResult batchResult = await this.ProcessBatchWithRetryAsync(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should stuff like this be an activity for additional checkpointing? Same with the commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants