Skip to content

Large video download all tasks can hold up task queue #1745

@becky-gilbert

Description

@becky-gilbert

Description

The 'download all videos' task runs on the builder pod and is part of the 'builder' queue, which also contains:

  • EFP builds (ember_build_and_gcp_deploy)
  • Data downloads (build_framedata_dict)

With large video files and/or numbers of files, a single 'download all videos' task can take a long time (e.g. 2 hours for my load test study on staging, which contains 10,000 short videos). There's no limit to how many of these tasks can be triggered simultaneously or in close succession. The 'builds' queue can run 2 tasks at a time, so a long-running task will hold up any other tasks in the same queue. This could become disruptive and confusing to researchers who sometimes have to wait a very long time for tasks like re-building an EFP study or downloading frame data.

Possible solutions

  • Move the 'download all videos' task out of our pods and into an AWS serverless system like Fargate (Lambda can only be used for shorter-running tasks). This has the advantages of (1) off-loading the tasks from our system, so that we don't have to worry about the consumption of our static resources, (2) autoscaling and parallel processing, (3) keeping the files in AWS (probably faster, more secure). The downside is that it is a more drastic change and would be more work to set up.
  • Move the 'download all videos' task into its own queue, separate from the two other 'build' tasks. This would allow us to set the task concurrency to 1 for video zip tasks, with a higher value (4?) for the other build tasks, since those are shorter and more predictable. (Note that these queues would still share resources).
  • Put the separate 'download all videos' task queue into its own pod/container, which would allow us to isolate its resources. We could also add autoscaling. This would be a larger change to our architecture but the task itself would be pretty much the same.
  • Add prioritization to our tasks, so that shorter jobs can be moved up ahead of 'download all videos' tasks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Bug[Work Type] An issue with the programPerformance Tuning[Work Type] Refactoring that is user-facing, affecting latency and/or space complexityResearcher[Audience] Researcher-facing

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions