Skip to content

Configurable block sizes in sealed fraction #330

@cheb0

Description

@cheb0

For search requests on real seq-db instances there are two major contributors of CPU usage:

  • iterating through query tree (left side on the flamegraph)
  • reading LID blocks from disk (right side on the flamegraph)
Image

The first part will be addressed through batcher query execution and block skipping. The second part can be partially addressed though lower LID pages. Databases strive to align page size to 4kb which is equal to the minimum amount which can be read from disk.

Pros:

  • higher cache granularity and efficiency
  • better skipping granurality
  • faster performance on search query and histograms

Cons:

  • more independent disk reads on aggregations
  • more independent read on S3 searches

Most cons can be addressed through intelligent prefetching and repacking fracs before uploading to S3.

Metadata

Metadata

Assignees

Labels

performanceFeatures or improvements that positively affect seq-db performance

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions