Skip to content
This repository was archived by the owner on Aug 27, 2019. It is now read-only.
This repository was archived by the owner on Aug 27, 2019. It is now read-only.

aggregations not using all classifications in a workflow+version #167

@vrooje

Description

@vrooje

Doing further checks with the aggregations from GZBL (project_id = 3), I noticed that 9 subjects that definitely have classifications in the raw classification exports don't have any aggregations at all. Their 300 or so classifications were collected over the past year or so, so they should have shown up in the aggregation of at least the initial task (which I requested after all subjects were retired), but they don't.

This made me worry that there were other classifications missing, so I added up the num_users column in the aggregations file for the initial task in my main workflow (workflow_id == 3, for which the only version is 56.13), and came up with a number that's about 3000 classifications short of the actual raw classifications for that workflow+version.

Where did these classifications go? Why weren't they included? I know it's only at the 1.5% level but we shouldn't waste those. I extracted drawing tasks from them, and that's the last task in the workflow, so I don't currently have reason to suspect they're malformed or incomplete in some way. But even if they are bad or for some other reason couldn't be included, we need to message that and explicitly say which classifications were and were not included in the aggregations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions