Skip to content

Possible enhancement: Filters as lists of indexes  #429

Open
@dale-wahl

Description

@dale-wahl

I had a thought regarding how filters work. I think it would be relatively easy to implement filters as simply lists/ db tables of item indexes of the parent dataset. Essentially creating a filter would create a dataset containing only the indexes of the filtered items from the parent. Then iterate_item would, if dataset is a filtered dataset, iterate through the parent dataset and only yield those indexes items.

Could store as actual csv/ndjson dataset or in database table. The idea being to save space from duplicate data.

Possible issue: deleting a parent dataset, but wanting to keep a filtered dataset.

Also would have to update how downloading datasets in frontend works since we would not have a flat file anymore for filtered datasets.

Migrate script would be complicated since indexes for parent dataset are not stored and filter datasets may need to thus be rerun. Alternatively we could use something like item id and check for it instead of indexes. That makes iterate_items more complex a calculation (is id in long_list) though perhaps fetching from a db table it wouldn’t be so bad. This would make the migrate script easier.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions