Skip to content

DATA: Request for Real World Datasets and Pipelines To Test Our Filters #716

Open
@nyoungbq

Description

@nyoungbq

We always need datasets and pipelines to use in test cases to try to identify bugs and better optimize bottlenecks for real world use cases.

PLEASE NOTE THAT PROVIDED DATASETS AND PIPELINES WILL BE OPENSOURCE AS THEY ARE PUBLICLY AVAILABLE IN OUR REPOSITORY

Steps for Submitting:

  1. Create a branch on your fork of the repository named data/data_submission
  2. Add a new file named SUBMISSION.md at the root level and add the following:
# Data Submission

Name: [your-name-here]
DataSet: [link to where we can find the Data] <- leave blank if not applicable
Pipeline: [link to where we can find the Pipeline] <- leave blank if not applicable

Information:
write a short paragraph about what it is, what its for, how it should be used, etc.
  1. Create a pull request from your branch to our repository | Create a Pull Request From Fork
  2. In the description of the PR add information about where the dataset/pipeline came from applications and acknowledgement that the data will be made public such as
I hereby acknowledge that the information is mine or I have received permission from the owner and I provide it with the understanding it will be made public.

Metadata

Metadata

Assignees

No one assigned

    Labels

    DataInvolving Datasets for testinggood first issueGood for newcomershelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions