Default header image

Data Submission on Github

Welcome to the AI4Casting Hub’s Data Submission Page!

Thank you for your interest in contributing data to the AI4Casting Hub. This guide provides an overview of the standards and procedures required for submitting data to our repository. By following these guidelines, you help ensure that your data is accurately formatted, verified, and easily integrated into our forecasting models. We are currently managing two active forecasting rounds. For detailed information on each forecast, please visit the corresponding GitHub repository through the links below:

  1. Forecasting Hospital Bed Occupancy with Public Health Ontario’s Ontario Respiratory Virus Tool Data
  2. Forecasting Respiratory Viruses Lab with Respiratory Virus Detection Surveillance System (RVDSS) Data

Data Submission Process

1. Format Requirements

  • File Type: Please submit your data in CSV format.
  • Column Headers: Ensure that your dataset includes the following mandatory columns: reference_date, target, horizon, location, target_end_date, output_type, output_type_id, value.
  • Date Format: Use the ISO 8601 format (YYYY-MM-DD) for all date entries.
  • Forecasts: Your data should align with our logical structure, which supports only time-series forecasts.

2. Step-by-Step Upload Tutorial

AI4Casting Hub Submission Tutorial
  • Step 1: Fork the repository from our main GitHub site.
  • Step 2: Clone the forked repository to your local machine to maintain a backup and streamline your workflow. Alternatively, you can make edits directly on GitHub if that suits your preference.
  • Step 3: If you haven’t already, place your formatted model metadata file (E.g. team_name-model_name.yaml) in the model-metadata/ directory of your forked repository. The hub’s status checks will raise errors if this file is missing.
  • Step 4: Place your weekly forecast CSV file in your designated directory within the model-output/ directory of your forked repository. Follow the format: model-metadata/team_name-model_name/2024-11-04-team_name-model_name.csv.
  • Step 4: Commit your changes with a descriptive message.
  • Step 5: Push the commit to your forked repository.
  • Step 6: Submit a PR to the main repository, providing details about your submission.
  • Step 7: The hub’s automated status checks will review your PR. If there are no errors, the hub admin will merge it. If you need to make any corrections, you can edit your files and create a new PR as long as it’s within the submission deadline.

3. Verification Process

  • Pre-Submission Verification: Ensure your data passes all validation checks using the provided Hubverse tools.
  • Common Issues: Some users experience verification failures due to date misalignment or incorrect location codes. Double-check these before submission.
  • Post-Submission Verification: After your PR is submitted, our automated systems will review the data. If issues are detected, you’ll receive feedback to correct and resubmit.

4. Handling Special Cases

  • Weird Stuff: Occasionally, unique datasets may raise issues that aren’t covered by standard procedures. If you encounter a verification issue, such as discrepancies in time-series data, troubleshoot or reach out to our support team.
  • Additional Steps: For datasets with multiple forecast targets or complex models, consider submitting additional documentation to explain the data structure.

5. Additional Resources