COVID-19 PubSeq: Public SARS-CoV-2 Sequence Resource

public sequences ready for download!

May 2021 update: we are now at 86,377 sequences with normalized metadata on AWS OpenData!

COVID-19 PubSeq (part 4)

1 What does this mean?

This means that when someone uploads a SARS-CoV-2 sequence using one of our tools (CLI or web-based) they add a sequence and some metadata which triggers a rerun of our workflows.

2 Where can I find the workflows?

Workflows are written in the common workflow language (CWL) and listed on github. PubSeq being an open project these workflows can be studied and modified!

3 Modify Workflow

Other documents

We fetch sequence data and metadata. We query the metadata in multiple ways using SPARQL and onthologies
We submit a sequence to the database. In this BLOG we fetch a sequence from GenBank and add it to the database.
We modify a workflow to get new output
We modify metadata for all to use! In this BLOG we add a field for a creative commons license.
Dealing with PubSeq localisation data
We explore the Arvados command line and API
Generate the files needed for uploading to EBI/ENA
Documentation for PubSeq REST API