public sequences ready for download!May 2021 update: we are now at 86,377 sequences with normalized metadata on AWS OpenData!
In Nature Scientists call for fully open sharing of coronavirus genome data. PubSeq has the goal of providing federated open data with permanent links in the form of uniform resource identifiers with creative commons data sharing licenses that can be used in research publications and for reproducible workflows using free and open source software. Federated data means that there is no central authority. Free software means that anyone can run the PubSeq website and replicate PubSeq workflows.
PubSeq exists because we believe that (anonymised) Pandemic viral data should be out in the open for everyone with sufficient metadata to trace strains across countries.
PubSeq is also an online bioinformatics public computational resource with unique metadata that provides on-the-fly analysis of sequenced SARS-CoV-2 samples and allows a quick turnaround in identification of new viral strains. PubSeq allows anyone to upload sequence material in the form of FASTA or FASTQ files with accompanying metadata through the web interface or REST API. For more information see the FAQ!.
Make your sequence data FAIR. Upload your SARS-CoV-2 sequence (FASTA or FASTQ formats) with simple metadata (JSONLD) to the public sequence resource. The upload will trigger a recompute with all available sequences into a Pangenome available for download!
Your uploaded sequence will automatically be processed and incorporated into the public pangenome with metadata using worklows from the High Performance Open Biology Lab defined here. All data is published under a Creative Commons license You can take the published (GFA/RDF/FASTA) data and store it in a triple store for further processing. Clinical data can be stored securely at REDCap.
Data can be uploaded from any sequencing platform in FASTA format. We give special attention to workflows for the Oxford Nanopore - see also pubmed - because it offers an affordable platform that is great for SARS-CoV-2 sequencing and identification. In New Zealand the Oxford Nanopore is used for all tracing.