OAS: A diverse database of cleaned, annotated and translated unpaired and paired antibody sequences

This article was originally published here

Protein Sci. 2021 Oct 15. doi: 10.1002/pro.4205. Online ahead of print.

ABSTRACT

The antibody repertoires of individuals and groups have been used to explore disease states, understand vaccine responses and drive therapeutic development. The arrival of B-cell receptor repertoire sequencing has enabled researchers to get a snapshot of these antibody repertoires and as more data is generated, increasingly in depth studies are possible. However, most publicly available data only exists as raw FASTQ files, making the data hard to access, process and compare. The Observed Antibody Space (OAS) database was created in 2018 to offer clean, annotated and translated repertoire data. In this paper we describe an update to OAS that has been driven by the increasing volume of data and the appearance of paired (VH/VL) sequence data. OAS is now accessible via a new web server, with standardised search parameters and a new sequence-based search option. The new database provides both nucleotides and amino acids for every sequence, with additional sequence annotations to make the data MiAIRR-compliant, and comments on potential problems with the sequence. OAS now contains 25 new studies, including SARS-CoV-2 data and paired sequencing data. The new database is accessible at http://opig.stats.ox.ac.uk/webapps/oas/ and all data is freely available for download. This article is protected by copyright. All rights reserved.

PMID:34655133 | DOI:10.1002/pro.4205