bac_taxonomy_<version>.tsv	GTDB taxonomy for bacterial genomes.
bac120_<version>.tree	Newick tree spanning the dereplicated bacterial genomes set inferred from the concatenation of 120 proteins and used to curate the GTDB taxonomy.
bac120_msa_<version>.faa	FASTA file of the trimmed multiple sequence alignment used to infer the bac120 tree.
bac120_msa_marker_info_<version>.tsv	Information about each of the 120 proteins used to infer the bac120 tree. The order of proteins in this file indicates the order in which they were concatenate.
bac120_msa_mask_<version>.txt	Mask indicating which columns were trimmed from the bac120 alignment.
bac120_msa_individual_genes_<version>.tar.gz	Multiple sequence alignments of the 120 bacterial proteins.	
bac_metadata_<version>.tsv	Metadata for all bacterial genomes including GTDB, NCBI, SILVA, and Greengene taxonomies, completeness and contamination estimates, assembly statistics, and genomic properties.	
bac_ssu_<version>.fna	FASTA file of 16S rRNA gene sequences identified within the dereplicated bacterial genomes set. The assigned taxonomy reflects the genome from which the sequence was obtained. In a small number of cases the 16S rRNA sequences are incongruent with this taxonomic assignment and therefore the 16S rRNA may not be representative of the genome.
bac_arb_<version>.arb	ARB database containing the bacterial reference tree and metadata used to curate the GTDB taxonomy. 
gtdb_uba_mags.tar.gz	Genomic files for 3,087 UBA genomes used to infer the GTDB taxonomy.
NCBIvs<version>_Bacteria.xlsx	Correspondence between standardly named NCBI and GTDB taxa ordered by degree of polyphyly.
