##gff-version 3
##sequence-region NZ_DS995511 1 11090
# conversion-by bp_genbank2gff3.pl
# organism Bacteroides eggerthii DSM 20697
# Note Bacteroides eggerthii DSM 20697 Scfld3, whole genome shotgun sequence.
# date 20-DEC-2019
NZ_DS995511	GenBank	region	1	11090	.	+	1	ID=NZ_DS995511;Dbxref=BioProject:PRJNA224116,taxon:483216;Name=NZ_DS995511;Note=Bacteroides eggerthii DSM 20697 Scfld3%2C whole genome shotgun sequence.,REFSEQ INFORMATION: The reference sequence was derived from DS995511. Bacteroides eggerthii (GenBank Accession Number for 16S rDNA gene: L16485) is a member of the Bacteroidetes division of the domain bacteria and has been isolated from human feces. The sequenced strain was obtained from Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSMZ) (DSM 20697). This is a Newbler assembly (http://www.454.com/enabling-technology/the-software.asp) consisting of 2 full plate runs of a fragment library and 2 full plate runs of a paired end library with a Q20 coverage of 60.8X. This sequenced strain is part of a comprehensive,sequence-based survey of members of the normal human gut microbiota. A joint effort of the WU-GSC and the Center for Genome Sciences at Washington University School of Medicine,the purpose of this survey is to provide the general scientific community with a broad view of the gene content of 100 representatives of the major divisions represented in the intestine's microbial community. This information should provide a frame of reference for analyzing metagenomic studies of the human gut microbiome. Further details of this effort are described in a white paper entitled 'Extending Our View of Self: the Human Gut Microbiome Initiative (HGMI)' (http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/HGMIS eq.pdf). These studies are supported by National Human Genome Research Institute. For answers to your questions regarding this assembly or project,or any other GSC genome project,please visit our Genome Groups web page (http://genome.wustl.edu/genome_group_index.cgi) and email the designated contact person. Bacteroides eggerthii (GenBank Accession Number for 16S rDNA gene: L16485) is a member of the Bacteroidetes division of the domain bacteria and has been isolated from human feces. The sequenced strain was obtained from Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSMZ) (DSM 20697).This is a Newbler assembly (http://www.454.com/enabling-technology/the-software.asp) consisting of 2 full plate runs of a fragment library and 2 full plate runs of a paired end library with a Q20 coverage of 60.8X. This sequenced strain is part of a comprehensive,sequence-based survey of members of the normal human gut microbiota. A joint effort of the WU-GSC and the Center for Genome Sciences at Washington University School of Medicine,the purpose of this survey is to provide the general scientific community with a broad view of the gene content of 100 representatives of the major divisions represented in the intestine's microbial community. This information should provide a frame of reference for analyzing metagenomic studies of the human gut microbiome. Further details of this effort are described in a white paper entitled 'Extending Our View of Self: the Human Gut Microbiome Initiative (HGMI)' (http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/HGMIS eq.pdf). These studies are supported by National Human Genome Research Institute. Coding sequences were predicted using GeneMark v3.3 and Glimmer2 v2.13. Intergenic regions not spanned by GeneMark and Glimmer2 were blasted against NCBI's non-redundant (NR) database and predictions generated based on protein alignments. tRNA genes were determined using tRNAscan-SE 1.23 and non-coding RNA genes by RNAmmer-1.2 and Rfam v8.0. Gene names are generated at the contig level and may not necessarily reflect any known order or orientation between contigs. For answers to your questions regarding this assembly or project,or any other GSC genome project,please visit our Genome Groups web page (http://genome.wustl.edu/genome_group_index.cgi) and email the designated contact person. Annotation was added to the contigs in December 2008. This is a reference genome for the Human Microbiome Project. This project is co-owned with the Human Microbiome Project DACC. The annotation was added by the NCBI Prokaryotic Genome Annotation Pipeline (PGAP). Information about PGAP can be found here: https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ ##Genome-Annotation-Data-START## Annotation Provider :: NCBI RefSeq Annotation Date :: 12/16/2019 03:49:21 Annotation Pipeline :: NCBI Prokaryotic Genome Annotation Pipeline (PGAP) Annotation Method :: Best-placed reference protein set,GeneMarkS-2+ Annotation Software revision :: 4.10 Features Annotated :: Gene,CDS,rRNA,tRNA,ncRNA,repeat_region Genes (total) :: 3,453 CDSs (total) :: 3,392 Genes (coding) :: 3,323 CDSs (with protein) :: 3,323 Genes (RNA) :: 61 rRNAs :: 1,1,1 (5S,16S,23S) complete rRNAs :: 1,1,1 (5S,16S,23S) tRNAs :: 56 ncRNAs :: 2 Pseudo Genes (total) :: 69 CDSs (without protein) :: 69 Pseudo Genes (ambiguous residues) :: 5 of 69 Pseudo Genes (frameshifted) :: 36 of 69 Pseudo Genes (incomplete) :: 34 of 69 Pseudo Genes (internal stop) :: 9 of 69 Pseudo Genes (multiple problems) :: 15 of 69 ##Genome-Annotation-Data-END## ;comment1=REFSEQ INFORMATION: The reference sequence was derived from DS995511. Bacteroides eggerthii (GenBank Accession Number for 16S rDNA gene: L16485) is a member of the Bacteroidetes division of the domain bacteria and has been isolated from human feces. The sequenced strain was obtained from Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSMZ) (DSM 20697). This is a Newbler assembly (http://www.454.com/enabling-technology/the-software.asp) consisting of 2 full plate runs of a fragment library and 2 full plate runs of a paired end library with a Q20 coverage of 60.8X. This sequenced strain is part of a comprehensive%2Csequence-based survey of members of the normal human gut microbiota. A joint effort of the WU-GSC and the Center for Genome Sciences at Washington University School of Medicine%2C the purpose of this survey is to provide the general scientific community with a broad view of the gene content of 100 representatives of the major divisions represented in the intestine's microbial community. This information should provide a frame of reference for analyzing metagenomic studies of the human gut microbiome. Further details of this effort are described in a white paper entitled 'Extending Our View of Self: the Human Gut Microbiome Initiative (HGMI)' (http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/HGMIS eq.pdf). These studies are supported by National Human Genome Research Institute. For answers to your questions regarding this assembly or project%2C or any other GSC genome project%2C please visit our Genome Groups web page (http://genome.wustl.edu/genome_group_index.cgi) and email the designated contact person. Bacteroides eggerthii (GenBank Accession Number for 16S rDNA gene: L16485) is a member of the Bacteroidetes division of the domain bacteria and has been isolated from human feces. The sequenced strain was obtained from Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSMZ) (DSM 20697).This is a Newbler assembly (http://www.454.com/enabling-technology/the-software.asp) consisting of 2 full plate runs of a fragment library and 2 full plate runs of a paired end library with a Q20 coverage of 60.8X. This sequenced strain is part of a comprehensive%2C sequence-based survey of members of the normal human gut microbiota. A joint effort of the WU-GSC and the Center for Genome Sciences at Washington University School of Medicine%2C the purpose of this survey is to provide the general scientific community with a broad view of the gene content of 100 representatives of the major divisions represented in the intestine's microbial community. This information should provide a frame of reference for analyzing metagenomic studies of the human gut microbiome. Further details of this effort are described in a white paper entitled 'Extending Our View of Self: the Human Gut Microbiome Initiative (HGMI)' (http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/HGMIS eq.pdf). These studies are supported by National Human Genome Research Institute. Coding sequences were predicted using GeneMark v3.3 and Glimmer2 v2.13. Intergenic regions not spanned by GeneMark and Glimmer2 were blasted against NCBI's non-redundant (NR) database and predictions generated based on protein alignments. tRNA genes were determined using tRNAscan-SE 1.23 and non-coding RNA genes by RNAmmer-1.2 and Rfam v8.0. Gene names are generated at the contig level and may not necessarily reflect any known order or orientation between contigs. For answers to your questions regarding this assembly or project%2C or any other GSC genome project%2C please visit our Genome Groups web page (http://genome.wustl.edu/genome_group_index.cgi) and email the designated contact person. Annotation was added to the contigs in December 2008. This is a reference genome for the Human Microbiome Project. This project is co-owned with the Human Microbiome Project DACC. The annotation was added by the NCBI Prokaryotic Genome Annotation Pipeline (PGAP). Information about PGAP can be found here: https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ ##Genome-Annotation-Data-START## Annotation Provider :: NCBI RefSeq Annotation Date :: 12/16/2019 03:49:21 Annotation Pipeline :: NCBI Prokaryotic Genome Annotation Pipeline (PGAP) Annotation Method :: Best-placed reference protein set%3B GeneMarkS-2+ Annotation Software revision :: 4.10 Features Annotated :: Gene%3B CDS%3B rRNA%3B tRNA%3B ncRNA%3B repeat_region Genes (total) :: 3%2C453 CDSs (total) :: 3%2C392 Genes (coding) :: 3%2C323 CDSs (with protein) :: 3%2C323 Genes (RNA) :: 61 rRNAs :: 1%2C 1%2C 1 (5S%2C 16S%2C 23S) complete rRNAs :: 1%2C 1%2C 1 (5S%2C 16S%2C 23S) tRNAs :: 56 ncRNAs :: 2 Pseudo Genes (total) :: 69 CDSs (without protein) :: 69 Pseudo Genes (ambiguous residues) :: 5 of 69 Pseudo Genes (frameshifted) :: 36 of 69 Pseudo Genes (incomplete) :: 34 of 69 Pseudo Genes (internal stop) :: 9 of 69 Pseudo Genes (multiple problems) :: 15 of 69 ##Genome-Annotation-Data-END## ;date=20-DEC-2019;host=Homo sapiens;isolation_source=biological product [ENVO:02000043];mol_type=genomic DNA;organism=Bacteroides eggerthii DSM 20697;strain=DSM 20697;submitter_seqid=Scfld3;type_material=type strain of Bacteroides eggerthii
NZ_DS995511	GenBank	gene	1	2271	.	-	1	ID=BACEGG_RS14715;Name=BACEGG_RS14715;old_locus_tag=BACEGG_03245
NZ_DS995511	GenBank	mRNA	1	2271	.	-	1	ID=BACEGG_RS14715.t01;Parent=BACEGG_RS14715
NZ_DS995511	GenBank	CDS	1	2271	.	-	1	ID=BACEGG_RS14715.p01;Parent=BACEGG_RS14715.t01;Name=BACEGG_RS14715;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_009122077.1;old_locus_tag=BACEGG_03245;product=hypothetical protein;protein_id=WP_004291728.1;transl_table=11;translation=length.756
NZ_DS995511	GenBank	exon	1	2271	.	-	1	Parent=BACEGG_RS14715.t01
NZ_DS995511	GenBank	gene	2302	3882	.	-	1	ID=BACEGG_RS14720;Name=BACEGG_RS14720;old_locus_tag=BACEGG_03246
NZ_DS995511	GenBank	mRNA	2302	3882	.	-	1	ID=BACEGG_RS14720.t01;Parent=BACEGG_RS14720
NZ_DS995511	GenBank	CDS	2302	3882	.	-	1	ID=BACEGG_RS14720.p01;Parent=BACEGG_RS14720.t01;Name=BACEGG_RS14720;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_004291730.1;old_locus_tag=BACEGG_03246;product=sialate O-acetylesterase;protein_id=WP_004291730.1;transl_table=11;translation=length.526
NZ_DS995511	GenBank	exon	2302	3882	.	-	1	Parent=BACEGG_RS14720.t01
NZ_DS995511	GenBank	gene	3888	5345	.	-	1	ID=BACEGG_RS14725;Name=BACEGG_RS14725;old_locus_tag=BACEGG_03247
NZ_DS995511	GenBank	mRNA	3888	5345	.	-	1	ID=BACEGG_RS14725.t01;Parent=BACEGG_RS14725
NZ_DS995511	GenBank	CDS	3888	5345	.	-	1	ID=BACEGG_RS14725.p01;Parent=BACEGG_RS14725.t01;Name=BACEGG_RS14725;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_009122075.1;old_locus_tag=BACEGG_03247;product=RagB/SusD family nutrient uptake outer membrane protein;protein_id=WP_004291732.1;transl_table=11;translation=length.485
NZ_DS995511	GenBank	exon	3888	5345	.	-	1	Parent=BACEGG_RS14725.t01
NZ_DS995511	GenBank	gene	5352	8393	.	-	1	ID=BACEGG_RS14730;Name=BACEGG_RS14730;old_locus_tag=BACEGG_03248
NZ_DS995511	GenBank	mRNA	5352	8393	.	-	1	ID=BACEGG_RS14730.t01;Parent=BACEGG_RS14730
NZ_DS995511	GenBank	CDS	5352	8393	.	-	1	ID=BACEGG_RS14730.p01;Parent=BACEGG_RS14730.t01;Name=BACEGG_RS14730;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_009122074.1;old_locus_tag=BACEGG_03248;product=TonB-dependent receptor;protein_id=WP_115615987.1;transl_table=11;translation=length.1013
NZ_DS995511	GenBank	exon	5352	8393	.	-	1	Parent=BACEGG_RS14730.t01
NZ_DS995511	GenBank	gene	8928	11090	.	+	1	ID=BACEGG_RS14735;Name=BACEGG_RS14735;old_locus_tag=BACEGG_03249
NZ_DS995511	GenBank	mRNA	8928	11090	.	+	1	ID=BACEGG_RS14735.t01;Parent=BACEGG_RS14735
NZ_DS995511	GenBank	CDS	8928	11090	.	+	1	ID=BACEGG_RS14735.p01;Parent=BACEGG_RS14735.t01;Name=BACEGG_RS14735;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_004291737.1;old_locus_tag=BACEGG_03249;product=heparinase;protein_id=WP_004291737.1;transl_table=11;translation=length.720
NZ_DS995511	GenBank	exon	8928	11090	.	+	1	Parent=BACEGG_RS14735.t01