##gff-version 3
##sequence-region NZ_EQ973490 1 23343
# conversion-by bp_genbank2gff3.pl
# organism Bacteroides cellulosilyticus DSM 14838
# Note Bacteroides cellulosilyticus DSM 14838 Scfld4, whole genome shotgun sequence.
# date 10-MAY-2022
NZ_EQ973490	GenBank	region	1	23343	.	+	1	ID=NZ_EQ973490;Dbxref=BioProject:PRJNA224116,taxon:537012;Name=NZ_EQ973490;Note=Bacteroides cellulosilyticus DSM 14838 Scfld4%2C whole genome shotgun sequence.,REFSEQ INFORMATION: The reference sequence is identical to EQ973490.1.  Bacteroides cellulosilyticus (GenBank Accession Number for 16S rDNA  gene: AJ583243) is a member of the Bacteroidetes division of the  domain bacteria and has been isolated from human feces. The  sequenced strain was obtained from Deutsche Sammlung von  Mikroorganismen und Zellkulturen GmbH (DSMZ) (DSM 14838).    This is a Newbler assembly  (http://www.454.com/enabling-technology/the-software.asp) comprised  of one full plate FLX PE 454 Data with a Q20 coverage of 16.3X.    This sequenced strain is part of a comprehensive,sequence-based  survey of members of the normal human gut microbiota. A joint  effort of the WU-GSC and the Center for Genome Sciences at  Washington University School of Medicine,the purpose of this  survey is to provide the general scientific community with a broad  view of the gene content of 100 representatives of the major  divisions represented in the intestine's microbial community. This  information should provide a frame of reference for analyzing  metagenomic studies of the human gut microbiome. Further details of  this effort are described in a white paper entitled 'Extending Our  View of Self: the Human Gut Microbiome Initiative (HGMI)'  (http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/HGMIS  eq.pdf). These studies are supported by National Human Genome  Research Institute.    For answers to your questions regarding this assembly or project,or any other GSC genome project,please visit our Genome Groups web  page (http://genome.wustl.edu/genome_group_index.cgi) and email the  designated contact person.  Bacteroides cellulosilyticus (GenBank Accession Number for 16S rDNA  gene: AJ583243) is a member of the Bacteroidetes division of the  domain bacteria and has been isolated from human feces. The  sequenced strain was obtained from Deutsche Sammlung von  Mikroorganismen und Zellkulturen GmbH (DSMZ) (DSM 14838).    This is a Newbler assembly  (http://www.454.com/enabling-technology/the-software.asp) comprised  of one full plate FLX PE 454 Data with a Q20 coverage of 16.3X.    This sequenced strain is part of a comprehensive,sequence-based  survey of members of the normal human gut microbiota. A joint  effort of the WU-GSC and the Center for Genome Sciences at  Washington University School of Medicine,the purpose of this  survey is to provide the general scientific community with a broad  view of the gene content of 100 representatives of the major  divisions represented in the intestine's microbial community. This  information should provide a frame of reference for analyzing  metagenomic studies of the human gut microbiome. Further details of  this effort are described in a white paper entitled 'Extending Our  View of Self: the Human Gut Microbiome Initiative (HGMI)'  (http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/HGMIS  eq.pdf). These studies are supported by National Human Genome  Research Institute.    Coding sequences were predicted using GeneMark v3.3 and Glimmer2  v2.13. Intergenic regions not spanned by GeneMark and Glimmer2 were  blasted against NCBI's non-redundant (NR) database and predictions  generated based on protein alignments. tRNA genes were determined  using tRNAscan-SE 1.23 and non-coding RNA genes by RNAmmer-1.2 and  Rfam v8.0. Gene names are generated at the contig level and may not  necessarily reflect any known order or orientation between contigs.    For answers to your questions regarding this assembly or project,or any other GSC genome project,please visit our Genome Groups web  page (http://genome.wustl.edu/genome_group_index.cgi) and email the  designated contact person.    Annotation was added to the contigs in March 2009    This is a reference genome for the Human Microbiome Project. This  project is co-owned with the Human Microbiome Project DACC.  Product names were updated in June 2013.  The annotation was added by the NCBI Prokaryotic Genome Annotation  Pipeline (PGAP). Information about PGAP can be found here:  https://www.ncbi.nlm.nih.gov/genome/annotation_prok/    \n##Genome-Annotation-Data-START##\nAnnotation Provider :: NCBI RefSeq\nAnnotation Date :: 05/05/2022 02:16:44\nAnnotation Pipeline :: NCBI Prokaryotic Genome\nAnnotation Pipeline (PGAP)\nAnnotation Method :: Best-placed reference protein\nset,GeneMarkS-2+\nAnnotation Software revision :: 6.1\nFeatures Annotated :: Gene,CDS,rRNA,tRNA,ncRNA,\nrepeat_region\nGenes (total) :: 5,175\nCDSs (total) :: 5,120\nGenes (coding) :: 4,851\nCDSs (with protein) :: 4,851\nGenes (RNA) :: 55\nrRNAs :: 1,1,1 (5S,16S,23S)\ncomplete rRNAs :: 1,1,1 (5S,16S,23S)\ntRNAs :: 50\nncRNAs :: 2\nPseudo Genes (total) :: 270\nCDSs (without protein) :: 269\nPseudo Genes (ambiguous residues) :: 0 of 270\nPseudo Genes (frameshifted) :: 97 of 270\nPseudo Genes (incomplete) :: 176 of 270\nPseudo Genes (internal stop) :: 16 of 270\nPseudo Genes (multiple problems) :: 15 of 270\nCRISPR Arrays :: 1\n##Genome-Annotation-Data-END##;comment1=REFSEQ INFORMATION: The reference sequence is identical to EQ973490.1.  Bacteroides cellulosilyticus (GenBank Accession Number for 16S rDNA  gene: AJ583243) is a member of the Bacteroidetes division of the  domain bacteria and has been isolated from human feces. The  sequenced strain was obtained from Deutsche Sammlung von  Mikroorganismen und Zellkulturen GmbH (DSMZ) (DSM 14838).    This is a Newbler assembly  (http://www.454.com/enabling-technology/the-software.asp) comprised  of one full plate FLX PE 454 Data with a Q20 coverage of 16.3X.    This sequenced strain is part of a comprehensive%2C sequence-based  survey of members of the normal human gut microbiota. A joint  effort of the WU-GSC and the Center for Genome Sciences at  Washington University School of Medicine%2C the purpose of this  survey is to provide the general scientific community with a broad  view of the gene content of 100 representatives of the major  divisions represented in the intestine's microbial community. This  information should provide a frame of reference for analyzing  metagenomic studies of the human gut microbiome. Further details of  this effort are described in a white paper entitled 'Extending Our  View of Self: the Human Gut Microbiome Initiative (HGMI)'  (http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/HGMIS  eq.pdf). These studies are supported by National Human Genome  Research Institute.    For answers to your questions regarding this assembly or project%2C  or any other GSC genome project%2C please visit our Genome Groups web  page (http://genome.wustl.edu/genome_group_index.cgi) and email the  designated contact person.  Bacteroides cellulosilyticus (GenBank Accession Number for 16S rDNA  gene: AJ583243) is a member of the Bacteroidetes division of the  domain bacteria and has been isolated from human feces. The  sequenced strain was obtained from Deutsche Sammlung von  Mikroorganismen und Zellkulturen GmbH (DSMZ) (DSM 14838).    This is a Newbler assembly  (http://www.454.com/enabling-technology/the-software.asp) comprised  of one full plate FLX PE 454 Data with a Q20 coverage of 16.3X.    This sequenced strain is part of a comprehensive%2C sequence-based  survey of members of the normal human gut microbiota. A joint  effort of the WU-GSC and the Center for Genome Sciences at  Washington University School of Medicine%2C the purpose of this  survey is to provide the general scientific community with a broad  view of the gene content of 100 representatives of the major  divisions represented in the intestine's microbial community. This  information should provide a frame of reference for analyzing  metagenomic studies of the human gut microbiome. Further details of  this effort are described in a white paper entitled 'Extending Our  View of Self: the Human Gut Microbiome Initiative (HGMI)'  (http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/HGMIS  eq.pdf). These studies are supported by National Human Genome  Research Institute.    Coding sequences were predicted using GeneMark v3.3 and Glimmer2  v2.13. Intergenic regions not spanned by GeneMark and Glimmer2 were  blasted against NCBI's non-redundant (NR) database and predictions  generated based on protein alignments. tRNA genes were determined  using tRNAscan-SE 1.23 and non-coding RNA genes by RNAmmer-1.2 and  Rfam v8.0. Gene names are generated at the contig level and may not  necessarily reflect any known order or orientation between contigs.    For answers to your questions regarding this assembly or project%2C  or any other GSC genome project%2C please visit our Genome Groups web  page (http://genome.wustl.edu/genome_group_index.cgi) and email the  designated contact person.    Annotation was added to the contigs in March 2009    This is a reference genome for the Human Microbiome Project. This  project is co-owned with the Human Microbiome Project DACC.  Product names were updated in June 2013.  The annotation was added by the NCBI Prokaryotic Genome Annotation  Pipeline (PGAP). Information about PGAP can be found here:  https://www.ncbi.nlm.nih.gov/genome/annotation_prok/    \n##Genome-Annotation-Data-START##\nAnnotation Provider :: NCBI RefSeq\nAnnotation Date :: 05/05/2022 02:16:44\nAnnotation Pipeline :: NCBI Prokaryotic Genome\nAnnotation Pipeline (PGAP)\nAnnotation Method :: Best-placed reference protein\nset%3B GeneMarkS-2+\nAnnotation Software revision :: 6.1\nFeatures Annotated :: Gene%3B CDS%3B rRNA%3B tRNA%3B ncRNA%3B\nrepeat_region\nGenes (total) :: 5%2C175\nCDSs (total) :: 5%2C120\nGenes (coding) :: 4%2C851\nCDSs (with protein) :: 4%2C851\nGenes (RNA) :: 55\nrRNAs :: 1%2C 1%2C 1 (5S%2C 16S%2C 23S)\ncomplete rRNAs :: 1%2C 1%2C 1 (5S%2C 16S%2C 23S)\ntRNAs :: 50\nncRNAs :: 2\nPseudo Genes (total) :: 270\nCDSs (without protein) :: 269\nPseudo Genes (ambiguous residues) :: 0 of 270\nPseudo Genes (frameshifted) :: 97 of 270\nPseudo Genes (incomplete) :: 176 of 270\nPseudo Genes (internal stop) :: 16 of 270\nPseudo Genes (multiple problems) :: 15 of 270\nCRISPR Arrays :: 1\n##Genome-Annotation-Data-END##;culture_collection=DSM:14838;date=10-MAY-2022;host=Homo sapiens;isolation_source=biological product [ENVO:02000043];mol_type=genomic DNA;organism=Bacteroides cellulosilyticus DSM 14838;strain=DSM 14838;submitter_seqid=Scfld4;type_material=type strain of Bacteroides cellulosilyticus
NZ_EQ973490	GenBank	gene	1	4095	.	-	1	ID=BACCELL_RS09475;Name=BACCELL_RS09475;old_locus_tag=BACCELL_02140
NZ_EQ973490	GenBank	mRNA	1	4095	.	-	1	ID=BACCELL_RS09475.t01;Parent=BACCELL_RS09475
NZ_EQ973490	GenBank	CDS	1	4095	.	-	1	ID=BACCELL_RS09475;Parent=BACCELL_RS09475.t01;gO_process=GO:0000160 - phosphorelay signal transduction system [Evidence IEA];Name=BACCELL_RS09475;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_007661009.1;old_locus_tag=BACCELL_02140;product=two-component regulator propeller domain-containing protein;protein_id=WP_149949311.1;transl_table=11;translation=length.1364
NZ_EQ973490	GenBank	exon	1	4095	.	-	1	Parent=BACCELL_RS09475.t01
NZ_EQ973490	GenBank	gene	4115	5446	.	-	1	ID=BACCELL_RS09480;Name=BACCELL_RS09480;old_locus_tag=BACCELL_02141
NZ_EQ973490	GenBank	mRNA	4115	5446	.	-	1	ID=BACCELL_RS09480.t01;Parent=BACCELL_RS09480
NZ_EQ973490	GenBank	CDS	4115	5446	.	-	1	ID=BACCELL_RS09480;Parent=BACCELL_RS09480.t01;gO_process=GO:0005975 - carbohydrate metabolic process [Evidence IEA];Name=BACCELL_RS09480;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_007661008.1;old_locus_tag=BACCELL_02141;product=family 43 glycosylhydrolase;protein_id=WP_007211515.1;transl_table=11;translation=length.443
NZ_EQ973490	GenBank	exon	4115	5446	.	-	1	Parent=BACCELL_RS09480.t01
NZ_EQ973490	GenBank	gene	5623	8280	.	-	1	ID=BACCELL_RS09485;Name=BACCELL_RS09485;old_locus_tag=BACCELL_02142
NZ_EQ973490	GenBank	mRNA	5623	8280	.	-	1	ID=BACCELL_RS09485.t01;Parent=BACCELL_RS09485
NZ_EQ973490	GenBank	CDS	5623	8280	.	-	1	ID=BACCELL_RS09485;Parent=BACCELL_RS09485.t01;gO_process=GO:0005975 - carbohydrate metabolic process [Evidence IEA];Name=BACCELL_RS09485;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_007411427.1;old_locus_tag=BACCELL_02142;product=glycoside hydrolase family 3 C-terminal domain-containing protein;protein_id=WP_007211516.1;transl_table=11;translation=length.885
NZ_EQ973490	GenBank	exon	5623	8280	.	-	1	Parent=BACCELL_RS09485.t01
NZ_EQ973490	GenBank	gene	8329	9387	.	-	1	ID=BACCELL_RS09490;Name=BACCELL_RS09490;old_locus_tag=BACCELL_02143
NZ_EQ973490	GenBank	mRNA	8329	9387	.	-	1	ID=BACCELL_RS09490.t01;Parent=BACCELL_RS09490
NZ_EQ973490	GenBank	CDS	8329	9387	.	-	1	ID=BACCELL_RS09490;Parent=BACCELL_RS09490.t01;Name=BACCELL_RS09490;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_007661006.1;old_locus_tag=BACCELL_02143;product=glycoside hydrolase family 43 protein;protein_id=WP_007211517.1;transl_table=11;translation=length.352
NZ_EQ973490	GenBank	exon	8329	9387	.	-	1	Parent=BACCELL_RS09490.t01
NZ_EQ973490	GenBank	gene	9434	10588	.	-	1	ID=BACCELL_RS09495;Name=BACCELL_RS09495;old_locus_tag=BACCELL_02144
NZ_EQ973490	GenBank	mRNA	9434	10588	.	-	1	ID=BACCELL_RS09495.t01;Parent=BACCELL_RS09495
NZ_EQ973490	GenBank	CDS	9434	10588	.	-	1	ID=BACCELL_RS09495;Parent=BACCELL_RS09495.t01;Name=BACCELL_RS09495;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_007216414.1;old_locus_tag=BACCELL_02144;product=alpha/beta hydrolase-fold protein;protein_id=WP_007211518.1;transl_table=11;translation=length.384
NZ_EQ973490	GenBank	exon	9434	10588	.	-	1	Parent=BACCELL_RS09495.t01
NZ_EQ973490	GenBank	gene	10596	11762	.	-	1	ID=BACCELL_RS09500;Name=BACCELL_RS09500;old_locus_tag=BACCELL_02145
NZ_EQ973490	GenBank	mRNA	10596	11762	.	-	1	ID=BACCELL_RS09500.t01;Parent=BACCELL_RS09500
NZ_EQ973490	GenBank	CDS	10596	11762	.	-	1	ID=BACCELL_RS09500;Parent=BACCELL_RS09500.t01;Name=BACCELL_RS09500;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_008626297.1;old_locus_tag=BACCELL_02145;product=alpha/beta hydrolase-fold protein;protein_id=WP_052309546.1;transl_table=11;translation=length.388
NZ_EQ973490	GenBank	exon	10596	11762	.	-	1	Parent=BACCELL_RS09500.t01
NZ_EQ973490	GenBank	gene	11999	14503	.	-	1	ID=BACCELL_RS09510;Name=BACCELL_RS09510;old_locus_tag=BACCELL_02147
NZ_EQ973490	GenBank	mRNA	11999	14503	.	-	1	ID=BACCELL_RS09510.t01;Parent=BACCELL_RS09510
NZ_EQ973490	GenBank	CDS	11999	14503	.	-	1	ID=BACCELL_RS09510;Parent=BACCELL_RS09510.t01;gO_process=GO:0005975 - carbohydrate metabolic process [Evidence IEA];Name=BACCELL_RS09510;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: protein motif:HMM:NF012961.2;old_locus_tag=BACCELL_02147;product=glycoside hydrolase family 9 protein;protein_id=WP_081450809.1;transl_table=11;translation=length.834
NZ_EQ973490	GenBank	exon	11999	14503	.	-	1	Parent=BACCELL_RS09510.t01
NZ_EQ973490	GenBank	gene	14832	16566	.	-	1	ID=BACCELL_RS09515;Name=BACCELL_RS09515;old_locus_tag=BACCELL_02148
NZ_EQ973490	GenBank	mRNA	14832	16566	.	-	1	ID=BACCELL_RS09515.t01;Parent=BACCELL_RS09515
NZ_EQ973490	GenBank	CDS	14832	16566	.	-	1	ID=BACCELL_RS09515;Parent=BACCELL_RS09515.t01;Name=BACCELL_RS09515;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_004292104.1;old_locus_tag=BACCELL_02148;product=RagB/SusD family nutrient uptake outer membrane protein;protein_id=WP_007211522.1;transl_table=11;translation=length.578
NZ_EQ973490	GenBank	exon	14832	16566	.	-	1	Parent=BACCELL_RS09515.t01
NZ_EQ973490	GenBank	pseudogenic_exon	16590	17151	.	-	1	ID=BACCELL_RS27335;Name=BACCELL_RS27335;Note=incomplete%3B too short partial abutting assembly gap%3B missing N-terminus%3B Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=2;inference=COORDINATES: similar to AA sequence:RefSeq:WP_017141023.1;old_locus_tag=BACCELL_02149;product=SusC/RagA family TonB-linked outer membrane protein;pseudo=_no_value;transl_table=11
NZ_EQ973490	GenBank	pseudogene	16590	17151	.	-	1	ID=BACCELL_RS27335.pseudogene;Alias=BACCELL_RS27335;Name=BACCELL_RS27335;old_locus_tag=BACCELL_02149;pseudo=_no_value
NZ_EQ973490	GenBank	gene	17172	19741	.	-	1	ID=BACCELL_RS09520;Name=BACCELL_RS09520;old_locus_tag=BACCELL_02150
NZ_EQ973490	GenBank	mRNA	17172	19741	.	-	1	ID=BACCELL_RS09520.t01;Parent=BACCELL_RS09520
NZ_EQ973490	GenBank	CDS	17172	19741	.	-	1	ID=BACCELL_RS09520;Parent=BACCELL_RS09520.t01;gO_component=GO:0045203 - integral component of cell outer membrane [Evidence IEA];gO_function=GO:0022857 - transmembrane transporter activity [Evidence IEA];gO_process=GO:0071702 - organic substance transport [Evidence IEA];Name=BACCELL_RS09520;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_007216410.1;old_locus_tag=BACCELL_02150;product=SusC/RagA family TonB-linked outer membrane protein;protein_id=WP_007211524.1;transl_table=11;translation=length.856
NZ_EQ973490	GenBank	exon	17172	19741	.	-	1	Parent=BACCELL_RS09520.t01
NZ_EQ973490	GenBank	gene	20005	21456	.	-	1	ID=BACCELL_RS09525;Name=BACCELL_RS09525;old_locus_tag=BACCELL_02152
NZ_EQ973490	GenBank	mRNA	20005	21456	.	-	1	ID=BACCELL_RS09525.t01;Parent=BACCELL_RS09525
NZ_EQ973490	GenBank	CDS	20005	21456	.	-	1	ID=BACCELL_RS09525;Parent=BACCELL_RS09525.t01;gO_process=GO:0005975 - carbohydrate metabolic process [Evidence IEA];Name=BACCELL_RS09525;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_004292093.1;old_locus_tag=BACCELL_02152;product=family 43 glycosylhydrolase;protein_id=WP_044153754.1;transl_table=11;translation=length.483
NZ_EQ973490	GenBank	exon	20005	21456	.	-	1	Parent=BACCELL_RS09525.t01
NZ_EQ973490	GenBank	gene	21459	22157	.	-	1	ID=BACCELL_RS09530;Name=BACCELL_RS09530;old_locus_tag=BACCELL_02153
NZ_EQ973490	GenBank	mRNA	21459	22157	.	-	1	ID=BACCELL_RS09530.t01;Parent=BACCELL_RS09530
NZ_EQ973490	GenBank	CDS	21459	22157	.	-	1	ID=BACCELL_RS09530;Parent=BACCELL_RS09530.t01;Name=BACCELL_RS09530;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_004292094.1;old_locus_tag=BACCELL_02153;product=hypothetical protein;protein_id=WP_007211527.1;transl_table=11;translation=length.232
NZ_EQ973490	GenBank	exon	21459	22157	.	-	1	Parent=BACCELL_RS09530.t01
NZ_EQ973490	GenBank	gene	22186	23343	.	-	1	ID=BACCELL_RS09535;Name=BACCELL_RS09535;old_locus_tag=BACCELL_02154
NZ_EQ973490	GenBank	mRNA	22186	23343	.	-	1	ID=BACCELL_RS09535.t01;Parent=BACCELL_RS09535
NZ_EQ973490	GenBank	CDS	22186	23343	.	-	1	ID=BACCELL_RS09535;Parent=BACCELL_RS09535.t01;Name=BACCELL_RS09535;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_007216407.1;old_locus_tag=BACCELL_02154;product=alpha/beta hydrolase-fold protein;protein_id=WP_044153756.1;transl_table=11;translation=length.385
NZ_EQ973490	GenBank	exon	22186	23343	.	-	1	Parent=BACCELL_RS09535.t01