##gff-version 3
##sequence-region NZ_DS990123 1 15684
# conversion-by bp_genbank2gff3.pl
# organism Phocaeicola plebeius DSM 17135
# Note Phocaeicola plebeius DSM 17135 Scfld_02_6, whole genome shotgun sequence.
# date 06-OCT-2022
NZ_DS990123	GenBank	region	1	15684	.	+	1	ID=NZ_DS990123;Dbxref=BioProject:PRJNA224116,taxon:484018;Name=NZ_DS990123;Note=Phocaeicola plebeius DSM 17135 Scfld_02_6%2C whole genome shotgun sequence.,REFSEQ INFORMATION: The reference sequence is identical to DS990123.1.    On Mar 3,2009 this sequence version replaced NW_002063606.1.  Bacteroides plebeius (GenBank Accession Number for 16S rDNA gene:  AB200217) is a member of the Bacteroidetes division of the domain  bacteria and has been isolated from human feces. The sequenced  strain was obtained from Deutsche Sammlung von Mikroorganismen und  Zellkulturen GmbH (DSMZ)(DSM 17135).    We have collected 7.9X coverage in plasmid end reads and 454 reads.  We have performed one round of automated sequence improvement  (pre-finishing),along with manual improvement that includes  breaking apart any mis-assembly,and making manual joins where  possible. Manual edits also are made where the consensus appears to  be incorrect. All low quality data on the ends of contigs is  removed. Contigs are ordered and oriented where possible.    Sequencing/Assembly: The genomic DNA was purified from liquid  culture derived from a single bacterial colony. A hybrid sequencing  strategy that utilized reads from both 454 GS-20 and ABI 3730xl  sequencers was devised and implemented to generate the draft genome  sequences. 454 reads were assembled using Newbler (454 Life  Sciences) into 454 de novo contigs. These de novo contigs were  converted in silico to 800 base paired reads ('superreads') with  400 base overlaps with neighboring superreads. Finally,PCAP  (Huang,et al,Genome Research,13:2164,(2003)) was used to  assemble the super-reads and the conventional 3730xl capillary  reads.    This sequenced strain is part of a comprehensive,sequence-based  survey of members of the normal human gut microbiota. A joint  effort of the WU-GSC and the Center for Genome Sciences at  Washington University School of Medicine,the purpose of this  survey is to provide the general scientific community with a broad  view of the gene content of 100 representatives of the major  divisions represented in the intestine's microbial community. This  information should provide a frame of reference for analyzing  metagenomic studies of the human gut microbiome. Further details of  this effort are described in a white paper entitled 'Extending Our  View of Self: the Human Gut Microbiome Initiative (HGMI)'  (http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/HGMIS  eq.pdf). These studies are supported by National Human Genome  Research Institute.    For answers to your questions regarding this assembly or project,or any other GSC genome project,please visit our Genome Groups web  page (http://genome.wustl.edu/genome_group_index.cgi) and email the  designated contact person.  Bacteroides plebeius (GenBank Accession Number for 16S rDNA gene:  AB200217) is a member of the Bacteroidetes division of the domain  bacteria and has been isolated from human feces. The sequenced  strain was obtained from Deutsche Sammlung von Mikroorganismen und  Zellkulturen GmbH (DSMZ)(DSM 17135).    We have collected 7.9X coverage in plasmid end reads and 454 reads.  We have performed one round of automated sequence  improvement(pre-finishing),along with manual improvement that  includes breaking apart any mis-assembly,and making manual joins  where possible. Manual edits also are made where the consensus  appears to be incorrect. All low quality data on the ends of  contigs is removed. Contigs are ordered and oriented where  possible.    Sequencing/Assembly: The genomic DNA was purified from liquid  culture derived from a single bacterial colony. A hybrid sequencing  strategy that utilized reads from both 454 GS-20 and ABI  3730xlsequencers was devised and implemented to generate the draft  genome sequences. 454 reads were assembled using Newbler (454 Life  Sciences) into 454 de novo contigs. These de novo contigs were  converted in silico to 800 base paired reads ('superreads') with  400 base overlaps with neighboring superreads. Finally,PCAP  (Huang,et al,Genome Research,13:2164,(2003)) was used to  assemble the super-reads and the conventional 3730xl capillary  reads.    This sequenced strain is part of a comprehensive,sequence-based  survey of members of the normal human gut microbiota. A joint  effort of the WU-GSC and the Center for Genome Sciences at  Washington University School of Medicine,the purpose of this  survey is to provide the general scientific community with a broad  view of the gene content of 100 representatives of the major  divisions represented in the intestine's microbial community. This  information should provide a frame of reference for analyzing  metagenomic studies of the human gut microbiome. Further details of  this effort are described in a white paper entitled 'Extending Our  View of Self: the Human Gut Microbiome Initiative (HGMI)'  (http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/HGMIS  eq.pdf). These studies are supported by National Human Genome  Research Institute.    Coding sequences were predicted using GeneMark v3.3 and Glimmer2  v2.13. Intergenic regions not spanned by GeneMark and Glimmer2 were  blasted against NCBI's non-redundant (NR) database and predictions  generated based on protein alignments. tRNA genes were determined  using tRNAscan-SE 1.23 and non-coding RNA genes by RNAmmer-1.2 and  Rfam v8.0. Gene names are generated at the contig level and may not  necessarily reflect any known order or orientation between contigs.    For answers to your questions regarding this assembly or project,or any other GSC genome project,please visit our Genome Groups web  page (http://genome.wustl.edu/genome_group_index.cgi) and email the  designated contact person.    Annotation was added to the contigs in September 2008.    This is a reference genome for the Human Microbiome Project. This  project is co-owned with the Human Microbiome Project DACC.  The annotation was added by the NCBI Prokaryotic Genome Annotation  Pipeline (PGAP). Information about PGAP can be found here:  https://www.ncbi.nlm.nih.gov/genome/annotation_prok/    \n##Genome-Annotation-Data-START##\nAnnotation Provider :: NCBI RefSeq\nAnnotation Date :: 10/05/2022 09:16:28\nAnnotation Pipeline :: NCBI Prokaryotic Genome\nAnnotation Pipeline (PGAP)\nAnnotation Method :: Best-placed reference protein\nset,GeneMarkS-2+\nAnnotation Software revision :: 6.3\nFeatures Annotated :: Gene,CDS,rRNA,tRNA,ncRNA,\nrepeat_region\nGenes (total) :: 3,688\nCDSs (total) :: 3,591\nGenes (coding) :: 3,539\nCDSs (with protein) :: 3,539\nGenes (RNA) :: 97\nrRNAs :: 6,5,6 (5S,16S,23S)\ncomplete rRNAs :: 6 (5S)\npartial rRNAs :: 5,6 (16S,23S)\ntRNAs :: 78\nncRNAs :: 2\nPseudo Genes (total) :: 52\nCDSs (without protein) :: 52\nPseudo Genes (ambiguous residues) :: 0 of 52\nPseudo Genes (frameshifted) :: 15 of 52\nPseudo Genes (incomplete) :: 43 of 52\nPseudo Genes (internal stop) :: 4 of 52\nPseudo Genes (multiple problems) :: 8 of 52\nCRISPR Arrays :: 1\n##Genome-Annotation-Data-END##;comment1=REFSEQ INFORMATION: The reference sequence is identical to DS990123.1.    On Mar 3%2C 2009 this sequence version replaced NW_002063606.1.  Bacteroides plebeius (GenBank Accession Number for 16S rDNA gene:  AB200217) is a member of the Bacteroidetes division of the domain  bacteria and has been isolated from human feces. The sequenced  strain was obtained from Deutsche Sammlung von Mikroorganismen und  Zellkulturen GmbH (DSMZ)(DSM 17135).    We have collected 7.9X coverage in plasmid end reads and 454 reads.  We have performed one round of automated sequence improvement  (pre-finishing)%2C along with manual improvement that includes  breaking apart any mis-assembly%2C and making manual joins where  possible. Manual edits also are made where the consensus appears to  be incorrect. All low quality data on the ends of contigs is  removed. Contigs are ordered and oriented where possible.    Sequencing/Assembly: The genomic DNA was purified from liquid  culture derived from a single bacterial colony. A hybrid sequencing  strategy that utilized reads from both 454 GS-20 and ABI 3730xl  sequencers was devised and implemented to generate the draft genome  sequences. 454 reads were assembled using Newbler (454 Life  Sciences) into 454 de novo contigs. These de novo contigs were  converted in silico to 800 base paired reads ('superreads') with  400 base overlaps with neighboring superreads. Finally%2C PCAP  (Huang%2C et al%2C Genome Research%2C 13:2164%2C (2003)) was used to  assemble the super-reads and the conventional 3730xl capillary  reads.    This sequenced strain is part of a comprehensive%2C sequence-based  survey of members of the normal human gut microbiota. A joint  effort of the WU-GSC and the Center for Genome Sciences at  Washington University School of Medicine%2C the purpose of this  survey is to provide the general scientific community with a broad  view of the gene content of 100 representatives of the major  divisions represented in the intestine's microbial community. This  information should provide a frame of reference for analyzing  metagenomic studies of the human gut microbiome. Further details of  this effort are described in a white paper entitled 'Extending Our  View of Self: the Human Gut Microbiome Initiative (HGMI)'  (http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/HGMIS  eq.pdf). These studies are supported by National Human Genome  Research Institute.    For answers to your questions regarding this assembly or project%2C  or any other GSC genome project%2C please visit our Genome Groups web  page (http://genome.wustl.edu/genome_group_index.cgi) and email the  designated contact person.  Bacteroides plebeius (GenBank Accession Number for 16S rDNA gene:  AB200217) is a member of the Bacteroidetes division of the domain  bacteria and has been isolated from human feces. The sequenced  strain was obtained from Deutsche Sammlung von Mikroorganismen und  Zellkulturen GmbH (DSMZ)(DSM 17135).    We have collected 7.9X coverage in plasmid end reads and 454 reads.  We have performed one round of automated sequence  improvement(pre-finishing)%2C along with manual improvement that  includes breaking apart any mis-assembly%2C and making manual joins  where possible. Manual edits also are made where the consensus  appears to be incorrect. All low quality data on the ends of  contigs is removed. Contigs are ordered and oriented where  possible.    Sequencing/Assembly: The genomic DNA was purified from liquid  culture derived from a single bacterial colony. A hybrid sequencing  strategy that utilized reads from both 454 GS-20 and ABI  3730xlsequencers was devised and implemented to generate the draft  genome sequences. 454 reads were assembled using Newbler (454 Life  Sciences) into 454 de novo contigs. These de novo contigs were  converted in silico to 800 base paired reads ('superreads') with  400 base overlaps with neighboring superreads. Finally%2C PCAP  (Huang%2C et al%2C Genome Research%2C 13:2164%2C (2003)) was used to  assemble the super-reads and the conventional 3730xl capillary  reads.    This sequenced strain is part of a comprehensive%2C sequence-based  survey of members of the normal human gut microbiota. A joint  effort of the WU-GSC and the Center for Genome Sciences at  Washington University School of Medicine%2C the purpose of this  survey is to provide the general scientific community with a broad  view of the gene content of 100 representatives of the major  divisions represented in the intestine's microbial community. This  information should provide a frame of reference for analyzing  metagenomic studies of the human gut microbiome. Further details of  this effort are described in a white paper entitled 'Extending Our  View of Self: the Human Gut Microbiome Initiative (HGMI)'  (http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/HGMIS  eq.pdf). These studies are supported by National Human Genome  Research Institute.    Coding sequences were predicted using GeneMark v3.3 and Glimmer2  v2.13. Intergenic regions not spanned by GeneMark and Glimmer2 were  blasted against NCBI's non-redundant (NR) database and predictions  generated based on protein alignments. tRNA genes were determined  using tRNAscan-SE 1.23 and non-coding RNA genes by RNAmmer-1.2 and  Rfam v8.0. Gene names are generated at the contig level and may not  necessarily reflect any known order or orientation between contigs.    For answers to your questions regarding this assembly or project%2C  or any other GSC genome project%2C please visit our Genome Groups web  page (http://genome.wustl.edu/genome_group_index.cgi) and email the  designated contact person.    Annotation was added to the contigs in September 2008.    This is a reference genome for the Human Microbiome Project. This  project is co-owned with the Human Microbiome Project DACC.  The annotation was added by the NCBI Prokaryotic Genome Annotation  Pipeline (PGAP). Information about PGAP can be found here:  https://www.ncbi.nlm.nih.gov/genome/annotation_prok/    \n##Genome-Annotation-Data-START##\nAnnotation Provider :: NCBI RefSeq\nAnnotation Date :: 10/05/2022 09:16:28\nAnnotation Pipeline :: NCBI Prokaryotic Genome\nAnnotation Pipeline (PGAP)\nAnnotation Method :: Best-placed reference protein\nset%3B GeneMarkS-2+\nAnnotation Software revision :: 6.3\nFeatures Annotated :: Gene%3B CDS%3B rRNA%3B tRNA%3B ncRNA%3B\nrepeat_region\nGenes (total) :: 3%2C688\nCDSs (total) :: 3%2C591\nGenes (coding) :: 3%2C539\nCDSs (with protein) :: 3%2C539\nGenes (RNA) :: 97\nrRNAs :: 6%2C 5%2C 6 (5S%2C 16S%2C 23S)\ncomplete rRNAs :: 6 (5S)\npartial rRNAs :: 5%2C 6 (16S%2C 23S)\ntRNAs :: 78\nncRNAs :: 2\nPseudo Genes (total) :: 52\nCDSs (without protein) :: 52\nPseudo Genes (ambiguous residues) :: 0 of 52\nPseudo Genes (frameshifted) :: 15 of 52\nPseudo Genes (incomplete) :: 43 of 52\nPseudo Genes (internal stop) :: 4 of 52\nPseudo Genes (multiple problems) :: 8 of 52\nCRISPR Arrays :: 1\n##Genome-Annotation-Data-END##;date=06-OCT-2022;host=Homo sapiens;isolation_source=biological product [ENVO:02000043];mol_type=genomic DNA;organism=Phocaeicola plebeius DSM 17135;strain=DSM 17135;type_material=type strain of Bacteroides plebeius
NZ_DS990123	GenBank	gene	1	2028	.	-	1	ID=BACPLE_RS05590;Name=BACPLE_RS05590;old_locus_tag=BACPLE_00332
NZ_DS990123	GenBank	mRNA	1	2028	.	-	1	ID=BACPLE_RS05590.t01;Parent=BACPLE_RS05590
NZ_DS990123	GenBank	CDS	1	2028	.	-	1	ID=BACPLE_RS05590;Parent=BACPLE_RS05590.t01;Name=BACPLE_RS05590;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_007660088.1;old_locus_tag=BACPLE_00332;product=RagB/SusD family nutrient uptake outer membrane protein;protein_id=WP_007559719.1;transl_table=11;translation=length.675
NZ_DS990123	GenBank	exon	1	2028	.	-	1	Parent=BACPLE_RS05590.t01
NZ_DS990123	GenBank	gene	2045	5347	.	-	1	ID=BACPLE_RS05595;Name=BACPLE_RS05595;old_locus_tag=BACPLE_00333
NZ_DS990123	GenBank	mRNA	2045	5347	.	-	1	ID=BACPLE_RS05595.t01;Parent=BACPLE_RS05595
NZ_DS990123	GenBank	CDS	2045	5347	.	-	1	ID=BACPLE_RS05595;Parent=BACPLE_RS05595.t01;Name=BACPLE_RS05595;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_005840711.1;old_locus_tag=BACPLE_00333;product=TonB-dependent receptor;protein_id=WP_007559720.1;transl_table=11;translation=length.1100
NZ_DS990123	GenBank	exon	2045	5347	.	-	1	Parent=BACPLE_RS05595.t01
NZ_DS990123	GenBank	gene	5669	9697	.	+	1	ID=BACPLE_RS05600;Name=BACPLE_RS05600;old_locus_tag=BACPLE_00334
NZ_DS990123	GenBank	mRNA	5669	9697	.	+	1	ID=BACPLE_RS05600.t01;Parent=BACPLE_RS05600
NZ_DS990123	GenBank	CDS	5669	9697	.	+	1	ID=BACPLE_RS05600;Parent=BACPLE_RS05600.t01;gO_process=GO:0000160 - phosphorelay signal transduction system [Evidence IEA];Name=BACPLE_RS05600;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_007660937.1;old_locus_tag=BACPLE_00334;product=two-component regulator propeller domain-containing protein;protein_id=WP_040312477.1;transl_table=11;translation=length.1342
NZ_DS990123	GenBank	exon	5669	9697	.	+	1	Parent=BACPLE_RS05600.t01
NZ_DS990123	GenBank	gene	9768	10919	.	-	1	ID=BACPLE_RS05605;Name=BACPLE_RS05605;old_locus_tag=BACPLE_00335
NZ_DS990123	GenBank	mRNA	9768	10919	.	-	1	ID=BACPLE_RS05605.t01;Parent=BACPLE_RS05605
NZ_DS990123	GenBank	CDS	9768	10919	.	-	1	ID=BACPLE_RS05605;Parent=BACPLE_RS05605.t01;Name=BACPLE_RS05605;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_011966837.1;old_locus_tag=BACPLE_00335;product=glycoside hydrolase family 43 protein;protein_id=WP_007559722.1;transl_table=11;translation=length.383
NZ_DS990123	GenBank	exon	9768	10919	.	-	1	Parent=BACPLE_RS05605.t01
NZ_DS990123	GenBank	gene	10960	11952	.	-	1	ID=BACPLE_RS05610;Name=BACPLE_RS05610;old_locus_tag=BACPLE_00336
NZ_DS990123	GenBank	mRNA	10960	11952	.	-	1	ID=BACPLE_RS05610.t01;Parent=BACPLE_RS05610
NZ_DS990123	GenBank	CDS	10960	11952	.	-	1	ID=BACPLE_RS05610;Parent=BACPLE_RS05610.t01;Name=BACPLE_RS05610;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_008619024.1;old_locus_tag=BACPLE_00336;product=glycoside hydrolase family 43 protein;protein_id=WP_007559723.1;transl_table=11;translation=length.330
NZ_DS990123	GenBank	exon	10960	11952	.	-	1	Parent=BACPLE_RS05610.t01
NZ_DS990123	GenBank	gene	12082	13278	.	-	1	ID=BACPLE_RS19680;Name=BACPLE_RS19680
NZ_DS990123	GenBank	mRNA	12082	13278	.	-	1	ID=BACPLE_RS19680.t01;Parent=BACPLE_RS19680
NZ_DS990123	GenBank	CDS	12082	13278	.	-	1	ID=BACPLE_RS19680;Parent=BACPLE_RS19680.t01;Name=BACPLE_RS19680;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_004373664.1;product=DUF2264 domain-containing protein;protein_id=WP_235777910.1;transl_table=11;translation=length.398
NZ_DS990123	GenBank	exon	12082	13278	.	-	1	Parent=BACPLE_RS19680.t01
NZ_DS990123	GenBank	pseudogenic_exon	13288	14295	.	-	1	ID=BACPLE_RS19685;Name=BACPLE_RS19685;Note=incomplete%3B partial in the middle of a contig%3B missing C-terminus%3B Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: protein motif:HMM:NF019113.2;product=glycoside hydrolase family 88 protein;pseudo=_no_value;transl_table=11
NZ_DS990123	GenBank	pseudogene	13288	14295	.	-	1	ID=BACPLE_RS19685.pseudogene;Alias=BACPLE_RS19685;Name=BACPLE_RS19685;pseudo=_no_value
NZ_DS990123	GenBank	gene	14407	15684	.	-	1	ID=BACPLE_RS05620;Name=BACPLE_RS05620;old_locus_tag=BACPLE_00338
NZ_DS990123	GenBank	mRNA	14407	15684	.	-	1	ID=BACPLE_RS05620.t01;Parent=BACPLE_RS05620
NZ_DS990123	GenBank	CDS	14407	15684	.	-	1	ID=BACPLE_RS05620;Parent=BACPLE_RS05620.t01;Name=BACPLE_RS05620;Note=Derived by automated computational analysis using gene prediction method: Protein Homology.;codon_start=1;inference=COORDINATES: similar to AA sequence:RefSeq:WP_009039811.1;old_locus_tag=BACPLE_00338;product=BNR repeat-containing protein;protein_id=WP_007559725.1;transl_table=11;translation=length.425
NZ_DS990123	GenBank	exon	14407	15684	.	-	1	Parent=BACPLE_RS05620.t01