LOCUS NZ_KB290676 5912 bp DNA linear CON 19-MAR-2021 DEFINITION Anaerostipes hadrus DSM 3319 Scfld194, whole genome shotgun sequence. ACCESSION NZ_KB290676 REGION: 26972..32883 VERSION NZ_KB290676.1 DBLINK BioProject: PRJNA224116 BioSample: SAMN02436796 Assembly: GCF_000332875.2 KEYWORDS WGS; HIGH_QUALITY_DRAFT; RefSeq. SOURCE Anaerostipes hadrus DSM 3319 ORGANISM Anaerostipes hadrus DSM 3319 Bacteria; Firmicutes; Clostridia; Eubacteriales; Lachnospiraceae; Anaerostipes. REFERENCE 1 (bases 1 to 5912) AUTHORS Weinstock,G., Sodergren,E., Lobos,E.A., Fulton,L., Fulton,R., Courtney,L., Fronick,C., O'Laughlin,M., Godfrey,J., Wilson,R.M., Miner,T., Farmer,C., Delehaunty,K., Cordes,M., Minx,P., Tomlinson,C., Chen,J., Wollam,A., Pepin,K.H., Bhonagiri,V., Zhang,X., Suruliraj,S., Warren,W., Mitreva,M., Mardis,E.R. and Wilson,R.K. TITLE Direct Submission JOURNAL Submitted (10-MAY-2012) Genome Sequencing Center, Washington University School of Medicine, 4444 Forest Park, St. Louis, MO 63108, USA COMMENT REFSEQ INFORMATION: The reference sequence is identical to KB290676.1. Anaerostipes hadrus DSM 3319 is a member of the Firmicutes division of the domain Bacteria and has been isolated from the GI tract. This is a reference genome for the Human Microbiome Project. This project is co-owned with the Human Microbiome Project DACC. The sequenced strain was obtained directly from ATCC (ATCC 29173). Source DNA was prepared by Michelle Daigneault and Emma Allen-Vercoe (Department of Molecular and Cellular Biology, University of Guelph, Ontario, Canada), and was funded by the NHGRI (through an HMP Technology Development Grant - 1R21HG005811 - 01). Coding sequences were predicted using GeneMark v3.3 and Glimmer3 v3.02. Intergenic regions not spanned by GeneMark and Glimmer3 were blasted against NCBI's non-redundant (NR) database and predictions generated based on protein alignments. tRNA genes were determined using tRNAscan-SE 1.23 and non-coding RNA genes by RNAmmer-1.2 and Rfam v8.1. The final gene set is processed through several programs such as Kegg (Release 56), psortB (Version 3.0.3) and Interproscan (Version 4.7) to determine possible function. Gene product names are determined by BER (Version 2.5). Gene names are generated at the contig level and may not necessarily reflect any known order or orientation between contigs. The National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH) is funding the sequence characterization of the Anaerostipes hadrus DSM 3319 genome. Eubacterium hadrum DSM 3319 is a member of the Firmicutes division of the domain Bacteria and has been isolated from the GI tract. This is a reference genome for the Human Microbiome Project. This project is co-owned with the Human Microbiome Project DACC. The sequenced strain was obtained directly from ATCC (ATCC 29173). Source DNA was prepared by Michelle Daigneault and Emma Allen-Vercoe (Department of Molecular and Cellular Biology, University of Guelph, Ontario, Canada), and was funded by the NHGRI (through an HMP Technology Development Grant - 1R21HG005811 - 01). Coding sequences were predicted using GeneMark v3.3 and Glimmer3 v3.02. Intergenic regions not spanned by GeneMark and Glimmer3 were blasted against NCBI's non-redundant (NR) database and predictions generated based on protein alignments. tRNA genes were determined using tRNAscan-SE 1.23 and non-coding RNA genes by RNAmmer-1.2 and Rfam v8.1. The final gene set is processed through several programs such as Kegg (Release 56), psortB (Version 3.0.3) and Interproscan (Version 4.7) to determine possible function. Gene product names are determined by BER (Version 2.5). Gene names are generated at the contig level and may not necessarily reflect any known order or orientation between contigs. The National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH) is funding the sequence characterization of the Eubacterium hadrum DSM 3319 genome. The annotation was added by the NCBI Prokaryotic Genome Annotation Pipeline (PGAP). Information about PGAP can be found here: https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ ##Genome-Assembly-Data-START## Finishing Goal :: High-Quality Draft Current Finishing Status :: High-Quality Draft Assembly Method :: Velvet v. 1.1.04 Genome Coverage :: 103x Sequencing Technology :: Illumina ##Genome-Assembly-Data-END## ##Genome-Annotation-Data-START## Annotation Provider :: NCBI RefSeq Annotation Date :: 03/19/2021 01:29:49 Annotation Pipeline :: NCBI Prokaryotic Genome Annotation Pipeline (PGAP) Annotation Method :: Best-placed reference protein set; GeneMarkS-2+ Annotation Software revision :: 5.1 Features Annotated :: Gene; CDS; rRNA; tRNA; ncRNA; repeat_region Genes (total) :: 2,711 CDSs (total) :: 2,653 Genes (coding) :: 2,580 CDSs (with protein) :: 2,580 Genes (RNA) :: 58 rRNAs :: 1, 2, 1 (5S, 16S, 23S) complete rRNAs :: 1 (23S) partial rRNAs :: 1, 2 (5S, 16S) tRNAs :: 50 ncRNAs :: 4 Pseudo Genes (total) :: 73 CDSs (without protein) :: 73 Pseudo Genes (ambiguous residues) :: 5 of 73 Pseudo Genes (frameshifted) :: 26 of 73 Pseudo Genes (incomplete) :: 33 of 73 Pseudo Genes (internal stop) :: 19 of 73 Pseudo Genes (multiple problems) :: 10 of 73 CRISPR Arrays :: 1 ##Genome-Annotation-Data-END## FEATURES Location/Qualifiers source 1..5912 /organism="Anaerostipes hadrus DSM 3319" /mol_type="genomic DNA" /submitter_seqid="Scfld194" /strain="DSM 3319" /isolation_source="stool, GI tract" /host="Homo sapiens" /culture_collection="ATCC:29173" /type_material="type strain of Anaerostipes hadrus" /db_xref="HMP:0369" /db_xref="taxon:649757" gene complement(1..1386) /locus_tag="HMPREF0369_RS08180" /old_locus_tag="HMPREF0369_01715" CDS complement(1..1386) /locus_tag="HMPREF0369_RS08180" /old_locus_tag="HMPREF0369_01715" /inference="COORDINATES: similar to AA sequence:RefSeq:WP_008394278.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="glycoside hydrolase family 32 protein" /protein_id="WP_009203922.1" /translation="MKFLDVEKYTPIQKSHEAYLKELMEYCKDTPRHPSYHIHPPCGL VNDPNGLAYFGGKYHVFYQWFPFGPEHGMKHWAHVISEDLIKWERSDQMLIPDQEYEK NGCYSGNSIEADGKLYLYYTANYKTEQGKIPKQAMAVMNSDGTILKSPNNPIIDEQPE GLIGEIRDPFVFEKEGAYWMLLGGGSTDGQARLILYKSTDLENWVYQGNIELTGIDLE LGYMYECPSYIEIDGKDVLFLSLMGRTPMGERFHNEFSSVYFIGELNLEDKTFHVESF DEIDKGFDFYAPQAFYGKDRQPMMFAWLGCGVQELPYAKEDMWIHSLTMPRFLTIKEG KLCQEVPENIKNEYAHLDIDSKRIKPEEDTWYINLTDKQISEIQIGDQEDHLSIKIDW AAGHIVADRSTLKHQFSTEYGIQREVSMSEELKNIEIYYDNTFIEIYLNDGKDTMTLR AFPENVEINLI" gene complement(1388..3295) /locus_tag="HMPREF0369_RS08185" /old_locus_tag="HMPREF0369_01716" CDS complement(1388..3295) /locus_tag="HMPREF0369_RS08185" /old_locus_tag="HMPREF0369_01716" /inference="COORDINATES: similar to AA sequence:RefSeq:WP_008394277.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="PTS sugar transporter subunit IIBCA" /protein_id="WP_008394277.1" /translation="MDYKKISQDVVDHVGGISNIKGATHCVTRLRLILNDETKYDRKA LENIEGVKGVLFNSGQLMIIFGTGTVNNVYDAFIELTGVKEMSQSDAKAEGMSKMGKL QQGFKVFSDIFIPIIPAFVAAAILTGVKALLTANGLFGLEGSLADQSKLVADIAEFLA IMATTFNYLPILVMYSAVKRFGGNPILGILVGIVMVHPDLVNRNTFVLDPSAASYWHF GPLAIAKVAFQGGVFPAILTAWFMSKVEKIAQKYVPAVVSFVFVPTITIFFANVALFT VFGPVGNMIGNGLAAVIDVLYNSFGAFGAFVFAALLQPLVVTGTHQAIQGIEANLIAT TGFNYIQGIWSVSIIAQGGGAIGMYLLAKKHSKDRDIAMSSFIPTLVGISEPAIFAAN LKYSVIPFVCSCLGAGCGGAFMKLAGVRAIGQGLTGILGLLIVVPNKLVFYVIGNLIA FIVPIVLIVIYNKTKGVPHGEEEEEAESALTIDETAKVPVKAVVDGKVIAIEDVKDGI FSEKVLGDGVGIIPTSETVLAPAEGEICTVMEASKHAIGVRTTNNSVFLIHVGIETVS MNGEGFEYLVKKGQHVKEGQPILKFDKAKIEAKGLNPVVVFVKTDEGNQTPVAFKTGI NVKAGKDVIGE" gene complement(3325..4716) /locus_tag="HMPREF0369_RS08190" /old_locus_tag="HMPREF0369_01717" CDS complement(3325..4716) /locus_tag="HMPREF0369_RS08190" /old_locus_tag="HMPREF0369_01717" /inference="COORDINATES: similar to AA sequence:RefSeq:WP_009203923.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="family 43 glycosylhydrolase" /protein_id="WP_009203923.1" /translation="MYQLYYQPEGIWVGDIMPYGKEGQFYVYHQRDTRNPGPFGEPFG WALATTKDFVDYKDYGESLKRGTDEEADQFIYAGSVFETEAGFHAFYTGYNREFLKAG KTSQTLLHATSKDGITWEKSKEALEIPPQEGYDKRNWRDPFVLWDEEREEYLLILGAR KGEDKRKQTGRLVHFTSKDLKKWEFKGDFWAPGIYTMFEMPEIFKMGDWWYLVFSEYS EGNKIHYRRSKNLYGPWEAPFDDAFDGRAYYAGRTAFDGERRVLFGWVPTRIDNDDKN AYLWGGTFVPHEVFQKEDGTLGVKPVDQMMEAFDGWKDLFNPCMKTIDTKEETLLCED TGSIAAFKTTVKFEEGTKEFSIRFYKDEETEVSYEYRFFVEENKVVFNKCPNYPWYQC FNIGLERPIKLEADKEYEICLIIDQDISTLYINGTALNARLCDHPGNGLALTVTDGTL EAKNTKIATKINK" gene complement(4935..5912) /locus_tag="HMPREF0369_RS08195" /old_locus_tag="HMPREF0369_01718" CDS complement(4935..5912) /locus_tag="HMPREF0369_RS08195" /old_locus_tag="HMPREF0369_01718" /inference="COORDINATES: similar to AA sequence:RefSeq:WP_008394275.1" /note="Derived by automated computational analysis using gene prediction method: Protein Homology." /codon_start=1 /transl_table=11 /product="LacI family DNA-binding transcriptional regulator" /protein_id="WP_008394275.1" /translation="MKPGIKDVAKVAGVSPTTVSRVLNNRGYISEETRKKVYDAMEEI NYYPNEIARALLNNRTYFVGVIVPTVTSPFHGEVVEQIEYYLSQKNYKMLLCNSKNQM DTEKAYIDMLRRNQVDGMIVGTHNAVVETYSKLKMPVVGIDRYLGEHIPVVSCDNYAA GQMATRHLIDKGCQHILCIRGNSKLKMPGNNRSQAYIQEMEKVGLPQMILEVPFIMEN VEKQKLIYDMLNAHPEIDGIFAGDDSLAVIALHVARQKGINIPKDLKIIGVDGTKQIL GFVPELTTIQQPVKQIAKVAVDKLIDLIEGKTAESCMDLPVKLLEGQTT" ORIGIN 1 ctaaattaaa ttaatttcca cattttctgg aaatgcacga agggtcattg tatcttttcc 61 atcatttaga taaatttcaa taaatgtatt atcataatat atctcaatgt ttttgagttc 121 ttctgacatg gaaacttctc tttgaattcc gtattctgta ctaaactgat gcttaagtgt 181 acttcggtct gcaactatat gtcctgctgc ccagtcaatc ttaatactta gatgatcttc 241 ttgatctccg atctgaattt ctgagatctg tttatctgtc agattgatat accatgtatc 301 ttcttctggt ttaattcttt tactgtctat atcaagatgt gcatattcat tcttaatatt 361 ttctggcact tcctggcata actttccttc tttgatcgtt aagaaccttg gcattgtcag 421 actatggatc cacatatctt cttttgcata tgggagttcc tgtaccccac acccaagcca 481 cgcaaacatc attggttgtc tgtcttttcc atagaatgcc tgtggtgcat aaaaatcaaa 541 tcctttgtcg atctcatcaa atgattccac atgaaatgtt ttatcttcaa gatttaactc 601 tccgataaaa tatacactgg aaaactcatt gtgaaaacgt tctcccattg gtgtacgtcc 661 cattaatgat aagaataaaa catcttttcc atctatttca atgtaagatg gacattcata 721 catataacca agctccaggt caattcctgt cagttcaata tttccctgat atacccagtt 781 ttcaagatct gtactcttat acaaaatcaa tcttgcctgc ccatcggtgc ttccaccacc 841 taaaagcatc cagtatgctc cttctttttc aaagacaaat gggtctctga tctccccgat 901 cagtccttct ggctgttcat cgatgatcgg attgtttgga gatttaagaa ttgtaccatc 961 actgttcata actgccattg cctgttttgg gatctttccc tgttctgtct tatagtttgc 1021 tgtataataa agataaagct taccatctgc ttctatggaa tttcctgaat aacatccatt 1081 tttctcatat tcctgatctg gaatgagcat ctgatcagac cgttcccatt ttatgagatc 1141 ttccgagatc acatgtgccc agtgtttcat tccgtgttct ggtccaaatg gaaaccactg 1201 ataaaacaca tggtattttc ctccaaaata cgccagtcca tttggatcgt tcactagtcc 1261 acatggggga tgaatatgat aggatggatg ccgtggagta tctttacaat actccataag 1321 ctcctttaaa tatgcttcat ggcttttttg gatcggtgta tatttttcta catctaaaaa 1381 tttcatatta ttccccgatc acgtcttttc ctgctttgac attgattcct gtcttaaatg 1441 ccacaggtgt ctggttacct tcatccgttt tgacaaatac gacgactgga tttaatcctt 1501 ttgcttcaat ctttgctttg tcaaatttga gaattggctg accttccttc acgtgttgtc 1561 ctttctttac aagatattca aatccttctc cattcatgga tactgtttcg attcctacat 1621 ggatcagaaa tacactattg tttgttgttc ggacaccgat ggcatgtttt gaagcttcca 1681 tcactgtaca gatttcacct tctgccggcg ctaagactgt ttcacttgtt gggatgatcc 1741 ccaccccgtc tcctaatact ttttcagaga agattccatc ttttacatct tcaatcgcaa 1801 taacttttcc gtcaacaact gctttgactg gtactttcgc tgtctcatca attgttaatg 1861 cagattcagc ttcttcctct tcttcaccat gtggaacacc ttttgtttta ttataaataa 1921 cgatcagtac gattggtacg ataaatgcga tcaggtttcc aatgacataa aatactaatt 1981 tatttggtac tacgatcaga agtcctaata ttcctgttaa tccctgtcca atggctctta 2041 cgcctgcaag tttcatgaat gctcctccac atccagctcc taagcaagaa catacaaatg 2101 ggatcacgga atattttaag tttgcagcaa agatggctgg ttctgaaatt cctactaatg 2161 ttggaatgaa acttgacatt gcgatatctc tgtctttaga atgtttcttc gctaacaagt 2221 acattccgat ggctccacca ccctgtgcaa tgatcgatac agaccagatt ccctgaatgt 2281 agttaaatcc tgtcgtagcg atcagatttg cttcaattcc ctgaatcgcc tgatgtgttc 2341 ctgtaactac taatggctgt aataacgctg caaagacaaa tgctccaaat gctccgaaac 2401 tgttgtataa aacatcgatc actgctgcta atccgttacc gatcatattt ccaactggtc 2461 cgaatactgt aaataatgct acgtttgcaa agaaaatagt aattgttggt acgaatacaa 2521 atgataccac agctggtaca tatttctgtg caatcttttc tactttagac ataaaccatg 2581 cggttaagat tgctgggaat actccaccct ggaatgcaac ttttgcaatt gctaatggtc 2641 cgaaatgcca gtaacttgcg gcagatgggt ctaatacgaa tgtattacgg tttacaagat 2701 ctggatgtac cataacgatt ccgactaaga taccaaggat cggattccca ccgaatcgtt 2761 tgacagcact atacattaca aggattggca gataattaaa tgttgttgcc atgatcgcta 2821 agaattctgc aatatctgca acaagtttgc tctgatctgc taaggatccc tctaatccaa 2881 ataatccatt ggctgtcaat aatgctttga caccagtcag gatcgctgct gcaacgaatg 2941 ctggaatgat cggaataaag atatcagaga atactttgaa tccctgctgt aattttccca 3001 tcttgctcat tccttctgcc ttggcatcag actgagacat ttctttgaca cctgttaatt 3061 caataaatgc atcatagaca ttgttaactg ttcctgttcc gaaaatgatc atcagctgcc 3121 cactattaaa taagacacct ttaacacctt cgatattctc taatgcttta cggtcatatt 3181 ttgtttcatc atttaaaatc aggcgaagtc ttgtcacaca atgtgtcgca cctttgatgt 3241 tactgatccc accaacgtga tctaccacat cttgtgatat ttttttgtaa tccatctata 3301 cttgtctcct ttatttctct ctttttattt gtttattttt gttgctattt ttgtgttttt 3361 tgcttctagt gtaccatctg tcactgtcag tgccaatcca tttcctggat gatcacataa 3421 tcttgcattt aatgccgttc cattaatata taatgttgaa atatcctggt ctatgatcag 3481 acagatttca tattctttgt ctgcttccaa tttgatcgga cgctcaagtc caatgttaaa 3541 gcactgatac catggataat ttggacattt gttaaagaca actttatttt cttccacaaa 3601 gaatctgtat tcatatgaaa cttctgtttc ttcatctttg taaaatctga tagaaaattc 3661 tttcgttcct tcttcaaatt ttacagtggt tttgaatgct gcgatgcttc ctgtatcttc 3721 acaaagaagt gtttcttctt ttgtgtcaat cgttttcata caagggttaa ataaatcttt 3781 ccaaccatcg aatgcttcca tcatctggtc taccggtttt accccaagtg ttccatcttc 3841 tttctggaat acttcgtgtg gcacgaaagt tcctccccac aagtatgcgt tcttgtcatc 3901 gttatcgatt cttgttggca cccatccaaa aagtactctt cgttctccat caaatgctgt 3961 acgtcctgca tagtatgctc taccgtcaaa tgcatcatca aatggtgctt cccaaggccc 4021 gtataggttc ttgcttcttc tgtaatggat tttattaccc tcactgtatt ctgagaatac 4081 aagataccac cagtctccca tcttaaagat ctctggcatc tcaaacattg tgtagattcc 4141 tggtgcccag aaatctcctt taaattccca tttctttaaa tcttttgatg tgaaatgtac 4201 gagacgtcct gtctgttttc tcttatcttc tccctttctt gctccaagga tcaacagata 4261 ttcttctctt tcctcatccc ataggacaaa tggatctctc cagtttctct tatcatatcc 4321 ttcctgtgga ggaatctcta atgcttcttt cgatttctcc catgtgatcc catctttgct 4381 tgtagcgtga agcaatgtct gtgatgtctt tcctgccttt aagaattctc tattatatcc 4441 tgtataaaat gcatgaaatc ctgcctctgt ttcaaaaaca cttcctgcat agatgaactg 4501 atctgcttct tcatccgttc cacgctttaa gctttctcca tagtctttgt aatctacaaa 4561 atcttttgtt gttgctaatg cccaaccgaa tggttctcca aaaggtcctg gatttcttgt 4621 atctctctga tgatatacat aaaactgacc ttcttttcca taaggcataa tatctcctac 4681 ccaaatacct tctggctgat aatataactg atacataatt taatcctttc tttgatttac 4741 cttgttgctt tctatggctt tattatatgt caatcgtttt catatgtcaa tcgttaccat 4801 attaattttt gcacaaaata tttgggggat ttttggtagt tttgcataaa acttatctta 4861 ttttgtacgt tctagcatat ttttaggcat caaaaaagga agagtagtat tttctcttcc 4921 ttttttattc tctgttatgt tgtctgtcct tccagtaatt ttactggaag atccatacag 4981 ctttctgctg tttttccttc aatcagatcg atcagcttat ccaccgctac ctttgcaatc 5041 tgtttaactg gctgctggat cgttgtcagt tctggaacaa atcctaaaat ctgctttgtt 5101 ccatctactc cgatgatctt tagatctttc ggaatattga ttcctttttg ccttgcgaca 5161 tgaagtgcga tcactgcaag cgaatcatct cctgcaaaaa taccgtcaat ctcagggtgt 5221 gcatttaaca tatcataaat aagtttttgc ttctccacat tctccatgat gaacggtact 5281 tctaaaatca tctgaggaag ccctactttc tccatctctt gaatataagc ctgacttcga 5341 ttgttacctg gcatctttaa tttactgttt cctcggatac ataaaatgtg ctgacatcct 5401 ttatcgatca aatgtctggt tgccatctgc cctgctgcat agttatcaca tgaaacgact 5461 ggaatatgtt ctccaagata acggtcaatt cctacgactg gcatctttaa cttggagtat 5521 gtctctacga ctgcgttatg tgttccaacg atcattccat ctacctgatt gcgtcttagc 5581 atatcaatat aggctttctc tgtatccatc tgatttttgc tgttacacag aagcatctta 5641 taatttttct gtgataaata atattctatc tgttctacaa cttctccatg aaatggtgat 5701 gttacagtag gaacgataac cccgacaaaa tacgttctgt tatttaacaa cgctctggca 5761 atctcatttg gataataatt gatctcttcc atagcatcat atactttttt tcttgtttct 5821 tcgctgatgt atcctcggtt atttaatacc ctggatactg ttgtgggaga aactcccgct 5881 acttttgcca catcttttat tccaggtttc at //