logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000002403_00465

You are here: Home > Sequence: MGYG000002403_00465

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species Robinsoniella peoriensis
Lineage Bacteria; Firmicutes_A; Clostridia; Lachnospirales; Lachnospiraceae; Robinsoniella; Robinsoniella peoriensis
CAZyme ID MGYG000002403_00465
CAZy Family CBM51
CAZyme Description hypothetical protein
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
2666 MGYG000002403_19|CGC1 291685.11 4.6379
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000002403 7202103 Isolate not provided not provided
Gene Location Start: 20585;  End: 28585  Strand: +

Full Sequence      Download help

MNSKRLISMK  RIMAWILTLC  MILTAVQIPI  DVQAAETATE  ENAALEKTVT  LHKSDGTELP60
ADYRNPDRPA  SMAVDGIIDA  TGEANYCDFG  KDGDKTALYM  QVDLGGLYDL  NRVNMWRYWK120
DGRTYDATVI  TTSESGDFTD  EVVIYNSDRS  NVHGFGAGGD  ERYAESASGH  QFAVPGGTKA180
QAVRVYVFGS  QNGTTNHINE  LQVWGTPHTE  NPDVNSYQVK  IPQGNGYQVI  PYENDPTTVE240
EGGSFRFQVL  IDSDNGYSAT  SAVKANGVSL  EAVDSVYTIE  NITEDQVITI  EGVHKAQYEV300
KFPENPQGYS  VEIQNEGSTT  VDYNGSISFK  LIIDEAYNES  VPVVKANGGA  ALGKDEFGVY360
TIANIQDDIT  VTVEGIQENT  VVKTKTMYLS  DMDWKSAANA  VGATGEKDTP  TKDLNHLQQQ420
MKLLVNGAEK  SFDKGIGVQT  DSSIVYDLED  KGYTSFHTLA  GVDYSAMEYV  DGEGCDIQFK480
VYLDDVVVFD  SGVVDASDEA  QEVNVAITSE  NKELKLEAKM  VKEPYNDWGN  WADASFEMAY540
PEPSNVALNK  TVTVKKTADN  SDSEVNSSRP  GSMAVDGIIG  PTSDSNYCDF  GQDGDNTSRY600
LQVDLGDVYE  LTQINMFRYW  ADGRVYNGTV  IAVSENADFS  NPTFIYNSDK  ADKHGLGAGS660
DDTYGETQSG  KLFEVPAGTM  GQYVRVYMAG  SNKGTTNHIA  ELQVMGYNFN  TEPKPYEANA720
FENAEVYLDM  PTHFQDLDSN  KNDDGSLKHI  GGQVTHPDIQ  VFDQPWNGYK  YWMIYTPNTM780
ITSQYENPYI  VASEDGQTWV  EPEGISNPIE  PEPPSTRFHN  CDADLLYDSV  NDRLLAYWNW840
ADDGGGIDDE  LKDQNCQIRL  RISYDGINWG  VPYDKDGNIA  TTADTVVRME  TGDKDFIPAI900
SEKDRYGMLS  PTFTYDDFRG  IYTMWAQNSG  DAGYNQSGKF  IEMRWSEDGI  NWSEPQKVNN960
FLGKDENGRQ  LWPWHQDIQY  IPELQEYWGL  SQCFSTSNPD  GSVLYLTKSR  DGVNWEQAGT1020
QPVLRAGKSG  TWDDFQIYRS  TFYYDNQSDS  PTGGKFRIWY  SALQANTSGK  TVLAPDGTVS1080
LQVGSQDTRI  WRIGYTENDY  MEVMKALTQN  KNYEEPELVD  AVSLNLSMDK  TSISVGEEAT1140
VSTAFVPENA  TDRIVKYTSQ  DPEIAVIDPT  GIVTGVKDGT  TTIVAETKSG  AKGELSVTVG1200
ELQRGEIRFE  VSNDHPMYLE  NYYWSDDAPK  KDGLDANKNY  YGDERVDSPV  MLYNTVPDEL1260
KDNTVILLIA  ERSLNSTDAV  RDWIKKNVEL  CNENKIPCAV  QIANGETNVN  TTIPLSFWNE1320
LATNNEYLVG  FNAAEMYNRF  AGDNRSYVMD  MIRLGVSHGV  CMMWTDTNIF  GTNGVLYDWL1380
TQDEKLSGLM  REYKEYISLM  TKESYGSEAA  NTDALFKGLW  MTDYCENWGI  ASDWWHWQLD1440
SNGALFDAGS  GGDAWKQCLT  WPENMYTQDV  VRAVSQGATC  FKSEAQWYSN  ATKGMRTPTY1500
QYSMIPFLEK  LVSKEVKIPT  KEEMLERTKA  IVVGAENWNN  FNYNTTYSNL  YPSTGQYGIV1560
PYVPSNCPEE  ELSGYDLVVR  ENLGKAGLKS  ALDTVYPVQK  SEGTAYCETF  GDTWYWMNSS1620
EDKNVSQYTE  FTTAINGAES  VKIAGEPHVF  GIIKENPGSL  NVYLSNYRLD  KTELWDGTIP1680
GGLSDQGCYN  YVWQMCERMK  NGTGLDTQLR  DTVITVKNAV  EPKVNFVTES  PADRSFAEDN1740
YVRPYKYTVA  QKEGTTDEWV  ITVSHNGIVE  FNIVTGDEKV  PATSVELSTD  KVDVIRNRTA1800
VVKATVLPQN  AGNKQLTWTI  ADPEIASVDN  KGTVTGLKEG  KTVLRAAISG  SVYKECEVNV1860
IDRKVTEVNL  NKTELSLSAG  DSAKLEASIA  PEDPSDSSIT  WTSTNENVAT  VASNGTVTAH1920
KAGVAQIIAQ  SAYQAKGIAT  VTVNYAASVK  LDRTGMTATA  NSEQSKSGGE  GPASNVLDGK1980
QDTMWHTSWT  DKPELHPHWI  KIDLNGTKTI  NKFAYTPRTG  ASNGTIYNYV  LIITDPEGNE2040
KQVAKGVWAA  NADVKYAEFD  AVEATAIKLQ  VDGNDDKASK  GGYGSAAEIN  IFEVAQKPSA2100
NELAENIKVI  APVKAEDTKV  SIPVITGFDI  VISNSSNPDV  IGIDGSITRP  ENDTVVTLTL2160
KVKETDSKAV  KEAAAEATTT  VDVLVTATKT  SDVEAESVTL  DKTAAELTVG  GELLLNAVVK2220
PDNATNKAVT  WSSDKPGTAT  VENGRVKALA  AGEARITAAT  ANGKTAVCVI  NVKEKEEPEV2280
ILPTEVRLNM  PSAEFSVGDQ  IQLTASVLPA  NAADKTITWK  SDKPEVATVA  NGWVKGIAAG2340
TAKITATSVN  GKTAVCVITV  KAQPQNLPTG  VSLNKKTASV  KLNKTLTLSA  VVQPSNADNK2400
TVKWTSDNTY  VATVENGVVK  AVNAGTARIT  AATVNGHKAT  CTITVPGTKI  SKAKVSLASS2460
KTHTGKALKP  SVKVTYGKNT  LKKNTDYTVS  YKNNINPGTA  SVTITGKGKY  YGTINKTFAI2520
KAAEGKTYTS  GKGKYKVTDA  SAKNRTVTFM  APVKKTYSSF  SVPSKVKIGN  DTYKVTAVAK2580
NAFKKNTKLT  KVTIGSNVKT  IGSYAFYGAS  QLKTLTLKTT  GLNSVGKNAF  KKTNAKLTVK2640
VPKSKLADYK  KLLKGKGLSG  KAKIQK2666

Enzyme Prediction      help

No EC number prediction in MGYG000002403_00465.

CAZyme Signature Domains help

Created with Snap133266399533666799933106611991333146615991732186619992132226623992532388536CBM5112061504GH9819582089CBM32
Family Start End Evalue family coverage
CBM51 388 536 2.9e-29 0.9850746268656716
GH98 1206 1504 1.6e-16 0.9113149847094801
CBM32 1958 2089 2.1e-16 0.9516129032258065

CDD Domains      download full data without filtering help

Created with Snap133266399533666799933106611991333146615991732186619992132226623992532385537NPCBM15211775Glyco_hydro_98C384536NPCBM21442356YjdB23122505YjdB
Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
pfam08305 NPCBM 5.45e-26 385 537 1 136
NPCBM/NEW2 domain. This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins. This domain has also been called the NEW2 domain (Naumoff DG. Phylogenetic analysis of alpha-galactosidases of the GH27 family. Molecular Biology (Engl Transl). (2004)38:388-399.)
pfam08307 Glyco_hydro_98C 4.77e-23 1521 1775 1 269
Glycosyl hydrolase family 98 C-terminal domain. This putative domain is found at the C-terminus of glycosyl hydrolase family 98 proteins. This domain is not expected to form part of the catalytic activity.
smart00776 NPCBM 5.66e-22 384 536 2 144
This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins.
COG5492 YjdB 9.29e-19 2144 2356 144 329
Uncharacterized conserved protein YjdB, contains Ig-like domain [General function prediction only].
COG5492 YjdB 9.91e-19 2312 2505 127 318
Uncharacterized conserved protein YjdB, contains Ig-like domain [General function prediction only].

CAZyme Hits      help

Created with Snap1332663995336667999331066119913331466159917321866199921322266239925327151201QQR05375.1|CBM327151201ANU41753.1|CBM327151201QIA30400.1|CBM3211831773QIG40118.1|CBM32|GH9824232666CBK83841.1|GH9
Hit ID E-Value Query Start Query End Hit Start Hit End
QQR05375.1 2.70e-165 715 1201 22 503
ANU41753.1 2.70e-165 715 1201 22 503
QIA30400.1 3.69e-165 715 1201 22 503
QIG40118.1 4.52e-92 1183 1773 20 566
CBK83841.1 8.35e-46 2423 2666 1535 1782

PDB Hits      download full data without filtering help

Created with Snap13326639953366679993310661199133314661599173218661999213222662399253271912016N1B_A71912016N1A_A120417732WMH_A120417732WMF_A120417732WMG_A
Hit ID E-Value Query Start Query End Hit Start Hit End Description
6N1B_A 2.73e-170 719 1201 27 497
Crystalstructure of an N-acetylgalactosamine deacetylase from F. plautii in complex with blood group B trisaccharide [Flavonifractor plautii]
6N1A_A 6.24e-169 719 1201 27 497
Crystalstructure of an N-acetylgalactosamine deacetylase from F. plautii [Flavonifractor plautii]
2WMH_A 6.11e-43 1204 1773 26 581
Crystalstructure of the catalytic module of a family 98 glycoside hydrolase from Streptococcus pneumoniae TIGR4 in complex with the H- disaccharide blood group antigen. [Streptococcus pneumoniae TIGR4]
2WMF_A 2.67e-42 1204 1773 26 581
Crystalstructure of the catalytic module of a family 98 glycoside hydrolase from Streptococcus pneumoniae TIGR4 (Sp4GH98) in its native form. [Streptococcus pneumoniae TIGR4]
2WMG_A 3.58e-42 1204 1773 26 581
Crystalstructure of the catalytic module of a family 98 glycoside hydrolase from Streptococcus pneumoniae TIGR4 (Sp4GH98) in complex with the LewisY pentasaccharide blood group antigen. [Streptococcus pneumoniae TIGR4]

Swiss-Prot Hits      download full data without filtering help

Created with Snap1332663995336667999331066119913331466159917321866199921322266239925327151201sp|P0DTR4|ADAC_FLAPL18322116sp|Q0TR53|OGA_CLOP118322091sp|Q8XL08|OGA_CLOPE21952352sp|P33747|Y4160_CLOAB22772445sp|P85991|IGLVP_BPSK9
Hit ID E-Value Query Start Query End Hit Start Hit End Description
P0DTR4 5.41e-166 715 1201 22 503
A type blood N-acetyl-alpha-D-galactosamine deacetylase OS=Flavonifractor plautii OX=292800 PE=1 SV=1
Q0TR53 2.41e-17 1832 2116 527 782
O-GlcNAcase NagJ OS=Clostridium perfringens (strain ATCC 13124 / DSM 756 / JCM 1290 / NCIMB 6125 / NCTC 8237 / Type A) OX=195103 GN=nagJ PE=1 SV=1
Q8XL08 7.12e-17 1832 2091 527 765
O-GlcNAcase NagJ OS=Clostridium perfringens (strain 13 / Type A) OX=195102 GN=nagJ PE=1 SV=1
P33747 6.09e-14 2195 2352 37 191
Uncharacterized protein CA_P0160 OS=Clostridium acetobutylicum (strain ATCC 824 / DSM 792 / JCM 1419 / LMG 5710 / VKM B-1787) OX=272562 GN=CA_P0160 PE=3 SV=2
P85991 2.13e-13 2277 2445 84 257
Ig-like virion protein OS=Serratia phage KSP90 OX=552528 PE=1 SV=2

SignalP and Lipop Annotations help

This protein is predicted as SP

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
0.000256 0.999064 0.000198 0.000160 0.000152 0.000141

TMHMM  Annotations      download full data without filtering help

start end
12 30