logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000002403_00464

You are here: Home > Sequence: MGYG000002403_00464

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species Robinsoniella peoriensis
Lineage Bacteria; Firmicutes_A; Clostridia; Lachnospirales; Lachnospiraceae; Robinsoniella; Robinsoniella peoriensis
CAZyme ID MGYG000002403_00464
CAZy Family CBM51
CAZyme Description hypothetical protein
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
1773 MGYG000002403_19|CGC1 197990.67 4.8649
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000002403 7202103 Isolate not provided not provided
Gene Location Start: 14987;  End: 20308  Strand: +

Full Sequence      Download help

MKFRAKKLVA  LLTASCMCLA  SPLSAAAESG  TGTRLVKGQT  GYLTEEQAIR  NQEQTTEERE60
QKLTGEETAE  VLMEDTKDSG  IVQTEEVQTK  EMQTEDAQTK  EVQTEEMQTE  DAQTEEMQTE120
DAQTEEVQTK  EVPAEETHMK  EIQTQETKKA  SDRNGKARVT  EILEDAQDPA  NRIVYLSDLQ180
WKSENHTVDS  ELPTRKDKSF  GGGKITLKVD  GTVTEFDKGI  GTQTDSTIVY  DLEGKGYTKF240
ETYVGVDYSQ  KENIPGEVCD  VKFRVKIDDK  IVSETGVLDP  LSNAVKISVN  IPDTAKTLTL300
YADKVTETWS  DHANWADAKF  YQALPEPENV  TFKKTVVARK  TSDNSEAPVN  PDSAVNSSKA360
VDGVIDSSSY  FDFGDQANSG  AVRESLYMEV  DLKGSYLLSD  IQLWRYWKDG  RTYAATAIVV420
AEDENFENAA  VIYNSDTTGE  IHHLGAGSDM  LYAETESGKT  FPVPENTKAR  YIRVYTYGVN480
GTSGVTNHIV  ELKVNAYVFG  DEILPEKPDD  SKIFPNAVNP  LKLQGPGTND  QVTHPDVTVF540
DKPWNGYKYW  MAYTPNKPGS  SYFENPCIAA  SNDGVNWEFP  AQNPVQPRYD  SEIENQNEHN600
CDTDIVYDPV  NDRLIMYWEW  AQDEAVNGKT  HRSEIRYRVS  YDGINWGVED  EKGVLMTGPT660
DHGCAIATEG  ERYSDLSPTV  VYDKTEKIYK  MWANDAGDVG  YENKQNNKVW  YRTSQDGISN720
WSDKTYVENF  LGVNEDGLQM  YPWHQDIQWV  EGFQEYWALQ  QAFPAGSGPD  NSSLRFSKSK780
DGIHWEPVSE  KALITVGAPG  TWDAGQIYRS  TFWYEPGGAK  GNGTFHIWYA  ALAEGQSHWD840
IGYTSANYAD  AMYKLTGSRP  EVEKRIEVNN  ENPLLIMPLY  GKSYSESGST  LDWGDDLVSR900
WKQVPEDLKE  NAVIEIHLGG  KIGLNESDSH  TAKAFYEQQL  AIAQENNIPV  MMVVATAGQQ960
NYWTGTANLD  AEWIDRMFKQ  HSVLKGIMST  ENYWTDYNKV  ATMGADYLRV  AAENGGYFVW1020
SEHQEGVIEN  VIANEKFNEA  LKLYGNNFIF  TWKNTPAGTN  SNAGTASYMQ  GLWLTGICAQ1080
WGGLADTWKW  YEKGFGKLFD  GQYSYNPGGE  EARPVATEPE  ALLGIEMMSI  YTNGGCVYNF1140
EHPAYVYGSY  NQNSPCFENV  IAEFMRYAIK  NPAPGKEEVL  ADTKAVFYGK  LSSLKSAGNL1200
LQNGLNWEDA  TLPTQTTGRY  GLIPAVPEAV  DEKTVKAVFG  DIEILNQSSA  QLANKDAKKA1260
YFEAKYPEQY  TGTAFGQLLN  DTWYLYNSNV  NVDGVQNAKL  PLEGNKSVDI  TMTPHTYVIL1320
DDQDGELQIK  LNNYRVDKDS  IWEGYGTTVT  DRWDTDHNTK  LQDWIRDEYI  PNPDDDTFRD1380
TTFELVGLES  EPEVNVTNGL  KDQYQEPVVE  YDAAAGTAMI  TVSGNGWVDL  TIDTNTAEVP1440
QVDKAKLNSK  IAEAKGIRQG  NYTDESYKAL  QEEIGKSQAV  SNKTDATQEE  VNAQLSRLES1500
AIARLKEKPA  VVSKTALNAK  IAEAKGIRQG  NYTDESYKAL  QNAIVKAQEL  SNKTDATQQQ1560
VNDLVSALTN  AIKNLKIDAD  KLAAESAKKV  AAVKVAVKAV  SYKSKEIKLS  WKTVTDADGY1620
VIRVKTGKKW  STEKTIKNNR  IITYTYKKGT  PGKKYVFEVK  AFKKVNGKTT  YSKYKTATKK1680
VVPQTVTAKA  KASKNNVVVK  WNKVSGASGY  VVMKKKGKTW  VKAAQVNAKK  LYFTDKKVKK1740
GKVYSYKVKA  YKVYKGKKVY  GSYSKSVNVK  TKS1773

Enzyme Prediction      help

EC 3.2.1.102 3.2.1.-

CAZyme Signature Domains help

Created with Snap88177265354443531620709797886975106311521241132914181507159516848621174GH98174320CBM51
Family Start End Evalue family coverage
GH98 862 1174 4e-63 0.9938837920489296
CBM51 174 320 3.1e-33 0.9925373134328358

CDD Domains      download full data without filtering help

Created with Snap88177265354443531620709797886975106311521241132914181507159516848601173Glyco_hydro_98M11761434Glyco_hydro_98C174320NPCBM174320NPCBM15151575FIVAR
Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
pfam08306 Glyco_hydro_98M 6.15e-61 860 1173 3 328
Glycosyl hydrolase family 98. This domain is the putative catalytic domain of glycosyl hydrolase family 98 proteins.
pfam08307 Glyco_hydro_98C 3.03e-58 1176 1434 1 269
Glycosyl hydrolase family 98 C-terminal domain. This putative domain is found at the C-terminus of glycosyl hydrolase family 98 proteins. This domain is not expected to form part of the catalytic activity.
pfam08305 NPCBM 2.22e-33 174 320 3 135
NPCBM/NEW2 domain. This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins. This domain has also been called the NEW2 domain (Naumoff DG. Phylogenetic analysis of alpha-galactosidases of the GH27 family. Molecular Biology (Engl Transl). (2004)38:388-399.)
smart00776 NPCBM 5.33e-24 174 320 5 144
This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins.
pfam07554 FIVAR 4.17e-06 1515 1575 1 68
FIVAR domain. This domain is found in a wide variety of contexts, but mostly occurring in cell wall associated proteins. A lack of conserved catalytic residues suggests that it is a binding domain. From context, possible substrates are hyaluronate or fibronectin (personal obs: C Yeats). This is further evidenced by. Possibly the exact substrate is N-acetyl glucosamine. Finding it in the same protein as pfam05089 further supports this proposal. It is found in the C-terminal part of Bacillus sp. Gellan lyase, which is removed during maturation. Some of the proteins it is found in are involved in methicillin resistance. The name FIVAR derives from Found In Various Architectures.

CAZyme Hits      help

Created with Snap88177265354443531620709797886975106311521241132914181507159516848481432ALG47715.1|CBM51|GH988481432AQM60602.1|CBM51|GH988481432BAB80035.1|CBM51|GH98|3.2.1.-8481432AOY52755.1|CBM51|GH988481432SQG37611.1|CBM51|GH98
Hit ID E-Value Query Start Query End Hit Start Hit End
ALG47715.1 1.79e-136 848 1432 193 792
AQM60602.1 1.37e-135 848 1432 193 796
BAB80035.1 1.16e-134 848 1432 193 792
AOY52755.1 1.16e-134 848 1432 193 792
SQG37611.1 1.16e-134 848 1432 193 792

PDB Hits      download full data without filtering help

Created with Snap881772653544435316207097978869751063115212411329141815071595168483514322WMH_A83514322WMF_A83514322WMG_A86114262WMI_A86114264D6D_A
Hit ID E-Value Query Start Query End Hit Start Hit End Description
2WMH_A 2.55e-118 835 1432 2 581
Crystalstructure of the catalytic module of a family 98 glycoside hydrolase from Streptococcus pneumoniae TIGR4 in complex with the H- disaccharide blood group antigen. [Streptococcus pneumoniae TIGR4]
2WMF_A 2.55e-118 835 1432 2 581
Crystalstructure of the catalytic module of a family 98 glycoside hydrolase from Streptococcus pneumoniae TIGR4 (Sp4GH98) in its native form. [Streptococcus pneumoniae TIGR4]
2WMG_A 3.18e-117 835 1432 2 581
Crystalstructure of the catalytic module of a family 98 glycoside hydrolase from Streptococcus pneumoniae TIGR4 (Sp4GH98) in complex with the LewisY pentasaccharide blood group antigen. [Streptococcus pneumoniae TIGR4]
2WMI_A 1.25e-116 861 1426 24 598
Crystalstructure of the catalytic module of a family 98 glycoside hydrolase from Streptococcus pneumoniae SP3-BS71 in complex with the A-trisaccharide blood group antigen. [Streptococcus pneumoniae SP3-BS71],2WMJ_A Crystal structure of the catalytic module of a family 98 glycoside hydrolase from Streptococcus pneumoniae SP3-BS71 (Sp3GH98) in complex with the B-trisaccharide blood group antigen. [Streptococcus pneumoniae SP3-BS71],2WMJ_B Crystal structure of the catalytic module of a family 98 glycoside hydrolase from Streptococcus pneumoniae SP3-BS71 (Sp3GH98) in complex with the B-trisaccharide blood group antigen. [Streptococcus pneumoniae SP3-BS71]
4D6D_A 4.91e-116 861 1426 1 575
Crystalstructure of a family 98 glycoside hydrolase catalytic module (Sp3GH98) in complex with the blood group A-trisaccharide (X02 mutant) [Streptococcus pneumoniae SP3-BS71]

Swiss-Prot Hits      download full data without filtering help

Created with Snap88177265354443531620709797886975106311521241132914181507159516848481432sp|Q6RUF5|EABC_CLOPF529847sp|P0DTR4|ADAC_FLAPL14421575sp|Q9L7Q2|ZMPB_STRPN14321589sp|E8MGH9|HYBA2_BIFL2
Hit ID E-Value Query Start Query End Hit Start Hit End Description
Q6RUF5 2.67e-135 848 1432 198 797
Blood-group-substance endo-1,4-beta-galactosidase OS=Clostridium perfringens OX=1502 GN=eabC PE=1 SV=1
P0DTR4 9.56e-84 529 847 55 380
A type blood N-acetyl-alpha-D-galactosamine deacetylase OS=Flavonifractor plautii OX=292800 PE=1 SV=1
Q9L7Q2 8.03e-12 1442 1575 416 544
Zinc metalloprotease ZmpB OS=Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334 / TIGR4) OX=170187 GN=zmpB PE=3 SV=2
E8MGH9 3.96e-09 1432 1589 1656 1814
Beta-L-arabinobiosidase OS=Bifidobacterium longum subsp. longum (strain ATCC 15707 / DSM 20219 / JCM 1217 / NCTC 11818 / E194b) OX=565042 GN=hypBA2 PE=1 SV=1

SignalP and Lipop Annotations help

This protein is predicted as SP

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
0.000609 0.942425 0.056126 0.000294 0.000289 0.000228

TMHMM  Annotations      help

There is no transmembrane helices in MGYG000002403_00464.