logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000003136_01367

You are here: Home > Sequence: MGYG000003136_01367

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species
Lineage Bacteria; Firmicutes; Bacilli; Lactobacillales; Streptococcaceae; Streptococcus;
CAZyme ID MGYG000003136_01367
CAZy Family GH101
CAZyme Description Endo-alpha-N-acetylgalactosaminidase
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
1425 157783.02 5.4077
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000003136 1781350 MAG United States North America
Gene Location Start: 1336;  End: 5613  Strand: -

Full Sequence      Download help

MTVRLQVVDN  QLHFDVTKIV  NHNQVTPGQK  IDDERKLLSS  INFLGNSLVS  VSSDQAGAKF60
DGATMSNNTH  VSGDDHIEVT  NPMKDLAKGY  MYGFVSTDKL  AAGVWSNSQN  SYGGGSNDWT120
RLTAYKETVG  NANYVGIHSS  EWQWEKAYKG  IVFPEYTKEL  PSAKVVITED  ANADKKVDWQ180
DGAIAYRSIM  NNPQGWEKVK  DITAYRIAMN  FGSQAQNPFL  MTLDGIKKIN  LHTDGLGQGV240
LLKGYGSEGH  DSGHLNYADI  GKRIGGVEDF  KALIEKAKKY  GAHLGIHVNA  SETYPESKYF300
NENILRKNPD  GSYSYGWNWL  DQGINIDAAY  DLAHGRLARW  EELKNKLGEG  LDFIYVDVWG360
NGQSGDNGAW  ATHVLAKEIN  KQGWRFAIEW  GHGGEYDSTF  QHWAADLTYG  GYTNKGINSV420
ITRFIRNHQK  DSWVGDYRSY  GGAANYPLLG  GYSMKDFEGW  QGRSDYNGYV  TNLFAHDVMT480
KYFQHFTVSK  WENGTPVTMS  DNGSTYKWTP  EMRVELVDAD  NNKVVVTRKS  NDINSPQYRE540
RTVTLNGRVI  QDGSAYLTPW  NWDANGKKLP  TDKEKMYYFN  TQAGATTWTL  PSDWANSKVY600
LYKLTDQGKT  EEQELTVKDG  KITLDLLANQ  PYVLYRSKQT  NPEMSWSEGM  HIYDQGFNSG660
TLKHWTISGD  ASKAEIVKSQ  GANEMLRIQG  NKSTVSLTQK  LTGLKPNTKY  AVYVGVDNRS720
NAKASITVNT  GEKEVTSYTN  KSLALNYVKA  YAHNTRRDNA  TVDDTSYFQN  MYAFFTTGSD780
VSNVTLTLSR  EAGDEATYFD  EIRTFENNSS  MYGDNHDTAK  GTFKQDFENV  AQGIFPFVIG840
GIEGVEDNRT  HLSEKHDPYT  QRDWNGKKVD  DVIEGNWSLK  TNGLVSRRNL  VYQTIPQNFR900
FEAGKTYRVT  FEYEAGSDNT  YAFVVGKGEF  QSGRRGNQAS  NLEMHELPNT  WTDSKKAKKV960
TFLVTGAETG  DTWVGIYSTG  NASNTRGDSG  GNANFRGYND  FIMDKLQIEE  ITLTGKMLTE1020
NALKNYLPTV  AMTNYTKESM  DALKEAVFNL  SQADDDISVE  EARAEIAKID  ALKNALVQKK1080
TALVAEDFES  LNAPAQAGED  LANAFDGNLS  SLWHTSWSGG  DVGKPATMVL  KEATEITGFR1140
YVPRGSGSNG  NLRDVKLVVT  DESGKEHTFT  VTDWPNNNKP  KDIDFGKTIK  AKKIVLTGTK1200
SYGDGGDKYQ  AAAELIFSRP  QVAETALDLS  GYETALAKAQ  KLTSKEHQEE  VASVVASMKY1260
ATDNHLLTER  MVAYFAEYLN  QLQDQTTKPD  APTSSKGEEA  APILEVPEYK  GPLGTAGEEE1320
APTLATQPEF  NGGVNAVEAL  VNEKPAYTGL  LATAGDQAAP  TIEKPEYQIS  QLGQGKLAES1380
KTSVSTEDKK  RLPETGESQS  DTAIFLAGVS  LALSAAVLAT  KRKEN1425

Enzyme Prediction      help

No EC number prediction in MGYG000003136_01367.

CAZyme Signature Domains help

Created with Snap71142213285356427498570641712783855926997106811401211128213531637GH101
Family Start End Evalue family coverage
GH101 1 637 0 0.8967468175388967

CDD Domains      download full data without filtering help

Created with Snap7114221328535642749857064171278385592699710681140121112821353180470Glyco_hydro_101193493GH_101_like800997GalBD_like1179Gal_mutarotas_3476610Glyco_hyd_101C
Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
pfam12905 Glyco_hydro_101 7.28e-161 180 470 1 273
Endo-alpha-N-acetylgalactosaminidase. Virulence of pathogenic organisms such as the Gram-positive Streptococcus pneumoniae is largely determined by the ability to degrade host glycoproteins and to metabolize the resultant carbohydrates. This family is the enzymatic region, EC:3.2.1.97, of the cell surface proteins that specifically cleave Gal-beta-1,3-GalNAc-alpha-Ser/Thr (T-antigen, galacto-N-biose), the core 1 type O-linked glycan common to mucin glycoproteins. This reaction is exemplified by the S. pneumoniae protein Endo-alpha-N-acetylgalactosaminidase, where Asp764 is the catalytic nucleophile-base and Glu796 the catalytic proton donor.
cd14244 GH_101_like 4.83e-130 193 493 1 298
Endo-a-N-acetylgalactosaminidase and related glcyosyl hydrolases. This family contains the enzymatically active domain of cell surface proteins that specifically cleave Gal-beta-1,3-GalNAc-alpha-Ser/Thr (T-antigen, galacto-N-biose), the core 1 type O-linked glycan common to mucin glycoproteins (EC:3.2.1.97). It has been classified as glycosyl hydrolase family 101 in the Cazy resource. Virulence of pathogenic organisms such as the Gram-positive Streptococcus pneumoniae and other commensal human bacteria is largely determined by their ability to degrade host glycoproteins and to metabolize the resultant carbohydrates.
pfam17974 GalBD_like 7.70e-104 800 997 1 190
Galactose-binding domain-like. Proteins containing a galactose-binding domain-like fold can be found in several different protein families, in both eukaryotes and prokaryotes. The common function of these domains is to bind to specific ligands, such as cell-surface-attached carbohydrate substrates for galactose oxidase and sialidase, phospholipids on the outer side of the mammalian cell membrane for coagulation factor Va, membrane-anchored ephrin for the Eph family of receptor tyrosine kinases, and a complex of broken single-stranded DNA and DNA polymerase beta for XRCC1. The structure of the galactose-binding domain-like members consists of a beta-sandwich, in which the strands making up the sheets exhibit a jellyroll fold.
pfam18080 Gal_mutarotas_3 3.42e-79 1 179 75 243
Galactose mutarotase-like fold domain. This domain is found in endo-alpha-N-acetylgalactosaminidase present in Streptococcus pneumoniae. Endo-alpha-N-acetylgalactosaminidase is a cell surface-anchored glycoside hydrolase involved in the breakdown of mucin type O-linked glycans. The domain, known as domain 2, exhibits strong structural similarlity to the galactose mutarotase-like fold but lacks the active site residues. Domains, found in a number of glycoside hydrolases, structurally similar to domain 2 confer stability to the multidomain architectures.
pfam17451 Glyco_hyd_101C 2.40e-41 476 610 1 111
Glycosyl hydrolase 101 beta sandwich domain. Virulence of pathogenic organisms such as the Gram-positive Streptococcus pneumoniae is largely determined by the ability to degrade host glycoproteins and to metabolize the resultant carbohydrates. This family is the enzymatic region, EC:3.2.1.97, of the cell surface proteins that specifically cleave Gal-beta-1,3-GalNAc-alpha-Ser/Thr (T-antigen, galacto-N-biose), the core 1 type O-linked glycan common to mucin glycoproteins. This reaction is exemplified by a S. pneumoniae protein, where Asp764 is the catalytic nucleophile-base and Glu796 the catalytic proton donor. This domain represents C-terminal the beta sandwich domain.

CAZyme Hits      help

Created with Snap711422132853564274985706417127838559269971068114012111282135311425QQQ35090.1|CBM32|GH10111425QKL33722.1|CBM32|GH10111425AYF95093.1|CBM32|GH10111425VEF78371.1|CBM32|GH10111425QLL96372.1|CBM32|GH101
Hit ID E-Value Query Start Query End Hit Start Hit End
QQQ35090.1 0.0 1 1425 683 2176
QKL33722.1 0.0 1 1425 677 2204
AYF95093.1 0.0 1 1425 677 2187
VEF78371.1 0.0 1 1425 672 2141
QLL96372.1 0.0 1 1425 672 2121

PDB Hits      download full data without filtering help

Created with Snap7114221328535642749857064171278385592699710681140121112821353110195A58_A110195A56_A110215A57_A112152ZXQ_A110195A59_A
Hit ID E-Value Query Start Query End Hit Start Hit End Description
5A58_A 0.0 1 1019 93 1111
Thestructure of GH101 D764N mutant from Streptococcus pneumoniae TIGR4 in complex with serinyl T-antigen [Streptococcus pneumoniae TIGR4]
5A56_A 0.0 1 1019 93 1111
Thestructure of GH101 from Streptococcus pneumoniae TIGR4 in complex with 1-O-methyl-T-antigen [Streptococcus pneumoniae TIGR4]
5A57_A 0.0 1 1021 93 1113
Thestructure of GH101 from Streptococcus pneumoniae TIGR4 in complex with PUGT [Streptococcus pneumoniae TIGR4]
2ZXQ_A 0.0 1 1215 105 1369
Crystalstructure of endo-alpha-N-acetylgalactosaminidase from Bifidobacterium longum (EngBF) [Bifidobacterium longum]
5A59_A 0.0 1 1019 93 1111
Thestructure of GH101 E796Q mutant from Streptococcus pneumoniae TIGR4 in complex with T-antigen [Streptococcus pneumoniae TIGR4],5A5A_A The structure of GH101 E796Q mutant from Streptococcus pneumoniae TIGR4 in complex with PNP-T-antigen [Streptococcus pneumoniae TIGR4]

Swiss-Prot Hits      download full data without filtering help

Created with Snap711422132853564274985706417127838559269971068114012111282135311424sp|Q2MGH6|GH101_STRPN11424sp|Q8DR60|GH101_STRR6321011sp|A9WNA0|GH101_RENSM10951215sp|P29767|NANH_CLOSE
Hit ID E-Value Query Start Query End Hit Start Hit End Description
Q2MGH6 0.0 1 1424 408 1767
Endo-alpha-N-acetylgalactosaminidase OS=Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334 / TIGR4) OX=170187 GN=SP_0368 PE=1 SV=1
Q8DR60 0.0 1 1424 408 1767
Endo-alpha-N-acetylgalactosaminidase OS=Streptococcus pneumoniae (strain ATCC BAA-255 / R6) OX=171101 GN=spr0328 PE=1 SV=1
A9WNA0 8.47e-112 32 1011 145 1037
Putative endo-alpha-N-acetylgalactosaminidase OS=Renibacterium salmoninarum (strain ATCC 33209 / DSM 20767 / JCM 11484 / NBRC 15589 / NCIMB 2235) OX=288705 GN=RSal33209_1326 PE=3 SV=2
P29767 1.97e-09 1095 1215 63 181
Sialidase OS=Clostridium septicum OX=1504 PE=3 SV=1

SignalP and Lipop Annotations help

This protein is predicted as OTHER

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
1.000047 0.000000 0.000000 0.000000 0.000000 0.000000

TMHMM  Annotations      help

There is no transmembrane helices in MGYG000003136_01367.