logo
sublogo
You are browsing environment: HUMAN GUT
help

CAZyme Information: MGYG000000398_01228

You are here: Home > Sequence: MGYG000000398_01228

Basic Information | Genomic context | Full Sequence | Enzyme annotations |  CAZy signature domains |  CDD domains | CAZyme hits | PDB hits | Swiss-Prot hits | SignalP and Lipop annotations | TMHMM annotations

Basic Information help

Species
Lineage Bacteria; Firmicutes_A; Clostridia; Lachnospirales; Lachnospiraceae; TF01-11;
CAZyme ID MGYG000000398_01228
CAZy Family GH59
CAZyme Description hypothetical protein
CAZyme Property
Protein Length CGC Molecular Weight Isoelectric Point
1110 MGYG000000398_7|CGC1 120655.68 5.4396
Genome Property
Genome Assembly ID Genome Size Genome Type Country Continent
MGYG000000398 2936252 MAG Sweden Europe
Gene Location Start: 35698;  End: 39030  Strand: +

Full Sequence      Download help

MLKRFLKQAG  ACLLSVSMIL  TSNVGSAGVS  ASSRKGDAVE  TKTGSIVING  DDIKADNVNG60
LTYKGFGLLS  ANSTSDLLMD  YKSQNPEKYA  ELMQYLFGGQ  YPIFTHVKLE  MGNDSNTSTG120
PESATMRTRD  EKANVLRNPG  WQLAADAKKI  NPNLKVSILS  WCTPTWVKKD  EDKYYWYKQS180
ILAAYEQYGF  MVDYINPNTN  ESWGGKGDIA  TTKKFAQWIA  EESADTISDE  KELALFHKIK240
LIVSDEANVV  SDSVAENLKS  DQEFMDAVDV  VGYHYKTADD  SNGGMKWFAE  EMDKEVWNSE300
EQATFSNSAF  RPAGNDKAPT  VAGTGIGGSG  SALEMGNTVI  KSFVDSRRSH  VIYQPAIGSF360
YEGGEYSFKE  LVSARDPWSG  WMHYDAGLLV  LAHISKFAVT  GWENESNTAG  IWRGVPSASK420
ASAVQTTSSN  AVDGRVGGEN  YMTLAAPTKD  NFSTVIVNDS  EYPMTYTLKT  ENMNLKADQK480
LELWETRAAD  DGAFNENYMK  RIQEVSADAE  GVYSFEVKPN  SAVTVTTLEV  SDSEEHTKAL540
PVEGERTVLD  TDATGDVQDT  ANEYLYADNF  EYTGKTVPVL  DGKGGFTGET  EDYIASRGGD600
TGAMARYTHT  LNGAFEVYKD  ENGNHVLRQQ  RDKQATGVGR  AWNKGDPVTL  IGDYRWTNYA660
ASIDVMFERA  ADSQYAQVGI  RQTGRTYNIS  NCAGYSLKVN  DNGTWALYRA  KFGSSSASGE720
ELASGSVDAS  QVTPETWFNL  ELRGEGNVIK  AYVNNSLIAE  YEDANPITSG  RIAIGSGNTY780
TRFDNLEVKK  LEGYAPYYNE  YIDNMETYDL  TPEKNPKLVY  DDKWSITCQN  HGMYTYQRSA840
SYSTGVGATL  SYTFKGTGLE  ILGYNKSDKT  TLNVTVDGNI  YQTAAPLWKS  DNMCTEYQLA900
GLEDAQHTVT  IEVASGSLAV  DAVAVTGSVY  GAKADETPSP  APSATAVPSA  APSGAPAQDP960
QTQTLTVKKG  DTFKVKDITY  VVTNAEKKTV  VLKKAENKKS  KAIVVPASVK  TAAGTFRVTG1020
ISDKAFAGCS  SLTKVTIGKN  VTSIGKEAFA  KDKNLKKIVI  KSSGLKKVGK  NAIKGISGKA1080
KISCGKKNVT  AYKKLFTGKT  GYVKSMKITK  1110

Enzyme Prediction      help

EC 3.2.1.23

CAZyme Signature Domains help

Created with Snap55111166222277333388444499555610666721777832888943999105457789GH59
Family Start End Evalue family coverage
GH59 57 789 1.2e-171 0.993660855784469

CDD Domains      download full data without filtering help

Created with Snap55111166222277333388444499555610666721777832888943999105463395Glyco_hydro_5910031075LRR_310031082LRR_310181075LRR_310211083LRR_3
Cdd ID Domain E-Value qStart qEnd sStart sEnd Domain Description
pfam02057 Glyco_hydro_59 1.29e-93 63 395 1 292
Glycosyl hydrolase family 59.
sd00036 LRR_3 3.53e-12 1003 1075 65 127
leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.
sd00036 LRR_3 8.67e-12 1003 1082 19 88
leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.
sd00036 LRR_3 3.65e-11 1018 1075 48 104
leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.
sd00036 LRR_3 5.48e-11 1021 1083 5 66
leucine-rich repeats. A leucine-rich repeat (LRR) is a structural protein motif of 20-30 amino acids that is unusually rich in the hydrophobic amino acid leucine. The conserved eleven-residue sequence motif (LxxLxLxxN/CxL) within the LRRs corresponds to the beta-strand and adjacent loop regions, whereas the remaining parts of the repeats are variable. LRRs fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Leucine-rich repeats are usually involved in protein-protein interactions.

CAZyme Hits      help

Created with Snap55111166222277333388444499555610666721777832888943999105447932ACL75594.1|CBM6|GH5947932ABG76970.1|CBM6|GH5945925AJQ96197.1|CBM2|CBM6|GH5945925AUX40340.1|GH5946925QNK59157.1|GH59
Hit ID E-Value Query Start Query End Hit Start Hit End
ACL75594.1 5.87e-298 47 932 35 915
ABG76970.1 5.87e-298 47 932 35 915
AJQ96197.1 1.42e-289 45 925 308 1183
AUX40340.1 2.22e-289 45 925 56 931
QNK59157.1 9.16e-289 46 925 315 1188

PDB Hits      help

has no PDB hit.

Swiss-Prot Hits      download full data without filtering help

Created with Snap55111166222277333388444499555610666721777832888943999105445789sp|P54804|GALC_CANLF60668sp|B5X3C1|GALC_SALSA60789sp|O02791|GALC_MACMU60788sp|Q5SNX7|GALC_DANRE60789sp|P54803|GALC_HUMAN
Hit ID E-Value Query Start Query End Hit Start Hit End Description
P54804 8.89e-25 45 789 19 666
Galactocerebrosidase OS=Canis lupus familiaris OX=9615 GN=GALC PE=1 SV=1
B5X3C1 8.22e-24 60 668 34 550
Galactocerebrosidase OS=Salmo salar OX=8030 GN=galc PE=2 SV=1
O02791 6.17e-23 60 789 52 682
Galactocerebrosidase OS=Macaca mulatta OX=9544 GN=GALC PE=1 SV=2
Q5SNX7 9.91e-23 60 788 30 657
Galactocerebrosidase OS=Danio rerio OX=7955 GN=galc PE=2 SV=1
P54803 9.93e-22 60 789 52 682
Galactocerebrosidase OS=Homo sapiens OX=9606 GN=GALC PE=1 SV=3

SignalP and Lipop Annotations help

This protein is predicted as SP

Other SP_Sec_SPI LIPO_Sec_SPII TAT_Tat_SPI TATLIP_Sec_SPII PILIN_Sec_SPIII
0.000395 0.998863 0.000211 0.000195 0.000178 0.000156

TMHMM  Annotations      help

There is no transmembrane helices in MGYG000000398_01228.