CAZyme3D

You are here: Home Cite us: 2025

Entry ID

Information for CAZyme ID: AIT18899.1

Basic Information

GenBank IDAIT18899.1
FamilyCBM18, GH18
Sequence Length1174
UniProt IDA0A097F8K9(100,100)Download
Average pLDDT?74.88
CAZy50 ID8410
CAZy50 RepYes, AIT18899.1
Structure ClusterSC_GH18_clus9
EC Number(s)-
Substrates(s)-

Taxonomy

Tax ID42368
KingdomEukaryota
PhylumAscomycota
ClassSordariomycetes
OrderHypocreales
FamilyOphiocordycipitaceae
GenusHirsutella
SpeciesHirsutella thompsonii

Protein Sequence:
90 < plddt <=100;
70 < plddt <= 90;
50 < plddt <= 70;
0 <= plddt <= 50;     Download help

MKSLWVPLVA  LGVLVGFSTQ  QQCSLTQLCE  AGCCSSAGFC  GYGPEYCGKG  CQSTCDRKAD60
CNPGWDGSDW  SKRDKCPLNV  CCSPHGFCGF  TEEFCEGNEV  KRPSCSVGDT  LVTRVVGYYE120
GWASSKRSCY  GLMPEEIPYG  QYTHIIFSFL  TVNPETFEVT  AGGQDTQVML  SRMEAIRILQ180
PDIKLWVAVG  GWAFNDPGPT  QTVFSDVAAS  PQKTTKFIKS  LLATMMRYGF  DGVDIDWEYP240
VAEDRHGRDE  DYENIVTFMR  NLKTRMSFHV  KGVSMTLPAS  YWYLQHFDIK  ALEKHVDWFN300
LMTYDIHGAW  DIDNKWTGPW  ANAHTNLTEI  QSGLDLLWRN  EISPKKVTIG  MSYYSRSFTL360
ADPSCNGVGC  RVSSAGAAGR  CSGTAGVLLH  PEIQEIVSEK  GLKPVLNRKA  AVKTVSWDNQ420
WVSFDDTVTW  RLKANHLRSQ  CIEGFMVWAI  SQDDKKGTNA  QALTKALGRP  VRDFPNLKGK480
TEKPEMQLSG  PKTCRWSSCF  EGCPSGFKEV  QRDGHKEIML  DTTHCKNFGT  KMSRLCCPMS540
SNMPVCRWRG  HSNSGKCKGG  CNQDEIEVGT  IWAGCKSGYQ  SACCTKTEST  AGYGKCMWMP600
CSKKAGKDMC  VFGTFKDFIT  TSSIASGGWQ  SCGKNSKSTL  CCESPPPDEM  SGICGWVQKT660
GHSNSFEQSL  ICEASCRDDQ  FRVGLESGDK  APDHGKEKCK  GEMAYCCDKS  SPKVPRYDRD720
TGSAQAKEFK  KLMASYMDNP  TCPATILFPS  LTDSQPEARS  LEAESRQYEI  LRGRAQDCTL780
DNWSRLLNYV  VLLFTTKQLA  MSPLIKIWDD  DFAGSYDKIL  EQKSLQEYFG  ESPTEDVRVT840
LEYVLYNPLE  AGDGIRNSKA  FKKDLCTESP  MNNRRRNIDE  EINIIQHGEL  VIMDDEHEID900
RRIINAFSKI  TSKTKTQPKL  QAIIQAIRGR  ILPLEYARWQ  WYNARTGHHQ  PGPFLELAYR960
IGPRIGVNGG  RGFDQYRDNN  PTSTRGGGRN  QFILFHMHID  PSTQWLQRSQ  GTTFLGITSL1020
TMYHSNSATS  TYFRGANNGY  IQGGWRAINH  QEGVTGRTKL  SCELLPTNNH  ARLWWVGSQS1080
PASAGPQNPW  LDDLRLWGQD  LHAQGYVSRP  ALNLILNRPN  PNAELGPEDN  RRLVLNNAIN1140
PSLEVEPYRW  NFRVENGRIV  FDKNTKPPRI  KDDL1174

Predicted 3D structure by AlphaFold2 with pLDDT = 74.88 ; Download help

pLDDT is for per-residue accuracy of the structure, which representes the quality of the residue. A higher value indicates better prediction accuracy. More detail please see AlphaFold .

Residues were colored according to plddt ( blue-> high quality; red-> low quality ).

Full Sequence:
CAPSIF:V and CAPSIF:G =99.9;
CAPSIF:V =59.9;
CAPSIF:G =40;
Non-Binding=0;     Download help

MKSLWVPLVA  LGVLVGFSTQ  QQCSLTQLCE  AGCCSSAGFC  GYGPEYCGKG  CQSTCDRKAD60
CNPGWDGSDW  SKRDKCPLNV  CCSPHGFCGF  TEEFCEGNEV  KRPSCSVGDT  LVTRVVGYYE120
GWASSKRSCY  GLMPEEIPYG  QYTHIIFSFL  TVNPETFEVT  AGGQDTQVML  SRMEAIRILQ180
PDIKLWVAVG  GWAFNDPGPT  QTVFSDVAAS  PQKTTKFIKS  LLATMMRYGF  DGVDIDWEYP240
VAEDRHGRDE  DYENIVTFMR  NLKTRMSFHV  KGVSMTLPAS  YWYLQHFDIK  ALEKHVDWFN300
LMTYDIHGAW  DIDNKWTGPW  ANAHTNLTEI  QSGLDLLWRN  EISPKKVTIG  MSYYSRSFTL360
ADPSCNGVGC  RVSSAGAAGR  CSGTAGVLLH  PEIQEIVSEK  GLKPVLNRKA  AVKTVSWDNQ420
WVSFDDTVTW  RLKANHLRSQ  CIEGFMVWAI  SQDDKKGTNA  QALTKALGRP  VRDFPNLKGK480
TEKPEMQLSG  PKTCRWSSCF  EGCPSGFKEV  QRDGHKEIML  DTTHCKNFGT  KMSRLCCPMS540
SNMPVCRWRG  HSNSGKCKGG  CNQDEIEVGT  IWAGCKSGYQ  SACCTKTEST  AGYGKCMWMP600
CSKKAGKDMC  VFGTFKDFIT  TSSIASGGWQ  SCGKNSKSTL  CCESPPPDEM  SGICGWVQKT660
GHSNSFEQSL  ICEASCRDDQ  FRVGLESGDK  APDHGKEKCK  GEMAYCCDKS  SPKVPRYDRD720
TGSAQAKEFK  KLMASYMDNP  TCPATILFPS  LTDSQPEARS  LEAESRQYEI  LRGRAQDCTL780
DNWSRLLNYV  VLLFTTKQLA  MSPLIKIWDD  DFAGSYDKIL  EQKSLQEYFG  ESPTEDVRVT840
LEYVLYNPLE  AGDGIRNSKA  FKKDLCTESP  MNNRRRNIDE  EINIIQHGEL  VIMDDEHEID900
RRIINAFSKI  TSKTKTQPKL  QAIIQAIRGR  ILPLEYARWQ  WYNARTGHHQ  PGPFLELAYR960
IGPRIGVNGG  RGFDQYRDNN  PTSTRGGGRN  QFILFHMHID  PSTQWLQRSQ  GTTFLGITSL1020
TMYHSNSATS  TYFRGANNGY  IQGGWRAINH  QEGVTGRTKL  SCELLPTNNH  ARLWWVGSQS1080
PASAGPQNPW  LDDLRLWGQD  LHAQGYVSRP  ALNLILNRPN  PNAELGPEDN  RRLVLNNAIN1140
PSLEVEPYRW  NFRVENGRIV  FDKNTKPPRI  KDDL1174

Carbohydrate binding residues Predicted by CAPSIF from 3D structure; Download help

Residues were colored according to prediction score:

Nonbinder, CAPSIF:G Predicted Binder, CAPSIF:V Predicted Binder, CAPSIF:V and CAPSIF:G Predicted Binder

CArbohydrate–Protein interaction Site IdentiFier (CAPSIF) that predicts non-covalent carbohydrate-binding sites on proteins: (1) a 3D-UNet voxel-based neural network model (CAPSIF:V) and (2) an equivariant graph neural network model (CAPSIF:G).

Details:
⋆B-Factor = 0.0 : Nonbinder.
⋆B-Factor = 40.0 : CAPSIF:G Predicted Binder.
⋆B-Factor = 59.9 : CAPSIF:V Predicted Binder.
⋆B-Factor = 99.9 : CAPSIF:V and CAPSIF:G Predicted Binder.

For more detail please see CAPSIF.

Full Sequence:
AA;
CE;
PL;
GH;
GT;
CBM;     Download structure help

dbCAN3 predicted domain(s) : CBM18(27-55)+CBM18(72-97)+GH18(113-462)

MKSLWVPLVA  LGVLVGFSTQ  QQCSLTQLCE  AGCCSSAGFC  GYGPEYCGKG  CQSTCDRKAD60
CNPGWDGSDW  SKRDKCPLNV  CCSPHGFCGF  TEEFCEGNEV  KRPSCSVGDT  LVTRVVGYYE120
GWASSKRSCY  GLMPEEIPYG  QYTHIIFSFL  TVNPETFEVT  AGGQDTQVML  SRMEAIRILQ180
PDIKLWVAVG  GWAFNDPGPT  QTVFSDVAAS  PQKTTKFIKS  LLATMMRYGF  DGVDIDWEYP240
VAEDRHGRDE  DYENIVTFMR  NLKTRMSFHV  KGVSMTLPAS  YWYLQHFDIK  ALEKHVDWFN300
LMTYDIHGAW  DIDNKWTGPW  ANAHTNLTEI  QSGLDLLWRN  EISPKKVTIG  MSYYSRSFTL360
ADPSCNGVGC  RVSSAGAAGR  CSGTAGVLLH  PEIQEIVSEK  GLKPVLNRKA  AVKTVSWDNQ420
WVSFDDTVTW  RLKANHLRSQ  CIEGFMVWAI  SQDDKKGTNA  QALTKALGRP  VRDFPNLKGK480
TEKPEMQLSG  PKTCRWSSCF  EGCPSGFKEV  QRDGHKEIML  DTTHCKNFGT  KMSRLCCPMS540
SNMPVCRWRG  HSNSGKCKGG  CNQDEIEVGT  IWAGCKSGYQ  SACCTKTEST  AGYGKCMWMP600
CSKKAGKDMC  VFGTFKDFIT  TSSIASGGWQ  SCGKNSKSTL  CCESPPPDEM  SGICGWVQKT660
GHSNSFEQSL  ICEASCRDDQ  FRVGLESGDK  APDHGKEKCK  GEMAYCCDKS  SPKVPRYDRD720
TGSAQAKEFK  KLMASYMDNP  TCPATILFPS  LTDSQPEARS  LEAESRQYEI  LRGRAQDCTL780
DNWSRLLNYV  VLLFTTKQLA  MSPLIKIWDD  DFAGSYDKIL  EQKSLQEYFG  ESPTEDVRVT840
LEYVLYNPLE  AGDGIRNSKA  FKKDLCTESP  MNNRRRNIDE  EINIIQHGEL  VIMDDEHEID900
RRIINAFSKI  TSKTKTQPKL  QAIIQAIRGR  ILPLEYARWQ  WYNARTGHHQ  PGPFLELAYR960
IGPRIGVNGG  RGFDQYRDNN  PTSTRGGGRN  QFILFHMHID  PSTQWLQRSQ  GTTFLGITSL1020
TMYHSNSATS  TYFRGANNGY  IQGGWRAINH  QEGVTGRTKL  SCELLPTNNH  ARLWWVGSQS1080
PASAGPQNPW  LDDLRLWGQD  LHAQGYVSRP  ALNLILNRPN  PNAELGPEDN  RRLVLNNAIN1140
PSLEVEPYRW  NFRVENGRIV  FDKNTKPPRI  KDDL1174

Predicted CAZyme domains from dbCAN; Download help

Domains were colored according to CAZyme classification: (AA), (CE), (PL), (GH), (GT), (CBM), & (Null)

dbCAN3 server is a web server for automated Carbohydrate-active enzyme ANnotation.

Details:
dbCAN3 server integrates three state-of-the-art tools/databases for automated CAZyme annotation:
⋆HMMER search for CAZyme family annotation vs. dbCAN CAZyme domain HMM database
⋆DIAMOND search for BLAST hits in the CAZy database
⋆HMMER search for CAZyme subfamily annotation vs. dbCAN-sub HMM database of CAZyme subfamilies (derived from eCAMI classification of CAZyDB families)

For more details, please see dbCAN3.

Similarites between the same cluster seqeunces from DIAMOND; Download help