dbCAN-seq: a database of CAZyme sequence and annotation

You are browsing environment: FUNGIDB

help

You are here: Home Cite us: 2018 and 2022

Select your database (Default: FUNGI DB)

Browse by Taxonomy

help

Arthropoda (11)	Ascomycota (182)	Basidiomycota (23)	Blastocladiomycota (1)
Chytridiomycota (4)	Mucoromycota (9)	Oomycota (25)

Browse by Family

help

AA1 (2002)	AA2 (923)	AA3 (3244)	AA4 (515)	AA5 (437)	AA6 (520)	AA7 (3611)
AA8 (420)	AA9 (1275)	AA10 (2)	AA11 (496)	AA12 (188)	AA13 (66)	AA14 (88)
AA15 (48)	AA16 (176)	AA17 (795)

CBM1 (136)	CBM5 (3)	CBM6 (37)	CBM9 (50)	CBM13 (277)	CBM14 (1)	CBM18 (79)
CBM19 (17)	CBM20 (266)	CBM21 (419)	CBM24 (264)	CBM25 (3)	CBM32 (10)	CBM35 (16)
CBM38 (64)	CBM42 (85)	CBM43 (358)	CBM46 (15)	CBM47 (28)	CBM48 (252)	CBM50 (422)
CBM51 (1)	CBM52 (21)	CBM63 (113)	CBM66 (47)	CBM67 (380)	CBM87 (65)	CBM91 (412)
CBM506 (5)	CBM508 (48)

CE1 (391)	CE2 (123)	CE3 (354)	CE4 (1336)	CE5 (834)	CE8 (547)	CE9 (265)
CE11 (23)	CE12 (214)	CE13 (1)	CE15 (89)	CE16 (428)	CE17 (80)	CE18 (66)
CE20 (6)

GH1 (534)	GH2 (850)	GH3 (2647)	GH4 (2)	GH5 (2880)	GH6 (313)	GH7 (453)
GH8 (15)	GH9 (54)	GH10 (445)	GH11 (329)	GH12 (449)	GH13 (2173)	GH14 (5)
GH15 (500)	GH16 (3252)	GH17 (1252)	GH18 (2515)	GH19 (48)	GH20 (469)	GH23 (26)
GH24 (106)	GH25 (79)	GH26 (117)	GH27 (416)	GH28 (1212)	GH29 (86)	GH30 (405)
GH31 (1157)	GH32 (503)	GH33 (74)	GH35 (424)	GH36 (310)	GH37 (436)	GH38 (280)
GH39 (104)	GH42 (11)	GH43 (1757)	GH44 (3)	GH45 (123)	GH46 (16)	GH47 (1498)
GH49 (30)	GH51 (300)	GH53 (154)	GH54 (106)	GH55 (628)	GH62 (131)	GH63 (336)
GH64 (172)	GH65 (169)	GH67 (113)	GH68 (1)	GH71 (564)	GH72 (1147)	GH74 (84)
GH75 (190)	GH76 (1270)	GH77 (1)	GH78 (725)	GH79 (375)	GH81 (544)	GH84 (17)
GH85 (41)	GH88 (155)	GH89 (105)	GH92 (440)	GH93 (234)	GH94 (20)	GH95 (162)
GH104 (1)	GH105 (352)	GH106 (120)	GH109 (133)	GH114 (204)	GH115 (170)	GH117 (3)
GH123 (20)	GH125 (323)	GH127 (73)	GH128 (517)	GH130 (44)	GH131 (237)	GH132 (381)
GH134 (79)	GH135 (203)	GH136 (8)	GH139 (25)	GH140 (67)	GH141 (32)	GH142 (48)
GH144 (1)	GH146 (39)	GH151 (1)	GH152 (172)	GH154 (195)	GH162 (29)	GH171 (10)
GH172 (42)

GT1 (1552)	GT2 (4030)	GT3 (269)	GT4 (1212)	GT5 (25)	GT7 (14)	GT8 (965)
GT10 (49)	GT14 (8)	GT15 (906)	GT17 (115)	GT19 (32)	GT20 (1130)	GT21 (236)
GT22 (1051)	GT24 (270)	GT25 (286)	GT28 (19)	GT30 (25)	GT31 (836)	GT32 (1057)
GT33 (259)	GT34 (548)	GT35 (223)	GT39 (817)	GT41 (356)	GT43 (29)	GT44 (7)
GT45 (74)	GT47 (50)	GT48 (505)	GT49 (103)	GT50 (257)	GT51 (1)	GT55 (14)
GT57 (534)	GT58 (257)	GT59 (238)	GT60 (183)	GT61 (84)	GT62 (647)	GT64 (56)
GT66 (283)	GT69 (606)	GT71 (1464)	GT74 (10)	GT76 (267)	GT77 (19)	GT81 (1)
GT90 (960)	GT91 (173)	GT92 (6)	GT96 (15)	GT100 (1)	GT101 (1)	GT105 (30)
GT109 (195)	GT110 (3)

PL1 (813)	PL3 (574)	PL4 (265)	PL7 (30)	PL8 (21)	PL9 (57)	PL11 (14)
PL14 (104)	PL20 (44)	PL22 (4)	PL26 (67)	PL27 (17)	PL29 (2)	PL33 (2)
PL35 (44)	PL38 (100)	PL42 (82)

Browse by Substrate

help

agarose (7)	alginate (1)	alpha-galactan (1)
alpha-glucan (7)	alpha-mannan (9)	arabinan (27)
arabinogalactan (39)	arabinoxylan (21)	beta-glucan (197)
beta-mannan (9)	capsule (4)	carrageenan (9)
cellobiose (8)	cellulose (29)	chitin (20)
fructan (2)	galactan (4)	galactomannan (6)
glucomannan (4)	glycogen (1)	glycosaminoglycan (1)
host glycan (43)	human milk oligosaccharide (4)	mucin (7)
pectin (153)	raffinose (1)	starch (5)
xylan (89)	xyloglucan (29)

alpha-glucan (1)	alpha-mannan (2)	alpha-rhamnoside (1)
arabinan (3)	arabinogalactanprotein (1)	beta-fucosides (8)
beta-galactan (7)	beta-glucan (105)	beta-mannan (6)
cellulose (20)	chitin (69)	chitooligosaccharide (30)
chitosan (1)	fructan (6)	glycogen (8)
glycolipid (1)	hostglycan (9)	lignin (7)
pectin (38)	peptidoglycan (16)	polyphenol (10)
rhamnose (4)	starch (33)	sucrose (5)
trehalose (1)	xylan (80)	xyloglucan (18)

pectin (21)	beta-glucan (7)	xyloglucan (10)
xylan (22)	fructan (1)	beta-mannan (1)
polyphenol (3)	arabinan (4)

Introduction

We published the old dbCAN-seq database in 2018 to provide pre-computed CAZyme and CGC (CAZyme gene cluster) sequence and annotation data for 5,349 bacterial isolate genomes.

In the past five years, numerous microbiomes have been sequenced and hundreds of thousands of metagenome assembled genomes (MAGs) from various ecological environments are now available in the public databases such as the MGnify database of European Bioinformatics Institute and the IMG/M database of Joint Genome Institute. Currently, no databases collect CAZymes and CGCs from microbiome MAGs and provide them on the web.

In the meantime, the CAZyme bioinformatics field continues to develop. It is now possible to infer carbohydrate substrates for CAZymes and CGCs, which is of a huge interest to applied microbiome.

To provide an comprehensive CAZymes and CGCs database for the community, we collect MAGs from four ecological environments to update dbCAN-seq database. In this update, we have made the following major and significant advances:

(i) ~498,000 CAZymes and ~169,000 CAZyme gene clusters (CGCs) from 9,421 MAGs of four ecological (human gut, human oral, cow rumen, and marine) environments;
(ii) Glycan substrates for 41,447 (24.54%) CGCs inferred by two novel approaches (dbCAN-PUL homology search and eCAMI subfamily majority voting);
(iii) A redesigned CGC page to include the graphical display of CGC gene compositions, the alignment of query CGC and subject PUL (polysaccharide utilization loci) of dbCAN-PUL to illustrate the substrate inference, and the eCAMI subfamily table to support substrate predicted by eCAMI subfamilies;
(iv) A statistics page to organize all the data for easy CGC access according to substrates and taxonomic phyla; and
(v) A batch download page.

It should be noted that all predicted substrate assignments for CGCs in dbCAN-seq need experimental validation. It is our hope that these predicted CGCs and substrates in the microbiomes of four ecological environments could facilitate the experimental characterization of new polysaccharide utilization loci (PULs) by the carbohydrate community.

Copyright 2022 © YIN LAB, UNL. All rights reserved. Designed by Jinfang Zheng and Boyang Hu. Maintained by Yanbin Yin.