{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Building Glycans manually\n",
"\n",
"In this tutorial we will build a glycan manually by attaching individual sugar residues together. The best thing about this is that we can easily incorporate non-standard or modified sugars into our glycans."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" \n",
" "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import plotly\n",
"plotly.offline.init_notebook_mode()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import glycosylator as gl"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's build the glycan with [GlyCosmos ID G02259AO](https://glycosmos.org/glycans/show/G02259AO)..."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# for reference, the glycan's IUPAC string is given by:\n",
"# (just copy/past from GlyCosmos)\n",
"iupac = \"GlcNAc(b1-3)[Gal(b1-4)GlcNAc(b1-6)]Gal(b1-4)Glc\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The glycan consists of Glucose (Glc), Galactose (Gal), and N-acetyl-glucose (GlcNAc). We can get the individual sugars by just making glycans from their names."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# let's get the sugars using their name abbreviations\n",
"glc = gl.glycan(\"Glc\")\n",
"gal = gl.glycan(\"Gal\")\n",
"glcnac = gl.glycan(\"GlcNAc\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Assembling glycans \n",
"\n",
"Now let's start assembling them. Glycosylator is built on top of BuildAMol, a general-purpose fragment-based molecular assembly tool. We can attach molecules (glycans) together to form new and larger ones. To that end, we need to specify a `linkage` between the two molecules, specifying which atoms to connect and which atoms to remove while connecting. Luckily, Glycosylator (and BuildAMol) already come with a repository of linkages pre-installed, so we do not need to specify any for this tutorial. If you find that your glycan cannot be built directly because some linkage is missing, you will need to make your own linkage, but that's easy to do, so don't worry about it!\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Linkage(SCK0),\n",
" Linkage(SCK1),\n",
" Linkage(LLLO),\n",
" Linkage(CERA),\n",
" Linkage(CERB),\n",
" Linkage(DAGA),\n",
" Linkage(DAGB),\n",
" Linkage(INS2A),\n",
" Linkage(INS2B),\n",
" Linkage(INS6A),\n",
" Linkage(INS6B),\n",
" Linkage(SGPA),\n",
" Linkage(TGPA),\n",
" Linkage(SGPB),\n",
" Linkage(TGPB),\n",
" Linkage(NGLA),\n",
" Linkage(11aa),\n",
" Linkage(11ab),\n",
" Linkage(11bb),\n",
" Linkage(12aa),\n",
" Linkage(12ab),\n",
" Linkage(12ba),\n",
" Linkage(12bb),\n",
" Linkage(13aa),\n",
" Linkage(13ab),\n",
" Linkage(13ba),\n",
" Linkage(13bb),\n",
" Linkage(14aa),\n",
" Linkage(14ab),\n",
" Linkage(14ba),\n",
" Linkage(14bb),\n",
" Linkage(16aa),\n",
" Linkage(16ab),\n",
" Linkage(16bb),\n",
" Linkage(16ba),\n",
" Linkage(SUCR),\n",
" Linkage(LCTL),\n",
" Linkage(AB15),\n",
" Linkage(SA23AB),\n",
" Linkage(LINK),\n",
" Linkage(ASN-glyco),\n",
" Linkage(SER-glyco),\n",
" Linkage(THR-glyco),\n",
" Linkage(23ab),\n",
" Linkage(23ba),\n",
" Linkage(11ba),\n",
" Linkage(21aa),\n",
" Linkage(21ab),\n",
" Linkage(21ba),\n",
" Linkage(21bb),\n",
" Linkage(31aa),\n",
" Linkage(31ab),\n",
" Linkage(31ba),\n",
" Linkage(31bb),\n",
" Linkage(41aa),\n",
" Linkage(41ab),\n",
" Linkage(41ba),\n",
" Linkage(41bb),\n",
" Linkage(51aa),\n",
" Linkage(51ab),\n",
" Linkage(51ba),\n",
" Linkage(51bb),\n",
" Linkage(61aa),\n",
" Linkage(61ab),\n",
" Linkage(61ba),\n",
" Linkage(61bb),\n",
" Linkage(71aa),\n",
" Linkage(71ab),\n",
" Linkage(71ba),\n",
" Linkage(71bb),\n",
" Linkage(81aa),\n",
" Linkage(81ab),\n",
" Linkage(81ba),\n",
" Linkage(81bb),\n",
" Linkage(22aa),\n",
" Linkage(22ab),\n",
" Linkage(22ba),\n",
" Linkage(22bb),\n",
" Linkage(32aa),\n",
" Linkage(32ab),\n",
" Linkage(32ba),\n",
" Linkage(32bb),\n",
" Linkage(42aa),\n",
" Linkage(42ab),\n",
" Linkage(42ba),\n",
" Linkage(42bb),\n",
" Linkage(52aa),\n",
" Linkage(52ab),\n",
" Linkage(52ba),\n",
" Linkage(52bb),\n",
" Linkage(62aa),\n",
" Linkage(62ab),\n",
" Linkage(62ba),\n",
" Linkage(62bb),\n",
" Linkage(72aa),\n",
" Linkage(72ab),\n",
" Linkage(72ba),\n",
" Linkage(72bb),\n",
" Linkage(82aa),\n",
" Linkage(82ab),\n",
" Linkage(82ba),\n",
" Linkage(82bb),\n",
" Linkage(23aa),\n",
" Linkage(23bb),\n",
" Linkage(33aa),\n",
" Linkage(33ab),\n",
" Linkage(33ba),\n",
" Linkage(33bb),\n",
" Linkage(43aa),\n",
" Linkage(43ab),\n",
" Linkage(43ba),\n",
" Linkage(43bb),\n",
" Linkage(53aa),\n",
" Linkage(53ab),\n",
" Linkage(53ba),\n",
" Linkage(53bb),\n",
" Linkage(63aa),\n",
" Linkage(63ab),\n",
" Linkage(63ba),\n",
" Linkage(63bb),\n",
" Linkage(73aa),\n",
" Linkage(73ab),\n",
" Linkage(73ba),\n",
" Linkage(73bb),\n",
" Linkage(83aa),\n",
" Linkage(83ab),\n",
" Linkage(83ba),\n",
" Linkage(83bb),\n",
" Linkage(24aa),\n",
" Linkage(24ab),\n",
" Linkage(24ba),\n",
" Linkage(24bb),\n",
" Linkage(34aa),\n",
" Linkage(34ab),\n",
" Linkage(34ba),\n",
" Linkage(34bb),\n",
" Linkage(44aa),\n",
" Linkage(44ab),\n",
" Linkage(44ba),\n",
" Linkage(44bb),\n",
" Linkage(54aa),\n",
" Linkage(54ab),\n",
" Linkage(54ba),\n",
" Linkage(54bb),\n",
" Linkage(64aa),\n",
" Linkage(64ab),\n",
" Linkage(64ba),\n",
" Linkage(64bb),\n",
" Linkage(74aa),\n",
" Linkage(74ab),\n",
" Linkage(74ba),\n",
" Linkage(74bb),\n",
" Linkage(84aa),\n",
" Linkage(84ab),\n",
" Linkage(84ba),\n",
" Linkage(84bb),\n",
" Linkage(15aa),\n",
" Linkage(15ab),\n",
" Linkage(15ba),\n",
" Linkage(15bb),\n",
" Linkage(25aa),\n",
" Linkage(25ab),\n",
" Linkage(25ba),\n",
" Linkage(25bb),\n",
" Linkage(35aa),\n",
" Linkage(35ab),\n",
" Linkage(35ba),\n",
" Linkage(35bb),\n",
" Linkage(45aa),\n",
" Linkage(45ab),\n",
" Linkage(45ba),\n",
" Linkage(45bb),\n",
" Linkage(55aa),\n",
" Linkage(55ab),\n",
" Linkage(55ba),\n",
" Linkage(55bb),\n",
" Linkage(65aa),\n",
" Linkage(65ab),\n",
" Linkage(65ba),\n",
" Linkage(65bb),\n",
" Linkage(75aa),\n",
" Linkage(75ab),\n",
" Linkage(75ba),\n",
" Linkage(75bb),\n",
" Linkage(85aa),\n",
" Linkage(85ab),\n",
" Linkage(85ba),\n",
" Linkage(85bb),\n",
" Linkage(26aa),\n",
" Linkage(26ab),\n",
" Linkage(26ba),\n",
" Linkage(26bb),\n",
" Linkage(36aa),\n",
" Linkage(36ab),\n",
" Linkage(36ba),\n",
" Linkage(36bb),\n",
" Linkage(46aa),\n",
" Linkage(46ab),\n",
" Linkage(46ba),\n",
" Linkage(46bb),\n",
" Linkage(56aa),\n",
" Linkage(56ab),\n",
" Linkage(56ba),\n",
" Linkage(56bb),\n",
" Linkage(66aa),\n",
" Linkage(66ab),\n",
" Linkage(66ba),\n",
" Linkage(66bb),\n",
" Linkage(76aa),\n",
" Linkage(76ab),\n",
" Linkage(76ba),\n",
" Linkage(76bb),\n",
" Linkage(86aa),\n",
" Linkage(86ab),\n",
" Linkage(86ba),\n",
" Linkage(86bb),\n",
" Linkage(17aa),\n",
" Linkage(17ab),\n",
" Linkage(17ba),\n",
" Linkage(17bb),\n",
" Linkage(27aa),\n",
" Linkage(27ab),\n",
" Linkage(27ba),\n",
" Linkage(27bb),\n",
" Linkage(37aa),\n",
" Linkage(37ab),\n",
" Linkage(37ba),\n",
" Linkage(37bb),\n",
" Linkage(47aa),\n",
" Linkage(47ab),\n",
" Linkage(47ba),\n",
" Linkage(47bb),\n",
" Linkage(57aa),\n",
" Linkage(57ab),\n",
" Linkage(57ba),\n",
" Linkage(57bb),\n",
" Linkage(67aa),\n",
" Linkage(67ab),\n",
" Linkage(67ba),\n",
" Linkage(67bb),\n",
" Linkage(77aa),\n",
" Linkage(77ab),\n",
" Linkage(77ba),\n",
" Linkage(77bb),\n",
" Linkage(87aa),\n",
" Linkage(87ab),\n",
" Linkage(87ba),\n",
" Linkage(87bb),\n",
" Linkage(18aa),\n",
" Linkage(18ab),\n",
" Linkage(18ba),\n",
" Linkage(18bb),\n",
" Linkage(28aa),\n",
" Linkage(28ab),\n",
" Linkage(28ba),\n",
" Linkage(28bb),\n",
" Linkage(38aa),\n",
" Linkage(38ab),\n",
" Linkage(38ba),\n",
" Linkage(38bb),\n",
" Linkage(48aa),\n",
" Linkage(48ab),\n",
" Linkage(48ba),\n",
" Linkage(48bb),\n",
" Linkage(58aa),\n",
" Linkage(58ab),\n",
" Linkage(58ba),\n",
" Linkage(58bb),\n",
" Linkage(68aa),\n",
" Linkage(68ab),\n",
" Linkage(68ba),\n",
" Linkage(68bb),\n",
" Linkage(78aa),\n",
" Linkage(78ab),\n",
" Linkage(78ba),\n",
" Linkage(78bb),\n",
" Linkage(88aa),\n",
" Linkage(88ab),\n",
" Linkage(88ba),\n",
" Linkage(88bb),\n",
" Linkage(CER-glyco),\n",
" Linkage(SPL-glyco)]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# let's first check what linkages are available\n",
"gl.available_linkages()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can refer to each of these linkages by their id, e.g. `\"14bb\"` for a 1-4 beta glycosidic linkage, etc. Hence, we can connect the Galactose to the Glucose using:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/noahhk/anaconda3/envs/glyco2/lib/python3.11/site-packages/plotly/express/_core.py:1985: FutureWarning:\n",
"\n",
"When grouping with a length-1 list-like, you will need to pass a length-1 tuple to get_group in a future version of pandas. Pass `(name,)` instead of `name` to silence this warning.\n",
"\n"
]
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"customdata": [
[
"C1",
1,
1,
"GLC",
"A"
],
[
"C2",
2,
1,
"GLC",
"A"
],
[
"C3",
3,
1,
"GLC",
"A"
],
[
"C4",
4,
1,
"GLC",
"A"
],
[
"C5",
5,
1,
"GLC",
"A"
],
[
"C6",
6,
1,
"GLC",
"A"
],
[
"C1",
24,
2,
"GLA",
"A"
],
[
"C2",
25,
2,
"GLA",
"A"
],
[
"C3",
26,
2,
"GLA",
"A"
],
[
"C4",
27,
2,
"GLA",
"A"
],
[
"C5",
28,
2,
"GLA",
"A"
],
[
"C6",
29,
2,
"GLA",
"A"
]
],
"hovertemplate": "atom_element=C
x=%{x}
y=%{y}
z=%{z}
__marker_size=%{marker.size}
atom_id=%{customdata[0]}
atom_serial=%{customdata[1]}
residue_serial=%{customdata[2]}
residue_name=%{customdata[3]}
chain_id=%{customdata[4]}
x=%{x}
y=%{y}
z=%{z}
__marker_size=%{marker.size}
atom_id=%{customdata[0]}
atom_serial=%{customdata[1]}
residue_serial=%{customdata[2]}
residue_name=%{customdata[3]}
chain_id=%{customdata[4]}
x=%{x}
y=%{y}
z=%{z}
__marker_size=%{marker.size}
atom_id=%{customdata[0]}
atom_serial=%{customdata[1]}
residue_serial=%{customdata[2]}
residue_name=%{customdata[3]}
chain_id=%{customdata[4]}