1.3. GROMACS案例#
新手的第一个GROMACS案例——水中的溶菌酶
该教程来自弗吉尼亚理工大学生物化学系的分子动力学教程:www.mdtutorials.com
该模拟流程完整清晰,符合一个GROMACS模拟的标准流程,足够简单不需要复杂的准备工作,是了解GROMACS基本操作的入门案例.
此案例将模拟一个包含一个溶菌酶分子的简单的水系统,通过此案例,将实际了解GROMACS执行MD的具体流程,基本命令操作.
1.3.1. 准备工作#
1.3.1.1. 环境及软件准备#
如果你按照上一小节的教程安装了 GROMACS及额外的模拟软件, 里面包含了本节需要的可视化工具 VMD.
首先需要 安装GROMACS,这里默认你使用的是Windows版或者是WSL中的GROMACS
VMD 工具用来可视化分子结构和模拟轨迹
Grace 用来将xvg文件绘制成图表
因为grace的版本不同, 调用grace的命令可能是:
xmgrace,qtgrace,grace如果还没有安装grace绘图程序,且你正在使用的是ubuntu系统,可以使用
sudo atp-get install grace命令安装。
1.3.1.2. GROMACS初识#
GROMACS的所有工具都是以命令形式调用,所以需要在命令行界面操作。命令格式形如:
gmx <module> <parameters>module 是各种GROMACS工具,可以完成生成拓扑参数、构建盒子、添加溶剂、添加离子、执行模拟计算等功能.
parameters 规定了一条GROMACS命令具体的输入与输出文件,以及一些可选项.
例如,想要查询一个GROMACS工具可以接受的参数,使用
gmx <module> -h,输出依次是:程序在电脑中的路径
GROMACS的安装路径
当前的工作目录
输入的命令
该工具的语法、描述、可选项、其它选项、
最后是GROMACS程序的提示语(提示语没用,但看到它表示此次命令执行成功)。
$ gmx pdb2gmx -h
:-) GROMACS - gmx pdb2gmx, 2024.3-conda_forge (-:
Executable: /home/zz/miniconda3/envs/md/bin.AVX2_256/gmx
Data prefix: /home/zz/miniconda3/envs/md
Working dir: /mnt/e/share/ms
Command line:
gmx pdb2gmx -h
SYNOPSIS
gmx pdb2gmx [-f [<.gro/.g96/...>]] [-o [<.gro/.g96/...>]] [-p [<.top>]]
[-i [<.itp>]] [-n [<.ndx>]] [-q [<.gro/.g96/...>]]
[-chainsep <enum>] [-merge <enum>] [-ff <string>] [-water <enum>]
[-[no]inter] [-[no]ss] [-[no]ter] [-[no]lys] [-[no]arg]
[-[no]asp] [-[no]glu] [-[no]gln] [-[no]his] [-angle <real>]
[-dist <real>] [-[no]una] [-[no]ignh] [-[no]missing] [-[no]v]
[-posrefc <real>] [-vsite <enum>] [-[no]heavyh] [-[no]deuterate]
[-[no]chargegrp] [-[no]cmap] [-[no]renum] [-[no]rtpres]
DESCRIPTION
...
OPTIONS
...
Other options:
...
GROMACS reminds you: "If we knew what it was we were doing, it would not be called research, would it?" (Albert Einstein)
1.3.1.3. 文件准备#
获得蛋白质结构文件,去RCSB网站下载溶菌酶的pdb文件,通过 PDB ID 搜索 “1AKI” >点击右侧“Download Files” >“Legacy PDB Format”,将得到"1aki.pdb"文件.
打开 “1aki.pdb” 文件,可以看到其中包含了结晶水"HOH",删除所有"HOH"所在行,命名为"1aki-h2o.pdb", 使用Linux命令实现:
grep -v HOH 1aki.pdb > 1aki-h2o.pdb注意检查蛋白质pdb文件结构,一个标准的蛋白质pdb文件包含了:
文件标题 HEADER
实验数据和元数据 REMARK
残基序列 SEQRES
二级和三级结构信息 HELIX SHEET
晶体学信息 CRYST1
二硫键 SSBOND
原子坐标 ATOM
非标准残基或配体的原子 HETATM
原子之间的连接关系 CONECT
TER 用于标识链的结束
MODEL 用于多模型结构的描述等等.
本节教程使用的所有文件打包在此: 百度网盘下载
1.3.2. 步骤1-生成拓扑结构#
GROMACS工具:pdb2gmx
pdb2gmx命令会生成三个文件:
分子的拓扑结构.top
位置限制文件.itp
GROMACS格式的结构文件.gro
需要输入数字来选择力场和水模型,相关信息将写入top文件中。力场选择"AMBER99SB",水模型无特殊需求选择简单点电荷水即可"TIP3P"
$ gmx pdb2gmx -f 1aki-h2o.pdb -p 1aki.top -i 1aki.itp -o 1aki.gro
:-) GROMACS - gmx pdb2gmx, 2024.3-conda_forge (-:
Executable: /home/zz/miniconda3/envs/md/bin.AVX2_256/gmx
Data prefix: /home/zz/miniconda3/envs/md
Working dir: /mnt/e/share/teach
Command line:
gmx pdb2gmx -f 1aki-h2o.pdb -p 1aki.top -i 1aki.itp -o 1aki.gro
Select the Force Field:
From '/home/zz/miniconda3/envs/md/share/gromacs/top':
1: AMBER03 protein, nucleic AMBER94 (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003)
2: AMBER94 force field (Cornell et al., JACS 117, 5179-5197, 1995)
3: AMBER96 protein, nucleic AMBER94 (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996)
4: AMBER99 protein, nucleic AMBER94 (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000)
5: AMBER99SB protein, nucleic AMBER94 (Hornak et al., Proteins 65, 712-725, 2006)
6: AMBER99SB-ILDN protein, nucleic AMBER94 (Lindorff-Larsen et al., Proteins 78, 1950-58, 2010)
7: AMBERGS force field (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002)
8: CHARMM27 all-atom force field (CHARM22 plus CMAP for proteins)
9: GROMOS96 43a1 force field
10: GROMOS96 43a2 force field (improved alkane dihedrals)
11: GROMOS96 45a3 force field (Schuler JCC 2001 22 1205)
12: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)
13: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)
14: GROMOS96 54a7 force field (Eur. Biophys. J. (2011), 40,, 843-856, DOI: 10.1007/s00249-011-0700-9)
15: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)
$ 5
Using the Amber99sb force field in directory amber99sb.ff
Opening force field file /home/zz/miniconda3/envs/md/share/gromacs/top/amber99sb.ff/watermodels.dat
Select the Water Model:
1: TIP3P TIP 3-point, recommended
2: TIP4P TIP 4-point
3: TIP4P-Ew TIP 4-point optimized with Ewald, recommended
4: TIP5P TIP 5-point (see https://gitlab.com/gromacs/gromacs/-/issues/1348 for issues)
5: SPC simple point charge
6: SPC/E extended simple point charge
7: None
$ 1
going to rename amber99sb.ff/aminoacids.r2b
Opening force field file /home/zz/miniconda3/envs/md/share/gromacs/top/amber99sb.ff/aminoacids.r2b
......
Analyzing pdb file
......
Total mass 14313.332 a.m.u.
Total charge 8.000 e
Writing topology
Writing coordinate file...
--------- PLEASE NOTE ------------
You have successfully generated a topology from: 1aki-h2o.pdb.
The Amber99sb force field and the tip3p water model are used.
--------- ETON ESAELP ------------
GROMACS reminds you: "The Carpenter Goes Bang Bang" (The Breeders)
现在使用VMD观察 1aki.gro 文件, 先输入命令
vmd 1aki.gro打开VMD, 然后在命令窗口输入pbc box回车, 会发现蛋白质没有完全在盒子内, 接下来进行调整.
1.3.3. 步骤2-调整模拟盒子#
GROMACS工具:editconf
使用简单的立方体作为盒子,将蛋白质剧中,距离边界至少0.2nm
$ gmx editconf -f 1aki.gro -o 1aki_newbox.gro -bt cubic -c -d 0.2
:-) GROMACS - gmx editconf, 2024.3-conda_forge (-:
Executable: /home/zz/miniconda3/envs/md/bin.AVX2_256/gmx
Data prefix: /home/zz/miniconda3/envs/md
Working dir: /mnt/e/share/teach
Command line:
gmx editconf -f 1aki.gro -o 1aki_newbox.gro -bt cubic -c -d 0.2
Note that major changes are planned in future for editconf, to improve usability and utility.
Read 1960 atoms
Volume: 123.376 nm^3, corresponds to roughly 55500 electrons
No velocities found
system size : 3.817 4.234 3.454 (nm)
diameter : 5.010 (nm)
center : 2.781 2.488 0.017 (nm)
box vectors : 5.906 6.845 3.052 (nm)
box angles : 90.00 90.00 90.00 (degrees)
box volume : 123.38 (nm^3)
shift : -0.076 0.217 2.688 (nm)
new center : 2.705 2.705 2.705 (nm)
new box vectors : 5.410 5.410 5.410 (nm)
new box angles : 90.00 90.00 90.00 (degrees)
new box volume : 158.35 (nm^3)
Back Off! I just backed up 1aki_newbox.gro to ./#1aki_newbox.gro.1#
GROMACS reminds you: "Predictions can be very difficult - especially about the future." (Niels Bohr)
再次使用VMD观察"1aki_newbox.gro",可以看到盒子蛋白质已经位于盒子中央.
1.3.4. 步骤3-添加溶剂(溶剂化蛋白质)#
GROMACS工具:solvate
使用水模型
spc216.gro(这是一个通用的三点水模型)填充盒子,并指定更新top文件(因为引入了新的分子,每次向盒子中添加新分子都需要更新top文件)注意这里重复使用了"1aki.top"文件名,因此原来的文件将被备份为"#1aki.top.1#"
$ gmx solvate -cp 1aki_newbox.gro -cs spc216.gro -o 1aki_solvate.gro -p 1aki.top
:-) GROMACS - gmx solvate, 2024.3-conda_forge (-:
Executable: /home/zz/miniconda3/envs/md/bin.AVX2_256/gmx
Data prefix: /home/zz/miniconda3/envs/md
Working dir: /mnt/e/share/teach
Command line:
gmx solvate -cp 1aki_newbox.gro -cs spc216.gro -o 1aki_solvate.gro -p 1aki.top
Reading solute configuration
......
Generating solvent configuration
......
Output configuration contains 15373 atoms in 4600 residues
Volume : 158.347 (nm^3)
Density : 1004.79 (g/l)
Number of solvent molecules: 4471
Processing topology
Adding line for 4471 solvent molecules with resname (SOL) to topology file (1aki.top)
Back Off! I just backed up 1aki.top to ./#1aki.top.1#
GROMACS reminds you: "I wanted to have a real language where programmers could write real programs, and see whether they would find the idea of data abstraction at
all useful." (Barbara Liskov)
重新观察一下新的"1aki.top"文件,最下面将多一行"SOL 4471"表示4471个溶剂分子
[ molecules ]
; Compound #mols
Protein_chain_A 1
SOL 4471
可以用VMD查看一下新的结构文件"1aki_solvate.gro",可以看到水分子充满了盒子,溶菌酶被包裹在中间。
1.3.5. 步骤4-添加离子#
GROMACS工具:grompp和genion
如果你注意pdb2gmx命令的输出,可以看到溶菌酶的净电荷为+8e :“Total charge 8.000 e”,因此需要添加离子中和。
genion工具需要读取扩展名为.tpr的文件,因此首先用grompp工具先生成tpr文件
要使用grompp生成一个.tpr文件,还需要一个扩展名为.mdp的文件,可以在此处下载示例 ions.mdp 文件
grompp将动力学参数、结构信息、拓扑参数三部分信息整理成二进制的tpr文件。
$ gmx grompp -f ions.mdp -c 1aki_solvate.gro -p 1aki.top -o ions.tpr
:-) GROMACS - gmx grompp, 2024.3-conda_forge (-:
Executable: /home/zz/miniconda3/envs/md/bin.AVX2_256/gmx
Data prefix: /home/zz/miniconda3/envs/md
Working dir: /mnt/e/share/teach
Command line:
gmx grompp -f ions.mdp -c 1aki_solvate.gro -p 1aki.top -o ions.tpr
Setting the LD random seed to 1608515534
......
NOTE 1 [file 1aki.top, line 18409]:
System has non-zero total charge: 8.000000
Total charge should normally be an integer. See
https://manual.gromacs.org/current/user-guide/floating-point.html
for discussion on how close it should be to an integer.
......
Analysing residue names:
There are: 129 Protein residues
There are: 4471 Water residues
Analysing Protein...
Number of degrees of freedom in T-Coupling group rest is 32703.00
The integrator does not provide a ensemble temperature, there is no system ensemble temperature
NOTE 2 [file ions.mdp]:
You are using a plain Coulomb cut-off, which might produce artifacts.
You might want to consider using PME electrostatics.
......
This run will generate roughly 1 Mb of data
There were 2 NOTEs
GROMACS reminds you: "The future always gets twisted and turned" (Lisa o Piu)
接下来正式添加离子,指定阳离子为钠离子,阴离子为氯离子,使体系保持电中性 “-neutral”,并更新top文件.
过程中需要输入溶剂组"SOL"对应的数字.
$ gmx genion -s ions.tpr -o 1aki_solvate_ions.gro -pname NA -nname CL -neutral -p 1aki.top
:-) GROMACS - gmx genion, 2024.3-conda_forge (-:
Executable: /home/zz/miniconda3/envs/md/bin.AVX2_256/gmx
Data prefix: /home/zz/miniconda3/envs/md
Working dir: /mnt/e/share/teach
Command line:
gmx genion -s ions.tpr -o 1aki_solvate_ions.gro -pname NA -nname CL -neutral -p 1aki.top
Reading file ions.tpr, VERSION 2024.3-conda_forge (single precision)
Reading file ions.tpr, VERSION 2024.3-conda_forge (single precision)
Will try to add 0 NA ions and 8 CL ions.
Select a continuous group of solvent molecules
Group 0 ( System) has 15373 elements
Group 1 ( Protein) has 1960 elements
Group 2 ( Protein-H) has 1001 elements
Group 3 ( C-alpha) has 129 elements
Group 4 ( Backbone) has 387 elements
Group 5 ( MainChain) has 517 elements
Group 6 ( MainChain+Cb) has 634 elements
Group 7 ( MainChain+H) has 646 elements
Group 8 ( SideChain) has 1314 elements
Group 9 ( SideChain-H) has 484 elements
Group 10 ( Prot-Masses) has 1960 elements
Group 11 ( non-Protein) has 13413 elements
Group 12 ( Water) has 13413 elements
Group 13 ( SOL) has 13413 elements
Group 14 ( non-Water) has 1960 elements
Select a group:
$ 13
Selected 13: 'SOL'
Number of (3-atomic) solvent molecules: 4471
Processing topology
Replacing 8 solute molecules in topology file (1aki.top) by 0 NA and 8 CL ions.
Back Off! I just backed up 1aki.top to ./#1aki.top.2#
Using random seed -54592018.
Replacing solvent molecule 855 (atom 4525) with CL
Replacing solvent molecule 232 (atom 2656) with CL
Replacing solvent molecule 1855 (atom 7525) with CL
Replacing solvent molecule 4162 (atom 14446) with CL
Replacing solvent molecule 72 (atom 2176) with CL
Replacing solvent molecule 3463 (atom 12349) with CL
Replacing solvent molecule 4122 (atom 14326) with CL
Replacing solvent molecule 1838 (atom 7474) with CL
GROMACS reminds you: "Torture numbers, and they'll confess to anything." (Greg Easterbrook)
现在检查一下新的top文件,最下面应该多了"CL 8"
[ molecules ]
; Compound #mols
Protein_chain_A 1
SOL 4463
CL 8
1.3.6. 步骤5-能量最小化#
GROMACS工具:grompp, mdrun, energy
在开始动力学之前,必须确保系统没有空间冲突或不适当的几何形状(尤其是对于自行绘制的分子结构,你没有办法精确的控制没个键长、键角的数值),通过能量最小化(EM)的过程来使得分子结构合理化。
但凡设计到使用grompp生成tpr文件,都需要mdp动力学参数文件,可以使用这里的示例 em.mdp 文件。
$ gmx grompp -f em.mdp -c 1aki_solvate_ions.gro -p 1aki.top -o em.tpr
:-) GROMACS - gmx grompp, 2024.3-conda_forge (-:
Executable: /home/zz/miniconda3/envs/md/bin.AVX2_256/gmx
Data prefix: /home/zz/miniconda3/envs/md
Working dir: /mnt/e/share/teach
Command line:
gmx grompp -f em.mdp -c 1aki_solvate_ions.gro -p 1aki.top -o em.tpr
NOTE 1 [file em.mdp]:
With Verlet lists the optimal nstlist is >= 10, with GPUs >= 20. Note
that with the Verlet scheme, nstlist has no effect on the accuracy of
your simulation.
......
This run will generate roughly 1 Mb of data
There was 1 NOTE
GROMACS reminds you: "Youth is wasted on the young" (The Smashing Pumpkins)
然后执行能量最小化计算
-deffnm em简写方法,意思是让mdrun工具以em为名的tpr文件为输入,并使得使出的各种格式的文件以em名字命名。-v表示在命令行中显示计算过程
$ gmx mdrun -deffnm em -v
:-) GROMACS - gmx mdrun, 2024.3-conda_forge (-:
Executable: /home/zz/miniconda3/envs/md/bin.AVX2_256/gmx
Data prefix: /home/zz/miniconda3/envs/md
Working dir: /mnt/e/share/teach
Command line:
gmx mdrun -deffnm em -v
The current CPU can measure timings more accurately than the code in
gmx mdrun was configured to use. This might affect your simulation
speed as accurate timings are needed for load-balancing.
Please consider rebuilding gmx mdrun with the GMX_USE_RDTSCP=ON CMake option.
Reading file em.tpr, VERSION 2024.3-conda_forge (single precision)
1 GPU selected for this run.
Mapping of GPU IDs to the 1 GPU task in the 1 rank on this node:
PP:0
PP tasks will do (non-perturbed) short-ranged interactions on the GPU
PP task will update and constrain coordinates on the CPU
Using 1 MPI thread
Using 12 OpenMP threads
Steepest Descents:
Tolerance (Fmax) = 1.00000e+03
Number of steps = 50000
Step= 0, Dmax= 1.0e-02 nm, Epot= -9.63828e+04 Fmax= 1.83369e+05, atom= 252
Step= 1, Dmax= 1.0e-02 nm, Epot= -1.13756e+05 Fmax= 6.57749e+04, atom= 1367
Step= 2, Dmax= 1.2e-02 nm, Epot= -1.32426e+05 Fmax= 3.07926e+04, atom= 1367
......
writing lowest energy coordinates.
Steepest Descents converged to Fmax < 1000 in 336 steps
Potential Energy = -2.2477358e+05
Maximum force = 9.9712384e+02 on atom 736
Norm of force = 3.6471184e+01
GROMACS reminds you: "First off, I'd suggest printing out a copy of the GNU coding standards, and NOT read it. Burn them, it's a great symbolic gesture." (Linus Torvalds)
计算完成后将得到四个文件
em.log:日志文件
em.edr:二进制能量文件
em.trr:二进制轨迹文件
em.gro:能量最小化结构
如何确定EM是否成功?
势能:Epot 应该是负的,对于水中的简单蛋白质应该在 \(10^5 - 10^6\) 的数量级上
最大力:Fmax 一般不大于 \(1000 kJ·mol^{-1}·nm^{-1}\)
由于本次、模拟个体系很简单,而且蛋白质来源于PDB数据库,本身是一个较合理的结构,在很快的几百步计算下就完成了优化。
简单的看一下能量变化,em.edr文件包含GROMACS在EM期间收集的所有能量项,可以使用GROMACS energy 模块分析任何.edr文件:
过程中需要键入
10 0,10表示选择Potential,0表示终止输入。将输出Epot的平均值,以及一个名为“potential.xvg”的文件。
$ gmx energy -f em.edr -o potential.xvg
:-) GROMACS - gmx energy, 2024.3-conda_forge (-:
Executable: /home/zz/miniconda3/envs/md/bin.AVX2_256/gmx
Data prefix: /home/zz/miniconda3/envs/md
Working dir: /mnt/e/share/teach
Command line:
gmx energy -f em.edr -o potential.xvg
Opened em.edr as single precision energy file
Select the terms you want from the following list by
selecting either (part of) the name or the number or a combination.
End your selection with an empty line or a zero.
-------------------------------------------------------------------
1 Bond 2 Angle 3 Proper-Dih. 4 Per.-Imp.-Dih.
5 LJ-14 6 Coulomb-14 7 LJ-(SR) 8 Coulomb-(SR)
9 Coul.-recip. 10 Potential 11 Pressure 12 Vir-XX
13 Vir-XY 14 Vir-XZ 15 Vir-YX 16 Vir-YY
17 Vir-YZ 18 Vir-ZX 19 Vir-ZY 20 Vir-ZZ
21 Pres-XX 22 Pres-XY 23 Pres-XZ 24 Pres-YX
25 Pres-YY 26 Pres-YZ 27 Pres-ZX 28 Pres-ZY
29 Pres-ZZ 30 #Surf*SurfTen 31 T-rest
$ 10 0
Last energy frame read 265 time 335.000
Statistics over 336 steps [ 0.0000 through 335.0000 ps ], 1 data sets
All statistics are over 266 points (frames)
Energy Average Err.Est. RMSD Tot-Drift
-------------------------------------------------------------------------------
Potential -213378 5900 15542.1 -39018.4 (kJ/mol)
GROMACS reminds you: "This Puke Stinks Like Beer" (LIVE)
然后使用绘图工具可以绘制能量变化曲线,例如使用xmgrace程序,命令行输入
xmgrace potential.xvg,结果图应该看起来像如下图:展示了Epot的良好稳定收敛。
图. 1.3.1 系统的总能量变化#
1.3.7. 步骤6-NVT平衡#
GROMACS工具:grompp, mdrun, energy
平衡通常分两个阶段进行:
第一阶段在NVT系综下进行。nvt平衡目的是为了稳定系统的温度
第二阶段在NPT系综下进行。npt平衡目的是为了稳定系统的压力
这里对重原子施加位置限制,grompp命令相比于前面多了一个
-r参数,表示为模拟的位置限制提供初始的参考结构。位置约束的意义在于:允许平衡蛋白质周围的溶剂,而不会对蛋白质结构施加变化。
注意这里的动力学参数文件第一行
define = -DPOSRES,开启了位置限制,用到了前面pdb2gmx生成的位置限制文件。在这里可以找到 nvt.mdp 文件。进行时长100ps的平衡
$ gmx grompp -f nvt.mdp -c em.gro -r em.gro -p 1aki.top -o nvt.tpr
:-) GROMACS - gmx grompp, 2024.3-conda_forge (-:
Executable: /home/zz/miniconda3/envs/md/bin.AVX2_256/gmx
Data prefix: /home/zz/miniconda3/envs/md
Working dir: /mnt/e/share/teach
Command line:
gmx grompp -f nvt.mdp -c em.gro -r em.gro -p 1aki.top -o nvt.tpr
......
This run will generate roughly 37 Mb of data
GROMACS reminds you: "Be less curious about people and more curious about ideas." (Marie Curie)
// 然后运行mdrun
$ gmx mdrun -deffnm nvt -v
:-) GROMACS - gmx mdrun, 2024.3-conda_forge (-:
Executable: /home/zz/miniconda3/envs/md/bin.AVX2_256/gmx
Data prefix: /home/zz/miniconda3/envs/md
Working dir: /mnt/e/share/teach
Command line:
gmx mdrun -deffnm nvt -v
......
starting mdrun 'LYSOZYME in water'
50000 steps, 100.0 ps.
step 14200: timed with pme grid 48 48 48, coulomb cutoff 1.000: 65.5 M-cycles
step 14400: timed with pme grid 42 42 42, coulomb cutoff 1.073: 61.0 M-cycles
step 14600: timed with pme grid 36 36 36, coulomb cutoff 1.252: 187.5 M-cycles
step 14800: timed with pme grid 40 40 40, coulomb cutoff 1.127: 102.2 M-cycles
step 15000: timed with pme grid 42 42 42, coulomb cutoff 1.073: 95.6 M-cycles
step 15200: timed with pme grid 44 44 44, coulomb cutoff 1.025: 81.5 M-cycles
step 15400: timed with pme grid 48 48 48, coulomb cutoff 1.000: 76.6 M-cycles
optimal pme grid 42 42 42, coulomb cutoff 1.073
step 26700, remaining wall clock time: 10 s
// 等待计算完成
Writing final coordinates.
step 50000, remaining wall clock time: 0 s
Core t (s) Wall t (s) (%)
Time: 121.848 20.310 599.9
(ns/day) (hour/ns)
Performance: 425.410 0.056
GROMACS reminds you: "Alas, You're Welcome" (Prof. Dumbledore in Potter Puppet Pals)
再次使用energy来分析温度的变化:在提示符处键入“16 0”以选择系统的温度并退出。
$ gmx energy -f nvt.edr -o temperature.xvg
:-) GROMACS - gmx energy, 2024.3-conda_forge (-:
Executable: /home/zz/miniconda3/envs/md/bin.AVX2_256/gmx
Data prefix: /home/zz/miniconda3/envs/md
Working dir: /mnt/e/share/teach
Command line:
gmx energy -f nvt.edr -o temperature.xvg
Opened nvt.edr as single precision energy file
Select the terms you want from the following list by
selecting either (part of) the name or the number or a combination.
End your selection with an empty line or a zero.
-------------------------------------------------------------------
1 Bond 2 Angle 3 Proper-Dih. 4 Per.-Imp.-Dih.
5 LJ-14 6 Coulomb-14 7 LJ-(SR) 8 Disper.-corr.
9 Coulomb-(SR) 10 Coul.-recip. 11 Position-Rest. 12 Potential
13 Kinetic-En. 14 Total-Energy 15 Conserved-En. 16 Temperature
17 Pres.-DC 18 Pressure 19 Constr.-rmsd 20 Vir-XX
21 Vir-XY 22 Vir-XZ 23 Vir-YX 24 Vir-YY
25 Vir-YZ 26 Vir-ZX 27 Vir-ZY 28 Vir-ZZ
29 Pres-XX 30 Pres-XY 31 Pres-XZ 32 Pres-YX
33 Pres-YY 34 Pres-YZ 35 Pres-ZX 36 Pres-ZY
37 Pres-ZZ 38 #Surf*SurfTen
39 T-Protein 40 T-Water_and_ions
41 Lamb-Protein 42 Lamb-Water_and_ions
$ 16 0
Last energy frame read 100 time 100.000
Statistics over 50001 steps [ 0.0000 through 100.0000 ps ], 1 data sets
All statistics are over 501 points
Energy Average Err.Est. RMSD Tot-Drift
-------------------------------------------------------------------------------
Temperature 300.134 0.1 3.27203 0.620947 (K)
GROMACS reminds you: "You are wrong!" (NOFX)
输入
xmgrace temperature.xvg查看温度变化,生成的图应如下:从图中可以看出系统的温度迅速达到目标值(300 K)附近。
图. 1.3.2 系统的温度变化#
1.3.8. 步骤7-NPT平衡#
GROMACS工具:grompp, mdrun, energy
过程几乎与nvt平衡类似,进行100ps,开启位置限制。
这里可以找到 npt.mdp 文件,注意mdp文件中的一些变化
continuation = yes表示从NVT平衡阶段继续模拟gen_vel = no表示从轨迹中读取速度
-t指定上一步生成的检查点文件.cpt,存储了上一步模拟过程中的完整状态信息。配合上述mdp文件的参数使用。这里可能会出现一个警告warning,在最新版本的GROMACS中会出现,不用管它,这是官方在推荐使用一种新的恒压方法。在命令最后添加
-maxwarn 1重新运行。
$ gmx grompp -f npt.mdp -c nvt.gro -r nvt.gro -t nvt.cpt -p 1aki.top -o npt.tpr
WARNING 1 [file npt.mdp]:
The Berendsen barostat does not generate any strictly correct ensemble,
and should not be used for new production simulations (in our opinion).
We recommend using the C-rescale barostat instead.
......
There was 1 WARNING
-------------------------------------------------------
Program: gmx grompp, version 2024.3-conda_forge
Source file: src/gromacs/gmxpreprocess/grompp.cpp (line 2751)
Fatal error:
Too many warnings (1).
If you are sure all warnings are harmless, use the -maxwarn option.
For more information and tips for troubleshooting, please check the GROMACS
website at https://manual.gromacs.org/current/user-guide/run-time-errors.html
-------------------------------------------------------
// warning 忽略它重新运行
$ gmx grompp -f npt.mdp -c nvt.gro -r nvt.gro -t nvt.cpt -p 1aki.top -o npt.tpr -maxwarn 1
......
// 然后运行mdrun
$ gmx mdrun -deffnm npt
......
再次使用energy来分析压强、密度的变化:
$ gmx energy -f npt.edr -o pressure.xvg
......
$ 18 0
......
$ gmx energy -f npt.edr -o density.xvg
......
$ 24 0
......
再次执行绘图命令
xmgrace pressure.xvg,xmgrace density.xvg,压力和密度均很快达到平衡,表明系统现在在压力和密度方面平衡良好。压力值在平衡阶段的过程中波动很大,但不用担心,压力是一个在MD模拟过程中波动很大的量。
图. 1.3.3 系统的压力变化#
图. 1.3.4 系统的密度变化#
1.3.9. 步骤8-MD模拟#
在完成两个平衡阶段后,系统现在在所需的温度和压力下充分平衡。
操作流程几乎类似npt平衡,区别在于关闭位置限制,因此不需要
-r参数;继续从上一个模拟过程开始,因此-t指定上一个过程生成的cpt文件。动力学模拟的参数文件可以在这里找到 md.mdp
同样使用
-maxwarn 1忽略压力平衡器的警告
$ gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p 1aki.top -o md0.tpr -maxwarn 1
:-) GROMACS - gmx grompp, 2024.3-conda_forge (-:
Executable: /home/zz/miniconda3/envs/md/bin.AVX2_256/gmx
Data prefix: /home/zz/miniconda3/envs/md
Working dir: /mnt/e/share/teach
Command line:
gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p 1aki.top -o md0.tpr -maxwarn 1
......
WARNING 1 [file md.mdp]:
13397 atoms are not part of any center of mass motion removal group.
This may lead to artifacts.
In most cases one should use one group for the whole system.
......
There was 1 WARNING
GROMACS reminds you: "Is That a Real Poncho ?" (F. Zappa)
// 然后执行MD模拟,花费时间因机器性能而异(几分钟~1个多小时)
$ gmx mdrun -deffnm md0 -v
:-) GROMACS - gmx mdrun, 2024.3-conda_forge (-:
......
step 499900, remaining wall clock time: 0 s
Writing final coordinates.
step 500000, remaining wall clock time: 0 s
Core t (s) Wall t (s) (%)
Time: 830.063 138.346 600.0
(ns/day) (hour/ns)
Performance: 624.522 0.038
GROMACS reminds you: "Developer accused of unreadable code refuses to comment" (Molly Struve)
恭喜你完成了一个完整的GROMACS模拟案例! 接下来是简单的分析.
1.3.10. 模拟后处理与分析#
1.3.10.1. 观察运动轨迹#
用VMD查看模拟轨迹,可以发现溶菌酶分子在水中的运动情况。
由于执行MD模拟时开启了去除蛋白质的质心平移(在 md.mdp 文件中: “comm-grps = Protein”), 溶菌酶分子几乎在盒子中心旋转而不会越出盒子边界.
图. 1.3.5 溶菌酶在水中的模拟轨迹图#
1.3.10.2. 轨迹文件处理#
GROMACS工具:trjconv
如果你的目标分子跑出盒子处于盒子边界处(这里不会因为mdp文件中设置消除了蛋白质的质心运动),需要
trjconv来调整轨迹文件位置。本案例可以不执行.
$ gmx trjconv -s md0.tpr -f md0.xtc -o md0_noPBC.xtc -pbc mol -center
// 选择1(“Protein”)作为要居中的组,选择0(“System”)作为输出。
此外
trjconv还可以转换轨迹文件格式,调整轨迹步数间隔,输出某一段的轨迹文件、输出某些步的结构文件等等。
1.3.10.3. 计算根均方偏差RMSD#
GROMACS工具:rms
蛋白质的RMSD反映了模拟的收敛性和蛋白的稳定性.
rms计算模拟过程中的结构与初始结构的RMSD,程序会对结构进行最小二乘拟合并给出RMSD随时间的变化,将得到一个xvg文件,包含了RMSD随时间的变化。-tu表示设置输出时间单位, 这里设置为 ns
// 以经过两步平衡后的结构为参考结构
$ gmx rms -s md0.tpr -f md0.xtc -o rmsd.xvg -tu ns
// 选择3(“α碳”),两次选择相同的组
$ 3 3
// 再与晶体结构作为参考结构做一下对比
$ gmx rms -s em.tpr -f md0.xtc -o rmsd_crystal.xvg -tu ns
$ 3 3
// 然后用xmgrace绘图
$ xmgrace rmsd.xvg rmsd_crystal.xvg
结果看起来像这样:
图. 1.3.6 溶菌酶Cα的 RMSD#
可以看到RMSD很快稳定到一个较小的值,并在后续保持。这说明溶菌酶分子在水中结构非常稳定波动很小,且该模拟很容易达到平衡,与该晶体结构略有细微差异。
1.3.10.4. 蛋白质回旋半径#
GROMACS工具:gyrate
蛋白质的回转半径是其致密性的衡量标准:回旋半径取决于某(些)原子质量与分子重心的关系, 因此可用于表征蛋白质结构的密实度。
如果一个蛋白质是稳定折叠态,它将可能保持相对稳定的Rg值。如果一个蛋白质展开,它的Rg会随着时间的推移而改变。
$ gmx gyrate -s md0.tpr -f md0.xtc -o gyrate.xvg
// 选择第1组(“Protein”)进行分析
$ 1
// 绘图
$ xmgrace gyrate.xvg
从平稳的Rg值看出,溶菌酶在300 K 下 1 ns 的过程中保持非常稳定的折叠状态:
图. 1.3.7 溶菌酶分子的回旋半径#
至此,我们完成了水中溶菌酶的全部模拟并进行了简单的分析!你应该熟悉了GROMACS的基本模拟流程,一些常用的GROMACS工具(模块)及其命令。