Explicit Membrane Protein Simulations in NAMD/VMD
( By Richard J. Law – Aug. 2004)
This workshop is intended as a quick (<90mins I hope!) passage through the set-up of a membrane protein simulation. For detailed instructions on the various aspects of MD simulation and set-up in NAMD got to the NAMD manual. Similar for use of VMD.
We are choosing to use NAMD/VMD for several reasons:
a) VMD provides a graphical interface for many steps of the process.
b) Many of the steps are automated to ease the process.
c) NAMD is by far the fastest MD code on large parallel machines (with PME electrostatics) which is what we need for such large systems (often > 150,000 atoms for some membrane proteins.)
But no package is perfect and processing either manually or by your own scripts may be required to overcome the problems particular to any system that you’re working with. (That should not be necessary in this exercise)
In many ways, Gromacs is an easier package to use as more of the steps are automated, and there is also the best post-processing and analysis tools of any MD package. It is at least as fast as NAMD on a small number of processors, and a large number with cut-off electrostatics. For comparison, here are some suggestions for an intermediate type system (helix in an octane slab) set-up in Gromacs.
1. Getting the protein
For a list of all the available membrane protein structures (not homology or fold-recognition models) go to Steve White’s collection. Generally speaking, protein structures can be downloaded from the PDB.
For this exercise, we’ll set-up a simple system with just a transmembrane helix dimer as this will avoid some of the complications - which I’ll allude to as we go.
I have already downloaded a protein helix for use in our membrane set-up. It is the NMR structure of the transmembrane fragment of glycophorin A (1AFO at the PDB).
Glycophorin A Ensemble
Out of the full pdb file I have cut out the pdb we are going to use. USE THIS -> 1afo_model4.pdb
When you have an ensemble of structures from a method like NMR, homology modeling, or any other form of modeling, assessing which one to use may involve a scoring function specific to that method, it may be that one structure suits your needs for a very specific reason, but if those arguments do not apply then the best way is to check the quality of the structure. The Biotech site at EBI enables protein quality checking by online submission to run programs such as Procheck, Whatcheck, and Prove on your protein structures.
The other thing you could do is use a program like nmrclust which basically finds the average structure from an ensemble. In this case, I just one where the termini didn’t bend back into the membrane!
Also at this stage, we might want to use an electrostatics (PB) calculation package such as Delphi or UHBD, to get detailed pka information about ionisable residues in the structure. (Andy McCammon talked about this last week.) With membrane proteins this situation is complicated by the need for more than external dielectric around the protein. Either way, we won’t go into this step at the moment!
2. Making a membrane
First we will make a membrane in which to insert protein. Open VMD. We will need to activate the membrane making package. In the console window, type:
package require membrane
Should say: “1.0”.
(Will need vmd1.8.?/lib/vmd/plugins/noarch/tcl/membrane1.0/pkgIndex.tcl to be included in the VMD source file.)
To see the options in Membrane1.0, just type:
We’ll make a 50Å x 50Å bilayer which will give us ~60 POPC lipid molecules – 30 in each leaflet. Do:
membrane –l popc –x 50 –y 50
This creates two files – membrane.pdb and membrane.psf. You can see the pdb structure in the display window. Notice that VMD very kindly adds some waters around the headgroups of the lipids (1019 of them, in this case!). In case you didn’t know already, a psf file is a structure file that defines the atom connectivity and segment set-up for NAMD. (It’s pretty much the same as a psf in Charmm.) A “segment” is usually a protein chain, or a particular group of molecules. But often segments are more split up than that – partially for convenience, and also because, in the case of water segments, no segment can contain more than 9999 molecules/residues. (Biological simulations are always mostly water!) We’ll return to this when we run psfgen later!
3. Insert protein into membrane
There are lots of ways to do this and most of them are more complicated than what we’re going to do today – although not necessarily better! We’re just going to use the graphical interface of VMD to position the protein in the membrane.
Load the glycophorin into VMD. (File > New molecule, etc.)
Use the graphical interface to display the protein in “New ribbons”, colored by “structure”. It might also help to show some aromatic residues (in case TYR and PHE) in VDW to help align the protein.
With the membrane and the protein displayed as you want them, in the main menu, do:
Mouse > Move > Molecule
You can now move the the protein separately from the membrane, with the mouse. Hold down the shift-key to rotate them separately. Align the protein with the centre of the membrane. It would help you a lot later if you don’t move the membrane – just the protein! (It’s okay to rotate the view – that won’t get saved.)
We now need to save the new coordinates of the protein and membrane. Select the protein molecule in the VMD Main window. Then do:
File > Save coordinates
The “Save trajectory” window pops up. In the “Selected Molecule” menu, select the protein. Click file type “pdb” in the menu below. Now click “save”. Put in a filename – something like protein_moved.pdb
Mine is 1afo_model4_moved.pdb
Close VMD when you’re done.
4. Make a psf for the protein
For every set of coordinates, we need a structure file to describe the atoms and connectivities for a certain forcefield. Here we’ll be using the Charmm27 forcefield. You can look at it here if you really want: top_all27_prot_lipid.inp
But we need the protein in a separate pdb file for each chain. As the chains have chain identifiers “A” and “B”, the easiest thing to do is this (make sure you include a space either side of the letter!):
grep “ A “ protein_moved.pdb > chainA.pdb
grep “ B “ protein_moved.pdb > chainB.pdb
We are now going to use a program called psfgen (part of NAMD) to create the psf.
The input script is here (glycophorin_psfgen.inp). Again it’s pretty obvious what it’s doing. We’re going to add acetone patches to the termini since this is a protein fragment – this is done like adding a residue as the new first and last residue, inside the segment routine. Notice the “alias” commands. These are because charmm needs histidine residues to be called HSE rather than HIS, and atom CD1 in isoleucine needs to be called CD! The guesscoord command allows the patch residue atom coordinates to be added, as well as any missing hydrogens.
To run the script, do:
psfgen < glycophorin_psfgen.inp > glycophorin_psfgen.log
We now have a new set of coordinates and a psf file for the protein:
protein_psfgen.pdb and protein_only.psf
4. Combining the membrane and the protein.
The membrane and protein are in the same position but we need to combine them so that the lipid and water molecules that are overlapping with the protein are deleted. (We’re not going to do anything more complicated than this today with initially packing the lipid around the protein.)
We’re going to use a tcl script and run it under VMD in text mode.
The tcl script is here (combine.tcl). Take a look at it. As it uses segment names and simple languages from VMD, it’s easy to see roughly what it’s going to do.
vmd –dispdev text < combine.tcl > combine.log
We now have our combined protein-membrane system:
protein_and_membrane.pdb and protein_and_membrane.psf
Take a look at it in VMD. Should look a bit like this (if you displayed it like this!):
Glycophorin TM fragment inserted into the membrane!
5. Add some more water
The protein sticks out from the membrane/water slightly so we need to add some more bulk water. Often large extra-membranous domains require solvation. The other reason that bulk solvation is required is that we need to shield the protein from interaction with itself in the periodic system – otherwise we’ll be simulating a crystal – which isn’t really what we want to do.
NB – Proteins, even in crystals, are highly solvated – that is they contain a lot of water within the boundaries of the structure. Today we’re just going to do simple, geometric solvation but narrow cavities in proteins, especially binding sites and transmembrane pores may require a more rigorous approach – e.g. the use of a monte-carlo style simulation of water addition to find the optimum solvation of these areas such as MMC or Solvate.
We can do a geometric solvation of the membrane-protein system in VMD.
In the console window, type:
package require solvate
It should say “1.2”.
If you type “solvate” you can see the options. There is more than one way to use this plug-in and get the right result. Feel free to play. The exact dimensions of your system may vary from mine slightly so either just use my system or try to work out the measurements for yours (I suggest using “Mouse > query” to measure the system).
The easiest thing to do is to just add 5Å of water in each z direction (the axis of the membrane – unless you moved it!).
To do this, with an exclusion radii of 3.3Å:
solvate protein_and_membrane.psf protein_and_membrane.pdb –b 3.3 –z 5 +z 5 –o solvated
This should produce the solvated psf and pdb: solvated.psf and solvated.pdb
VMD won’t update the structure in the display if you opened the protein_and_membrane.pdb to work on it. Undisplay it and load in the new solvated.pdb file to look at it.
You’ll notice that the system is not perfect. You may be able to set it up better by trial and error at each stage (insertion, combining, solvation) but however well you get it, a period of equilibration (after minimization) will be required to fix these packing imperfections. Most of the waters in the bilayer will be expelled at this time.
Don’t close the solvated system from VMD. We’re going to add some ions to it!
8. Add counter-ions
We need to add counter-ions to balance the charge on the system, especially when using PME in the electrostatics calculation (Darden, 1998). Several programs, such as grid and gen_ion (in Gromacs) although you to calculate the most energetically favorable position for the ion – but this isn’t really what you want to do, unless you’re adding ions to an ion channel pore.
The following script can be run in VMD using the tcl plugin. You should have solvated.pdb displayed. Using File > Load Data into molecule, read in the solvated.psf file.
Now do: Extensions > tkcon
This opens up the tcl/tk console. From here we can run the tcl script. (This is the long winded, visual way – rather than just doing it from the command line with text VMD.)
In the tcl console, type:
(don’t worry about the error message – it just tells you how to use the script.)
addions ions_added.psf ions_added.pdb
And it should create those files with ions added at random positions, at least 5Å from the lipid and the protein. It should add 4 chloride ions (CLA) (to neutralize the +4 charge), which you can see if you load the ion_added.pdb into VMD (and display them in vdw).
You may also want to add ions to simulate a particular ionic solution. I haven’t done that here but basically the process is the same – replace random waters with the right number of ions.
Ok. You’ve basically set-up a membrane-protein system! Well done! Although, it is not ready for a production MD run until we have minimized and equilibrated the system. We don’t though have time to actually run these in the session today. (And due to bad planning on my part – I didn’t have time to run it either!) But if we did, an input script for minimization of the system in NAMD would look something like this – glycoph_min.in
This is perhaps the most important phase of the set-up. If you don’t do this bit right (for a real protein-membrane simulation where you care about the answer!) then your results will be distorted by packing imperfections in the system.
Basically what we want to do is pack the lipid and water around the protein while holding the protein rigid. The water would pack very quickly but lipid can take much longer. Usually I would suggest around 500ps of simulation time, with the protein positions restrained (see NAMD manual) and monitoring the lipid density is a good way to tell if the system is better equilibrated yet or not.
Here is an example of an equilibration (or “restrained”) run input file: restrained_md_example.in
8. Production run
You’re ready to rock! Here’s an example of a production MD input file: md.in