1.What features should I look for when I visualize a docked compound?
Q:The first thing to check is that the ligand fits into some kind of pocket on the receptor. The second is that there is a chemical match between the atoms in the ligand and those in the receptor. For example, check that carbon atoms in the ligand are near hydrophobic atoms in the receptor, while nitrogens and oxygens in the ligand are near complementary hydrogen bonding atoms. Check for charge complementarity. Also consider whatever else you may know about your particular system: for instance, if you know the mechanism of action of the enzyme and which residues are involved, examine how the ligand binds to these residues. In the case of HIV-1 Protease, good inhibitors bind in a mode that mimics the transition state, placing a hydroxyl group near the two catalytic aspartic acid sidechains.
2.I get very high Reference RMSD values in my DLG (docking log file); what went wrong?
Q:The "Reference RMSD" values that are printed in the "RMSD TABLE" in the DLG are computed from the coordinates of
either the input ligand (PDBQ or PDBQT) file specified by the "move" command in the DPF, if you did not include the "rmsref" command in your DPF;
or the ligand (PDBQ or PDBQT) file you specified in the "rmsref" command in the DPF.
If you do not specify the "rmsref" command, and the ligand input coordinates happen to be translated far from the receptor, you will appear to get high Reference RMSD values. _Don't Panic!_ This is normal, and you don't need to worry about these high values.
You may want to check that the input ligand coordinates are far from the crystallographically observed binding position in active site, by reading in the input ligand and the receptor in ADT.
We usually use the "rmsref" DPF-command to specify the x-ray crystallographic coordinates of a known binding mode taken from a complex PDB structure. This can be a useful way of checking if your redocking is successful, if the Reference RMSD values are less than 2-3 Å from the crystal structure position of the ligand.
3. I used "get-dockings" to extract the docked conformations. Where is the macromolecule?
Q:After you extract the docked conformations from the AutoDock log file using get-dockings, you obtain a PDB formatted file that contains the docked conformations of the ligand. These are sorted in order of increasing energy, and in accordance with the conformational clustering. This PDB file puts each docked conformation in between MODEL and ENDMDL records. This file does not contain the macromolecule coordinates which you docked to. Don't Panic! These are still in the original PDBQS file you used to generate the AutoGrid maps. In other words, the output docked coordinates of the ligand are written in the same reference frame as the original macromolecule PDBQS.
To view the dockings in relation to the macromolecule, in InsightII 2000, for example, read in the PDB file of MODELs first (with "Keep all frames" turned on). Then read in the macromolecule PDBQS file using the first docked conformation as the "Reference" structure. You can read in the PDBQS file as a PDB formatted file (it works in InsightII, except you do not see PDBQS files in the file browser: you must type in the PDBQS file name yourself.) You should now see the macromolecule and the docked conformations together.
Alternatively, you can add the macromolecule and the docked conformations together into one PDB file that contains everything. At the UNIX prompt, type this:
% get-dockings mydocking.dlg
% pdbqtopdb mymacromolecule.pdbqs >> mydocking.dlg.pdb
This will append the macromolecule (in PDB format) to the stacked MODELs in the "mydocking.dlg.pdb" file, so you can now read in this file and you should see everything together.
4.How do I evaluate AutoDock's clustering results? If I have more than one cluster after doing conformational cluster analysis, which one do I choose?
Q:Open a DLG (docking log) in a text editor and search for the word "HISTOGRAM", and if you used the "analysis" keyword in your DPF (docking parameter file), then you will find AutoDock's conformational clustering histogram. This sorts the docking results into conformationally similar bins, according to the RMSD tolerance you set using the rmstol keyword, and according to whether you used the rmsnosym command. (By default, AutoDock tries to compute the minimum RMSD by taking into consideration the symmetry in the molecule, and works well if the two conformations are very similar; using the rmsnosym command guarantees that a 1-to-1 correspondence of atoms is considered in computing the RMSD. ADT does not consider symmetry in the RMSD calculations in clustering, and uses the same algorithm as the rmsnosym command.)
NOTE: by default, AutoDock 4 uses only the ligand atoms for the cluster analysis, if you have sidechains that are flexible in the receptor . You can use the new command rmsatoms all to include all the moving atoms in the RMSD calculation; the alternative form of the command, rmsatoms ligand_only computes the RMSD for only the atoms in the ligand, although this command is not necessary since this is the default.
If you find more than one cluster, which one should you choose?
The answer depends on a number of factors: first of all, it's best to have done at least 50 runs, to get a good sampling of results to cluster (I prefer to do at least 100 dockings). Also, use an RMSD tolerance that is appropriate for the size of your ligand: larger ligands need larger rmstol values, typically at least 2 Angstroms.
The next question is, did each docking search for long enough? In other words, did the number of energy evaluations (ga_num_evals in GA and LGA dockings; cycles, accs and rejs in SA dockings) match the dimensionality of the search problem? This depends on the number of torsions in the ligand (and protein, if flexible), and how these torsions are arranged in the molecule (are they arranged linearly, or are they nested?). Ideally, if you run a docking for long enough, you should always converge on the lowest energy solution, and obtain just one cluster.
However, if you obtain two or more clusters, and the lowest-energy cluster is less populated than another cluster with higher energy, which one is the "right" answer? What happens to the clustering results if you increase the number energy evaluations? Does the size of the lowest energy cluster increase to exceed the number in the other cluster?
Which cluster you choose should also depend on a visual inspection of the binding modes, comparing how the ligand interacts with the receptor. Does one binding mode look more chemically-reasonable than the other(s)?
Also bear in mind that the if the difference in the energies between the mean energies of the two clusters is less than about 2.5 kcal/mol, this is within the standard deviation of the AutoDock force field, and it is difficult to say which one is the "correct" one.
If you have two ligands, and they bind to the same receptor, but one forms just one cluster, while the other forms more than one cluster, yet they both bind with about the same estimated binding free energy, which one will be better? This is a key question that we are currently investigating ways to quantify: stay tuned!
5.I see MAX_ATOMS defined to 2048 - which seems to be a huge number for a "small molecule" (Also hydrogen bonds are anyway eliminated reducing the actual number of atoms participating in docking..)
The ind.pdbqt (small), hsg (protein) files that I use for testing my GPU code - have a total of 73 atoms (true_ligand_atoms + flexres). Is this a typical number? Do ligands exist that have 1000s of atoms?
Q:Some people use autodock for protein-protein docking...or docking large peptide ligands...or carbohydrates...or DNA etc. Some people also try to do flexible docking with really large numbers of flexible residues. Those applications would require large MAXATOMS. However, autodock isn't really designed to do those things well and there are other specialized software packages available for many of those applications.
My suggestion is to size it to accommodate ligands up to maybe 1-2kDa and ~6-8 flexible residues. That would probably be plenty for most users
6.I have run screening studies on an enzyme using polar hydrogens identified by Autodock (4.1). Recently, I have come across information regarding a closely related enzyme, in which an important tyrosine residue is half-deprotonated at the relevant pH (i.e. it is 50% likely to be deprotonated). I'm assuming it is impossible to simulate this using Autodock, so I would like to run the screen with the deprotonated tyrosine, having already ran it with the tyrosine protonated. However, when I delete the hydrogen, there doesn't seem to be any effect on the charge of the oxygen atom. If I need to set this manually, what would be the best way to do so? What would be an appropriate value for the charge?
Q:AutoDockTools (you were referring to it, probably) can be used to handle general cases, but it hasn't been designed to work with "exotic" protonation states and their corresponding charges.
As you said, to model the partially deprotonated state you could run dockings also with the deprotonated state and compare them with the regularones.
Concerning charges, usually the extra charge resulting from the de-protonation is spread over several atoms, and to model this with a reasonable level of accuracy, you probably need at least a semi-empirical method (i.e. MOPAC) if not a full QM calculation. |