Titrations

Practical Aspects | Slow and Fast Exchange | Combining Chemical Shifts | Chemical Shift Changes | Global vs. Individual Fitting | Protein Ligand Binding Equation | Monomer-Dimer Binding Equation | Langmuir/Michaelis-Menton Model | Variable Protein Concentration | Protein-Ligand Binding with Multiple Binding Sites

When you do a titration from which you want to extract quantitative information (e.g. a binding constant) there are different ways of doing this on a practical level and also different types of system that you might be dealing with. Each of these will require different approaches and equations for fitting your data. We'll try and summarise some of these here and also provide derivations of equations which will help you understand which fitting equations you need to use when doing a titration.

The easiest way to perform a titration is simply to take an NMR tube containing your protein of interest and gradually to add your ligand in increasing amounts. The problem with this approach is that as you add your ligand you are increasing the volume of your sample. The concentration of your protein will therefore change between the differen titration points. This makes the equations you need to use rather complicated.

It is much better practice to make separate samples which all have the same protein concentration but varying ligand concentration. That is all very well if you have lots of protein and can make lots of samples, but what if you are limited in the amount of protein you have? In that case, it is good practice to start with two samples of equal volume containing equal concentrations of your protein and the concentration of your ligand at the two extremes of the range to be explored. You can then construct the intermediate points by successively exchanging volumes between the two samples. The concentration of the protein is thus kept constant.

Another thing to remember is that you should make sure that your solvent composition is the same at each titration point. If you are working with a small molecule that isn't soluble in your protein buffer, then you need to make sure that the concentration of your small molecule solvent (e.g. DMSO) is the same at every titration point. Achieving this is again much easier if you have a separate sample at each point, or work with two samples that you mix. If you do not manage to keep the solvent composition identical across your titration, the chemical shift changes you observe could be due to the altered solvent environment rather than the addition of the ligand.

In NMR titration experiments we distinguish between the so-called fast and slow exchange regimes. If you have fast exchange, then you will see one peak which gradually moves from the free chemical shift value, δ_{free}, to the bound one, δ_{bound}, as you add your ligand. In a slow exchange system you will see two peaks, one at δ_{free} and the other at δ_{bound}, and as you titrate your ligand in, the peak at δ_{free} will gradually reduce in size as the peak at δ_{bound} will increase. For a full explanation and discussion of fast and slow exchange see Section 3 of Williamson 2013. The important thing to note is that we will only be discussing fast exchange systems below.

An NMR titration is typically recorded on 2D data (usually ^{1}H-^{15}N or ^{1}H-^{13}C HSCQs/HMQCs), thus we obtain two measurements per point: the ^{1}H chemical shift and the ^{15}N or ^{13}C chemical shift. Often it is useful to combine the chemical shifts from each residue, so as to obtain a single value per residue, δ_{comb}:

$$δ_{comb} = \sqrt{\frac{1}{N} \sum_{i=1}^{N}(\alpha_{i}δ_{i})^2}$$

where N is the number of nuclei being combined and α_{i} is a weighting factor for each nucleus, i. If we are considering chemical shift differences, then this is in effect the average Euclidean distance between to peaks:

$$∆δ_{comb} = \sqrt{\frac{1}{N} \sum_{i=1}^{N}(\alpha_{i}∆δ_{i})^2}$$

If the number of nuclei being combined per residue is the same, then the 1/N factor can also be excluded.

There is no theoretically justifiable value for alpha. It can be chosen to reflect the different chemical shift ranges of the nuclei in question, their gyromagnetic ratios or even be amino-acid type specific (Schumann et al. 2007). Most commonly, α is set to 1 for ^{1}H and 0.14 for ^{15}N (Williamson 2013) so as to reflect the relative chemical shift ranges of the ^{1}H^{N} and ^{15}N^{H} nuclei. A more recent study suggests that theese are reasonable values, though for some proteins the ^{15}N value should be slightly higher, closer to 0.2 or 0.21 (Hobbs et al. 2022). For a ^{1}H-^{15}N HSCQ experiment, the combined chemical shift is thus given by

$$δ_{comb} = \sqrt{\frac{δ_{H}^2 + (0.14δ_{N})^2}{2}}$$

For ^{13}C methyl titrations you might expect an α value of around 0.2 (similarly if you were considering ^{1}H^{α}-^{13}C^{α} peaks).

Before starting to think about fitting the data, it is useful to investigate the chemical shift perturbations. Typically, ∆δ_{comb} is calculated as

$$∆δ_{comb} = δ_{comb,bound} - δ_{comb,free}$$

where δ_{comb,free} is δ_{comb} for the free protein, and δ_{comb,bound} is δ_{comb} for the fully bound complex (ideally the chemical shift observed at your final titration point).

By plotting residue vs. ∆δ_{comb} you can see which parts of your protein undergo the most significant conformational changes upon ligand binding. CcpNmr Analysis enables you both to create the residue vs. ∆δ_{comb} chart as a bar graph and to plot the values (in the form of different colours) directly onto your protein structure, if you have one. Usually you would expect the largest chemical shift changes to cluster around the ligand binding site, but be aware that allosteric changes could induce chemical shift changes elsewhere in the protein, too.

Hobbs et al. 2022 make some interesting observations and comments about how to fit titration data, e.g. whether to fit each peak or each nucleus individually, or whether to do a simultaneous global fit to all values. There are advantages to doing both individual and global fits. Fitting each residue individually and looking at a residue vs. K_{d} graph, allows you to get an initial understanding of your data. (Do not confuse such a graph with your residue vs. ∆δ_{comb} graph!) You may find that the K_{d} varies significantly between different residues. In particular, you may find some residues where the fitted K_{d} is much larger (= weaker binding) than most residues and that the chemical shifts for these residues vary in an almost linear manner. Hobbs et al. suggest that this "may arise from very weak secondary binding, or possibly from effects of the ligand on solvent structure, sensed by the protein as small and almost linear shift changes (Bye et al. 2016). [It] may also derive from small changes in pH during the titration." Similarly, you may find that there are a few residues with very small fitted K_{d} values where the chemical shift essentially only changes upon the first addition of ligand. Neither of these phenomena represent genuine residue-specific binding. One option would be to exclude these values and average the remaining ones, but it is difficult to find non-arbitrary cut-off points for including/excluding residues in the calculation of the global K_{d}. Instead, Hobbs et al. suggest doing a global fit, i.e. fitting all nuclei individually (i.e. not combining them via a Euclidean distance) and simultaneously. As yet a global fit is not implemented in CcpNmr Analysis, but we have plans to do this. A global fit reduces errors in the fit and also removes the need to exclude residues manually. (It might, however, make sense to exclude overlapped peaks from your analysis.)

Hobbs et al. make some further interesting observations about differential K_{d} values across your protein. In particular, they identify proteins, where different residues in the protein appear to experience different binding strengths dependent upon ligand proximity. In one case, they suggest, this may indicate different binding affinities for different parts of the (flexible) ligand. So do read that paper if you find that you have systematic differences in binding affinities across your protein, as they may not be co-incidental.

In conclusion, it is probably worth doing both global and residue-specific fits. The global fit will give you an overall K_{d} value, but the residue-specific fits/K_{d} values might also give you further information about the binding event.

This equation and derivation will be for the reaction P + L ⇄ PL where P is the protein and L the ligand. Throughout the titration the concentration of the P stays constant and the concentration of L is varied. [P] is the concentration of P in the unbound state, [L] the concentration of L in the unbound state and [PL] the concentration of the protein-ligand complex. The total concentrations [P]_{T} and [L]_{T} for P and L, respectively, are thus given by

$$\begin{align}[P]_{T} &= [P] + [PL] \\ [L]_{T} &= [L] + [PL] \end{align}$$

The equilibrium dissociation constant is

$$K_{d} = \frac{[P][L]}{[PL]}$$

Therefore

$$\begin{align} [P] &= \frac{K_{d}[PL]}{[L]} \\ &= \frac{K_{d}([P]_{T} − [P])}{[L]} \\ &= \frac{K_{d}([P]_{T} − [P])}{[L]_{T}−[PL]} \\ &= \frac{K_{d}([P]_{T} − [P])}{[L]_{T}−[P]_{T}+[P]}\end{align}$$

This becomes

$$[P]([L]_{T} − [P]_{T} + [P]) = K_{d}([P]_{T} − [P])$$

Multiplying this out, we get

$$[P][L]_{T} − [P][P]_{T} + [P]^2 = K_{d}[P]_{T} − K_{d}[P]$$

Collecting all the terms on one side this gives us a quadratic equation in [P]:

$$[P]^2 +([L]_{T}−[P]_{T}+K_{d})[P]−[P]_{T}K_{d} =0$$

Therefore

$$\begin{align} [P] &= \frac{−([L]_{T}−[P]_{T}+ K_{d})+\sqrt{([L]_{T}−[P]_{T}+K_{d})^2+4[P]_{T}K_{d}}}{2} \\ &= \frac{[P]_{T}-[L]_{T}- K_{d}+\sqrt{([L]_{T}−[P]_{T}+K_{d})^2+4[P]_{T}K_{d}}}{2} \\ &= \frac{[P]_{T}-[L]_{T}- K_{d}+\sqrt{([L]_{T}+[P]_{T}+K_{d})^2-4[P]_{T}[L]_{T}}}{2} \end{align}$$

Note that we cannot have the other root because it would give a negative [P].

Dividing both sides by -[P]_{T} gives us

$$-\frac{[P]}{[P]_{T}} = -\frac{1}{2}+\frac{[L]_{T}+K_{d}-\sqrt{([L]_{T}+[P]_{T}+K_{d})^2+4[P]_{T}[L]_{T}}}{2[P]_{T}}$$

After adding 1 to both sides we then have

$$\frac{[P]_{T}-[P]}{[P]_{T}} = \frac{[L]_{T}+[P]_{T}+K_{d}-\sqrt{([L]_{T}+[P]_{T}+K_{d})^2+4[P]_{T}[L]_{T}}}{2[P]_{T}}$$

The fraction of bound P is

$$f_{bound} = \frac{[PL]}{[P]_{T}} = \frac{[P]_{T} − [P]}{[P]_{T}}$$

and the fraction of unbound P is

$$f_{free} =1−f_{bound}$$

Let δ_{bound} be the chemical shift of the bound P and δ_{free} be the chemical shift of the unbound P.

For fast exchange the observed chemical shift is

$$\begin{align} δ_{obs} &= f_{bound}δ_{bound} + f_{free}δ_{free} \\ &= f_{bound}δ_{bound} + (1 − f_{bound})δ_{free} \\ &= f_{bound}δ_{bound} - f_{bound}δ_{free} + δ_{free} \\ &= δ_{free} + (δ_{bound} − δ_{free})f_{bound} \\ &= δ_{free} + (δ_{bound} − δ_{free})\left( \frac{[P]_{T}-[P]}{[P]_{T}}\right) \\&= δ_{free} + (δ_{bound} − δ_{free})\left( \frac{[L]_{T}+[P]_{T}+K_{d}-\sqrt{([L]_{T}+[P]_{T}+K_{d})^2+4[P]_{T}[L]_{T}}}{2[P]_{T}} \right)\end{align}$$

We normally know what δ_{free} is and so we introduce

$$∆δ_{obs} = δ_{obs} − δ_{free}$$

and

$$∆δ_{max} = δ_{bound} − δ_{free}$$

therefore

$$∆δ_{obs} = ∆δ_{max}\left( \frac{[L]_{T}+[P]_{T}+K_{d}-\sqrt{([L]_{T}+[P]_{T}+K_{d})^2+4[P]_{T}[L]_{T}}}{2[P]_{T}} \right)$$

This is equation (6) in Williamson 2013 and the derivation above is the "little [bit of] algebra" mentioned.

We know [P]_{T} and [L]_{T}. Sometimes we know δ_{bound} in which case the only thing left to fit is K_{d}. Sometimes we do not know δ_{bound} in which case we have to fit that as well (so equivalently ∆δ_{max}).

Note that in Version 2 of Analysis, this equation was further modified by setting

$$\begin{align}A &= \frac{∆δ_{max}}{2} \\ B &= 1 + \frac{K_{d}}{[P]_{T}} \\ x &= \frac{[L]_{T}}{[P]_{T}} \\ y &= ∆δ_{obs} \end{align}$$

Then the equation above becomes

$$y=A \left(B+x−\sqrt{(B+x)^2−4x}\right)$$

which is the equation used in CcpNmr Analysis Version 2. We measure chemical shifts at various values of [L]_{T}. We fit both A and B.

Note that

$$K_{d} =[P]_{T}(B−1)$$

which means (as mentioned above) that this fitting only makes sense if the total concentration of the protein [P]_{T} is kept constant throughout the experiments. If it varies then y has two dependent variables (in effect, [P]_{T} and [L]_{T}). This would require the Variable Concentration Fitting.

This equation and derivation are for the reaction A + A ⇄ AA where A is a protein that dimerises. [A] is the concentration of the monomer and [AA] the concentration of the dimer. The total concentration [A]_{T} of A is thus given by

$$[A]_{T} = [A] + 2[AA] $$

The equilibrium dissociation constant is

$$K_{d} = \frac{[A]^2}{[AA]} $$

The ratio, r, of monomer to dimer is

$$r = \frac{[AA]}{[A]_{T}} $$

Therefore

$$[AA] = r[A]_{T} $$

Thus

$$[A] = [A]_{T} - 2[AA] = (1-2r)[A]_{T} $$

Note that 0 <= r <= 1/2. Therefore we have

$$K_{d} = \frac{(1-2r)^2}{r}[A]_{T} $$

Therefore

$$4r^2 - 4r + 1 =(1-2r)^2 = \frac{K_{d}}{[A]_{T}}r $$

Turning this into a quadratic equation in r:

$$r^2 - \left(1+\frac{K_{d}}{4[A]_{T}}\right)r + \frac{1}{4} = 0 $$

Therefore

$$\begin{align} r &= \frac{1}{2} \left( 1 + \frac{K_{d}}{4[A]_{T}} - \sqrt{\left(1+\frac{K_{d}}{4[A]_{T}}\right)^2 - 1}\right) \\ &= \frac{1}{8[A]_{T}} \left(K_{d}+4[A]_{T}- \sqrt{(K_{d}+4[A]_{T})^2 - 16[A]_{T}^2}\right) \end{align}$$

The positive root cannot be taken because that would make r > 1/2.

Let δ_{A} be the chemical shift of the monomer and δ_{AA} be the chemical shift of the dimer and define

$$∆δ_{max} = δ_{AA} - δ_{A} $$

In general, we do not know either δ_{A} or δ_{AA}.

For fast exchange the observed chemical shift is

$$\begin{align} δ_{obs} &= \frac{[A]δ_{A}+2[AA]δ_{AA}}{[A]_{T}} \\ &=(1-2r)δ_{A}+2rδ_{AA} \\ &= δ_{A} + 2r(δ_{AA}-δ_{A}) \\ &=δ_{A}+2r∆δ_{max} \\ &= δ_{A} + \frac{∆δ_{max}}{4[A]_{T}} \left(K_{d}+ 4[A]_{T}- \sqrt{(K_{d}+4[A]_{T})^2 - 16[A]_{T}^2}\right) \end{align}$$

There is no point subtracting anything here since, in general, we do not know either δ_{A} or δ_{AA}.

Again, in CcpNmr Analysis Version 2 this was modified by setting

$$\begin{align} A &= ∆δ_{max} \\ B &= K_{d} \\ C &= δ_{A} \\ x &= [A]_{T} \\ y &= δ_{obs} \end{align}$$

Then the equation above becomes

$$\begin{align} y &=A \frac{\left( B+4x−\sqrt{(B+4x)^2−16x^2} \right)}{4x} +C \\ &= A \left( 1+\frac{B}{4x}-\sqrt{\left(1+\frac{B}{4x} \right)^2} \right) + C \end{align}$$

which was the equation provided in CcpNmr Analysis Version 2.

Many will be familiar with the Langmuir/Michaelis-Menton approximation typically used in the context of enzyme kinetics. Here the ligand is present in large excess, thus

$$[L] >> [PL] $$

Therefore we can make the following approximation

$$[L] \approx [L]_{T} $$

The algebra above can therefore be simplified in the following way:

$$\begin{align} [P] &= \frac{K_{d}[PL]}{[L]} \\ &= \frac{K_{d}([P]_{T}-[P])}{[L]} \\ &\approx \frac{K_{d}([P]_{T}-[P])}{[L]_{T}} \end{align}$$

Therefore

$$[P][L]_{T} = K_{d}[P]_{T} - K_{d}[P]$$

and

$$[P]([L]_{T}+K_{d}) = K_{d}[P]_{T}$$

giving us

$$[P] = \frac{K_{d}[P]_{T}}{[L]_{T}+K_{d}} $$

which is a considerable simplification compared to the quadratic equation when we do not use the Langmuir/Michaelis-Mneton approximation. Now let's rearrange to have everything in terms of ([P]_{T} - [P])/[P]_{T}:

$$-\frac{[P]}{[P]_{T}} = -\frac{K_{d}}{[L]_{T}+K_{d}}$$

Adding 1 to each side gives

$$\frac{[P]_{T}}{[P]_{T}} - \frac{[P]}{[P]_{T}} = \frac{[L]_{T}+K_{d}}{[L]_{T}+K_{d}} - \frac{K_{d}}{[L]_{T}+K_{d}} $$

And finally

$$\frac{[P]_{T}-[P]}{[P]_{T}} = \frac{[L]_{T}}{[L]_{T} + K_{d}}$$

Using the definitions above we can change this to

$$∆δ_{obs} = ∆δ_{max} \left(\frac{[L]_{T}}{[L]_{T} + K_{d}}\right)$$

If we set

$$\begin{align} A &= ∆δ_{max} \\ x &= [L]_{T} \\ y &= ∆δ_{obs} \end{align}$$

Then the equation above becomes

$$ y =A \left(\frac{x}{x + K_{d}}\right) $$

which is the familiar Langmuir/Michaelis-Menton equation also provided in CcpNmr Analysis.

Please note that in an NMR titration you are usually unlikely to have the situation where [L] >> [PL] and thus you wouldn't generally reckon to use this equation (if you do and the approximation does not hold, then you will get an incorrect value for your K_{d}).

If we simply add the ligand step-wise to a single sample of protein, the volume will gradually increase, leading a reduction of the total protein concentration, [P]_{T} with each titration point. To account for this, we can epxress the volume in terms of the ligand concentration:

$$v_{t} = v_{0} \times \frac{[L]_{S}}{[L]_{S}-[L]_{T}} $$

where [L]_{S} is the ligand stock concentration. This can be rearranged as follows:

$$\frac{v_{0}}{v_{t}} = \frac{[L]_{S}-[L]_{T}}{[L]_{S}} = 1- \frac{[L]_{T}}{[L]_{S}}$$

The total protein concentration, [P]_{T} therefore becomes

$$[P]_T = [P]_{0} \times \frac{v_{0}}{v_{t}} = [P]_{0}\left(1-\frac{[L]_{T}}{[L]_{S}} \right)$$

where [P]_{0} is the initial protein concentration. Substituting this into the protein-ligand binding equation without volume change above this gives us:

$$∆δ_{obs} = ∆δ_{max}\left( \frac{[L]_{T}+[P]_{0}(1-\frac{[L]_{T}}{[L]_{S}})+K_{d}-\sqrt{([L]_{T}+[P]_{0}(1-\frac{[L]_{T}}{[L]_{S}})+K_{d})^2+4[P]_{0}(1-\frac{[L]_{T}}{[L]_{S}})[L]_{T}}}{2[P]_{0}(1-\frac{[L]_{T}}{[L]_{S}})} \right)$$

*Not yet implemented.*