## Abstract

In the last two decades, single-molecule force measurements using optical and magnetic tweezers and atomic force spectroscopy have dramatically expanded our knowledge of nucleic acids and proteins. These techniques characterize the force on a biomolecule required to produce a given molecular extension. When stretching long DNA molecules, the observed force–extension relationship exhibits a characteristic plateau at approximately 65 pN where the DNA may be extended to almost twice its B-DNA length with almost no increase in force. In the present review, I describe this transition in terms of the Poland–Scheraga model and summarize recent related studies.

- DNA stretching
- helix–coil transition
- mechanical properties of DNA
- single-molecule biophysics

## Introduction

Under physiological conditions *in vitro*, the thermodynamically stable configuration of DNA is the Watson–Crick double helix (however, *in vivo* DNA is supercoiled and constrained by proteins such as nuclear-associated proteins in prokaryotes and histones in eukaryotes). In this configuration, the nucleotides A, T, G and C of each helix pair with those of the complementary helix according to the key–lock principle, such that only the base pairs AT and GC can form [1]. As hydrogen bonds contribute only little to the helix stability, the major support comes from stacking of base pairs [2]. However, important biological processes require the unzipping (denaturation) of a specific region of base pairs [3,4]. Examples include the docking of single-stranded-DNA-binding proteins to DNA, such as DNA replication via DNA helicase and polymerase and transcription via RNA polymerase [1,3]. Local unzipping and subsequent rejoining of base pairs occurs spontaneously *in vivo* due to thermal fluctuations in a process named breathing of double-stranded DNA, which opens up denaturation zones of a few tens of base pairs; such breathing fluctuations have been studied experimentally by following the exchange of protons from imino groups with water [5] and by fluorescent labelling of synthetic DNA constructs [6]. In a theoretical study, the probability density of finding a region of open base pairs of size *n* at time *t* was obtained using the Fokker–Planck equation [7]. Breathing fluctuations may be supported by accessory proteins which bind to transient single-stranded DNA, thereby lowering the DNA base-pair stability [4].

DNA denaturation *in vitro* is accomplished traditionally by heating, or by titration with acid or alkali. Since GC base pairs are bound more strongly than AT base pairs (by three compared with two hydrogen bonds), they denature at a higher melting temperature, *T*_{m}. Accordingly, upon heating, double-stranded DNA starts to unwind in regions rich in AT base pairs and then proceeds to regions of higher GC content. Depending on the fraction of GC base pairs, *T*_{m} ranges between 60 and 110°C [2]. An important application of thermal DNA melting is PCR, in which copies of a DNA sample are created by repeated cycles of thermal unwinding and subsequent reannealing in a solution of invariable primers and single nucleotides [8].

A different way to induce local unwinding in DNA *in vitro* is subjecting the molecule to mechanical stress. Single-molecule force spectroscopy has opened the possibility to induce denaturation regions in DNA by traction in an atomic force microscope or optical tweezers instrument. In these experiments, the end-to-end separation, *L*, of a single DNA molecule is varied and the force, *F*, between the ends is recorded as a function of *L*. When stretching long DNA, e.g. λ-DNA, the observed force–extension relationship, *F*(*L*), exhibits a distinct plateau at approximately 65 pN where the DNA may be extended to approximately 1.7 times its natural B-DNA length with almost no increase in force, before the DNA duplex unbinds at larger forces [9–12]; this process has been named the overstretching transition.

Since its discovery, the overstretched configuration of DNA has been a matter of debate. In the early stretching experiments of DNA, it was proposed that the overstretching transition at 65 pN is a transition of B-DNA to a new stretched form of DNA, named S-DNA, in which base-pairing remains but canonical intra-strand base stacking is absent [9–13]. This hypothesis was supported by molecular modelling studies of DNA duplex stretching which reproduced a transition to a base-paired ladder-like structure [14–16]. The occurrence of S-DNA found also support by thermodynamic examination of experimental force–extension relationships [13] and by studying the dynamic behaviour in a model of DNA overstretching [17]. A different scenario for the DNA overstretching transition assumes force-induced melting of B-DNA into two strands of single-stranded DNA, similar to that observed for thermal melting (Figure 1). Williams, Rouzina, Bloomfield and co-workers have shown that force-induced DNA melting quantitatively explains experimental DNA force–extension relationships for a broad range of solution conditions set by ionic strength, pH, temperature [18–21], and the presence of DNA-binding ligands [22–26]. Force-induced melting has also been observed in molecular dynamics simulations of short DNA oligomers when entropic contributions of denatured DNA regions are taken into account [27,28]. DNA overstretching in the presence of glyoxal demonstrated that base pairs were indeed exposed to solution during overstretching [29]. Experiments have demonstrated force-induced melting of λ-DNA directly using fluorescent labels specific to double-stranded and single-stranded DNA [30]. Force-induced melting was also observed for stretching short DNA duplex molecules of given GC and AT content [31]. In DNA-stretching experiments, strand separation may occur either from free ends or nicks by one strand peeling off from the other [30–34], or by DNA melting in the middle, in close analogy to thermal melting; interestingly, DNA overstretching was found to occur at 65 pN for topologically closed but rotationally free DNA, and at 110 pN for torsionally constrained DNA, which implies that overstretching DNA does not require peeling from free ends or nicks [35]. Whereas single-stranded-DNA-binding proteins are expected to favour DNA melting by peeling from free ends or nicks, intercalators and other DNA-binding ligands that stabilize double-stranded DNA preferentially lead to DNA melting in the middle [36].

Force-induced destabilization of DNA has become a valuable tool to probe the interaction of proteins that bind to single-stranded or double-stranded DNA at physiological melting temperatures *T*_{m}(*F*), well below *T*_{m}(0) of free DNA [22–26] (for a review on protein and small-molecule binding on DNA probed by DNA-stretching experiments, see [37]). The current body of experimental data suggests that force-induced DNA melting is the prevalent mechanism in the presence of DNA-binding ligands, whereas S-DNA may occur as an intermediate in the absence of binding ligands.

DNA denaturation can be described by the classical PS (Poland–Scheraga) model, which views DNA at the transition as an alternating sequence of bound double-stranded and denatured single-stranded domains. Double-stranded regions are governed by hydrogen-bonding of base pairs as well as base stacking, whereas denatured regions are governed by the entropy gain upon disruption of base pairs [38,39]. The PS model is fundamental in biological physics and has been progressively refined to obtain a quantitative understanding of the DNA-melting process [40,41]. In what follows, I describe the force-induced melting transition of DNA in the framework of the PS model in its simplest form, in which bound domains are modelled as rigid segments of double-stranded DNA and denatured domains as flexible freely jointed chains of single-stranded DNA [42]. Modelling force-induced DNA melting in terms of the PS model is appropriate when DNA melts in the middle, such as in the presence of DNA-binding ligands which stabilize double-stranded DNA. The PS model reproduces essential features of force-induced DNA-denaturation experiments, including a lower melting temperature *T*_{m}(*F*) for finite stretching forces *F* compared with free DNA, and a plateau in the force–extension relationship *F*(*L*).

## PS model for stretched DNA

The denaturation transition is most easily discussed in the grand-canonical ensemble in which the total number *N* of base pairs and the end-to-end vector **R** of the chain are allowed to fluctuate. The grand-canonical partition function of a chain with an applied external force *F* in the *x*-direction is given by [42]
(1)
where β=(*k*_{B}*T*)^{−1} (*T* is the temperature and *k*_{B} is Boltzmann's constant) and *z* is the fugacity conjugate to *N*. *Z*_{can}(*N*,**R**) is the canonical partition function of a chain with fixed *N* and **R**, and *R _{x}* is the

*x*-component of

**R**(Figure 2). The grand-canonical partition function

*Z*(

*z*,

*F*) in eqn (1) can be expressed in terms of a sum over alternating sequences of bound segments and denatured loops. Assuming that bound segments and denatured loops are statistically independent, each sequence factorizes and one obtains (Figure 2) (2)

*B*(

*z*,

*F*) and Ω(

*z*,

*F*) are the statistical weights of bound segments and denatured loops within the chain respectively, and Ω

*(*

_{e}*z*,

*F*) is the statistical weight of the end unit at both ends of the chain. Eqn (2) is generally valid as long as bound segments and denatured loops are statistically independent; conversely, the statistical weights

*B*, Ω and Ω

*as functions of*

_{e}*z*,

*T*and

*F*(and possibly other control parameters) depend on the choice of specific models. In what follows, we discuss the simplest case in which bound segments are modelled as rigid rods and denatured loops as flexible FJCs (freely jointed chains).

### Bound segments

Double-stranded DNA is fairly stiff on microscopic length scales, corresponding to a relatively large persistence length *P*_{ds} of 50 nm. It is therefore justified to model a segment of bound intact DNA with *k* base pairs as a rigid rod of length *kx*_{ds} where *x*_{ds}=0.34 nm is the length of a bound base pair in B-DNA [18] (Figure 3a). For simplicity, we assume that the binding energy per base pair has the same value *E*_{0} < 0 for all base pairs. The statistical weight of a bound segment with *k* base pairs and fixed spatial orientation is then given by ω* ^{k}* where ω=exp(βϵ) and ϵ=−

*E*

_{0}>0. Assuming that the segment is free to rotate about one end while the other end is subject to a force

*F*in the

*x*-direction, the statistical weight becomes [42] (3) where

**r**is the end-to-end vector of the segment and

*x*is the

*x*-component of

**r**(Figure 3a). In eqn (3), we have introduced the dimensionless force variable (4) The integration in eqn (3) is over the surface of the three-dimensional unit sphere

*S*of area 4π, corresponding to an integration over orientations of the unit vector where

*r*=|

**r**|=

*kx*

_{ds}.

*B*(

*k*,

*F*) is normalized such that

*B*=ω

*for*

^{k}*F*=0, in accordance with previous calculations for free DNA [43]. If the number of base pairs

*k*of the segment is allowed to fluctuate at given fugacity

*z*, the statistical weight becomes (5)

For *F*=0, corresponding to the limit φ→0 in eqn (5), one obtains *B*(*z*,0)=ω*z*/(1−ω*z*) as found previously for free DNA [43].

### Denatured loops

Single-stranded DNA in denatured loops is far more flexible than double-stranded DNA. Values of the persistence length *P*_{ss} for single-stranded DNA were found to range between 0.7 nm [10] and 2 nm [44], corresponding to approximately one to three nucleotides of length *x*_{ss}=0.6 nm [18]. It is therefore justified to model the single-stranded DNA in a denatured loop as an FJC with 2*k* segments of fixed length *b* (Figure 3b). In the absence of an energetic cost for bending between successive segments, *b* is equivalent to the Kuhn length of the chain, corresponding to twice its persistence length *P*_{ss}. To mimic the key–lock principle of natural DNA, we assume perfect matching of base pairs, which implies that both arcs of a denatured loop have the same length. The statistical weight of a loop with 2*k* segments and displacement vector **r** thus corresponds to the number of conformations of an FJC starting at the origin *O*, visiting the point **r** after *k* segments, and returning to *O* after 2*k* segments (Figure 3b). This conformation number is given by
(6)
*C*(*k*) is the number of conformations of a linear FJC with *k* segments which is fixed at one end, and ρ* _{k}*(

**r**) is the probability density for the end-to-end vector

**r**of this chain;

*a*is a microscopic length, e.g. the lattice constant of a supporting lattice. Thus

*C*(

*k*)ρ

*(*

_{k}**r**)

*a*

^{3}in eqn (6) is the number of conformations of a linear FJC with

*k*segments starting at

*O*and ending in the volume element

*a*

^{3}at

**r**, corresponding to an arc of the loop. Since the loop has two such arcs, Ω(

*k*,

**r**) in eqn (6) is the square of this number. The conformation number

*C*(

*k*) has the general form

where μ is the connectivity constant of the supporting lattice. The number μ is a measure of the degrees of freedom of one individual segment in the chain, e.g. μ=6 for a random walk on a three-dimensional cubic lattice.

If the displacement vector **r** of a loop can move freely, subject to an applied external force *F* in the *x*-direction, the statistical weight becomes
(8)
with Ω(*k*,**r**) from eqn (6) and *x* the *x*-component of **r**. The sum in the first line of eqn (8) is over all displacement vectors **r** on a supporting lattice with lattice constant *a*, and in the second line, we replace this sum by an integral using the volume element d^{3}*r*=*a*^{3}. The statistical weight Ω(*k*,*F*) in eqn (8) is determined by the probability density ρ* _{k}*(

**r**); in the simplest case, the FJC is treated in the Gaussian limit (corresponding to the limit

*k*→∞,

*b*→0 in such a way that

*kb*

^{2}stays fixed), for which (9) Inserting ρ

*(*

_{k}**r**) from eqn (9) into eqn (8) one obtains (10) This equation can be expressed as (11) where ℓ is the number of broken base pairs in a loop of 2

*k*segments, i.e.

*k*segments in each of the two arcs (Figure 3b). Since single-stranded DNA contains

*b*/

*x*

_{ss}nucleotides per segment of length

*b*, one obtains (12)

The variable φ=β*Fx*_{ds} in eqn (11) has been introduced in eqn (4), and and α=*bx*_{ss}/(12*x*_{ds}^{2}) are dimensionless parameters. Using *b*=2.5 nm, *x*_{ds}=0.34 nm and *x*_{ss}=0.6 nm, one finds α=1.08. The amplitude *A* in eqn (11) is given by
(13)
where we have included the loop initiation factor σ_{0}≪1, quantifying the initiation of a loop in previously intact double-stranded DNA. Thus *A*≪1 if σ_{0}≪1 is included.

If the number ℓ of broken base pairs in a loop is allowed to fluctuate at given fugacity *z*, the statistical weight of the loop becomes, using eqn (11),
(14)
where *c*=3/2 and

The function in eqn (14) is the polylog function; for |*u*|<1, Li* _{c}*(

*u*) converges for any value of

*c*. For

*u*=1, there are three cases: (i)

*c*≤1, Li

*(1) diverges; (ii) 1<*

_{c}*c*≤2, Li

*(1) converges, but Li*

_{c}*′(*

_{c}*u*)|

_{u}_{= 1}diverges; (iii)

*c*>2, both Li

*(1) and Li*

_{c}*′(*

_{c}*u*)|

_{u}_{= 1}converge. The limit

*u*=1 corresponds to the value (16) of the fugacity (recall φ=β

*Fx*

_{ds}), thus Ω(

*z*,

*F*) is only well-defined for

*z*≤

*z*and diverges for

_{m}*z*>

*z*. The statistical weight Ω

_{m}*of an end unit modelled as an FJC in the Gaussian limit may be derived in a similar way and one obtains Ω*

_{e}*(*

_{e}*z*,

*F*)=

*A*Li

_{e}_{0}(

*u*) (Figure 3b).

### Fraction of bound base pairs and DNA extension

From the grand-canonical partition function in eqn (1), one obtains the average number of base pairs
(17)
If the number of base pairs *N* is set, one has to choose the fugacity *z* such that *N*=<*N*>, which implies that *z* becomes a function of *N*. The average number <*N*_{b}> of bound base pairs is given by (see eqns 2 and 3)
(18)
Similarly, the average of the *x*-component of DNA extension is given by
(19)

Thus *z* acts as a fugacity for base pairs (bound or unbound), ω acts as a fugacity for bound base pairs, and β*F* acts as a fugacity for the *x*-component of DNA extension. The average fraction of bound base pairs is given by
(20)

Similarly, the average of the *x*-component of DNA extension per base pair in units of *x*_{ds} is given by
(21)

The fraction Θ is the order parameter of the denaturation transition for free unstretched DNA and can be measured by light absorption at wavelengths of approximately 260 nm. For free DNA, the temperature-dependence Θ(*T*) is referred to as the melting curve. Conversely, the force–extension relationship *R _{x}*(

*T*,

*F*) is measured directly in DNA-stretching experiments.

In order to obtain the phase diagram of the model in the (*T*,*F*)-plane, the average fraction of bound base pairs Θ should be evaluated in the thermodynamic limit <*N*>→∞ (in practice, results derived for the thermodynamic limit are accurate for *N* of the order of a few tens of base pairs and larger). Formally, the extensive parameters <*N*>, <*N*_{b}> and <*R _{x}*>/

*x*

_{ds}may be obtained as derivatives of the thermodynamic potential (22) with respect to the intensive parameters ln

*z*, ln ω and ln φ (where φ=β

*Fx*

_{ds}). (In eqn 22, we assume that the system is enclosed in a finite volume

*V*, with corresponding pressure

*p*=−∂Φ/∂

*V*; without the inclusion of

*V*, the Euler equation implies Φ=0.) The Gibbs–Duhem relationship for the thermodynamic potential Φ in eqn (22) is given by (for constant pressure

*p*) (23)

Dividing by <*N*>, one obtains
(24)
with Θ and *R _{x}* from eqns (20) and (21). Eqn (24) may be used to carry out the thermodynamic limit <

*N*>→∞; noting that

*z*=

*z*(ω,φ,<

*N*>), one obtains for <

*N*>→∞. (25) where

*z**(ω,φ) is the value of the fugacity in the limit <

*N*>→∞. Similarly, Θ*(ω,

*F*)=Θ(

*z**,ω,φ) and

*R**(ω,

_{x}*F*)=

*R*(

_{x}*z**,ω,φ) are the values of Θ and

*R*for <

_{x}*N*>→∞. For constant φ, one obtains from eqn (25) (26)

Similarly, for constant ω, one obtains (27)

Eqns (26) and (27) yield the fraction of bound base pairs Θ and the DNA extension *R _{x}* per base pair in units of

*x*

_{ds}in the thermodynamic limit

*N*→∞.

### Phase diagrams

The quantity *z**(ω,φ) in eqns (26) and (27) is the lowest value of *z* for which <*N*> in eqn (17) diverges. For *T*<*T*_{m}, this limit occurs for the case that the denominator on the right-hand side of eqn (2) vanishes. This implies that *z**(ω,φ) is implicitly determined by the condition
(28)

Graphically, *z** is obtained by intersection of Ω(*z*) and *B*(*z*)^{−1} as functions of *z*. Both Ω(*z*) and *B*(*z*) are increasing in *z*, thus *B*(*z*)^{−1} is decreasing. Figure 4 shows schematically the behaviour of Ω(*z*) and *B*(*z*)^{−1} as *T* is varied through the transition temperature *T*_{m}. Here we consider a value *c*>1 of the exponent *c*>1 in eqn (14), so that Ω(*z*_{m}) is finite {for 1<*c*≤2, the slope of Ω(*z*) at *z*_{m} diverges, whereas for *c*>2, the slope of Ω(*z*) at *z*_{m} is finite; as a result, one finds that in the former case the denaturation transition is of second order, whereas in the latter case it is of first order [43]}.

Consider a bound state of the DNA chain at *T*<*T*_{m} where the fraction of bound base pairs is finite, i.e. Θ>0. As *T* increases, the curve *B*(*z*)^{−1} increases, whereas Ω(*z*) remains constant. Thus the value *z** of *z* at the intersection of the two curves increases (Figure 4). However, *z** can only increase until it reaches the value *z*_{m} given by eqn (16) because Ω diverges for *z*>*z*_{m}. This can only occur for *c*>1, since Ω(*z*_{m}) itself diverges for *c*≤1, which implies that no thermodynamic phase transition is possible for *c*≤1. Conversely, for *c*>1, the denaturation transition takes place for *z**=*z*_{m}, where *z** is determined by eqn (28) and *z*_{m} is the smallest value of *z* for which Ω(*z*) diverges, given by eqn (16). Combining the condition *z**=*z*_{m} with eqn (28) yields an implicit relationship for the transition line *F*_{m}(*T*):
(29)
Here we have used Ω(*z*_{m})=*A*Li* _{c}*(1)=

*A*ζ(1), independent of

*T*and

*F*; see eqn (16). For free DNA, i.e.

*F*=0,

*z*

_{m}is independent of

*T*and eqn (29) reduces to the condition for the

*T*

_{m}for free DNA.

Figure 5 shows transition lines *f*_{m}(*t*) in terms of the dimensionless variables *f*=*Fx*_{ds}/ϵ and *t*=*k*_{B}*T*/ϵ, using eqn (29) in conjunction with eqns (10) and (5). The lines *f*_{m}(*t*) separate a finite region of bound states from an infinite region of denatured states. The point (*t*_{0},*f*=0) with *t*_{0}=*t*_{m}(*f*=0) corresponds to the melting transition for free DNA, i.e. *F*=0. For *F*>0, the shape of the transition lines *f*_{m}(*t*) depends on the parameters *A*, α and *s* entering the statistical weight Ω of denatured loops in eqn (11); Figure 5 shows *f*_{m}(*t*) for *A*=1 and *A*=0.01, with α=1, *s*=5. The value *A*=0.01 corresponds to the more realistic case of a small loop initiation factor σ_{0}≪1 (see eqn 13). The lines *f*_{m}(*t*) contain a region in which *f*_{m}(*t*) decreases with *t*, such that increased stretching forces *f* lower the melting temperature *t*_{m}(*f*), corresponding to force-induced destabilization of DNA as observed in experiments [22–26,37]. Interestingly, *f*_{m}(*t*) vanishes for both *t*→*t*_{0} and *t*→0. This re-entrant behaviour implies that, for given 0<*f*_{0}<*f*_{max}, where *f*_{max} is the maximum of *f*_{m}(*t*), the chain does not only denature at a large value *t*_{m}^{+}(*f*_{0}), but also at a smaller value *t*_{m}^{−}(*f*_{0}).

## Summary

Single-molecule force measurements using optical and magnetic tweezers and atomic force microscopy have dramatically expanded our knowledge of nucleic acids and proteins. Specifically, stretching single DNA molecules by an optical tweezers instrument can induce the unwinding of the two strands of the DNA duplex. The induced structural and thermodynamic changes in the DNA double helix upon stretching alter interactions with DNA-binding ligands in a controllable and measurable way. Therefore single-molecule force measurements of DNA and DNA–ligand interactions provide an unprecedented opportunity for quantitative study of a wide range of physiologically important phenomena associated with DNA helix destabilization and DNA–ligand binding.

In spite of the progress made in single-molecule force experiments, a poor understanding of the structural and thermodynamic response of biomolecules to mechanical stress has limited the insight that such experiments have provided into helix destabilization and DNA–ligand binding. A notorious difficulty for modelling force-induced melting is the vast range of length and time scales spanned by the process. An important objective of future studies is therefore to develop biophysical models that capture the crossing length and times scales from the atomistic to the macromolecular level. In the present paper, I have shown that the PS model provides a potential starting point to better understand force–extension relationships of stretched DNA.

## Funding

This work was supported by the National Institutes of Health [grant number 5SC3GM083779-03].

## Acknowledgments

I thank Mark C. Williams for helpful discussions.

## Footnotes

Topological Aspects of DNA Function and Protein Folding: An Independent Meeting held at the Isaac Newton Institute for Mathematical Sciences, Cambridge, U.K., 3–7 September 2012, as part of the Isaac Newton Institute Programme Topological Dynamics in the Physical and Biological Sciences (16 July–21 December 2012). Organized and Edited by Andrew Bates (University of Liverpool, U.K.), Dorothy Buck (Imperial College London, U.K.), Sarah Harris (University of Leeds, U.K.), Andrzej Stasiak (University of Lausanne, Switzerland) and De Witt Sumners (Florida State University, U.S.A.).

**Abbreviations:**
FJC, freely jointed chain;
PS, Poland–Scheraga

- © The Authors Journal compilation © 2013 Biochemical Society