1 Classical scalar field

In quantum mechanics (QM) we only considered non-relativistic single particles. By then special relativity was already well-known and many physicists tried to find a combined theory of special relativity and QM, similar to how we are trying to find a combined theory with general relativity today. The first attempt at this was the Klein-Gordon (KG) equation which, in essence, was very similar to the Schrödinger equation. Recall how we identify operators in QM

pixandEit.\displaystyle p\to-i\hbar\partial_{x}\quad\text{and}\quad E\to i\hbar\partial_% {t}\,. (1)

The Schrödinger equation is just the definition of the non-relativistic energy E=T+VE=T+V

E=T+V=p22m+V,itΨ=[22mx2+V]Ψ.\displaystyle\begin{split}E&=T+V=\frac{p^{2}}{2m}+V\,,\\ i\hbar\partial_{t}\Psi&=\Big{[}-\frac{\hbar^{2}}{2m}\partial_{x}^{2}+V\Big{]}% \Psi\,.\end{split} (2)

It is now natural to try and extend this to the relativistic case E2c2p2=c4m2E^{2}-c^{2}p^{2}=c^{4}m^{2} (which of course for p=0p=0 is just E=mc2E=mc^{2})

2(1c2t2+x2)Ψ=m2c2Ψ.\displaystyle\hbar^{2}\Big{(}-\frac{1}{c^{2}}\partial_{t}^{2}+\partial_{x}^{2}% \Big{)}\Psi=m^{2}c^{2}\Psi\,. (3)

This is the KG equation. Unfortunately, this equation is, as it stands, inconsistent and Schrödinger himself discarded it immediately, choosing to instead expand the relativistic mass-energy relation which lead him to (2).

The solution to this problem will be that we need to consider the number of particles free rather than fixed to one. Recall how in perturbation theory we had to sum over all possible states rather than just the lowest energy one. Here we have much the same.

Before we can do this, we need to briefly revisit some aspects of classical physics. We will spend the rest of this chapter reviewing relativity and Lagrangian mechanics and classical field theory.

1.1 Relativity

At its heart, physics is the study of the symmetries of nature and their consequences. By studying how systems change or more importantly do not change under transformation, we can identify important properties of the system (cf. Noether theorem). Arguably one of the most important symmetries there is, is Lorentz symmetry, i.e. invariance under spacetime rotations in special relativity. These consists of rotation in 3D space and boosts. The following is merely a brief summary of what is required for this course and should not be viewed as complete.

1.1.1 Rotation

A rotation in 3D can be expressed using a 3×33\times 3 matrix RR such that

xx=Rx,xixi=j=13Rijxj=Rijxj.\displaystyle\begin{split}\vec{x}\to\vec{x}^{\,\prime}&=R\vec{x}\,,\\ x_{i}\to x_{i}^{\prime}&=\sum_{j=1}^{3}R_{ij}x_{j}=R_{ij}x_{j}\,.\end{split} (4)

Here we have started using Einstein sum conventions where a sum is implicit over indices that appear exactly twice. Since a rotation of the entire system is supposed to not change the lengths of vectors or the angles between them, let us consider what happens to a scalar product between two vectors x\vec{x} and y\vec{y}

xy=i=13xiyi=xiyixiyi=RijxjRikxk.\displaystyle\vec{x}\cdot\vec{y}=\sum_{i=1}^{3}x_{i}y_{i}=x_{i}y_{i}\to x_{i}^% {\prime}y_{i}^{\prime}=R_{ij}x_{j}R_{ik}x_{k}\,. (5)

The only way for this to hold is if

RijRik=Rji(RT)ik=δjkor in other wordsRRT=1,\displaystyle R_{ij}R_{ik}=R_{ji}(R^{T})_{ik}=\delta_{jk}\quad\text{or in % other words}\qquad RR^{T}=1\,, (6)

with the Kronecker delta δjk\delta_{jk}. This means that RR has to be an orthogonal matrix. Since we usually want to disallow reflections, we also require detR=1\det R=1 to arrive at the symmetry group

SO(3)={R|RRT=1anddetR=1}.\displaystyle\mathrm{SO}(3)=\Big{\{}R\ \Big{|}\ RR^{T}=1\ \text{and}\ \det R=1% \Big{\}}\,. (7)

1.1.2 Minkowski space

What is the equivalent of distances and rotations in special-relativity, i.e. what do we require to remain invariant under transformation? Measurements tell us that the ‘distance’ in spacetime is defined as

s2=c2t2x2,\displaystyle s^{2}=c^{2}t^{2}-\vec{x}^{2}\,, (8)

which remains invariant compared to the previous s2=x2s^{2}=\vec{x}^{2}. This is very similar to the rotations except for the extra sign. In 3D we have worked in Euclidean space while we now need to work in Minkowskian space. Similarly, the metric ss is now called the Minkowski metric11 1 Formally, the Minkowski metric is not a metric in the mathematical sense because s2s^{2} can be negative or zero without x=0x=0. . For simplicity, let us collect the time component into the vector xx to write the vector

xμ=(x0,x1,x2,x3)=(ct,x),\displaystyle x^{\mu}=(x^{0},x^{1},x^{2},x^{3})=(ct,\vec{x})\,, (9)

which we need to distinguish from the covector which has lower indices

xμ=(x0,x1,x2,x3)=(ct,x),\displaystyle x_{\mu}=(x^{0},-x^{1},-x^{2},-x^{3})=(ct,-\vec{x})\,, (10)

to properly account for the metric. In sum convention, we may only contract upper with lower indices, i.e.

xμxμ=c2t2x2=s2,\displaystyle x^{\mu}x_{\mu}=c^{2}t^{2}-\vec{x}^{2}=s^{2}\,, (11)

is valid while xμyμx^{\mu}y^{\mu} is not. It would therefore be helpful to have a way to raise or lower indices which is done using the metric itself

ημν=(1000010000100001).\displaystyle\eta_{\mu\nu}=\begin{pmatrix}1&0&0&0\\ 0&-1&0&0\\ 0&0&-1&0\\ 0&0&0&-1\end{pmatrix}\,. (12)
Signs of the metric

Note that there are two competing conventions for the signs in η\eta. In this lecture (and the broader particle physics community) we will use the ‘mostly minus’ conventions, sometimes called west coast metric. Alternatively, people use a metric which has the minus sign in the time component which is more common in string theory and cosmology. This is often referred to as the ‘mostly plus’ or east coast convention. The west coast convention has the nice property that the momentum of a massive particle squares to its mass, i.e. pμpμ=m2p^{\mu}p_{\mu}=m^{2} rather than pμpμ=m2p^{\mu}p_{\mu}=-m^{2}. When reading other resources, please make sure you understand the metric the author uses to avoid making sign mistakes!

η\eta takes a role not dissimilar from the Kronecker delta in Euclidean space. We can now write

xμ=ημνxνandxμ=ημνxν,\displaystyle x_{\mu}=\eta_{\mu\nu}x^{\nu}\quad\text{and}\quad x^{\mu}=\eta^{% \mu\nu}x_{\nu}\,, (13)

where we have defined the new ημν\eta^{\mu\nu}. Luckily, its matrix representation is the same since

xμ=ημνηνρxρand thereforeημνηνρ=δρμ.\displaystyle x^{\mu}=\eta^{\mu\nu}\eta_{\nu\rho}x^{\rho}\quad\text{and % therefore}\quad\eta^{\mu\nu}\eta_{\nu\rho}=\delta^{\mu}_{\phantom{\mu}\rho}\,. (14)
Suggested Exercise

What is ημνηνμ\eta_{\mu\nu}\eta^{\nu\mu}?

Please note that for a general tensor, i.e. an object with multiple indices, the order of indices matters. You can easily convince yourself that it does not for the Kronecker delta δρμ\delta^{\mu}_{\phantom{\mu}\rho} but you should be careful. Raising and lowering indices works also for tensors, e.g.

wμν=ηνρwρμ=ημσηνρwσρ.\displaystyle w^{\mu\nu}=\eta^{\nu\rho}w^{\mu}_{\phantom{\mu}\rho}=\eta^{\mu% \sigma}\eta^{\nu\rho}w_{\sigma\rho}\,. (15)

It is very important to remember that wμνw^{\mu\nu}, wρμw^{\mu}_{\phantom{\mu}\rho}, wσρw_{\sigma\rho} and even wμρw_{\mu}^{\phantom{\mu}\rho} are all different objects!

Another important object to consider is the derivative operator which also exists as a vector and a covector

μ\displaystyle\partial^{\mu} =xμ=(1ct,),\displaystyle=\frac{\partial}{\partial x_{\mu}}=\Big{(}\frac{1}{c}\partial_{t}% ,-\vec{\nabla}\Big{)}\,, (16)
μ\displaystyle\partial_{\mu} =xμ=(1ct,+).\displaystyle=\frac{\partial}{\partial x^{\mu}}=\Big{(}\frac{1}{c}\partial_{t}% ,+\vec{\nabla}\Big{)}\,. (17)

The derivative of xx itself works as expected

μxν=δμνandμxν=δνμ.\displaystyle\partial_{\mu}x^{\nu}=\delta_{\mu}^{\ \nu}\quad\text{and}\quad% \partial^{\mu}x_{\nu}=\delta^{\mu}_{\ \nu}\,. (18)

One can easily show that the momentum

pμ=(E/c,p)\displaystyle p^{\mu}=\Big{(}E/c,\vec{p}\Big{)} (19)

is a vector which allows the identification (1).

1.1.3 Lorentz transformation

We now have the tools to actually study Lorentz transformations. We begin by defining

  • the scalar product in Minkowski space which works the same for vectors and covectors

    xy=xμyμ=xμyμ=ημνxμyν=ημνxμyν.\displaystyle x\cdot y=x^{\mu}y_{\mu}=x_{\mu}y^{\mu}=\eta_{\mu\nu}x^{\mu}y^{% \nu}=\eta^{\mu\nu}x_{\mu}y_{\nu}\,. (20)
  • the Lorentz transformation as a linear transformation

    xμ(x)μ=Λνμxν.\displaystyle x^{\mu}\to(x^{\prime})^{\mu}=\Lambda^{\mu}_{\phantom{\mu}\nu}x^{% \nu}\,. (21)
    Suggested Exercise

    Find an explicit form of Λ\Lambda.

We want this transformation to preserve scalar products, i.e. xy=xyx^{\prime}\cdot y^{\prime}=x\cdot y

xy=ημν(x)μ(y)ν=ημνΛρμxρΛσνyσ=!ησρxρyσ.\displaystyle x^{\prime}\cdot y^{\prime}=\eta_{\mu\nu}(x^{\prime})^{\mu}(y^{% \prime})^{\nu}=\eta_{\mu\nu}\Lambda^{\mu}_{\phantom{\mu}\rho}x^{\rho}\Lambda^{% \nu}_{\phantom{\nu}\sigma}y^{\sigma}\stackrel{{\scriptstyle!}}{{=}}\eta_{% \sigma\rho}x^{\rho}y^{\sigma}\,. (22)

This allows us to specify the full Lorentz group

O(1,3)={Λ|ημνΛρμΛσν=ησρ}.\displaystyle\mathrm{O}(1,3)=\Big{\{}\Lambda\ \Big{|}\ \eta_{\mu\nu}\Lambda^{% \mu}_{\phantom{\mu}\rho}\Lambda^{\nu}_{\phantom{\nu}\sigma}=\eta_{\sigma\rho}% \Big{\}}\,. (23)

This relation is very similar to the orthogonal group O(3)\mathrm{O}(3) and hence has a similar name. We use the arguments 1,31,3 to indicate that there is one time dimension and three spacial dimensions that have opposite signs. Similarly to how we wanted rotations to not include reflections, we can define subgroups

  • orthochronous O+(1,3)\mathrm{O}^{+}(1,3) which preserves the direction of time by requiring that Λ001\Lambda^{0}_{\phantom{0}0}\geq 1.

  • proper SO(1,3)\mathrm{SO}(1,3) which preserves orientation by requiring that detΛ=+1\det\Lambda=+1.

  • improper which flips orientation, i.e. detΛ=1\det\Lambda=-1

  • non-orthochronous O(1,3)\mathrm{O}^{-}(1,3) which flips the direction of time by requiring that Λ001\Lambda^{0}_{\phantom{0}0}\leq 1.

When we talk about the Lorentz group, we often refer to the proper orthochronous group SO+(1,3)\mathrm{SO}^{+}(1,3) .

Suggested Exercise

Proof that SO+(1,3)\mathrm{SO}^{+}(1,3) is a group.

Suggested Exercise

Proof that p2=pp=pμpνημνp^{2}=p\cdot p=p^{\mu}p^{\nu}\eta_{\mu\nu} is invariant under Lorentz transformation. What does this mean for the mass-energy relation E2/cp2=m2c2E^{2}/c-\vec{p}^{2}=m^{2}c^{2}?

When we combine the Lorentz group with invariance under shifts, we obtain the largest group of spacetime symmetry, the Poincaré group.

Since the Lorentz group is connect, we can obtain every element of SO+(1,3)\mathrm{SO}^{+}(1,3) by concatenating infinitesimal Lorentz transforms starting from the identity transform δ\delta, i.e.

Λνμ=δνμ+ωνμ+𝒪(ω2).\displaystyle\Lambda^{\mu}_{\phantom{\mu}\nu}=\delta^{\mu}_{\phantom{\mu}\nu}+% \omega^{\mu}_{\phantom{\mu}\nu}+\mathcal{O}(\omega^{2})\,. (24)

By substituting this into the condition for the Lorentz group (23), we find

ησρ=ημνΛρμΛσν=ημν(δρμ+ωρμ)(δσν+ωσν)+𝒪(ω2)=ηρσ+ηρνωσν+ημσωρμ+𝒪(ω2)=ηρσ+ωρσ+ωσρ+𝒪(ω2).\displaystyle\begin{split}\eta_{\sigma\rho}&=\eta_{\mu\nu}\Lambda^{\mu}_{% \phantom{\mu}\rho}\Lambda^{\nu}_{\phantom{\nu}\sigma}=\eta_{\mu\nu}\Big{(}% \delta^{\mu}_{\phantom{\mu}\rho}+\omega^{\mu}_{\phantom{\mu}\rho}\Big{)}\Big{(% }\delta^{\nu}_{\phantom{\nu}\sigma}+\omega^{\nu}_{\phantom{\nu}\sigma}\Big{)}+% \mathcal{O}(\omega^{2})\\ &=\eta_{\rho\sigma}+\eta_{\rho\nu}\omega^{\nu}_{\phantom{\nu}\sigma}+\eta_{\mu% \sigma}\omega^{\mu}_{\phantom{\mu}\rho}+\mathcal{O}(\omega^{2})=\eta_{\rho% \sigma}+\omega_{\rho\sigma}+\omega_{\sigma\rho}+\mathcal{O}(\omega^{2})\,.\end% {split} (25)

In other words, ω\omega is antisymmetric

ωρσ=ωσρ,\displaystyle\omega_{\rho\sigma}=-\omega_{\sigma\rho}\,, (26)

which is why it is so important to keep the order of indices correct.

Aside from (21), it is also useful to know how derivatives transform

xμ=(x)νxμ(x)ν=Λμν(x)ν.\displaystyle\frac{\partial}{\partial x^{\mu}}=\frac{\partial(x^{\prime})^{\nu% }}{\partial x^{\mu}}\frac{\partial}{\partial(x^{\prime})^{\nu}}=\Lambda^{\nu}_% {\phantom{\nu}\mu}\frac{\partial}{\partial(x^{\prime})^{\nu}}\,. (27)

1.1.4 Transformation of the KG field

If we want SO+(1,3)\mathrm{SO}^{+}(1,3) to be a symmetry of nature, our theories need to invariant under transformation. To ensure this, we need to study how for example the solution Ψ(x)\Psi(x) of the KG equation (3) transforms. Ψ(x)\Psi(x) maps every point xx in spacetime to a (complex) number Ψ(x)\Psi(x). If xx is for example the point where Ψ(x)\Psi(x) is maximal, this property needs to be retained even after transformation, i.e.

xx=Λx,Ψ(x)Ψ(x)=Ψ(Λx)=!Ψ(x).\displaystyle\begin{split}x&\to x^{\prime}=\Lambda x\,,\\ \Psi(x)&\to\Psi^{\prime}(x^{\prime})=\Psi^{\prime}(\Lambda x)\stackrel{{% \scriptstyle!}}{{=}}\Psi(x)\,.\end{split} (28)

The way to ensure this, is to require

Ψ(x)Ψ(x)=Ψ(Λ1x).\displaystyle\Psi(x)\to\Psi^{\prime}(x)=\Psi(\Lambda^{-1}x)\,. (29)

For the KG equation we need

ημνμνΨ(x)ημν(Λ1)μρ(Λ1)νσηρσ(ρσΨ)(Λ1x)=(ηρσρσΨ)(Λ1x),\displaystyle\eta^{\mu\nu}\partial_{\mu}\partial_{\nu}\Psi(x)\to\underbrace{% \eta^{\mu\nu}(\Lambda^{-1})^{\rho}_{\phantom{\rho}\mu}(\Lambda^{-1})^{\sigma}_% {\phantom{\sigma}\nu}}_{\eta^{\rho\sigma}}(\partial_{\rho}\partial_{\sigma}% \Psi)(\Lambda^{-1}x)=(\eta^{\rho\sigma}\partial_{\rho}\partial_{\sigma}\Psi)(% \Lambda^{-1}x)\,, (30)

where we have used again the defining properties of Λ\Lambda. With this it is obvious that the KG equation (3) (now rewritten in more compact notation)

(2μμ+m2c2)Ψ=0\displaystyle(\hbar^{2}\partial_{\mu}\partial^{\mu}+m^{2}c^{2})\Psi=0 (31)

is invariant

(2μμ+m2c2)Ψ(x)(2μμ+m2c2)Ψ(Λ1x)\displaystyle(\hbar^{2}\partial_{\mu}\partial^{\mu}+m^{2}c^{2})\Psi(x)\to(% \hbar^{2}\partial_{\mu}\partial^{\mu}+m^{2}c^{2})\Psi(\Lambda^{-1}x) (32)

1.2 A brief digression on units

In the above discussion we often had to write \hbar and cc. To avoid doing this, it is common to choose a unit system where =c=1\hbar=c=1 and we only have a single unit (such as GeV{\rm GeV}). This is perfectly permissible as we will always know how to convert back to SI units by multiplying with the correct powers of \hbar and cc. With the 2019 revision of the SI system, both cc and \hbar have definite values without uncertainties as they are used to define the meter and kilogram respectively. A helpful value to remember is (c)=197.3MeVfm(\hbar c)=197.3\,{\rm MeV}\cdot{\rm fm}.

Suggested Exercise

Convert the following values back to SI units:

  • the total cross section for pppp collisions at the LHC is σ250GeV2\sigma\approx 250\,{\rm GeV}^{-2}. How much is this in fb2{\rm fb}^{2}?

  • the muon lifetime is τ3.3×1018GeV1\tau\approx 3.3\times 10^{18}\,{\rm GeV}^{-1}. How much is this in μs\mu{\rm s}?

  • the electron mass is m0.511MeVm\approx 0.511\,{\rm MeV}. How much is this in kg{\rm kg}?

1.3 Lagrangian mechanics

Let us briefly review a classical system with nn degrees of freedom such as a collection of n/3n/3 particles that can all move independently of each other. This system is completely described by nn generalised coordinates q1,,qnq_{1},...,q_{n} and nn generalised velocities q˙1,,q˙n\dot{q}_{1},...,\dot{q}_{n}. To find these, we need the Lagrangian LL. For a system of particles, this is

L({qi},{q˙i},t)=TV=12k=1nmiq˙i2V({qi},{q˙i}).\displaystyle L(\{q_{i}\},\{\dot{q}_{i}\},t)=T-V=\frac{1}{2}\sum_{k=1}^{n}m_{i% }\dot{q}_{i}^{2}-V(\{q_{i}\},\{\dot{q}_{i}\})\,. (33)

To derive the equations of motions for this system, we define the action SS functional

S[{qi(t)}]=t0t1dtL({qi},{q˙i},t)\displaystyle S[\{q_{i}(t)\}]=\int_{t_{0}}^{t_{1}}{\rm d}t\ L(\{q_{i}\},\{\dot% {q}_{i}\},t) (34)

and use the variational principle, i.e. we require the action is extremal22 2 This is sometimes referred to as the principle of least action. However, the action does not need to minimised (even if it often is) as long as it is extremal

δS=0.\displaystyle\delta S=0\,. (35)

From this you can easily derive the Euler-Lagrange equation

ddtLq˙iLqi=0.\displaystyle\frac{{\rm d}}{{\rm d}t}\frac{\partial L}{\partial\dot{q}_{i}}-% \frac{\partial L}{\partial q_{i}}=0\,. (36)
Derivation of the Euler-Lagrange for n=1n=1

In the n=1n=1 case we have only one qq so that the action only depends on the function q(t)q(t). Fixing the boundary conditions q(t0)=q0q(t_{0})=q_{0} and q(t1)=q1q(t_{1})=q_{1}, we vary the path by a small ϵω(t)\epsilon\,\omega(t) with ω(t0)=ω(t1)=0\omega(t_{0})=\omega(t_{1})=0. Then,

ddϵS[q+ϵω]\displaystyle\frac{{\rm d}}{{\rm d}\epsilon}S[q+\epsilon\,\omega] =t0t1dtddϵL(q(t)+ϵω(t),q˙(t)+ϵω˙(t),t)\displaystyle=\int_{t_{0}}^{t_{1}}{\rm d}t\ \frac{{\rm d}}{{\rm d}\epsilon}L% \big{(}q(t)+\epsilon\,\omega(t),\dot{q}(t)+\epsilon\,\dot{\omega}(t),t\big{)}
=t0t1dt[ω(t)Lq(q(t)+ϵω(t),q˙(t)+ϵω˙(t),t)+ω˙(t)Lq˙(q(t)+ϵω(t),q˙(t)+ϵω˙(t),t)],\displaystyle=\int_{t_{0}}^{t_{1}}{\rm d}t\ \Bigg{[}\omega(t)\frac{\partial L}% {\partial q}\big{(}q(t)+\epsilon\,\omega(t),\dot{q}(t)+\epsilon\,\dot{\omega}(% t),t\big{)}+\dot{\omega}(t)\frac{\partial L}{\partial\dot{q}}\big{(}q(t)+% \epsilon\,\omega(t),\dot{q}(t)+\epsilon\,\dot{\omega}(t),t\big{)}\Bigg{]}\,, (37)

since tt does not depend on ϵ\epsilon. For ϵ=0\epsilon=0, we have an extremal value

dS[q]dϵ\displaystyle\frac{{\rm d}S[q]}{{\rm d}\epsilon} =t0t1dt[ω(t)Lq(q(t),q˙(t),t)+ω˙(t)Lq˙(q(t),q˙(t),t)].\displaystyle=\int_{t_{0}}^{t_{1}}{\rm d}t\ \Bigg{[}\omega(t)\frac{\partial L}% {\partial q}\big{(}q(t),\dot{q}(t),t\big{)}+\dot{\omega}(t)\frac{\partial L}{% \partial\dot{q}}\big{(}q(t),\dot{q}(t),t\big{)}\Bigg{]}\,. (38)

Applying integration-by-parts on the second term with ω(t0)=ω(t1)=0\omega(t_{0})=\omega(t_{1})=0 such hat the boundary conditions vanish

dS[q]dϵ\displaystyle\frac{{\rm d}S[q]}{{\rm d}\epsilon} =t0t1dtω(t)[Lq(q(t),q˙(t),t)ddtLq˙(q(t),q˙(t),t)].\displaystyle=\int_{t_{0}}^{t_{1}}{\rm d}t\ \omega(t)\Bigg{[}\frac{\partial L}% {\partial q}\big{(}q(t),\dot{q}(t),t\big{)}-\frac{{\rm d}}{{\rm d}t}\frac{% \partial L}{\partial\dot{q}}\big{(}q(t),\dot{q}(t),t\big{)}\Bigg{]}\,. (39)

Since ω\omega is an arbitrary smooth function, the fundamental lemma of the calculus of variations requires that the bracket vanishes.

Because of its prevalence in mechanics, we define the conjugate momentum to qiq_{i}

πi=Lq˙i\displaystyle\pi_{i}=\frac{\partial L}{\partial\dot{q}_{i}} (40)

Next, we can define the Hamiltonian

H=j=1nq˙jπjL({qi},{q˙i},t),\displaystyle H=\sum_{j=1}^{n}\dot{q}_{j}\pi_{j}-L(\{q_{i}\},\{\dot{q}_{i}\},t% )\,, (41)

which in the case of (33) is just the total energy of the system

H=j=1nq˙jLq˙iq˙jmjL=T+V.\displaystyle H=\sum_{j=1}^{n}\dot{q}_{j}\underbrace{\frac{\partial L}{% \partial\dot{q}_{i}}}_{\dot{q}_{j}m_{j}}-L=T+V\,. (42)

We can now wonder when energy is conversed by calculating the total derivative of HH w.r.t. tt

dHdt=q¨jLq˙j+q˙jddtLq˙jd(qjπj)/dtLqjq˙jLq˙jq¨jLtdL/dt,\displaystyle\frac{{\rm d}H}{{\rm d}t}=\underbrace{\ddot{q}_{j}\frac{\partial L% }{\partial\dot{q}_{j}}+\dot{q}_{j}\frac{{\rm d}}{{\rm d}t}\frac{\partial L}{% \partial\dot{q}_{j}}}_{{{\rm d}(q_{j}\pi_{j})}/{{\rm d}t}}\underbrace{-\frac{% \partial L}{\partial q_{j}}\dot{q}_{j}-\frac{\partial L}{\partial\dot{q}_{j}}% \ddot{q}_{j}-\frac{\partial L}{\partial t}}_{-{{\rm d}L}/{{\rm d}t}}\,, (43)

where we have left the sum over jj implicit. Using (36), we can cancel the remaining terms and are left with

dHdt=Lt.\displaystyle\frac{{\rm d}H}{{\rm d}t}=-\frac{\partial L}{\partial t}\,. (44)

In other words, if the Lagrangian LL does not explicitly depend on tt (for example through a time-dependent potential V(t)V(t)), energy is conserved. This is the first example of what is often called a Noether current or Noether charge, that is a conserved quantity that arises because the Lagrangian has a certain symmetry such as time-invariance.

Noether Theorem

In 1918, the German mathematician Emmy Noether proved that if the Lagrangian LL is invariant under small perturbations of the time variable and the coordinates qiq_{i}, there exists a conserved quantity for each of the nn coordinates. To quantify this, let TT be the generator of time evolution and QiQ_{i} the generator of the symmetry

t\displaystyle t t+δt=t+ϵT,\displaystyle\to t+\delta t=t+\epsilon T\,, (45)
qi\displaystyle q_{i} qi+δqi=qi+ϵQi.\displaystyle\to q_{i}+\delta q_{i}=q_{i}+\epsilon Q_{i}\,. (46)

If LL is invariant under the this transformation,

𝕆=HTπiQi\displaystyle\mathbb{O}=H\ T-\pi_{i}Q_{i} (47)

is conserved. Examples include

  • T=1T=1, Qi=0Q_{i}=0: energy is conserved if the potential is not time-dependent.

  • T=0T=0, Qi=1Q_{i}=1: linear momentum is conserved if the potential is shift-invariant.

  • T=0T=0, for each particle rr Qr=n×qr\vec{Q}_{r}=\vec{n}\times\vec{q}_{r} with some vector n\vec{n}: angular momentum along the axis n\vec{n} is conserved if the Lagrangian is spherical symmetric.

  • If the LL is invariant under Lorentz boost, the centre-of-mass system moves with constant velocity.

Additionally to Lagrangian mechanics, sometimes it is helpful to consider the Hamiltonian EoMs

q˙j=Hπjandπ˙j=Hqj.\displaystyle\dot{q}_{j}=\frac{\partial H}{\partial\pi_{j}}\quad\text{and}% \quad\dot{\pi}_{j}=-\frac{\partial H}{\partial q_{j}}\,. (48)
Suggested Exercise

Derive these using the Euler-Lagrange equation.

1.4 Classical field theory

Before we can study quantum field theorys, let us review classical field theories. A classical field ϕ(x,t)\phi(\vec{x},t) is a function that can take a value for each point in space and time. From a Lagrangian point of view, this means that we one degree of freedom for each point, requiring us to use integrals rather than finite sums. Since we also want to consider relativity, we further want to avoid talking about space differently from time. Therefore, the Lagrangian functional LL is now less useful and we instead use the Lagrangian density \mathcal{L} which confusingly is also often called Lagrangian. The generalised coordinate now becomes the field ϕ(x)\phi(x) and the generalised velocity the derivative of the field μϕ\partial^{\mu}\phi. The action functional is still defined the same way

S[ϕ]=dtL[ϕ]=dtd3x[ϕ,μϕ]=d4x[ϕ,μϕ].\displaystyle S[\phi]=\int{\rm d}t\ L[\phi]=\int{\rm d}t\int{\rm d}^{3}x\ % \mathcal{L}[\phi,\partial^{\mu}\phi]=\int{\rm d}^{4}x\ \mathcal{L}[\phi,% \partial^{\mu}\phi]\,. (49)

Similarly to the n=1n=1 case above, we can derive the Euler-Lagrange equation by requiring δS=0\delta S=0

xμ(μϕ)ϕ=xμπμϕ=0.\displaystyle\frac{\partial}{\partial x_{\mu}}\frac{\partial\mathcal{L}}{% \partial(\partial^{\mu}\phi)}-\frac{\partial\mathcal{L}}{\partial\phi}=\frac{% \partial}{\partial x_{\mu}}\pi_{\mu}-\frac{\partial\mathcal{L}}{\partial\phi}=% 0\,. (50)

where we have once again defined the conjugate momentum π\pi.

Suggested Exercise

Do this by adding a small variation ϕϕ+δϕ\phi\to\phi+\delta\phi and use Gauss’s law to remove the surface terms.

Now let us consider the KG as defining a field ϕ\phi instead of a wavefunction Ψ\Psi and derive the EoM (31). We first require the Lagrangian \mathcal{L}. While we could write down a field theory that depends on ϕ\phi and μϕ\partial^{\mu}\phi in any way we like, we usually want \mathcal{L} to be polynomial in the fields and derivatives. There are plenty of examples where this is not the case though (such as the Sine-Gordon theory or Higgs Effective Theory). We do require \mathcal{L} to be a Lorentz scalar though, meaning that it cannot have open indices, and that \mathcal{L} has units of GeV4{\rm GeV}^{4} so that the action SS is dimensionless. The first conditions implies that \mathcal{L} can only be a function of (μϕ)(μϕ)(\partial_{\mu}\phi)(\partial^{\mu}\phi) as there are no other vectors to contract with. It turns out that

=12(μϕ)(μϕ)12m2ϕ2.\displaystyle\mathcal{L}=\frac{1}{2}(\partial_{\mu}\phi)(\partial^{\mu}\phi)-% \frac{1}{2}m^{2}\phi^{2}\,. (51)

Note that the normalisation of 1/21/2 in front of the kinetic term does not matter classically as it does not impact the Euler-Lagrange equations. This changes once we start studying quantum fields so we will keep it already canonically normalised now. Let us that this reduces to the KG equation by computing

(μϕ)=12(μϕ)[ηαβ(αϕ)(βϕ)]=12ηαβ[(αϕ)δμβ+δμα(βϕ)]=μϕ.\displaystyle\frac{\partial\mathcal{L}}{\partial(\partial^{\mu}\phi)}=\frac{1}% {2}\frac{\partial}{\partial(\partial^{\mu}\phi)}\Big{[}\eta_{\alpha\beta}(% \partial^{\alpha}\phi)(\partial^{\beta}\phi)\Big{]}=\frac{1}{2}\eta_{\alpha% \beta}\Big{[}(\partial^{\alpha}\phi)\delta^{\beta}_{\phantom{\beta}\mu}+\delta% ^{\alpha}_{\phantom{\alpha}\mu}(\partial^{\beta}\phi)\Big{]}=\partial_{\mu}% \phi\,. (52)

Therefore, we find for (50)

xμμϕ12(m2ϕ2)ϕ=(μμ+m2)ϕ=0,\displaystyle\frac{\partial}{\partial x_{\mu}}\partial_{\mu}\phi-\frac{1}{2}% \frac{\partial(-m^{2}\phi^{2})}{\partial\phi}=(\partial^{\mu}\partial_{\mu}+m^% {2})\phi=0\,, (53)

which is just (31).

For the Hamiltonian formulation, we need the conjugate momentum which is written through the time derivative

π(x)=ϕ˙=ϕ˙(x),\displaystyle\pi(x)=\frac{\partial\mathcal{L}}{\partial\dot{\phi}}=\dot{\phi}(% x)\,, (54)

which, like ϕ\phi itself, is a scalar field in that it assigns a scalar value to every point in spacetime.

When we considered the case where nn is finite, our next subject was to show that total energy is conserved. We can do something very similar here by defining the Hamiltonian density \mathcal{H}

=ϕ˙π=12π2+12(ϕ)2+12m2ϕ2.\displaystyle\mathcal{H}=\dot{\phi}\pi-\mathcal{L}=\frac{1}{2}\pi^{2}+\frac{1}% {2}(\nabla\phi)^{2}+\frac{1}{2}m^{2}\phi^{2}\,. (55)

but it is actually helpful to think broader. Once we start considering relativity, energy and momentum become frame-dependent and it would be nice to have a covariant description that works with four-vectors rather than focussing on energy and momentum separately. Such an object is the energy-momentum tensor

Tμν=(μϕ)(νϕ)ημν,\displaystyle T^{\mu\nu}=(\partial^{\mu}\phi)\frac{\partial\mathcal{L}}{% \partial(\partial_{\nu}\phi)}-\mathcal{L}\eta^{\mu\nu}\,, (56)

which considers energy and momentum densities but also their fluxes as well as pressure and stress. TμνT^{\mu\nu} is a symmetric tensor which is trivial to see in our specific example,

Tμν=(μϕ)(νϕ)12(σϕ)(ρϕ)ηρσημν+12m2ϕ2ημν=(ηρμησν12ηρσημν)(σϕ)(ρϕ)+12m2ϕ2ημν.\displaystyle\begin{split}T^{\mu\nu}=(\partial^{\mu}\phi)(\partial^{\nu}\phi)-% \frac{1}{2}(\partial_{\sigma}\phi)(\partial_{\rho}\phi)\eta^{\rho\sigma}\eta^{% \mu\nu}+\frac{1}{2}m^{2}\phi^{2}\eta^{\mu\nu}\\ =(\eta^{\rho\mu}\eta^{\sigma\nu}-\frac{1}{2}\eta^{\rho\sigma}\eta^{\mu\nu})(% \partial_{\sigma}\phi)(\partial_{\rho}\phi)+\frac{1}{2}m^{2}\phi^{2}\eta^{\mu% \nu}\,.\end{split} (57)

This is very similar to \mathcal{H} and in fact T00=T^{00}=\mathcal{H} is the energy density.

Since we have four translation symmetries (one time and three spatial ones), we expect four conserved Noether currents

νTμν=(νμϕ)(νϕ)+(μϕ)(ν(νϕ))(ϕνϕ+(ρϕ)ν(ρϕ))νημν=(νμϕ)(νϕ)+(μϕ)(ν(νϕ))(σ(σϕ)νϕ+(ρϕ)ν(ρϕ))ημν=0.\displaystyle\begin{split}\partial_{\nu}T^{\mu\nu}&=(\partial_{\nu}\partial^{% \mu}\phi)\frac{\partial\mathcal{L}}{\partial(\partial_{\nu}\phi)}+(\partial^{% \mu}\phi)\Bigg{(}\partial_{\nu}\frac{\partial\mathcal{L}}{\partial(\partial_{% \nu}\phi)}\Bigg{)}-\underbrace{\Bigg{(}\frac{\partial\mathcal{L}}{\partial\phi% }\partial_{\nu}\phi+\frac{\partial\mathcal{L}}{\partial(\partial_{\rho}\phi)}% \partial_{\nu}(\partial_{\rho}\phi)\Bigg{)}}_{\partial_{\nu}\mathcal{L}}\eta^{% \mu\nu}\\ &=(\partial_{\nu}\partial^{\mu}\phi)\frac{\partial\mathcal{L}}{\partial(% \partial_{\nu}\phi)}+(\partial^{\mu}\phi)\Bigg{(}\partial_{\nu}\frac{\partial% \mathcal{L}}{\partial(\partial_{\nu}\phi)}\Bigg{)}-\Bigg{(}\partial^{\sigma}% \frac{\partial\mathcal{L}}{\partial(\partial^{\sigma}\phi)}\partial_{\nu}\phi+% \frac{\partial\mathcal{L}}{\partial(\partial_{\rho}\phi)}\partial_{\nu}(% \partial_{\rho}\phi)\Bigg{)}\eta^{\mu\nu}=0\,.\end{split} (58)

Here we have used the chain rule to expand ν\partial_{\nu}\mathcal{L} and (50) for the first term in the ημν\eta^{\mu\nu} bracket. We have now shown that the energy-momentum tensor is a conserved quantity in a more differential sense.

Let us pause for a moment to understand what this means by considering the μ=0\mu=0 case, i.e. jν=T0νj^{\nu}=T^{0\nu} for which we still have the same conservation law

0=νjν=j0t+j.\displaystyle 0=\partial_{\nu}j^{\nu}=\frac{\partial j^{0}}{\partial t}+\vec{% \nabla}\cdot\vec{j}\,. (59)

By integrating this over a small region of space VV, we find using Gauss’s divergence theorem for the second term

tVd3xj0=VdSj.\displaystyle\frac{\partial}{\partial t}\int_{V}{\rm d}^{3}x\ j^{0}=-\oint_{% \partial V}{\rm d}\vec{S}\cdot\vec{j}\,. (60)

Since we have already identified j0j^{0} as the energy density, this implies that the change in energy in the volume is equal to the flux of energy out of this volume. We will often encounter conservation laws like this, from energy or momentum like here to probability densities in wavefunctions or electric charge in electrodynamics.

Before concluding this chapter, let us derive the general solution of the KG equation. Since this is a wave equation, we begin by making an ansatz in terms of plane wave solutions

ϕ(x)=Aeikx+Be+ikx,\displaystyle\phi(x)=A{\rm e}^{-ik\cdot x}+B{\rm e}^{+ik\cdot x}\,, (61)

where kμk^{\mu} is a four vector indicating the direction of travel of our plane wave. Since we have considered only the real KG where ϕ=ϕ\phi=\phi^{*}, we also need a real solution, i.e. B=AB=A^{*}. We can constrain kk by substituting the ansatz into the KG equation

2ϕ+m2ϕ=k2ϕ+m2ϕ=0.\displaystyle\partial^{2}\phi+m^{2}\phi=-k^{2}\phi+m^{2}\phi=0\,. (62)

Therefore, k2=m2k^{2}=m^{2} which means that the momentum of the plane wave needs to be on its so-called mass-shell. Of course, the general solution is a linear combination of plane waves

ϕ(x)=d3k(2π)312Ek[akeikx+akeikx]\displaystyle\phi(x)=\int\frac{{\rm d}^{3}k}{(2\pi)^{3}}\frac{1}{\sqrt{2E_{% \vec{k}}}}\Big{[}a_{\vec{k}}{\rm e}^{-ik\cdot x}+a_{\vec{k}}^{*}{\rm e}^{ik% \cdot x}\Big{]} (63)

where k2=(k0)2k2=Ek2k2=m2k^{2}=(k^{0})^{2}-\vec{k}^{2}=E_{\vec{k}}^{2}-\vec{k}^{2}=m^{2}. The factor 1/2E1/\sqrt{2E} does not really matter at this point but it will become convenient later.

Suggested Exercise

Repeat the above discussion for the complex KG field where ϕ\phi and ϕ\phi^{*} are independent degrees of freedom. Start with the Lagrangian

=(μϕ)(μϕ)m2ϕϕ.\displaystyle\mathcal{L}=(\partial_{\mu}\phi)(\partial^{\mu}\phi^{*})-m^{2}% \phi\phi^{*}\,. (64)

In a QFT context, ϕ\phi^{*} would be the antiparticle to ϕ\phi’s particle.