1 Classical scalar field

In quantum mechanics (QM) we only considered non-relativistic single particles. By then special relativity was already well-known and many physicists tried to find a combined theory of special relativity and QM, similar to how we are trying to find a combined theory with general relativity today. The first attempt at this was the Klein-Gordon (KG) equation which, in essence, was very similar to the Schrödinger equation. Recall how we identify operators in QM

\displaystyle p\to-i\hbar\partial_{x}\quad\text{and}\quad E\to i\hbar\partial_% {t}\,.

(1)

The Schrödinger equation is just the definition of the non-relativistic energy $E=T+V$

\displaystyle\begin{split}E&=T+V=\frac{p^{2}}{2m}+V\,,\\ i\hbar\partial_{t}\Psi&=\Big{[}-\frac{\hbar^{2}}{2m}\partial_{x}^{2}+V\Big{]}% \Psi\,.\end{split}

(2)

It is now natural to try and extend this to the relativistic case $E^{2}-c^{2}p^{2}=c^{4}m^{2}$ (which of course for $p=0$ is just $E=mc^{2}$ )

\displaystyle\hbar^{2}\Big{(}-\frac{1}{c^{2}}\partial_{t}^{2}+\partial_{x}^{2}% \Big{)}\Psi=m^{2}c^{2}\Psi\,.

(3)

This is the KG equation. Unfortunately, this equation is, as it stands, inconsistent and Schrödinger himself discarded it immediately, choosing to instead expand the relativistic mass-energy relation which lead him to (2).

The solution to this problem will be that we need to consider the number of particles free rather than fixed to one. Recall how in perturbation theory we had to sum over all possible states rather than just the lowest energy one. Here we have much the same.

Before we can do this, we need to briefly revisit some aspects of classical physics. We will spend the rest of this chapter reviewing relativity and Lagrangian mechanics and classical field theory.

1.1 Relativity

At its heart, physics is the study of the symmetries of nature and their consequences. By studying how systems change or more importantly do not change under transformation, we can identify important properties of the system (cf. Noether theorem). Arguably one of the most important symmetries there is, is Lorentz symmetry, i.e. invariance under spacetime rotations in special relativity. These consists of rotation in 3D space and boosts. The following is merely a brief summary of what is required for this course and should not be viewed as complete.

1.1.1 Rotation

A rotation in 3D can be expressed using a $3\times 3$ matrix $R$ such that

\displaystyle\begin{split}\vec{x}\to\vec{x}^{\,\prime}&=R\vec{x}\,,\\ x_{i}\to x_{i}^{\prime}&=\sum_{j=1}^{3}R_{ij}x_{j}=R_{ij}x_{j}\,.\end{split}

(4)

Here we have started using Einstein sum conventions where a sum is implicit over indices that appear exactly twice. Since a rotation of the entire system is supposed to not change the lengths of vectors or the angles between them, let us consider what happens to a scalar product between two vectors $\vec{x}$ and $\vec{y}$

\displaystyle\vec{x}\cdot\vec{y}=\sum_{i=1}^{3}x_{i}y_{i}=x_{i}y_{i}\to x_{i}^% {\prime}y_{i}^{\prime}=R_{ij}x_{j}R_{ik}x_{k}\,.

(5)

The only way for this to hold is if

\displaystyle R_{ij}R_{ik}=R_{ji}(R^{T})_{ik}=\delta_{jk}\quad\text{or in % other words}\qquad RR^{T}=1\,,

(6)

with the Kronecker delta $\delta_{jk}$ . This means that $R$ has to be an orthogonal matrix. Since we usually want to disallow reflections, we also require $\det R=1$ to arrive at the symmetry group

\displaystyle\mathrm{SO}(3)=\Big{\{}R\ \Big{|}\ RR^{T}=1\ \text{and}\ \det R=1% \Big{\}}\,.

(7)

1.1.2 Minkowski space

What is the equivalent of distances and rotations in special-relativity, i.e. what do we require to remain invariant under transformation? Measurements tell us that the ‘distance’ in spacetime is defined as

\displaystyle s^{2}=c^{2}t^{2}-\vec{x}^{2}\,,

(8)

which remains invariant compared to the previous $s^{2}=\vec{x}^{2}$ . This is very similar to the rotations except for the extra sign. In 3D we have worked in Euclidean space while we now need to work in Minkowskian space. Similarly, the metric $s$ is now called the Minkowski metric¹¹ 1 Formally, the Minkowski metric is not a metric in the mathematical sense because $s^{2}$ can be negative or zero without $x=0$ . . For simplicity, let us collect the time component into the vector $x$ to write the vector

\displaystyle x^{\mu}=(x^{0},x^{1},x^{2},x^{3})=(ct,\vec{x})\,,

(9)

which we need to distinguish from the covector which has lower indices

\displaystyle x_{\mu}=(x^{0},-x^{1},-x^{2},-x^{3})=(ct,-\vec{x})\,,

(10)

to properly account for the metric. In sum convention, we may only contract upper with lower indices, i.e.

\displaystyle x^{\mu}x_{\mu}=c^{2}t^{2}-\vec{x}^{2}=s^{2}\,,

(11)

is valid while $x^{\mu}y^{\mu}$ is not. It would therefore be helpful to have a way to raise or lower indices which is done using the metric itself

\displaystyle\eta_{\mu\nu}=\begin{pmatrix}1&0&0&0\\ 0&-1&0&0\\ 0&0&-1&0\\ 0&0&0&-1\end{pmatrix}\,.

(12)

Signs of the metric

Note that there are two competing conventions for the signs in $\eta$ . In this lecture (and the broader particle physics community) we will use the ‘mostly minus’ conventions, sometimes called west coast metric. Alternatively, people use a metric which has the minus sign in the time component which is more common in string theory and cosmology. This is often referred to as the ‘mostly plus’ or east coast convention. The west coast convention has the nice property that the momentum of a massive particle squares to its mass, i.e. $p^{\mu}p_{\mu}=m^{2}$ rather than $p^{\mu}p_{\mu}=-m^{2}$ . When reading other resources, please make sure you understand the metric the author uses to avoid making sign mistakes!

$\eta$ takes a role not dissimilar from the Kronecker delta in Euclidean space. We can now write

\displaystyle x_{\mu}=\eta_{\mu\nu}x^{\nu}\quad\text{and}\quad x^{\mu}=\eta^{% \mu\nu}x_{\nu}\,,

(13)

where we have defined the new $\eta^{\mu\nu}$ . Luckily, its matrix representation is the same since

\displaystyle x^{\mu}=\eta^{\mu\nu}\eta_{\nu\rho}x^{\rho}\quad\text{and % therefore}\quad\eta^{\mu\nu}\eta_{\nu\rho}=\delta^{\mu}_{\phantom{\mu}\rho}\,.

(14)

Suggested Exercise

What is $\eta_{\mu\nu}\eta^{\nu\mu}$ ?

Please note that for a general tensor, i.e. an object with multiple indices, the order of indices matters. You can easily convince yourself that it does not for the Kronecker delta $\delta^{\mu}_{\phantom{\mu}\rho}$ but you should be careful. Raising and lowering indices works also for tensors, e.g.

\displaystyle w^{\mu\nu}=\eta^{\nu\rho}w^{\mu}_{\phantom{\mu}\rho}=\eta^{\mu% \sigma}\eta^{\nu\rho}w_{\sigma\rho}\,.

(15)

It is very important to remember that $w^{\mu\nu}$ , $w^{\mu}_{\phantom{\mu}\rho}$ , $w_{\sigma\rho}$ and even $w_{\mu}^{\phantom{\mu}\rho}$ are all different objects!

Another important object to consider is the derivative operator which also exists as a vector and a covector

	$\displaystyle\partial^{\mu}$	$\displaystyle=\frac{\partial}{\partial x_{\mu}}=\Big{(}\frac{1}{c}\partial_{t}% ,-\vec{\nabla}\Big{)}\,,$		(16)
	$\displaystyle\partial_{\mu}$	$\displaystyle=\frac{\partial}{\partial x^{\mu}}=\Big{(}\frac{1}{c}\partial_{t}% ,+\vec{\nabla}\Big{)}\,.$		(17)

The derivative of $x$ itself works as expected

\displaystyle\partial_{\mu}x^{\nu}=\delta_{\mu}^{\ \nu}\quad\text{and}\quad% \partial^{\mu}x_{\nu}=\delta^{\mu}_{\ \nu}\,.

(18)

One can easily show that the momentum

\displaystyle p^{\mu}=\Big{(}E/c,\vec{p}\Big{)}

(19)

is a vector which allows the identification (1).

1.1.3 Lorentz transformation

We now have the tools to actually study Lorentz transformations. We begin by defining

•

the scalar product in Minkowski space which works the same for vectors and covectors

$\displaystyle x\cdot y=x^{\mu}y_{\mu}=x_{\mu}y^{\mu}=\eta_{\mu\nu}x^{\mu}y^{% \nu}=\eta^{\mu\nu}x_{\mu}y_{\nu}\,.$ (20)
•

the Lorentz transformation as a linear transformation

$\displaystyle x^{\mu}\to(x^{\prime})^{\mu}=\Lambda^{\mu}_{\phantom{\mu}\nu}x^{% \nu}\,.$ (21)

Suggested Exercise

Find an explicit form of $\Lambda$ .

We want this transformation to preserve scalar products, i.e. $x^{\prime}\cdot y^{\prime}=x\cdot y$

\displaystyle x^{\prime}\cdot y^{\prime}=\eta_{\mu\nu}(x^{\prime})^{\mu}(y^{% \prime})^{\nu}=\eta_{\mu\nu}\Lambda^{\mu}_{\phantom{\mu}\rho}x^{\rho}\Lambda^{% \nu}_{\phantom{\nu}\sigma}y^{\sigma}\stackrel{{\scriptstyle!}}{{=}}\eta_{% \sigma\rho}x^{\rho}y^{\sigma}\,.

(22)

This allows us to specify the full Lorentz group

\displaystyle\mathrm{O}(1,3)=\Big{\{}\Lambda\ \Big{|}\ \eta_{\mu\nu}\Lambda^{% \mu}_{\phantom{\mu}\rho}\Lambda^{\nu}_{\phantom{\nu}\sigma}=\eta_{\sigma\rho}% \Big{\}}\,.

(23)

This relation is very similar to the orthogonal group $\mathrm{O}(3)$ and hence has a similar name. We use the arguments $1,3$ to indicate that there is one time dimension and three spacial dimensions that have opposite signs. Similarly to how we wanted rotations to not include reflections, we can define subgroups

•

orthochronous $\mathrm{O}^{+}(1,3)$ which preserves the direction of time by requiring that $\Lambda^{0}_{\phantom{0}0}\geq 1$ .
•

proper $\mathrm{SO}(1,3)$ which preserves orientation by requiring that $\det\Lambda=+1$ .
•

improper which flips orientation, i.e. $\det\Lambda=-1$
•

non-orthochronous $\mathrm{O}^{-}(1,3)$ which flips the direction of time by requiring that $\Lambda^{0}_{\phantom{0}0}\leq 1$ .

When we talk about the Lorentz group, we often refer to the proper orthochronous group $\mathrm{SO}^{+}(1,3)$ .

Suggested Exercise

Proof that $\mathrm{SO}^{+}(1,3)$ is a group.

Suggested Exercise

Proof that $p^{2}=p\cdot p=p^{\mu}p^{\nu}\eta_{\mu\nu}$ is invariant under Lorentz transformation. What does this mean for the mass-energy relation $E^{2}/c-\vec{p}^{2}=m^{2}c^{2}$ ?

When we combine the Lorentz group with invariance under shifts, we obtain the largest group of spacetime symmetry, the Poincaré group.

Since the Lorentz group is connect, we can obtain every element of $\mathrm{SO}^{+}(1,3)$ by concatenating infinitesimal Lorentz transforms starting from the identity transform $\delta$ , i.e.

\displaystyle\Lambda^{\mu}_{\phantom{\mu}\nu}=\delta^{\mu}_{\phantom{\mu}\nu}+% \omega^{\mu}_{\phantom{\mu}\nu}+\mathcal{O}(\omega^{2})\,.

(24)

By substituting this into the condition for the Lorentz group (23), we find

\displaystyle\begin{split}\eta_{\sigma\rho}&=\eta_{\mu\nu}\Lambda^{\mu}_{% \phantom{\mu}\rho}\Lambda^{\nu}_{\phantom{\nu}\sigma}=\eta_{\mu\nu}\Big{(}% \delta^{\mu}_{\phantom{\mu}\rho}+\omega^{\mu}_{\phantom{\mu}\rho}\Big{)}\Big{(% }\delta^{\nu}_{\phantom{\nu}\sigma}+\omega^{\nu}_{\phantom{\nu}\sigma}\Big{)}+% \mathcal{O}(\omega^{2})\\ &=\eta_{\rho\sigma}+\eta_{\rho\nu}\omega^{\nu}_{\phantom{\nu}\sigma}+\eta_{\mu% \sigma}\omega^{\mu}_{\phantom{\mu}\rho}+\mathcal{O}(\omega^{2})=\eta_{\rho% \sigma}+\omega_{\rho\sigma}+\omega_{\sigma\rho}+\mathcal{O}(\omega^{2})\,.\end% {split}

(25)

In other words, $\omega$ is antisymmetric

\displaystyle\omega_{\rho\sigma}=-\omega_{\sigma\rho}\,,

(26)

which is why it is so important to keep the order of indices correct.

Aside from (21), it is also useful to know how derivatives transform

\displaystyle\frac{\partial}{\partial x^{\mu}}=\frac{\partial(x^{\prime})^{\nu% }}{\partial x^{\mu}}\frac{\partial}{\partial(x^{\prime})^{\nu}}=\Lambda^{\nu}_% {\phantom{\nu}\mu}\frac{\partial}{\partial(x^{\prime})^{\nu}}\,.

(27)

1.1.4 Transformation of the KG field

If we want $\mathrm{SO}^{+}(1,3)$ to be a symmetry of nature, our theories need to invariant under transformation. To ensure this, we need to study how for example the solution $\Psi(x)$ of the KG equation (3) transforms. $\Psi(x)$ maps every point $x$ in spacetime to a (complex) number $\Psi(x)$ . If $x$ is for example the point where $\Psi(x)$ is maximal, this property needs to be retained even after transformation, i.e.

\displaystyle\begin{split}x&\to x^{\prime}=\Lambda x\,,\\ \Psi(x)&\to\Psi^{\prime}(x^{\prime})=\Psi^{\prime}(\Lambda x)\stackrel{{% \scriptstyle!}}{{=}}\Psi(x)\,.\end{split}

(28)

The way to ensure this, is to require

\displaystyle\Psi(x)\to\Psi^{\prime}(x)=\Psi(\Lambda^{-1}x)\,.

(29)

For the KG equation we need

\displaystyle\eta^{\mu\nu}\partial_{\mu}\partial_{\nu}\Psi(x)\to\underbrace{% \eta^{\mu\nu}(\Lambda^{-1})^{\rho}_{\phantom{\rho}\mu}(\Lambda^{-1})^{\sigma}_% {\phantom{\sigma}\nu}}_{\eta^{\rho\sigma}}(\partial_{\rho}\partial_{\sigma}% \Psi)(\Lambda^{-1}x)=(\eta^{\rho\sigma}\partial_{\rho}\partial_{\sigma}\Psi)(% \Lambda^{-1}x)\,,

(30)

where we have used again the defining properties of $\Lambda$ . With this it is obvious that the KG equation (3) (now rewritten in more compact notation)

\displaystyle(\hbar^{2}\partial_{\mu}\partial^{\mu}+m^{2}c^{2})\Psi=0

(31)

is invariant

\displaystyle(\hbar^{2}\partial_{\mu}\partial^{\mu}+m^{2}c^{2})\Psi(x)\to(% \hbar^{2}\partial_{\mu}\partial^{\mu}+m^{2}c^{2})\Psi(\Lambda^{-1}x)

(32)

1.2 A brief digression on units

In the above discussion we often had to write $\hbar$ and $c$ . To avoid doing this, it is common to choose a unit system where $\hbar=c=1$ and we only have a single unit (such as ${\rm GeV}$ ). This is perfectly permissible as we will always know how to convert back to SI units by multiplying with the correct powers of $\hbar$ and $c$ . With the 2019 revision of the SI system, both $c$ and $\hbar$ have definite values without uncertainties as they are used to define the meter and kilogram respectively. A helpful value to remember is $(\hbar c)=197.3\,{\rm MeV}\cdot{\rm fm}$ .

Suggested Exercise

Convert the following values back to SI units:

•

the total cross section for $pp$ collisions at the LHC is $\sigma\approx 250\,{\rm GeV}^{-2}$ . How much is this in ${\rm fb}^{2}$ ?
•

the muon lifetime is $\tau\approx 3.3\times 10^{18}\,{\rm GeV}^{-1}$ . How much is this in $\mu{\rm s}$ ?
•

the electron mass is $m\approx 0.511\,{\rm MeV}$ . How much is this in ${\rm kg}$ ?

1.3 Lagrangian mechanics

Let us briefly review a classical system with $n$ degrees of freedom such as a collection of $n/3$ particles that can all move independently of each other. This system is completely described by $n$ generalised coordinates $q_{1},...,q_{n}$ and $n$ generalised velocities $\dot{q}_{1},...,\dot{q}_{n}$ . To find these, we need the Lagrangian $L$ . For a system of particles, this is

\displaystyle L(\{q_{i}\},\{\dot{q}_{i}\},t)=T-V=\frac{1}{2}\sum_{k=1}^{n}m_{i% }\dot{q}_{i}^{2}-V(\{q_{i}\},\{\dot{q}_{i}\})\,.

(33)

To derive the equations of motions for this system, we define the action $S$ functional

\displaystyle S[\{q_{i}(t)\}]=\int_{t_{0}}^{t_{1}}{\rm d}t\ L(\{q_{i}\},\{\dot% {q}_{i}\},t)

(34)

and use the variational principle, i.e. we require the action is extremal²² 2 This is sometimes referred to as the principle of least action. However, the action does not need to minimised (even if it often is) as long as it is extremal

\displaystyle\delta S=0\,.

(35)

From this you can easily derive the Euler-Lagrange equation

\displaystyle\frac{{\rm d}}{{\rm d}t}\frac{\partial L}{\partial\dot{q}_{i}}-% \frac{\partial L}{\partial q_{i}}=0\,.

(36)

Derivation of the Euler-Lagrange for

n=1

In the $n=1$ case we have only one $q$ so that the action only depends on the function $q(t)$ . Fixing the boundary conditions $q(t_{0})=q_{0}$ and $q(t_{1})=q_{1}$ , we vary the path by a small $\epsilon\,\omega(t)$ with $\omega(t_{0})=\omega(t_{1})=0$ . Then,

	$\displaystyle\frac{{\rm d}}{{\rm d}\epsilon}S[q+\epsilon\,\omega]$	$\displaystyle=\int_{t_{0}}^{t_{1}}{\rm d}t\ \frac{{\rm d}}{{\rm d}\epsilon}L% \big{(}q(t)+\epsilon\,\omega(t),\dot{q}(t)+\epsilon\,\dot{\omega}(t),t\big{)}$
		$\displaystyle=\int_{t_{0}}^{t_{1}}{\rm d}t\ \Bigg{[}\omega(t)\frac{\partial L}% {\partial q}\big{(}q(t)+\epsilon\,\omega(t),\dot{q}(t)+\epsilon\,\dot{\omega}(% t),t\big{)}+\dot{\omega}(t)\frac{\partial L}{\partial\dot{q}}\big{(}q(t)+% \epsilon\,\omega(t),\dot{q}(t)+\epsilon\,\dot{\omega}(t),t\big{)}\Bigg{]}\,,$		(37)

since $t$ does not depend on $\epsilon$ . For $\epsilon=0$ , we have an extremal value

\displaystyle\frac{{\rm d}S[q]}{{\rm d}\epsilon}

\displaystyle=\int_{t_{0}}^{t_{1}}{\rm d}t\ \Bigg{[}\omega(t)\frac{\partial L}% {\partial q}\big{(}q(t),\dot{q}(t),t\big{)}+\dot{\omega}(t)\frac{\partial L}{% \partial\dot{q}}\big{(}q(t),\dot{q}(t),t\big{)}\Bigg{]}\,.

(38)

Applying integration-by-parts on the second term with $\omega(t_{0})=\omega(t_{1})=0$ such hat the boundary conditions vanish

\displaystyle\frac{{\rm d}S[q]}{{\rm d}\epsilon}

\displaystyle=\int_{t_{0}}^{t_{1}}{\rm d}t\ \omega(t)\Bigg{[}\frac{\partial L}% {\partial q}\big{(}q(t),\dot{q}(t),t\big{)}-\frac{{\rm d}}{{\rm d}t}\frac{% \partial L}{\partial\dot{q}}\big{(}q(t),\dot{q}(t),t\big{)}\Bigg{]}\,.

(39)

Since $\omega$ is an arbitrary smooth function, the fundamental lemma of the calculus of variations requires that the bracket vanishes.

Because of its prevalence in mechanics, we define the conjugate momentum to $q_{i}$

\displaystyle\pi_{i}=\frac{\partial L}{\partial\dot{q}_{i}}

(40)

Next, we can define the Hamiltonian

\displaystyle H=\sum_{j=1}^{n}\dot{q}_{j}\pi_{j}-L(\{q_{i}\},\{\dot{q}_{i}\},t% )\,,

(41)

which in the case of (33) is just the total energy of the system

\displaystyle H=\sum_{j=1}^{n}\dot{q}_{j}\underbrace{\frac{\partial L}{% \partial\dot{q}_{i}}}_{\dot{q}_{j}m_{j}}-L=T+V\,.

(42)

We can now wonder when energy is conversed by calculating the total derivative of $H$ w.r.t. $t$

\displaystyle\frac{{\rm d}H}{{\rm d}t}=\underbrace{\ddot{q}_{j}\frac{\partial L% }{\partial\dot{q}_{j}}+\dot{q}_{j}\frac{{\rm d}}{{\rm d}t}\frac{\partial L}{% \partial\dot{q}_{j}}}_{{{\rm d}(q_{j}\pi_{j})}/{{\rm d}t}}\underbrace{-\frac{% \partial L}{\partial q_{j}}\dot{q}_{j}-\frac{\partial L}{\partial\dot{q}_{j}}% \ddot{q}_{j}-\frac{\partial L}{\partial t}}_{-{{\rm d}L}/{{\rm d}t}}\,,

(43)

where we have left the sum over $j$ implicit. Using (36), we can cancel the remaining terms and are left with

\displaystyle\frac{{\rm d}H}{{\rm d}t}=-\frac{\partial L}{\partial t}\,.

(44)

In other words, if the Lagrangian $L$ does not explicitly depend on $t$ (for example through a time-dependent potential $V(t)$ ), energy is conserved. This is the first example of what is often called a Noether current or Noether charge, that is a conserved quantity that arises because the Lagrangian has a certain symmetry such as time-invariance.

Noether Theorem

In 1918, the German mathematician Emmy Noether proved that if the Lagrangian $L$ is invariant under small perturbations of the time variable and the coordinates $q_{i}$ , there exists a conserved quantity for each of the $n$ coordinates. To quantify this, let $T$ be the generator of time evolution and $Q_{i}$ the generator of the symmetry

	$\displaystyle t$	$\displaystyle\to t+\delta t=t+\epsilon T\,,$		(45)
	$\displaystyle q_{i}$	$\displaystyle\to q_{i}+\delta q_{i}=q_{i}+\epsilon Q_{i}\,.$		(46)

If $L$ is invariant under the this transformation,

\displaystyle\mathbb{O}=H\ T-\pi_{i}Q_{i}

(47)

is conserved. Examples include

•

$T=1$ , $Q_{i}=0$ : energy is conserved if the potential is not time-dependent.
•

$T=0$ , $Q_{i}=1$ : linear momentum is conserved if the potential is shift-invariant.
•

$T=0$ , for each particle $r$ $\vec{Q}_{r}=\vec{n}\times\vec{q}_{r}$ with some vector $\vec{n}$ : angular momentum along the axis $\vec{n}$ is conserved if the Lagrangian is spherical symmetric.
•

If the $L$ is invariant under Lorentz boost, the centre-of-mass system moves with constant velocity.

Additionally to Lagrangian mechanics, sometimes it is helpful to consider the Hamiltonian EoMs

\displaystyle\dot{q}_{j}=\frac{\partial H}{\partial\pi_{j}}\quad\text{and}% \quad\dot{\pi}_{j}=-\frac{\partial H}{\partial q_{j}}\,.

(48)

Suggested Exercise

Derive these using the Euler-Lagrange equation.

1.4 Classical field theory

Before we can study quantum field theorys, let us review classical field theories. A classical field $\phi(\vec{x},t)$ is a function that can take a value for each point in space and time. From a Lagrangian point of view, this means that we one degree of freedom for each point, requiring us to use integrals rather than finite sums. Since we also want to consider relativity, we further want to avoid talking about space differently from time. Therefore, the Lagrangian functional $L$ is now less useful and we instead use the Lagrangian density $\mathcal{L}$ which confusingly is also often called Lagrangian. The generalised coordinate now becomes the field $\phi(x)$ and the generalised velocity the derivative of the field $\partial^{\mu}\phi$ . The action functional is still defined the same way

\displaystyle S[\phi]=\int{\rm d}t\ L[\phi]=\int{\rm d}t\int{\rm d}^{3}x\ % \mathcal{L}[\phi,\partial^{\mu}\phi]=\int{\rm d}^{4}x\ \mathcal{L}[\phi,% \partial^{\mu}\phi]\,.

(49)

Similarly to the $n=1$ case above, we can derive the Euler-Lagrange equation by requiring $\delta S=0$

\displaystyle\frac{\partial}{\partial x_{\mu}}\frac{\partial\mathcal{L}}{% \partial(\partial^{\mu}\phi)}-\frac{\partial\mathcal{L}}{\partial\phi}=\frac{% \partial}{\partial x_{\mu}}\pi_{\mu}-\frac{\partial\mathcal{L}}{\partial\phi}=% 0\,.

(50)

where we have once again defined the conjugate momentum $\pi$ .

Suggested Exercise

Do this by adding a small variation $\phi\to\phi+\delta\phi$ and use Gauss’s law to remove the surface terms.

Now let us consider the KG as defining a field $\phi$ instead of a wavefunction $\Psi$ and derive the EoM (31). We first require the Lagrangian $\mathcal{L}$ . While we could write down a field theory that depends on $\phi$ and $\partial^{\mu}\phi$ in any way we like, we usually want $\mathcal{L}$ to be polynomial in the fields and derivatives. There are plenty of examples where this is not the case though (such as the Sine-Gordon theory or Higgs Effective Theory). We do require $\mathcal{L}$ to be a Lorentz scalar though, meaning that it cannot have open indices, and that $\mathcal{L}$ has units of ${\rm GeV}^{4}$ so that the action $S$ is dimensionless. The first conditions implies that $\mathcal{L}$ can only be a function of $(\partial_{\mu}\phi)(\partial^{\mu}\phi)$ as there are no other vectors to contract with. It turns out that

\displaystyle\mathcal{L}=\frac{1}{2}(\partial_{\mu}\phi)(\partial^{\mu}\phi)-% \frac{1}{2}m^{2}\phi^{2}\,.

(51)

Note that the normalisation of $1/2$ in front of the kinetic term does not matter classically as it does not impact the Euler-Lagrange equations. This changes once we start studying quantum fields so we will keep it already canonically normalised now. Let us that this reduces to the KG equation by computing

\displaystyle\frac{\partial\mathcal{L}}{\partial(\partial^{\mu}\phi)}=\frac{1}% {2}\frac{\partial}{\partial(\partial^{\mu}\phi)}\Big{[}\eta_{\alpha\beta}(% \partial^{\alpha}\phi)(\partial^{\beta}\phi)\Big{]}=\frac{1}{2}\eta_{\alpha% \beta}\Big{[}(\partial^{\alpha}\phi)\delta^{\beta}_{\phantom{\beta}\mu}+\delta% ^{\alpha}_{\phantom{\alpha}\mu}(\partial^{\beta}\phi)\Big{]}=\partial_{\mu}% \phi\,.

(52)

Therefore, we find for (50)

\displaystyle\frac{\partial}{\partial x_{\mu}}\partial_{\mu}\phi-\frac{1}{2}% \frac{\partial(-m^{2}\phi^{2})}{\partial\phi}=(\partial^{\mu}\partial_{\mu}+m^% {2})\phi=0\,,

(53)

which is just (31).

For the Hamiltonian formulation, we need the conjugate momentum which is written through the time derivative

\displaystyle\pi(x)=\frac{\partial\mathcal{L}}{\partial\dot{\phi}}=\dot{\phi}(% x)\,,

(54)

which, like $\phi$ itself, is a scalar field in that it assigns a scalar value to every point in spacetime.

When we considered the case where $n$ is finite, our next subject was to show that total energy is conserved. We can do something very similar here by defining the Hamiltonian density $\mathcal{H}$

\displaystyle\mathcal{H}=\dot{\phi}\pi-\mathcal{L}=\frac{1}{2}\pi^{2}+\frac{1}% {2}(\nabla\phi)^{2}+\frac{1}{2}m^{2}\phi^{2}\,.

(55)

but it is actually helpful to think broader. Once we start considering relativity, energy and momentum become frame-dependent and it would be nice to have a covariant description that works with four-vectors rather than focussing on energy and momentum separately. Such an object is the energy-momentum tensor

\displaystyle T^{\mu\nu}=(\partial^{\mu}\phi)\frac{\partial\mathcal{L}}{% \partial(\partial_{\nu}\phi)}-\mathcal{L}\eta^{\mu\nu}\,,

(56)

which considers energy and momentum densities but also their fluxes as well as pressure and stress. $T^{\mu\nu}$ is a symmetric tensor which is trivial to see in our specific example,

\displaystyle\begin{split}T^{\mu\nu}=(\partial^{\mu}\phi)(\partial^{\nu}\phi)-% \frac{1}{2}(\partial_{\sigma}\phi)(\partial_{\rho}\phi)\eta^{\rho\sigma}\eta^{% \mu\nu}+\frac{1}{2}m^{2}\phi^{2}\eta^{\mu\nu}\\ =(\eta^{\rho\mu}\eta^{\sigma\nu}-\frac{1}{2}\eta^{\rho\sigma}\eta^{\mu\nu})(% \partial_{\sigma}\phi)(\partial_{\rho}\phi)+\frac{1}{2}m^{2}\phi^{2}\eta^{\mu% \nu}\,.\end{split}

(57)

This is very similar to $\mathcal{H}$ and in fact $T^{00}=\mathcal{H}$ is the energy density.

Since we have four translation symmetries (one time and three spatial ones), we expect four conserved Noether currents

\displaystyle\begin{split}\partial_{\nu}T^{\mu\nu}&=(\partial_{\nu}\partial^{% \mu}\phi)\frac{\partial\mathcal{L}}{\partial(\partial_{\nu}\phi)}+(\partial^{% \mu}\phi)\Bigg{(}\partial_{\nu}\frac{\partial\mathcal{L}}{\partial(\partial_{% \nu}\phi)}\Bigg{)}-\underbrace{\Bigg{(}\frac{\partial\mathcal{L}}{\partial\phi% }\partial_{\nu}\phi+\frac{\partial\mathcal{L}}{\partial(\partial_{\rho}\phi)}% \partial_{\nu}(\partial_{\rho}\phi)\Bigg{)}}_{\partial_{\nu}\mathcal{L}}\eta^{% \mu\nu}\\ &=(\partial_{\nu}\partial^{\mu}\phi)\frac{\partial\mathcal{L}}{\partial(% \partial_{\nu}\phi)}+(\partial^{\mu}\phi)\Bigg{(}\partial_{\nu}\frac{\partial% \mathcal{L}}{\partial(\partial_{\nu}\phi)}\Bigg{)}-\Bigg{(}\partial^{\sigma}% \frac{\partial\mathcal{L}}{\partial(\partial^{\sigma}\phi)}\partial_{\nu}\phi+% \frac{\partial\mathcal{L}}{\partial(\partial_{\rho}\phi)}\partial_{\nu}(% \partial_{\rho}\phi)\Bigg{)}\eta^{\mu\nu}=0\,.\end{split}

(58)

Here we have used the chain rule to expand $\partial_{\nu}\mathcal{L}$ and (50) for the first term in the $\eta^{\mu\nu}$ bracket. We have now shown that the energy-momentum tensor is a conserved quantity in a more differential sense.

Let us pause for a moment to understand what this means by considering the $\mu=0$ case, i.e. $j^{\nu}=T^{0\nu}$ for which we still have the same conservation law

\displaystyle 0=\partial_{\nu}j^{\nu}=\frac{\partial j^{0}}{\partial t}+\vec{% \nabla}\cdot\vec{j}\,.

(59)

By integrating this over a small region of space $V$ , we find using Gauss’s divergence theorem for the second term

\displaystyle\frac{\partial}{\partial t}\int_{V}{\rm d}^{3}x\ j^{0}=-\oint_{% \partial V}{\rm d}\vec{S}\cdot\vec{j}\,.

(60)

Since we have already identified $j^{0}$ as the energy density, this implies that the change in energy in the volume is equal to the flux of energy out of this volume. We will often encounter conservation laws like this, from energy or momentum like here to probability densities in wavefunctions or electric charge in electrodynamics.

Before concluding this chapter, let us derive the general solution of the KG equation. Since this is a wave equation, we begin by making an ansatz in terms of plane wave solutions

\displaystyle\phi(x)=A{\rm e}^{-ik\cdot x}+B{\rm e}^{+ik\cdot x}\,,

(61)

where $k^{\mu}$ is a four vector indicating the direction of travel of our plane wave. Since we have considered only the real KG where $\phi=\phi^{*}$ , we also need a real solution, i.e. $B=A^{*}$ . We can constrain $k$ by substituting the ansatz into the KG equation

\displaystyle\partial^{2}\phi+m^{2}\phi=-k^{2}\phi+m^{2}\phi=0\,.

(62)

Therefore, $k^{2}=m^{2}$ which means that the momentum of the plane wave needs to be on its so-called mass-shell. Of course, the general solution is a linear combination of plane waves

\displaystyle\phi(x)=\int\frac{{\rm d}^{3}k}{(2\pi)^{3}}\frac{1}{\sqrt{2E_{% \vec{k}}}}\Big{[}a_{\vec{k}}{\rm e}^{-ik\cdot x}+a_{\vec{k}}^{*}{\rm e}^{ik% \cdot x}\Big{]}

(63)

where $k^{2}=(k^{0})^{2}-\vec{k}^{2}=E_{\vec{k}}^{2}-\vec{k}^{2}=m^{2}$ . The factor $1/\sqrt{2E}$ does not really matter at this point but it will become convenient later.

Suggested Exercise

Repeat the above discussion for the complex KG field where $\phi$ and $\phi^{*}$ are independent degrees of freedom. Start with the Lagrangian

\displaystyle\mathcal{L}=(\partial_{\mu}\phi)(\partial^{\mu}\phi^{*})-m^{2}% \phi\phi^{*}\,.

(64)

In a QFT context, $\phi^{*}$ would be the antiparticle to $\phi$ ’s particle.