The trapezoidal integration method

Trapezoidal integration

Say we want to approximate the definite integral \( \int_a^bf(x)dx\) 1 of some smooth function \( f: \mathbb{R}\rightarrow\mathbb{R}\). Depending on the nature of \( f\) we might not be able to obtain an exact solution. Instead we can use numerical integration methods to approximate the value of the integral. Here we will be looking at the trapezoidal method.

The trapezoidal method works by building trapez-shaped panels under the function. With a single panel it could look like this:

Of course the idea here is that it is easy to calculate the area of the panel. We can observe 2 that the area is really the same as this area:

And we have our first approximation:

\( \int_a^bf(x)\approx (b-a)\frac{f(b)+f(a))} 2\)

Now the idea is to keep increasing the number of panels until we have enough precision.

Generally we use \( n\) panels of length \( h=\frac{b-a}n\). Then the panels start, meet and end at \( x_j=a+j\cdot h\) for \( j=0,1,\ldots,n\). The total area of the approximation becomes

\[ T_n(f)=\sum_{j=0}^n \frac{h(f(x_j)+f(x_{j+1}))}2=\frac h2 \sum_{j=0}^n (f(x_j)+f(x_j+h))\]

Of course the more panels we add, the better our approximation becomes. At the same time we need to perform more arithmetic and, what might be worse, we need to have access to more information about the function. The relationship between the quality of the approximation and the amount of function access (and caculation work) is captured by the error.

Trapezoidal error

The error is defined as \( \left|\int_a^b f(x) - T_n(f)\right|\) and can be bounded by the sum of the error in each of the intervals.

The error in a single interval is

\[ \left|\int_{x_j}^{x_j+h}f(x)-\frac h2 (f(x_j)+f(x_j+h))\right|\]

Clearly there is always some 'maximal' point \( x_j\leq c \leq x_j+h\) such that the error is upper bounded by \( h f(c)\):a

(As a sidenote the mean value theorem states that there is also some $c$ where the error is $0$).

But we can find a much better bound than that.

Let \[ g_j(h)=\frac h2 (f(x_j)+f(x_j+h))-\int_{x_j}^{x_j+h}f(x)\] A large \( g\) (negative or positive) equates to a large error, so we will seek to bound \( g\) close to \( 0\).

We observe that:

\[ \frac \delta {\delta h} g_j(h) = \frac 1 2 (f(x_j)+f(x_j+h)+h f'(x_j+h))-f(x_j+h)\]

And so:

\[ g''_j(h) = \frac 1 2 h f''(x_j+h)\]

Now choose \( c_j\) such that \( f''(x_j+h)\leq f''(c_j)\).

It is clear from the definition of \( g_j(h)\) that \( g_j(0)=0\) and hence \( g_j'(0)=0\) and \( g_j''(0)=0\) which we use to see:

\[ \int_0^h g''_j(x) \leq \frac 1 4 h^2 f''(c_n)\]

and so

\[ g_j(h)=\int_0^h g'_j(x) \leq \frac 1 {12} h^3 f''(c_n)\]

By choosing some \( c_u\) such that \( \forall j . f''(c_j)\leq f''(c_u)\) we get an upper bound on \( g\):

\[ \sum_{i=0}^n g_i(h)\leq \frac 1 {12} h^2(b-a) f''(c_u)\]

We can follow the same steps to get a lower bound by choosing \( c_j\) values that minimize \( g''_j(h)\) and then choosing \( c_l\) such that \( \forall j . f''(c_j)\leq f''(c_u)\). Combining these bounds we can get the error bound:

\[ \left|\int_a^b f(x) - T_n(f)\right|\leq \frac 1 {12} h^2(b-a) f''(c),\text{ where } c\in[a,b]\]

Which is a much tighter bound on the error. The critical part here is the \( h^2\) indicating that the error decreases quadratically as the number of panels \( n\) grows.

\[ \left|\int_a^b f(x) - T_n(f)\right| = \mathcal{O}(n^{-2})\]


We will be integrating over \( x\) in the rest of the text so the \( dx\) will be omitted.

\( f(a)(b-a)+(f(b)-f(a))(b-a)/2= f(b)+f(a)(b-a)/2\)