0. Introduction
This tutorial explains how to use the Excel numerical library XLPack. You can download the example worksheets. It is recommended to run them on your PC and it will help you understand this tutorial.
In this chapter, overview of the library is described. Chapter 1 explains how to obtain and use example worksheets. In the following chapters, usage of numerical library routines is described with some examples. All examples can be run by using XLPack Basic (6.0).
The details of the numerical methods are not described in this tutorial, but other series of documents describing the detail algorithms of numerical methods will be posted.
Structure of XLPack
XLPack consists of worksheet function library, solver and VBA subroutine/function library. These can be divided into two types: tools for numerical calculations without programming and tools that support programming.
Numerical calculation without programming
Worksheet functions or solver allow you to perform calculations without the need for programming.
Worksheet function library
Worksheet functions can be directly entered in Excel worksheet. There are two types: functions that return one value and functions that return a group of values, such as a vector or matrix.
The major examples of the former are special functions, which are used as part of formulas like ordinary Excel functions. The other example of this type is the data input type quadrature function.
The latter is a function which returns the array type result unlike ordinary Excel functions. The major ones are the linear calculation functions (system of linear equations, eigenvalue and eigenvectors, linear least squares method, singular value decomposition, etc.). Further, interpolation, algebraic equation and FFT routines are included in this type. To enter such functions in the Excel worksheet, select the cells where the solution is to be placed, and click “fx” of formula bar or press Shift+F3 then “Insert Function” dialog box appears. After selecting function and entering all parameters, do not click “OK” nor press Enter, but press Ctrl+Shift+Enter.
XLPack solver
Like the Excel solver, XLPack solver is the add-in that calculates by referencing cells specified by menu operations. Calculations can be performed simply by entering the formula and the necessary data into the worksheet.
XLPack solver can obtain the solutions for nonlinear equations, nonlinear optimization, nonlinear least squares method, quadrature and ordinary differential equations.
Supporting VBA numerical programming
VBA subroutine/function library
VBA subroutine/function library is a set of subroutines and functions which can be called from VBA programs. When VBA numerical programming is required, by using this library, users just need to develop mainly input/output part and professional programming on numerical calculation is not required since the library covers it.
You can use sample worksheets as a template so that you can use the library easily. You can also refer to the sample codes included in the reference manual except for simple ones.
Problems covered by XLPack
XLPack covers almost all areas of general numerical computation. Each area is briefly described below.
Special function
System of linear equations
Eigenvalues and eigenvectors
Approximation
Nonlinear equations
Nonlinear optimization
Numerical integration (Quadrature)
Ordinary differential equations
Fast Fourier transform (FFT)
Note – Partial differential equations and their related areas such as systems of linear equations with large sparse coefficient matrices are under development. Statistical computation is currently not included.
Special functions
Worksheet functions and VBA functions are provided for special functions such as Bessel functions, Airy functions, elliptic integrals, exponential and logarithmic integrals, gamma functions, beta functions, psi functions, and error functions. These functions can be used as the builtin functions in the formulas.
System of linear equations
For example, a system of two linear equations looks like:
a11x1 + a12x2 = b1 a21x1 + a22x2 = b2
where, x1 and x2 are unknowns.
This system of equations can be written in matrix form as follows.
Ax = b where, A = (a11 a12) (a21 a22) b = (b1) (b2) x = (x1) (x2)
A is called as a coefficient matrix, b is called as a right hand side vector, and x is called as a solution vector.
If the inverse matrix A-1 is computed, let’s multiply it to both sides of the equation from the left.
A-1Ax = A-1b
A-1A is a unit matrix, then x can be computed by the following formula.
x = A-1b
Since Excel has the builtin function MINVERSE which computes an inverse matrix, a solution vector x can be computed by multiplying the inverse matrix, which is computed by using this function, to a right hand side vector b from the left. However, it is recommended to compute a solution directory without computing an inverse matrix, because the inverse requires more arithmetic operations and generally produces less accurate solutions.
Then the methods based on Gaussian elimination without computing an inverse matrix are widely used. LU decomposition for general nonsymmetric matrices and Cholesky decomposition for symmetric matrices are the today’s standard methods.
First, a coefficient matrix A is decomposed as below.
A = LU, where L is lower triangular matrix, U is upper triangular matrix -- LU decomposition A = LDLT, where L is lower triangular matrix, D is diagonal matrix -- (modified) Cholesky decomposition
Next, in the case of LU decomposition, Ux is substituted by y and y is computed from Ly = b, and finally x is computed from Ux = y.
LUx = b Ly = b Ux = y
In the above process, solutions can be easily computed since L and U are triangular matrices. The solution for Cholesky decomposition of symmetric matrix can be similarly computed.
Dgesv is the VBA subroutine for the former and Dposv for the latter. Worksheet functions are WDgesv and WDposv, respectively.
When solving system of linear equations, it is recommended to compute a condition number.
The condition number is defined as follows.
cond(A) = ||A|| ||A-1||
The value of a condition number is 1 for the unit matrix I which has the best condition, ∞ for the singular matrix which has the worst condition, and the intermediate value between 1 and ∞ for other matrices. To compute a condition number straightforward by the definition, the inverse of the matrix is required and much computation effort is necessary. Therefore, some estimate is used in general.
By using condition number, relative error of obtained solution x can be estimated as below.
||x - x||/||x|| ≦ cond(A) ε
Where bold face x is the true solution, ε is the machine epsilon(*). When a condition number is very large, a problem is called as an ill-conditioned problem. If a condition number is greater than 1015, there may be no significant digits in the obtained solution.
Worksheet functions WDgesv and WDposv return estimated condition number. For the case of VBA programs, Dlange and Dgecon or Dlansy and Dpocon can be used to compute an estimated condition number.
—
(*) Machine epsilon is the error caused by an approximation of a real number by a floating point number. It is 1.11×10-16 in the case of Excel (double precision in IEEE form).
Eigenvalues and eigenvectors
Eigenvalue problem of matrix A is to obtain scalar λ and vector x which satisfy the following equation.
Ax = λx, x ≠ 0
λ is called as an eigenvalue and x is called as an eigenvector. The matrix of order n has in general n eigenvalues and corresponding n eigenvactors. Since symmetric matrices appear in many applications, let’s consider about the symmetric matrix case here. The eigenvalues of symmetric matrices are real numbers.
Eigenvalues satisfy the following equation.
det(A - λI) = 0
This is called characteristic equation which is n-th degree polynomial in λ. Although λ can be obtained by solving this polynomial equation, more efficient method is used in numerical computation.
First, matrix A is reduced to tridiagonal matrix (matrix with non-zero elements only on the diagonal and first diagonal below and above it) without changing its eigenvalues. This transformation can be completed with finite number of arithmetic. Next, eigenvalues of this tridiagonal matrix are computed by iterative method such as QR method. This iterative process on the tridiagonal matrix should be efficiently converged with less total effort than adopting iterative method from the initial stage without tridiagonalization. Using subroutine Dsyev with this method or worksheet function WDsyev, all eigenvalues and eigenvectors of symmetric matrices can be computed.
Approximation
To approximate functions, least squares method is often used. This can be regarded as fitting the experimental data by a model function. When a model function is a linear combination of some basis functions, linear least squares method is used. For other model functions, nonlinear least squares method is used. These two solving methods are completely different. The interpolation is sometimes confused with the least squares methods. However, it is used to obtain unknown function values in between given function values.
Linear least squares problems
Suppose m measured data
(xi, yi) = (xi + exi, yi + eyi), i = 1, ..., m
are given. Where each data (xi, yi) contains measurement error (exi and eyi) against true value (xi, yi).
Suppose each true value satisfies some function F.
yi = F(xi)
Let’s approximate the function F(x) by the linear combination f(x) of n basis functions φj.
F(x) ≒ f(x) = c1φ1(x) + c2φ2(x) + ... + cnφn(x)
Where f(x) is called as the model function. When enough many numbers of data points are available (m ≫ n), the parameters (cj) are determined to best fit the experimental data by the model function.
The most commonly used basis function φj of the model function is a polynomial of degree (j -1).
φj(x) = xj-1
This is the polynomial approximation.
f(x) = c1 + c2x + ... + cnxn-1
Linear least squares method can also be used for model functions other than polynomials if they are linear combinations of some basis functions. For example,
Trigonometric:
f(x) = c1sin(x) + c2sin(2x) + ... + cnsin(nx)
Exponential (λj is constant):
f(x) = c1exp(λ1x) + c2exp(λ2x) + ... + cnexp(λnx)
When m measured data are substituted for the model function and arranged lengthwise, the following system of linear equations is obtained (called as observation equation).
Ac = y where, A is m x n coefficient matrix, and Aij = φj(xi) (i = 1 to m, j = 1 to n) c is n solution vector, and the elements are the parameters cj (j=1 to n) y is m right hand side vector, and the elements are the data yi (i=1 to m)
This system cannot be solved in usual way since the number of equations is greater than the number of unknowns (m > n). Then let’s compute the approximate solution. One of the methods is the least squares method which minimizes the square of the norm ||Ac – y|| (*). In this case, solution c is called as the least squares solution.
Least squares solution can be obtained as the solution where the derivative of ||Ac – y||2 equals to 0 and it is the solution of the following system of linear equations.
ATAc = ATy
This equation is called as the normal equation which can be solved in usual way since (number of unknowns) = (number of equations) = n. However, the condition number of the coefficient matrix of normal equation is likely to be large because Cond(ATA) = (Cond(A))2. Then the truncation error may be large and this method is not used very much.
Instead, QR decomposition (A = QR) is widely used. Q is mxm orthogonal matrix (QTQ = I). R is upper triangular matrix ((n+1)-th to m-th rows are 0). After this decomposition, the original system of linear equations is transformed as below.
Ac = y QRc = y QTQRc = QTy Rc = QTy
In the last equation, c can easily be computed since R on left hand side is upper triangular matrix. The obtained c is the lease squares solution.
The number of parameters n should not be unnecessarily large. For example, approximation with 10th order polynomial will be meaningless if the data are obviously on the straight line. If n is too large, columns of the coefficient matrix A becomes linearly dependent and eventually QR decomposition will fail. This is called as the rank deficiency (rank is the maximum number of linearly independent columns of the matrix). In such case, the model function should be reconsidered.
Using subroutine Dgels and worksheet function WDgels which implement the above computation method, an accurate linear least squares solutions can be computed.
—–
(*) This means to minimize the square sum of residuals Σri2 (i = 1~m), where ri = f(xi) – yi
Nonlinear least squares problems
Different from linear least squares method, a model function is a general function f(x) with n parameters.
F(x) ≒ f(x; c1, c2, ... , cn)
One of the parameter fitting methods is the nonlinear least squares method, i.e. the parameters (cj) are determined to minimize the sum of squares of the residuals.
Σri2 (i = 1 to m) where, ri = f(xi) - yi
In the case of linear least squares method, the parameters are computed by solving system of linear equations. In the case of nonlinear least squares method, however, the parameters are obtained by the iterative computation. The computation is started from the approximated solution given as the initial value and the process to improve the approximation is iterated until the convergence to the least squares solution. If the appropriate initial approximation is not given, the iterative process may not be converged.
To solve nonlinear least squares problems, VBA subroutine Lmdif1 can be used. It is the implementation of Marquardt method. It is easy-to-use routine which calculates derivatives (Jacobian matrix) by forward-difference approximation and users are not required to compute derivatives analytically.
Lmdif1 can also be used from XLPack solver.
Interpolation
The interpolation method is sometimes confused with the least squares method. In the least squares method, it is assumed that the errors are included in the data points, and the approximation formula is in general not exactly on the data points. The interpolation, however, smoothly connects the data points which do not include the errors. The approximation formula passes through all the data points. If the errors are included, unnaturally bent formula will be obtained.
There are numerical tables for a variety of functions. When using them, interpolation was used to obtain the values of the points not listed in the numerical table. There is no such demand these days because we have a computer or a calculator. However, interpolation is still useful, for example, to quickly obtain the values of the functions which require large effort to compute.
The major interpolation methods are as follows.
- Polynomial interpolation (Lagrange interpolation)
Approximation with (n-1)-th degree polynomial which passes through n data points
- Piecewise polynomial interpolation
The interval is divided into a collection of subintervals and an approximating polynomial is constructed on each subinterval. The most common method of this type is cubic spline interpolation. Cubic polynomials are used between each successive pair of nodes, and the interpolation is constructed twice continuously differentiable on the interval to smoothly joint the subintervals
To compute cubic spline coefficients and interpolant values, VBA subroutines Pchse and Pchfe or worksheet functions WPchse and WPchfe can be used.
Nonlinear equations
There is in general no numerical method to solve a nonlinear equation with finite number of arithmetic operations. The iterative method starting from the given initial approximated solution and improving it to converge to the solution is used.
Algebraic equation
When a nonlinear function is a polynomial function, the equation is called as an algebraic equation.
p(x) = a0xn + a1xn-1 + ... + an-1x + an = 0
2, 3 and 4th order equations can be solved by analytic methods. However, 5th and higher order equations should be solved by iterative methods. There are many methods proposed for higher order algebraic equations and users should choose an appropriate one depending on the problem. When only one real solution is necessary, the numerical method for single variable general nonlinear equation described below can also be used.
VBA subroutine Rpzero2 and worksheet function WRpzero2 solve the algebraic equation of real coefficients by Newton method. Complex solutions can be computed.
Nonlinear equation of one variable
Let’s find the real root of single variable general nonlinear function.
f(x) = 0
The most common numerical method for this problem is Newton’s method. It converges fast and many numerical methods have been developed based on it. Other popular methods are the iterative methods which start from the initial interval including a root and reduce it such as bisection method. These methods are not fast to converge but stable.
VBA subroutine Dfzero is the combination of bisection and interpolation methods. It finds a root within given initial interval faster than bisection method.
Dfzero can also be used from XLPack solver.
System of nonlinear equations of multiple variables
System of nonlinear equations with multiple variables is one of the problems of numerical methods which are difficult to solve.
f1(x1, x2, ..., xn) = 0 f2(x1, x2, ..., xn) = 0 ... fn(x1, x2, ..., xn) = 0
Any nonsingular system of linear equations has a unique solution. In the case of system of nonlinear equations, however, may have no real solution, or may have multiple solutions.
Newton’s method (which is generalized to multiple variables) or its modified algorithm are used. These methods find the solution near given initial approximation but do not find the global solutions. To obtain a desired solution, therefore, the initial approximation as near as the target solution should be supplied.
To solve a system of nonlinear equations, VBA subroutine Hybrd1 can be used. It is the implementation of Powell’s hybrid method which is one of the modified Newton’s methods. It is easy-to-use routine which calculates derivatives (Jacobian matrix) by forward-difference approximation.
Hybrd1 can also be used from XLPack solver.
Nonlinear optimization
Nonlinear optimization is the problem of finding a minimum point (or maximum point when the sign is inverted) of nonlinear function. The iterative method is used to solve this problem.
Nonlinear optimization of function of one variable
Let’s find the minimum point of single variable general nonlinear function.
min f(x)
The typical method is the iterative method which starts from the initial interval including a minimum point and reduces it such as golden section search. This method is not fast to converge but stable like a bisection method for nonlinear equation.
VBA subroutine Dfmin is the combination of golden section and quadratic interpolation methods. It finds a minimum point within given initial interval. It is similar to subroutine Zeroin in the case of nonlinear equation.
Dfmin can also be used from XLPack solver.
Nonlinear optimization of function of several variables
Let’s find the minimum point of general nonlinear function of several variables. This is also one of the problems of numerical methods which are difficult to solve.
min f(x1, x2, …, xn)
There are various methods for solving nonlinear optimization. The direct search methods such as simplex method use only function values. The conjugate gradient method, quasi-Newton method and other methods use derivatives as well as function values. These methods find the local minimums but can not find the global minimum. There may be multiple local minimums. Therefore the initial approximation as near as the target minimum should be supplied. Otherwise the algorithm may converge to the undesired minimum.
The nonlinear optimization may be applied to the problems of nonlinear least squares or system of nonlinear equations by using the residual sum of squares as the object function. However, a dedicated numerical method for each problem should be used since the nonlinear optimization does not make full use of the information characterizing each problem.
To solve a nonlinear optimization, VBA subroutine Optif0 can be used. It is the implementation of quasi-Netwon method. It is easy-to-use routine which calculates derivatives (Jacobian matrix) by finite difference approximation.
Optif0 can also be used from XLPack solver.
Numerical integration (Quadrature)
Suppose f(x) is defined on the interval [a, b], we numerically compute the definite integral.
∫ f(x) dx [a, b]
There are two types of quadrature routines. One is a function input type which is used when integrand values can be computed at any points in the interval of integration on request. The other is a data input type which is used when integrand values at some points including the interval are known.
Function input type
There are many classical quadrature formulas, for example trapezoid, Simpson, Newton-Cotes, Gauss, etc. However, it is not easy to estimate the error in the result obtained by these formulas. In these days, new formulas, which are effective when calculating both integral value and its error estimate, are used. A Gauss-Kronrod formula is one of such formulas. The error estimate provides users with a useful information on the accuracy of the result. It also takes an important role for an adaptive quadrature algorithm. An adaptive quadrature is reliable and efficient general purpose routine which adaptively refines subintervals according to the error estimate and computes the integral value satisfying given target accuracy.
VBA subroutine Qk15 is the Gauss-Kronrod quadrature with fixed abscissas (15 points). It computes the integral and its estimated error with 15 function evaluations. VBA subroutine Qag is the Gauss-Kronrod adaptive quadrature. For smooth integrand functions, Qk15 will compute with satisfactory accuracy. Qag adaptive quadrature is recommended when an integrand is not smooth function or a user wants to compute with the specific target accuracy even if more function evaluations are required.
When one or both of the interval ends are not finite values, VBA subroutine Qagi can be used. It is also the Gauss-Kronrod adaptive quadrature
Qag and Qagi can also be used from XLPack solver.
Data input type
A value of integral is obtained by interpolating the given data points and computing the integration of interpolant. VBA subroutines Pchse and Pchia, or worksheet functions WPchse and WPchia for cubic spline interpolation can be used for this purpose.
Ordinary differential equations
An ordinary differential equation (ODE) appears frequently in the scientific calculation. Since not many ODEs can be solved analytically, numerical methods are often used.
Initial value problem
First order ordinary differential equation (ODE) can be written in the following form.
y' = f(y, t)
In the initial value problem of ODE, a function value y0 is given at initial point t = t0 and a function value y(tf) at final point t = tf is computed. The following is called as the initial condition.
y(t0) = y0
System of first order differential equations
An initial value problem of first order ODE can be extended to a system of first order ODEs by replacing scalar y by vector y.
y' = f(y, t) y(t0) = y0
For example, the system of two equations can be written as follows.
y1' = f1(y1, y2, t) y2' = f2(y1, y2, t)
In this case, two initial conditions are required.
y1(t0) = y1,0 y2(t0) = y2,0
Higher order ordinary differential equations
Higher order ODE, i.e. ODE including second or higher derivatives, can be solved by reducing it to a system of first order ODEs. Let’s consider the following equation as an example.
y''' = f(y, y', y'', t)
By replacing y1 = y, y2 = y’, y3 = y”, the following system of first order ODEs is obtained.
y1' = y2 y2' = y3 y3' = f(y1, y2, y3, t)
In this case, three initial conditions are required.
y(t0) = y0 y'(t0) = y'0 y''(t0) = y''0
A system of higher order ODEs can be solved with similar method.
Solving initial value problem of ODE
An initial value problem of ODE can be solved by one-step methods such as Euler method and Runge-Kutta method, or multi-step methods such as Adams method. Similar to new formula of quadrature, some methods which are efficient to calculate both solution and its error estimate have been developed. For example, the embedding formula of Runge-Kutta method can computes two approximate solutions of different orders using one formula. One of these solutions is used to estimate the error. Then an adaptive initial value solver with automatic step size control depending on the error estimation has also been developed.
VBA subroutine Derkf is the adaptive solver by 4(5)th order Runge-Kutta-Fehlberg method which is one of embedding forms of Runge-Kutta methods. This subroutine is equivalent to the RKF45 routine which has been widely used. Derkf can handle a system of first order ODEs.
Derkf can also be used from XLPack solver.
Fast Fourier transform (FFT)
The discrete Fourier transform (DFT) computes the Fourier transform of the discrete data numerically. It is used for a variety of fields such as digital signal processing, image processing, etc. The algorithm called as the fast Fourier transform (FFT) allows us to compute DFT with about two orders of magnitude less effort.
Suppose f(x) is the periodic function with the period of 2π, and the function value at the breakpoints xj=2πj/N which divide the interval [0,2π] into N subintervals, f(xj) is denoted as fj (for j=0 to N-1).
If the f(x) is real function, DFT can be computed by the following equation.
Fk = (1/2)(ak - i bk)
Where, the coefficients ak and bk are given by
ak = (2/N)Σ fj cos(2πkj/N) (Σ for j=0 to N-1) (k=1 to N2) bk = (2/N)Σ fj sin(2πkj/N) (Σ for j=0 to N-1) (k=1 to N2) where N2=N/2-1(if N is even number), N2=(N-1)/2(if N is odd number)
b0 is zero, bN/2 is zero if N is the even number, and a0 and aN/2 are given by the following formula with factor of 1/2.
a0 = (1/N)Σ fj (Σ for j=0 to N-1) aN/2 = (1/N)Σ fj cos(jπ) = (1/N)Σ(-1)j fj (Σ for j=0 to N-1)
If the f(x) is real function, fj equals to conj(fj), and then the DFT will be Fk=conj(FN-k). Therefore, Fk can be computed only for k=0 to N/2 (conj() denotes the conjugate complex).
The inverse transform of DFT, which is called as the inverse discrete Fourier transform (IDFT), is defined as follows.
fj = Σ(ak cos(2πkj/N) + bk sin(2πkj/N)) (Σ for k=0 to N/2) (j=0 to N-1)
Note ? The normalization coefficients (1/N) of DFT/IDFT and the sign of exponential part may be different in the different documents.
VBA subroutine Rfft1f, Rfft1b, Rfft1i or worksheet function WRfft1f, WRfft1b can be used to compute one-dimensional real Fourier transform or backward transform.