Integration of Reaction Mechanisms and Estimation of Mechanism Parameters:

A Description of libparest.a

Overview
This file describes the usage of the driver programs which link the parameter estimation package GREG with the integration and sensitivity analysis of DDASAC and DASSLSO, as well as the input files needed by the main() program.

The hope is that this driver program will be general enough for almost all applications of GREG. If anyone has ideas about how to better extend this program, please email David Dooling with your ideas.

Instructions on how to create the executable are at the end of this file.

The user must supply five functions and two const double values in a C file, typically model.c. The first line of the file should have the following line:

#include <parest.h>
The rest of the file should define the following functions and constants:
const double Renergy, Rgas;

int ModelODEs(double *y, double *yprime, double *rpar, int *ipar,
InputStructure *Input);

int AlgebraicEqns(double *y, double *f, double *rpar, int *ipar,
InputStructure *Input, int check);

int EqnType(double *ework, int nSpecies, int nSurfaceSpecies,
InputStructure *Input);

void TimeFcn(double time, double *y, double *yprime, double *flowin,
int nSpecies, int nSurfaceSpecies, double *rpar, int *ipar,
InputStructure *Input);

void dump2f(FILE *ffile, double t, int nstvar, double *y, double *f,
double *wt, int nspar, double *deriv, int lenfc,
int nSpecies, int nSurfaceSpecies, InputStructure *Input,
int iout);

These functions contain all of the model specific information needed for integration and parameter estimation. The constant values allow the user to specify the units for energy, pressure, and volume (these are discussed in the integrator input file section). ModelODEs() should contain all of the species balance equations. AlgebraicEqns() should contain any model specific algebraic equations which override the species balance equations such as a site balance equation. In addition, AlgebraicEqns() should check that the algebraic equations are satisfied, e. g., assigning a variable forcing the equations to be satisfied, when int check is true (non-zero). EqnType() must place a zero (0.0e0) in the ework array for any species described by an algebraic equation. TimeFcn() allows the user to specify any time (or distance in the case of a PFR) dependent parts of the model, such as changing input flows for a CSTR or an imposed temperature profile in a PFR. The user must use the dump2f() function to place the appropriate responses into f[] for GREG to compare to obs[]. Also if nspar > 0, the user must place the appropriate sensitivities into deriv[]. If f = NULL, the program should not place either the responses or the sensitivities in f or deriv arrays, respectively. NOTE: The responses passed to dump2f for PFR reactors are flows, not partial pressures; therefore, the user must convert them to partial pressures if they are desired. The program ode_gen generates a file with the appropriate forms of the above functions given an input file describing the reactions the species undergo. If they do not exist in the present working directory, ode_gen also outputs default forms of the GREG, DDASAC/DASSLSO, and parameter input files, named greg10.in, ddat.in, and param.dat, respectively. The format of the ode_gen input file is identical to the output of NetGen (isn't that convenient?).

NOTE: the reactor equations used in libparest.a assume surface species in catalytic reactions are the first nSurfaceSpecies elements of the array of species, y. If your model includes surface species and you do not have them first in the model, the results you obtain with these codes will be incorrect (if the integrator even works at all!). ode_gen automatically places any surface species in the model (species with `m' as the first character of their symbol) in the correct place in the array.

Several programs are provided in libparest.a to help the user calculate rate constants, They are:

double k_Arrhenius(double A, double E);
double k_lfer(double A, double E0, double alpha, double delH);
double k_lferuni(double A, double E0, double alpha, double delH);
double k_lferbi(double A, double E0, double alpha, double delH);
double k_ads(double k0, double Tref);
double k_temkin(double A, double E0, double range, double coverage);
double k_adsrep(double A, double E0, double range, double coverage);
k_Arrhenius does what one would expect, it calculates the value of the rate constant at the current temperature give a preexponential factor A and an activation energy E. k_lfer, k_lferuni, and k_lferbi are all identical and it is recommended you use k_lfer, since the other two simply call this one. These three functions use a linear free energy relationship (LFER) to calculate the value of the rate constant at the current temperature given a preexponential factor A, intrinsic activation barrier E0, transfer coefficient alpha, and heat of reaction delH. The linear free energy relationship calculates the activation energy:
E = E0 + alpha * delH.
The activation energy and preexponential factor are then passed to k_Arrhenius. k_ads calculates the rate of unactivated adsorption at the reactor temperature, given the rate at at a reference temperature and the reference temperature. This function assumes an inverse square root dependence of the rate on temperature. k_temkin calculates a rate constant whose activation energy depends on the coverage of a surface species. The formula for the activation energy is:
E = E0 - range * coverage.
The advantage of this function to simply mapping its arguments to k_lfer is that k_lfer contains checks for thermodynamic consistency which are inappropriate for coverage dependent activation energies (although it does check to make sure the activation energy is nonnegative). k_adsrep calculates a rate constant whose activation energy varies due to adsorbate-adsorbate repulsions. The function assumes adsorbates will be isolated until a coverage of 0.25 monolayer, therefore no change in activation energy occurs until the coverage is greater than 0.25. At that point, the function uses the equation
E = E0 - range * (coverage - 0.25)/0.75
to calculate the activation energy.

The functionality and options of the driver program are best exhibited through the explanation of the input files required by the driver program. The format of the input files are below. The usage of the program is:

usage: parest [options]
available options:
-b
Read parameter bounds from parameter file
-d FILE
Read integrator and model information from FILE, rather
the default, ddat.in
-g FILE
Read optimization information from parameter FILE
If not specified, only integration is performed
-h
Display this message
-l [XXX]
Take natural logarithm of parameters larger than XXX
or 1.0e+5 if XXX is not specified
-o FILE
Read observation information from FILE
-p FILE
Read parameter information from FILE
-t
Read parameter type from parameter file
-x
Read chimax value from parameter file
All arguments are optional, but will most often be used for parameter estimation. The greginfile contains the input to GREG if parameter optimization is to be done. If only integration is desired, the default values assumed by the program are appropriate and no greginfile is needed. The program assumes an integration input file named ddat.in is located in the working directory if the -d flag is not used. Observation and parameter files need only be specified if they are used. Note that observations and parameters are necessary for parameter optimization.

Three or four output files are generated during execution of the parameter estimation. The file greg.out contains the output from GREG. The file fort.lun+1 (where lun is the value supplied by the user in the greginfile) contains any information printed by DDASAC (typically error messages and therefore not always generated). DASSLSO prints errors to stdout. The final two output files are graph.out and f.out. The file graph.out contains all of the responses from the most recent execution of the integrator. f.out is the file pointer supplied to the user supplied dump2f() function in which the user can write anything she desires. If species classes/lumps are used, a file named graphClass.out is also generated which contains the responses for each class.


--------------------------------------------------------------------------------

Input File Description and Explanation
This section describes the optional input files for the integration/parameter estimation functions.


--------------------------------------------------------------------------------

greginfile
This file contains the parameters needed by GREG in the following order (Notes/explanations are in curly brackets, {}):

level
iprob
lun
itmax {-999 if grprep default, 25, to be used}
lists {-999 if grprep default, 2, to be used}
idif {-999 if grprep default, 1, to be used}
jnext {-999 if grprep default, 0, to be used}
atol {< 0 if grprep default, 1.0e-6, to be used}
rptol {< 0 if grprep default, 1.0e-5, to be used}
rstol {< 0 if grprep default, 1.0e-1, to be used}
lenf {if level == 12}
nresp {if level >= 20}
ndet {if level >= 20}
These parameters are described in the GREG documentation. If zero (0) is entered for itmax, GREG is not called and only integration is performed. Note this is also done when no greginfile is specified.


--------------------------------------------------------------------------------

obsfile
This file contains information about the experimental observations in the following format (optional inputs are in parentheses, ()):

nob
wf_type {see below}
obs[0] (wf[0])
obs[1] (wf[1])
...
obs[nob - 1] (wf[nob - 1])
The type of weighting factor, wf_type can be 'w' (or any string that begins with 'w' or 'W'), 'a' (or any string that begins with 'a' or 'A'), or anything else. A string that begins with 'w' or 'W' indicates that you are going to provide the weighting factors, wf[i], for each observation which should be placed on the same line as the observation. A string that begins with 'a' or 'A' (auto) tells the program to calculate a weighting factor for you. The weighting factor, wf[i], will be equal to 1/(sqrt(0.10 * obs[i])). This formula assumes a relative error in each observation of 10% and calculates the weighting factor accordingly. Therefore, lower values are given higher weighting, which one normally desires. If any other string is entered (one must be entered), a weighting factor of one (1.0e0) is given to all observations, indicating that all observations have the same absolute error. Note that different sets of weighting factors can give widely different results.


--------------------------------------------------------------------------------

parfile
This file contains the information about and the initial guesses for the parameters to be estimated by GREG and/or used by the integrator. The format is as follows:

npar_total
npar_opt
npar_sensitivity
nkrescale
(type) par[0] (bndlw[0] bndup[0]) (chmax[0])
(type) par[1] (bndlw[1] bndup[1]) (chmax[1])
...
(type) par[npar - 1] (bndlw[npar-1] bndup[npar-1]) (chmax[npar-1])
(type) par[npar]
(type) par[npar + 1]
...
(type) par[npar_total - 1]
npar_total is the total number of parameters, optimizable plus constant. npar_opt is the total number of parameters to be estimated by GREG and npar_sensitivity is the number of optimizeable parameters the integrator should provide sensitivities for (NOTE: npar_opt must be greater than or equal to npar_sensitivity. Therefore, if you want sensitivities without optimization, set npar_opt equal to npar_sensitivity and do not use the -g flag). Any remaining optimizable parameters will have their sensitivities estimated by GREG. Parameters which are not to be estimated by GREG but which are needed by the integrator are supported. The number of this type of parameter is calculated automatically from npar_total and npar_opt. The values of these constant parameters should be entered in the parfile after the values for the optimization parameters if the -t flag is not used. The optional command line argument, -b, indicates whether or not the user is supplying upper and lower bounds on the parameters to override the GRPREP values of -1.0e30 and 1.0e30, respectively. The -x flag indicates the user is supplying chmax values for the parameters to override the default GRPREP value of -0.5 (see GREG documentation for an explanation of chmax).

The -t flag allows the user to specify the type of parameter being entered. There are five types: c, o, r, rs, and s. `c' specifies a constant parameter, `o' specifies a parameter to be optimized with GREG providing the sensitivities, `r' specifies a parameter to be reparameterized (see below) and optimized with GREG providing sensitivities, `rs' is the same a `r' except the integrator is to provide the sensitivities, and `s' specifies a parameter to be optimized with the integrator providing the sensitivities. This option allows more flexibility when specifying parameters to be optimized so that fewer compilations will be required. Without the -t option, no type information can be specified and the parameters must be ordered as described above.

The driver program also supports the reparameterization of the rate constants to speed up the convergence of the parameters. This reparameterization scales all the rate constants against the values of the rate constants at the mean inverse temperature. To take advantage of this rescaling, insert the number of rate constants, i. e., number of preexponential factors plus the number of activation energies divided by two, nkrescale, into the third line of the input file, otherwise put zero (0) there. The parameters to be rescaled must be the first 2 * nkrescale positions of the par array and entered in the order A1, E1, A2, E2, etc. If the -t option is used, these parameters need not be at the beginning of the list nor even adjacent to each other, but must still alternate between preexponential factors and activation energies. Note that the user provided model functions must take this rescaling into account. To aid in this, the function double ArefT(double lnkrefT, double E, double Tref) which calculates the value of A given the current parameters has been written and is included in the libparest.a. The Tref value is stored in par[npar_total]. The value npar_total is stored in ipar[0].

Finally, the driver also supports the optimization of log(A) rather than A (log means natural log, ln). If the -l flag is specified, the program automatically sends log(par[I]) if par[I] is greater than 1.0e5 for both constant and optimized parameters. Again, this must be taken into account by user functions by using exp(k[I]). The default lower cutoff for using the natural logarithm of the parameter may be overridden by giving an argument to the -l flag. The format of the -l argument is -l XXX, where XXX is any floating point format accepted by C, e. g., 1.0e5, 1.e+5, 100000, or 100000.0. You may not enter a negative number as a argument to the -l flag. To take the logarithm of all parameters, provide a small number greater than zero.


--------------------------------------------------------------------------------

integfile
This file must contain the integrator parameters, experimental conditions, and initial species concentrations. This information will be passed to the model and the integrator in an appropriate manner. The contents of the file are:

integrator {ddasslso or ddasac}
class_flag {use classes = 1, otherwise = 0}
nInputStructure {number of input structures}
nSpecies {number of species}
nSurfaceSpecies {number of surface species}
rtol {relative tolerance used in integrator}
atol {absolute tolerance used in integrator}
increment {tout - t, for integrator}
numStruct {begin input structure 1}
reactor_type {batch, PFR, or CSTR}
units {units of y0 array, pressure or concentration}
Ft0 {input total molar flow rate}
ptot {total pressure}
Sr {total number of sites}
V {reactor volume}
T {reactor temperature}
beta {temperature ramp}
ntout {number of output times}
tout[0]
tout[1]
...
tout[ntout - 1]
y0[0] {initial concentrations}
y0[1]
...
y0[nSpecies - 1]
numStruct {begin input structure 2}
...
(class[0]) {class for first species, if class_flag == 1}
(class[1])
...
(class[nSpecies - 1])
It is the user's responsibility to ensure that the units for all of the inputs are consistent. To this end, two const double must be declared in the model.c file, Renergy and Rgas. Renergy is used in all Arrhenius type expressions, so its units must be the same as the units for user supplied activation energies and heats of adsorption. Rgas is used in all ideal gas law calculations, therefore its units must be consistent with the units of the rate expressions supplied in ModelODEs() and the input y0, ptot, and V. At present, only pressure units are supported for CSTR and PFR reactor types.

If equally spaced output time intervals are desired, enter a value of ntout < 1. Then enter three values into tout[]. tout[0] contains the first output time (not starting time, which is always zero (0.0e0)), tout[1] contains the final output time, and tout[2] contains the tout increment to step from tout[0] to tout[1]. That is, it is structured like a for loop.

If the integration is sufficiently stiff, the desired output times may not be spaced closely enough to allow the integrator to converge. If this is the case, enter a nonzero value for increment. The program will then split up the time between output times into intervals no larger than increment. The integrator will be called on to solve between these smaller intervals, but no output will be printed and the dump2f routine will not be called. This feature is not recommended for PFRs. To suppress this feature, enter an increment of zero (0.0e0).

The class_flag controls whether species lumping is used and printed to the graphClass.out file. If this is desired, enter 1 for the flag and enter the class number for each species after all of the other input. The classes must be positive integers.


--------------------------------------------------------------------------------

Make Instructions
On a Digital UNIX machine, you can simply build the executable like this (gmake, rather than make, must be used):

% gmake -f /carol/software/parest/make.parest
In this case, the model must be contained in a file named model.c in the present working directory and the executable will be named parest. On an SGI, replace /carol with /files in all commands. An appropriate model.c file can be made using ode_gen:
% ode_gen model

To get an executable appropriate for full symbolic debugging, type:

% gmake -f /carol/software/parest/make.parest DEBUG=1
For an optimized (-O2) executable, type:
% gmake -f /carol/software/parest/make.parest OPT=1
To check malloc/free errors during execution, type:
% gmake -f /carol/software/parest/make.parest DEBUG=1 MDEBUG=1
or
% dmalloc -l logfile -i 100 low
% gmake -f /carol/software/parest/make.parest DEBUG=1 DMALLOC=1
Before running these executable, read the memdebug manual and dmalloc manual.

(Actually the '1' in the above commands can be replaced by any string.) If you define both DEBUG and OPT, the definition of DEBUG overrides OPT, and a debuggable executable is created.

 

Bugs
If you are certain you have found a bug in libparest.a, email David Dooling.


 

 

Northwestern Home | Plan-It Purple | Sites A-Z | Search | Last updated December 2, 2003
  Broabelt Research Group | 2145 Sheridan Road | Evanston, IL 60202 | 847.491.5351
World Wide Web Disclaimer and University Policy Statements   © 2003 Northwestern University