I've long thought that there was an error in the way I calculated the shadow elasticity of substitution (SES) in my 2012 paper on interfuel substitution in the *Journal of Economic Surveys*. This would have been a big problem as the paper carries out a meta-analysis of SESs. But no primary paper reported the results in terms of the SES. I computed all this data from the various ways results were presented in the original studies. I never got around to doing anything about it or even checking carefully whether there was a mistake. I suppose this is because I hate finding mistakes in my papers and as a result procrastination goes into superdrive.

Yesterday a student wrote to me and requested the data. I have now checked the derivation of the SES in my database and also computed it in an alternative way. There is in fact no mistake. This is great news!

The reason I thought that there was a mistake is because of the confusing notation used for the Morishima Elasticity of Substitution (MES). Conventionally, the MES is written as MES_ij for the elasticity of substitution between inputs i and j when the price of i changes. By contrast, the cross-price elasticity is written eta_ij for the elasticity of demand for the quantity of input i with respect to the price of input j!*

I have now uploaded the database used for the meta-analysis to my data website. The following is a description of what is in the Excel spreadsheet:

Each line in the main "data" worksheet is for a specific sample/model in
a specific paper. Each of these typically has multiple elasticity estimates.

Column A: Identification number for each paper.

Columns B to L: Characteristics of the authors. Including their rank
in the Coupe ranking that was popular at the time.

Column M: Year paper was published.

Columns N to V: Characteristics of the journals in which the papers
were published. This includes in Column O the estimated impact factor in the
year of publication. Others are impact factors in later years.

Column W: Number of citations the paper had received in the Web of
Science at the time the database was compiled.

Column X: Number of citations the lead author has had in their career
apart from for this paper.

Columns Y to AO: Characteristics of the sample used for
the estimates on that line. So looking at the first line in the table, as an example, we have:

Data from Canada for 1959-1973. Annual observations. This is a panel for
different industries. N=2, so there are two industries but a single
estimate for both. T is the length of the time series dimension. Sample
size is N*T*Number of equations - i.e if there are 4 fuels usually 3
equations are estimated. This could be different if the cost function
itself is also estimated, but it looks like no papers did that. (There are also papers using time series for
individual industries etc and cross-sections at one point in time.)

Column AH: Whether fixed effects estimation was used or not (only
makes sense for panel data).

Column AC: The standard deviation of change in the real oil price in that
period.

Column AD: PPP GDP per capita of the country from the Penn World Table. Probably the mean for the sample period.

Column AE: Population of the country in millions. Looks like the
mean for the sample period.

Columns AP to AZ are the specification of the model:

Column AP: Not4 - if there weren't 4 fuels in the analysis.

Column AQ: Partial elasticity - this is holding the level of total
energy use constant.

Column AR: Total elasticity - this allows the level of total energy use
to change.

Columns AS and AT: If this is a dynamic model these are estimates of the
short-run or the long-run elasticity.

Column AU: The model is derived from a cost function, or something else.

Column AW: Functional form of the model.

Column AW: Form of the equations estimated - usually cost shares - log ratios means the log of the ratio of cost shares.

Column AX to AZ: How technical change is modeled. Many papers don't
model any technical change explicitly. Energy model means there is
biased technical change for energy inputs. Aggregate model means that if
other inputs are also modeled they also have biased technical change.
Kalman means that the Kalman filter was used to estimate stochastic
technical change.

Columns BA to the end have the actual estimates. Different papers
provide different information. All the various estimates eventually are
converted into Shadow Elasticities of Substitution.

Columns BA to BP: Own price and cross-price elasticities of
demand. For example: Coal-Oil means the cross-price
elasticity of demand for coal with respect to the price of oil.

Columns BQ to CF: Reported translog cost function parameters.

Columns CP to CS: Cost shares at the sample mean. These
are used in various elasticity formulae. They were derived in a variety
of ways from the information in papers. One of these methods is the
quadratic solution in Columns CG to CO. It uses demand elasticities and
translog parameters to reverse engineer the cost shares. Other estimates take the
ratio of demand and Allen elasticities.

Columns CT to DE: Morishima elasticities of substitution. These are asymmetric -
so we have oil-coal and coal-oil. Here the terminology is very
confusing. The standard terminology is that MES_ij is for a change in
the price of i. So coal-oil is for a change in the price of coal. This
is the reverse of what is used for cross-price elasticities! It is
super-confusing.

Columns DF to DK have the shadow elasticities I actually used in the
meta-analysis.

Columns DL to EA have the Allen elasticities of substitution. Some of these are reported in the papers and some I computed from the cross-price elasticities.

* You can learn more about all these elasticities in my 2011 *Journal of Productivity* paper on the topic.