Friday, August 6, 2021

Data for "Interfuel Substitution: A Meta-Analysis"

I've long thought that there was an error in the way I calculated the shadow elasticity of substitution (SES) in my 2012 paper on interfuel substitution in the Journal of Economic Surveys. This would have been a big problem as the paper carries out a meta-analysis of SESs. But no primary paper reported the results in terms of the SES. I computed all this data from the various ways results were presented in the original studies. I never got around to doing anything about it or even checking carefully whether there was a mistake. I suppose this is because I hate finding mistakes in my papers and as a result procrastination goes into superdrive.

Yesterday a student wrote to me and requested the data. I have now checked the derivation of the SES in my database and also computed it in an alternative way. There is in fact no mistake. This is great news!

The reason I thought that there was a mistake is because of the confusing notation used for the Morishima Elasticity of Substitution (MES). Conventionally, the MES is written as MES_ij for the elasticity of substitution between inputs i and j when the price of i changes. By contrast, the cross-price elasticity is written eta_ij for the elasticity of demand for the quantity of input i with respect to the price of input j!*

I have now uploaded the database used for the meta-analysis to my data website. The following is a description of what is in the Excel spreadsheet:

Each line in the main "data" worksheet is for a specific sample/model in a specific paper. Each of these typically has multiple elasticity estimates.

Column A: Identification number for each paper.

Columns B to L: Characteristics of the authors. Including their rank in the Coupe ranking that was popular at the time.

Column M: Year paper was published.

Columns N to V: Characteristics of the journals in which the papers were published. This includes in Column O the estimated impact factor in the year of publication. Others are impact factors in later years.

Column W: Number of citations the paper had received in the Web of Science at the time the database was compiled.

Column X: Number of citations the lead author has had in their career apart from for this paper.

Columns Y to AO: Characteristics of the sample used for the estimates on that line. So looking at the first line in the table, as an example, we have:

Data from Canada for 1959-1973. Annual observations. This is a panel for different industries. N=2, so there are two industries but a single estimate for both. T is the length of the time series dimension. Sample size is N*T*Number of equations - i.e if there are 4 fuels usually 3 equations are estimated. This could be different if the cost function itself is also estimated, but it looks like no papers did that. (There are also papers using time series for individual industries etc and cross-sections at one point in time.)

Column AH: Whether fixed effects estimation was used or not (only makes sense for panel data).

Column AC: The standard deviation of change in the real oil price in that period.

Column AD: PPP GDP per capita of the country from the Penn World Table. Probably the mean for the sample period.

Column AE: Population of the country in millions. Looks like the mean for the sample period.

Columns AP to AZ are the specification of the model:

Column AP: Not4 - if there weren't 4 fuels in the analysis.

Column AQ: Partial elasticity - this is holding the level of total energy use constant.

Column AR: Total elasticity - this allows the level of total energy use to change.

Columns AS and AT: If this is a dynamic model these are estimates of the short-run or the long-run elasticity.

Column AU: The model is derived from a cost function, or something else.

Column AW: Functional form of the model.

Column AW: Form of the equations estimated - usually cost shares - log ratios means the log of the ratio of cost shares.

Column AX to AZ: How technical change is modeled. Many papers don't model any technical change explicitly. Energy model means there is biased technical change for energy inputs. Aggregate model means that if other inputs are also modeled they also have biased technical change. Kalman means that the Kalman filter was used to estimate stochastic technical change.

Columns BA to the end have the actual estimates. Different papers provide different information. All the various estimates eventually are converted into Shadow Elasticities of Substitution. 

Columns BA to BP: Own price and cross-price elasticities of demand. For example: Coal-Oil means the cross-price elasticity of demand for coal with respect to the price of oil.

Columns BQ to CF: Reported translog cost function parameters.

Columns CP to CS: Cost shares at the sample mean. These are used in various elasticity formulae. They were derived in a variety of ways from the information in papers. One of these methods is the quadratic solution in Columns CG to CO. It uses demand elasticities and translog parameters to reverse engineer the cost shares. Other estimates take the ratio of demand and Allen elasticities.

Columns CT to DE: Morishima elasticities of substitution. These are asymmetric - so we have oil-coal and coal-oil. Here the terminology is very confusing. The standard terminology is that MES_ij is for a change in the price of i. So coal-oil is for a change in the price of coal. This is the reverse of what is used for cross-price elasticities! It is super-confusing.

Columns DF to DK have the shadow elasticities I actually used in the meta-analysis.

Columns DL to EA have the Allen elasticities of substitution. Some of these are reported in the papers and some I computed from the cross-price elasticities.

* You can learn more about all these elasticities in my 2011 Journal of Productivity paper on the topic.