Data Sources¶
Contents:
The Penn World Tables¶
The PWT’s main purpose is to construct panel data for GDP and its components in constant international prices.
Background¶
National Accounts report GDP and its components in local currency units (LCU).
We could use exchange rates to make these figures into dollars.
The data would then imply that people in low income countries are shockingly poor, not just very poor.
This would be misleading because prices are systematically lower in low income countries.
To compare living standards across countries (and over time), it is necessary to deflate GDP with the local price of a consistent bundle of goods.
How Is This Done?¶
The ICP collects data on local prices every several years (benchmark years).
Between benchmark years, prices are interpolated.
Even though there is no theoretically “best” price index, there is a substantial theory of how to construct price indices.
The PWT does this and reports GDP in international prices.
This comes in 2 flavors:
- RGDPE: to compare living standards (think: deflated by the consumer price index)
- RGDPO: to compare productive capacity (think: deflated by a producer price index)
Obtaining the data¶
The data can be downloaded here.
I recommend downloading the Stata file and converting into a matlab dataset using Stat/Transfer.
Format of the Stata file:
- each row is a country / year combination
- countries are identified by their ISO codes. These are 3 letter abbreviations.
- each column is a variable (except for the first 4 columns which are country / year info)
Since Mathworks plans to phase out datasets, it would make sense to then convert the whole thing into a Matlab table
using dataset2table.
It would now make sense to make each variable into an array indexed by [country, year].
- Fortunately, someone has already done this, so you can download the result from the PWT web site.
- Unfortunately, that file contains errors (as of 2014-Dec). So you cannot use it!
PWT: Matlab Code¶
Directory structure:¶
baseDir
:
outDir
: figures and tablesmatDir
: generatedmat
filesdataDir
: original data filesprogDir
: program files
Programs¶
The code is entirely general purpose (not specific to the course).
go_pwt8
startup; add dir to pathconst_pwt8
set constantsrun_all_pwt8
runs everything in sequenceimport_pwt8
imports stata file into matlabvar_load_yc_pwt8
loads one variable by [Country,year]country_list_pwt8
makes list of countries and years
Basic Steps¶
- Download Stata file.
- Make Stata file into a matlab
dataset
using Stat/Transfer import_pwt8
: break the Stata file into individual variables and save them as matlab matrices, indexed by [year, country]var_load_yc_pwt8
: loads one variable for a given set of years and countries
Exercises¶
Write code that imports the stata file (
import_pwt8
)Write
var_load_yc_pwt8
Plot the density of real output per worker in 2000 (
rgdp_density_growth821
).Plot the price of consumption against real gdp per worker for the year 2000.
What do you find?
World Development Indicators¶
A collection of cross country data on
- gdp and its components
- employment
- demographics
- education
- government finances
- and much more
Getting the Data¶
One typically downloads one variable at a time from here.
The best format is probably xls.
One gets a file with years as columns.
Getting the Data Into Matlab¶
Replace the column headers for the years with something that makes a valid variable name (e.g. x1970).
Run through Stat/Transfer to obtain a Matlab dataset.
Load the file into matlab. Convert the data portion of the matrix into a matrix by [year, country].
Country indicators are World Bank WITS codes (essentially the same as ISO codes).
The code now directly reads the xls file
Matlab Code¶
Directories¶
- prog: programs
- excel: raw
xls
files and generatedmat
files (one for each variable) - outDir: tables and figures
Programs¶
const_wdi2013
: set constantsvar_load_yc_wdi2013
: load one variable by [country, year]country_list_wdi2013
: make a list of all countries in the data (WITS codes)
Exercises¶
- Construct GDP per worker (i.e. per employed person) in constant international dollars.
- Save it as a Matlab matrix by [country, year].
- How does it compare with PWT data?
- Plot the fraction of self-employed workers against GDP per capita in constant international dollars. For the year 2000. What do you find?
- Download a variable from the WDI web site and import it into Matlab.
Barro-Lee Schooling Data¶
The standard dataset for working with cross-country schooling data.
Covers educational attainment by [year, country, age group, sex].
How is this constructed?
- mainly from household surveys
- since there are not many surveys for most countries, there is a lot of interpolation
Getting the Data¶
Download Stata files from the Barro-Lee web site.
Each row is a country / year.
Each column is a variable.
Each file is one sex.
Importing Into Matlab¶
Make the Stata file into a Matlab dataset
using Stat/Transfer.
Write a function var_load_yc_bl2013
that extracts one variable for a given set of years and countries.