| Title: | Maddison Project Data |
|---|---|
| Description: | Relatively easy access is provided to 2023 version of the Maddison project data downloaded 2025-08-28. This project collates all the credible data on population and GDP for 169 countries, with some dating back to the year 1 of the current era. One function makes it easy to find the leaders for each year, allowing users to delete countries like OPEC with narrow economies to focus on technology leaders. Another function makes it easy to plot data for only selected countries or years. Another function makes it relatively easy to obtain references to the original sources, which must be cited per the copyright rules of the Maddison Project for different uses of their data. |
| Authors: | Spencer Graves [aut, cre] (ORCID: <https://orcid.org/0009-0005-5387-729X>) |
| Maintainer: | Spencer Graves <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.1.0 |
| Built: | 2026-05-20 06:11:38 UTC |
| Source: | https://github.com/sbgraves237/maddisondata |
The Maddison project collates historical economic statistics from many sources.
They have a citation policy: CONDITIONS UNDER WHICH ALL ORIGINAL PAPERS MUST BE CITED:
a) If the data is shown in any graphical form b) If subsets of the full dataset that include less than a dozen (12) countries are used for statistical analysis or any other purposes
When neither a) or b) apply, then the MDP as a whole can be cited.
getMaddisonSources returns a data.frame of relevant sources for a
particular application.
getMaddisonSources( ISO = NULL, plot = TRUE, sources = MaddisonData::MaddisonSources, years = MaddisonData::MaddisonYears )getMaddisonSources( ISO = NULL, plot = TRUE, sources = MaddisonData::MaddisonSources, years = MaddisonData::MaddisonYears )
ISO |
either NULL to return all sources or a character vector of ISO
codes for the countries included in the analysis or a |
plot |
logical indicating whether the use does nor does not include
plotting data. The Maddison project requires citing all relevant
|
sources |
list of sources in the format of |
years |
|
a data.frame with 3 columns:
3-letter ISO code for country.
character vector of years or year ranges for which source applies.
character vector of sources.
in the format of MaddisonSources.
getMaddisonSources() # all getMaddisonSources(plot=FALSE) # only MDP GBR <- getMaddisonSources('GBR') # GBR getMaddisonSources(names(MaddisonSources)[1:12], FALSE) # only MDP getMaddisonSources(data.frame(ISO=c('GBR', 'USA'), yearBegin=rep(1500, 2)) ) #GBR, USA since 1500 getMaddisonSources('AUS') # AUS: no special sources for AUS.getMaddisonSources() # all getMaddisonSources(plot=FALSE) # only MDP GBR <- getMaddisonSources('GBR') # GBR getMaddisonSources(names(MaddisonSources)[1:12], FALSE) # only MDP getMaddisonSources(data.frame(ISO=c('GBR', 'USA'), yearBegin=rep(1500, 2)) ) #GBR, USA since 1500 getMaddisonSources('AUS') # AUS: no special sources for AUS.
ggplot pathsggplotPath plots y vs. x (typically year) with a separate line for
each group with options for legend placement, horizontal and vertical lines
and labels.
ggplotPath( x = "year", y, group, data, scaley = 1, logy = TRUE, ylab, legend.position, hlines, vlines, labels, fontsize = 10, color, linetype )ggplotPath( x = "year", y, group, data, scaley = 1, logy = TRUE, ylab, legend.position, hlines, vlines, labels, fontsize = 10, color, linetype )
x |
name of a numeric column in |
y |
name of column in |
group |
name of grouping variable, i.e., plot a separate line for each
level of |
data |
|
scaley |
number to divide y by for plotting. Default = 1, but for data
in monetary terms, e.g., for |
logy |
logical: if |
ylab |
y axis label. Default =
|
legend.position |
argument passed to |
hlines |
numeric vector of locations on the |
vlines |
numeric vector of locations on the |
labels |
= |
fontsize |
for legend and axes labels in theme(text=element_text(size=fontsize)); default = 10. |
color |
for lines to pass to |
linetype |
optional vector. Default
|
an object of class ggplot2::ggplot, which can be subsequently
edited, and whose print method produces the desired plot.
str(GBR_USA <- subset(MaddisonData::MaddisonData, ISO %in% c('GBR', 'USA'))) GBR_USA1 <- ggplotPath('year', 'gdppc', 'ISO', GBR_USA, 1000) GBR_USA1a <- ggplotPath('year', 'gdppc', 'ISO', GBR_USA, 1000, color=c('red', 'blue')) GBR_USA1+ggplot2::coord_cartesian(xlim=c(1500, 1850)) # for only 1500-1850 GBR_USA1+ggplot2::coord_cartesian(xlim=c(1600, 1700), ylim=c(.9, 3)) # label the lines ISOll <- data.frame(x=c(1500, 1800), y=c(2.5, 1.7), label=c('GBR', 'USA'), srt=c(0, 30), col=c('red', 'green'), size=c(2, 9)) GBR_USA2 <- ggplotPath('year', 'gdppc', 'ISO', GBR_USA, 1000, labels=ISOll, fontsize = 20) # h, vlines, manual legend only Hlines <- c(1,3, 10, 30) Vlines = c(1849, 1929, 1933, 1939, 1945) (GBR_USA3 <- ggplotPath('year', 'gdppc', 'ISO', GBR_USA, 1000, ylab='GDP per capita (2011 PPP K$)', legend.position = NULL, hlines=Hlines, vlines=Vlines, labels=ISOll)) # do.call(ggplotPath, ...) with 1 line list1 <- list(x='Time', y='lvl', data=data.frame(Time=1:4, lvl=sqrt(1:4))) doCallPlot <- do.call(ggplotPath, list1)str(GBR_USA <- subset(MaddisonData::MaddisonData, ISO %in% c('GBR', 'USA'))) GBR_USA1 <- ggplotPath('year', 'gdppc', 'ISO', GBR_USA, 1000) GBR_USA1a <- ggplotPath('year', 'gdppc', 'ISO', GBR_USA, 1000, color=c('red', 'blue')) GBR_USA1+ggplot2::coord_cartesian(xlim=c(1500, 1850)) # for only 1500-1850 GBR_USA1+ggplot2::coord_cartesian(xlim=c(1600, 1700), ylim=c(.9, 3)) # label the lines ISOll <- data.frame(x=c(1500, 1800), y=c(2.5, 1.7), label=c('GBR', 'USA'), srt=c(0, 30), col=c('red', 'green'), size=c(2, 9)) GBR_USA2 <- ggplotPath('year', 'gdppc', 'ISO', GBR_USA, 1000, labels=ISOll, fontsize = 20) # h, vlines, manual legend only Hlines <- c(1,3, 10, 30) Vlines = c(1849, 1929, 1933, 1939, 1945) (GBR_USA3 <- ggplotPath('year', 'gdppc', 'ISO', GBR_USA, 1000, ylab='GDP per capita (2011 PPP K$)', legend.position = NULL, hlines=Hlines, vlines=Vlines, labels=ISOll)) # do.call(ggplotPath, ...) with 1 line list1 <- list(x='Time', y='lvl', data=data.frame(Time=1:4, lvl=sqrt(1:4))) doCallPlot <- do.call(ggplotPath, list1)
ggplotPath in multiple panels with no space between panels and a shared
horizontal axis on the bottom.ggplotPath2 accepts a variety of inputs. The most general is with object
= a list of sublists to be fed individually to do.call(ggplotPath,..) and
then assembled into the desired plot after eliminating the space between the
individual plots. Optional \dots arguments give default values for the
individual calls to ggplotPath.
ggplotPath2(object, ...) ## S3 method for class 'list' ggplotPath2(object, ...) ## S3 method for class 'mts' ggplotPath2(object, Time, ...) ## S3 method for class 'KFS' ggplotPath2(object, ...) ## Default S3 method: ggplotPath2( object, Time, object2, scaley, logy, ylab, hlines, vlines = numeric(0), labels, fontsize = 10, color, linetype, ... )ggplotPath2(object, ...) ## S3 method for class 'list' ggplotPath2(object, ...) ## S3 method for class 'mts' ggplotPath2(object, Time, ...) ## S3 method for class 'KFS' ggplotPath2(object, ...) ## Default S3 method: ggplotPath2( object, Time, object2, scaley, logy, ylab, hlines, vlines = numeric(0), labels, fontsize = 10, color, linetype, ... )
object |
with an associated matrix to be plotted. The following classes of objects are supported:
|
... |
optional arguments. |
Time |
optional numeric vector with lenght being either the number of
rows of the matrix obtained from
|
object2 |
an optional vector or matrix-type object with the same
number of rows as |
scaley |
= optional numeric vector of the length |
logy |
optional character vector of length
The default is 'exp_log' for the first panel and ” for the rest. |
ylab |
optional character vector of length |
hlines |
optional list of at most |
vlines |
optional numeric vector of locations on the |
labels |
= optional |
fontsize |
optional number for legend and axes labels, used in 'ggplotPath(..., fontsize=fontsize); default = 10. |
color |
optional vector or list of length ' |
linetype |
optional vector or list of length |
Alternatively, the first object argument can be a matrix, data.frame,
tibble, or multivariate time series (of class mts). An optional Time
argument must be a numeric vector with length equal to the number of rows of
object. If Time is provided, it overrides any values for the horizontal
axis that could be inferred from rownames or time of object. A
second optional object2 can be either a vector whose length matches the
number of rows of object or a matrix or ts object similarly matching the
number of rows of object.
an object of class ggplot2::ggplot, which can be subsequently
edited, and whose print method produces the desired plot.
# matrix examples Mat <- cbind(lvl=1:5, vel=rep(1:2, length=5), acc=sin(1:5)) Mat1 <- Mat rownames(Mat1) <- 1951:1955 Mat2 <- as.data.frame(cbind(Mat, year=1951:1955)) # mts example MTS <- ts(Mat, 1951) # Do Matp <- ggplotPath2(Mat) # with object2 = vector Matp1 <- ggplotPath2(Mat, object2=sqrt(1:5)) # with object2= = 2 column matrix for first 2 panels. Matp2 <- ggplotPath2(Mat, object2=2*Mat[, 2:1]) Mat1p <- ggplotPath2(Mat1) Mat2p <- ggplotPath2(Mat2[, 1:3], Time=Mat2[, 'year']) MTSp <- ggplotPath2(MTS) MTSep <- ggplotPath2(MTS, logy=c('exp_log', 'log', '')) # list example List2 <- list( level=list('year', 'lvl'), slope=list('year', 'vel'), accel=list('year', 'acc')) Mat2l <- ggplotPath2(List2, data=Mat2) # State space / Kalman filtering model for GBR GBR <- subset(MaddisonData, (ISO=='GBR') & !is.na(gdppc)) # model example growthFormula <- (log(gdppc)~ -1 + SSMbespoke(growthModel(.04, GBR$gdppc) )) library(KFAS) GBR2m <-SSModel(growthFormula, GBR, H=matrix(NA) ) # NOTE: This call ignores Time GBRgrowthFit1 <- fitSSM(GBR2m, inits=-6, method = "BFGS", updatefn = growthUpdateFn) # NOTE: This call currently also ignores Time; MUST BE FIXED GBRgrowthFit1t <- fitSSM(GBR2m, inits=-6, method = "BFGS", updatefn = growthUpdateFn, Time=GBR$year) GBRfitp <- ggplotPath2(GBRgrowthFit1t) #KFS example GBR_KFS <- KFAS::KFS(GBRgrowthFit1$model) GBR_KFSt <- KFAS::KFS(GBRgrowthFit1t$model) GBR_KFSp0 <- ggplotPath2(GBR_KFS) GBR_KFSp <- ggplotPath2(GBR_KFS$a) GBR_KFStp <- ggplotPath2(GBR_KFSt$a) # label the lines ISOll1 <- data.frame(x=c(1500, 1800), y=c(2.5, 1.7), label=c('GBR', 'Napoleon'), srt=c(0, 30), col=c('red', 'green'), size=c(2, 9), component=1) GBR_KFSp1 <- ggplotPath2(GBR_KFS$a, labels=ISOll1) GBR_KFSp <- ggplotPath2(GBR_KFS, labels=ISOll1) ISOll2 <- ISOll1 ISOll2$component <- 1:2 GBR_KFSp2 <- ggplotPath2(GBR_KFS, labels=ISOll2) # hlines, vlines zero <- 0 attr(zero, 'color') <- 'red' attr(zero, 'lty') <- 'dashed' Hlines1 <- list(c(1,3, 10, 30), zero) Vlines <- c(1649, 1929, 1933, 1945) GBR_KFSp3 <- ggplotPath2(GBR_KFS, labels=ISOll2, hlines=Hlines1, vlines=Vlines)# matrix examples Mat <- cbind(lvl=1:5, vel=rep(1:2, length=5), acc=sin(1:5)) Mat1 <- Mat rownames(Mat1) <- 1951:1955 Mat2 <- as.data.frame(cbind(Mat, year=1951:1955)) # mts example MTS <- ts(Mat, 1951) # Do Matp <- ggplotPath2(Mat) # with object2 = vector Matp1 <- ggplotPath2(Mat, object2=sqrt(1:5)) # with object2= = 2 column matrix for first 2 panels. Matp2 <- ggplotPath2(Mat, object2=2*Mat[, 2:1]) Mat1p <- ggplotPath2(Mat1) Mat2p <- ggplotPath2(Mat2[, 1:3], Time=Mat2[, 'year']) MTSp <- ggplotPath2(MTS) MTSep <- ggplotPath2(MTS, logy=c('exp_log', 'log', '')) # list example List2 <- list( level=list('year', 'lvl'), slope=list('year', 'vel'), accel=list('year', 'acc')) Mat2l <- ggplotPath2(List2, data=Mat2) # State space / Kalman filtering model for GBR GBR <- subset(MaddisonData, (ISO=='GBR') & !is.na(gdppc)) # model example growthFormula <- (log(gdppc)~ -1 + SSMbespoke(growthModel(.04, GBR$gdppc) )) library(KFAS) GBR2m <-SSModel(growthFormula, GBR, H=matrix(NA) ) # NOTE: This call ignores Time GBRgrowthFit1 <- fitSSM(GBR2m, inits=-6, method = "BFGS", updatefn = growthUpdateFn) # NOTE: This call currently also ignores Time; MUST BE FIXED GBRgrowthFit1t <- fitSSM(GBR2m, inits=-6, method = "BFGS", updatefn = growthUpdateFn, Time=GBR$year) GBRfitp <- ggplotPath2(GBRgrowthFit1t) #KFS example GBR_KFS <- KFAS::KFS(GBRgrowthFit1$model) GBR_KFSt <- KFAS::KFS(GBRgrowthFit1t$model) GBR_KFSp0 <- ggplotPath2(GBR_KFS) GBR_KFSp <- ggplotPath2(GBR_KFS$a) GBR_KFStp <- ggplotPath2(GBR_KFSt$a) # label the lines ISOll1 <- data.frame(x=c(1500, 1800), y=c(2.5, 1.7), label=c('GBR', 'Napoleon'), srt=c(0, 30), col=c('red', 'green'), size=c(2, 9), component=1) GBR_KFSp1 <- ggplotPath2(GBR_KFS$a, labels=ISOll1) GBR_KFSp <- ggplotPath2(GBR_KFS, labels=ISOll1) ISOll2 <- ISOll1 ISOll2$component <- 1:2 GBR_KFSp2 <- ggplotPath2(GBR_KFS, labels=ISOll2) # hlines, vlines zero <- 0 attr(zero, 'color') <- 'red' attr(zero, 'lty') <- 'dashed' Hlines1 <- list(c(1,3, 10, 30), zero) Vlines <- c(1649, 1929, 1933, 1945) GBR_KFSp3 <- ggplotPath2(GBR_KFS, labels=ISOll2, hlines=Hlines1, vlines=Vlines)
KFAS
growthModel returns a list returned by KFAS::SSMcustom() for a model
with potentially irregularly spaced univariate observations with a
2-dimensional (level, growthRate) state.
growthModel( sigma, y, a1, Time, stateNames = c("level", "growthRate"), Log = TRUE, ... )growthModel( sigma, y, a1, Time, stateNames = c("level", "growthRate"), Log = TRUE, ... )
sigma |
a numeric vector, forced to length 2 by replacing it by
|
y |
= optional numeric vector or |
a1 |
= optional numeric vector of length 2 to pass to
|
Time |
= optional integer vector of times at which non-missing
observations are available. Default = |
stateNames |
= |
Log |
default = TRUE. |
... |
optional arguments passed to |
a list returned by KFAS::SSMcustom() with an additional Time
component.
GBR <- subset(MaddisonData, (ISO=='GBR') & !is.na(gdppc)) growthMdl1 <- growthModel(.1, GBR$gdppc, Time=GBR$year) GBRgdppc1 <- with(GBR[-1, ], ts(gdppc, year[1])) growthMdl2 <- growthModel(c(.1, .2), GBRgdppc1) growthMdl0 <- growthModel(.1, GBR$gdppc, a1=c(10, 1), Log=FALSE, stateNames=c('lvl', 'vel'))GBR <- subset(MaddisonData, (ISO=='GBR') & !is.na(gdppc)) growthMdl1 <- growthModel(.1, GBR$gdppc, Time=GBR$year) GBRgdppc1 <- with(GBR[-1, ], ts(gdppc, year[1])) growthMdl2 <- growthModel(c(.1, .2), GBRgdppc1) growthMdl0 <- growthModel(.1, GBR$gdppc, a1=c(10, 1), Log=FALSE, stateNames=c('lvl', 'vel'))
growthUpdateFn
growthUpdateFn(pars, model, Time)growthUpdateFn(pars, model, Time)
pars |
= |
model |
= list assumed to have components |
Time |
= optional integer vector of times at which non-missing
observations are available. Default is |
returns a list returned by KFAS::SSMcustom() for a model
with potentially irregularly spaced univariate observations with a
2-dimensional (level, growthRate) state.
a model with components H and Q updated as described.
GBR <- subset(MaddisonData, (ISO=='GBR') & !is.na(gdppc)) growthMdl1 <- growthModel(.1, GBR$gdppc, Time=GBR$year) growthMdl1v0 <- growthUpdateFn(.1, growthMdl1) growthMdl1v <- growthUpdateFn(.1, growthMdl1, Time=GBR$year) growthMdl1v3 <- growthUpdateFn(1:3, growthMdl1, Time=GBR$year) growthFml <- (log(gdppc)~ -1 + SSMbespoke(growthModel(.1, GBR$gdppc) )) library(KFAS) growthSSmdl <-SSModel(growthFml, GBR, H=matrix(NA) ) growthSSmdl1 <- growthUpdateFn(.1, growthSSmdl)GBR <- subset(MaddisonData, (ISO=='GBR') & !is.na(gdppc)) growthMdl1 <- growthModel(.1, GBR$gdppc, Time=GBR$year) growthMdl1v0 <- growthUpdateFn(.1, growthMdl1) growthMdl1v <- growthUpdateFn(.1, growthMdl1, Time=GBR$year) growthMdl1v3 <- growthUpdateFn(1:3, growthMdl1, Time=GBR$year) growthFml <- (log(gdppc)~ -1 + SSMbespoke(growthModel(.1, GBR$gdppc) )) library(KFAS) growthSSmdl <-SSModel(growthFml, GBR, H=matrix(NA) ) growthSSmdl1 <- growthUpdateFn(.1, growthSSmdl)
logMaddison returns a tibble::tibble of data on selected countries
extracted from MaddisonData, appending columns lnGDPpc and lnPop =
natural logarithms of gdppc and pop.
logMaddison(ISO = NULL)logMaddison(ISO = NULL)
ISO |
either NULL to select all the data in |
a tibble::tibble with 6 columns:
3-letter ISO code for countries selected
numeric year in the current era.
Gross domestic product per capita adjusted for inflation to 2011 dollars at purchasing power parity.
Population, mid-year (thousands)
log(gdppc)
log(pop)
logMaddison() # all logMaddison(c('GBR', 'USA')) # GBR, USAlogMaddison() # all logMaddison(c('GBR', 'USA')) # GBR, USA
MadDateRanges returns a data.frame with 3 numeric columns:
yearBegin, yearEnd, and sourceNum from the vector of dateRanges
associated with different sources in MaddisonSources.
MadDateRanges(dateRanges)MadDateRanges(dateRanges)
dateRanges |
character vector of date ranges, each associated with a different source. |
a data.frame with 3 columns
numeric years
1, 2, 3, ... for the location in dateRanges
MadDateRanges(c('1', '700 – 1500', '1252–1700 (England)', '1915-1919 & 1949', '1820, 1870, 1913, 1950')) # equal data.frame( yearBegin=c(1, 700, 1252, 1820, 1870, 1913, 1950), yearEnd =c(1, 1500, 1700, 1820, 1870, 1913, 1950), sourceNum=c(1, 2, 3, rep(4, 4)))MadDateRanges(c('1', '700 – 1500', '1252–1700 (England)', '1915-1919 & 1949', '1820, 1870, 1913, 1950')) # equal data.frame( yearBegin=c(1, 700, 1252, 1820, 1870, 1913, 1950), yearEnd =c(1, 1500, 1700, 1820, 1870, 1913, 1950), sourceNum=c(1, 2, 3, rep(4, 4)))
The
Maddison project
collates historical economic statistics from many sources.
MaddisonCountries is a data.frame of all (countrycode, country,
region) combinations in those data.
MaddisonCountriesMaddisonCountries
MaddisonCountriesA data frame with 3 columns:
3-letter ISO country code
Country name used by the Maddison project
Geographic region including country
Its rownames = ISO.
https://www.rug.nl/ggdc/historicaldevelopment/maddison/releases/maddison-project-database-2020?lang=en"Groningen Growth and Development Centre"
# Get the country for a countrycode (IS) subset(MaddisonCountries, ISO=='GBR', country) # Or MaddisonCountries['GBR', 'country'] # Find Yugoslavia subset(MaddisonCountries, grepl('Yugo', country), 1:3) # number of countries by region table(MaddisonCountries$region) # What are "Western Offshoots"? subset(MaddisonCountries, grepl('Of', region), c(country, ISO))# Get the country for a countrycode (IS) subset(MaddisonCountries, ISO=='GBR', country) # Or MaddisonCountries['GBR', 'country'] # Find Yugoslavia subset(MaddisonCountries, grepl('Yugo', country), 1:3) # number of countries by region table(MaddisonCountries$region) # What are "Western Offshoots"? subset(MaddisonCountries, grepl('Of', region), c(country, ISO))
The
Maddison project
collates historical economic statistics from many sources.
MaddisonCountries is a data.frame of all (countrycode, country,
region) combinations in those data. This object provides easy access to
the 2023 version of the Maddison project data downloaded 2025-08-28.
MaddisonDataMaddisonData
MaddisonDataA data frame with 4 columns:
3-letter ISO country code
numeric year starting with year 1 CE
Gross domestic product (GDP) per capita in 2011 dollars at purchasing power parity (PPP)
Population, mid-year (thousands)
https://www.rug.nl/ggdc/historicaldevelopment/maddison/releases/maddison-project-database-2020?lang=en"Groningen Growth and Development Centre"
# Get the countrycode for a country subset(MaddisonCountries, country=='United Kingdom', ISO) # Select str(GBR <- MaddisonData[MaddisonData$ISO=='GBR', ])# Get the countrycode for a country subset(MaddisonCountries, country=='United Kingdom', ISO) # Select str(GBR <- MaddisonData[MaddisonData$ISO=='GBR', ])
MaddisonLeaders computes the countries with the highest gdppc for each
year.
MaddisonLeaders( except = character(0), y = "gdppc", group = "ISO", data = MaddisonData::MaddisonData, x = "year" )MaddisonLeaders( except = character(0), y = "gdppc", group = "ISO", data = MaddisonData::MaddisonData, x = "year" )
except |
either NULL to select all the data in |
y |
name of column in |
group |
name of column in |
data |
|
x |
time variable. Default = |
an object of class c('MaddisonLeaders', 'data.frame'), with
columns
paste0(x, 'Begin),
paste0(x, 'End'),
paste0(y, '0'),
paste0(y, '1'), and
{{group}}
paste0('d', x, '0') =
paste0(x, 'End') - paste0(x, 'Begin') + min(dx), where
dx = min(diff(sort(unique(data[, x]))))
paste0('d', x, '1') =
c(tail(paste0(x, 'Begin'), -1) - head(paste0(x, 'End'), -1), NA)
(defaults:
dy0 = yearEnd - yearBegin +1 and
dy1 = c(tail(yearBegin, -1) - head(yearEnd, -1), NA)
)
(defaults:
yearBegin,
yearEnd,
gdppc0,
gdppc1, and
ISO, plus
dyear0 = yearEnd - yearBegin + 1 and
dyear1 = c(tail(yearBegin, -1) - head(yearEnd, -1), NA)
with an attribute LeaderByYear = a data.frame with columns, {{x}},
paste0('max', y), and {{group}} (defaults: year, maxgdppc, ISO).
Leaders0 <- MaddisonLeaders() # max GDPpc for each year. # Presumed technology leaders without commodity leaders with narrow # economies Leaders1 <- MaddisonLeaders(c('ARE', 'KWT', 'QAT')) # since 1600 MadDat1600 <- subset(MaddisonData, year>1600) Leaders1600 <- MaddisonLeaders(c('ARE', 'KWT', 'QAT'), data=MadDat1600) # max pop by region within percentiles of gdppc noGDP <- is.na(MaddisonData$gdppc) MadDat <-MaddisonData[!noGDP, ] gdpPcts <- quantile(MadDat$gdppc, seq(0, 1, .01), na.rm=TRUE) gdpPct <- unique(as.numeric(gdpPcts[-1])) gdpPc <-c(gdpPct[-100], tail(gdpPct, 1)*(1+sqrt(.Machine$double.eps))) gdp100 <- MadDat$gdppc nObs <- nrow(MadDat) for(i in 1:nObs){gdp100[i] <- min(gdpPc[MadDat$gdppc[i]<gdpPc])} MadDat$gdp100 <- gdp100 MadDat$region <- MaddisonCountries[MadDat$ISO, 'region', drop=TRUE] MadPopRgnGDP<-MaddisonLeaders(y='pop',group='region',data=MadDat,x='gdp100')Leaders0 <- MaddisonLeaders() # max GDPpc for each year. # Presumed technology leaders without commodity leaders with narrow # economies Leaders1 <- MaddisonLeaders(c('ARE', 'KWT', 'QAT')) # since 1600 MadDat1600 <- subset(MaddisonData, year>1600) Leaders1600 <- MaddisonLeaders(c('ARE', 'KWT', 'QAT'), data=MadDat1600) # max pop by region within percentiles of gdppc noGDP <- is.na(MaddisonData$gdppc) MadDat <-MaddisonData[!noGDP, ] gdpPcts <- quantile(MadDat$gdppc, seq(0, 1, .01), na.rm=TRUE) gdpPct <- unique(as.numeric(gdpPcts[-1])) gdpPc <-c(gdpPct[-100], tail(gdpPct, 1)*(1+sqrt(.Machine$double.eps))) gdp100 <- MadDat$gdppc nObs <- nrow(MadDat) for(i in 1:nObs){gdp100[i] <- min(gdpPc[MadDat$gdppc[i]<gdpPc])} MadDat$gdp100 <- gdp100 MadDat$region <- MaddisonCountries[MadDat$ISO, 'region', drop=TRUE] MadPopRgnGDP<-MaddisonLeaders(y='pop',group='region',data=MadDat,x='gdp100')
The
Maddison project
collates historical economic statistics from many sources.
MaddisonSources is a list of tibble::tibbles with ISO names
giving the sources of GDP per capita for different years for the said
country.
MaddisonYears is a data.frame giving yearBegin and yearEnd and the
number of each source in MaddisonSpources for each ISO.
MaddisonSources MaddisonYearsMaddisonSources MaddisonYears
MaddisonSourcesA named list of tibble::tibbles, one for each country, named with the
ISO country codes. Each tibble has one row for each source for the indicated
ISO and two columns:
character variable of year(s) for this source starting with year 1 CE.
character variable giving the source for the years
described.
In addition, MaddisonSources has an attribute since2008, which says,
"gdppc since 2008: Total Economy Database (TED) from the Conference Board
for all countries included in TED and UN national accounts statistics for
all others."
MaddisonYearsA data.frames with 4 columns:
3-letter country code.
Integer year begin and end for each source.
Integer of the source within MaddisonSources[[ISO]].
An object of class data.frame with 133 rows and 4 columns.
https://www.rug.nl/ggdc/historicaldevelopment/maddison/releases/maddison-project-database-2020?lang=en"Groningen Growth and Development Centre"
MaddisonSources[['GBR']] MaddisonSources[['GBR']][, 1, drop=TRUE] # = c('1', '1252–1700 (England)', '1700–1870') # for data from the year 1 # and for England only between 1252 and 1700, etc. MaddisonSources[['IRN']][, 1, drop=TRUE] # = '1820, 1870, 1913, 1950' # for those 4 years only. MaddisonSources[c('GBR', 'USA')] MaddisonSources[['GBR']][, 1, drop=TRUE] # = c('1', '1252–1700 (England)', '1700–1870') MaddisonYears[MaddisonYears$ISO=='GBR', ] = data.frame( ISO=rep('GBR', 3), yearBegin=c(1, 1252, 1700), yearEnd =c(1, 1700, 1870), sourceNum=1:3 ) MaddisonSources[['EGY']][, 1, drop=TRUE] # = c('1', '700 – 1500', '1820, 1870, 1913, 1950') MaddisonYears[MaddisonYears$ISO=='EGY', ] = data.frame( ISO=rep('EGY', 6), yearBegin=c(1, 700, 1820, 1870, 1913, 1950), yearEnd =c(1, 1500, 1820, 1870, 1913, 1950), sourceNum=c(1, 2, rep(3, 4)) )MaddisonSources[['GBR']] MaddisonSources[['GBR']][, 1, drop=TRUE] # = c('1', '1252–1700 (England)', '1700–1870') # for data from the year 1 # and for England only between 1252 and 1700, etc. MaddisonSources[['IRN']][, 1, drop=TRUE] # = '1820, 1870, 1913, 1950' # for those 4 years only. MaddisonSources[c('GBR', 'USA')] MaddisonSources[['GBR']][, 1, drop=TRUE] # = c('1', '1252–1700 (England)', '1700–1870') MaddisonYears[MaddisonYears$ISO=='GBR', ] = data.frame( ISO=rep('GBR', 3), yearBegin=c(1, 1252, 1700), yearEnd =c(1, 1700, 1870), sourceNum=1:3 ) MaddisonSources[['EGY']][, 1, drop=TRUE] # = c('1', '700 – 1500', '1820, 1870, 1913, 1950') MaddisonYears[MaddisonYears$ISO=='EGY', ] = data.frame( ISO=rep('EGY', 6), yearBegin=c(1, 700, 1820, 1870, 1913, 1950), yearEnd =c(1, 1500, 1820, 1870, 1913, 1950), sourceNum=c(1, 2, rep(3, 4)) )
path_package2 returns a character vector of matches to target.
It differs from system.file() in that it supports searching for a target
file or folder possibly in subdirs of the working directory or in
nparents of its parents.
path_package2( target, package = NULL, nparents = 1, subdirs = c("extdata", paste("inst", "extdata", sep = .Platform$file.sep)) )path_package2( target, package = NULL, nparents = 1, subdirs = c("extdata", paste("inst", "extdata", sep = .Platform$file.sep)) )
target |
A regular expression describing the file of folder desired. |
package |
Name of the package to in which to search. If |
nparents |
integer indicate the number of parents of the working directory in which to search; default = 1. |
subdirs |
= |
This works in a vignette searching for a target that could be in the
vignettes directory of its parent package or in the package directory
or in, e.g., one of subdirs = c('extdata', paste('inst', 'extdata', sep=.Platform$file.sep)).
Returns the full path to match(s) if found and a character vector of length
0 if no matches are found. The returned object also has a searched
attribute being a character vector of the directories searched.
This was inspired by a desire to share with others a vignette describing how to create data objects from a file that could not itself be shared on CRAN. This is not easy, because the working director available to code in a vignette changes depending on how that code is run.
path_package2 allows the user to store the target locally, e.g., in
inst/extdata but include it in .gitignore to prevent it from leaving the
local computer. The vignette then decides what to do after calling
path_package2() based on the length of the the object returned.
a character vector with an attribute searched giving the full
paths of all directories searched for target.
# search for a file matching a regular expression path_package2('^mpd.*xlsx$') # search only in the working directory path_package2('^mpd.*xlsx$', nparents=0, subdirs=character(0))# search for a file matching a regular expression path_package2('^mpd.*xlsx$') # search only in the working directory path_package2('^mpd.*xlsx$', nparents=0, subdirs=character(0))
MaddisonLeaders
summary.MaddisonLeaders returns a data.frame with columns ISO,
paste0(x, 'Begin), paste0(x, 'End'), n, and p.
## S3 method for class 'MaddisonLeaders' summary(object, sortBy = "ISO", decreasing = FALSE, ...)## S3 method for class 'MaddisonLeaders' summary(object, sortBy = "ISO", decreasing = FALSE, ...)
object |
= object of class |
sortBy |
= column of output used for sorting; default = |
decreasing |
default = |
... |
= optional arguments for |
a data.frame with columns
ISO = One row for each level of ISO in unique(object[, 'ISO'])
paste0(x, 'Begin) = earliest object[, paste0(x, 'Begin')] for ISO
paste0(x, 'End'), last object[, paste0(x, 'End')] for ISO
n = sum of (paste0(x, 'End') - paste0(x, 'Begin') + 1 for ISO.
p = n/(paste0(x, 'End') - paste0(x, 'Begin') + 1).
)
(defaults:
ISO = One row for each level of ISO in unique(object[, 'ISO'])
yearBegin = earliest object[, 'yearBegin')] for ISO
yearEnd = last object[, 'yearEnd')] for ISO
n = sum of ('yearEnd' - 'yearBegin' + 1) for ISO.
p = n/(yearEnd - yearBegin + 1).
[, 'yearBegin')]: R:,%20'yearBegin') [, 'yearEnd')]: R:,%20'yearEnd')
Leaders0 <- MaddisonLeaders() # max GDPpc for each year. summary(Leaders0)Leaders0 <- MaddisonLeaders() # max GDPpc for each year. summary(Leaders0)
yr converts a Date to a year and fraction. For example, 2025-01-01
becomes 2025.00000, while 2025-01-02 becomes 2025.00234, because (2-1)/365
is 0.00234 to 5 significant digits. However, 2024-01-02 becomes 2024.0233,
because (2-1)/366 is only 0.00233 to 5 significant digits.
yr(x, ...)yr(x, ...)
x |
quantity that can be converted to a |
... |
arguments passed to |
a number (numeric vector).
lubridate::decimal_date(), lubridate::ymd()
Jan2_24_25 <- c('2024-01-02', '2025-01-02') J2yr <- yr(Jan2_24_25) J2y <- yr(as.POSIXct(Jan2_24_25)) all.equal(J2yr, J2y)Jan2_24_25 <- c('2024-01-02', '2025-01-02') J2yr <- yr(Jan2_24_25) J2y <- yr(as.POSIXct(Jan2_24_25)) all.equal(J2yr, J2y)