Technological change
NOTE: The present notebook is coded in R. It relies heavily on the tidyverse ecosystem of packages. We load the tidyverse below as a prerequisite for the rest of the notebook - along with a few other libraries.
\(\rightarrow\) Don’t forget that code flows sequentially. A random chunk may not work if the previous have have not been executed.
library(tidyverse) # Package for data wrangling
library(readxl) # Package to import MS Excel files
library(latex2exp) # Package for LaTeX expressions
library(quantmod) # Package for stock data extraction
library(highcharter) # Package for reactive plots
library(ggcorrplot) # Package for correlation plots
library(plm) # Package for panel models
library(WDI) # Package for World Bank data
library(broom) # Package for neat regression output
The content of the notebook is heavily inspired from the book Advanced Macro-economics - An Easy Guide.
Context
It’s not easy to explain growth endogenously. Adding new factors (human capital) to the production function does not help if CRS are assumed.
Back to Solow
Recall a Cobb-Douglas production function, \(y=Ak^\alpha\). Suppose now that technology allow \(A\) to grow, i.e., \(\dot{A}_t/A_t=\gamma_A\) (\(A_t=A_0e^{\gamma_A t}\)). Then,
\[\dot{y}_t= \dot{A}_t k_t^\alpha + A_t\alpha \dot{k}_tk_t^{\alpha-1}\]
and \[\frac{\dot{y}_t}{y_t}= \frac{\dot{A}_t}{A_t}+\alpha \frac{\dot{k}_t}{k_t}=\gamma_A + \alpha \gamma_k.\] The trick is that technology does not enter the budget constraint (unlike capital). This is why it can have a nonzero growth rate upon equilibrium. Capital remains constant, but technology grows indefinitely.
This is cheating, i.e., growth is exogenous.
A seemingly segmented economy
One possible “way out” is to posit a novel form of the production function. Here we follow Romer’s Endogenous Technological Change. Originally, the model assumes a variety of products, \(X(i)\) - \(i\) being the index. Now, there are also different types of products: the final ones and the intermediate (or raw) ones, which serve as inputs for final output. \(X(i)\) refers to the quantity of intermediate input of variety \(i\) used by the economy. But in fact, in the end (as we’ll see), the only thing that matters is the dichotomy between intermediate products and the final output. Indeed, the production function is seemingly more intricate:
\[Y(X)=\left(\int_0^M X(i)^\alpha di \right)^{1/\alpha},\] where \(M\) is the range of varieties (basically, the integral is a sum). Another way to see this is to imagine sectors that are infinitely small (hence the integral). Labor is left out to ease the computations - but can also be viewed as already incorporated in the \(X(i)\).
The firms that produce the final output take the prices of intermediate products (\(p(i)\)) as given. They seek to minimize costs for a given unit of good produced, i.e.,
\[\min_{X(i)} \int_0^Mp(i)X(i)di, \quad s.t. \quad \int_0^MX(i)^\alpha di=1 \tag{1}\]
The Lagrange formulation is
\[L=\int_0^Mp(i)X(i)di - \lambda \left(\int_0^MX(i)^\alpha di-1 \right)\]
and
\[\frac{\partial L}{\partial X(i)}=p(i)-\lambda \alpha X(i)^{\alpha-1}\]
so that the FOCs lead to \[X(i)=\left(\frac{\alpha \lambda}{p(i)} \right)^{1/(1-\alpha)}.\]
Demand is logically downward sloping: if variety \(i\) costs more, then demand for it will shrink.
But in fact, upon simplifying assumption on the lack of heterogeneity in the cross-section of intermediate products, the distinction vanishes. Indeed, upon setting \(X(i)=Z/M\) where \(Z\) represents the total resources required to produce the intermediate inputs, we get
\[Y=(M(Z/M)^\alpha)^{1/\alpha}=ZM^{1/\alpha+1},\]
which, from the perspective of \(Z\), is equivalent to the \(AK\) model.
Importantly, the varieties are not fixed once and for all; they may change, due to innovations and R&D. Hence, while \(Z\) is fixed, \(M\) is not and for simplicity, we assume \(\dot{M}_t=\gamma_M M_t\) (i.e., \(M_t=M_0e^{\gamma_M t}\)). It holds that
\[\dot{Y}_t=Z\dot{M}_t (1/\alpha+1)M_t^{1/\alpha}\] so
\[\frac{\dot{Y}_t}{Y_t}=(1/\alpha+1)\frac{\dot{M}_t}{M_t}=(1/\alpha+1)\gamma_M.\] In the original paper, \(\gamma_M\) depends on \(\alpha\), on labor and, crucially, on the productivity of innovation. It is linearly increasing in the latter two variables.
Semi-endogenous growth
Here we follow R&D-based models of economic growth.
Here, the total labor force is split in two: \(L_Y\) for the labor that directly produces output and \(L_A\) for the workforce that works in R&D… The production function is
\[Y=K^{1-a}(AL_Y)^a\] and the interesting part here is the evolution of \(A\), which is defined as productivity of knowledge. We know that specifying \(\dot{A}/A=\delta\) is cheating as this leads to exogenous growth. Instead, suppose \[\dot{A}=\tilde{\delta} L_A^\lambda, \quad \lambda \in (0,1],\] i.e., change in innovation is driven by the R&D headcount but possibly at a power smaller than one. \(\tilde{\delta}\) is the rate at which “scientists” discover new ideas and products. This rate could depend on the level of knowledge in the economy. Here we assume that \[\tilde{\delta}=\delta A^\phi,\] where \(\phi\) determines the returns of knowledge. Note that it can be negative! In the end, \[\dot{A}=\delta A^\phi L_A^\lambda \quad \Leftrightarrow \quad \gamma_A= \frac{\dot{A}}{A}=\delta A^{\phi-1}L_A^\lambda \] If we differentiate with respect to \(t\), we get \[\frac{\partial \gamma_A}{\partial t}=\delta(\lambda L_A^{\lambda-1}A^{\phi-1}\dot{L}_A+\dot{A}(\phi-1)A^{\phi-2}L_A^\lambda)\] If the growth rate of \(A\) remains constant, this means the above quantity is zero, i.e., \[\frac{\lambda}{1-\phi}\frac{\dot{L_A}}{L_A}=\frac{\dot{A}}{A}\] If the growth rate of \(L_A\) is \(n\), then we have \[\gamma_A=\frac{\lambda n}{1-\phi}, \tag{2}\] hence the parameter \(\phi\) plays a crucial role. This is all the more evident if we recall that under standard assumptions, it holds that production factors follow dynamics such as: \[ \gamma_x= \frac{\dot{x}_t}{x_t}=s\frac{y_t}{x_t}-(\delta+n),\] hence if \(\gamma_x\) is constant, it means that the ratio \(y_t/x_t\) should be constant too, i.e., that all quantities grow at the same rate, which will be given by Equation 2 in the model.
What the data says
The model
We now turn to an empirical exploration of the concepts and variables seen and mentioned until today. Indeed, models are only worthwhile if they are able to explain and predict salient empirical properties of the economy.
We follow here a panel approach from Economic Growth in a Cross Section of Countries:
\[g_{t,i}=\textbf{X}_{t,i}\boldsymbol{\beta}+a \log(y_{t-1,i}) + e_{t,i},\]
where \(g_{t,i}\) is the growth rate of country \(i\) at date (year) \(t\) and \(y_{t,i}\) is GDP per capita. The matrix \(\textbf{X}_{t,i}\) embeds all variables of interest.
Fetching & wrangling the data
<- function(v, n = 12){ # Imputation function
impute for(j in 1:n){
<- which(is.na(v))
ind if(length(ind)>0){
if(ind[1]==1){ind <- ind[-1]}
<- v[ind-1]
v[ind]
}
}return(v)
}
<- WDI( # World Bank data
wb_growth indicator = c(
"labor" = "SL.TLF.TOTL.IN", # Labor force (# individuals)
"savings_rate" = "NY.GDS.TOTL.ZS", # Savings rate (% GDP)
"inflation" = "FP.CPI.TOTL.ZG", # Inflation rate
"trade" = "NE.TRD.GNFS.ZS", # Trade as % of GDP
"pop" = "SP.POP.TOTL", # Population
"pop_growth" = "SP.POP.GROW", # Population growth
"capital_formation" = "NE.GDI.TOTL.ZS", # Gross capital formation (% GDP)
"gdp_percap" = "NY.GDP.PCAP.CD", # GDP per capita
"RD_percap" = "GB.XPD.RSDV.GD.ZS", # R&D per capita
"educ_level" = "SE.SEC.CUAT.LO.ZS", # % pop reachiing second. educ. level
"educ_spending" = "SE.XPD.TOTL.GD.ZS", # Education spending (%GDP)
"debt" = "GC.DOD.TOTL.GD.ZS", # Central gov. debt (% of GDP)
"gdp" = "NY.GDP.MKTP.CD" # Gross Domestic Product (GDP)
), extra = TRUE,
start = 1960,
end = 2024) |>
mutate(across(everything(), as.vector)) |>
select(-status, -lending, -iso2c, -iso3c) |>
filter(lastupdated == max(lastupdated)) |>
arrange(country, year) |>
mutate(capital_percap = capital_formation / labor, .before = "region")
We make a few adjustments to the data, adding GDP per capita growth and imputing a few points along the way (to increase sample size).
<- wb_growth |>
wb_growth filter(region != "Aggregates") |> # Remove continents & co.
group_by(country) |>
mutate(gdp_growth = gdp_percap/dplyr::lag(gdp_percap) - 1, .before = "region") |>
mutate(across(labor:capital_percap, ~ impute(.x, n = 3))) |>
ungroup()
|> head(9) wb_growth
country | year | lastupdated | labor | savings_rate | inflation | trade | pop | pop_growth | capital_formation | gdp_percap | RD_percap | educ_level | educ_spending | debt | gdp | capital_percap | gdp_growth | region | capital | longitude | latitude | income |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Afghanistan | 1960 | 2024-09-19 | NA | NA | NA | NA | 8622466 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | South Asia | Kabul | 69.1761 | 34.5228 | Low income |
Afghanistan | 1961 | 2024-09-19 | NA | NA | NA | NA | 8790140 | 1.925952 | NA | NA | NA | NA | NA | NA | NA | NA | NA | South Asia | Kabul | 69.1761 | 34.5228 | Low income |
Afghanistan | 1962 | 2024-09-19 | NA | NA | NA | NA | 8969047 | 2.014879 | NA | NA | NA | NA | NA | NA | NA | NA | NA | South Asia | Kabul | 69.1761 | 34.5228 | Low income |
Afghanistan | 1963 | 2024-09-19 | NA | NA | NA | NA | 9157465 | 2.078997 | NA | NA | NA | NA | NA | NA | NA | NA | NA | South Asia | Kabul | 69.1761 | 34.5228 | Low income |
Afghanistan | 1964 | 2024-09-19 | NA | NA | NA | NA | 9355514 | 2.139651 | NA | NA | NA | NA | NA | NA | NA | NA | NA | South Asia | Kabul | 69.1761 | 34.5228 | Low income |
Afghanistan | 1965 | 2024-09-19 | NA | NA | NA | NA | 9565147 | 2.216007 | NA | NA | NA | NA | NA | NA | NA | NA | NA | South Asia | Kabul | 69.1761 | 34.5228 | Low income |
Afghanistan | 1966 | 2024-09-19 | NA | NA | NA | NA | 9783147 | 2.253524 | NA | NA | NA | NA | NA | NA | NA | NA | NA | South Asia | Kabul | 69.1761 | 34.5228 | Low income |
Afghanistan | 1967 | 2024-09-19 | NA | NA | NA | NA | 10010030 | 2.292638 | NA | NA | NA | NA | NA | NA | NA | NA | NA | South Asia | Kabul | 69.1761 | 34.5228 | Low income |
Afghanistan | 1968 | 2024-09-19 | NA | NA | NA | NA | 10247780 | 2.347351 | NA | NA | NA | NA | NA | NA | NA | NA | NA | South Asia | Kabul | 69.1761 | 34.5228 | Low income |
First analyses
Let’s have a look at missing data.
<- c("savings_rate", "inflation", "trade", "debt", "capital_formation" , "pop_growth",
vars "RD_percap", "educ_level", "educ_spending")
|> select(all_of(vars)) |> is.na() |> colMeans() wb_growth
savings_rate inflation trade debt
0.37827035 0.36271802 0.35268895 0.82957849
capital_formation pop_growth RD_percap educ_level
0.38190407 0.01780523 0.78953488 0.71722384
educ_spending
0.53393895
Debt and R&D cost a lot of data depletion.
Which countries are the most represented in the data?
<- c("savings_rate", "inflation", "trade", "debt", "capital_formation" , "pop_growth",
vars "RD_percap", "educ_level", "educ_spending")
|>
wb_growth select(all_of(c(vars, "country"))) |>
na.omit() |>
group_by(country) |>
count(sort = T) |>
head(12)
country | n |
---|---|
Canada | 28 |
Portugal | 26 |
Hungary | 23 |
Italy | 23 |
Luxembourg | 23 |
Bulgaria | 22 |
Malaysia | 22 |
Romania | 22 |
Sweden | 22 |
Australia | 21 |
Croatia | 21 |
Denmark | 21 |
Canada and Portugal make it to the top.
Next, let us look if there is colinearity among variables. Indeed, high correlations between independent variables are likely to perturb inference.
|>
wb_growth select(all_of(vars)) |>
na.omit() |>
cor() |>
ggcorrplot(lab = TRUE, digits = 1L) +
scale_fill_viridis_c(alpha = 0.7)
Usually, a correlation of 0.5 (in absolute value) is considered already high. A value above 0.7 is prohibitive…
So here, it seems we are fine.
Panel estimation
plm(formula = gdp_growth ~ . ,
data = wb_growth |>
::select(all_of(c("country", "year", "gdp_growth", vars))) |>
dplyrna.omit() |>
group_by(country) |>
mutate(n = n()) |>
filter(n > 10),
effect = "twoways",
index = c("country", "year"),
model = "within") |>
tidy()
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
savings_rate | 0.0033133 | 0.0008066 | 4.1078015 | 0.0000440 |
inflation | -0.0003243 | 0.0007822 | -0.4146078 | 0.6785385 |
trade | -0.0001284 | 0.0001632 | -0.7867243 | 0.4316728 |
debt | 0.0000349 | 0.0001578 | 0.2210084 | 0.8251415 |
capital_formation | 0.0027687 | 0.0007905 | 3.5022791 | 0.0004865 |
pop_growth | -0.0167985 | 0.0042139 | -3.9864333 | 0.0000731 |
RD_percap | 0.0078383 | 0.0109675 | 0.7146836 | 0.4750100 |
educ_level | -0.0002141 | 0.0005232 | -0.4092601 | 0.6824567 |
educ_spending | 0.0011688 | 0.0044388 | 0.2633038 | 0.7923833 |
To detect which variables matter, we look at p-values: they indicate the probability to obtain a value as “extreme” as the one observed under the assumption that the coefficient is equal to zero (this hypothesis is called the null). Hence if a p-value is close to zero, it signals support for the assumption that there is a link (not necessarily causal) between the dependent and independent variables.
Here, the savings rate and capital formation both have significant positive coefficients…
A step back: heuristic sources of growth
(see CSV, section 7.2)
- luck: initial conditions may have an impact.
- geography: natural resources (minerals, grains, cattle, etc.) are a key driver of growth. Diseases are also more frequent in some parts of the globe.
- culture (customary beliefs and values): they can drive economic decisions, but are also hard to measure.
- institutions: property rights, labor markets, regulation.