vignettes/introduction_to_life_tables.Rmd
introduction_to_life_tables.RmdA life table is a table that includes information to describe the dying out of a birth cohort. This can also be a synthetic birth cohort, in which case we refer to it as a period life table.
Life tables are one of the most important devices in demography – they have been used since the 1600s! They can also be useful for other fields, because they are generalizable to other discrete “time to event” data.
Typically, life tables have one row per age group, with columns representing life table metrics, also known as parameters. The life table parameters used in this package are , , , , , , , , and . In this notation, refers to age, and the metrics apply to either the age directly or to the interval between ages and where indicates the length of the interval, typically in years. We often shorthand by removing the “n” from the notation, with interval length implied.
Here’s an example of a period life table, for males in Austria in
1992, which is saved in the package as austria_1992_lt
(source: Preston 2001):
| x | x+n | deaths | pop | mx | ax | qx | px | lx | dx | nLx | Tx | ex |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 419 | 47925 | 0.01 | 0.07 | 0.01 | 0.99 | 1.00 | 0.01 | 0.99 | 72.89 | 72.89 |
| 1 | 5 | 70 | 189127 | 0.00 | 1.63 | 0.00 | 1.00 | 0.99 | 0.00 | 3.96 | 71.90 | 72.53 |
| 5 | 10 | 36 | 234793 | 0.00 | 2.50 | 0.00 | 1.00 | 0.99 | 0.00 | 4.95 | 67.94 | 68.63 |
| 10 | 15 | 46 | 238790 | 0.00 | 3.14 | 0.00 | 1.00 | 0.99 | 0.00 | 4.94 | 62.99 | 63.68 |
| 15 | 20 | 249 | 254996 | 0.00 | 2.72 | 0.00 | 1.00 | 0.99 | 0.00 | 4.93 | 58.04 | 58.74 |
| 20 | 25 | 420 | 326831 | 0.00 | 2.52 | 0.01 | 0.99 | 0.98 | 0.01 | 4.90 | 53.11 | 54.01 |
| 25 | 30 | 403 | 355086 | 0.00 | 2.48 | 0.01 | 0.99 | 0.98 | 0.01 | 4.87 | 48.21 | 49.35 |
| 30 | 35 | 441 | 324222 | 0.00 | 2.60 | 0.01 | 0.99 | 0.97 | 0.01 | 4.84 | 43.34 | 44.61 |
| 35 | 40 | 508 | 269963 | 0.00 | 2.70 | 0.01 | 0.99 | 0.96 | 0.01 | 4.80 | 38.50 | 39.90 |
| 40 | 45 | 769 | 261971 | 0.00 | 2.66 | 0.01 | 0.99 | 0.96 | 0.01 | 4.75 | 33.70 | 35.25 |
| 45 | 50 | 1154 | 238011 | 0.00 | 2.70 | 0.02 | 0.98 | 0.94 | 0.02 | 4.66 | 28.95 | 30.73 |
| 50 | 55 | 1866 | 261612 | 0.01 | 2.68 | 0.04 | 0.96 | 0.92 | 0.03 | 4.52 | 24.29 | 26.42 |
| 55 | 60 | 2043 | 181385 | 0.01 | 2.64 | 0.05 | 0.95 | 0.89 | 0.05 | 4.32 | 19.77 | 22.29 |
| 60 | 65 | 3496 | 187962 | 0.02 | 2.62 | 0.09 | 0.91 | 0.84 | 0.07 | 4.01 | 15.45 | 18.43 |
| 65 | 70 | 4366 | 153832 | 0.03 | 2.62 | 0.13 | 0.87 | 0.76 | 0.10 | 3.58 | 11.43 | 14.97 |
| 70 | 75 | 4337 | 105169 | 0.04 | 2.59 | 0.19 | 0.81 | 0.66 | 0.12 | 3.01 | 7.86 | 11.86 |
| 75 | 80 | 5279 | 73694 | 0.07 | 2.52 | 0.30 | 0.70 | 0.54 | 0.16 | 2.28 | 4.84 | 9.00 |
| 80 | 85 | 6460 | 57512 | 0.11 | 2.42 | 0.44 | 0.56 | 0.37 | 0.16 | 1.45 | 2.56 | 6.84 |
| 85 | Inf | 6146 | 32248 | 0.19 | 5.25 | 1.00 | 0.00 | 0.21 | 0.21 | 1.11 | 1.11 | 5.25 |
For reference, the following is a list of life table metrics and their definitions.
: mortality rate between ages and . Shorthand to with implied interval width (). Equals deaths divided by person-years lived in the interval. Mid-year population is commonly used as an adequate approximation of the person-years denominator.
: mean person-years lived between ages and for those who die within the interval. Shorthand to with implied interval width ().
: probability of death between ages and , conditional on survival to age . Shorthand to with implied interval width (). Equals deaths in the interval divided by survivors to -th birthday. Examples: = probability of death between birth and age ; = probability of death between age and age conditional on survival to age .
: proportion of the cohort surviving to age .
: life expectancy at age – mean number of years lived after -th birthday by those surviving to age . Life expectancy at birth is .
: total person-years lived between age and .
: total person-years lived above age .
: proportion of the cohort dying between ages and . Shorthand to .
: probability of survival between ages and conditional on survival to age . Inverse of .
We often reduce life tables to age patterns of log probability of death () or to survival curves ( over age), which can be easily displayed and vetted in plots.


The demCore package includes many utility functions for
calculations that leverage the mathematical relationships between life
table metrics to build out a complete life table. This section will
provide details and examples regarding the use of these functions and
their underlying methods. We will accomplish this by following along the
example of building the example life table above from death counts and
population.
Note that this document and this package do not contain an exhaustive list of relationships between metrics. Additionally, some equations presented rely on assumptions and others are true relationships that are always valid. For more details, see the Preston Demography textbook, from which many of these details were drawn.
From raw death count and population data, the place to start with a life table is .
Let’s load in our example data and calculate :
If we have and we can directly calculate . However, we often use and to get in the first place, and so have to make some assumptions to get . Empirical calculations of would require detailed and accurate data on age of death in days (such as paired date of birth and date of death), which is typically unavailable.
Rule of thumb:
One option is to assume all deaths occur in the middle of the interval, so . This assumption works well for most ages, but it doesn’t work as well for very young or very old where mortality can change rapidly over the interval.
Another assumption we can make is that the age-specific death rate is
constant between
and
.
Under this assumption,
The function mx_to_ax implements this assumption.
Using our example data, we get:
dt <- hierarchyUtils::gen_length(dt, col_stem = "age")
dt[, ax := mx_to_ax(mx = mx, age_length = age_length)]Note that we can use hierarchyUtils::gen_length to add
the age_length column given age_start and
age_end.
1a0 and 4a1:
Preston et al adapted an analysis first completed by Coale and Demeny (1983) to derive a relationship between infant mortality rate () and under-5 values ( and ). In the absence of reliable data to produce , these relationships can be used to predict from infant :
| Males | Females | |
|---|---|---|
| 1a0: | ||
| If 1m0 >= 0.107 | 0.330 | 0.350 |
| If 1m0 < 0.107 | 0.045 + 2.684 * 1m0 | 0.053 + 2.800 * 1m0 |
| 4a1: | ||
| If 1m0 >= 0.107 | 1.352 | 1.361 |
| If 1m0 < 0.107 | 1.651 - 2.816 * 1m0 | 1.522 - 1.518 * 1m0 |
Use the gen_u5_ax_from_mx function to implement this
method:
dt[, sex := "male"]
gen_u5_ax_from_mx(dt, id_cols = c("age_start", "age_end", "sex"))Graduation method: One strategy for selecting values is based on the level and slope of the function. Comparing two populations with the same , the population with more rapidly rising mortality rate with respect to age will have deaths that are more concentrated in the later part of the interval (higher ). Comparing two populations with the same slope in , the one with higher mortality rate will have more deaths at the beginning of the interval (lower ).
To utilize this theory, we can implement iteration as described in
the Preston book, and originally proposed by Keyfitz (1966):
Where
is derived from the conversion from
to
.
However, since the
to
conversion requires
,
this requires us to pick a starting place for
(like
),
solve for
,
solve for
,
and so on until convergence. Use demCore::iterate_ax to
implement this method.
From and , we can solve directly for :
For the terminal age group, should be because all individuals surviving to the terminal age group will die in that age group (probability of death = ).
The mx_ax_to_qx combines the equation for
and the requirement that terminal
equal one by setting
if age_length = Inf.
dt[, qx := mx_ax_to_qx(mx = mx, ax = ax, age_length = age_length)]Other functions that utilize this relationship but solve for
different metrics are mx_qx_to_ax and
qx_ax_to_mx.
You can also solve for under the assumption of constant mortality rate within an interval, which removes from the relationship:
dt[, qx_compare := mx_to_qx(mx = mx, age_length = age_length)]These two
values are the same, because the implied
in mx_to_qx is equivalent to the
we generate under the assumption in mx_to_ax.
To calculate the proportion of a cohort surviving to age (), we set (100% survive to birth), and recursively calculate:
or in words, the proportion surviving to age times the proportion of those survivors who do not die between and is the proportion surviving to age .
Our gen_lx_from_qx function can perform this
calculation:
gen_lx_from_qx(dt, id_cols = c("age_start", "age_end"))Proportion of cohort dying between ages and () is to start, then thereafter (difference between proportion surviving to age and proportion surviving to age ).
To calculate
,
use gen_dx_from_lx:
gen_dx_from_lx(dt, id_cols = c("age_start", "age_end"))The person-years lived between ages and () can be broken down into:
such that:
For the terminal age group:
Use the gen_nLx function to calculate with this
method:
Life expectancy above age (mean person-years lived above age ) is equal to the total person years over age divided by the persons surviving to age :
For the terminal age group, because everyone surviving to the interval dies in the interval.
Calculate
with gen_ex:
gen_ex(dt)One possible set of steps for calculating a complete period life table from deaths and mid-year population is:
gen_u5_ax_from_mx, and
set ax over age 5 as n/2mx_ax_to_qx
iterate_ax to modify ax and qx
values, improving ax over the naive n/2 valueslifetable function. The
lifetable function combines many of the functions described
in this vignette for convenience.Preston Samuel H, Patrick H, Michel G. Demography: measuring and modeling population processes. MA: Blackwell Publishing. 2001.
Coale AJ, Demeny P, Vaughan B. Regional model life tables and stable populations: studies in population. Elsevier; 2013 Oct 22.
Keyfitz N. A life table that agrees with the data. Journal of the American Statistical Association. 1966 Jun 1;61(314):305-12.