Package 'propop'

Title: Project Population Growth in Switzerland using the Cohort Component Method
Description: The purpose of this package is to project the development of the population at different spatial levels (e.g., cantons, districts, municipalities) using the cohort component method and the parameters provided by the Federal Statistical Office (FSO).
Authors: Norah Efosa [aut, cre], Adrian Gadient [aut] , Tina Cornioley [aut], Jan Wunder [aut], Niklas Haffert [aut], Andrea Plüss [ctb], Nadine Herrmann [ctb], Lutz Benson [ctb], Eric Meyer [ctb], Statistik Aargau [fnd, cph]
Maintainer: Norah Efosa <[email protected]>
License: GPL (>= 3)
Version: 1.2.1
Built: 2025-01-16 10:32:26 UTC
Source: https://github.com/statistik-aargau/propop

Help Index


Aggregate evaluation measures

Description

Returns descriptive summary statistics of model accuracy and bias measures across demographic groups and years. The returned statistics are particularly useful for comparing the model performance for different groups or different models.

Usage

aggregate_measures(data, weight_groups = NULL)

Arguments

data

data frame created with function compute_measures.

weight_groups

character, optional argument indicating one or more column names to obtain evaluation criteria weighted for specific groups (e.g., age groups, nationality).

Value

#' A data frame. The data frame includes the following summary measures:

  • mpe is the mean percentage error (mpe; or mean algebraic percentage error malpe); it is a bias indicator as it takes the direction of the error into account. Positive values indicate that the projections were, overall, too high. Negative values indicate that the projections were, overall, too low. The closer the value is too zero, the lower the bias.

  • medpe is the median (or middle value) of the percentage error (medpe). Particularly useful for small samples or skewed distributions. The closer the value is too zero, the lower the bias.

  • mape is the mean absolute percentage / proportional error (mape). It considers variance (or amplitude) and can be seen as a measure of precision. The smaller the value, the lower is the average error.

  • medape is the median (or middle value) of the absolute percentage error (medape). Particularly useful for small samples or skewed distributions. The smaller the value, the lower is the average error.

  • rmse is the root mean square error; it is an indication of the robustness or quality of the projection. The smaller the value, the more robust the projection.

  • wmape is the weighted mean absolute percentage error (wmape); in contrast to mape, this measure weights each absolute percentage error according to the population size of the "focal" group (e.g., nationality, age group) and thus considers domain size. Put differently, errors count more in populous groups than in less populous groups. This measure is particularly useful when population sizes vary strongly. The closer the value, the more precise is the projection.

  • n_measure is the number of times a summary measure occurs (per weight group if requested).

  • ape_under_1 is a measure to gauge the error distribution; it indicates the proportion of observations that have absolute percentage errors smaller than 1%.

  • ape_under_5 is a measure to gauge the error distribution; it indicates the proportion of observations that have absolute percentage errors smaller than 5%.

References

Baker, J., et al. (2015). Sub-county population estimates using administrative records: A municipal-level case study in New Mexico. In M. N. Hoque & L. B. Potter (Eds.), Emerging techniques in applied demography (pp. 63-79). Springer, https://doi.org/10.1007/978-94-017-8990-5_6

Bérard-Chagnon, J. (2015) Using tax data to estimate the number of families and households in Canada. In M. N. Hoque & L. B. Potter (Eds.), Emerging techniques in applied demography (pp. 137-153). Springer, https://doi.org/10.1007/978-94-017-8990-5_10

Reinhold M. & Thomsen, S. L. (2015) Subnational population projections by age: An evaluation of combined forecast techniques, Population Research and Policy Review, 34, 593-613, https://doi.org/10.1007/s11113-015-9362-0

Wilson, T. (2012). Forecast accuracy and uncertainty of Australian Bureau of Statistics state and territory population projections, International Journal of Population Research, 1, 419824, https://doi.org/10.1155/2012/419824

Wilson, T. (2016). Evaluation of alternative cohort-component models for local area population forecasts, Population Research and Policy Review, 35, 241-261, https://doi.org/10.1007/s11113-015-9380-y


Calculate shares for distributing people among subregions

Description

Calculate shares for distributing people among subregions

Usage

calculate_shares(data, col, age_group = "default")

Arguments

data

data frame, historical records (e.g., immigration from other cantons or countries) aggregated across demographic groups.

col

character, name of the column which contains the data for historical occurrences.

age_group

character (optional), either 1-year, 5-year, or 10-year age group used as basis for calculating shares. If the argument is not specified, the default attempts to avoid age groups without any observations. It prioritizes age groups based on their resolution (1-year age groups = most informative and highest priority, 10-year age groups = least informative and lowest priority). Users can override the default and enforce the use of a specific age group for all demographic groups by setting the argument to "age_group_5" or "age_group_10".

Value

Returns the input data frame with the following new columns:

  • age_group_5: character, indicates the 5-year age group to which the 1-year age group is assigned to.

  • age_group_10: character, indicates the 10-year age group to which the 1-year age group is assigned to.

  • sum_5: numeric, total number of people in the 5-year age group.

  • prop_5: numeric, proportion of the the 5-year age group total that is allocated to each 1-year age group.

  • sum_10: numeric, total number of people in the 10-year age group.

  • prop_10: numeric, proportion of the the 10-year age group total that is allocated to each 1-year age group.

  • use_age_group: character, preference for 1-year, 5-year, or 10-year age group. Defaults to age_group_1 if at least one observation was recorded in all 5 years belonging to the respective 5-year age groups.

  • n: numeric, number of people to be used according to use_age_group to compute the share.

  • n_sum: numeric, total per demographic group and across all spatial units.

  • share: numeric, the spatial unit's share relative to the total of people within the same demographic group (across all spatial units; i.e., n / n_sum).


Compute evaluation measures

Description

Uses the differences between a benchmark and the results from a projection to compute performance measures.

Usage

compute_measures(combined, weight_groups = NULL)

Arguments

combined

data frame created with propop::prepare_evaluation().

weight_groups

character, optional argument indicating one or more column names to obtain evaluation criteria weighted for specific groups (e.g., age groups, nationality).

Details

The input is a data frame created with propop::prepare_evaluation(). It includes a benchmark (typically the observed population records, i.e., the number of people per spatial unit, demographic group, and year) and the corresponding projected number of people. The input can range from low resolution (e.g., total number of people per municipality) to high resolution (e.g., 101 age classes, nationality, sex).

For more details on usage, see vignette("evaluate", package = "propop").

Value

A data frame. The following evaluation criteria can directly be interpreted and used for descriptive comparisons:

  • error is the forecast error; it quantifies the level of under-projection (negative values) and over-projection (positive values) relative to the benchmark n_benchmark.

  • pe is the percentage error and expresses the under- / over-projection in percent of the benchmark n_benchmark.

  • ape is the absolute percentage error; it is the absolute deviation in percent of the benchmark n_benchmark, thus only showing the extent of the error but not the direction.

  • w_ape is the weighted absolute percentage error; it weighs each absolute percentage error according to the population size of the focal group (e.g., nationality, age group). The weighted version is useful as an aggregated measure when groups vary strongly in terms of population size. Only returned when the argument weight_groups contains at least one grouping variable.

The following helper variables are used to compute aggregate measures. They are only returned when weight groups are provided via the argument weight_groups.

  • n_tot is the total number of people (i.e., sum of the number of people in all demographic groups); used to compute the weighted absolute percentage error.

  • group_tot is the number of people in the focal group; used to compute the weighted absolute percentage error.

  • weight is the share of the (optional) focal group (e.g., municipality type / size, nationality, age group) relative to all people; used to compute the weighted absolute percentage error.

References

Baker, J., et al. (2015). Sub-county population estimates using administrative records: A municipal-level case study in New Mexico. In M. N. Hoque & L. B. Potter (Eds.), Emerging techniques in applied demography (pp. 63-79). Springer, https://doi.org/10.1007/978-94-017-8990-5_6

Wilson, T. (2012). Forecast accuracy and uncertainty of Australian Bureau of Statistics state and territory population projections, International Journal of Population Research, 1, 419824, https://doi.org/10.1155/2012/419824

Wilson, T. (2016). Evaluation of alternative cohort-component models for local area population forecasts, Population Research and Policy Review, 35, 241-261, https://doi.org/10.1007/s11113-015-9380-y

Examples

## Not run: 
# Get evaluation measures without weights
compute_measures(combined)
# Get evaluation measures weighted for groups
compute_measures(combined, weight_groups = c("age", "nat"))

## End(Not run)

Sample parameters to run population projection

Description

Data frame containing the rates and number of people from the Federal Statistical Office (FSO) required to project the development of four demographic groups for a selected canton (Aargau). The parameters are from the model published in 2020. The sample data only include the reference scenario and the years 2019-2030.

Usage

fso_parameters

Format

The example data include the required parameters for each demographic group (nationality (2) X sex (2) X age classes (101)) for the years 2019-2030.

Demographic groups

The returned data frame includes parameters for each unique combination of the following demographic variables:

  • nat: ch = Swiss; int = foreign / international.

  • sex: f = female, m = male.

  • age: 101 one-year age classes, ranging from 0 to 100 (including those older than 100).

Parameters

The following parameters are included in the returned data frame:

  • year: numeric, year of projection.

  • scen: character, projection scenario.

  • birthrate: numeric, total number of live human births per 1,000 inhabitants. (formerly birth_rate).

  • int_mothers: numeric, proportion of children with Swiss nationality born to non-Swiss mothers (formerly births_int_ch).

  • mor: numeric, prospective mortality (probability of death).

  • emi_int: numeric, rate of people emigrating to other countries (formerly emi).

  • emi_nat: numeric, rate of people emigrating to other cantons (new parameter).

  • acq: numeric, rate of acquisition of Swiss citizenship.

  • imm_int_n: numeric, number of people immigrating from abroad (formelry imm_int).

  • imm_nat_n: numeric, number of people immgrating from other cantons (new parameter).

  • emi_nat_n: numeric, number of people emigrating to other cantons (parameter previously used to compute mig_nat_n).

  • mig_nat_n: numeric, national / inter-cantonal net migration (number of immigrants minus number of emigrants). (formerly mig_ch, will soon be obsolete and removed).

  • spatial_unit: character, indicating the user requested spatial unit(s).

Details about calculated variables

births_int_ch is calculated by dividing the number of live newborns with Swiss citizenship born to non-Swiss mothers by the number of all live newborns born to non-Swiss mothers.

mig_ch is calculated as the difference between the immigration from other cantons and the emigration to other cantons.

Source

Data obtained from the Swiss Federal Statistical Office (FSO):


Sample population data from the Federal Statistical Office

Description

Data frame containing the starting population required to project the development of four demographic groups for a selected canton (Aargau). The data from 2018 were obtained from the Federal Statistical Office (FSO).

Usage

fso_population

Format

The example population records include the number of people of each demographic group (nationality (2) X sex (2) X age classes (101)) for the canton of Aargau in 2018.

Value

A data frame. For each of the four demographic groups (female / male, Swiss / foreign nationals), there are 101 age classes, resulting in a total of 404 rows per requested year and spatial unit. Columns included in the returned data frame:

year

numeric, year in which the population was recorded.

spatial_unit

character, indicating the spatial entities (e.g., cantons, districts, municipalities).

nat

character, ch = Swiss, int = foreign / international.

sex

character f = female, m = male.

age

numeric, 101 one-year age classes, ranging from 0 to 100 (including those older than 100).

n

numeric, number of people per year, spatial entity, and demographic group.

Source

Federal Statistical Office: https://www.pxweb.bfs.admin.ch/pxweb/en/px-x-0102010000_101/-/px-x-0102010000_101.px/


Get projection parameters from FSO

Description

Users who do not have the mandatory projection parameters for propop::propop() can use this convenience function to download them from the Federal Statistical Office (FSO). The parameters are only available on the level of cantons. For smaller-scale projections, the parameters must be scaled down. In addition to the parameters, the function also returns the projected population (i.e., number of people estimated in the FSO model released in 2020). All parameters and projections are from the FSO model published in 2020. The variables int_mothers and mig_nat_n are not directly available from the FSO. They are calculated within this function.

To get projection parameters, you must use the spelling defined in the corresponding FSO table. See vignette("prepare_data", package = "propop").

Changes to the API interface may break this function. If problems occur, we recommend following the step-by-step procedure described in vignette("prepare_data", package = "propop").

Usage

get_parameters(
  number_fso_ref = "px-x-0104020000_101",
  number_fso_high = "px-x-0104020000_102",
  number_fso_low = "px-x-0104020000_103",
  number_fso_rates = "px-x-0104020000_109",
  number_fso_births = "px-x-0104020000_106",
  year_first,
  year_last,
  spatial_units
)

Arguments

number_fso_ref

character, px-x table ID for number parameters (reference scenario), defaults to "px-x-0104020000_101".

number_fso_high

character, px-x table ID for number parameters (high growth scenario), defaults to "px-x-0104020000_102".

number_fso_low

character, px-x table ID for for number parameters (low growth scenario, defaults to "px-x-0104020000_103".

number_fso_rates

character, px-x table ID for rate parameters, defaults to "px-x-0104020000_109".

number_fso_births

character, px-x table ID required to compute the share of Swiss newborns from non-Swiss mothers, defaults to "px-x-0104020000_106".

year_first

numeric, first year for which the parameters and projections are to be downloaded.

year_last

numeric, last year for which the parameters and projections are to be downloaded.

spatial_units

character vector, indicating at least one spatial entity for which the projection will be run. Typically a canton.

Value

A data frame with the rates and number of people from the Federal Statistical Office (FSO) required to project the population development of the requested spatial entities. For each of the four demographic groups (nationality x sex), there are 101 age classes, resulting in a total of 404 rows per requested year and spatial unit.

Demographic groups

The returned data frame includes parameters for each unique combination of the following demographic variables:

  • nat: ch = Swiss; int = foreign / international.

  • sex: f = female, m = male.

  • age: 101 one-year age classes, ranging from 0 to 100 (including those older than 100).

Parameters

The following parameters are included in the returned data frame:

  • year: numeric, year of projection.

  • scen: character, projection scenario.

  • birthrate: numeric, total number of live human births per 1,000 inhabitants. (formerly birth_rate).

  • int_mothers: numeric, proportion of children with Swiss nationality born to non-Swiss mothers (formerly births_int_ch).

  • mor: numeric, prospective mortality (probability of death).

  • emi_int: numeric, rate of people emigrating to other countries (formerly emi).

  • emi_nat: numeric, rate of people emigrating to other cantons (new parameter).

  • acq: numeric, rate of acquisition of Swiss citizenship.

  • imm_int_n: numeric, number of people immigrating from abroad (formelry imm_int).

  • imm_nat_n: numeric, number of people immgrating from other cantons (new parameter).

  • emi_nat_n: numeric, number of people emigrating to other cantons (parameter previously used to compute mig_nat_n).

  • mig_nat_n: numeric, national / inter-cantonal net migration (number of immigrants minus number of emigrants). (formerly mig_ch, will soon be obsolete and removed).

  • spatial_unit: character, indicating the user requested spatial unit(s).

Projected population

n_projected is the the number of people per demographic group and year on December 31 (as projected by the FSO in the 2020 model).

Details about calculated variables

births_int_ch is calculated by dividing the number of live newborns with Swiss citizenship born to non-Swiss mothers by the number of all live newborns born to non-Swiss mothers.

mig_ch is calculated as the difference between the immigration from other cantons and the emigration to other cantons.

Source

Data obtained from the Swiss Federal Statistical Office (FSO):

Examples

## Not run: 
one_canton <- get_parameters(
  year_first = 2025,
  year_last = 2050,
  spatial_units = "Aargau"
)
two_cantons_4years <- get_parameters(
  year_first = 2018,
  year_last = 2021,
  spatial_units = c("Aargau", "Zug")
)

## End(Not run)

Get population data from FSO

Description

Users who do not have the required population data can use this convenience function to get the mandatory starting population for propop::propop() from the Federal Statistical Office (FSO). The function can also be used to obtain the population records for several years (e.g., for model performance evaluations).

To get the population data, you must use the spelling defined in the corresponding FSO table. For more details see vignette("prepare_data", package = "propop").

Changes to the API interface may break this function.

Usage

get_population(
  number_fso = "px-x-0102010000_101",
  year,
  year_last = NULL,
  spatial_units
)

Arguments

number_fso

character, px-x table ID for population records, defaults to ⁠px-x-0102010000_101⁠.

year

numeric, year for which the population records are to be downloaded. This usually is the starting population. To download longer time periods, use year to indicate the onset of the period.

year_last

numeric (optional); specifies the final year of the time period for which data will be downloaded.

spatial_units

character vector, indicating at least one spatial entity for which the projection will be run. Typically a canton, districts, or municipalities.

Value

A data frame. For each of the four demographic groups (female / male, Swiss / foreign nationals), there are 101 age classes, resulting in a total of 404 rows per requested year and spatial unit. Columns included in the returned data frame:

year

numeric, year in which the population was recorded.

spatial_unit

character, indicating the spatial entities (e.g., cantons, districts, municipalities).

nat

character, ch = Swiss, int = foreign / international.

sex

character f = female, m = male.

age

numeric, 101 one-year age classes, ranging from 0 to 100 (including those older than 100).

n

numeric, number of people per year, spatial entity, and demographic group.

Source

Federal Statistical Office: https://www.pxweb.bfs.admin.ch/pxweb/en/px-x-0102010000_101/-/px-x-0102010000_101.px/

Examples

## Not run: 
get_population(
  number_fso = "px-x-0102010000_101",
  year = 2018,
  year_last = 2019,
  spatial_units = "- Aargau"
)
get_population(
  year = 2018,
  spatial_units = c("- Aargau", "......0301 Aarberg")
)

## End(Not run)

Prepare data for evaluation

Description

This functions takes benchmark data (typically population records) and population projections and prepares a combined data frame to evaluate the performance of the projection. For more details on usage, see vignette("evaluate", package = "propop").

Usage

prepare_evaluation(
  data_benchmark,
  n_benchmark,
  data_projected,
  n_projected,
  age_groups = NULL
)

Arguments

data_benchmark

data frame containing benchmark data (e.g., actual / official population records obtained with propop::get_population()).

n_benchmark

numeric column containing the benchmark population of each demographic group.

data_projected

data frame containing population projections; can be created with propop::propop().

n_projected

numeric column containing the projected size of each demographic group.

age_groups

character, optional argument ("age_groups_3") indicating if the data shall be aggregated into the predefined three age groups (0-19, 20-64, over 65 years). Using aggregated groups will lead to smaller projection errors than using 101 age classes. Currently only one option is available for aggregating age groups. Defaults to using 101 one-year age classes.

Value

Returns a data frame with the number of people from the benchmark and from the projection. Each row contains a unique combination of year, spatial unit, and demographic group.

Input data and variables

Both input data frames must contain the following variables for the same range of years:

year

character, year in which the population was recorded.

spatial_unit

character, indicating the spatial entities (e.g., cantons, districts, municipalities).

nat

character, ch = Swiss, int = foreign / international.

sex

character, f = female, m = male.

age

numeric, 101 one-year age classes, ranging from 0 to 100 (including those older than 100).

n

numeric, number of people per year, spatial entity, and demographic group.

Examples

## Not run: 
combined <- prepare_evaluation(
  data_benchmark = output_get_population,
  data_projected = output_propop
)
combined_grouped <- prepare_evaluation(
  data_benchmark = output_get_population,
  data_projected = output_propop,
  age_groups = "age_groups_3"
)

## End(Not run)

Project population development (raw results)

Description

Core function that uses the cohort component method and matrix algebra to project population development. The function can be used for different spatial levels (e.g., cantons, municipalities) and for one scenario at a time.

This function provides projections in a raw version in which key information is missing (e.g., which age groups the rows represent). To conveniently obtain an enriched, more informative output, use the wrapper function propop::propop() (which internally uses propop::project_raw()).

The parameters and starting populations for different spatial levels can be obtained from the Swiss Federal Statistical Office (FSO). For instructions on how to download this information from STAT-TAB, see vignette("prepare_data", package = "propop").

The projection parameters need to be passed on as a single data frame to project_raw with (with the parameters as columns). The column types, names, and factor levels need to match those specified below.

The method used to calculate the projections is a 'cohort-component analysis' implemented with matrices due to programming performance benefit compared to data frames. In a nutshell, the starting population ('n') is multiplied by the survival rate to obtain the number of people which transition into the projected next year (year + 1). Then, the absolute number of people immigrating from outside Switzerland and the migration saldo for people from outside the respective canton is added to the surviving population. This results in the starting population for projection the next year. Newborn children are added aeparately to the new starting population of each year.

The starting population is clustered in 404 groups: 101 age groups times two nationalities times 2 genders. The survival rate is calculated in the function 'create_transition_matrix()' resulting in the matrix 'L'. We use the rates for mortality, emigration towards countries outside Switzerland and the rate for the acquisition of the Swiss citizenship by the foreign population to calculate survival rates. The model from the FSO also includes the rate of emigration to other cantons in the survival rate. In contrast, we include the immi- and emigration from and to other cantons by adding the migration balance (German = 'saldo') (immigration + emigration) afterwards.

Steps in this function:

  1. Checks: Checking input data and parameter settings for correct formats.

  2. Data preparation: Preparing vectors e.g. for the projection time frame and creating empty vectors to be filled with data later on.

  3. Loop over years for calculating the projections

    • Subsetting parameters: Depending on the selected projection year and on the demographic unit, the parameters for mortality, fertility, acquisition of the Swiss citizenship as well as migration parameters are subset by demographic group.

    • Create matrices: Matrices are build for the survival rate, mortality, fertility and for calculating the number of newborn babies.

    • Creating vectors: Vectors are built for mortality and migration parameters.

    • Projection: The transition matrix 'L' is multiplied by the starting population for the next year. Migrating people are added in absolute numbers. People that are 100 years old and older are clustered into one age group (age = 100). The newborn babies are added to the resulting starting population for the next projection year.

  4. Aggregating the data: All projected years are aggregated into one data frame. The function 'propop()', in which this function is contained, automatically adds relevant meta data to the results.

Usage

project_raw(
  parameters,
  year_first,
  year_last,
  age_groups = 101,
  fert_first = 16,
  fert_last = 50,
  share_born_female = 100/205,
  n,
  subregional
)

Arguments

parameters

data frame containing the FSO rates and numbers to run the projection for a specific spatial level (e.g., canton, municipality).

  • year: projection year.

  • spatial_unit: ID of spatial entity (e.g., canton, municipality) for which to run the projections.

  • scen: projection scenario, used to subset data frames with multiple scenarios (r = reference, l = low growth, h = high growth scenario).

  • nat: nationality (ch = Swiss; int = foreign / international).

  • sex: sex (f = female, m = male).

  • age: age classes; typically ranging from 0 to 100 (incl. >100).

  • birthrate: numeric, total number of live human births per 1,000 inhabitants.

  • int_mothers proportion of children with Swiss nationality born to non-Swiss mothers.

  • mor: prospective mortality rate (probability of death).

  • acq: rate of acquisition of Swiss citizenship.

  • emi_int: rate of people emigrating abroad.

  • emi_nat: rate of people emigrating to other cantons.

  • imm_int_n: number of people immigrating from abroad.

  • imm_nat_n: number of people immigrating from other cantons.

  • mig_sub: within canton net migration. Useful to account for movements between different subregions (e.g., municipalities). This argument is optional.

year_first

numeric, first year to be projected.

year_last

numeric, last year to be projected.

age_groups

numeric, number of age classes. Creates a vector with 1-year age classes running from 0 to (age_groups - 1). Defaults to 101 (FSO standard number of age groups).

fert_first

numeric, first year of female fertility. Defaults to 16 (FSO standard value).

fert_last

numeric, last year of female fertility. Defaults to 50 (FSO standard value).

share_born_female

numeric, fraction of female babies. Defaults to 100 / 205 (FSO standard value).

n

number of people per demographic group and year; should be the year before year_first. Typically extracted from data frame created with propop::get_population().

subregional

boolean, TRUE indicates that subregional migration patterns (e.g., movement between municipalities within a canton) are part of the projection.

Value

Returns an unformatted and unlabeled data frame. It includes the number of people for each demographic group per year (starting year and projected years. The number of rows corresponds to the product of years and demographic groups (e.g., nationality (2) X sex (2) X age groups (101) = 404). Variables included in the output:

n

number of people per demographic group.

IMM_INT

number of immigrants from other countries.

MIG_NAT

number of people migrating from / to other superordinate spatial units (typically cantons).

MIG_SUB

number of migrants within the superordinate spatial unit (typically a canton).

MOR

number of deaths (among people older than 0).

EMI_INT

number of emigrants to other countries.

ACQ

number of foreigners who acquire Swiss citizenship (naturalisations).

BIRTHS

number of births.

See Also

propop()

Examples

# load package data
data(fso_parameters)
data(fso_population)

# run projection
project_raw(
  parameters = fso_parameters,
  year_first = 2019,
  year_last = 2019,
  n = fso_population |> dplyr::pull(n),
  subregional = FALSE
) |>
  head(10)

Project population development

Description

Wrapper function to project population development using the cohort component method (see e.g., here for more details).

You can either use your own parameters and starting population or download these data from the Swiss Federal Statistical Office (FSO). For instructions on how to download this information from STAT-TAB, see vignette("prepare_data", package = "propop").

For more details on how to use this function to project the population development on the level of a canton, see vignette("run_projections", package = "propop").

The projection parameters need to be passed to propop::propop() as a single data frame (with the parameters as columns). The column types, names, and factor levels need to match the specifications listed below under parameters:

Usage

propop(
  parameters,
  population,
  year_first,
  year_last,
  age_groups = 101,
  fert_first = 16,
  fert_last = 50,
  share_born_female = 100/205,
  subregional = FALSE,
  binational = TRUE,
  spatial_unit = "spatial_unit"
)

Arguments

parameters

data frame containing the FSO rates and numbers to run the projection for a specific spatial level (e.g., canton, municipality).

  • year, character, projection year.

  • spatial_unit, character, ID of spatial entity (e.g., canton, municipality) for which to run the projections.

  • scen, character, projection scenario, is used to subset data frames with multiple scenarios (r = reference, l = low growth, h = high growth).

  • nat (optional), character, nationality (ch = Swiss; int = foreign / international).

  • sex, character (f = female, m = male).

  • age, numeric, typically ranging from 0 to 100 (incl. >100).

  • birthrate, numeric, number of children per year.

  • int_mothers (optional), numeric, proportion of children with Swiss nationality born to non-Swiss mothers.

  • mor, numeric, prospective mortality rate (probability of death).

  • acq (optional), numeric, rate of acquisition of Swiss citizenship.

  • emi_int, numeric, rate of people emigrating abroad. (number of immigrants - number of emigrants).

  • emi_nat: rate of people emigrating to other cantons.

  • imm_int_n, numeric, number of people immigrating from abroad.

  • imm_nat_n: numeric, number of people immigrating from other cantons.

  • mig_sub (optional), numeric, net migration per subregion; this is the migration from / to other subregions (e.g., municipalities, districts) within the main superordinate projection unit (e.g., a canton). Accounts for movements between different subregions. Needs to be provided by the user.

population

data frame including the starting population of each demographic group and each spatial unit. Possible values are the same as in parameters (apart from year). The data frame only includes one year, usually the one preceding the first year of the projection.

  • year character, should be year_first - 1.

  • spatial_unit character.

  • nat character.

  • sex character.

  • age numeric.

  • n numeric, number of people per demographic group.

year_first

numeric, first year to be projected.

year_last

numeric, last year to be projected.

age_groups

numeric, number of age classes. Creates a vector with 1-year age classes running from 0 to (age_groups - 1). Must currently be set to ⁠= 101⁠ (FSO standard number of age groups).

fert_first

numeric, first year of female fertility. Defaults to 16 (FSO standard value).

fert_last

numeric, last year of female fertility. Defaults to 50 (FSO standard value).

share_born_female

numeric, fraction of female babies. Defaults to 100 / 205 (FSO standard value).

subregional

boolean, TRUE indicates that subregional migration patterns (e.g., movement between municipalities within a canton) are part of the projection. Requires input (parameters and population) on the level of subregions.

binational

boolean, TRUE indicates that projections discriminate between two groups of nationalities. FALSE indicates that only one projection is run without distinguishing between nationalities.

spatial_unit

character, name of variable containing the names of the region or subregions for which the projection shall be performed.

Value

Returns a data frame that includes the number of people for each demographic group per year (for projected years) and spatial unit. The number of rows is the product of all years times all demographic groups times all spatial units. The output includes several identifiers that indicate to which demographic group, year, and spatial unit the results in the rows refer to:

year

integer, indicating the projected years.

spatial_unit

factor, spatial units for which the projection was run (e.g., canton, districts, municipalities).

age

integer, ranging from 0n to 100 (including those older than 100).

sex

factor, female (f) and male (m).

nat

factor, indicates if the nationality is Swiss (ch) or international / foreign (int). This variable is only returned if binational = TRUE.

The output also includes columns related to the size and change of the population:

n_jan

numric, start-of-year population per demographic group.

n_dec

numeric, end-of-year population per demographic group.

delta_n

numeric, population change per demographic group from the start to the end of the year in absolute numbers.

delta_perc

numeric, population change per demographic group from the start to the end of the year in percentages.

The components that are used to project the development of the population are also included in the output:

births

numeric, number of births (non-zero values are only available for age = 0).

mor

numeric, number of deaths.

emi_int

numeric, number of people who emigrate to other countries.

emi_nat

numeric, number of people who emigrate to other cantons.

imm_int

numeric, number of people who immigrate from other countries.

imm_nat

numeric, number of people who immigrate from other cantons.

acq

numeric, number of people who acquire Swiss citizenship.

Examples

# Run projection for the sample data (whole canton of Aargau)
propop(
  parameters = fso_parameters,
  year_first = 2019,
  year_last = 2022,
  population = fso_population,
  subregional = FALSE,
  binational = TRUE
)