---
title: "Prepare data"
output: 
  rmarkdown::html_vignette:
    toc: true
    toc_depth: 2
vignette: >
  %\VignetteIndexEntry{Prepare data}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  markdown: 
    wrap: 84
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# Overview

To run projections with `{propop}`, you need a starting population and 
projection parameters. You can of course use your own data but you need to ensure 
that the input files have the required structure. If you don't have the relevant
data, you can download them from the Federal Statistical Office (FSO). This 
vignette explains how to get the data. You'll also learn how to prepare the 
relevant information to run population projections `{propop}`.

*Note that some data are only available for certain administrative levels.*

# Required data

If you don't have the information and data required to run `propop::propop()` 
(or `propop::project_raw()`), you can download most of the data from
[STAT-TAB](https://www.pxweb.bfs.admin.ch/pxweb). More specifically, 
the information from the following tables are needed:

+-------------------+--------------------------------+----------------------------+
| Table ID          | Parameters expressed as...     | Variables required for     |
|                   |                                | projection                 |
+===================+================================+============================+
| px-               | number of people (*reference*  | -   Inter-cantonal         |
| x-0104020000_101  | scenario)                      |     immigration            |
|                   |                                | -   Inter-cantonal         |
|                   |                                |     emigration             |
|                   |                                | -   International          |
|                   |                                |     immigration            |
|                   |                                | -   International          |
|                   |                                |     emigration             |
|                   |                                | -   end of year population |
|                   |                                |     size                   |
+-------------------+--------------------------------+----------------------------+
| px-               | number of people (*high        | -   same as 101            |
| x-0104020000_102  | growth* scenario)              |                            |
+-------------------+--------------------------------+----------------------------+
| px-               | number of people (*low growth* | -   same as 101            |
| x-0104020000_103  | scenario)                      |                            |
+-------------------+--------------------------------+----------------------------+
| px-               | rates / probabilities (five    | -   Births per mother      |
| x-0104020000_109  | scenarios)                     |                            |
|                   |                                | -   Mortality              |
|                   |                                |                            |
|                   |                                | -   International          |
|                   |                                |     emigration             |
|                   |                                |                            |
|                   |                                | -   Inter-cantonal         |
|                   |                                |     emigration             |
|                   |                                |                            |
|                   |                                | -   Acquisition of Swiss   |
|                   |                                |     citizenship            |
+-------------------+--------------------------------+----------------------------+
| px                | share of newborns with Swiss   | -   Live newborns          |
| -x-0104020000_106 | nationality born to non-Swiss  |                            |
|                   | mothers                        | -   Live births by age and |
|                   |                                |     nationality of the     |
|                   |                                |     mother (varies between |
|                   |                                |     cantons)               |
+-------------------+--------------------------------+----------------------------+
| Constant          |                                | -   Start (16) and end (50)|
| parameters        |                                |     of the fertile age of  |
| **not** directly  |                                |     women                  |
| available from    |                                |                            |
| STAT-TAB are      |                                | -   Proportion of newborns |
| provided as       |                                |     with female sex        |
| arguments         |                                |     (100/205)              |
+-------------------+--------------------------------+----------------------------+

: Overview of required FSO tables (STAT-TAB)


# Convenient way to get FSO data

The `propop` package provides two convenience functions to download data from 
the FSO. These are strongly based on the `BFS` package and its documentation.

To get the starting population for a spatial unit, you must use the spelling
defined in the corresponding FSO table. The entries in the FSO tables may 
contain special characters. The spelling may also vary between FSO tables. 

`BFS::bfs_get_metadata()` is helpful to identify the required spelling(s).

Here's an example of how to get the population for the canton of Aargau:
```{r eval = FALSE, warning=FALSE}
library(propop) 
ag_population <- get_population(
  number_fso = "px-x-0102010000_101",
  year = 2022,
  spatial_units = "- Aargau"
)
```

Get the parameters for a sample canton (mind using the same spelling as in the
FSO tables; see comment above):
```{r get ag params, eval = FALSE}
ag_parameters <- get_parameters(
  year = 2023,
  year_last = 2026,
  spatial_units = "Aargau"
)
```
The projection can be run as follows:

```{r eval = FALSE}
# select reference scenario
ag_parameters_ref <- ag_parameters |>
  dplyr::filter(scen == "reference")

propop(
  parameters = ag_parameters_ref,
  year = 2023,
  year_last = 2026,
  age_groups = 101,
  fert_first = 16,
  fert_last = 50,
  share_born_female = 100 / 205,
  population = ag_population,
  subregional = FALSE,
  binational = TRUE
)
```


**Note of caution:** 
As long as the FSO's API interface and the underlying data structure remain 
stable, the functions will work. However, changes in the API are likely to break 
the functions.