Introduction to tidyndr

“To measure is to know. If you cannot measure it, you cannot improve it.”

- Lord Kelvin

The Nigeria National Data Repository (NDR) houses the de-identified patient-level information for the HIV program in Nigeria. Perhaps the most versatile feature of the repository is that it allows users with login access to download this patient-level information, analyze, and make informed decisions to improve the response of their program towards achieving the UNAIDS three 95s targets.

Analysis of the de-identified patient-level information is traditionally conducted in Microsoft Excel. While this provides a great platform, it has some downsides which include:

  1. The software must be installed on the user’s computer.

  2. The user must be familiar with the formula for calculation of the indicator of interest.

  3. Because of the point-and-click nature of the analysis, it is error-prone. Sometimes these errors go unnoticed by the users giving a rather false result.

  4. Performing the same analysis periodically can be quite tedious and time-consuming as the analysis is usually done afresh every time.

The aim of the {tidyndr} package is to eliminate these hurdles by providing the user with an application that can be conducted on a free and open-source software, allows the user to focus on the task to be performed and not the formula, remove user-defined errors, and allow for easy automation of routine activities.

The {tidyndr} functions are grouped into four categories for performing related actions.

  1. Importing functions

  2. Treatment functions

  3. Viral Load functions

  4. Summary functions

library(tidyndr)

Importing your NDR line-list

The read_ndr() function allows you to import your line-list in a nicely configured format for data analysis. It:

  1. Reads your .csv file.

  2. Formats the data type for each of the column variables as necessary (converts all date variables to dates and categorical variables to factors).

  3. Converts all column names to snake case format.

  4. Adds three new columns to your data - date_lost (calculated by adding 28 days to the sum of last_drug_pickup_date and the days_of_arv_refill), appointment_date (calculated by adding the number of days_of_arv_refill to the last_drug_pickup_date, and current_status (calculated by classifying the patient as “active” or “inactive” using the value of the time_stamp argument as a reference.

The read_ndr() has 5 arguments that can be specified. The first arguments two are always required:


## import file from the computer. This uses the example file that comes with the {tidyndr} package.

file_path <- system.file("extdata",
                         "ndr_example.csv",
                         package = "tidyndr")

ex_ndr <- read_ndr(file_path, time_stamp = "2021-02-15")

## import file from the computer using a few of the `...` arguments and setting quite to "TRUE"

ndr_example <- read_ndr(file_path,
                        time_stamp = "2021-02-15",
                        skip = 0,
                        comment = "",
                        quiet = TRUE)

Treatment functions

These group of indicators are based on the PEPFAR MER treatment indicators and their supporting indicators. They include:

  1. tx_new()
  2. tx_curr()
  3. tx_ml()
  4. tx_ml_outcomes()
  5. tx_rtt()
  6. tx_appointment()
  7. tx_mmd()

tx_new()

This generates the line-list of clients who started ART within a period. It can be supplied 5 different arguments with the first one being the only compulsory one:

## generate tx_new clients for the FY for all states in the data
tx_new(ndr_example)
#> # A tibble: 5,239 x 52
#>   ip     state  lga   facility datim_code sex   patient_identif~ hospital_number
#>   <fct>  <fct>  <fct> <fct>    <fct>      <fct> <chr>            <chr>          
#> 1 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 1004       0003           
#> 2 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 3008       0003           
#> 3 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 2008       0003           
#> 4 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 30012      0003           
#> # ... with 5,235 more rows, and 44 more variables: date_of_birth <date>,
#> #   age_at_art_initiation <dbl>, current_age <dbl>, art_start_date <date>,
#> #   art_start_date_source <fct>, last_drug_pickup_date <date>,
#> #   last_drug_pickup_date_q1 <date>, last_drug_pickup_date_q2 <date>,
#> #   last_drug_pickup_date_q3 <date>, last_drug_pickup_date_q4 <date>,
#> #   last_regimen <fct>, last_clinic_visit_date <date>,
#> #   days_of_arv_refill <dbl>, pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>

## generate tx_new for only one state (State 1 in the data) for January 2021.

tx_new(ndr_example,
       from = "2021-01-01",
       to = "2021-01-31",
       states = "State 1")
#> # A tibble: 244 x 52
#>   ip     state  lga   facility datim_code sex   patient_identif~ hospital_number
#>   <fct>  <fct>  <fct> <fct>    <fct>      <fct> <chr>            <chr>          
#> 1 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 1004       0003           
#> 2 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 10070      00029          
#> 3 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 10085      00030          
#> 4 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 10087      00033          
#> # ... with 240 more rows, and 44 more variables: date_of_birth <date>,
#> #   age_at_art_initiation <dbl>, current_age <dbl>, art_start_date <date>,
#> #   art_start_date_source <fct>, last_drug_pickup_date <date>,
#> #   last_drug_pickup_date_q1 <date>, last_drug_pickup_date_q2 <date>,
#> #   last_drug_pickup_date_q3 <date>, last_drug_pickup_date_q4 <date>,
#> #   last_regimen <fct>, last_clinic_visit_date <date>,
#> #   days_of_arv_refill <dbl>, pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>

tx_curr()

Generates the line-list of all clients who are still active on treatment. It has 4 different parameters with only the first one as the compulsory argument to be specified. The parameters are:

## generate current clients using the calculated `current_status` column
tx_curr(ndr_example)
#> # A tibble: 16,696 x 52
#>   ip     state  lga   facility datim_code sex   patient_identif~ hospital_number
#>   <fct>  <fct>  <fct> <fct>    <fct>      <fct> <chr>            <chr>          
#> 1 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 1001       0001           
#> 2 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 2001       0001           
#> 3 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 3001       0001           
#> 4 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 1002       0001           
#> # ... with 16,692 more rows, and 44 more variables: date_of_birth <date>,
#> #   age_at_art_initiation <dbl>, current_age <dbl>, art_start_date <date>,
#> #   art_start_date_source <fct>, last_drug_pickup_date <date>,
#> #   last_drug_pickup_date_q1 <date>, last_drug_pickup_date_q2 <date>,
#> #   last_drug_pickup_date_q3 <date>, last_drug_pickup_date_q4 <date>,
#> #   last_regimen <fct>, last_clinic_visit_date <date>,
#> #   days_of_arv_refill <dbl>, pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>

## generate current clients using the default `current_status_28_days` column
tx_curr(ndr_example,
        status = "default")
#> # A tibble: 27,024 x 52
#>   ip     state  lga   facility datim_code sex   patient_identif~ hospital_number
#>   <fct>  <fct>  <fct> <fct>    <fct>      <fct> <chr>            <chr>          
#> 1 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 3001       0001           
#> 2 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 1002       0001           
#> 3 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 3003       0001           
#> 4 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 1003       0002           
#> # ... with 27,020 more rows, and 44 more variables: date_of_birth <date>,
#> #   age_at_art_initiation <dbl>, current_age <dbl>, art_start_date <date>,
#> #   art_start_date_source <fct>, last_drug_pickup_date <date>,
#> #   last_drug_pickup_date_q1 <date>, last_drug_pickup_date_q2 <date>,
#> #   last_drug_pickup_date_q3 <date>, last_drug_pickup_date_q4 <date>,
#> #   last_regimen <fct>, last_clinic_visit_date <date>,
#> #   days_of_arv_refill <dbl>, pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>

tx_ml()

This generates the line-list of clients who were active at the beginning of the reference date and have now become inactive at the to date specified. The default is to generate the line-list of all clients who became inactive in the current Fiscal Year (i.e. were active at the beginning of the FY but have now become inactive). It accepts 5 arguments listed below:

## generate the line-list of clients who were active at the beginning of October 2020 
## (beginning of FY21) but became inactive at the end of December 2020.
tx_ml(new_data = ndr_example,
      from = "2020-10-01",
      to = "2020-12-31")

## if data from two periods are available, you can supply these to determine the `tx_ml"

file_path <- "https://raw.githubusercontent.com/stephenbalogun/example_files/main/ndr_example.csv"
ndr_old <- read_ndr(file_path, time_stamp = "2021-02-15")
ndr_new <- ndr_example
tx_ml(old_data = ndr_old,
      new_data = ndr_new)

## generate the line-list of clients who have become inactive for "State 1" and "State 2" 
## since the beginning of October 2021.
tx_ml(new_data = ndr_example,
      states = c("State 1", "State 2"))

tx_ml_outcomes

For the inactive clients generated, you might be interested in subsetting those with specific final outcomes of interest. Currently, the NDR recognizes only two final outcomes (“dead” and “transferred out”). These are the ones that can be subset using the tx_ml_outcomes() function. This function takes only two argument:

## generate the line-list of all clients who became inactive this Fiscal Year
ml_example <- tx_ml(new_data = ndr_example)

## subset inactive clients who were transferred out
tx_ml_outcomes(ml_example, outcome = "transferred out")

tx_rtt

You can filter for clients who were previously inactive but have returned to treatment and are still active at the end of the period of interest. This is the only function where you will be needing two different sets of data - the first data contains the inactive clients while the second data is where their change in status will be checked. The acceptable arguments to tx_rtt include:

## location of the old line-list that contains the list of inactive clients
file_path <- "https://raw.githubusercontent.com/stephenbalogun/example_files/main/ndr_example.csv"

old_data <- read_ndr(file_path,
                     time_stamp = "2021-02-15")

new_data <- ndr_example 
tx_rtt(old_data, new_data)

tx_appointment

Sometimes, you are interested in knowing the number of Active clients who are due for medication refill/drug pick up within a period of time. This can help you to plan for the visits, forecast medication appointments and also identify active clients who have missed their appointment. The tx_appointment() is one of the supporting treatment indicators that helps in this regard. It takes 6 arguments viz:

## generate list of clients with medication appointment in Q2 of FY21
q2_appt <- tx_appointment(ndr_example,
              from = "2021-01-01",
              to = "2021-03-31")

## print the number of clients with appointments in Q2
nrow(q2_appt)
#> [1] 11465

tx_mmd

Knowing the number of months of medications dispensed during the last medication refill allows you to calculate the number of active clients who are on MMD (Multi-month Dispensing), i.e. clients who were given between 3 months and 6 months medication during the last clinic visit. You might also be interested in know the details of clients who did not have MMD, or who had more than 6 months medication refill (some of which might be due to data entry errors). The arguments that can be supplied to this function include:

tx_mmd(ndr_example)
#> # A tibble: 14,310 x 53
#>   ip     state  lga   facility datim_code sex   patient_identif~ hospital_number
#>   <fct>  <fct>  <fct> <fct>    <fct>      <fct> <chr>            <chr>          
#> 1 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 1001       0001           
#> 2 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 2001       0001           
#> 3 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 3001       0001           
#> 4 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 1002       0001           
#> # ... with 14,306 more rows, and 45 more variables: date_of_birth <date>,
#> #   age_at_art_initiation <dbl>, current_age <dbl>, art_start_date <date>,
#> #   art_start_date_source <fct>, last_drug_pickup_date <date>,
#> #   last_drug_pickup_date_q1 <date>, last_drug_pickup_date_q2 <date>,
#> #   last_drug_pickup_date_q3 <date>, last_drug_pickup_date_q4 <date>,
#> #   last_regimen <fct>, last_clinic_visit_date <date>,
#> #   days_of_arv_refill <dbl>, pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>, months_dispensed <dbl>

## filter clients who had more than 6 months of ARV
tx_mmd(ndr_example,
       months = c(7, Inf))
#> # A tibble: 0 x 53
#> # ... with 53 variables: ip <fct>, state <fct>, lga <fct>, facility <fct>,
#> #   datim_code <fct>, sex <fct>, patient_identifier <chr>,
#> #   hospital_number <chr>, date_of_birth <date>, age_at_art_initiation <dbl>,
#> #   current_age <dbl>, art_start_date <date>, art_start_date_source <fct>,
#> #   last_drug_pickup_date <date>, last_drug_pickup_date_q1 <date>,
#> #   last_drug_pickup_date_q2 <date>, last_drug_pickup_date_q3 <date>,
#> #   last_drug_pickup_date_q4 <date>, last_regimen <fct>,
#> #   last_clinic_visit_date <date>, days_of_arv_refill <dbl>,
#> #   pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>, months_dispensed <dbl>

## list of clients who had either more than 6 months, or < 3 months medications dispensed
tx_mmd(ndr_example,
       months = c(1, 2, 7, Inf))
#> # A tibble: 2,237 x 53
#>   ip     state  lga   facility datim_code sex   patient_identif~ hospital_number
#>   <fct>  <fct>  <fct> <fct>    <fct>      <fct> <chr>            <chr>          
#> 1 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 20012      0001           
#> 2 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 20021      0008           
#> 3 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 30027      0009           
#> 4 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 30031      00010          
#> # ... with 2,233 more rows, and 45 more variables: date_of_birth <date>,
#> #   age_at_art_initiation <dbl>, current_age <dbl>, art_start_date <date>,
#> #   art_start_date_source <fct>, last_drug_pickup_date <date>,
#> #   last_drug_pickup_date_q1 <date>, last_drug_pickup_date_q2 <date>,
#> #   last_drug_pickup_date_q3 <date>, last_drug_pickup_date_q4 <date>,
#> #   last_regimen <fct>, last_clinic_visit_date <date>,
#> #   days_of_arv_refill <dbl>, pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>, months_dispensed <dbl>

Viral Load Indicators

The viral load group of indicators allows you to be able to generate the list of clients who are eligible for viral load; have a viral load result within the last one year (for adults of 20 years and above) or 6 months (for pediatrics and adolescents); and are virally suppressed. These indicators include:

  1. tx_vl_eligible()

  2. tx_pvls_den()

  3. tx_pvls_num()

  4. tx_vl_unsuppressed()

tx_vl_eligible

Two groups of clients are eligible for routine viral load test - clients who have been on ARV medication for at least 6 months and have not had a viral load result, and clients whose last viral load result is more than 12 months ago (for adults) and more than 6 months ago (for pediatrics and adolescents). tx_vl_eligible() generates this line-list using a combination of the 6 arguments below that can be provided:

## list of clients who are eligible for VL sample collection by the end of March 2021
tx_vl_eligible(ndr_example,
               ref = "2021-03-31",
               sample = TRUE)
#> # A tibble: 7,017 x 52
#>   ip     state  lga   facility datim_code sex   patient_identif~ hospital_number
#>   <fct>  <fct>  <fct> <fct>    <fct>      <fct> <chr>            <chr>          
#> 1 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 1001       0001           
#> 2 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 3006       0002           
#> 3 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 1005       0004           
#> 4 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 20012      0001           
#> # ... with 7,013 more rows, and 44 more variables: date_of_birth <date>,
#> #   age_at_art_initiation <dbl>, current_age <dbl>, art_start_date <date>,
#> #   art_start_date_source <fct>, last_drug_pickup_date <date>,
#> #   last_drug_pickup_date_q1 <date>, last_drug_pickup_date_q2 <date>,
#> #   last_drug_pickup_date_q3 <date>, last_drug_pickup_date_q4 <date>,
#> #   last_regimen <fct>, last_clinic_visit_date <date>,
#> #   days_of_arv_refill <dbl>, pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>

## filter clients who are eligible for VL test (result) by 31st of March 2021
tx_vl_eligible(ndr_example,
               ref = "2021-03-31")
#> # A tibble: 14,314 x 52
#>   ip     state  lga   facility datim_code sex   patient_identif~ hospital_number
#>   <fct>  <fct>  <fct> <fct>    <fct>      <fct> <chr>            <chr>          
#> 1 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 1001       0001           
#> 2 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 2001       0001           
#> 3 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 3001       0001           
#> 4 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 1002       0001           
#> # ... with 14,310 more rows, and 44 more variables: date_of_birth <date>,
#> #   age_at_art_initiation <dbl>, current_age <dbl>, art_start_date <date>,
#> #   art_start_date_source <fct>, last_drug_pickup_date <date>,
#> #   last_drug_pickup_date_q1 <date>, last_drug_pickup_date_q2 <date>,
#> #   last_drug_pickup_date_q3 <date>, last_drug_pickup_date_q4 <date>,
#> #   last_regimen <fct>, last_clinic_visit_date <date>,
#> #   days_of_arv_refill <dbl>, pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>

tx_pvls_den

This determines clients who have a documented viral load result and are not due for a repeat viral load test. It helps you to know what the viral load coverage (the number of those who have a VL result / the number of those eligible [when sample is set to FALSE]). It accepts arguments all the arguments that can be supplied to tx_vl_eligible above.

## determine clients whose viral load result is within the last 1 year for adults (>= 20 years) 
## and 6 months for paediatrics and adolescents
tx_pvls_den(ndr_example)
#> # A tibble: 7,458 x 52
#>   ip     state  lga   facility datim_code sex   patient_identif~ hospital_number
#>   <fct>  <fct>  <fct> <fct>    <fct>      <fct> <chr>            <chr>          
#> 1 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 2001       0001           
#> 2 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 3001       0001           
#> 3 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 1002       0001           
#> 4 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 3005       0001           
#> # ... with 7,454 more rows, and 44 more variables: date_of_birth <date>,
#> #   age_at_art_initiation <dbl>, current_age <dbl>, art_start_date <date>,
#> #   art_start_date_source <fct>, last_drug_pickup_date <date>,
#> #   last_drug_pickup_date_q1 <date>, last_drug_pickup_date_q2 <date>,
#> #   last_drug_pickup_date_q3 <date>, last_drug_pickup_date_q4 <date>,
#> #   last_regimen <fct>, last_clinic_visit_date <date>,
#> #   days_of_arv_refill <dbl>, pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>

## ist of clients who will not be due for a repeat VL test by the end of March 2021
tx_pvls_den(ndr_example,
            ref = "2021-03-31")
#> # A tibble: 7,297 x 52
#>   ip     state  lga   facility datim_code sex   patient_identif~ hospital_number
#>   <fct>  <fct>  <fct> <fct>    <fct>      <fct> <chr>            <chr>          
#> 1 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 2001       0001           
#> 2 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 3001       0001           
#> 3 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 1002       0001           
#> 4 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 3005       0001           
#> # ... with 7,293 more rows, and 44 more variables: date_of_birth <date>,
#> #   age_at_art_initiation <dbl>, current_age <dbl>, art_start_date <date>,
#> #   art_start_date_source <fct>, last_drug_pickup_date <date>,
#> #   last_drug_pickup_date_q1 <date>, last_drug_pickup_date_q2 <date>,
#> #   last_drug_pickup_date_q3 <date>, last_drug_pickup_date_q4 <date>,
#> #   last_regimen <fct>, last_clinic_visit_date <date>,
#> #   days_of_arv_refill <dbl>, pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>

tx_pvls_num

This determines the number of clients whose viral load test results are below a particular limit among those with a documented viral load result, thus are said to be virally suppressed. It helps you to know how many clients are doing well on their current medication. It accepts the following arguments:

## clients whose last viral load result is within the last 1 year for adults (>= 20 years) 
## and 6 months for paediatrics and adolescents, and are virally suppressed
tx_pvls_num(ndr_example)
#> # A tibble: 2,889 x 52
#>   ip     state  lga   facility datim_code sex   patient_identif~ hospital_number
#>   <fct>  <fct>  <fct> <fct>    <fct>      <fct> <chr>            <chr>          
#> 1 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 2005       0001           
#> 2 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 2009       0001           
#> 3 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 30014      0005           
#> 4 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 30021      0006           
#> # ... with 2,885 more rows, and 44 more variables: date_of_birth <date>,
#> #   age_at_art_initiation <dbl>, current_age <dbl>, art_start_date <date>,
#> #   art_start_date_source <fct>, last_drug_pickup_date <date>,
#> #   last_drug_pickup_date_q1 <date>, last_drug_pickup_date_q2 <date>,
#> #   last_drug_pickup_date_q3 <date>, last_drug_pickup_date_q4 <date>,
#> #   last_regimen <fct>, last_clinic_visit_date <date>,
#> #   days_of_arv_refill <dbl>, pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>

## generate the list of clients whose viral load result is less than 200
tx_pvls_num(ndr_example,
            n = 200)
#> # A tibble: 2,687 x 52
#>   ip     state  lga   facility datim_code sex   patient_identif~ hospital_number
#>   <fct>  <fct>  <fct> <fct>    <fct>      <fct> <chr>            <chr>          
#> 1 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 2009       0001           
#> 2 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 30014      0005           
#> 3 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 30021      0006           
#> 4 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 20018      0004           
#> # ... with 2,683 more rows, and 44 more variables: date_of_birth <date>,
#> #   age_at_art_initiation <dbl>, current_age <dbl>, art_start_date <date>,
#> #   art_start_date_source <fct>, last_drug_pickup_date <date>,
#> #   last_drug_pickup_date_q1 <date>, last_drug_pickup_date_q2 <date>,
#> #   last_drug_pickup_date_q3 <date>, last_drug_pickup_date_q4 <date>,
#> #   last_regimen <fct>, last_clinic_visit_date <date>,
#> #   days_of_arv_refill <dbl>, pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>

tx_vl_unsuppressed

This determines the number of clients whose viral load test results are above a particular limit among those with a documented viral load result. It accepts the following arguments:

## clients whose last viral load result is within the last 1 year for adults (>= 20 years) 
## and 6 months for paediatrics and adolescents but were unsuppressed
tx_vl_unsuppressed(ndr_example)
#> # A tibble: 369 x 52
#>   ip     state  lga   facility datim_code sex   patient_identif~ hospital_number
#>   <fct>  <fct>  <fct> <fct>    <fct>      <fct> <chr>            <chr>          
#> 1 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 30097      00036          
#> 2 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 10081      00027          
#> 3 IP_na~ State~ LGA0~ Facilit~ datim_cod~ F     State 200381     000118         
#> 4 IP_na~ State~ LGA0~ Facilit~ datim_cod~ M     State 100249     00087          
#> # ... with 365 more rows, and 44 more variables: date_of_birth <date>,
#> #   age_at_art_initiation <dbl>, current_age <dbl>, art_start_date <date>,
#> #   art_start_date_source <fct>, last_drug_pickup_date <date>,
#> #   last_drug_pickup_date_q1 <date>, last_drug_pickup_date_q2 <date>,
#> #   last_drug_pickup_date_q3 <date>, last_drug_pickup_date_q4 <date>,
#> #   last_regimen <fct>, last_clinic_visit_date <date>,
#> #   days_of_arv_refill <dbl>, pregnancy_status <fct>, current_viral_load <dbl>,
#> #   date_of_current_viral_load <date>, current_viral_load_q1 <dbl>,
#> #   date_of_current_viral_load_q1 <date>, current_viral_load_q2 <dbl>,
#> #   date_of_current_viral_load_q2 <date>, current_viral_load_q3 <dbl>,
#> #   date_of_current_viral_load_q3 <date>, current_viral_load_q4 <dbl>,
#> #   date_of_current_viral_load_q4 <date>, current_status_28_days <fct>,
#> #   current_status_90_days <fct>, current_status_q1_28_days <fct>,
#> #   current_status_q1_90_days <fct>, current_status_q2_28_days <fct>,
#> #   current_status_q2_90_days <fct>, current_status_q3_28_days <fct>,
#> #   current_status_q3_90_days <fct>, current_status_q4_28_days <fct>,
#> #   current_status_q4_90_days <fct>, patient_has_died <lgl>,
#> #   patient_deceased_date <date>, patient_transferred_out <lgl>,
#> #   transferred_out_date <date>, patient_transferred_in <lgl>,
#> #   transferred_in_date <date>, x49 <chr>, date_lost <date>,
#> #   appointment_date <date>, current_status <chr>

Summary Indicators

Summary indicators provide aggregates for a particular indicator of interest. {tidyndr} provides two aggregate functions. These are:
1. summarise_ndr()

  1. disaggregate()

summarise_ndr

All the previous functions generates patient-level line-lists. You will most often be interested in a tabular summary of the information provided. This is the work of summarise_ndr() (and its partner summarize_ndr()). It takes all the line-lists that you might have generated, and display a summary table with one column for each of your generated line-lists. summarise_ndr() accepts three arguments:

curr <- tx_curr(ndr_example) # generate active clients and assign to "curr"

new <- tx_new(ndr_example) # generate TX_NEW for the FY and assign to "new"

summarise_ndr(curr, new, level = "state", names = c("curr", "tx_new"))
#> # A tibble: 4 x 4
#>   ip      state    curr tx_new
#>   <chr>   <chr>   <int>  <int>
#> 1 IP_name State 1  3726    662
#> 2 IP_name State 2  5058   1239
#> 3 IP_name State 3  7912   3338
#> 4 Total   -       16696   5239

disaggregate

A very powerful function that allows you to summarise your generated line-list disaggregated based on a particular variable. The disaggregation options currently available are “age (or current age)”, “sex (or gender)”, or “pregnancy status”. It accepts 3 arguments:

## generate list of inactive clients
inactives <- tx_ml(new_data = ndr_example)

## disaggregate inactive clients by gender at state level
disaggregate(inactives,
             by = "sex")

## disaggregate inactive clients by "age group" at country level
disaggregate(inactives,
             by = "current_age",
             level = "country")