psrwe: Propensity Score-Integrated Approaches for Incorporating Real-World Evidence in Clinical Studies

Chenguang Wang

2021-04-16

## Loading required package: psrwe
## Loading required package: rstan
## Loading required package: StanHeaders
## Loading required package: ggplot2
## rstan (Version 2.21.2, GitRev: 2e1f913d3ca3)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
## Loading required package: Rcpp

Introduction

In the R package psrwe, we implement a series of approaches for leveraging real-world evidence in clinical study design and analysis.

Propensity score estimation

The approaches implemented in psrwe are mostly based on propensity score adjustment. Estimation of propensity scores can be done by using the function rwe_ps.

data(ex_dta)
dta_ps <- rwe_ps(ex_dta,
                 v_covs = paste("V", 1:7, sep = ""),
                 v_grp = "Group",
                 cur_grp_level = "current",
                 nstrata = 5)

It is extremely important to evaluate the propensity score adjustment results. In psrwe, functions are provided to visualize the balance in covariate distributions and propensity score distributions based on propensity score stratification.

plot(dta_ps, "balance")

plot(dta_ps, "ps")

PS-integrated power prior approach for single arm studies

For single arm studies when there is one external data source, the function rwe_ps_powerp allows one to conduct the analysis proposed in Wang et. al. (2019). The method uses propensity score to pre-select a subset of real-world data containing patients that are similar to those in the current study in terms of covariates, and to stratify the selected patients together with those in the current study into more homogeneous strata. The power prior approach is then applied in each stratum to obtain stratum-specific posterior distributions, which are combined to complete the Bayesian inference for the parameters of interest.

ps_dist   <- rwe_ps_dist(dta_ps)
post_smps <- rwe_ps_powerp(dta_ps,
                           total_borrow = 40,
                           v_distance   = ps_dist$Dist[1:dta_ps$nstrata],
                           outcome_type = "binary",
                           v_outcome    = "Y")
## 
## SAMPLING FOR MODEL 'powerpsbinary' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 4.8e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.48 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.14399 seconds (Warm-up)
## Chain 1:                0.126873 seconds (Sampling)
## Chain 1:                0.270863 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'powerpsbinary' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.5e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.149957 seconds (Warm-up)
## Chain 2:                0.106409 seconds (Sampling)
## Chain 2:                0.256366 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'powerpsbinary' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 3.2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.32 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.154825 seconds (Warm-up)
## Chain 3:                0.092672 seconds (Sampling)
## Chain 3:                0.247497 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'powerpsbinary' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1.6e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.162727 seconds (Warm-up)
## Chain 4:                0.07583 seconds (Sampling)
## Chain 4:                0.238557 seconds (Total)
## Chain 4:
## Warning: There were 44 divergent transitions after warmup. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: Examine the pairs() plot to diagnose sampling problems

The mixing of posterior samples should be checked to ensure the convergence of the posterior sampling.

traceplot(post_smps$stan_rst, pars = c("theta", "thetas"))

Results can be further summarized as:

summary(post_smps)
## $overall_mean
## [1] 0.3092625
## 
## $overall_variance
## [1] 0.0008288703
## 
## $theta_by_stratum
##   Strata     Theta    Variance
## 1      1 0.4057454 0.004544006
## 2      2 0.2594251 0.003561362
## 3      3 0.2063270 0.003229119
## 4      4 0.3506278 0.004614629
## 5      5 0.3241875 0.004484550

PS-integrated composite likelihood approach for single arm studies

For single arm studies when there is one external data source, the function rwe_ps_cl allows one to conduct the analysis proposed in Wang et. al. (2020). In this approach, within each propensity score stratum, a composite likelihood function is specified and utilized to down-weight the information contributed by the external data source. Estimates of the stratum-specific parameters are obtained by maximizing the composite likelihood function. These stratum-specific estimates are then combined to obtain an overall population-level estimate of the parameter of interest.

ps_borrow <- rwe_ps_borrow(total_borrow = 40, ps_dist)
rst_cl    <- rwe_ps_cl(dta_ps, v_borrow = ps_borrow, v_outcome = "Y")
summary(rst_cl)
## $overall_mean
## [1] 0.3009473
## 
## $jackknife_variance
## [1] 0.0007589453
## 
## $theta_by_stratum
##   Strata N1  N0     Theta    Variance
## 1      1 40 720 0.4010585 0.003486166
## 2      2 40 143 0.2503177 0.003297067
## 3      3 40  95 0.1924889 0.002852081
## 4      4 40  57 0.3440490 0.004661073
## 5      5 40  16 0.3168223 0.004414814

PS-integrated composite likelihood approach for randomized studies

For randomized studies when there is one external data source that contains control subjects, the function rwe_ps_cl2arm allows one to conduct the analysis proposed in Chen et. al. (2020). In this approach, a propensity score-integrated composite likelihood approach is developed for augmenting the control arm of the two-arm randomized controlled trial with patients from the external data source. An example is given below.

data(ex_dta_rct)
dta_ps_2arm <- rwe_ps(ex_dta_rct,
                      v_covs = paste("V", 1:7, sep = ""),
                      v_grp = "Group",
                      cur_grp_level = "current",
                      nstrata = 5)

rst_2arm <- rwe_ps_cl2arm(dta_ps_2arm,
                          v_arm = "Arm",
                          trt_arm_level = 1,
                          outcome_type = "continuous",
                          v_outcome = "Y",
                          total_borrow = 40)

print(rst_2arm)
## $treatment
## $treatment$overall_mean
## [1] 368.4005
## 
## $treatment$jackknife_variance
## [1] 19.15712
## 
## $treatment$theta_by_stratum
##   Strata N1 N0    Theta  Variance
## 1      1 21 21 386.4337 101.98887
## 2      2 15 15 353.5991  50.85397
## 3      3 22 22 374.9533 102.84782
## 4      4 19 19 367.9795  69.01982
## 5      5 23 23 355.6684 105.97465
## 
## 
## $control
## $control$overall_mean
## [1] 358.1482
## 
## $control$jackknife_variance
## [1] 10.25328
## 
## $control$theta_by_stratum
##   Strata N1  N0    Theta Variance
## 1      1 19 720 373.8150 41.71388
## 2      2 25 143 362.4157 32.31339
## 3      3 18  95 356.2292 99.33514
## 4      4 21  57 354.5110 37.34027
## 5      5 17  16 340.8876 47.54278
## 
## 
## $effect
## $effect$Estimate
## [1] 10.1551
## 
## $effect$Variance
## [1] 27.55722

Reference

Chen, W.C., Wang, C., Li, H., Lu, N., Tiwari, R., Xu, Y. and Yue, L.Q., 2020. Propensity score-integrated composite likelihood approach for augmenting the control arm of a randomized controlled trial by incorporating real-world data. Journal of Biopharmaceutical Statistics, 30(3), pp.508-520.

Wang, C., Lu, N., Chen, W. C., Li, H., Tiwari, R., Xu, Y., & Yue, L. Q. (2020). Propensity score-integrated composite likelihood approach for incorporating real-world evidence in single-arm clinical studies. Journal of biopharmaceutical statistics, 30(3), 495-507.

Wang, C., Li, H., Chen, W. C., Lu, N., Tiwari, R., Xu, Y., & Yue, L. Q. (2019). Propensity score-integrated power prior approach for incorporating real-world evidence in single-arm clinical studies. Journal of biopharmaceutical statistics, 29(5), 731-748.