Generate a simulated dataset — gendataset • admr

gendataset generates a simulated dataset based on the model structure and random effects specified in the options. The function can generate data in either raw format or nlmixr format, with optional residual error.

Usage

gendataset(opts, seed = 1, reserr = TRUE, nlmixrform = FALSE)

Arguments

opts: A list of model options generated by genopts(). Contains settings for the model, including the prediction function, time points, parameter structure, and simulation settings.
seed: Random seed for reproducibility. Default is 1.
reserr: Logical indicating whether to add residual error to the simulated data. Default is TRUE.
nlmixrform: Logical indicating whether to return the data in nlmixr format. If TRUE, returns a data frame with columns: dv, time, id, amt, evid, cmt. If FALSE, returns a matrix of simulated observations. Default is FALSE.

Value

If nlmixrform is FALSE, returns a matrix of simulated observations with dimensions nsim x length(time). If nlmixrform is TRUE, returns a data frame in nlmixr format with columns:

dv: Dependent variable (e.g., concentration measurements)
time: Observation time points
id: Subject identifier
amt: Dose amount (NA for observations, typically in mg)
evid: Event identifier (101 for dosing, 0 for observation)
cmt: Compartment number (1 for depot/dosing, 2 for central/observation)

Details

The function generates simulated data following these steps:

Generating random effects (η) from a multivariate normal distribution
Computing individual parameters (θᵢ) using log-normal transformations
Simulating concentration-time profiles using the prediction function
Adding residual error components if requested: - Proportional error: y = f(t,θ)(1 + ε), where ε ~ N(0,σ²_prop) - Additive error: y = f(t,θ) + ε, where ε ~ N(0,σ²_add)
Formatting output in either raw matrix or nlmixr-compatible format

The residual error model can include:

Proportional error only (specified by Sigma_prop)
Additive error only (specified by Sigma_add)
Combined error model (both Sigma_prop and Sigma_add)

The function supports reproducible simulations through the seed parameter and is compatible with both population PK modeling and simulation workflows.

Examples

# Load required libraries
library(admr)
library(rxode2)
library(nlmixr2)
library(dplyr)
library(tidyr)
library(mnorm)

# Define RxODE model
rxModel <- function(){
model({
  # Parameters
  ke = cl / v1             # Elimination rate constant
  k12 = q / v1             # Rate constant for central to peripheral transfer
  k21 = q / v2             # Rate constant for peripheral to central transfer

  # Differential equations
  d/dt(depot)    = -ka * depot
  d/dt(central)  = ka * depot - ke * central - k12 * central + k21 * peripheral
  d/dt(peripheral) = k12 * central - k21 * peripheral

  # Concentration in central compartment
  cp = central / v1
})
}

rxModel <- rxode2(rxModel)
#>  
#>  
#> ℹ parameter labels from comments are typically ignored in non-interactive mode
#> ℹ Need to run with the source intact to parse comments
rxModel <- rxModel$simulationModel
#>  
#>  

# Define prediction function for a two-compartment model
predder <- function(time, theta_i, dose = 100) {
  n_individuals <- nrow(theta_i)
  if (is.null(n_individuals)) n_individuals <- 1

  # Create event table for dosing and sampling
  ev <- eventTable(amount.units="mg", time.units="hours")
  ev$add.dosing(dose = dose, nbr.doses = 1, start.time = 0)
  ev$add.sampling(time)

  # Solve ODE system
  out <- rxSolve(rxModel, params = theta_i, events = ev, cores = 0)

  # Return matrix of predictions
  matrix(out$cp, nrow = n_individuals, ncol = length(time), byrow = TRUE)
}

# Create options for a two-compartment model
opts <- genopts(
  f = predder,
  time = c(.1, .25, .5, 1, 2, 3, 5, 8, 12),
  p = list(
    # Population parameters (fixed effects)
    beta = c(cl = 5,    # Clearance (L/h)
            v1 = 10,    # Central volume (L)
            v2 = 30,    # Peripheral volume (L)
            q = 10,     # Inter-compartmental clearance (L/h)
            ka = 1),    # Absorption rate (1/h)

    # Between-subject variability (30% CV on all parameters)
    Omega = matrix(c(0.09, 0, 0, 0, 0,
                    0, 0.09, 0, 0, 0,
                    0, 0, 0.09, 0, 0,
                    0, 0, 0, 0.09, 0,
                    0, 0, 0, 0, 0.09), nrow = 5, ncol = 5),

    # Residual error (20% CV)
    Sigma_prop = 0.04
  ),
  nsim = 2500,  # Number of Monte Carlo samples
  n = 500,      # Number of individuals
  fo_appr = FALSE  # Use Monte Carlo approximation
)

# Generate a dataset with 100 individuals
dataset <- gendataset(opts, n = 100)