Catch Errors

Error catching is an important area to consider when creating Monte Carlo simulations. Sometimes, iterative algorithms will 'fail to converge', or otherwise crash for other reasons (e.g., sparse data). However, SimDesign makes this process much easier because the internal functions are automatically wrapped within try blocks, and therefore simulations will not terminate unexpectedly. This type of information is also collected as it may be relevant to the writer that something unknown is going wrong in the code-base. Below we demonstrate what happens when errors are thrown and caught, and how this information is tracked in the returned object.

Define the functions

As usual, define the functions of interest.

library(SimDesign)
# SimFunctions(comments=FALSE)

Design <- data.frame(N = c(10,20,30))
Generate <- function(condition, fixed_objects = NULL) {
    ret <- with(condition, rnorm(N))
    ret
}

Analyse <- function(condition, dat, fixed_objects = NULL) {
    whc <- sample(c(0,1,2,3), 1, prob = c(.7, .20, .05, .05))
    if(whc == 0){
       ret <- mean(dat)
    } else if(whc == 1){
        ret <- t.test() # missing arguments
    } else if(whc == 2){
        ret <- t.test('invalid') # invalid arguments
    } else if(whc == 3){
        # throw error manually 
        stop('Manual error thrown') 
    }
    # manual warnings
    if(sample(c(TRUE, FALSE), 1, prob = c(.1, .9)))
        warning('This warning happens rarely')
    if(sample(c(TRUE, FALSE), 1, prob = c(.5, .5)))
        warning('This warning happens much more often')
    ret
}

Summarise <- function(condition, results, fixed_objects = NULL) {
    ret <- c(bias = bias(results, 0))
    ret
}

The above simulation is just an example of how errors are tracked in SimDesign, as well as how to throw a manual error in case the data should be re-drawn based on the user's decision (e.g., when a model converges, but fails to do so before some number of predefined iterations).

Run the simulation

result <- runSimulation(Design, replications = 100, 
                       generate=Generate, analyse=Analyse, summarise=Summarise)
## 
## 
Design row: 1/3;   Started: Thu Jun 27 23:42:34 2019;   Total elapsed time: 0.00s 
## 
## 
Design row: 2/3;   Started: Thu Jun 27 23:42:34 2019;   Total elapsed time: 0.09s 
## 
## 
Design row: 3/3;   Started: Thu Jun 27 23:42:34 2019;   Total elapsed time: 0.17s
print(result)
##    N     bias
## 1 10  0.03782
## 2 20  0.01479
## 3 30 -0.01443
##   ERROR: .Error in t.test.default("invalid") : not enough 'x' observations\n
## 1                                                                          4
## 2                                                                         17
## 3                                                                          5
##   ERROR: .Error in t.test.default() : argument "x" is missing, with no default\n
## 1                                                                             19
## 2                                                                             26
## 3                                                                             32
##   ERROR: .Manual error thrown\n
## 1                             7
## 2                             8
## 3                             9
##   WARNING: .This warning happens much more often
## 1                                             44
## 2                                             40
## 3                                             47
##   WARNING: .This warning happens rarely REPLICATIONS SIM_TIME
## 1                                     5          100    0.09s
## 2                                     9          100    0.08s
## 3                                     9          100    0.08s
##                  COMPLETED       SEED
## 1 Thu Jun 27 23:42:34 2019  570175513
## 2 Thu Jun 27 23:42:34 2019  799129990
## 3 Thu Jun 27 23:42:34 2019 1230193230

What you'll immediately notice from this output object is that the name of the error/warning thrown, and the function from which the error was thrown, are included as additional columns in the output with the prefix ERROR:. Furthermore, the frequency in which the error occurred are also included for each design condition (here the t.test.default() error, where no inputs were supplied, occurred more often than the manually thrown error as well as the invalid-input error). This behavior is also tracked for WARNING messages as well in case there are clues as to why estimation models are having difficulty (or for other reasons whereby the warnings may be more serious).

Finally, SimDesign has a built-in safety feature controlled by with max_errors argument to avoid getting stuck in infinite redrawing loops. By default, if more than 50 errors are consecutively returned then the simulation will be halted, and the final error message will be returned. This safety feature is built-in because too many consecutive stop() calls generally indicates a major problem in the simulation code which should be fixed before continuing.

What to do (explicit debug catch)

If errors occur too often then these design conditions should either be extracted out of the simulation or further inspected to determine if they can be fixed (e.g., providing better starting values, increasing convergence criteria/number of iterations, etc). The use of the debugging features can also be useful to track down issues as well. For example, manually wrap the problematic functions in a try() call, and add the line if(is(object, 'try-error')) browser() to jump into the location/replication where the object unexpectedly witnessed an error. Jumping into the exact location where the error occurred will greatly help you determine what exactly went wrong in the simulation state, allowing you to quickly locate and fix the issue.

What to do (stored error seed debuging)

An alternative approach to locating errors in general is to use information stored within the SimDesign objects at the time of completion. By default, all .Random.seed states associated with errors are stored within the final object, and these can be extracted using the extract_error_seeds() function. This function returns a data.frame object with each seed stored column-wise, where the associated error message is contained in the column name itself (and allowed to be coerced into a valid data.frame column name).

seeds <- extract_error_seeds(result)
head(seeds[,1:3])
##   Design_row_1.1..Error.in.t.test.default..invalid.....not.enough..x..observations.
## 1                                                                               403
## 2                                                                               207
## 3                                                                        1561779751
## 4                                                                        -764922243
## 5                                                                       -1307744396
## 6                                                                       -1528964262
##   Design_row_1.2..Error.in.t.test.default..invalid.....not.enough..x..observations.
## 1                                                                               403
## 2                                                                               251
## 3                                                                        1561779751
## 4                                                                        -764922243
## 5                                                                       -1307744396
## 6                                                                       -1528964262
##   Design_row_1.3..Error.in.t.test.default.....argument..x..is.missing..with.no.default.
## 1                                                                                   403
## 2                                                                                   272
## 3                                                                            1561779751
## 4                                                                            -764922243
## 5                                                                           -1307744396
## 6                                                                           -1528964262

Given these seeds, replicating an exact error can be achieved by a) extracting a single column into an integer vector, and b) passing this vector to the load_seed input. For example, replicating the first error message can be achieved as follows, where it makes the most sense to immediately go into the debugging mode via the debug inputs.

Note: It is important to manually select the correct Design row using this error extraction approach; otherwise, the seed will clearly not replicate the exact problem state.

picked_seed <- seeds$Design_row_1.1..Error.in.t.test.default..invalid.....not.enough..x..observations.
runSimulation(Design[1,], replications = 100, load_seed=picked_seed, debug='analyse',
              generate=Generate, analyse=Analyse, summarise=Summarise)

The .Random.seed state will be loaded at this exact state, and will always be related at this state as well (in case c is typed in the debugger, or somehow the error is harder to find while walking through the debug mode). Hence, users must type Q to exit the debugger after they have better understood the nature of the error message first-hand.