This function uses the mice package to multiply impute missing values based on the statistical relationships among a set of variables. There is a range of mice documentation and tutorials that is worth getting into to develop and check this function.

impute_data_mice(data, var_names, var_methods, n_imputations)

Arguments

data

Data table - the Health Survey for England dataset with missing values

var_names

Character vector - the names of the variables to be considered in the multiple imputation.

var_methods

Character vector - the names of the statistical methods to be used to predict each of the above variables - see the mice documentation.

n_imputations

Integer - the number of different versions of the imputed data to produce.

Value

Returns a list containing

  • data All versions of the multiply imputed data in a single data table.

  • object The mice multiple imputation object.

Examples


if (FALSE) {

# "logreg" - binary Logistic regression
# "polr" - ordered Proportional odds model
# "polyreg" - unordered Polytomous logistic regression

imp_obj <- impute_data_mice(
  data = test_data,
  c("binary_variable", "order_categorical_variable", "unordered_categorical_variable"),
  c("logreg", "polr", "polyreg"),
  n_imputations = 5
)

}