Skip to contents

The high level validate_*() family of functions all return a <hub_validations> S3 class object.

Structure of <hub_validations> object

A hub_validations object is effectively a list and represents the collected output of the series of checks performed by a higher level validate_*() function.

Each named element of the list contains the result of an individual check and inherits from subclass <hub_check>. The name of each element is the name of the check.

Let’s examine an example output of a model output file validation using validate_submission().

hub_path <- system.file("testhubs/simple", package = "hubValidations")

v <- validate_submission(hub_path,
  file_path = "team1-goodmodel/2022-10-08-team1-goodmodel.csv"
)

str(v, max.level = 1)
#> List of 20
#>  $ valid_config      :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ file_exists       :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ file_name         :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ file_location     :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ round_id_valid    :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ file_format       :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ metadata_exists   :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ file_read         :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ valid_round_id_col:List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ unique_round_id   :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ match_round_id    :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ colnames          :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ col_types         :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ valid_vals        :List of 5
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ rows_unique       :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ req_vals          :List of 5
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ value_col_valid   :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ value_col_non_desc:List of 5
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ value_col_sum1    :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_info" "hub_check" "rlang_message" "message" ...
#>  $ submission_time   :List of 4
#>   ..- attr(*, "class")= chr [1:5] "check_failure" "hub_check" "rlang_warning" "warning" ...
#>  - attr(*, "class")= chr [1:2] "hub_validations" "list"

The super class returned in each element depends on the status of the check:

  • If a check succeeds, a <message/check_success> condition class object is returned.

  • If a check is skipped, a <message/check_info> condition class object is returned.

  • Checks vary with respect to whether they return an <error/check_error> or <warning/check_failure> condition class object if the check fails. Ultimately, both will cause overall validation to fail and the two classes are used primarily to communicate the severity of a failing check.

hub_validations print method

hub_validations objects have their own print method which displays the result, the file name and message of each check:

  • indicates a check was successful (a <message/check_success> condition class object was returned)
  • indicates a high severity check failed (a <error/check_error> condition class object was returned)
  • ! indicates a lower severity check failed (a <warning/check_failure> condition class object was returned)
  • indicates a check was skipped (a <message/check_info> condition class object was returned)
v
#> ::notice ::✔ simple: All hub config files are valid.%0A✔ 2022-10-08-team1-goodmodel.csv: File exists at path%0A  model-output/team1-goodmodel/2022-10-08-team1-goodmodel.csv.%0A✔ 2022-10-08-team1-goodmodel.csv: File name "2022-10-08-team1-goodmodel.csv" is%0A  valid.%0A✔ 2022-10-08-team1-goodmodel.csv: File directory name matches `model_id`%0A  metadata in file name.%0A✔ 2022-10-08-team1-goodmodel.csv: `round_id` is valid.%0A✔ 2022-10-08-team1-goodmodel.csv: File is accepted hub format.%0A✔ 2022-10-08-team1-goodmodel.csv: Metadata file exists at path%0A  model-metadata/team1-goodmodel.yaml.%0A✔ 2022-10-08-team1-goodmodel.csv: File could be read successfully.%0A✔ 2022-10-08-team1-goodmodel.csv: `round_id_col` name is valid.%0A✔ 2022-10-08-team1-goodmodel.csv: `round_id` column "origin_date" contains a%0A  single, unique round ID value.%0A✔ 2022-10-08-team1-goodmodel.csv: All `round_id_col` "origin_date" values match%0A  submission `round_id` from file name.%0A✔ 2022-10-08-team1-goodmodel.csv: Column names are consistent with expected%0A  round task IDs and std column names.%0A✔ 2022-10-08-team1-goodmodel.csv: Column data types match hub schema.%0A✔ 2022-10-08-team1-goodmodel.csv: `tbl` contains valid values/value%0A  combinations.%0A✔ 2022-10-08-team1-goodmodel.csv: All combinations of task ID%0A  column/`output_type`/`output_type_id` values are unique.%0A✔ 2022-10-08-team1-goodmodel.csv: Required task ID/output type/output type ID%0A  combinations all present.%0A✔ 2022-10-08-team1-goodmodel.csv: Values in column `value` all valid with%0A  respect to modeling task config.%0A✔ 2022-10-08-team1-goodmodel.csv: Values in `value` column are non-decreasing%0A  as output_type_ids increase for all unique task ID value/output type%0A  combinations of quantile or cdf output types.%0Aℹ 2022-10-08-team1-goodmodel.csv: No pmf output types to check for sum of 1.%0A  Check skipped.%0A! 2022-10-08-team1-goodmodel.csv: Submission time must be within accepted%0A  submission window for round.  Current time 2024-02-13 09:47:58.533777 is%0A  outside window 2022-10-02 EDT--2022-10-09 23:59:59 EDT.

Structure of a <hub_check> object

Let’s look more closely at the structure of the first few elements of the hub_validations object retuned by validate_submission()

v <- validate_submission(hub_path,
  file_path = "team1-goodmodel/2022-10-08-team1-goodmodel.csv"
)

str(head(v))
#> List of 6
#>  $ valid_config  :List of 4
#>   ..$ message       : chr "All hub config files are valid. \n "
#>   ..$ where         : chr "simple"
#>   ..$ call          : chr "check_config_hub_valid"
#>   ..$ use_cli_format: logi TRUE
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ file_exists   :List of 4
#>   ..$ message       : chr "File exists at path \033[34mmodel-output/team1-goodmodel/2022-10-08-team1-goodmodel.csv\033[39m. \n "
#>   ..$ where         : chr "team1-goodmodel/2022-10-08-team1-goodmodel.csv"
#>   ..$ call          : chr "check_file_exists"
#>   ..$ use_cli_format: logi TRUE
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ file_name     :List of 4
#>   ..$ message       : chr "File name \033[34m\"2022-10-08-team1-goodmodel.csv\"\033[39m is valid. \n "
#>   ..$ where         : chr "team1-goodmodel/2022-10-08-team1-goodmodel.csv"
#>   ..$ call          : chr "check_file_name"
#>   ..$ use_cli_format: logi TRUE
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ file_location :List of 4
#>   ..$ message       : chr "File directory name matches `model_id`\n                                           metadata in file name. \n "
#>   ..$ where         : chr "team1-goodmodel/2022-10-08-team1-goodmodel.csv"
#>   ..$ call          : chr "check_file_location"
#>   ..$ use_cli_format: logi TRUE
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ round_id_valid:List of 4
#>   ..$ message       : chr "`round_id` is valid. \n "
#>   ..$ where         : chr "team1-goodmodel/2022-10-08-team1-goodmodel.csv"
#>   ..$ call          : chr "check_valid_round_id"
#>   ..$ use_cli_format: logi TRUE
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...
#>  $ file_format   :List of 4
#>   ..$ message       : chr "File is accepted hub format. \n "
#>   ..$ where         : chr "team1-goodmodel/2022-10-08-team1-goodmodel.csv"
#>   ..$ call          : chr "check_file_format"
#>   ..$ use_cli_format: logi TRUE
#>   ..- attr(*, "class")= chr [1:5] "check_success" "hub_check" "rlang_message" "message" ...

Each <hub_check> objects contains the following elements:

  • message: the result message containing details about the check.
  • where:: there the check was performed, usually the model output file name.
  • call: the function used to perform the check.
  • use_cli_format: whether the message is formatted using cli format, almost always TRUE.

Extra information

Some <hub_check> objects contain extra information about the failing check to help identify affected rows in submissions.

For example, the <hub_check> object returned for the valid_vals check, which checks that all columns in a model output file (excluding the value column) contain valid combinations of task ID / output type / output type ID values contains an additional element called error_tbl, with details of the invalid value combinations in the rows affected.

To access error_tbl from the output of validate_submission() stored in an object v, you would use:

v$valid_vals$error_tbl