R/interval_assertions.R
overlapping_intervals_dt.Rd
Checks to see if the specified interval variable contains overlapping intervals.
assert_no_overlapping_intervals_dt(
dt,
id_cols,
col_stem,
identify_all_possible = FALSE,
quiet = FALSE
)
identify_overlapping_intervals_dt(
dt,
id_cols,
col_stem,
identify_all_possible = FALSE,
quiet = FALSE
)
[data.table()
]
Data containing the interval variable to check. Should include all 'id_cols'.
[character()
]
ID columns that uniquely identify each row of dt
. Should include
'col_stem_start' and 'col_stem_end'.
[character(1)
]
The name of the interval variable to check, should not include the
'_start' or '_end' suffix.
[logical(1)
]
Whether to return all overlapping intervals ('TRUE') or try to identify just
the less granular interval ('FALSE'). Default is 'FALSE'. Useful when it may
not be clear what is the less granular interval.
[logical(1)
]
Should progress messages be suppressed as the function is run? Default is
False.
identify_overlapping_intervals_dt
returns a [data.table()
] with
id_cols
that have overlapping intervals. If no intervals are overlapping
then a zero-row [data.table()
] is returned.
assert_no_overlapping_intervals_dt
returns nothing but throws an error if
identify_overlapping_intervals_dt
returns a non-empty data.table.
identify_overlapping_intervals_dt
works by first identifying each unique
set of intervals in dt
. Then checks one at a time the groups of rows
of dt
that match each set of intervals.
input_dt <- data.table::data.table(
age_start = seq(0, 95, 5),
age_end = c(seq(5, 95, 5), Inf)
)
input_dt <- rbind(input_dt, data.table::data.table(age_start = c(15), age_end = c(60)))
# identify everything that is overlapping
overlapping_dt <- identify_overlapping_intervals_dt(
dt = input_dt,
id_cols = c("age_start", "age_end"),
col_stem = "age",
identify_all_possible = TRUE
)
#> Interval group 1 of 1: [0, 5),[5, 10),[10, 15),[15, 20),[15, 60),[20, 25),[25, 30),[30, 35),[35, 40),[40, 45),[45, 50),[50, 55),[55, 60),[60, 65),[65, 70),[70, 75),[75, 80),[80, 85),[85, 90),[90, 95),[95, Inf)
# identify only the largest overlapping intervals
overlapping_dt <- identify_overlapping_intervals_dt(
dt = input_dt,
id_cols = c("age_start", "age_end"),
col_stem = "age",
identify_all_possible = FALSE
)
#> Interval group 1 of 1: [0, 5),[5, 10),[10, 15),[15, 20),[15, 60),[20, 25),[25, 30),[30, 35),[35, 40),[40, 45),[45, 50),[50, 55),[55, 60),[60, 65),[65, 70),[70, 75),[75, 80),[80, 85),[85, 90),[90, 95),[95, Inf)