R/categorical_trees.R
create_tree.Rd
Creates a hierarchical data.tree object that defines how different levels of categorical or interval data are related to each other.
create_agg_tree(mapping, exists, col_type)
create_scale_tree(mapping, exists, col_type, collapse_missing = FALSE)
vis_tree(tree)
[data.table()
]
For 'categorical' variables, defines how different levels of the
hierarchical variable relate to each other. For aggregating 'interval'
variables, it is used to specify intervals to aggregate to, while when
scaling the mapping is inferred from the available intervals in dt
.
[character()
]
names of variables in the mapping that data exists for.
[character(1)
]
The type of variable that is being aggregated or scaled over. Can be either
'categorical' or 'interval'.
[logical(1)
]
When scaling a categorical
variable, whether to collapse missing
intermediate levels in mapping
. Default is 'False' and the function
errors out due to missing data.
[data.tree()
] as returned by create_agg_tree()
or
create_scale_tree()
.
create_agg_tree()
and create_scale_tree()
return a
[data.tree()
] with attributes for whether each node has data available
('exists') and whether aggregation to or scaling of each node is possible
('agg_possible' or 'scale_possible'). For create_scale_tree()
, also
includes a field for whether children of each node can be scaled
('scale_children_possible').
vis_tree()
uses networkD3::diagonalNetwork()
to create 'D3' network
graphs.
Hierarchical tree data structures are used to represent different levels of the categorical or interval data. The r package data.tree is used to implement these data structures.
When vis_tree()
is used to visualize a tree returned by create_agg_tree()
then nodes with data directly provided are colored green, nodes where
aggregation is possible are colored blue, and missing nodes without data
directly provided and where aggregation is impossible because of the missing
nodes are colored red.
When vis_tree()
is used to visualize a tree returned by
create_scale_tree()
then nodes with data directly provided and that can be
scaled are colored green, nodes with data directly provided but that can not
be scaled are colored blue and nodes without data directly provided
# aggregation example where all present day locations exist except for Tehran
locations_present <- iran_mapping[!grepl("[0-9]+", child) &
child != "Tehran", child]
agg_tree <- create_agg_tree(iran_mapping, exists = locations_present,
col_type = "categorical")
vis_tree(agg_tree)
# scaling example where all present day locations exist without collapsing
locations_present <- c(iran_mapping[!grepl("[0-9]+", child), child], "Iran")
scale_tree <- create_scale_tree(iran_mapping,
exists = locations_present,
col_type = "categorical")
vis_tree(scale_tree)
# scaling example where all present day locations exist and collapsing tree
scale_tree <- create_scale_tree(iran_mapping,
exists = locations_present,
col_type = "categorical",
collapse = TRUE)
vis_tree(scale_tree)