The RELSA package comprises a set of functions for RELlative Severity Assessment in Laboratory animals. In animal-based research, the problem of severity classification is crucial. As animals cannot communicate their state of well-being, scientists need reliable tools for monitoring severity as closely as possible. It has been shown that a diversity of behavioral tests (and others) may serve this purpose. However, the main issue with these approaches is that they are rather specific and challenging to transfer to other animal models. A comprehensive and easy-to-use toolbox for assessing and comparing different variables and animal models is missing. RELSA offers a first insight into this topic by providing a composite score for an arbitrary number of input variables and the possibility of comparing different animal models based on actual value changes within a relative context. The RELSA procedure will allow close severity monitoring of individual animals and bulk analysis of overall reached severity levels of entire models.
Foremost, RELSA is not (yet) a predictor of death in laboratory animals!
At its current stage of development, it is meant as a guideline or tool for researchers working with laboratory animals. The package was designed for monitoring the relative severity of individual animals and can also be used to determine the overall severity level of entire animal models. It is not (yet) meant as a unified solution for a) the definition, b) grading, and c) objective classification of severity. Nevertheless, it may help in finding more objective thresholds with more data.
In order to run the RELSA package the following additional packages are needed. Both can be loaded from CRAN: dichromat (Yes, Waichler, and Schemes 2015), FactoMineR (Lê, Josse, and Husson 2008) and dplyr (Wickham et al. 2020).
The following commands can be used in R to download and install the developer version of the RELSA package. The second command is required to build the associated vignettes (if needed).
devtools::install_github("mytalbot/relsa", build_vignettes = TRUE)
RELSA in one sentence: The RELSA score is a generalized quantification of multi-variable differences.
Variable changes are monitored concerning their baseline or physiological states, and they are stored. For this to work, input data need to have a single baseline value (e.g., the mean value of a previously run baseline period, etc.).
The severity context can be extrapolated with data from a well-defined reference set. Trivially, the extrema (min/max) of each variable in the reference set serve as ranges for the given severity context. However, there is one caveat: researchers must provide some sort of estimation about the quality of severity in the reference experiment. This assessment is one reason why the RELSA score is relative. Therefore, all variable changes in other data are mapped towards the general severity level of the reference set.
It does not matter, however, if this severity is classified as “low”, “medium” or “severe”. The relative context will shift the scale to lower or higher values, depending on the maximum escalations in the reference set. The final RELSA score is calculated from weight factors (specific RELSA effect sizes) and is regularized by the extrema in the reference set. At last, particular focus is on the fact that larger weights are more important than more minor changes in terms of severity, which is included in the final score.
Missing data are set to NA and not to zero or imputed otherwise. If set to zero, this would allow recognizing those elements in the calculations of, e.g., the mean. The result would be a skewed reflection of what happens in the data towards unimportant values. Introducing NAs allows a more fair calculation of the score regarding potential severity, as these values will not be included in the calculations. This lays more focus on actual present variables and calculates RELSA scores for data sets with lots of missing data. Thus, the RELSA score is an abstraction of what happens in the data at specific time points.
Besides providing context, the reference set has another purpose: it regularizes the possible ranges of variables. This can prove essential, as variables behave differently when animals are negatively affected. For example, a loss of 17 % in body weight is generally recognized as a rather substantial threat to animal health. At the same time, burrowing behavior may drop to zero %. Here, a difference of 17 % is equivalent to 100 % at the same time of observation. However, in other animal models, this may not be the case. By comparing the actual ranges to the reference set, this kind of bias is regularized, resulting in the weight factors.
A conclusion would be that certain parameters are lagging, others are leading, or interactions are model-dependent. The main issues here are not the parameters themselves but rather the experimental design and sampling frequency. For example, body weight is measured, e.g., once per day (i.e. in the morning) and burrowing behavior after a specific time (overnight). The sampling rates in these cases are a) not equal and b) not frequent enough to catch minute changes. Transient ranges of some variables thus appear as “all-or-nothing” parameters. They change much faster than the sampling rates so that the exact development over time cannot be seen. Although the sampling rate cannot be corrected with RELSA, the skewness in distribution can be adjusted to a certain degree by including extreme values of a training or reference model with known severity into the calculation. These values are called baseline (severity) values for the given animal model. The RELSA score itself is pretty easy to interpret. A RELSA of, e.g., 0.85 means that the variables in a test animal have reached 85 % of the (generalized) maximum deviation level of the reference set. If values are larger than 1 (100 %), this indicates higher severity than within the reference set. This is particularly interesting for data sets with different variables: some might perform better than others.
For the current version of RELSA, the tabular data must follow a specific format. It can be loaded as *.txt file in the tab-delimited format using the
relsa_load function. Variable names should be recognizable but straightforward, such as bw (body weight), body weight change (bwc), burON (burrowing overnight), etc. The variable names must match the names in the reference model. The table should have an index column (the first one) without a header. The animal id should be a short but unique identifier. The treatment column contains experimental information, e.g., about the nature of an experiment (Transmitter/Sham, etc.). The condition column includes additional information such as anesthesia, death, or any other “condition” that should be included. Currently, only two columns can be used to specify conditions for filtering. However, both can be used as specifications in the
relsa_load function to load subsets of the data.
The “day” column contains the time information of the data. This column should be of equal length for each animal in the table even if there are no data. Just leave the fields empty. If data are not entered with the same length, the
relsa_load function will take care of this (and will show a warning). The RELSA convention for baseline coding is -1. However, you can design this column differently (e.g., hours instead of days), as long as it continuously increases.
IMPORTANT: The variables in the baseline/training data must match the names in the test data, which are very likely to come from a different table.
For additional information: we provide a turorial for collecting data in small animal studies.
Further, we provide a general data format for collecting experimental data and meta information ( DB_FORm). (Talbot et al. 2020)
# This function will load RELSA raw data as a data.frame. relsa_load(file, treatment=NULL, condition=NULL)
relsa_load function is used to import training and test data. It uses the
file argument with a complete string to the sample file (including the testfile.txt). The
treatment/condition fields can be used to subsample the loaded files (e.g., if the data contains both transmitter and sham data. With
treatment="transmitter", only data from the “transmitter” experiment will be loaded (the spelling is case sensitive!).
# This function will normalize specified variables to the range [0;100]. relsa_norm(set, normthese=NULL, ontime=1)
set contains the loaded data frame obtained by
relsa_load. The specified variables in
normthese will be normalized to a range of [0;100]. Important note: variables that are not specified are not normalized! This function can also be used to normalize other data sets. For example, some variables like body weigh change (bwc) that are sometimes pre-calculated in the data tables do not need normalization and must, therefore, not be specified in the function. The field
ontime specifies the timepoint in the data set to which the data in each column should be normalized. Please note that this is not a string for the actual name of the time unit but for the absolute position (row name) in the data frame (if, e.g., baseline values (day = -1) are at position 1, and this is the very first entry in the table, ontime=1).
# This function will calculate k+1 levels from kmeans analysis of the reference data for severity classification. relsa_levels(refset, mypath, filename, drops=NULL, turns=NULL, relsaNA=NA, k=4, showScree="no", customCol=NULL, seed=123, myYlim=c(0,1.4), saveTiff="yes")
refset is the normalized reference data frame. With the arguments
filename the path and name of the final *.tiff file can be specified.
turns determines the number of clusters in the data. The exact number must be found heuristically by using the internalized Scree plot. If
showScree is set to “yes”, the Scree plot is shown.
customCol can be used to specify custom colors for the clusters. The color vector should be as long as the number of clusters.
seed controls the random seeding and can be turned off with NULL. With
myYlim the range of the y-axis can be controlled, and
saveTiff controls whether the final *.tiff file shall be plotted.
# This function will establish baseline values for RELSA calculation. relsa_baselines(dataset=NULL, bslday=-1, variables=NULL, turnvars=NULL)
relsa_baselines function builds baseline values from the provided reference data. It may be necessary to load and normalize the data with the two previously mentioned functions.
dataset should contain only normalized values (range [0;100]).
bslday points to the actual value of the time column that is to be used as baseline value (the default value is day = -1).
variables contains a string with the actual variables for which RELSA scores shall be calculated. The test variables in the later test set should be included here as well. Otherwise, they will not be calculated, or the function will produce an error. The field
turnvars is important for variables that show increasing values under duress instead of variables showing a decrease in values such as body weight. These variables can be specified as such:
turnvars=c("hr","temp") etc. The function’s output is twofold: a) a matrix for each variable in the model with 100 % as maximum normalized value and b) a vector with maximum/minimum reached variable values (not normalized!) in the training model.
# This function will load the default transmitter implantation / post-op reference model relsa_reference()
No further objects can be specified. The output is a list with bsl as [] and the overall RELSA levels as []. If more specific variations for the reference model are needed, the
relsa_load function should be used. This function uses the internal postop data set.
# This is the actual RELSA function. relsa (set, bsl, a=1, drop=NULL, turnvars=NULL, relsaNA=0 )
relsa accepts normalized (test data) from the
relsa_norm function for analysis. However, it is possible to assess the training model as well.
bsl is the object obtained from the
relsa_baselines function for model referencing.
a is the number of unique animals/animals in the set.
drop can be used to exclude variables that are contained in the training model but shall be excluded from further analysis (e.g. drop=c(“bw”)).
turnvars is again used to indicate variables with switched orientation. Sometimes calculations produce NaN.
relsaNA determines how to substitute these values (default is 0).
# This function can be used to plot the RELSA score. relsa_plot (set=NULL, RELSA, levels=NULL, animal=1, plotvar=NULL, plotRELSA=TRUE, myylim=c(70,110), myYlim=c(0,2), mypch=1, mycol="red")
set contains the loaded data frame obtained by
RELSA the output from the
relsa function. The
levels argument accepts the level output from the
relsa_levels function. The specific animal to be plotted is defined with the unique animal number in
animal. Each variable can be plotted individually by controlling the variable name in the
plotvar field (e.g. plotvar=“bwc”). With the
plotRELSA argument, the RELSA score is plotted in addition to the variable.
myylim controls the range of the first Y-axis on the left side of the plot and
myYlim for the RELSA score on the left side if plotvar is not NULL. The pch argument controls the dot type (default is 1).
# This function will provide extensive RELSA analysis and the outputs for further research. relsa_wrapper(querydata, bsl = NULL, levels = levels, treatment = NULL, condition = NULL, normthese = NULL, turnsQuery = NULL, dropsQuery = NULL, animalnr = NULL, ymax = 1.2, pcadims = 2, studylabel = NULL, severity = NULL, colorlabel = NULL)
querydata requires a path including the *.txt filename with the raw data in the RELSA format.
bsl is the field for the baseline information from either the
levels contain the specific RELSA levels from the k-means clustering of the reference model. These values can be changed here (this is not recommended). The
treatment/condition fields can subsample the loaded data. The
normthese field defines the variables that need normalisation.
turnsQuery/dropsQuery code for the variables in the querydata that need “turning” or that shall be dropped from the analysis. This can exclude data columns that have no representation in the reference model. After the wrapper function has finished, an example animal is shown, specified with
ymax codes the maximum range of the y-axis.
pcadims codes the maximum number of principal components are analyzed for variable contributions. A custom
studylabel can be assigned to the output data frames. A custom
severity level can be assigned to the output data frames. A custom
colorlabel can be assigned to the output data frames.
Uses the internal
library(RELSA) # Build model ------------------------------------------------------------- raw <- surgery vars <- c("bwc", "burON", "hr", "hrv", "temp", "act") turnvars <- c("hr","temp") pre <- relsa_norm(raw, normthese = c("hr", "hrv", "temp", "act")) refmodel <- relsa_reference() # Test model -------------------------------------------------------------- animal <- 1 RELSA <- relsa(set = pre, bsl=refmodel[], a = animal, drop=NULL, turnvars = turnvars) head(RELSA$relsa$rms)
Lê, Sébastien, Julie Josse, and François Husson. 2008. “FactoMineR: A Package for Multivariate Analysis.” Journal of Statistical Software 25 (1): 1–18. https://doi.org/10.18637/jss.v025.i01.
Talbot, Steven R, Stefan Bruch, Fabian Kießling, Michael Marschollek, Branko Jandric, René H Tolba, and André Bleich. 2020. “Design of a Joint Research Data Platform: A Use Case for Severity Assessment.” Laboratory Animals 54 (1): 33–39. https://doi.org/10.1177/0023677219872228.
Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2020. Dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr.
Yes, Lazyload, Scott Waichler, and Color Schemes. 2015. “Package ‘ dichromat ’.”