Lab Meeting
Fixed and Random effects
Different types of effect
lm(y ~ treatment + batch)
design(dds) <- ~ treatment + batch
treatmentcan be one of a pre-chosen set of optionsbatchtakes arbitrary labels that could go on indefinitely
Random and Fixed effects
For
treatmentwe’re interested in expected values (oftreated, andcontroland their difference). FIXED effectFor
batchwe’re less interested in the individual expected values. We probably assume that they’re zero on average, but might be interested in their variance - how much ‘noise’ they’re introducing. RANDOM effect
Notation
lm(y ~ treatment + batch)
library(lme4)
lmer(y ~ treatment + (1|batch))
These are pretty much equivalent, the latter saying ‘allow a different intercept (constant term) for each batch’.
The former would phrase results in terms of individual treatment effects and individual batch effects.
The latter would make the ‘noise due to batch’ the primary piece of information (about batch, treatment is as usual).
Quick win
Different batches have different uptake of treatment
lm(y ~ treatment * batch)
lmer(y ~ treatment + (treatment|batch))
First model’s
treatmentcoefficient is treatment effect in the (arbitrary)batch₀Second model’s
treatmentcoefficient is for the ‘average’ batch.
Generalises
Most experiments have same number of samples (fastq’s) and biological units (mice). But frequently this is not the case.
Left and right half of brain
Three arbitrary regions per tissue per mouse
Patient follow-up visits.
We believe patients may vary (in baseline, and in response?). But we’re interested in them as a whole, not individually.
Formulae
lmer(y ~ treatment + side + (1|mouse))
# lmer(y ~ treatment + side + (side|mouse))
lmer(y ~ treatment + (1|mouse))
#lmer(y ~ treatment + (1|mouse/region))
lmer(y ~ gender + time*Arm + (1|patient))
#lmer(y ~ gender + rcs(time,3)*Arm + (time|patient))
So we still need to think about how to specify the model!
But the language helps us focus on the true role of predictors.
Why bother
Should let us auto-rephrase those nasty
gender:recoded_patient + gender:visit + genderformulae that DESeq2 forces upon us.More ‘intelligent’ plots: we don’t need a legend with 73 patients in ‘different’ colours (but we might like to know which genes are patient-variable.)
Don’t suggest fitting the models using
lme4::(you need lots of instances, each with lots of measurements) - but it might be a handy sanity check occasionally.
Package of the week
gt and gtsummary - like ggplot, but for tables. Static table
generation in rmarkdown, with pipe-able formatting, sorting and
margins.
gtsummary gives a ‘Table 1’ summary of RCT/observational data to
give a quick overview of the properties of a data-frame, perhaps
stratified by a subset of columns.
It also can give the much-maligned ‘Table 2’ assessing the association of each independent variable on the outcome. Don’t ‘conclude’ anything from these.