Use all labels for pseudobulk model fitting or split labels into separate model fits? #13

micahpf · 2024-05-16T20:09:16Z

Originally, I had been implementing the edger-voom-limma pseudobulk pipeline by splitting each cell type label into separate DGElist objects and performing model fitting on each independently.

More recently, the UM5 folks asked to see contrasts across labels, so I implemented a version which creates a single DGElist including all cell type labels and fits a single model, so I that I could pull out cross-label contrasts directly. This seems like the default assumption of the edgeR::Seurat2PB function, which creates a single DGElist with a column for each sample x label combination.

However, in the OSCA multi-sample book recommends against using all labels for model fitting:

We do not use all labels for GLM fitting as the strong DE between labels makes it difficult to compute a sensible average abundance to model the mean-dispersion trend. Moreover, label-specific batch effects would not be easily handled with a single additive term in the design matrix for the batch. Instead, we arbitrarily pick one of the labels to use for this demonstration.

Note that some of the authors of OSCA were co-developers of edgeR, so I tend to trust their judgement of the best practices.

So I think we should default back to fitting each cell type label. Cross-label comparisons may need to be made in other ways.

I may post a bioconductor forum question about this later. Just leaving this here for future reference.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use all labels for pseudobulk model fitting or split labels into separate model fits? #13

Use all labels for pseudobulk model fitting or split labels into separate model fits? #13

micahpf commented May 16, 2024

Use all labels for pseudobulk model fitting or split labels into separate model fits? #13

Use all labels for pseudobulk model fitting or split labels into separate model fits? #13

Comments

micahpf commented May 16, 2024