Parallelisation #3

vincr04 · 2020-02-27T09:41:51Z

Hi, thanks for developping IKAP, which seems to be a useful software. I am running it on a large Seurat object and am wondering if there are ways to make it quicker (e.g. parallelisation)? Any hint on how to do this would be really appreciated!

hummuscience · 2020-04-27T17:22:21Z

Since IKAP uses some functions from Seurat, I guess parallelization can be done using the Seurat method.

According to Seurat, the following functions can use parallelization:
NormalizeData
ScaleData
JackStraw
FindMarkers
FindIntegrationAnchors
FindClusters - if clustering over multiple resolutions

IKAP uses ScaleData, FindAllMarkers from Seurat that would benefit from the parallel processing. Data scaling is done at the beginning so it probably doesn't change much. FindAllMarkers is probably done multiple times so I guess this would benefit a lot from parallelization.

To set up parallel processing, use future

library(future)
plan("multiprocess", workers = 4)

If you get an error like this:

Error in getGlobalsAndPackages(expr, envir = envir, globals = TRUE) : 
  The total size of the X globals that need to be exported for the future expression ('FUN()') is X GiB. This exceeds the maximum allowed size of 500.00 MiB (option 'future.globals.maxSize'). The X largest globals are ...

You need to adjust the variable future.globals.maxSize as follows:
options(future.globals.maxSize = 1000 * 1024^2)

Careful. Setting this too high might eat up too much RAM and crash R.

hummuscience · 2020-06-14T16:35:48Z

Another option would be to use snakemake to run the workflow

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelisation #3

Parallelisation #3

vincr04 commented Feb 27, 2020

hummuscience commented Apr 27, 2020

hummuscience commented Jun 14, 2020

Parallelisation #3

Parallelisation #3

Comments

vincr04 commented Feb 27, 2020

hummuscience commented Apr 27, 2020

hummuscience commented Jun 14, 2020