Parallel computing

*The interpretation of the Advanced part is only available in EN.

Parallel alignment of FASTQs

Read alignment and annotation are typically the most resource-intensive steps during the analysis process. Consequently, parallel processing of transcriptome alignment is especially crucial. The decision to enable parallel computing depends on the available memory and the number of threads designated for the analysis, whether it is in local mode or cluster mode.

The number of parallel tasks is determined by the smaller value of either [Maximum free memory/memory required for an alignment subtask] or [Setting threads/threads required for an alignment subtask].

Resource settings

For instance, if set 150 GB of memory and 24 threads to start SAW count run like:

#analysis script
saw count \
...
--id=FASTQ_parallel_alignment \
--memory=150 \
--threads-num=24 \
...

and generate a specific resource configuration file or check the /saw-8.2.X/config/resources.yaml

#resource configuration yaml
version: "1.0"

global:
  scheduler: "sge"
  default_setting:
    max_retries: 0

schedulers: 
  sge:
    queue: " "

rules:
...
  alignment:  #read alignment
    threads: 16
    mem_gb: null  ##assume the memory consumption of a subtask is 60GB
    retry_factor: 1.5
...

The memory required for the read alignment of a pair of FASTQ does not need to be specified. The program will automatically calculate and utilize the number of CID sites in Stereo-seq chip mask.

Assume that the memory required for a pair of FASTQ alignments, namely a read alignment subtask, is 60 GB. Therefore, a minimum of [150 GB / 60 GB] and [24 / 16] will be utilized for parallel processing. Clearly, this analysis task can only be executed serially under the aforementioned settings.

The most straightforward way is to modify the resource settings:

#analysis script
saw count \
...
--id=FASTQ_parallel_alignment \
--memory=150 \  #memory is enough
--threads-num=32 \  #give more threads, 32/16=2
...

#resource configuration yaml
version: "1.0"

global:
  scheduler: "sge"
  default_setting:
    max_retries: 0

schedulers: 
  sge:
    queue: " "

rules:
...
  alignment:  #read alignment
    threads: 12   #reduce the number of threads for a single task, 24/12=2
    mem_gb: null  ##assume the memory consumption of a subtask is 60GB
    retry_factor: 1.5
...

Parallel computing

Parallel computing

Parallel alignment of FASTQs

Resource settings

results matching ""

No results matching ""