-
Notifications
You must be signed in to change notification settings - Fork 13
How to find the selected isoforms
To find information on selected isoforms, please follow these instructions:
FastOMA/Nextflow provides progress updates as it runs. If you've already saved the output, you can use grep to extract the relevant information:
$ grep "infer_roothogs " fastoma_log | tail -3
[xx/yyy] process > infer_roothogs (1) [100%] 1 of 1 ✔
[xx/yyy] process > infer_roothogs (1) [100%] 1 of 1 ✔
For each run there is an id xx/yyy
, which is the starting of the relevant directory in the work
folder in the same folder as fastoma's out (fastoma_output_dir
).
Instead of xx/yyy
use the id in your output. Then you can cd to this folder. Note that you need to use tab to use the autocomplete feature of terminal to fill the rest of folder: $ cd work/94/930a
using tab results in cd work/94/930abc390e3b83a310bb5bdbcbdbd9
then you should be able to see the selected_isoforms
folder
$ ls selected_isoforms/
ARTHA_selected_isoforms.tsv SOLCW_selected_isoforms.tsv TS117_selected_isoforms.tsv
$ head -n 3 selected_isoforms/TS222_selected_isoforms.tsv
Sopim_TS222_01T000001.1 Sopim_TS222_01T000001.1
Sopim_TS222_01T000002.1 Sopim_TS222_01T000002.1
If you don't have the fastoma log to find the relevant directory you can use find in the work
folder
find . -name .command.sh | xargs grep "fastoma-infer-roothogs"
The output of this command shows the directory inside work
that includes selected_isoforms
. There might be a few hits, you can ls
and check which one has selected_isoforms
.
Regarding the number of gained genes, we use the pyham package for phylostratigraphy. The count of gained genes also includes all genes that were not mapped to any groups and not selected isoforms. Note that this only occurs at the extant species level, and ancestral levels (internal nodes) are accurate.