KGïA: Leveraging Disease-Specific Topologies and Counterfactual Relationships in Knowledge Graphs for Inductive Reasoning in Drug Repurposing
This code generates the validation (valid.txt.
) and test splits (text.txt
) for semi-inductive and transductive settings given a train split (train.txt
).
python split_KG.py -kg_filepath 'biomedical_KG.txt' -train_filepath 'path/to/your/train_split/train.txt'
Generate augmentation edges for train.txt using different settings of treatment/manipulation, applicable for both transductive and semi-inductive settings. This step introduces counterfactual links for triples with a TREAT relation. You can modify the treatment settings as needed.
python main.py --dataset pubmed --metric auc --alpha 1 --beta 1 --gamma 30 --lr 0.1 --embraw mvgrl --t kcore --neg_rate 40 --jk_mode mean --batch_size 12000 --epochs 200 --patience 50 --trail 20
Make sure edges_f_t0.npy
, edges_f_t1.npy
, and int_to_entity.pkl
are in a directory named after your chosen treatment.
python augment.py -treatment kcore
Ensure that your data splits are stored under ULTRA/datasets.
To conduct a case study in which you wish to annotate each triple in your test file with its score and rank, include the argument -this_is_a_case_study
python script/run.py -c config/inductive/inference.yaml --dataset GPKG_SInd --version 'na' --epochs 0 --bpe null --gpus [0] --ckpt /home/cxo147/ULTRA_PDR/ckpts/ultra_50g.pth
python script/run.py -c config/transductive/inference.yaml --dataset GPKG_T --epochs 0 --bpe null --gpus [0] --ckpt /home/cxo147/ULTRA_PDR/ckpts/ultra_50g.pth
python -m torch.distributed.launch --nproc_per_node=4 script/run.py -c config/inductive/inference.yaml --dataset GPKG_FI --version 'na' --epochs 10 --bpe 100 --gpus [0,1,2,3] --ckpt /home/cxo147/ULTRA_PDR/ckpts/ultra_50g.pth