Project in Molecular Life Science, protein structure predictor using svm.
SVM models
Datasets, whole and reduced
Performs cross-validation with different window sizes
Performs cross-validation and creates a confusion matrix figure
Trains the final single-sequence information SVM
Trains the final multiple-sequence information SVM
SVM and Random forest functions, cross-validation
Model preparation, only functions.
Check different C parameter values
Polynomial kernel
Takes the model and predicts the topology of a fasta sequence
Cross-validation for the multiple-sequence info SVM
Creates frequency matrices from the PSSM
RBF kernel
Divides the globular sequences and TM sequences into two files, then takes one part TM and two parts globular
Runs random forest
Plots graphs