Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quantms runs #550

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft

quantms runs #550

wants to merge 9 commits into from

Conversation

enryH
Copy link
Member

@enryH enryH commented Jan 25, 2025

  • downstream ion parsing does not work yet

@enryH
Copy link
Member Author

enryH commented Jan 25, 2025

try:
    input_df = load_input_file(input_file_loc, input_format)
except pd.errors.ParserError as e:
    raise ParseError(
        f"Error parsing {input_format} file, please make sure the format is correct and the correct software tool is chosen: {e}"
    )
except Exception as e:
    raise ParseSettingsError(f"Error parsing the input file: {e}")

# Parse settings file
try:
    parse_settings = ParseSettingsBuilder(
        parse_settings_dir=self.parse_settings_dir, module_id=self.module_id
    ).build_parser(input_format)
except KeyError as e:
    raise ParseSettingsError(f"Error parsing settings file for parsing, settings seem to be missing: {e}")
except FileNotFoundError as e:
    raise ParseSettingsError(f"Could not find the parsing settings file: {e}")
except Exception as e:
    raise ParseSettingsError(f"Error parsing settings file for parsing: {e}")

try:
    standard_format, replicate_to_raw = parse_settings.convert_to_standard_format(input_df)
except KeyError as e:
    raise ConvertStandardFormatError(f"Error converting to standard format, key missing: {e}")
except Exception as e:
    raise ConvertStandardFormatError(f"Error converting to standard format: {e}")

# calculate quantification scores
try:
    quant_score = QuantScores(
        self.precursor_name, parse_settings.species_expected_ratio(), parse_settings.species_dict()
    )
except Exception as e:
    raise QuantificationError(f"Error generating quantification scores: {e}")

# generate intermediate data structure
try:
    intermediate_data_structure = quant_score.generate_intermediate(standard_format, replicate_to_raw)
except Exception as e:
    raise IntermediateFormatGenerationError(f"Error generating intermediate data structure: {e}")

Currently reading works and I advance to the generation of the intermediate output (last step above), but then fail with proteobench.exceptions.IntermediateFormatGenerationError: Error generating intermediate data structure: "['precursor ion'] not in index"

@RobbinBouwmeester any idea why?

@RobbinBouwmeester
Copy link
Contributor

Hmmm interesting. I do know there is an ion warning that is sometimes thrown without good reason, could be that.

Issue there, if I remember correctly, is that it tries to read the charge state. But due to some complications at that point in the code it cannot (yet). We need to build in the check later.

Will have a look ASAP.

@enryH
Copy link
Member Author

enryH commented Jan 26, 2025

Thanks. But that is already a good hint. I will double check where the charge is used.

@enryH
Copy link
Member Author

enryH commented Jan 26, 2025

Execution is continued although it might be required. If we add an explicit error, we would need to test all parsing again.

if self.analysis_level == "ion":
if "proforma" in df_filtered_melted.columns and "Charge" in df_filtered_melted.columns:
df_filtered_melted["precursor ion"] = (
df_filtered_melted["proforma"] + "|Z=" + df_filtered_melted["Charge"].astype(str)
)
else:
print("Not all columns required for making the ion are available.")
return df_filtered_melted, replicate_to_raw

@enryH
Copy link
Member Author

enryH commented Jan 26, 2025

@rodvrees Was there any reason to not raise an ValueError?

enryH added 2 commits January 26, 2025 16:45
ParseModifications. logic in
proteobench/io/parsing/parse_settings.py

add 'proforma' column which is required.
@enryH
Copy link
Member Author

enryH commented Jan 26, 2025

So the answer is yes according to the test which fail for

FAILED test/test_module_dda_quant.py::TestOutputFileReading::test_benchmarking - proteobench.exceptions.ConvertStandardFormatError: Error converting to standard format.
FAILED test/test_module_dda_quant.py::TestOutputFileReading::test_input_file_initial_parsing - ValueError: Not all columns required for making the ion are available: 'proforma' and 'Charge'.
FAILED test/test_module_dda_quant.py::TestOutputFileReading::test_input_file_processing - ValueError: Not all columns required for making the ion are available: 'proforma' and 'Charge'.
FAILED test/test_module_dda_quant.py::TestWrongFormatting::test_MaxQuant_file - proteobench.exceptions.ConvertStandardFormatError: Error converting to standard format.
FAILED test/test_module_dia_quant.py::TestOutputFileReading::test_benchmarking - proteobench.exceptions.ConvertStandardFormatError: Error converting to standard format: Not all columns required for making the ion are available: 'proforma' and 'Charge'.
FAILED test/test_module_dia_quant.py::TestOutputFileReading::test_input_file_initial_parsing - ValueError: Not all columns required for making the ion are available: 'proforma' and 'Charge'.
FAILED test/test_module_dia_quant.py::TestOutputFileReading::test_input_file_processing - ValueError: Not all columns required for making the ion are available: 'proforma' and 'Charge'.
FAILED test/test_module_dia_quant.py::TestWrongFormatting::test_DIANN_file - proteobench.exceptions.ConvertStandardFormatError: Error converting to standard format: Not all columns required for making the ion are available: 'proforma' and 'Charge'.

@enryH
Copy link
Member Author

enryH commented Jan 26, 2025

Regarding the precursor ion it was mainly due to the fact that I started from the custom format which does not have a [modifications_parser] in it's toml file, leading to no proforma column being generated.

parser = ParseSettings(parse_settings, parse_settings_module)
if "modifications_parser" in parse_settings.keys():
parser = ParseModificationSettings(parser, parse_settings)

using the ParseSettings functionality:

class ParseModificationSettings:
def __init__(self, parser: ParseSettings, parse_settings: Dict[str, Any]):
"""
Initialize the ParseModificationSettings object.
Args:
parser (ParseSettings): The base parse settings object.
parse_settings (Dict[str, Any]): The modifications-specific parse settings.
"""

@enryH enryH changed the title ✅ start reading the data quantms runs Jan 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants