quantms runs #550

enryH · 2025-01-25T17:23:33Z

downstream ion parsing does not work yet

- downstream ion parsing does not work yet

enryH · 2025-01-25T17:36:35Z

try:
    input_df = load_input_file(input_file_loc, input_format)
except pd.errors.ParserError as e:
    raise ParseError(
        f"Error parsing {input_format} file, please make sure the format is correct and the correct software tool is chosen: {e}"
    )
except Exception as e:
    raise ParseSettingsError(f"Error parsing the input file: {e}")

# Parse settings file
try:
    parse_settings = ParseSettingsBuilder(
        parse_settings_dir=self.parse_settings_dir, module_id=self.module_id
    ).build_parser(input_format)
except KeyError as e:
    raise ParseSettingsError(f"Error parsing settings file for parsing, settings seem to be missing: {e}")
except FileNotFoundError as e:
    raise ParseSettingsError(f"Could not find the parsing settings file: {e}")
except Exception as e:
    raise ParseSettingsError(f"Error parsing settings file for parsing: {e}")

try:
    standard_format, replicate_to_raw = parse_settings.convert_to_standard_format(input_df)
except KeyError as e:
    raise ConvertStandardFormatError(f"Error converting to standard format, key missing: {e}")
except Exception as e:
    raise ConvertStandardFormatError(f"Error converting to standard format: {e}")

# calculate quantification scores
try:
    quant_score = QuantScores(
        self.precursor_name, parse_settings.species_expected_ratio(), parse_settings.species_dict()
    )
except Exception as e:
    raise QuantificationError(f"Error generating quantification scores: {e}")

# generate intermediate data structure
try:
    intermediate_data_structure = quant_score.generate_intermediate(standard_format, replicate_to_raw)
except Exception as e:
    raise IntermediateFormatGenerationError(f"Error generating intermediate data structure: {e}")

Currently reading works and I advance to the generation of the intermediate output (last step above), but then fail with proteobench.exceptions.IntermediateFormatGenerationError: Error generating intermediate data structure: "['precursor ion'] not in index"

@RobbinBouwmeester any idea why?

RobbinBouwmeester · 2025-01-26T12:45:40Z

Hmmm interesting. I do know there is an ion warning that is sometimes thrown without good reason, could be that.

Issue there, if I remember correctly, is that it tries to read the charge state. But due to some complications at that point in the code it cannot (yet). We need to build in the check later.

Will have a look ASAP.

enryH · 2025-01-26T14:39:08Z

Thanks. But that is already a good hint. I will double check where the charge is used.

enryH · 2025-01-26T14:56:51Z

Execution is continued although it might be required. If we add an explicit error, we would need to test all parsing again.

ProteoBench/proteobench/io/parsing/parse_settings.py

Lines 168 to 175 in b52149e

    
           if self.analysis_level == "ion": 
        
               if "proforma" in df_filtered_melted.columns and "Charge" in df_filtered_melted.columns: 
        
                   df_filtered_melted["precursor ion"] = ( 
        
                       df_filtered_melted["proforma"] + "|Z=" + df_filtered_melted["Charge"].astype(str) 
        
                   ) 
        
               else: 
        
                   print("Not all columns required for making the ion are available.") 
        
               return df_filtered_melted, replicate_to_raw

enryH · 2025-01-26T14:59:32Z

@rodvrees Was there any reason to not raise an ValueError?

ParseModifications. logic in proteobench/io/parsing/parse_settings.py add 'proforma' column which is required.

enryH · 2025-01-26T15:51:21Z

So the answer is yes according to the test which fail for

FAILED test/test_module_dda_quant.py::TestOutputFileReading::test_benchmarking - proteobench.exceptions.ConvertStandardFormatError: Error converting to standard format.
FAILED test/test_module_dda_quant.py::TestOutputFileReading::test_input_file_initial_parsing - ValueError: Not all columns required for making the ion are available: 'proforma' and 'Charge'.
FAILED test/test_module_dda_quant.py::TestOutputFileReading::test_input_file_processing - ValueError: Not all columns required for making the ion are available: 'proforma' and 'Charge'.
FAILED test/test_module_dda_quant.py::TestWrongFormatting::test_MaxQuant_file - proteobench.exceptions.ConvertStandardFormatError: Error converting to standard format.
FAILED test/test_module_dia_quant.py::TestOutputFileReading::test_benchmarking - proteobench.exceptions.ConvertStandardFormatError: Error converting to standard format: Not all columns required for making the ion are available: 'proforma' and 'Charge'.
FAILED test/test_module_dia_quant.py::TestOutputFileReading::test_input_file_initial_parsing - ValueError: Not all columns required for making the ion are available: 'proforma' and 'Charge'.
FAILED test/test_module_dia_quant.py::TestOutputFileReading::test_input_file_processing - ValueError: Not all columns required for making the ion are available: 'proforma' and 'Charge'.
FAILED test/test_module_dia_quant.py::TestWrongFormatting::test_DIANN_file - proteobench.exceptions.ConvertStandardFormatError: Error converting to standard format: Not all columns required for making the ion are available: 'proforma' and 'Charge'.

enryH · 2025-01-26T15:54:43Z

Regarding the precursor ion it was mainly due to the fact that I started from the custom format which does not have a [modifications_parser] in it's toml file, leading to no proforma column being generated.

ProteoBench/proteobench/io/parsing/parse_settings.py

Lines 67 to 69 in b52149e

    
           parser = ParseSettings(parse_settings, parse_settings_module) 
        
           if "modifications_parser" in parse_settings.keys(): 
        
               parser = ParseModificationSettings(parser, parse_settings)

using the ParseSettings functionality:

ProteoBench/proteobench/io/parsing/parse_settings.py

Lines 188 to 196 in b52149e

    
           class ParseModificationSettings: 
        
               def __init__(self, parser: ParseSettings, parse_settings: Dict[str, Any]): 
        
                   """ 
        
                   Initialize the ParseModificationSettings object. 
        
                   Args: 
        
                       parser (ParseSettings): The base parse settings object. 
        
                       parse_settings (Dict[str, Any]): The modifications-specific parse settings. 
        
                   """

enryH added 2 commits January 25, 2025 18:23

✅ start reading the data

c666cc4

- downstream ion parsing does not work yet

🔧🚧 add not entirely correct configuration

e280fc6

enryH added 2 commits January 26, 2025 16:45

🎨 raise errors if proforma is missing and explicit error handling

9d0ddc0

🐛 add proforma manuelly as no modification parsing is specified.

0321595

ParseModifications. logic in proteobench/io/parsing/parse_settings.py add 'proforma' column which is required.

🐛 move to separate issue #556

49e7929

enryH mentioned this pull request Jan 26, 2025

Update Contributing documentation #557

Open

enryH changed the title ~~✅ start reading the data~~ quantms runs Jan 26, 2025

Merge branch 'main' into quantms_dda

977d1f4

RobbinBouwmeester approved these changes Jan 27, 2025

View reviewed changes

enryH added 3 commits January 29, 2025 12:54

🚧 Start file reading of quantms parameter files

38058a8

🐛 json could not be reloaded

05f71b0

🚧 continue mapping parameters

038fcf8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quantms runs #550

quantms runs #550

enryH commented Jan 25, 2025

enryH commented Jan 25, 2025 •

edited

Loading

RobbinBouwmeester commented Jan 26, 2025

enryH commented Jan 26, 2025

enryH commented Jan 26, 2025

enryH commented Jan 26, 2025

enryH commented Jan 26, 2025

enryH commented Jan 26, 2025

quantms runs #550

Are you sure you want to change the base?

quantms runs #550

Conversation

enryH commented Jan 25, 2025

enryH commented Jan 25, 2025 • edited Loading

RobbinBouwmeester commented Jan 26, 2025

enryH commented Jan 26, 2025

enryH commented Jan 26, 2025

enryH commented Jan 26, 2025

enryH commented Jan 26, 2025

enryH commented Jan 26, 2025

enryH commented Jan 25, 2025 •

edited

Loading