Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: replace dict-based query with ParsedQuery dataclass #133

Merged
merged 1 commit into from
Jan 17, 2025

Conversation

filipchristiansen
Copy link
Collaborator

This PR replaces the old dictionary-based query object with a new ParsedQuery dataclass, improving type safety and readability across the codebase.

Key points:

  • Introduces a ParsedQuery dataclass that consolidates all query-related fields (repo, branch, commit, patterns, etc.)

  • Updates the following functions in the query_parser module to return a query of type ParsedQuery (instead of dict[str, Any]):

    • _parse_repo_source
    • _parse_path
    • parse_query
  • Updates the following functions in the query_ingestion module to take a query argument of type ParsedQuery (instead of dict[str, Any]):

    • run_ingest_query
    • _ingest_directory
    • _ingest_single_file
    • _create_tree_structure
    • _create_summary_string
    • _extract_files_content
    • _process_item
    • _process_symlink
    • _scan_directory
  • Switches ignore/include patterns to sets for clearer overrides and deduplication

  • Moves or imports maximum size constants (MAX_FILE_SIZE, etc.) to config.py

  • Aligns all references and tests to the new dataclass approach

This unification makes it easier to understand and maintain how query data flows through the ingestion process.

@filipchristiansen filipchristiansen force-pushed the refactor/dict-to-dataclass branch 2 times, most recently from 14b607e to a9373d4 Compare January 15, 2025 07:40
- Introduce ParsedQuery dataclass to store query parameters and metadata
- Update ingestion and parser modules to use ParsedQuery instead of dict[str, Any]
- Convert ignore_patterns and include_patterns to sets
- Clean references to max size and pattern handling
- Update tests to reflect new dataclass usage
@filipchristiansen filipchristiansen force-pushed the refactor/dict-to-dataclass branch from a9373d4 to 0c6242a Compare January 17, 2025 08:45
Copy link
Owner

@cyclotruc cyclotruc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested it, looks good

@cyclotruc cyclotruc merged commit d721b00 into main Jan 17, 2025
8 checks passed
@filipchristiansen filipchristiansen deleted the refactor/dict-to-dataclass branch January 19, 2025 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants