Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(QueryParser): add support for patterns like "field:*". #2513

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

LebranceBW
Copy link
Contributor

ChangeLog

  • Add entries in LogicalLiteral and LogicalAst to support ExistsQuery pattern.

));
return (None, errors);
}
if !json_path.is_empty() && field_type.is_json() {
Copy link
Contributor

@trinity-1686a trinity-1686a Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this condition is wrong. it generates an error when the json path contains something and the field is json. It should error when there is a path, but the field isn't json (and maybe when there is no path and we're on a json field, i don't remember off hand if tantivy allows things at the root of a json field. I'll edit this comment after verifying)
edit: we don't support putting values at the root of a json field currently, so either it's a json field and there must be a path, or it isn't and there mustn't (though the error messages should likely differ)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right. I'll fix it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExistsQuery does not support JSON fields, so checking them is redundant. Just make sure fat field field + JSON path combination is banned.

Comment closed, if there still are some issue. feel free to reopen the it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what leads you to believe json fields are not supported in ExistsQuery, i'm pretty sure we use them inside Quickwit, and it seems to work when providing the full path and not just the field to the ExistsQuery (the changes in examples/json_fields.rs are just a poor (wo)man's test, not something that should be committed)

diff --git a/examples/json_field.rs b/examples/json_field.rs
index 334ea10e4..a9e1286a4 100644
--- a/examples/json_field.rs
+++ b/examples/json_field.rs
@@ -14,7 +14,7 @@ fn main() -> tantivy::Result<()> {
     let mut schema_builder = Schema::builder();
     schema_builder.add_date_field("timestamp", FAST | STORED);
     let event_type = schema_builder.add_text_field("event_type", STRING | STORED);
-    let attributes = schema_builder.add_json_field("attributes", STORED | TEXT);
+    let attributes = schema_builder.add_json_field("attributes", STORED | TEXT | FAST);
     let schema = schema_builder.build();
 
     // # Indexing documents
@@ -99,7 +99,7 @@ fn main() -> tantivy::Result<()> {
     }
     {
         // Default json fields can still be accessed by full path
-        let query = query_parser.parse_query("attributes.event_type:holiday-sale")?;
+        let query = query_parser.parse_query("attributes.event_type:*")?;
         let hits = searcher.search(&*query, &TopDocs::with_limit(2))?;
         assert_eq!(hits.len(), 1);
     }
diff --git a/src/query/query_parser/query_parser.rs b/src/query/query_parser/query_parser.rs
index a7433df97..36248a164 100644
--- a/src/query/query_parser/query_parser.rs
+++ b/src/query/query_parser/query_parser.rs
@@ -862,7 +862,7 @@ impl QueryParser {
                     ));
                     return (None, errors);
                 }
-                if !json_path.is_empty() && field_type.is_json() {
+                if json_path.is_empty() == field_type.is_json() {
                     errors.push(QueryParserError::UnsupportedQuery(format!(
                         "Json path is not supported for field {:?}",
                         field_entry.name()
@@ -870,7 +870,7 @@ impl QueryParser {
                     return (None, errors);
                 }
                 let logical_ast = LogicalAst::Leaf(Box::new(LogicalLiteral::Exist {
-                    field_name: field_entry.name().to_string(),
+                    field_name: full_path,
                 }));
                 (Some(logical_ast), errors)
             }

Copy link
Contributor Author

@LebranceBW LebranceBW Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what leads you to believe json fields are not supported in ExistsQuery, i'm pretty sure we use them inside Quickwit, and it seems to work when providing the full path and not just the field to the ExistsQuery (the changes in examples/json_fields.rs are just a poor (wo)man's test, not something that should be committed)

diff --git a/examples/json_field.rs b/examples/json_field.rs
index 334ea10e4..a9e1286a4 100644
--- a/examples/json_field.rs
+++ b/examples/json_field.rs
@@ -14,7 +14,7 @@ fn main() -> tantivy::Result<()> {
     let mut schema_builder = Schema::builder();
     schema_builder.add_date_field("timestamp", FAST | STORED);
     let event_type = schema_builder.add_text_field("event_type", STRING | STORED);
-    let attributes = schema_builder.add_json_field("attributes", STORED | TEXT);
+    let attributes = schema_builder.add_json_field("attributes", STORED | TEXT | FAST);
     let schema = schema_builder.build();
 
     // # Indexing documents
@@ -99,7 +99,7 @@ fn main() -> tantivy::Result<()> {
     }
     {
         // Default json fields can still be accessed by full path
-        let query = query_parser.parse_query("attributes.event_type:holiday-sale")?;
+        let query = query_parser.parse_query("attributes.event_type:*")?;
         let hits = searcher.search(&*query, &TopDocs::with_limit(2))?;
         assert_eq!(hits.len(), 1);
     }
diff --git a/src/query/query_parser/query_parser.rs b/src/query/query_parser/query_parser.rs
index a7433df97..36248a164 100644
--- a/src/query/query_parser/query_parser.rs
+++ b/src/query/query_parser/query_parser.rs
@@ -862,7 +862,7 @@ impl QueryParser {
                     ));
                     return (None, errors);
                 }
-                if !json_path.is_empty() && field_type.is_json() {
+                if json_path.is_empty() == field_type.is_json() {
                     errors.push(QueryParserError::UnsupportedQuery(format!(
                         "Json path is not supported for field {:?}",
                         field_entry.name()
@@ -870,7 +870,7 @@ impl QueryParser {
                     return (None, errors);
                 }
                 let logical_ast = LogicalAst::Leaf(Box::new(LogicalLiteral::Exist {
-                    field_name: field_entry.name().to_string(),
+                    field_name: full_path,
                 }));
                 (Some(logical_ast), errors)
             }

On my. I'm sorry that I reached this arbitrary conclusion without actually verifying it through coding. In fact, this idea came from here.
src/fastfield/mod.rs which says basic types are supported. It appears that works plain fields within JSON leaves as well.
Thank you for correcting my mistaken understanding; I will rewrite the code again and add the corresponding unit tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants