This is a JSONPath parser. It strictly follows RFC 9535 and passes the JSONPath Compliance Test Suite.
It reads a JSON input file and a query, and uses the query to find and return a set of matching values from the input. This does for JSON the same job that XPath does for XML.
This project includes:
* ruby library to run jsonpath queries on a JSON input
* command-line tool to do the same
Contents
Install the gem from the command-line:
gem install janeway-jsonpath
or add it to your Gemfile:
gem 'janeway-jsonpath', '~> 0.6'
Give it a query and some input JSON, and it prints a JSON result. Use single quotes around the JSON query to avoid shell interaction. Example:
$ cat store.json
{ "store": {
"book": [
{ "category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{ "category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
},
{ "category": "fiction",
"author": "Herman Melville",
"title": "Moby Dick",
"isbn": "0-553-21311-3",
"price": 8.99
},
{ "category": "fiction",
"author": "J. R. R. Tolkien",
"title": "The Lord of the Rings",
"isbn": "0-395-19395-8",
"price": 22.99,
"value": true
}
],
"bicycle": {
"color": "red",
"price": 399
}
}
}
### Find and return matching values
$ janeway '$..book[? @.price==8.95 || @.price==8.99].title' store.json
[
"Sayings of the Century",
"Moby Dick"
]
### Delete matching values from the input, print what remains
$ janeway -d '$.store.book' store.json
{
"store": {
"bicycle": {
"color": "red",
"price": 399
}
}
}
You can also pipe JSON into it:
$ cat store.json | janeway '$..book[? @.price<10]'
See the help message for more capabilities: janeway --help
Here's an example of ruby code using Janeway to find values from a JSON document: To use the Janeway library in ruby code, providing a jsonpath query and an input object (Hash or Array) to search.
require 'janeway'
require 'json'
data = JSON.parse(File.read(ARGV.first))
Janeway.enum_for('$.store.book[0].title', data)
This returns an Enumerator, which offers instance methods for using the query to operate on matching values in the input object.
Following are examples showing how to work with the Enumerator methods.
Returns all values that match the query.
results = Janeway.enum_for('$..book[? @.price<10]', data).search
# Returns every book in the store cheaper than $10
Alternatively, parse the query once, and share it between threads or ractors with different data sources:
# Create ractors with their own data sources
ractors =
Array.new(4) do |index|
Ractor.new(index) do |i|
query = receive
data = JSON.parse File.read("input-file-#{i}.json")
puts query.enum_for(data).search
end
end
# Construct JSONPath query object and send it to each ractor
query = Janeway.parse('$..book[? @.price<10]')
ractors.each { |ractor| ractor.send(query).take }
Iterates through matches one by one, without holding the entire set in memory. Janeway's #each iteration is particularly powerful, as it provides context for each match. The matched value is yielded:
data = {
'birds' => ['eagle', 'stork', 'cormorant'] }
'dogs' => ['poodle', 'pug', 'saint bernard'] },
}
Janeway.enum_for('$.birds.*', data).each do |bird|
puts "the bird is a #{bird}"
end
Strings values can be modified in place:
Janeway.enum_for('$.birds[? @=="storck"]', data).each do |bird|
bird.gsub!('ck', 'k') # Fix a typo: "storck" => "stork"
end
# input list is now ['eagle', 'stork', 'cormorant']
However, this doesn't work with numeric values because they can't be modified in place. Here's an example illustrating the same problem but using strings:
Janeway.enum_for('$.birds', data).each do |bird|
bird = "bald eagle" if bird == 'eagle'
# local variable 'bird' now points to a new value, but the original list is unchanged
end
# input list is still ['eagle', 'stork', 'cormorant']
To replace list values, you need access to the parent object (a Hash or Array) and the array index / hash key. These are available as the second and third yield parameters. This allows the list or hash to be modified:
Janeway.enum_for('$.birds[? @=="eagle"]', data).each do |_bird, parent, index|
parent[index] = "golden eagle"
end
# input list is now ['golden eagle', 'stork', 'cormorant']
The parent / index parameters can also be used to delete values, but this requires caution to avoid index errors. For example, iterating over an array may yield index values 1, 2 and 3. Deleting the value at index 1 would change the index for the remaining values. Next iteration might yield index 2, but since 1 was deleted it is now really at index 1.
To avoid having to deal with such problems, use the #delete
method instead.
Lastly, the #each
iterator's fourth yield parameter is the normalized path to the matched value.
This is a jsonpath query string that uniquely points to the matched value.
# Collect the normalized path of every object in the input, at all levels
paths = []
Janeway.enum_for('$..*', data).each do |_bird, _parent, _index, path| do
paths << path
end
paths
# [
# "$['birds']", "$['dogs']", "$['birds'][0]", "$['birds'][1]",
# "$['birds'][2]", "$['dogs'][0]", "$['dogs'][1]", "$['dogs'][2]"
# ]
The #delete
method deletes matched values from the input.
# delete any bird whose name starts with "s" or is "eagle"
Janeway.enum_for('$.birds[? search(@, "^s") || @=="eagle"]', data).delete
# input bird list is now ['cormorant']
# delete all dog names
Janeway.enum_for('$.dogs.*', data).delete
# dog list is now []
The #delete_if
method yields matched values to a block, and deletes them if the block returns a truthy result.
This allows values to be deleted based on conditions that can't be tested by a JSONPath query:
# delete any book from the store json data that is not in the database
Janeway.enum_for('$.store.book.*', data).delete_if do |book|
results = db.query('SELECT * FROM books WHERE author=? AND title=?', book['author'], book['title'])
results.size == 0
end
Replaces every value matched by the query with the given value. The replacement does not need to be the same type (eg. you can replace a string value with Hash or nil.)
Alternatively, provide a block which receives the value and returns a replacement value.
# Set every price to nil
Janeway.enum_for('$..price', data).replace(nil)
# Suppose the prices were serialized as strings, eg. "price" => "9.99"
# Convert them to floating point values:
Janeway.enum_for('$..price', data).replace { |price| price.to_f }
# Same thing, but using more terse ruby:
Janeway.enum_for('$..price', data).replace(&:to_f)
# Convert price to hash:
Janeway.enum_for('$..price', data).replace do |price|
{
'price' => price.to_f,
'currency' => 'CAD',
}
end
Adds the given value to the input data, at the place specified by the JSONPath query.
The query must be a singular query, meaning it can only be made of of name selectors (hash keys) and index selectors (array indexes.)
The normalized paths which are yielded to #each
are usable here.
Examples of singular queries:
$.store.book.0.price
$['store']['book'][0]
Examples of queries that are valid JSONPath but are not useable here:
$.store.book.*
$.store.book[1:2]
$.store.book[? @.price >= 8.99]
Additionally, some other restrictions apply:
- The "parent" node must exist, eg. for
$.a[1].name
, the path$.a[1]
must exist and be a Hash - Cannot create array index
n
unless the array contains exactlyn-1
elements - If query path already exists, the block is called if provided. Otherwise an exception is raised.
Here is an example of adding a new book to the store:
# Add a new book to the store
book_count = data['store']['book'].size
Janeway.enum_for("$.store.book[#{book_count}]", data).insert do
{ "category": "fiction",
"author": "Kameron Hurley",
"title": "The Light Brigade",
"price": 33.11
}
end
The Janeway.enum_for
and Janeway::Query#enum_for
methods return an enumerator, so you can use the usual ruby enumerator methods, such as:
Return the matched elements, as modified by the block.
# return all store prices, but take a dollar off each one
sale_prices = Janeway.enum_for('$.store..price', data).map { |price| price - 1 }
# [7.95, 11.99, 7.99, 21.99, 398]
Return only values that match the JSONPath query and also return a truthy value from the block. This solves a common JSON problem: You want to do a numeric comparison on a value in your JSON, but the JSON value is stored as a string type.
# Poorly serialized, the prices are strings
data =
{ "store" => {
"book" => [
{ "title" => "Sayings of the Century", "price" => "8.95" },
{ "title" => "Sword of Honour", "price" => "12.99" },
{ "title" => "Moby Dick", "price" => "8.99" },
{ "title" => "The Lord of the Rings", "price" => "22.99" },
]
}
}
# Can't use a filter query with a numeric comparison on a string price
Janeway.enum_for('$.store.book[? @.price > 10.00]', data).find_all
# result: []
# Solve the problem with ruby by filtering with #select and converting string to number:
Janeway.enum_for('$.store.book.*', data).select { |book| book['price'].to_f > 10 }
# result: [{"title" => "Sword of Honour", "price" => "12.99"}, {"title" => "The Lord of the Rings", "price" => "22.99"}]
Return only values that match the JSONPath query and also return false from the block.
Return values that match the jsonpath query and return truthy values from the block. Instead of returning the value from the data, return the result of the block:
# Return titles of books by certain authors which cost more than the minimum price
APPROVED_AUTHORS = ['Evelyn Waugh', 'Herman Melville']
Janeway.enum_for('$.store.book[? @.price >= 8.99]', data).filter_map do |book|
book['title'] if APPROVED_AUTHORS.include?(book['author'])
end
# [ "Moby Dick", "Sword of Honour" ]
Combines functionality of #select
and #map
.
Return the first value that matches the jsonpath query and also returns a truthy value from the block:
Janeway.enum_for('$.store.book[? @.price >= 10.99]', data).find do |book|
book['title'].start_with?('T')
end
# [ "The Lord of the Rings" ]
There are many other Enumerable methods too, see the ruby Enumerable module documenation for more.
This is the classic 'jsonpath' ruby gem. It has been around since 2008. It is not compliant with RFC 9535, because it was written long before the standard was finalized, but it's a capable and useful parser and has long been the best jsonpath library available for ruby.
See Porting for tips on converting a ruby project from joshbuddy/jsonpath to janeway.
Also there are many non-ruby implementations of RFC 9535, here are just a few:
- maintain perfect compliance with IETF RFC 9535
- raise helpful query parse errors designed to help users understand and improve queries, rather than describing issues in the code
- don't use regular expressions for parsing, for performance
- don't use
eval
, which is known to be an attack vector - be simple and fast with minimal dependencies
- provide ruby-like accessors (eg. #each, #delete_if) for processing results
- idiomatic, linted ruby 3 code with frozen string literals everywhere
- Changing behavior to follow other implementations
The JSONPath RFC was in draft status for a long time and has seen many changes. There are many implementations based on older drafts, and others which add features that were never in the RFC at all.
The goal is adherence to the RFC 9535 rather than adding features that are in other implementations. This implementation's results are supposed to be identical to other RFC-compliant implementations in dart, python and other languages.
The RFC was finalized in 2024. With the finalized RFC and the rigorous suite of compliance tests, it is now possible to have JSONPath implementations in many languages with identical behavior.