jq is awesome, and a lot more powerful than gron, but with that power comes complexity. gron aims to make it easier to use the tools you already know, like grep and sed.
gron's primary purpose is to make it easy to find the path to a value in a deeply nested JSON blob when you don't already know the structure; much of jq's power is unlocked only once you know that structure.
In simpler words, as a user: The jq query language [1] is obtuse, obscure, incredibly hard to learn if you need it for quick one liners once in a blue moon. I've tried, believe me, but I should probably spend that much effort learning Chinese instead.
It's just operating at the wrong abstraction level, whereas gron is orders of magnitude easier to understand and _explore_.
> In simpler words, as a user: The jq query language [1] is obtuse, obscure, incredibly hard to learn if you need it for quick one liners once in a blue moon.
I don't agree that jq's query language is obtuse. It's a DSL for JSON document trees, and it's largely unfamiliar, but so is xpath or any other DOM transformation language.
The same thing is said about regex.
My take is that "it's obtuse" just translates to "I'm not familiar with it and I never bothered to get acquainted with it".
One thing that we can agree though is that jq's docs are awful at providing a decent tutorial for new users to ramp up.
I'm pretty good with regular expressions. I have spent a lot of time trying to get familiar with jq. The problem is that I never use it outside of parsing JSON files, yet I use regular expressions all over the place: on the command line, in Python and Javascript and Java code. They are widely applicable. Their syntax is terse, but relatively small.
jq has never come naturally. Every time I try to intuit how to do something, my intuition fails. This is despite having read its man page a dozen times or more, and consulted it even more frequently than that.
I've spent 20+ years on the Unix command line. I know my way around most of it. I can use sed and awk and perl to great effect. But I just can't seem to get jq to stick.
Aside, but there's a lot of times when "I know jq can do this, but I forget exactly how, let me find it in the man page" and then... I find jq's man page as difficult as jq itself when trying to use it as a reference.
Anyway, $0.02.
Edited to add: as a basic query language, I find it easy to use. It's when I'm dealing with json that embeds literal json strings that need to be parsed as json a second time, or when I'm trying to manipulate one or more fields in some way before outputting that I struggle. So it's when I'm trying to compose filters and functions inside jq that I find it hard to use.
Agreed. I only use jq once a month or once in two months at most. Every time I want to do something I just search for my use case since I can't seem to remember the syntax.
But I have tried to learn jq's syntax (it's pretty much a minilanguage) and it has been incredibly difficult.
I also remember what when I first tried learning regex it was also very difficult. That is until I learned about finite state machines and regular languages, after that CS fundamentals class I was able to make sense of regex in a way that stuck.
Is there a comparable theory for jq's mini-language?
Not a theory per se, but my "lightbulb moment" with jq came when I thought about it like this:
jq is basically a templating language, like Jsonnet or Jinja2. What jq calls a "filter" can also be called a template for the output format.
Like any template, a jq filter will have the same structure as the desired output, but may also include dynamically calculated (interpolated) data, which can be a selection from the input data.
So, at a high level, write your filter to look like your output, with hardcoded data. Then, replace the "dynamic" parts of the output data with selectors over the input.
Don't worry about any of the other features (e.g. conditionals, variables) until you need them to write your "selectors."
I don't know of any formal theory, but it feels a bit like functional programming because you don't often use variables (an advanced feature, as the manual says). I kind of got a feel for it by realizing that it wants to push a stream of objects through transformations, and that's about it. A few operators/functions can "split" the stream, or pull the stream back into place. Like, uh,
in.json {"a":1,"b":2}
jq -c '{a}' in.json
{"a":1}
The . is the current stream, so if I just do ". , .", it's kind of pushing two streams along:
jq -c '.,. | {a}' in.json
{"a":1}
{"a":1}
Then, of course, say:
jq -c '{a, b, c: .}' in.json
{"a":1,"b":2,"c":{"a":1,"b":2}}
It was going through the . stream, and I pulled the . stream right back in while doing so.
So it kind of helps to keep straight in my head when I've kind of got multiple streams going, vs multiple values.
Someone (almost anyone) can probably explain better with formal theory, but I just kind of got a feel for it and kind of describe it like this.
I think it's more like "I'm not familiar with it and getting it to do something that seems like it should be easy is surprisingly hard, even though I'm putting in some effort." I've become pretty good at jq lately, but for several years before that I would occasionally have some problem that I knew jq could solve, and resolved to sit down and learn the damn thing already, and every time, found it surprisingly difficult. Until you get a really good understanding of it (and good techniques for debugging your expressions), it's often easier just to write a python script.
I love jq, and without detracting from it, gron looks like an extremely useful, "less difficult" complement to it.
Adding: in fact, gron's simplicity is downright inspired. It looks like all it does is convert your json blob into a bunch of assignment statements that have the full path from root to node, and the ability to parse that back into an object. Not sure why I didn't think of that intermediate form being way more greppable. Kudos to the author.
Just as an example, this just took me about a minute to get the data I wanted, whereas I probably spent a half an hour on it yesterday with jq:
I spend a decent amount of time at the command line wrangling data files. It's fun for me to get clever with other tools like awk and perl when stringing together one liners and I enjoy building my knowledge of these tools, but jq has just never stuck.
Quite possibly, I did first play with Perl about 15 years before encountering jq. Some days I do feel as though my head is simply out of room, as my brain has been replaced by a large heap of curly braces, semi colons and stack traces.
I mean, I've been using grep and sed for 15 years now and I still struggle with anything beyond matching literals since they use a "nonstandard" regexp syntax and GNU and BSD variants behave very differently making for a lot of bugs on scripts that need to work on both Linux and MacOS (of coure you can install GNU on macos and BSD on linux, but the whole advantage of bash scripts is that you assume certain things are installed on the user's system and if you can't satisfy that assumption you may as well use Python or similar). I think gron has value for those simpler grep cases, but for anything beyond that, jq is the way to go (incidentally I'm very dissatisfied with all of the tools that aspire to be "jq for yaml" or even the relative dearth of tools for converting YAML to JSON on the command line).
>GNU and BSD variants behave very differently making for a lot of bugs on scripts that need to work on both Linux and MacOS
Perl shines for this use case (assuming it is present in the machines you are working with). It is slower than grep/sed/awk for most cases, but it is more powerful and better portable across platforms.
> Perl shines for this use case (assuming it is present in the machines you are working with). It is slower than grep/sed/awk for most cases, but it is more powerful and better portable across platforms.
Agreed.
For better or worse, when performance is not a concern in my scripts, I just shell out to "perl -pe" rather than trying to deal with grep, sed or awk.
`jq` started to click for me after watching this introductory video[1] closely and playing with some examples as I went.
The slides are linked in the video description and at [2]. You'll need them because unfortunately the video is produced in such a way that the speaker video window often obscures important parts of his presentation.
While it's true that jq's DSL has a bit of a learning curve, being able to try expressions and see immediate feedback can help immensely.
Here is a demo of a small script I wrote that shows jq results as you type using FZF: https://asciinema.org/a/349330 (link to script is in the description)
It also includes the ability to easily "bookmark" expressions and return to them so you don't have to worry about losing up an expression that's almost working to experiment with another one.
As a jq novice, I've personally found it to be super useful.
I really can't relate to the language being hard to learn. I've been using jq for a while now and have only had to look at the docs once when a key contained special characters and dot notation didn't work. I've usually been able to just guess the syntax in a couple of tries, but that might be because I'm just used to using weird notation for manipulating data structures (css, emmet, xpath, various "functional" globs of map/filter/reduce/zip...)
It's just an array programming language. Not everything has to be C, and I think it's unfair to call a language obtuse and incredibly hard to learn just because you're not used to the style.
Hey! That's kind of how how I use the CLI (API?) at AWS. It works pretty well! And fortunately (for me), not too much thinking involved.
BTW: I have a D3 front-end dashboard/console for the app (not admin) that makes this a little bit harder, but D3 is pretty organized (and well-documented), if you can figure out what you are trying to do with it.
It feels like finding a deeply nested key in a structured document is a job for XPath. Most people including myself until recently ignore that XPath 3.1 operates on JSON.
I like that jq's query expression syntax is command line (bash) friendly. My hunch is that xpath expressions would be awkward to work with.
I've done too much xpath, xquery, xslt, css selectors. For my own work (dog fooding), I settled on mostly using very simple globbing expressions. Then use the host language's 'foreach' equiv for iterating result sets.
Looping back to command line xpath: there's always some impedance match between the query and host languages. IIRC, one of the shells, like chubot's oilshell or fish?, has more rational expression evaluation (compared to bash).
You especially see this with regexs. It's a major language design fail that others haven't adopted Perl's first class regex intrinsics. C# has LINQ, sure. But that's more xquery than xpath. And I've never liked xquery.
In other words, "blue collar" programming languages should have intrinsic path expressions. Whatever the syntax.
Technically no, because it offers no comparable way to interface to the line-based world of unix tools for interop.
Practically, most things you'd do with gron and grep, sed, awk, ... you could do using only jq as well. Jq comes with massive cognitive overhead though and has a bunch of very unpleasant gotchas (like silently corrupting numbers with abs > 2^53, although very recent jq graciously no longer does that iff you do no processing on the number).
I find jq pretty useful, but I have no love for it.
Actually, I think it might be possible to implement gron in jq (you can produce "plaintext" not just json, and the processing facilities jq offers might be powerful enough to escape everything appropriately, but it's not something I'm curious enough to find out to try).
It really depends on what you want to do, and thus what think gron does.
If all you want to do is search for properties with a given value then yes, jq does that very well.
Unlike gron, jq even allows users to output search results as valid json docs. Hell, jq allows users to transform entire JSON docs.
However, if all you want to do is expand the JSON path at each symbol then I don't know if jq supports that usecase. But then again, why would anyone want to do that?
I think you should just invest the hour or so it takes to learn jq. Yes, it's far from a programming language design marvel. But it covers all of the edge cases, and once you learn it, you can be very productive. (But, the strategy of "copy paste a oneliner from Stackoverflow the one time a year I need it" isn't going to work.)
I think structured data is so common now, that you have to invest in learning tools for processing it. Personally, I invested the time once, and it saves me every single day. In the past, I would have a question like "which port is this Pod listening on", and write something like "kubectl get pod foo -o yaml | grep port -A 3". Usually you get your answer after manually reading through the false-positives. But with "jq", you can just drive directly to the correct answer: "kubectl get pod foo -o json | jq '.spec.containers[].ports'"
Maybe it's kind of obtuse, but it's worth your time, I promise.
But how do you get that '.spec.containers[].ports'?
It seems to me that for your example use case, gron is at least useful to first understand the json structure before making your jq request. And, for simple use cases like this one, enough to replace jq altogether.
Well, the schema of the JSON is something you have to come up with on your own. I happen to have seen like 8 trillion pod manifests so I know what I'm looking for, but if you don't, you have to figure out the schema in some other way. To reverse engineer something, I usually pipe into keys (jq keys, jq '.spec | keys', jq '.spec.containers[] | keys', etc.)
For Kubernetes specifically, "kubectl explain pod", "kubectl explain pod.spec", etc. will help you find what you're looking for.
> Well, the schema of the JSON is something you have to come up with on your own. I happen to have seen like 8 trillion pod manifests so I know what I'm looking for, but if you don't, you have to figure out the schema in some other way.
Well, or you just do
kubectl get pod pod -o json | gron | grep port
and you will get the answer to the original question + the path.
I think this is a good point. It's definitely hard to grep for certain parts of gron's output; especially where there's arrays involved because of the square brackets. I find that using fgrep/grep -F can help with that in situations where you don't need regular expressions though.
It's not an ideal output format for sure, but it does meet some criteria that I considered to be desirable.
Firstly: it's unambiguous. While your suggested format is easier to grep, it is also lossy as you mention. One of my goals with gron was to make the process reversible (i.e. with gron -u), which would not be possible with such a lossy format.
Secondly: it's valid JavaScript. Perhaps that's a minor thing, but it means that the statements are eval-able in either Node.js or in a browser. It's a fairly small thing, but it is something I've used on a few occasions. Using JavaScript syntax also means I didn't need to invent new rules for how things should be done, I could just follow a subset of existing rules.
FWIW, personally I'm usually using gron to help gain a better understanding of the structure of an object; often trying to find where a piece of known data exists, which means grepping for the value rather than the key/path - avoiding many of the problems you mention.
Thanks for your input :) I'd like to see your jq script to help me learn some more about jq!
While I like what jq let's me do, I actually find it really difficult to use. It's very rare that I attempt to use it without having to consult the docs, and when I try to do anything remotely complex it often takes ages to figure it out.
I very much like the look of gron for the simpler stuff!
"Or you could create a shell script in your $PATH named ungron or norg to affect all users"
You could also check argv[0] for if you were called via the `ungron` name. Then it would be as simple as a symlink, which is very easy to add at install/packaging time.
I'm sorry to hear that - I didn't mean it as a criticism, (you're free to work on or not work on whatever you want of course!) I was just surprised at the level of traction the post was getting I suppose.
Wow, I am typically hesitant to adopt new tools into my flow. Often times they either don't solve my problem all that much better, or they try to do too much.
This looks perfect. Does one thing and does it well. I will be adopting this :-)
If you'd like to use something like this in your own APIs to let your clients filter requests or on the CLI (as is the intention with gron), consider giving "json-mask" a try (you'll need Node.js installed):
If you've ever used Google APIs' `fields=` query param you already know how to use json-mask; it's super simple:
a,b,c - comma-separated list will select multiple fields
a/b/c - path will select a field from its parent
a(b,c) - sub-selection will select many fields from a parent
a/*/c - the star * wildcard will select all items in a field
I have installed gron on all my development machines.
Will probably use it heavily when working with awscli. I'm not conversant enough in the jq query language to not have to look things up when writing even somewhat complex scripts. And I don't want to learn awscli's custom query syntax. :)
Thought at first that it might be possible to replicate gron's functionality by some magic composition of jq, xargs, and grep, but that was before I understood the full awesomeness of gron - piping through grep, sed maintains gron context so you can still ungron later.
Very nice! I don't like that it can also make network requests. It's potential security hole and completely unnecessary given that we already have curl and pipes for that.
1. Is there a name/"standard" for the format gron is transforming json into?
2. Thesis:
jq is cumbersome when used on a json input of serious size/complexity because upfront knowledge of the structure of the json is needed to formulate correct search queries.
Gron supports that "uninformed search" use-case much better.
Prove me wrong ;)
1. There isn't really a name for it, but it's a subset of JavaScript and the grammar is available here specified in EBNF, with some railroad diagrams to aid understanding: https://tomnomnom.github.io/gron/
2. That's pretty much exactly why I wrote the tool :)
Is this better than the old solution of json_pp < jsonfile | grep 'pattern' ?
While that's only useful for picking out specific named keys without context, that's often good enough to get the job done. Added bonus is that json_pp and grep are usually installed by default so you don't have to install anything.
JMESPath has a spec, that is true, but JMESPath also has some serious limitations [1]. If I'm doing any JSON manipulation on the command line then I'll reach for jq.
That said, gron certainly looks like it offers simplicity in cases where using jq would be too complicated.
Why shouldn't I just use jq?
jq is awesome, and a lot more powerful than gron, but with that power comes complexity. gron aims to make it easier to use the tools you already know, like grep and sed.
gron's primary purpose is to make it easy to find the path to a value in a deeply nested JSON blob when you don't already know the structure; much of jq's power is unlocked only once you know that structure.