-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comma-delimited data within json element #22
Comments
hatmandu
changed the title
Comma-delimited data with json element
Comma-delimited data within json element
Jun 17, 2014
Anyone? |
Looks like the dev's on a vacation. |
I'd still love to see a solution for this :) |
@hatmandu did you fix it? I'm having the same problem |
Sadly not - and nobody ever replied. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks for your fantastically useful json2csv script - I've been using it to parse data from OpenLibrary dumps. It's working very well, even though the OL data is very inconsistently structured. One question, though, if I may...
In a case where there are commas within an item, eg
{"subjects": ["Books and reading -- Fiction.", "Storytelling -- Fiction.", "Death -- Fiction.", "Jews -- Germany -- History -- 1933-1945 -- Fiction."]}
json2csv appears to strip out the commas within the value, so the four different subjects all get merged into one. It comes out like this for -k subjects:
[Books and reading -- Fiction. Storytelling -- Fiction. Death -- Fiction. Jews -- Germany -- History -- 1933-1945 -- Fiction.]
Is there a straightforward way to get it to preserve those multiple items within a value? (I don't need them as separate fields in the CSV, but would like to preserve the distinction within the 'subjects' field, if you see what I mean - so they could be delimited by something other than a comma.)
(I tried using the -d flag to set a different field delimiter, e.g. semicolon, but it still stripped out the commas as above.)
Edit: another example...
"subject_places": ["United States", "China"]
comes out as
[United States China]
so it's not really practical to find some automated way of parsing that alas.
The text was updated successfully, but these errors were encountered: