-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with the client-side ("serviceprovider") implementation of ListRecords #278
Comments
Hmm, I actually don't understand what's going on - looking at the existing RecordParser tests, I don't really get how they are passing. |
Ok, I see, the tests are passing because of |
@poikilotherm I want to close this issue, since I opened it based on not understanding how that parser was supposed to work. (I warned upfront that that was a possibility) (as you can see, Dataverse hasn't been using this parser at all) |
I may ask for, and/or make a PR adding an extra feature to record processing. |
I'm closing this issue (opened because of a misunderstanding, as explained above). |
It appears that harvesting via ListRecords is broken. The reason we never noticed is that Dataverse OAI client hasn't been using it, relying instead on making a ListIdentifiers call, then calling GetRecord for each non-deleted identifier. I am however working on adding support for harvesting via ListRecords as well, optionally.
To skip directly to the punchline, I believe all it is is this line:
xoai/xoai-service-provider/src/main/java/io/gdcc/xoai/serviceprovider/parsers/MetadataParser.java
Line 34 in 7584005
The problem being that the
<metadata>
tag in question has already been parsed by the RecordParser before this parser has been called, here:xoai/xoai-service-provider/src/main/java/io/gdcc/xoai/serviceprovider/parsers/RecordParser.java
Line 48 in 7584005
A larger fragment:
xoai/xoai-service-provider/src/main/java/io/gdcc/xoai/serviceprovider/parsers/RecordParser.java
Lines 48 to 60 in 7584005
In other words, when it's trying to parse this fragment of a ListRecords response:
the
content
String in line 49 above will only contain the<oai_dc:dc ...> ... </oai_dc:dc>
part, and that's where the next parser bombs withThe fix appears to be as simple as commenting out line 34 in
MetadataParser.java
😄.But it would sound prudent to add a test or two that would attempt to parse some example fragments.
The text was updated successfully, but these errors were encountered: