-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Marker metadata for tracking #43
Comments
Yep, indeed. Here it is in action: Line 296 in 3b32f25
Just need to resolve the precise keyword which will be used for deposits rather than frictionless, for which "deposits" is sufficiently unambiguous, but perhaps not sufficiently informative? I imagine the procedure will be automatic, rather than opt-in. The data are still exposed to, and controllable by, users, so in my current view only require demonstration of the possibilitity of manually removing the keyword. |
Cool. While I'd love to have high coverage data I'd want to be maximally transparent and opt-in with this. Yes, "deposits" is probably not a great keyword. (This makes me wonder if another name would be better for the package). That said, if there is already a "frictionlessdata" tag, perhaps we could only place our marker inside data package.json, and we could query that file in repositories with that tag. It would be more intensive but doable, and avoid tag cluttering that users might not want. |
That would actually be a great example for a tutorial! |
The neccessary precursor issue of keywords #36 is now done. Output copied here to demonstrate functionality needed for this issue. Keywords always have to be defined in "subjects", not "description". The following code illustrates the new functionality, starting with what happens when "keywords" are defined in the wrong field: library (deposits)
packageVersion ("deposits")
#> [1] '0.1.0.53'
metadata <- list (
title = "New Title",
abstract = "This is the abstract",
creator = list (list (name = "A. Person"), list (name = "B. Person")),
description = paste0 (
"This is the description\n\n",
"## keywords\none, two\nthree\n\n## version\n1.0"
)
)
cli <- depositsClient$new (service = "zenodo", metadata = metadata, sandbox = TRUE)
#> Error: Metadata source for [keywords] should be [subject] and not [description]
cli <- depositsClient$new (service = "figshare", metadata = metadata)
#> Error: Metadata source for [keywords] should be [subject] and not [description] The error message for both services is sufficiently informative to know what to do next: metadata$description <- "This is the description\n\n## version\n1.0"
metadata$subject <- "## keywords\none, two\nthree"
cli <- depositsClient$new (service = "zenodo", metadata = metadata, sandbox = TRUE)
cli$deposit_new ()
#> ID of new deposit : 1177062
cli$hostdata$metadata$keywords
#> [[1]]
#> [1] "one"
#>
#> [[2]]
#> [1] "two"
#>
#> [[3]]
#> [1] "three"
cli <- depositsClient$new (service = "figshare", metadata = metadata)
cli$deposit_new ()
#> Files for private Figshare deposits can only be downloaded manually; no metadata can be retrieved for this deposit.
#> ID of new deposit : 22348531
cli$hostdata$tags
#> [1] "one" "two" "three" Created on 2023-03-28 with reprex v2.0.2 And keywords are appropriately translated into service-specific terms, with the services themselves then returning their own representations. This issue then just needs optional or automatic insertion of a deposits-specific keyword, potentially alongside the "frictionlessdata" keyword illustrated in this Zenodo search query. @peterdesmet Can you comment on any "official" frictionless positions on the use of such keywords? Is "frictionlessdata" supported or encouraged, or just something you personally use? (Seems to be the latter from the Zenodo records.) Do you have any adivce or recommendations for us to extend upon your own usage to flag our own as a direct extension of frictionless? Any advice or input would be really appreciated 👍 😄 |
What keyword to use ( Regarding:
Can you clarify your use case? Is it "what keywords to automatically assign to a deposit in Zenodo/... that was created with the |
Yes, that is precisely what I meant. We are intending to have a (likely optional, but possibly default) keyword that we can use to identify all deposits created via this package. And those will also likely include an additional keyword to align with your current "frictionlessdata" usage. So ultimately two keywords. |
I think the proper way to do it would be to assign a related identifier with
In any case, I tried it out for one of the animal tracking datasets I published with the |
That's a great idea! deposits builds from a DCMI metadata structure which includes a few terms in which that might fit. And Zenodo has a "related_identifiers" field which allows the compiled option, and it also has the ability to construct custom search queries on any fields. So that should work for that, which will mean also for Dryad, which we'll soon expand to. (We currently do figshare too, but full functionality there is not so important.) |
Is it possible for us to include in metadata information that would allow us to search for records created by deposits? Perhaps with some opt-in mechanism like "Would you like to add the keyword
deposits-client
to make it possible for us to track records created by deposits" at deposition time? Ideally its something in a minimally obstrusive or visible metadata field but something we could find via the various search APIs across repositories.The text was updated successfully, but these errors were encountered: