You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thanks to a super new stats by @gsautter we now have a better way to look at what we do, get an idea of the errors and last but not least a way to understand where we should invest in the future to minimize errors in production, but also to communicate the limitation of this utterly "stupid" thing we do, liberation imprisoned biodiversity data.
.....................................
Note1: "Replaces Error Protocol UUID" not being empty indicates there is an earlier error protocol for the same document, before the respective users did their QC work. The "Number of Errors Fixed" and the "Number of Errors Marked as False Positives" are deltas to the respective predecessor error protocol, so the sums of said deltas reflect what the individual users did.
Note2: Keep in mind that computing these deltas is only possible if there are at least two error protocols for a document. So if anyone runs the batch on their desktop machine and does the QC before even uploading the IMF to the server, there is no way of telling the amount of QC work done.
Note1: "Replaced by Error Protocol UUID" being empty indicates there is no later error protocol for a specific document, i.e., this restricts the numbers to the error protocols indicating the current status of their respective documents.
Note2: The number under "Document UUID" also indicates the number of documents we have QCed so far.
thanks to a super new stats by @gsautter we now have a better way to look at what we do, get an idea of the errors and last but not least a way to understand where we should invest in the future to minimize errors in production, but also to communicate the limitation of this utterly "stupid" thing we do, liberation imprisoned biodiversity data.
.....................................
Hi Donat,
in order to get some more numbers on our QC efforts than that little specialized tool could previously provide, I spent the past two days finally building some dedicated stats for the error protocols ... see
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftb.plazi.org%2FGgServer%2FephStats&data=02%7C01%7Cagosti%40amnh.org%7Ce37058e912b544c3b86908d7f1406d58%7Cbe0003e8c6b9496883aeb34586974b76%7C0%7C0%7C637243131323760693&sdata=FBIFRNR5MSeC7C2tXuZBzp%2BDXLf850OlE5IwWH6ofpw%3D&reserved=0
Here for instance an overview of all the error categories:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftb.plazi.org%2FGgServer%2FephStats%2Fstats%3FoutputFields%3DerrorCat.name%2BerrorCat.label%2BerrorCat.errorsRemoved%2BerrorCat.falsePosAdded%2BerrorCat.typeCount%26groupingFields%3DerrorCat.name%26FA-errorCat.label%3Dmax%26FA-errorCat.typeCount%3Dmax%26format%3DHTML&data=02%7C01%7Cagosti%40amnh.org%7Ce37058e912b544c3b86908d7f1406d58%7Cbe0003e8c6b9496883aeb34586974b76%7C0%7C0%7C637243131323760693&sdata=km1uWTodZlv55TFp%2BDisdoJu09YzNKEz2%2B4s5Xk%2BnFQ%3D&reserved=0
The same for the error types:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftb.plazi.org%2FGgServer%2FephStats%2Fstats%3FoutputFields%3DerrorType.name%2BerrorType.label%2BerrorType.parentCat%2BerrorType.errorsRemoved%2BerrorType.falsePosAdded%26groupingFields%3DerrorType.name%2BerrorType.parentCat%26orderingFields%3DerrorType.parentCat%26FA-errorType.label%3Dmax%26format%3DHTML&data=02%7C01%7Cagosti%40amnh.org%7Ce37058e912b544c3b86908d7f1406d58%7Cbe0003e8c6b9496883aeb34586974b76%7C0%7C0%7C637243131323760693&sdata=pVxmoD48%2F8931Xk7EAZl7072YqezgGdA%2F5kiPlvVzIU%3D&reserved=0
Here an overview of what the individual users did:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftb.plazi.org%2FGgServer%2FephStats%2Fstats%3FoutputFields%3Ddoc.docId%2Bdoc.subjectDocId%2Bdoc.updateUser%2Bdoc.prevDocId%2Bdoc.errorsRemoved%2Bdoc.falsePosAdded%26groupingFields%3Ddoc.updateUser%26orderingFields%3Ddoc.subjectDocId%26FP-doc.prevDocId%3D0-%26format%3DHTML&data=02%7C01%7Cagosti%40amnh.org%7Ce37058e912b544c3b86908d7f1406d58%7Cbe0003e8c6b9496883aeb34586974b76%7C0%7C0%7C637243131323760693&sdata=szc9S%2BSkMChh6VVn1oQVjFuBWsbdJ3D80l2DVXLA2AE%3D&reserved=0
Note1: "Replaces Error Protocol UUID" not being empty indicates there is an earlier error protocol for the same document, before the respective users did their QC work. The "Number of Errors Fixed" and the "Number of Errors Marked as False Positives" are deltas to the respective predecessor error protocol, so the sums of said deltas reflect what the individual users did.
Note2: Keep in mind that computing these deltas is only possible if there are at least two error protocols for a document. So if anyone runs the batch on their desktop machine and does the QC before even uploading the IMF to the server, there is no way of telling the amount of QC work done.
This now is an overview of all the errors extant in (the QCed part of) our data
collection:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftb.plazi.org%2FGgServer%2FephStats%2Fstats%3FoutputFields%3Ddoc.docId%2Bdoc.subjectDocId%2Bdoc.nextDocId%2Bdoc.errors%2Bdoc.falsePos%2Bdoc.errorsBlocker%2Bdoc.falsePosBlocker%2Bdoc.errorsCritical%2Bdoc.falsePosCritical%2Bdoc.catCount%2Bdoc.typeCount%26orderingFields%3Ddoc.subjectDocId%26FP-doc.nextDocId%3D-0%26FA-doc.nextDocId%3Dmin%26FA-doc.catCount%3Dmax%26FA-doc.typeCount%3Dmax%26format%3DHTML&data=02%7C01%7Cagosti%40amnh.org%7Ce37058e912b544c3b86908d7f1406d58%7Cbe0003e8c6b9496883aeb34586974b76%7C0%7C0%7C637243131323760693&sdata=xSMAw8ihXwYUscj8MPLLXODLpFOEbB8fZ5sAVcj16W4%3D&reserved=0
Note1: "Replaced by Error Protocol UUID" being empty indicates there is no later error protocol for a specific document, i.e., this restricts the numbers to the error protocols indicating the current status of their respective documents.
Note2: The number under "Document UUID" also indicates the number of documents we have QCed so far.
Same as the above, expanded by the individual error types:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftb.plazi.org%2FGgServer%2FephStats%2Fstats%3FoutputFields%3Ddoc.nextDocId%2BerrorType.name%2BerrorType.label%2BerrorType.parentCat%2BerrorType.errors%2BerrorType.falsePos%2BerrorType.errorsBlocker%2BerrorType.falsePosBlocker%2BerrorType.errorsCritical%2BerrorType.falsePosCritical%26groupingFields%3DerrorType.name%2BerrorType.parentCat%26orderingFields%3DerrorType.parentCat%26FP-doc.nextDocId%3D-0%26FA-doc.nextDocId%3Dmin%26FA-errorType.label%3Dmax%26format%3DHTML&data=02%7C01%7Cagosti%40amnh.org%7Ce37058e912b544c3b86908d7f1406d58%7Cbe0003e8c6b9496883aeb34586974b76%7C0%7C0%7C637243131323770686&sdata=fiSEINnRY3vc%2B1xFQD771DnfBLi13ypcd13wV4AgXpM%3D&reserved=0
Best,
Guido
The text was updated successfully, but these errors were encountered: