Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reject Duplicate Submissions #5047

Open
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

rajpatel24
Copy link
Contributor

@rajpatel24 rajpatel24 commented Aug 5, 2024

Checklist

  1. If you've added code that should be tested, add tests
  2. If you've changed APIs, update (or create!) the documentation
  3. Ensure the tests pass
  4. Make sure that your code lints and that you've followed our coding style
  5. Write a title and, if necessary, a description of your work suitable for publishing in our release notes
  6. Mention any related issues in this repository (as #ISSUE) and in other repositories (as kobotoolbox/other#ISSUE)
  7. Open an issue in the docs if there are UI/UX changes

Description

We have identified a race condition in the submission processing that causes duplicate submissions with identical UUIDs and XML hashes. This issue is particularly problematic under conditions with multiple remote devices submitting forms simultaneously over unreliable networks.

To address this issue, a PR has been raised with the following proposed changes:

  • Race Condition Resolution: A locking mechanism has been added to prevent the race condition when checking for existing instances and creating new ones. This aims to eliminate duplicate submissions.

  • UUID Enforcement: Submissions without a UUID are now explicitly disallowed. This ensures that every submission is uniquely identifiable and further mitigates the risk of duplicate entries.

  • Introduction of root_uuid:

    • To ensure a consistent submission UUID throughout its lifecycle and prevent duplicate submissions with the same UUID, a new root_uuid column has been added to the Instance model with a unique constraint (root_uuid per xform).

      • If the <meta><rootUuid> is present in the submission XML, it is stored in the root_uuid column.

      • If <meta><rootUuid> is not present, the value from <meta><instanceID> is used instead.

    • This approach guarantees that the root_uuid remains constant across the lifecycle of a submission, providing a reliable identifier for all instances.

  • UUID Handling Improvement: Updated the logic to strip only the uuid: prefix while preserving custom, non-UUID ID schemes (e.g., domain.com:1234). This ensures compliance with the OpenRosa spec and prevents potential ID collisions with custom prefixes.

  • Error Handling:

    • 202 Accepted: Returns when content is identical to an existing submission and successfully processed.
    • 409 Conflict: Returns when a duplicate UUID is detected but with differing content.

These changes should improve the robustness of the submission process and prevent both race conditions and invalid submissions.

Notes

  • Implemented a fix to address the race condition that leads to duplicate submissions with the same UUID and XML hash.
  • Incorporated improvements from existing work, ensuring consistency and robustness in handling concurrent submissions.
  • The fix aims to prevent duplicate submissions, even under high load and unreliable network conditions.

Related issues

Supersedes kobotoolbox/kobocat#876

@rajpatel24 rajpatel24 changed the base branch from main to beta August 5, 2024 12:43
@noliveleger noliveleger self-requested a review August 5, 2024 13:43
@rajpatel24 rajpatel24 force-pushed the 862-reject_duplicate_submissions branch from 28eefe5 to 039ed95 Compare August 5, 2024 16:20
@rajpatel24 rajpatel24 changed the title 862 Reject Duplicate Submissions Reject Duplicate Submissions Aug 5, 2024
@rajpatel24 rajpatel24 force-pushed the 862-reject_duplicate_submissions branch from 039ed95 to 6a177da Compare August 7, 2024 13:51
@noliveleger noliveleger self-assigned this Aug 21, 2024
@noliveleger noliveleger changed the base branch from beta to kobocat-django-app-part-2-refactor-mock-deployment-backend August 21, 2024 19:47
 # Conflicts:
 #	kobo/apps/openrosa/apps/logger/exceptions.py
 #	kobo/apps/openrosa/apps/logger/models/instance.py
 #	kobo/apps/openrosa/apps/logger/models/xform.py
 #	kobo/apps/openrosa/apps/logger/xform_instance_parser.py
 #	kobo/apps/openrosa/libs/utils/logger_tools.py
@noliveleger noliveleger changed the base branch from kobocat-django-app-part-2-refactor-mock-deployment-backend to beta-refactored August 26, 2024 15:05
@rajpatel24 rajpatel24 force-pushed the 862-reject_duplicate_submissions branch 2 times, most recently from bf597d5 to 2958052 Compare September 3, 2024 10:51
- Add test case for duplicate submission with an attachment
- Improve logic to extract UUID from xml
- Add logic to reject submission without UUID
@rajpatel24 rajpatel24 force-pushed the 862-reject_duplicate_submissions branch from 2958052 to f3c89f6 Compare September 3, 2024 17:29
Copy link

kobo/apps/openrosa/apps/logger/tests/test_parsing.py Outdated Show resolved Hide resolved
kobo/apps/openrosa/apps/logger/tests/test_parsing.py Outdated Show resolved Hide resolved
kobo/apps/openrosa/apps/logger/xform_instance_parser.py Outdated Show resolved Hide resolved
kpi/tests/api/v2/test_api_submissions.py Show resolved Hide resolved
kobo/apps/openrosa/libs/utils/logger_tools.py Outdated Show resolved Hide resolved
kobo/apps/openrosa/libs/utils/logger_tools.py Outdated Show resolved Hide resolved
@rajpatel24 rajpatel24 force-pushed the 862-reject_duplicate_submissions branch from 86e9313 to 306888b Compare September 30, 2024 09:00
@noliveleger noliveleger changed the base branch from beta-refactored to main October 7, 2024 18:31
noliveleger and others added 6 commits October 7, 2024 15:03
 # Conflicts:
 #	kobo/apps/openrosa/apps/api/tests/viewsets/test_xform_submission_api.py
 #	kobo/apps/openrosa/apps/logger/tests/test_simple_submission.py
 #	kobo/apps/openrosa/apps/logger/xform_instance_parser.py
 #	kobo/apps/openrosa/libs/utils/logger_tools.py
 # Conflicts:
 #	kobo/apps/openrosa/libs/utils/logger_tools.py
@rajpatel24 rajpatel24 force-pushed the 862-reject_duplicate_submissions branch from 5f50949 to 362bfa8 Compare October 17, 2024 21:06
Comment on lines +222 to +237
int_lock = int.from_bytes(
hashlib.shake_128(
f'{xform_id}!!{submission_uuid}!!{xml_hash}'.encode()
).digest(7), 'little'
)
acquired = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rajpatel24 please write a comment to explain why we use this thing.

Copy link
Contributor

@Guitlle Guitlle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good. We don't have preview steps, though, should we?

@rajpatel24 rajpatel24 force-pushed the 862-reject_duplicate_submissions branch from 578b2be to b0a4208 Compare January 9, 2025 12:27
@rajpatel24 rajpatel24 requested a review from noliveleger January 9, 2025 13:34
@noliveleger
Copy link
Contributor

The code looks good. We don't have preview steps, though, should we?

Indeed, but this PR predates the new PR workflow.
I'm relying on the the tests to reproduce the "previous steps".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants