added utf-8 support for SEND JSON POST Request #504
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR adds UTF-8 encoding support for SEND JSON POST requests in the transcription module of Jigasi.
org.jitsi.jigasi.transcription.SEND_JSON_REMOTE_URLS=https://ts.meet.jit.si/transcriptions
This ensures proper handling of non-ASCII characters, especially for languages like Hindi, Tamil, Japanese, etc.
Changes:
Explicitly set the Content-Type header to application/json; charset=UTF-8 to indicate that the JSON data is UTF-8 encoded.
Modified the byte conversion of the JSON string to use UTF-8 encoding.
Change-1:
To:
Change-2:
To:
Motivation:
While the transcriptions worked well in English, issues arose when changing the language to Hindi or others. The received text contained numerous question marks, indicating an encoding issue. By ensuring the data is sent using UTF-8 encoding, this PR aims to resolve such issues and ensure the correct interpretation of non-ASCII characters.
Testing:
Tested the transcription feature with multiple languages, including Hindi, Tamil, and Japanese.
Verified that the JSON POST requests in jigasi sip-communicator.properties are being sent with the correct UTF-8 encoding.
org.jitsi.jigasi.transcription.SEND_JSON_REMOTE_URLS=https://ts.meet.jit.si/transcriptions
Impact:
This change ensures that Jigasi can handle transcription for a wide variety of languages without any encoding-related issues, enhancing its versatility and robustness.
Additional Notes (if any):
Mention any related issues, potential side effects, or further improvements that can be made.