Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the IMSC HRM for EBU-TT-D documents #66

Open
wants to merge 32 commits into
base: master
Choose a base branch
from
Open

Conversation

nigelmegitt
Copy link
Collaborator

@nigelmegitt nigelmegitt commented Dec 7, 2023

This PR provides improved support for creating EBU-TT-D documents from XML sources, and implements validation of those documents against the IMSC HRM at https://www.w3.org/TR/imsc-hrm/ - it incorporates the tests at w3c/imsc-hrm-tests that are valid EBU-TT-D and runs those tests.

Also closes #62 by incorporating the changes proposed in #63.

@nigelmegitt nigelmegitt marked this pull request as ready for review December 7, 2023 12:20
Includes all the imsc-hrm-tests test files that can be valid EBU-TT-D, fixed up to be so.
Iterate through the characters getting the glyphs.
Decide if they need to be rendered or copied from the cache.
Compute NRGA, and check the Glyph Cache size on each iteration.
Still to do: compute copyDur and renderDur. Needs UAX24 implementation too.
Parses the Unicode UAX24 scripts list and generates a python file that specifies those lists in a way that can be queried later. Needed for the IMSC-HRM implementation.
* Fix NRGA calculation to square the area
* Calculate _GCpy and _Ren based on uax24 script

Passes all but 2 of the tests.

TODO: tidy up stdout and log messages
For example, as might occur if text is broken with a `<br/>` child element of the `<span>`
Change all the `print()`s to `log.debug()`s so the log is readable.
Fix a bug where a `<br/>` after some text content would cause the previous text to be processed and counted again, which caused dur014-pass to fail incorrectly.
Should have 0.5s available render time for second ISD.
And regenerate uax24.py script
Previously, we processed each separate `p` in an ISD distinctly as a separate ISD, which was wrong. Now, gather all the elements in each ISD together and process as a group. Also tidies up the time handling.

When an ISD has a region with an opaque background colour and showBackground="always", there can _never_ be an empty ISD, because the background always needs to be painted.
The log handling in --verbose mode is flaky, not sure why.
Makes the tests behave as expected.
The upstream repo has had those conversions applied, so the tests are now the same.
Came back during rebase
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support reading EBU-TT-D documents
1 participant