Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Did you use other scripts beside Latin in MLT2017 for pre-training? #13

Closed
rubycheen opened this issue Oct 21, 2022 · 6 comments
Closed

Comments

@rubycheen
Copy link

Did you use other scripts beside Latin in MLT2017 for pre-training?

@zx1239856
Copy link
Contributor

Our model only works with Latin scripts. You may refer to #12 if you want to extend the character set.

@rubycheen
Copy link
Author

@zx1239856 Thanks a lot! And I found the coco format data has column "polys" with 16 numbers (8 points?) for each, how did you convert ICDAR dataset that only have bounding box with four points to that?

@zx1239856
Copy link
Contributor

The 8 points "polys" are control points of Bezier curves. For polygon annotations, we use 16 points. We reuse this column name for both types of annotations.

Please refer to https://github.com/mlpc-ucsd/TESTR/blob/main/adet/data/builtin.py#L19-L47. For example, totaltext_poly_train is polygonal TotalText, whereas totaltext_train is the Bezier variant.

In terms of conversion from quadrilateral boxes to 16-point polygons, I assume you may add 3 more points to each side of the box. I've also seen other approaches based on the division of circumference, i.e. you divide the perimeter into 13 segments evenly and therefore 12 points can be added.

@rubycheen
Copy link
Author

Thanks! So your illustration is exactly the same with your method of conversion? (divide the perimeter into 13 segments evenly and therefore 12 points can be added)

@zx1239856
Copy link
Contributor

We simply adopted the first approach I mentioned above i.e. inserting 3 more points on each side.

@rubycheen
Copy link
Author

I got it, thank you so much:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants