如何获取行文本图像和标签？ #43

wksuixin · 2021-03-03T02:35:35Z

行文本可通过截取本行文字的最小外接矩形来获得。如何获得行文本标签？主要是不确定train.jsonl文件中，单个字符出现顺序是否和行文本字符出现顺序一致？

yuantailing · 2021-03-03T08:15:25Z

您好，instance 顺序和阅读顺序是一致的。

sentence:
[instance_0, instance_1, instance_2, ...]                 # MUST NOT be empty

instance:
{
    polygon: [[x0, y0], [x1, y1], [x2, y2], [x3, y3]],    # x, y are floating-point numbers
    text: str,                                            # the length of the text MUST be exactly 1
    is_chinese: bool,
    attributes: [attr_0, attr_1, attr_2, ...],            # MAY be an empty list
    adjusted_bbox: [xmin, ymin, w, h],                    # x, y, w, h are floating-point numbers
}

标注人员一般按阅读习惯的顺序标注。在上述结构中，sentence 数组中 instance 的顺序保留了标注顺序。因此，为了获取行文本的标签，将 instance.text 字段顺次连接即可。

wksuixin · 2021-03-03T08:19:28Z

好的，谢谢您

wksuixin closed this as completed Mar 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

如何获取行文本图像和标签？ #43

如何获取行文本图像和标签？ #43

wksuixin commented Mar 3, 2021

yuantailing commented Mar 3, 2021

wksuixin commented Mar 3, 2021

如何获取行文本图像和标签？ #43

如何获取行文本图像和标签？ #43

Comments

wksuixin commented Mar 3, 2021

yuantailing commented Mar 3, 2021

wksuixin commented Mar 3, 2021