google-site-verification=EH-INoJJCFk7-jAvyaAripclA4Dj9Sls8azb-V03bkk

Parsing Table Structures in the Wild

복잡한 large-scale table structure parsing dataset을 구축, 구조화된 테이블에서의 discrete cells를 정확하게 그룹화할 수 있는 pairing loss 기반의 cycle-pairing module을 최적화하는 Cycle-CenterNet을 소개함. WTW, ICDAR2019 데이터 셋 기반 테이블 구조 분석 성능 SOTA를 달성

Apr 19, 2022

Parsing Table Structures in the Wild

Contents

Introduction The WTW Dataset Cycle-CenterNet Cycle-Pairing Module Parsing-Processing Module Experiment

Introduction

notion image

본 논문의 Contribution은 다음과 같습니다.

복잡한 large-scale table structure parsing dataset을 구축

구조화된 테이블에서의 discrete cells를 정확하게 그룹화할 수 있는 pairing loss 기반의 cycle-pairing module을 최적화하는 Cycle-CenterNet을 소개함

WTW, ICDAR2019 데이터 셋 기반 테이블 구조 분석 성능 SOTA를 달성

The WTW Dataset

Image Collection and Annotation

notion image

natural image 50% archival document 30% printed document 20%로 구성되었습니다.

모든 이미지를 얻은 후, 7개의 어려운 cases로 분류하였으며, 각 case가 합리적인 비율로 구성되었습니다.

각 이미지에 표시된 모든 테이블의 cell 좌표와 행/열 정보에 annotation이 수행되었습니다.

2개 이상의 테이블이 있는 경우, instance 정보도 입력되었습니다.

훈련 데이터와 테스트 데이터 분포가 거의 일치하도록 하기 위해 원본 이미지의 약 75%를 훈련 세트로 무작위로 선택하고 나머지 데이터 샘플은 테스트 및 평가에 사용합니다.

WTW 데이터 세트에는 10970개의 훈련 샘플과 3611개의 테스트 샘플이 있습니다.

notion image

합리적인 평가 프로토콜은 다양한 접근 방식을 정량적으로 비교하는 데 중요합니다. 주어진 테이블 구조 파서를 다음과 같이 (1) 물리적 구조의 정확성과 (2) 논리적 구조의 정확성의 두 가지 측면에서 평가합니다.

Precision, Recall and F-score for physical structure estimation.

WTW 데이터 세트의 테스트 분할에서 ground truth에 대한 파싱 결과에 대한 Precision, Recall 및 F1 점수를 계산하여 cell detection의 정확도를 평가합니다. 일반적인 객체 감지와 달리 테이블 구조 파싱은 허용 오차가 낮은 테이블 셀의 정확도가 필요합니다. 따라서 IOU가 0.9 미만인 감지된 cell은 false positive detections로 간주됩니다.

Precision, Recall, F-score and TEDS [24] for adjacent relationship estimation.

논리적 구조 정확성을 위해 우리는 cell adjacency 및 tree-edit-distance similarity(TEDS)에 대한 Precision, Recall, F-score를 계산하여 문서 이미지에 사용된 평가 프로토콜을 따릅니다.

Cycle-CenterNet

notion image

CenterNet을 기반으로 인접 셀 간의 공통 vertex을 학습하기 위해 Cycle-Pairing 모듈과 Pairing loss를 추가합니다.

공통 vertex를 통해 모든 셀을 함께 연결하고 완전한 테이블 구조를 얻을 수 있습니다.

마지막으로 동일한 Parsing-Processing을 사용하여 행/열 정보를 얻습니다.

Cycle-Pairing Module

테이블 구조를 인식하기 위해 cell의 위치를 파악하고 cell 간의 splicing 정보를 학습하는 CyclePairing 모듈을 제안합니다.

Center-to-vertex branch에서 테이블 셀의 중심에서 vertex으로의 오프셋을 regression합니다.

Vertex-to-center branch에서 공통 vertex간의 offsets와 cell로 둘러쌓인 center point를 학습합니다.

끝으로 Parsing-Processing에서 테이블의 접합 정보를 추론할 수 있습니다.

Center-to-Vertex branch for cells localization

notion image

DLA-34 backbone으로부터 feature map을 추출하여 center-to-vertex branch에 입력, 을 예측합니다.

은 center point P와 4개의 vertices의 coordinate offset을 의미합니다.

여기서 는 table cells에서 center points의 개수를 의미합니다.

notion image

Vertex-to-Center branch for cells grouping

DLA-34 backbone으로부터 feature map을 추출하여 vertex-to-center branch에 입력, 을 예측합니다.

은 공통 vertex와 4개의 ceptenr points의 coordinate offset을 의미합니다.

여기서 는 모든 공통 vertexes의 개수를 의미합니다.

만약 vertex K를 공유하는 cell의 개수가 4보다 작으면, regression value 값이 0으로 설정됩니다.

notion image

Pairing Loss for Cycle-Pairing Module

notion image

notion image

는 predicted offset과 gt 사이의 l1 loss 값입니다.

는 hyperparameters 입니다.

는 regression 성능에 따라 전체 loss에 동적으로 가중치를 부여합니다.

Dynamic weighing function

notion image

notion image

notion image

는 각 중심 정점 쌍에 대한 regression error score를 정의합니다.

notion image

Parsing-Processing Module

마지막 단계로 table id, 시작 row/column 끝 row/column 을 포함하는 완전한 테이블 구조 정보를 복구하기 위한 parsing processing 모듈을 제안합니다.

먼저 모든 셀을 4개의 bounding edges로 분할하고 위쪽 edges와 아래쪽 edges를 horizontal lines로 병합, 왼쪽 edges와 오른쪽 edges는 vertical lines로 병합합니다.

그리고 각 horizontal lines 및 vertical lines를 정렬하고 0부터 indexing 합니다.

마지막으로 line index로부터 cell을 indexing하고 row/column 정보를 출력합니다.

Experiment

notion image

well-conditioned tabular images를 위한 CascasedTabNet과 Split+Heuristic의 경우 많은 wild images를 포함한 WTW dataset에서 매우 형편없는 결과를 보였습니다.

Centernet Baseline에 비해 본 논문에서 제안한 방법론으로 horizontal rectange regression을 arbitary quadrilateral regression으로 변경하고, Cycle_CenterNet 그리고 PairLoss를 추가적으로 도입했을때 점차적으로 성능이 개선됨을 확인했습니다.

notion image

SOTA를 달성했음에도 여전히 Curved, Occlued & blurred, Overlaid와 같이 어려운 데이터 셋에 대해 낮은 성능을 보이는 문제가 있습니다.

notion image

Cycle-CenterNet이 wired tables를 위해 디자인되었음에도 wireless tables를 가진 ICDAR2013 데이터셋에서도 91.7%의 F1-Score를 달성했습니다.

notion image

Precision, Recall, F1 score가 0.6~0.9의 IOU 별로 각각 계산되어 Weighted-Average F1이 계산되었을때 기존 방법들에 비해 15.2%의 성능 향상으로 SOTA를 달성했습니다.

notion image

Share article

Contents

Introduction The WTW Dataset Cycle-CenterNet Cycle-Pairing Module Parsing-Processing Module Experiment