References
- Y. Xiao, M. Xue, T. Lu, Y. Wu, and S. Palaiahnakote, "A Text-Context-Aware CNN Network for Multi-oriented and Multi-language Scene Text Detection," in Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp. 695-700, 2019.
- O. Tursun, R. Zeng, S. Denman, S. Sivapalan, S. Sridharan, and C. Fookes, "MTRNet: A Generic Scene Text Eraser," in Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp. 39-44, 2019.
- K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
- S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, "Aggregated Residual Transformations for Deep Neural Network," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492-1500, 2017.
- G. Pavlakos, L. Zhu, X. Zhou, and K. Daniilidis, "Learning to Estimate 3D Human Pose and Shape from a Single Color Image," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 459-468, 2018.
- J. Hu, L. Shen, and G. Sun, "Squeeze-and-Excitation Networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132-7141, 2018.
- W. Wang, E. Xie, X. Li, W. Hou, T. Lu, G. Yu, and S. Shao, "Shape Robust Text Detection With Progressive Scale Expansion Network," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9336-9345, 2019.
- S. S. Paliwal, R. Rahul, M. Sharma, and L. Vig, "TableNet: Deep Learning Model for End-to-end Table Detection and Tabular Data Extraction from Scanned Document Images," in Proceedings of International Conference on Document Analysis and Recognition (ICDAR), pp. 128-133, 2019.
- R. Smith, "An Overview of the Tesseract OCR Engine," in Proceedings of International Conference on Document Analysis and Recognition (ICDAR), 2007.
- O. Ronneberger, P. Fischer, and T. Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," Medical Image Computing and Computer-Assisted Intervention, vol. 9351, pp. 234-241, 2015.
- L. C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adan, "Encoder-decoder with atrous separable convolution for semantic image segmentation," in Proceedings of the European Conference on Computer Vision, pp. 801-818, 2018.
- R. Caruana, "Multi-task learning," Machine Learning, vol. 28, no. 1, pp. 41-75, 1997. https://doi.org/10.1023/A:1007379606734
- Z. Chen, R. Zhang, G. Zhang, Z. Ma, and T. Lei, "Digging Into Pseudo Label: A Low-Budget Approach for Semi-Supervised Semantic Segmentation," in IEEE Access, vol. 8, pp. 41830-41837, 2020. https://doi.org/10.1109/access.2020.2975022
- H. Wu, S. Zheng, J. Zhang, and K. Huang, "Fast End-to-End Trainable Guided Filter," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1838-1847, 2018.