References
- J.M. Lim, S.J. Jang, M.Y. Kim, J. H. Lee, "2014 Status of Utilization of Internet," Korea Internet Agency, 2014
- Deng C., Shipeng Y., Ji-Rong W., Wei-Ying M., "VIPS: a Vision-based Page Segmentation Algorithm," Microsoft Technical Report(MSR-TR-2003-79), 2003.
- Suhit G., Gail E. K., Peter G., Michael F. C., Justin S., "Automating Content Extraction of HTML Documents," World Wide Web, vol.8, Issue2, pp.179-224, 2005. https://doi.org/10.1007/s11280-004-4873-3
- Jeff P., Dan R., "Extracting Article Text from the Web with Maximum Subsequence Segmentation," The 18th international conference on World wide web, pp.971-980, 2009.
- Stefan E., "A lightweight and efficient tool for clcaning Web pages", The 6th International Conference on Language Resources and Evaluation, 2008.
- Christian K., Peter F., Wolfgang N., "Boilerplate Detection using Shallow Text Features," The third ACM international conference on Web search and data mining, pp.441-450, 2010.
- Jian F., Ping L., Suk Hwan L., Sam L., Parag J., Jerry L., "Article Clipper-A System for Web Article Extraction," 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.743-746, 2011.
- Tim W., William H. H., Jiawei H., "CETR-Content Extraction via Tag Ratios," 19th international conference on World wide web, pp.971-980, 2010.
- Jung-chan Yun, Sung-dae Yun, "Design of personalized Web mining using association rules ", Journal of Korea multimedia society, Vol. 11-11, pp.1566-1574, 2008.
- Hyung-woo Lee, Tae-su Kim, "Research of knowledge inference algorithm with associated mining method based on Ontology", Journal of Korea multimedia society, Vol. 11-11, pp.1601-1614, 2008.
- Tomaz K., Evaluating Text Extraction Algorithms. [Online]. Available: http://tomazkovacic.com/blog/(downloaded 2012, Jul.)
- W3C Recommendation. (1999, Dec. 24). HTML 4.01 Specification [Online]. Available:http://www.w3.org/TR/html401/ (downloaded 2012, Jul.)
- Ju-gil Hong, Eun-young Shin, Jue-il Lee, Won-Seok Lee, "Automatic Hierarchical Classification of news articles using association rules", Journal of Korea multimedia society, Vol. 14-6, pp.730-741, 2011. https://doi.org/10.9717/kmms.2011.14.6.730
- Won-moon Song, Woo-seung Kim, Mung-won Kim, "HTML document, extraction using the context of the surrounding text blocks", Journal of Korean Institute of Information Scientists and Engineers : Software and Applications, Vol. 40-3, pp.155-163, 2013.
- S.-H. Lin, J.-M. Ho, Discobering Informative Content Blocks from Web Documents. Proc. of 8th ACM SIGKDD Intl. Conf. Knowledge Discovery and Data Mining, 2002.
- Young-gu Lee, "Study on the article text extraction from news web page", Journal of Korea Society for Information Management, Vol. 26, pp.305-320, 2009. https://doi.org/10.3743/KOSIM.2009.26.1.305
- L. Bing, Y. Wang, Y. Zhang, Primary Content Extraction with Mountain Model. Proc. 8th IEEE CIT, 2008.
Cited by
- 한국 인터넷신문 HTML 규격 및 시맨틱스 수준 분석 vol.18, pp.5, 2015, https://doi.org/10.9728/dcs.2017.18.5.949
- Software Implementation to Covert Table and Text-Based Hangul Files(.hwp) to HTML vol.17, pp.12, 2015, https://doi.org/10.14801/jkiit.2019.17.12.155