Search | Korea Science

소프트웨어 참조 데이터세트 구축 동향

Kim, Ki-Bom;Park, Sang-Seo
- Review of KIISC
- /
- v.18 no.1
- /
- pp.70-77
- /
- 2008
디지털 포렌식에서 증거 데이터 분석의 효율성을 높이기 위해서는 잘 알려진 파일을 분석 대상에서 제외하거나, 특정 파일의 존재여부에 대한 검사가 필요하다. 이를 위하여, 시스템 파일, 폰트 파일, 응용 프로그램 파일 등 분석이 필요없는 파일 및 루트킷, 백도어, 익스플로잇 코드 등 악성 파일에 대한 해쉬 값을 미래 계산하여 저장해 둔 것을 소프트웨어 참조 데이터세트라고 한다. 이 논문에서는 소프트웨어 참조 데이터세트 구축에 대한 주요 동향에 대하여 살펴본다. 특히, 소프트웨어 참조 데이터세트 구축을 주도하고 있는 미국의 NSRL RDS에 대하여 활용가능성 측면에서 구체적으로 살펴본다. NSRL RDS에 대한 분석결과 실제 컴퓨터 포렌식 도구에서 활용하기 매우 어렵다는 사실을 알 수 있다.
PDF KSCI

Font Data-driven Oriental Brush-Art Calligraphy Generation (폰트 데이터 기반의 동양적 붓글씨 필적 생성)

Ahn, Jeong-Ho;Lee, In-Kwon
- Proceedings of the Korean Information Science Society Conference
- /
- 2010.06b
- /
- pp.275-278
- /
- 2010
이 논문에서는, 기존에 존재하는 글자체의 커브 데이터를 분석하여 같은 글자를 붓글씨로 서예를 하듯이 다시 써낸 듯한 효과를 낼 수 있는 방법을 제안한다. 글자를 형성하는 위상적인 뼈대를 커브로 쪼개어, 글자 하나를 여러 획으로 분리하여 표현한 후에, 각 획에 해당하는 커브의 차원 수와, 길이와 곡률을 이용하여 붓의 궤적을 자동적으로 생성해 내는 방법이다. 붓의 궤적이 표현될 방법을 기존 글자 데이터를 이용해서 어떻게 조작 경로를 자동적으로 만들어 붓글씨 팔적을 생성해낼 것인지가 풀어내어야 할 문제이다.
PDF

A study of improve vectorising technique on the internet (인터넷에서의 개선된 벡터라이징 기법에 관한 연구)

김용호;이윤배
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.6 no.2
- /
- pp.271-281
- /
- 2002
Currently, most web designers guarante high quality using bitmap graphics as fixed font size, but that has defects about file size and flexibility. Especially, to provide high quality of banner and advertise characters, after you should use a bitmap edit program, and then we should follow the method we add that program to HTML documents as bitmap data. In this study, as I show a couple of new tags in front of HTML documents, I show methods which can be presented diverse effects. When text information are stored, because we print out a screen with simple control points and outside information, it can be possible for us to express the same quality of Hangul characters like printed documents in a web browser. Regardless of the second class of platform, we can make it possible the character expression with exact character expressions and diverse effects.
PDF KSCI

Implementation of closed caption service S/W module on DTV receiver (DTV 수신기의 자막방송 S/W 모듈의 구현)

Kim Sun-Gwon;No Seung-Yong
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.41 no.1
- /
- pp.69-76
- /
- 2004
Recently, The development of DTV receiver and the need of its additional services have been increased vastly. In this paper, we implement new closed caption engine for the deaf and hard of hearing person and languages studying on DTV receiver, The specification of domestic closed caption is almost adopted that of EIA-608A. In this paper, with fully following the specification, we will present how to implement functions of closed caption with new algorithm. the function includes paint-on, Pop-on, roll-up/down, etc. experimental results show that the proposed technique provides satisfactory performance on DTV receiver.
PDF KSCI

A Study on the OCR of Korean Sentence Using DeepLearning (딥러닝을 활용한 한글문장 OCR연구)

Park, Sun-Woo
- Annual Conference on Human and Language Technology
- /
- 2019.10a
- /
- pp.470-474
- /
- 2019
한글 OCR 성능을 높이기 위해 딥러닝 모델을 활용하여 문자인식 부분을 개선하고자 하였다. 본 논문에서는 폰트와 사전데이터를 사용해 딥러닝 모델 학습을 위한 한글 문장 이미지 데이터를 직접 생성해보고 이를 활용해서 한글 문장의 OCR 성능을 높일 다양한 모델 조합들에 대한 실험을 진행했다. 딥러닝 모델은 STR(Scene Text Recognition) 구조를 사용해 변환, 추출, 시퀀스, 예측 모듈 각 24가지 모델 조합을 구성했다. 딥러닝 모델을 활용한 OCR 실험 결과 한글 문장에 적합한 모델조합은 변환 모듈을 사용하고 시퀀스와 예측 모듈에는 BiLSTM과 어텐션을 사용한 모델조합이 다른 모델 조합에 비해 높은 성능을 보였다. 해당 논문에서는 이전 한글 OCR 연구와 비교해 적용 범위를 글자 단위에서 문장 단위로 확장하였고 실제 문서 이미지에서 자주 발견되는 유형의 데이터를 사용해 애플리케이션 적용 가능성을 높이고자 한 부분에 의의가 있다.
PDF

Development of an emotional subtitle editor for the deaf and hearing impaired people (청각장애인을 위한 감성자막 편집기 개발)

Kim, Hyunsoon;Oh, Juhyun
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2020.07a
- /
- pp.469-471
- /
- 2020
방송의 디지털화에 따른 비장애인 대비 소외 계층의 정보 접근성의 부족은 소외 계층에 대한 정보 격차를 심화시킬 수 있다. 이에 캐릭터 수화 방송, 자막 방송 등 장애인을 위한 방송 서비스의 양적, 질적 개선에 관한 연구가 진행되고 있다. 자막 방송 서비스의 경우, 관련 법령에 따라 서비스를 실시하고 있으며 지상파 UHD 방송의 경우에도 본 방송을 시작한 이래 폐쇄 자막 서비스 시스템을 구축하여 서비스를 제공하고 있다. 이러한 기존 자막 서비스는 텍스트 형태의 단조로운 내용 전달 방식이어서 다양한 스타일로 풍부하게 내용을 전달하는 것에 대한 요구가 있다. 이에 본 논문에서는 지상파 UHD 방송을 대상으로 개선된 형태의 자막 서비스인 감성자막 서비스를 소개하고 이를 위한 감성 자막 편집기 기술 개발에 대하여 다룬다. 감성자막 서비스는 화자의 감정 정보를 자막 메타데이터에 추가적으로 제공하여, 감정에 따라 다양한 이모티콘이나 다른 종류의 폰트 스타일로 자막 서비스가 가능하게 하는 서비스이다. 감성자막 편집기는 이러한 감성 자막 메타데이터를 추가, 편집하고 감성자막 파일로 생성하기 위한 시스템으로, 지상파 UHD 송출 시스템 및 폐쇄 자막 표준을 고려하여 개발하였다.
PDF

Proposal for Deep Learning based Character Recognition System by Virtual Data Generation (가상 데이터 생성을 통한 딥러닝 기반 문자인식 시스템 제안)

Lee, Seungju;Park, Gooman
- Journal of Broadcast Engineering
- /
- v.25 no.2
- /
- pp.275-278
- /
- 2020
In this paper, we proposed a deep learning based character recognition system through virtual data generation. In order to secure the learning data that takes the largest weight in supervised learning, virtual data was created. Also, after creating virtual data, data generalization was performed to cope with various data by using augmentation parameter. Finally, the learning data composition generated data by assigning various values to augmentation parameter and font parameter. Test data for measuring the character recognition performance was constructed by cropping the text area from the actual image data. The test data was augmented considering the image distortion that may occur in real environment. Deep learning algorithm uses YOLO v3 which performs detection in real time. Inference result outputs the final detection result through post-processing.
https://doi.org/10.5909/JBE.2020.25.2.275 인용 PDF KSCI KPUBS

Printed Numeric Character Recognition on Giro Form (지로 서식 문서의 인쇄체 숫자 인식)

김진숙;변영철;김경환;최영우;이일병
- Proceedings of the Korean Information Science Society Conference
- /
- 1999.10b
- /
- pp.446-448
- /
- 1999
본 논문에서는 일상 생활에서 쉽게 접할 수 있는 지로(Giro) 서식 상에 있는 인쇄체 숫자열 인식 방법으로서 템플릿 매칭 방법에 대해 설명한다. 지로 서식 문서 상의 인쇄체 숫자는 인쇄시의 오류로 인하여 숫자의 굵기나 높이, 그리고 폭이 다를 수는 있지만 기본적으로 폰트의 유형이 한가지라는 것과 나타날 수 있는 오류의 유형이 몇 가지로 제한되어 있다는 특징을 갖는다. 따라서 이러한 데이터 특징을 효율적으로 수용할 수 있도록 템플릿을 정의한 후 매칭 방법을 통해 숫자를 인식하는 템플릿 매칭 방법에 대해 설명한다. 실험 결과 비교적 간단한 방법을 이용하더라고 인쇄체 숫자열을 효율적으로 인식할 수 있었다.
PDF

Improved Tag Selection for Tag-cloud using the Dynamic Characteristics of Tag Co-occurrence (태그 동시 출현의 동적인 특징을 이용한 개선된 태그 클라우드의 태그 선택 방법)

Kim, Du-Nam;Lee, Kang-Pyo;Kim, Hyoung-Joo
- Journal of KIISE:Computing Practices and Letters
- /
- v.15 no.6
- /
- pp.405-413
- /
- 2009
Tagging system is the system that allows internet users to assign new meta-data which is called tag to article, photo, video and etc. for facilitating searching and browsing of web contents. Tag cloud, a visual interface is widely used for browsing tag space. Tag cloud selects the tags with the highest frequency and presents them alphabetically with font size reflecting their popularity. However the conventional tag selection method includes known weaknesses. So, we propose a novel tag selection method Freshness, which helps to find fresh web contents. Freshness is the mean value of Kullback-Leibler divergences between each consecutive change of tag co-occurrence probability distribution. We collected tag data from three web sites, Allblog, Eolin and Technorati and constructed the system, 'Fresh Tag Cloud' which collects tag data and creates our tag cloud. Comparing the experimental results between Fresh Tag Cloud and the conventional one with data from Allblog, our one shows 87.5% less overlapping average, which means Fresh Tag Cloud outperforms the conventional tag cloud.
PDF KSCI

Streamlined GoogLeNet Algorithm Based on CNN for Korean Character Recognition (한글 인식을 위한 CNN 기반의 간소화된 GoogLeNet 알고리즘 연구)

Kim, Yeon-gyu;Cha, Eui-young
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.20 no.9
- /
- pp.1657-1665
- /
- 2016
Various fields are being researched through Deep Learning using CNN(Convolutional Neural Network) and these researches show excellent performance in the image recognition. In this paper, we provide streamlined GoogLeNet of CNN architecture that is capable of learning a large-scale Korean character database. The experimental data used in this paper is PHD08 that is the large-scale of Korean character database. PHD08 has 2,187 samples for each character and there are 2,350 Korean characters that make total 5,139,450 sample data. As a training result, streamlined GoogLeNet showed over 99% of test accuracy at PHD08. Also, we made additional Korean character data that have fonts that are not in the PHD08 in order to ensure objectivity and we compared the performance of classification between streamlined GoogLeNet and other OCR programs. While other OCR programs showed a classification success rate of 66.95% to 83.16%, streamlined GoogLeNet showed 89.14% of the classification success rate that is higher than other OCR program's rate.
https://doi.org/10.6109/jkiice.2016.20.9.1657 인용 PDF KSCI

Search Result 25, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)