• 제목/요약/키워드: Text Security

검색결과 349건 처리시간 0.028초

Profane or Not: Improving Korean Profane Detection using Deep Learning

  • Woo, Jiyoung;Park, Sung Hee;Kim, Huy Kang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권1호
    • /
    • pp.305-318
    • /
    • 2022
  • Abusive behaviors have become a common issue in many online social media platforms. Profanity is common form of abusive behavior in online. Social media platforms operate the filtering system using popular profanity words lists, but this method has drawbacks that it can be bypassed using an altered form and it can detect normal sentences as profanity. Especially in Korean language, the syllable is composed of graphemes and words are composed of multiple syllables, it can be decomposed into graphemes without impairing the transmission of meaning, and the form of a profane word can be seen as a different meaning in a sentence. This work focuses on the problem of filtering system mis-detecting normal phrases with profane phrases. For that, we proposed the deep learning-based framework including grapheme and syllable separation-based word embedding and appropriate CNN structure. The proposed model was evaluated on the chatting contents from the one of the famous online games in South Korea and generated 90.4% accuracy.

Analysis of AI Content Detector Tools

  • Yo-Seob Lee;Phil-Joo Moon
    • International journal of advanced smart convergence
    • /
    • 제12권4호
    • /
    • pp.154-163
    • /
    • 2023
  • With the rapid development of AI technology, ChatGPT and other AI content creation tools are becoming common, and users are becoming curious and adopting them. These tools, unlike search engines, generate results based on user prompts, which puts them at risk of inaccuracy or plagiarism. This allows unethical users to create inappropriate content and poses greater educational and corporate data security concerns. AI content detection is needed and AI-generated text needs to be identified to address misinformation and trust issues. Along with the positive use of AI tools, monitoring and regulation of their ethical use is essential. When detecting content created by AI with an AI content detection tool, it can be used efficiently by using the appropriate tool depending on the usage environment and purpose. In this paper, we collect data on AI content detection tools and compare and analyze the functions and characteristics of AI content detection tools to help meet these needs.

Exploring trends in blockchain publications with topic modeling: Implications for forecasting the emergence of industry applications

  • Jeongho Lee;Hangjung Zo;Tom Steinberger
    • ETRI Journal
    • /
    • 제45권6호
    • /
    • pp.982-995
    • /
    • 2023
  • Technological innovation generates products, services, and processes that can disrupt existing industries and lead to the emergence of new fields. Distributed ledger technology, or blockchain, offers novel transparency, security, and anonymity characteristics in transaction data that may disrupt existing industries. However, research attention has largely examined its application to finance. Less is known of any broader applications, particularly in Industry 4.0. This study investigates academic research publications on blockchain and predicts emerging industries using academia-industry dynamics. This study adopts latent Dirichlet allocation and dynamic topic models to analyze large text data with a high capacity for dimensionality reduction. Prior studies confirm that research contributes to technological innovation through spillover, including products, processes, and services. This study predicts emerging industries that will likely incorporate blockchain technology using insights from the knowledge structure of publications.

침해사고 예방을 위한 정보보안 교육훈련 문제은행 시스템 (A Design of Information Security Education training Databank System for Preventing Computer Security incident)

  • 모은수;이재필;이재광;이준현;이재광
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2015년도 춘계학술대회
    • /
    • pp.277-280
    • /
    • 2015
  • 스미싱, 피싱 등의 개인정보 침해사고로 인한 개인정보보안이 화두가 되고 있다. 이와 같은 개인정보 침해 사건사고는 개인정보관리에 있어 사용자의 의식이 부족하기 때문에 발생한다. 본 논문에서는 기존의 XML Tag 구조 기반 문제은행 시스템과 달리 텍스트 기반 교환 형식 기술로 언어에 의존하지 않는 장점을 가진 Key-Value 방식의 JSON을 사용하였다. 제안하는 시스템은 정보보호 분야별 상, 중, 하의 난이도로 구분하며, 공간 및 시간 제약 없는 자유로운 스마트기기 및 PC를 통해 사용자에게 서비스를 제공한다. 교육훈련 서버(훈련서버)의 안정적인 서비스를 위하여 오픈소스 기반의 Nodejs와 Apache의 Load Balancing 기술을 사용한다. 또한 교육훈련의 정답, 오답 판정 시 훈련서버에게 요청하지 않고 웹페이지에서 처리하며, 그 결과는 jQuery Ajax를 이용하여 훈련서버에게 전송된다. 사용자 ID를 기준으로 데이터베이스에 저장되고, 교육훈련통계 지표로 사용하도록 하였다. 본 논문에서는 사용자의 정보보안 의식 강화를 위해 수준별 교육훈련 시스템을 설계하였다.

  • PDF

소비 전력 테이블 생성을 통한 부채널 분석의 성능 향상 (Improved Side Channel Analysis Using Power Consumption Table)

  • 고가영;진성현;김한빛;김희석;홍석희
    • 정보보호학회논문지
    • /
    • 제27권4호
    • /
    • pp.961-970
    • /
    • 2017
  • 차분전력분석공격은 추측하는 비밀 정보 값에 따라 계산한 중간 값을 전력 소비 모델에 대입하여 전력 소비량을 구한 후 실제 발생한 전력 소비량과 함께 분석하여 암호화에 쓰인 비밀 정보 값을 복원한다. 이 때 흔히 쓰이는 전력 소비 모델로는 해밍 웨이트 모델이나 해밍 디스턴스 모델이 있으며 좀 더 정확한 전력 소비 모델을 구하기 위해서 전력 모델링 기법을 이용한다. 하지만 공격 타켓이 되는 장비가 가정한 전력 소비 모델과 상이한 경우 중간 값에 해당하는 전력 소비량을 옳게 반영하지 못하는 문제가 발생한다. 본 논문에서는 실제 공격 장비에서 측정한 소비 전력을 테이블 형태로 저장하여 전력 소비 모델로써 이용하는 방법을 제안한다. 제안하는 방법은 암호화 과정에서 활용 가능한 정보(평문, 암호문 등)가 쓰이는 시점에서의 소비 전력을 이용한다. 이 방법은 사전에 탬플릿 구성을 할 필요가 없으며 실제 공격 장비에서 측정한 소비 전력을 이용하기 때문에 해당 장비의 소비 전력 모델을 정확하게 반영한다. 제안하는 방법의 성능을 확인하기 위해 시뮬레이션과 실험을 진행하였으며 제안하는 방법의 성능이 기존의 전력 모델링 기법보다 부채널 공격 성능이 향상됨을 확인하였다.

안전신문고를 이용한 재난 예측 방법론 제안 (Research Suggestion for Disaster Prediction using Safety Report of Korea Government)

  • 이준;신진동;조상명;이상화
    • 한국방재안전학회논문집
    • /
    • 제12권4호
    • /
    • pp.15-26
    • /
    • 2019
  • 안전신문고는 2014년부터 운영되고 있으며, 2019년 7월까지 약 1백만 건의 누적신고건수가 존재한다. 본 연구에서는 정보화시대가 되고 있는 현 시점에서 약 116만 건이 넘은 안전신문고의 신고내용을 분석하여 국민의 소리와 관심이 과연 얼마나 힘이 있고 의미가 있는지 확인하고자 한다. 특히, 예측능력에 관심을 두고 있는데, 과연 안전신문고의 신고내용이 향후 일어날 수 있는 재난과 연관성이 있는지 확인하고자 하였다. 이를 위해 연구진은 안전신문고에 신고된 자료를 텍스트로 받아 자연어 분석 방법(Natural Language Processing)론에 의해 분석하였다. 이를 토대로 안전신문고 분석 기간 동안의 신문기사를 분석하여 안전신문고와 신문 기사 내용 간의 상관관계를 분석하였다. 그 결과 응답 및 확인 관련 보고서의 수가 증가함에 따라 몇 달 내 사고가 발생하였으며, 사회의 불안에 대해 사전에 보고된 안전문고의 내용을 분석하면 미래 재난 예측에 활용될 수 있을 것이라 판단된다.

A Classification Model for Illegal Debt Collection Using Rule and Machine Learning Based Methods

  • Kim, Tae-Ho;Lim, Jong-In
    • 한국컴퓨터정보학회논문지
    • /
    • 제26권4호
    • /
    • pp.93-103
    • /
    • 2021
  • 금융당국의 채권추심 가이드라인, 추심업자에 대한 직접적인 관리 감독 수행 등의 노력에도 불구하고 채무자에 대한 불법, 부당한 채권 추심은 지속되고 있다. 이러한 불법, 부당한 채권추심행위를 효과적으로 예방하기 위해서는 비정형데이터 기계학습 등 기술을 활용하여 적은 인력으로도 불법 추심행위에 대한 점검 등에 대한 모니터링을 강화 할 수 있는 방법이 필요하다. 본 연구에서는 대부업체의 추심 녹취 파일을 입수하여 이를 텍스트 데이터로 변환하고 위법, 위규 행위를 판별하는 규칙기반 검출과 SVM(Support Vector Machine) 등 기계학습을 결합한 불법채권추심 분류 모델을 제안하고 기계학습 알고리즘에 따라 얼마나 정확한 식별을 하였는지를 비교해 보았다. 본 연구는 규칙기반 불법 검출과 기계학습을 결합하여 분류에 활용할 경우 기존에 연구된 기계학습만을 적용한 분류모델 보다 정확도가 우수하다는 것을 보여 주었다. 본 연구는 규칙기반 불법검출과 기계학습을 결합하여 불법여부를 분류한 최초의 시도이며 후행연구를 진행하여 모델의 완성도를 높인다면 불법채권 추심행위에 대한 소비자 피해 예방에 크게 기여할 수 있을 것이다.

Hate Speech Detection Using Modified Principal Component Analysis and Enhanced Convolution Neural Network on Twitter Dataset

  • Majed, Alowaidi
    • International Journal of Computer Science & Network Security
    • /
    • 제23권1호
    • /
    • pp.112-119
    • /
    • 2023
  • Traditionally used for networking computers and communications, the Internet has been evolving from the beginning. Internet is the backbone for many things on the web including social media. The concept of social networking which started in the early 1990s has also been growing with the internet. Social Networking Sites (SNSs) sprung and stayed back to an important element of internet usage mainly due to the services or provisions they allow on the web. Twitter and Facebook have become the primary means by which most individuals keep in touch with others and carry on substantive conversations. These sites allow the posting of photos, videos and support audio and video storage on the sites which can be shared amongst users. Although an attractive option, these provisions have also culminated in issues for these sites like posting offensive material. Though not always, users of SNSs have their share in promoting hate by their words or speeches which is difficult to be curtailed after being uploaded in the media. Hence, this article outlines a process for extracting user reviews from the Twitter corpus in order to identify instances of hate speech. Through the use of MPCA (Modified Principal Component Analysis) and ECNN, we are able to identify instances of hate speech in the text (Enhanced Convolutional Neural Network). With the use of NLP, a fully autonomous system for assessing syntax and meaning can be established (NLP). There is a strong emphasis on pre-processing, feature extraction, and classification. Cleansing the text by removing extra spaces, punctuation, and stop words is what normalization is all about. In the process of extracting features, these features that have already been processed are used. During the feature extraction process, the MPCA algorithm is used. It takes a set of related features and pulls out the ones that tell us the most about the dataset we give itThe proposed categorization method is then put forth as a means of detecting instances of hate speech or abusive language. It is argued that ECNN is superior to other methods for identifying hateful content online. It can take in massive amounts of data and quickly return accurate results, especially for larger datasets. As a result, the proposed MPCA+ECNN algorithm improves not only the F-measure values, but also the accuracy, precision, and recall.

Copyright Protection for Digital Image by Watermarking Technique

  • Ali, Suhad A.;Jawad, Majid Jabbar;Naser, Mohammed Abdullah
    • Journal of Information Processing Systems
    • /
    • 제13권3호
    • /
    • pp.599-617
    • /
    • 2017
  • Due to the rapid growth and expansion of the Internet, the digital multimedia such as image, audio and video are available for everyone. Anyone can make unauthorized copying for any digital product. Accordingly, the owner of these products cannot protect his ownership. Unfortunately, this situation will restrict any improvement which can be done on the digital media production in the future. Some procedures have been proposed to protect these products such as cryptography and watermarking techniques. Watermarking means embedding a message such as text, the image is called watermark, yet, in a host such as a text, an image, an audio, or a video, it is called a cover. Watermarking can provide and ensure security, data authentication and copyright protection for the digital media. In this paper, a new watermarking method of still image is proposed for the purpose of copyright protection. The procedure of embedding watermark is done in a transform domain. The discrete cosine transform (DCT) is exploited in the proposed method, where the watermark is embedded in the selected coefficients according to several criteria. With this procedure, the deterioration on the image is minimized to achieve high invisibility. Unlike the traditional techniques, in this paper, a new method is suggested for selecting the best blocks of DCT coefficients. After selecting the best DCT coefficients blocks, the best coefficients in the selected blocks are selected as a host in which the watermark bit is embedded. The coefficients selection is done depending on a weighting function method, where this function exploits the values and locations of the selected coefficients for choosing them. The experimental results proved that the proposed method has produced good imperceptibility and robustness for different types of attacks.

A Study on the Smart Tourism Awareness through Bigdata Analysis

  • LEE, Song-Yi;LEE, Hwan-Soo
    • 산경연구논집
    • /
    • 제11권5호
    • /
    • pp.45-52
    • /
    • 2020
  • Purpose: In the 4th industrial revolution, services that incorporate various smart technologies in the tourism sector have begun to gain popularity. Accordingly, academic discussions on smart tourism have also started to become active in various fields. Despite recent research, the definition of smart tourism is still ambiguous, and it is not easy to differentiate its scope or characteristics from traditional tourism concepts. Thus, this study aims to analyze the perception of smart tourism exposed online to identify the current point of smart tourism in Korea and present the research direction for conceptualizing smart tourism suitable for the domestic situation. Research design, data, and methodology: This study analyzes the perception of smart tourism exposed online based on 20,198 news data from portal sites over the past six years. Data on words used with smart tourism were collected from the leading portal sites Naver, Daum, and Google. Text mining techniques were applied to identify the social awareness status of smart tourism. Network analysis was used to visualize the results between words related to smart tourism, and CONCOR analysis was conducted to derive clusters formed by words having similarity. Results: As a result of keyword analysis, the frequency of words related to the development and construction of smart tourism areas was high. The analysis of the centrality of the connection between words showed that the frequency of keywords was similar, and that the words "smartphones" and "China" had relatively high connection centrality. The results of network analysis and CONCOR indicated that words were formed into eight groups including related technologies, promotion, globalization, service introduction, innovation, regional society, activation, and utilization guide. The overall results of data analysis showed that the development of smart tourism cities was a noticeable issue. Conclusions: This study is meaningful in that it clearly reflects the differences in the perception of smart tourism between online and research trends despite various efforts to develop smart tourism in Korea. In addition, this study highlights the need to understand smart tourism concepts and enhance academic discussions. It is expected that such academic discussions will contribute to improving the competitiveness of smart tourism research in Korea.