• Title/Summary/Keyword: Image to Speech

Search Result 188, Processing Time 0.023 seconds

Design of detection method for malicious URL based on Deep Neural Network (뉴럴네트워크 기반에 악성 URL 탐지방법 설계)

  • Kwon, Hyun;Park, Sangjun;Kim, Yongchul
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.5
    • /
    • pp.30-37
    • /
    • 2021
  • Various devices are connected to the Internet, and attacks using the Internet are occurring. Among such attacks, there are attacks that use malicious URLs to make users access to wrong phishing sites or distribute malicious viruses. Therefore, how to detect such malicious URL attacks is one of the important security issues. Among recent deep learning technologies, neural networks are showing good performance in image recognition, speech recognition, and pattern recognition. This neural network can be applied to research that analyzes and detects patterns of malicious URL characteristics. In this paper, performance analysis according to various parameters was performed on a method of detecting malicious URLs using neural networks. In this paper, malicious URL detection performance was analyzed while changing the activation function, learning rate, and neural network structure. The experimental data was crawled by Alexa top 1 million and Whois to build the data, and the machine learning library used TensorFlow. As a result of the experiment, when the number of layers is 4, the learning rate is 0.005, and the number of nodes in each layer is 100, the accuracy of 97.8% and the f1 score of 92.94% are obtained.

Artificial Intelligence for Assistance of Facial Expression Practice Using Emotion Classification (감정 분류를 이용한 표정 연습 보조 인공지능)

  • Dong-Kyu, Kim;So Hwa, Lee;Jae Hwan, Bong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.6
    • /
    • pp.1137-1144
    • /
    • 2022
  • In this study, an artificial intelligence(AI) was developed to help with facial expression practice in order to express emotions. The developed AI used multimodal inputs consisting of sentences and facial images for deep neural networks (DNNs). The DNNs calculated similarities between the emotions predicted by the sentences and the emotions predicted by facial images. The user practiced facial expressions based on the situation given by sentences, and the AI provided the user with numerical feedback based on the similarity between the emotion predicted by sentence and the emotion predicted by facial expression. ResNet34 structure was trained on FER2013 public data to predict emotions from facial images. To predict emotions in sentences, KoBERT model was trained in transfer learning manner using the conversational speech dataset for emotion classification opened to the public by AIHub. The DNN that predicts emotions from the facial images demonstrated 65% accuracy, which is comparable to human emotional classification ability. The DNN that predicts emotions from the sentences achieved 90% accuracy. The performance of the developed AI was evaluated through experiments with changing facial expressions in which an ordinary person was participated.

A Study on the Semiotics and Poetic Meaning of Literature Content - at the Center of Moon Sam­seok's Children's Poetry - (문학콘텐츠의 기호학적 시적의미 연구 -문삼석의 동시(童詩)를 중심으로-)

  • Sung, Hyun-Ju
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.6
    • /
    • pp.72-79
    • /
    • 2019
  • This study tries to study the poetic beauty of the space deconstructed by the medium appearing in Moon Sam-seok's children's poetry to help with simultaneous education and guiding methodology. The research method is based on the assumption that semiotics spatial image is read. In other words, we intend to derive the poetic beauty of the space in which the great pole space built by is deconstructed by the intervention of by the medium term . Among Moon Sam-seok's series of works, the research text is "The Wind and the Fire," "The Wind and the Empty Bottle," "The Wind and Salt," "The Wind and the Rock." According to the study, the wind deconstructed a space that was differentiated by the presence or absence of matter into a "coexistence space." These poetic spaces symbolize poetic beauty as ideal places of life that coexist in a distinction but not discrimination. Second, the wind has eliminated the gap between alienation, suffering and solitude. In other words, the wind deconstructed poetic space produced poetic beauty with the 'space of communication' based on homogeneity of the nature of existence. In conclusion, Moon's poetic speech can be seen that he intended to express the discreteness of the poetic space as 'communication' and 'common life' by deconstructing it with deviation and convergence by introducing a medium.

Design and Implementation of a news Archive System using Shot Types (샷의 타입을 이용한 뉴스 아카이브 시스템의 설계 및 구현)

  • Han, Keun-Ju;Nang, Jong-Ho;Ha, Myung-Hwan;Jung, Byung-Hee;Kim, Kyeong-Soo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.5
    • /
    • pp.416-428
    • /
    • 2001
  • In order to build a news archive system. the news video stream should be first segmented into several articles, ad their contents are abstracted effectively. This abstraction helps the users to understand the contents of the article without playing the whole video stream. This paper proposes a new article boundary detection scheme for the news video streams together with a new news article abstraction scheme using the shot types of the news video data. The shots in the news video are classified into anchor person shots, interview shots, speech shots, reporting shots, graphic shots, and others. Since the news article starts with an anchor shot whose duration is relatively longer than other shots with special screen structure, the article boundary in detected by the computing the length of the shot and checking the screen structure in the proposed scheme. For the effective abstraction of the article video, the graphic image located in the right-top of the anchor shot frames is primarily used in the proposed abstraction scheme since it is the abstraction of the article made by the producer of the news according to its contents so that it contains a lot of meaningful information. The key frames of the other shots except interview and report shots are also used to abstract the contents of the articles in the proposed scheme. Upon experimental results, the precision and recall values of the proposed article boundary detection scheme could be 92% and 96%, respectively. This paper also presents a design and implementation of a prototype news archive system on WWW that consists of an indexing tool, an authoring tool, a database for meta-data of the news, and a browsing tool.

  • PDF

Analysis of Generative AI Technology Trends Based on Patent Data (특허 데이터 기반 생성형 AI 기술 동향 분석)

  • Seongmu Ryu;Taewon Song;Minjeong Lee;Yoonju Choi;Soonuk Seol
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.17 no.1
    • /
    • pp.1-9
    • /
    • 2024
  • This paper analyzes the trends in generative AI technology based on patent application documents. To achieve this, we selected 5,433 generative AI-related patents filed in South Korea, the United States, and Europe from 2003 to 2023, and analyzed the data by country, technology category, year, and applicant, presenting it visually to find insights and understand the flow of technology. The analysis shows that patents in the image category account for 36.9%, the largest share, with a continuous increase in filings, while filings in the text/document and music/speech categories have either decreased or remained stable since 2019. Although the company with the highest number of filings is a South Korean company, four out of the top five filers are U.S. companies, and all companies have filed the majority of their patents in the U.S., indicating that generative AI is growing and competing centered around the U.S. market. The findings of this paper are expected to be useful for future research and development in generative AI, as well as for formulating strategies for acquiring intellectual property.

EFFECT OF PULSED ELECTROMAGNETIC FIELD STIMULATION ON THE EARLY BONE CONSOLIDATION AFTER DISTRACTION OSTEOGENESIS IN RABBIT MANDIBLE MODEL (가토 하악골 골신장 후 맥동전자기장이 조기 골경화에 미치는 효과에 대한 연구)

  • Hwang, Kyung-Kyun;Cho, Tae-Hyung;Song, Yun-Mi;Kim, Do-Kyun;Han, Sung-Hee;Kim, In-Sook;Hwang, Soon-Jung
    • Maxillofacial Plastic and Reconstructive Surgery
    • /
    • v.29 no.2
    • /
    • pp.123-131
    • /
    • 2007
  • Introduction: Distraction osteogenesis is widely used as for bone lengthening in patients with maxillofacial deformity and alveolar bone atrophy. One of the major problems in distraction osteogenesis is long consolidation period for 2-3 months, in which the devices have to be fixed on the bone to prevent relapse. It results in scar formation on the face, disturbance of mastication and speech. This study was performed to evaluate the stimulating effect of pulsed electromagnetic field on the early bone consolidation in distraction osteogenesis. Materials and methods: Total 10 rabbit were used (5 for control group, 5 for experimental group). A vertical osteotomy in the mandibular body was performed and the distraction device was fixed. After 5 days distraction was done 1mm per a day for 7 days. A pulsed electromagnetic field (38 Gauss, 60 Hz) was applied for 8 hours per day and it continued for 5 days immediately after distraction in the experimental group. Both groups were sacrificed after 2 weeks. Histological specimens with H&E and Masson Trichrome staining were made and histomorphometrically analysed with image analyser. Results: The device for distraction osteogenesis was displaced in one animal for each group, therefore, only four animals in both groups were evaluated. In both groups, a new bone formation was observed in the distracted area after 2 weeks. The bone formation was enhanced in the experimental groups ($31.76{\pm}8.68%$) compared with control group ($9.94{\pm}3.23%$), its difference was statistically significant (p<0.001). Conclusion: This study suggests that electrical stimulation with electromagnectic field may be effective in the early bone formation after distraction osteogenesis. Further studies with large number of animals are needed before clinical application.

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.

Detecting Adversarial Example Using Ensemble Method on Deep Neural Network (딥뉴럴네트워크에서의 적대적 샘플에 관한 앙상블 방어 연구)

  • Kwon, Hyun;Yoon, Joonhyeok;Kim, Junseob;Park, Sangjun;Kim, Yongchul
    • Convergence Security Journal
    • /
    • v.21 no.2
    • /
    • pp.57-66
    • /
    • 2021
  • Deep neural networks (DNNs) provide excellent performance for image, speech, and pattern recognition. However, DNNs sometimes misrecognize certain adversarial examples. An adversarial example is a sample that adds optimized noise to the original data, which makes the DNN erroneously misclassified, although there is nothing wrong with the human eye. Therefore studies on defense against adversarial example attacks are required. In this paper, we have experimentally analyzed the success rate of detection for adversarial examples by adjusting various parameters. The performance of the ensemble defense method was analyzed using fast gradient sign method, DeepFool method, Carlini & Wanger method, which are adversarial example attack methods. Moreover, we used MNIST as experimental data and Tensorflow as a machine learning library. As an experimental method, we carried out performance analysis based on three adversarial example attack methods, threshold, number of models, and random noise. As a result, when there were 7 models and a threshold of 1, the detection rate for adversarial example is 98.3%, and the accuracy of 99.2% of the original sample is maintained.