Search | Korea Science

Improving the Accuracy of Document Classification by Learning Heterogeneity (이질성 학습을 통한 문서 분류의 정확성 향상 기법)

Wong, William Xiu Shun;Hyun, Yoonjin;Kim, Namgyu
- Journal of Intelligence and Information Systems
- /
- v.24 no.3
- /
- pp.21-44
- /
- 2018
In recent years, the rapid development of internet technology and the popularization of smart devices have resulted in massive amounts of text data. Those text data were produced and distributed through various media platforms such as World Wide Web, Internet news feeds, microblog, and social media. However, this enormous amount of easily obtained information is lack of organization. Therefore, this problem has raised the interest of many researchers in order to manage this huge amount of information. Further, this problem also required professionals that are capable of classifying relevant information and hence text classification is introduced. Text classification is a challenging task in modern data analysis, which it needs to assign a text document into one or more predefined categories or classes. In text classification field, there are different kinds of techniques available such as K-Nearest Neighbor, Naïve Bayes Algorithm, Support Vector Machine, Decision Tree, and Artificial Neural Network. However, while dealing with huge amount of text data, model performance and accuracy becomes a challenge. According to the type of words used in the corpus and type of features created for classification, the performance of a text classification model can be varied. Most of the attempts are been made based on proposing a new algorithm or modifying an existing algorithm. This kind of research can be said already reached their certain limitations for further improvements. In this study, aside from proposing a new algorithm or modifying the algorithm, we focus on searching a way to modify the use of data. It is widely known that classifier performance is influenced by the quality of training data upon which this classifier is built. The real world datasets in most of the time contain noise, or in other words noisy data, these can actually affect the decision made by the classifiers built from these data. In this study, we consider that the data from different domains, which is heterogeneous data might have the characteristics of noise which can be utilized in the classification process. In order to build the classifier, machine learning algorithm is performed based on the assumption that the characteristics of training data and target data are the same or very similar to each other. However, in the case of unstructured data such as text, the features are determined according to the vocabularies included in the document. If the viewpoints of the learning data and target data are different, the features may be appearing different between these two data. In this study, we attempt to improve the classification accuracy by strengthening the robustness of the document classifier through artificially injecting the noise into the process of constructing the document classifier. With data coming from various kind of sources, these data are likely formatted differently. These cause difficulties for traditional machine learning algorithms because they are not developed to recognize different type of data representation at one time and to put them together in same generalization. Therefore, in order to utilize heterogeneous data in the learning process of document classifier, we apply semi-supervised learning in our study. However, unlabeled data might have the possibility to degrade the performance of the document classifier. Therefore, we further proposed a method called Rule Selection-Based Ensemble Semi-Supervised Learning Algorithm (RSESLA) to select only the documents that contributing to the accuracy improvement of the classifier. RSESLA creates multiple views by manipulating the features using different types of classification models and different types of heterogeneous data. The most confident classification rules will be selected and applied for the final decision making. In this paper, three different types of real-world data sources were used, which are news, twitter and blogs.
https://doi.org/10.13088/jiis.2018.24.3.021 인용 PDF KSCI

Video Analysis System for Action and Emotion Detection by Object with Hierarchical Clustering based Re-ID (계층적 군집화 기반 Re-ID를 활용한 객체별 행동 및 표정 검출용 영상 분석 시스템)

Lee, Sang-Hyun;Yang, Seong-Hun;Oh, Seung-Jin;Kang, Jinbeom
- Journal of Intelligence and Information Systems
- /
- v.28 no.1
- /
- pp.89-106
- /
- 2022
Recently, the amount of video data collected from smartphones, CCTVs, black boxes, and high-definition cameras has increased rapidly. According to the increasing video data, the requirements for analysis and utilization are increasing. Due to the lack of skilled manpower to analyze videos in many industries, machine learning and artificial intelligence are actively used to assist manpower. In this situation, the demand for various computer vision technologies such as object detection and tracking, action detection, emotion detection, and Re-ID also increased rapidly. However, the object detection and tracking technology has many difficulties that degrade performance, such as re-appearance after the object's departure from the video recording location, and occlusion. Accordingly, action and emotion detection models based on object detection and tracking models also have difficulties in extracting data for each object. In addition, deep learning architectures consist of various models suffer from performance degradation due to bottlenects and lack of optimization. In this study, we propose an video analysis system consists of YOLOv5 based DeepSORT object tracking model, SlowFast based action recognition model, Torchreid based Re-ID model, and AWS Rekognition which is emotion recognition service. Proposed model uses single-linkage hierarchical clustering based Re-ID and some processing method which maximize hardware throughput. It has higher accuracy than the performance of the re-identification model using simple metrics, near real-time processing performance, and prevents tracking failure due to object departure and re-emergence, occlusion, etc. By continuously linking the action and facial emotion detection results of each object to the same object, it is possible to efficiently analyze videos. The re-identification model extracts a feature vector from the bounding box of object image detected by the object tracking model for each frame, and applies the single-linkage hierarchical clustering from the past frame using the extracted feature vectors to identify the same object that failed to track. Through the above process, it is possible to re-track the same object that has failed to tracking in the case of re-appearance or occlusion after leaving the video location. As a result, action and facial emotion detection results of the newly recognized object due to the tracking fails can be linked to those of the object that appeared in the past. On the other hand, as a way to improve processing performance, we introduce Bounding Box Queue by Object and Feature Queue method that can reduce RAM memory requirements while maximizing GPU memory throughput. Also we introduce the IoF(Intersection over Face) algorithm that allows facial emotion recognized through AWS Rekognition to be linked with object tracking information. The academic significance of this study is that the two-stage re-identification model can have real-time performance even in a high-cost environment that performs action and facial emotion detection according to processing techniques without reducing the accuracy by using simple metrics to achieve real-time performance. The practical implication of this study is that in various industrial fields that require action and facial emotion detection but have many difficulties due to the fails in object tracking can analyze videos effectively through proposed model. Proposed model which has high accuracy of retrace and processing performance can be used in various fields such as intelligent monitoring, observation services and behavioral or psychological analysis services where the integration of tracking information and extracted metadata creates greate industrial and business value. In the future, in order to measure the object tracking performance more precisely, there is a need to conduct an experiment using the MOT Challenge dataset, which is data used by many international conferences. We will investigate the problem that the IoF algorithm cannot solve to develop an additional complementary algorithm. In addition, we plan to conduct additional research to apply this model to various fields' dataset related to intelligent video analysis.
https://doi.org/10.13088/jiis.2022.28.1.089 인용 PDF KSCI

A Study on Business Diversification and Business Performance of Korean Mass Media Enterprises (국내 매스미디어 기업의 사업다각화와 경영성과에 관한 연구)

Chang, Yun-Hi
- Korean journal of communication and information
- /
- v.43
- /
- pp.173-208
- /
- 2008
This study analyses the business performance according to the business diversification of Korean mass media enterprises from year 2003 to 2006. The conclusions drawn which could be divided into five main parts are the followings: First, newspaper companies pursue unrelated diversification in various industrial areas, in order to gain maximum profit while broadcasting companies exert themselves to provide better service by diversifying the major contents. Second, overall the interviewed companies display a constant decline in profit gained from their major business area thus establishing strategies to broaden their focus on diversification of any sort. Third, the researcher completed group analysis in regard of diversification measure resulting in division of three groups. The group which had the most immense diversification range gained the highest ROE, the lowest ROE volatility, and lesser probability of risk taking. The analysis adresses the companies broadening their business areas by researching and focusing on diversification are relatively stable in terms of the profit they gain. Fourth, the middle level group in terms of sales scale, debts, enterprise history, major share rate and high ROE group carry out diversification progressively. The sales scale affects positively to diversification, while the major share rate affects negatively to diversification. Fifth, in accordance to the research, diversification overall contributes to obtainance of successful outcome. Since there was not an immense amount of studies to be referred in the media area, the researcher interviewed and did panel discussion with numerous strategists and managers who are in charge of diversification of media companies. However, collection of only 4 years of data limits the research to be considered to be a generalized study, and does not reflect time gap between business diversification and business performance. Development is required in future studies to be established regarding the media companies' specificity different to other industries, classified the media companies into media types, and consider the time gap in the diversification activities and business performance.
PDF

Performance of Uncompressed Audio Distribution System over Ethernet with a L1/L2 Hybrid Switching Scheme (L1/L2 혼합형 중계 방법을 적용한 이더넷 기반 비압축 오디오 분배 시스템의 성능 분석)

Nam, Wie-Jung;Yoon, Chong-Ho;Park, Pu-Sik;Jo, Nam-Hong
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.46 no.12
- /
- pp.108-116
- /
- 2009
In this paper, we propose a Ethernet based audio distribution system with a new L1/L2 hybrid switching scheme, and evaluate its performance. The proposed scheme not only offers guaranteed low latency and jitter characteristics that are essentially required for the distribution of high-quality uncompressed audio traffic, and but also provide an efficient transmission of data traffic on the Ethernet environment. The audio distribution system with a proposed scheme consists of a master node and a number of relay nodes, and all nodes are mutually connected as a daisy-chain topology through up and downlinks. The master node generates an audio frame for each cycle of 125us, and the audio frame has 24 time slotted audio channels for carrying stereo 24 channels of 16-bit PCM sampled audio. On receiving the audio frame from its upstream node via the downlink, each intermediate node inserts its audio traffic to the reserved time slot for itself, then relays again to next node through its physical layer(L1) transmission - repeating. After reaching the end node, the audio frame is loopbacked through the uplink. On repeating through the uplink, each node makes a copy of audio slot that node has to receive, then play the audio. When the audio transmission is completed, each node works as a normal L2 switch, thus data frames are switched during the remaining period. For supporting this L1/L2 hybrid switching capability, we insert a glue logic for parsing and multiplexing audio and data frames at MII(Media Independent Interlace) between the physical and data link layers. The proposed scheme can provide a good delay performance and transmission efficiency than legacy Ethernet based audio distribution systems. For verifying the feasibility of the proposed L1/L2 hybrid switching scheme, we use OMNeT++ as a simulation tool with various parameters. From the simulation results, one can find that the proposed scheme can provides outstanding characteristics in terms of both jitter characteristic for audio traffic and transmission efficiency of data traffics.
PDF KSCI

The effect of Big-data investment on the Market value of Firm (기업의 빅데이터 투자가 기업가치에 미치는 영향 연구)

Kwon, Young jin;Jung, Woo-Jin
- Journal of Intelligence and Information Systems
- /
- v.25 no.2
- /
- pp.99-122
- /
- 2019
According to the recent IDC (International Data Corporation) report, as from 2025, the total volume of data is estimated to reach ten times higher than that of 2016, corresponding to 163 zettabytes. then the main body of generating information is moving more toward corporations than consumers. So-called "the wave of Big-data" is arriving, and the following aftermath affects entire industries and firms, respectively and collectively. Therefore, effective management of vast amounts of data is more important than ever in terms of the firm. However, there have been no previous studies that measure the effects of big data investment, even though there are number of previous studies that quantitatively the effects of IT investment. Therefore, we quantitatively analyze the Big-data investment effects, which assists firm's investment decision making. This study applied the Event Study Methodology, which is based on the efficient market hypothesis as the theoretical basis, to measure the effect of the big data investment of firms on the response of market investors. In addition, five sub-variables were set to analyze this effect in more depth: the contents are firm size classification, industry classification (finance and ICT), investment completion classification, and vendor existence classification. To measure the impact of Big data investment announcements, Data from 91 announcements from 2010 to 2017 were used as data, and the effect of investment was more empirically observed by observing changes in corporate value immediately after the disclosure. This study collected data on Big Data Investment related to Naver 's' News' category, the largest portal site in Korea. In addition, when selecting the target companies, we extracted the disclosures of listed companies in the KOSPI and KOSDAQ market. During the collection process, the search keywords were searched through the keywords 'Big data construction', 'Big data introduction', 'Big data investment', 'Big data order', and 'Big data development'. The results of the empirically proved analysis are as follows. First, we found that the market value of 91 publicly listed firms, who announced Big-data investment, increased by 0.92%. In particular, we can see that the market value of finance firms, non-ICT firms, small-cap firms are significantly increased. This result can be interpreted as the market investors perceive positively the big data investment of the enterprise, allowing market investors to better understand the company's big data investment. Second, statistical demonstration that the market value of financial firms and non - ICT firms increases after Big data investment announcement is proved statistically. Third, this study measured the effect of big data investment by dividing by company size and classified it into the top 30% and the bottom 30% of company size standard (market capitalization) without measuring the median value. To maximize the difference. The analysis showed that the investment effect of small sample companies was greater, and the difference between the two groups was also clear. Fourth, one of the most significant features of this study is that the Big Data Investment announcements are classified and structured according to vendor status. We have shown that the investment effect of a group with vendor involvement (with or without a vendor) is very large, indicating that market investors are very positive about the involvement of big data specialist vendors. Lastly but not least, it is also interesting that market investors are evaluating investment more positively at the time of the Big data Investment announcement, which is scheduled to be built rather than completed. Applying this to the industry, it would be effective for a company to make a disclosure when it decided to invest in big data in terms of increasing the market value. Our study has an academic implication, as prior research looked for the impact of Big-data investment has been nonexistent. This study also has a practical implication in that it can be a practical reference material for business decision makers considering big data investment.
https://doi.org/10.13088/jiis.2019.25.2.099 인용 PDF KSCI HTML

Analyzing the discriminative characteristic of cover letters using text mining focused on Air Force applicants (텍스트 마이닝을 이용한 공군 부사관 지원자 자기소개서의 차별적 특성 분석)

Kwon, Hyeok;Kim, Wooju
- Journal of Intelligence and Information Systems
- /
- v.27 no.3
- /
- pp.75-94
- /
- 2021
The low birth rate and shortened military service period are causing concerns about selecting excellent military officers. The Republic of Korea entered a low birth rate society in 1984 and an aged society in 2018 respectively, and is expected to be in a super-aged society in 2025. In addition, the troop-oriented military is changed as a state-of-the-art weapons-oriented military, and the reduction of the military service period was implemented in 2018 to ease the burden of military service for young people and play a role in the society early. Some observe that the application rate for military officers is falling due to a decrease of manpower resources and a preference for shortened mandatory military service over military officers. This requires further consideration of the policy of securing excellent military officers. Most of the related studies have used social scientists' methodologies, but this study applies the methodology of text mining suitable for large-scale documents analysis. This study extracts words of discriminative characteristics from the Republic of Korea Air Force Non-Commissioned Officer Applicant cover letters and analyzes the polarity of pass and fail. It consists of three steps in total. First, the application is divided into general and technical fields, and the words characterized in the cover letter are ordered according to the difference in the frequency ratio of each field. The greater the difference in the proportion of each application field, the field character is defined as 'more discriminative'. Based on this, we extract the top 50 words representing discriminative characteristics in general fields and the top 50 words representing discriminative characteristics in technology fields. Second, the number of appropriate topics in the overall cover letter is calculated through the LDA. It uses perplexity score and coherence score. Based on the appropriate number of topics, we then use LDA to generate topic and probability, and estimate which topic words of discriminative characteristic belong to. Subsequently, the keyword indicators of questions used to set the labeling candidate index, and the most appropriate index indicator is set as the label for the topic when considering the topic-specific word distribution. Third, using L-LDA, which sets the cover letter and label as pass and fail, we generate topics and probabilities for each field of pass and fail labels. Furthermore, we extract only words of discriminative characteristics that give labeled topics among generated topics and probabilities by pass and fail labels. Next, we extract the difference between the probability on the pass label and the probability on the fail label by word of the labeled discriminative characteristic. A positive figure can be seen as having the polarity of pass, and a negative figure can be seen as having the polarity of fail. This study is the first research to reflect the characteristics of cover letters of Republic of Korea Air Force non-commissioned officer applicants, not in the private sector. Moreover, these methodologies can apply text mining techniques for multiple documents, rather survey or interview methods, to reduce analysis time and increase reliability for the entire population. For this reason, the methodology proposed in the study is also applicable to other forms of multiple documents in the field of military personnel. This study shows that L-LDA is more suitable than LDA to extract discriminative characteristics of Republic of Korea Air Force Noncommissioned cover letters. Furthermore, this study proposes a methodology that uses a combination of LDA and L-LDA. Therefore, through the analysis of the results of the acquisition of non-commissioned Republic of Korea Air Force officers, we would like to provide information available for acquisition and promotional policies and propose a methodology available for research in the field of military manpower acquisition.
https://doi.org/10.13088/jiis.2021.27.3.075 인용 PDF KSCI

A Study of the Application of 'Digital Heritage ODA' - Focusing on the Myanmar cultural heritage management system - (디지털 문화유산 ODA 적용에 관한 시론적 연구 -미얀마 문화유산 관리시스템을 중심으로-)

Jeong, Seongmi
- Korean Journal of Heritage: History & Science
- /
- v.53 no.4
- /
- pp.198-215
- /
- 2020
Official development assistance refers to assistance provided by governments and other public institutions in donor countries, aimed at promoting economic development and social welfare in developing countries. The purpose of this research is to examine the construction process of the "Myanmar Cultural Heritage Management System" that is underway as part of the ODA project to strengthen cultural and artistic capabilities and analyze the achievements and challenges of the Digital Cultural Heritage ODA. The digital cultural heritage management system is intended to achieve the permanent preservation and sustainable utilization of tangible and intangible cultural heritage materials. Cultural heritage can be stored in digital archives, newly approached using computer analysis technology, and information can be used in multiple dimensions. First, the Digital Cultural Heritage ODA was able to permanently preserve cultural heritage content that urgently needed digitalization by overcoming and documenting the "risk" associated with cultural heritage under threat of being extinguished, damaged, degraded, or distorted in Myanmar. Second, information on Myanmar's cultural heritage can be systematically managed and used in many ways through linkages between materials. Third, cultural maps can be implemented that are based on accurate geographical location information as to where cultural heritage is located or inherited. Various items of cultural heritage were collectively and intensively visualized to maximize utility and convenience for academic, policy, and practical purposes. Fourth, we were able to overcome the one-sided limitations of cultural ODA in relations between donor and recipient countries. Fifth, the capacity building program run by officials in charge of the beneficiary country, which could be the most important form of sustainable development in the cultural ODA, was operated together. Sixth, there is an implication that it is an ODA that can be relatively smooth and non-face-to-face in nature, without requiring the movement of manpower between countries during the current global pandemic. However, the following tasks remain to be solved through active discussion and deliberation in the future. First, the content of the data uploaded to the system should be verified. Second, to preserve digital cultural heritage, it must be protected from various threats. For example, it is necessary to train local experts to prepare for errors caused by computer viruses, stored data, or operating systems. Third, due to the nature of the rapidly changing environment of computer technology, measures should also be discussed to address the problems that tend to follow when new versions and programs are developed after the end of the ODA project, or when developers have not continued to manage their programs. Fourth, since the classification system criteria and decisions regarding whether the data will be disclosed or not are set according to Myanmar's political judgment, it is necessary to let the beneficiary country understand the ultimate purpose of the cultural ODA project.
https://doi.org/10.22755/kjchs.2020.53.4.198 인용 PDF

UX Methodology Study by Data Analysis Focusing on deriving persona through customer segment classification (데이터 분석을 통한 UX 방법론 연구 고객 세그먼트 분류를 통한 페르소나 도출을 중심으로)

Lee, Seul-Yi;Park, Do-Hyung
- Journal of Intelligence and Information Systems
- /
- v.27 no.1
- /
- pp.151-176
- /
- 2021
As the information technology industry develops, various kinds of data are being created, and it is now essential to process them and use them in the industry. Analyzing and utilizing various digital data collected online and offline is a necessary process to provide an appropriate experience for customers in the industry. In order to create new businesses, products, and services, it is essential to use customer data collected in various ways to deeply understand potential customers' needs and analyze behavior patterns to capture hidden signals of desire. However, it is true that research using data analysis and UX methodology, which should be conducted in parallel for effective service development, is being conducted separately and that there is a lack of examples of use in the industry. In thiswork, we construct a single process by applying data analysis methods and UX methodologies. This study is important in that it is highly likely to be used because it applies methodologies that are actively used in practice. We conducted a survey on the topic to identify and cluster the associations between factors to establish customer classification and target customers. The research methods are as follows. First, we first conduct a factor, regression analysis to determine the association between factors in the happiness data survey. Groups are grouped according to the survey results and identify the relationship between 34 questions of psychological stability, family life, relational satisfaction, health, economic satisfaction, work satisfaction, daily life satisfaction, and residential environment satisfaction. Second, we classify clusters based on factors affecting happiness and extract the optimal number of clusters. Based on the results, we cross-analyzed the characteristics of each cluster. Third, forservice definition, analysis was conducted by correlating with keywords related to happiness. We leverage keyword analysis of the thumb trend to derive ideas based on the interest and associations of the keyword. We also collected approximately 11,000 news articles based on the top three keywords that are highly related to happiness, then derived issues between keywords through text mining analysis in SAS, and utilized them in defining services after ideas were conceived. Fourth, based on the characteristics identified through data analysis, we selected segmentation and targetingappropriate for service discovery. To this end, the characteristics of the factors were grouped and selected into four groups, and the profile was drawn up and the main target customers were selected. Fifth, based on the characteristics of the main target customers, interviewers were selected and the In-depthinterviews were conducted to discover the causes of happiness, causes of unhappiness, and needs for services. Sixth, we derive customer behavior patterns based on segment results and detailed interviews, and specify the objectives associated with the characteristics. Seventh, a typical persona using qualitative surveys and a persona using data were produced to analyze each characteristic and pros and cons by comparing the two personas. Existing market segmentation classifies customers based on purchasing factors, and UX methodology measures users' behavior variables to establish criteria and redefine users' classification. Utilizing these segment classification methods, applying the process of producinguser classification and persona in UX methodology will be able to utilize them as more accurate customer classification schemes. The significance of this study is summarized in two ways: First, the idea of using data to create a variety of services was linked to the UX methodology used to plan IT services by applying it in the hot topic era. Second, we further enhance user classification by applying segment analysis methods that are not currently used well in UX methodologies. To provide a consistent experience in creating a single service, from large to small, it is necessary to define customers with common goals. To this end, it is necessary to derive persona and persuade various stakeholders. Under these circumstances, designing a consistent experience from beginning to end, through fast and concrete user descriptions, would be a very effective way to produce a successful service.
https://doi.org/10.13088/jiis.2021.27.1.151 인용 PDF KSCI

Analysis of Munitions Contract Work Using Process Mining (프로세스 마이닝을 이용한 군수품 계약업무 분석 : 공군 군수사 계약업무를 중심으로)

Joo, Yong Seon;Kim, Su Hwan
- Journal of Intelligence and Information Systems
- /
- v.28 no.4
- /
- pp.41-59
- /
- 2022
The timely procurement of military supplies is essential to maintain the military's operational capabilities, and contract work is the first step toward timely procurement. In addition, rapid signing of a contract enables consumers to set a leisurely delivery date and increases the possibility of budget execution, so it is essential to improve the contract process to prevent early execution of the budget and transfer or disuse. Recently, research using big data has been actively conducted in various fields, and process analysis using big data and process mining, an improvement technique, are also widely used in the private sector. However, the analysis of contract work in the military is limited to the level of individual analysis such as identifying the cause of each problem case of budget transfer and disuse contracts using the experience and fragmentary information of the person in charge. In order to improve the contract process, this study analyzed using the process mining technique with data on a total of 560 contract tasks directly contracted by the Department of Finance of the Air Force Logistics Command for about one year from November 2019. Process maps were derived by synthesizing distributed data, and process flow, execution time analysis, bottleneck analysis, and additional detailed analysis were conducted. As a result of the analysis, it was found that review/modification occurred repeatedly after request in a number of contracts. Repeated reviews/modifications have a significant impact on the delay in the number of days to complete the cost calculation, which has also been clearly revealed through bottleneck visualization. Review/modification occurs in more than 60% of the top 5 departments with many contract requests, and it usually occurs in the first half of the year when requests are concentrated, which means that a thorough review is required before requesting contracts from the required departments. In addition, the contract work of the Department of Finance was carried out in accordance with the procedures according to laws and regulations, but it was found that it was necessary to adjust the order of some tasks. This study is the first case of using process mining for the analysis of contract work in the military. Based on this, if further research is conducted to apply process mining to various tasks in the military, it is expected that the efficiency of various tasks can be derived.
https://doi.org/10.13088/jiis.2022.28.4.041 인용 PDF KSCI

Research on Generative AI for Korean Multi-Modal Montage App (한국형 멀티모달 몽타주 앱을 위한 생성형 AI 연구)

Lim, Jeounghyun;Cha, Kyung-Ae;Koh, Jaepil;Hong, Won-Kee
- Journal of Service Research and Studies
- /
- v.14 no.1
- /
- pp.13-26
- /
- 2024
Multi-modal generation is the process of generating results based on a variety of information, such as text, images, and audio. With the rapid development of AI technology, there is a growing number of multi-modal based systems that synthesize different types of data to produce results. In this paper, we present an AI system that uses speech and text recognition to describe a person and generate a montage image. While the existing montage generation technology is based on the appearance of Westerners, the montage generation system developed in this paper learns a model based on Korean facial features. Therefore, it is possible to create more accurate and effective Korean montage images based on multi-modal voice and text specific to Korean. Since the developed montage generation app can be utilized as a draft montage, it can dramatically reduce the manual labor of existing montage production personnel. For this purpose, we utilized persona-based virtual person montage data provided by the AI-Hub of the National Information Society Agency. AI-Hub is an AI integration platform aimed at providing a one-stop service by building artificial intelligence learning data necessary for the development of AI technology and services. The image generation system was implemented using VQGAN, a deep learning model used to generate high-resolution images, and the KoDALLE model, a Korean-based image generation model. It can be confirmed that the learned AI model creates a montage image of a face that is very similar to what was described using voice and text. To verify the practicality of the developed montage generation app, 10 testers used it and more than 70% responded that they were satisfied. The montage generator can be used in various fields, such as criminal detection, to describe and image facial features.
https://doi.org/10.18807/jsrs.2024.14.1.013 인용 PDF

Search Result 12,210, Processing Time 0.04 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)