• 제목/요약/키워드: Text data

검색결과 2,953건 처리시간 0.032초

A Deep Learning-based Regression Model for Predicting Government Officer Education Satisfaction (공무원 직무 전문교육 만족도 예측을 위한 딥러닝 기반 회귀 모델 설계)

  • Sumin Oh;Sungyeon Yoon;Minseo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • 제10권5호
    • /
    • pp.667-671
    • /
    • 2024
  • Professional job training for government officers emphasizes establishing desirable values as public officials and improving professionalism in public service. To provide customized education, some studies are analyzed factors affecting education satisfaction. However, there is a lack of research predicting education satisfaction with educational contents. Therefore, we propose a deep learning-based regression model that predicts government officer education satisfaction with educational contents. We use education information data for government officer. We use one-hot encoding to categorize variables collected in text format, such as education targets, education classifications, and education types. We quantify the education contents stored in text format as TF-IDF. We train our deep learning-based regression model and validate model performance with 10-Fold Cross Validation. Our proposed model showed 99.87% accuracy on test sets. We expect that customized education recommendations based on our model will help provide and improve optimized education content.

Primer on Generative Artificial Intelligence and Large Language Models in Medical Imaging (의료영상에서 생성형 인공지능과 대형 언어 모델 입문)

  • Kiduk Kim;Gil-Sun Hong;Namkug Kim
    • Journal of the Korean Society of Radiology
    • /
    • 제85권5호
    • /
    • pp.848-860
    • /
    • 2024
  • The recent advent of large language models (LLMs), such as ChatGPT, has drawn attention to generative artificial intelligence (AI) in a number of fields. Generative AI can produce different types of data including text, images, and voice, depending on the training methods and datasets used. Additionally, recent advancements in multimodal techniques, which can simultaneously process multiple data types like text and images, have expanded the potential of using multimodal generative AI in the medical environment where various types of clinical and imaging information are used together. This review summarizes the concepts and types of LLMs, image generative AI, and multimodal AI, and it examines the status and future possibilities of generative AI in the field of radiology.

Cyber-Lecture Management System based on XML (XML 기반의 사이버 강좌관리 시스템)

  • Kim, Hye-Young;Kim, Hwa-Sun;Kim, Heung-Sik;Kim, Sang-Gyun;Choi, Heung-Kook
    • The KIPS Transactions:PartA
    • /
    • 제10A권5호
    • /
    • pp.529-538
    • /
    • 2003
  • The speedy development of the world-wide web is rapidly growing the internet-based tools for the remote instruction. In interchanging and expressing the information of one another on the web, it is produced through the form of HTML(Hyoertext Text Markup Language). However, the structural disadvantage of the HTML is becoming to require a powerful XML(eXtensible Markup Language) which can store all the sphere of data, and transform them into another form. Nevertheless, because the powerful XML(eXtensible Markup Language). However necessary that XML standard should be applied appropriately. Because existing lecture data of cyber education sites cannot be shared, the users should passively use only the functions offered by cyber school. To solve the problem of this limit, in this study, the standardized data structure for XML is defined, and system model for processing between the server and the client is provided. By storing the lecture data of cyber education sites as XML on the web, stored data came to be reused without changing on any site. In the view of Users, they could used the Internet service with equipment that they want at any place and any time. To control any kinds of CK\LMS(Cyber Lecture Management System) for Administrator and Users, we offered a variety of Multimedia applications and an easy interface and built a new style of CLMS. Therefore, by strong and extracting the data related with the virtual education of the secondary school through the form of XML, for the effective interchange and sharing of the information, maximum utilization of the information can be achieved.

Korean High School Students' Understanding of the Concept of Correlation (우리나라 고등학생들의 상관관계 이해도 조사)

  • No, A Ra;Yoo, Yun Joo
    • Journal of Educational Research in Mathematics
    • /
    • 제23권4호
    • /
    • pp.467-490
    • /
    • 2013
  • Correlation is a basic statistical concept which is necessary for understanding the relationship between two variables when they change values. In the middle school curriculum of Korea, only informal definition of correlation is taught with two-way data representations such as scatter plots and contingency tables. In this study, we investigated Korean high school students' understanding of correlation using a test consisting of 35 items about interpretation of scatter plot, contingency table, and text in realistic situation. 216 students from a high school in Seoul took the test for 20 minutes. From the results, we could observe the following: First, students did not have right criteria for determining the strength of correlation presented in scatter plots. Most of students could determine if there is correlation/no correlation and if the correlation is positive/negative by seeing the data presented in scatter plots. However, they did not judge by the closeness to the regression line but rather judged by the closeness between data points. Second, when statements about comparing the strength of correlation in the context of real life situation were given in text, the students had difficulty in understanding the distribution-related characteristic of the bi-variate data. Students had difficulty in figuring out the local distribution characteristic of data, which cannot be guessed merely based on the expression 'The correlation is strong' without statistical knowledge of correlation. Third, a large number of students could not judge the association between two variabels using conditional proportions when qualitative data are given in 2-by-2 tables. They made judgement by the absolute cell count and when the marginal sum of two categories are different for explanatory variable they thought the association could not be determined. From these results, we concluded that educational measures are required in order to remove such misconceptions and to improve understanding of correlation. Considering that the current mathematics curriculum does not cover the concept of correlation, we need to improve the curriculum as well.

  • PDF

Sensitivity Identification Method for New Words of Social Media based on Naive Bayes Classification (나이브 베이즈 기반 소셜 미디어 상의 신조어 감성 판별 기법)

  • Kim, Jeong In;Park, Sang Jin;Kim, Hyoung Ju;Choi, Jun Ho;Kim, Han Il;Kim, Pan Koo
    • Smart Media Journal
    • /
    • 제9권1호
    • /
    • pp.51-59
    • /
    • 2020
  • From PC communication to the development of the internet, a new term has been coined on the social media, and the social media culture has been formed due to the spread of smart phones, and the newly coined word is becoming a culture. With the advent of social networking sites and smart phones serving as a bridge, the number of data has increased in real time. The use of new words can have many advantages, including the use of short sentences to solve the problems of various letter-limited messengers and reduce data. However, new words do not have a dictionary meaning and there are limitations and degradation of algorithms such as data mining. Therefore, in this paper, the opinion of the document is confirmed by collecting data through web crawling and extracting new words contained within the text data and establishing an emotional classification. The progress of the experiment is divided into three categories. First, a word collected by collecting a new word on the social media is subjected to learned of affirmative and negative. Next, to derive and verify emotional values using standard documents, TF-IDF is used to score noun sensibilities to enter the emotional values of the data. As with the new words, the classified emotional values are applied to verify that the emotions are classified in standard language documents. Finally, a combination of the newly coined words and standard emotional values is used to perform a comparative analysis of the technology of the instrument.

A Development Plan for Co-creation-based Smart City through the Trend Analysis of Internet of Things (사물인터넷 동향분석을 통한 Co-creation기반 스마트시티 구축 방안)

  • Park, Ju Seop;Hong, Soon-Goo;Kim, Na Rang
    • Journal of Korea Society of Industrial Information Systems
    • /
    • 제21권4호
    • /
    • pp.67-78
    • /
    • 2016
  • Recently many countries around the world are actively promoting smart city projects to address various urban problems such as traffic congestion, housing shortage, and energy scarcity. Due to development of the Internet of Things (IoT), the development of a smart city with sustainability, convenience, and environment-friendliness was enabled through the effective control and reuse of urban resources. The purpose of this study is to analyze the technical trends of IoT and present a development plan for smart city which is one of the applications of the IoT. To this end, the news articles of the Electronic Times between 2013 and 2015were analyzed using the text mining technique and smart city development cases of other countries were investigated. The analysis results revealed the close relationships of big data, cloud, platforms, and sensors with smart city. For the successful development of a smart city, first, all the interested parties in the city must work together to create new values throughout the entire process of value chain. Second, they must utilize big data and disclose public data more actively than they are doing now. This study has made academic contribution in that it has presented a big data analysis method and stimulated follow-up studies. For the practical contribution, the results of this study provided useful data for the policy making of local governments and administrative agencies for smart city development. This study may have limitations in the incorporation of the total trends because only the news articles of the Electronic Times were selected to analyze the technical trends of the IoT.

A Curriculum Study to Strengthen AI and Data Science Job Competency (AI·데이터 사이언스 분야 직무 역량 강화를 위한 커리큘럼 연구)

  • Kim, Hyo-Jung;Kim, Hee-Woong
    • Informatization Policy
    • /
    • 제28권2호
    • /
    • pp.34-56
    • /
    • 2021
  • According to the Fourth Industrial Revolution, demand for and interest in jobs in the field of AI and data science - such as artificial intelligence/data analysts - are increasing. In order to keep pace with this trend, and to supply human resources that can effectively perform such jobs in the relevant fields in a timely manner, job seekers must develop the competencies required by the companies, and universities must be in charge of training. However, it is difficult to devise appropriate response strategies at the level of job seekers, companies and universities, which are stakeholders in terms of supplying suitably competent personnel. Therefore, the purpose of this study is to determine which competencies are required in practice in order to cultivate and supply human talents equipped with the necessary job competencies, and to propose plans for the development of the required competencies at the university level. In order to identify the required competencies in the field of AI and data science, data on job postings on the LinkedIn site, the recruitment platform, were analyzed using text mining techniques. Then, research was conducted with the aim of devising and proposing concrete plans for competency development at the university level by comparing and verifying the results of the international graduate school curriculum in the field of AI and data science, and the interview results with the hiring managers, respectively, with the results of the topic model.

An Analysis of the Experience of Visitors of Fishing Experience Recreation Village Using Big Data - A Focus on Baekmi Village in Hwaseong-si and Susan Village in Yangyang-gun - (빅데이터를 활용한 어촌체험휴양마을 방문객의 경험분석 - 화성시 백미리와 양양군 수산리 어촌체험휴양마을을 대상으로 -)

  • Song, So-Hyun;An, Byung-Chul
    • Journal of Korean Society of Rural Planning
    • /
    • 제27권4호
    • /
    • pp.13-24
    • /
    • 2021
  • This study used big data to analyze visitors' experiences in Fishing Experience Recreation Village. Through the portal site posting data for the past six years, the experience of visiting Fishing Experience Villages in Baekmi and Susan was analyzed. The analysis method used Text mining and Social Network Analysis which are Big data analysis techniques. Data was collected using Textom, and experience keywords were extracted by analyzing the frequency and importance of experience texts. Afterwards, the characteristics of the experience of visiting the Fishing Experience Village were identified through the analysis of the interaction between the experience keywords using 'U cinet 6.0' and 'NetDraw'. First, through TF and TF-IDF values, keywords such as "Gungpyeong Port", "Susan Port", and "Yacht Marina" that refer to the name of the port and the port facilities appeared at the top. This is interpreted as the name of the port has the greatest impact on the recognition of the Fishing Experience Villages, and visitors showed a lot of interest in the port facilities. Second, focusing on the unique elements of port facilities and fishing villages such as "mud flat experience", "fishing village experience", "Gungpyeong port", "Susan port", "yacht marina", and "beach" through the values of degree, closeness, and betweenness centrality interpreted as having an interaction with various experiences. Third, through the CONCOR analysis, it was confirmed that the visitor's experience was focused on the dynamic behavior, the experience program had the greatest influence on the experience of the visitor, and that the experience of the static and the dynamic behavior was relatively balanced. In conclusion, the experience of visitors in the Fishing Experience Villages is most affected by the environment of the fishing village such as the tidal flats and the coast and the fishing village experience program conducted at the fishing port facilities. In particular, it was found that fishing port facilities such as ports and marinas had a high influence on the awareness of the Fishing Experience Villages. Therefore, it is important to actively utilize the scenery and environment unique to fishing villages in order to revitalize the Fishing Experience Villages experience and improve the quality of the visitor experience. This study is significant in that it studied visitors' experiences in fishing village recreation villages using big data and derived the connection between fishing village and fishing village infrastructure in fishing village experience tourism.

Deep Learning-based Professional Image Interpretation Using Expertise Transplant (전문성 이식을 통한 딥러닝 기반 전문 이미지 해석 방법론)

  • Kim, Taejin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • 제26권2호
    • /
    • pp.79-104
    • /
    • 2020
  • Recently, as deep learning has attracted attention, the use of deep learning is being considered as a method for solving problems in various fields. In particular, deep learning is known to have excellent performance when applied to applying unstructured data such as text, sound and images, and many studies have proven its effectiveness. Owing to the remarkable development of text and image deep learning technology, interests in image captioning technology and its application is rapidly increasing. Image captioning is a technique that automatically generates relevant captions for a given image by handling both image comprehension and text generation simultaneously. In spite of the high entry barrier of image captioning that analysts should be able to process both image and text data, image captioning has established itself as one of the key fields in the A.I. research owing to its various applicability. In addition, many researches have been conducted to improve the performance of image captioning in various aspects. Recent researches attempt to create advanced captions that can not only describe an image accurately, but also convey the information contained in the image more sophisticatedly. Despite many recent efforts to improve the performance of image captioning, it is difficult to find any researches to interpret images from the perspective of domain experts in each field not from the perspective of the general public. Even for the same image, the part of interests may differ according to the professional field of the person who has encountered the image. Moreover, the way of interpreting and expressing the image also differs according to the level of expertise. The public tends to recognize the image from a holistic and general perspective, that is, from the perspective of identifying the image's constituent objects and their relationships. On the contrary, the domain experts tend to recognize the image by focusing on some specific elements necessary to interpret the given image based on their expertise. It implies that meaningful parts of an image are mutually different depending on viewers' perspective even for the same image. So, image captioning needs to implement this phenomenon. Therefore, in this study, we propose a method to generate captions specialized in each domain for the image by utilizing the expertise of experts in the corresponding domain. Specifically, after performing pre-training on a large amount of general data, the expertise in the field is transplanted through transfer-learning with a small amount of expertise data. However, simple adaption of transfer learning using expertise data may invoke another type of problems. Simultaneous learning with captions of various characteristics may invoke so-called 'inter-observation interference' problem, which make it difficult to perform pure learning of each characteristic point of view. For learning with vast amount of data, most of this interference is self-purified and has little impact on learning results. On the contrary, in the case of fine-tuning where learning is performed on a small amount of data, the impact of such interference on learning can be relatively large. To solve this problem, therefore, we propose a novel 'Character-Independent Transfer-learning' that performs transfer learning independently for each character. In order to confirm the feasibility of the proposed methodology, we performed experiments utilizing the results of pre-training on MSCOCO dataset which is comprised of 120,000 images and about 600,000 general captions. Additionally, according to the advice of an art therapist, about 300 pairs of 'image / expertise captions' were created, and the data was used for the experiments of expertise transplantation. As a result of the experiment, it was confirmed that the caption generated according to the proposed methodology generates captions from the perspective of implanted expertise whereas the caption generated through learning on general data contains a number of contents irrelevant to expertise interpretation. In this paper, we propose a novel approach of specialized image interpretation. To achieve this goal, we present a method to use transfer learning and generate captions specialized in the specific domain. In the future, by applying the proposed methodology to expertise transplant in various fields, we expected that many researches will be actively conducted to solve the problem of lack of expertise data and to improve performance of image captioning.

Deriving Basic Living Service Items and Establishing Spatial Data in Rural Areas (농촌 생활권 기초생활서비스 항목 설정 및 공간데이터 구축을 위한 기초연구)

  • Kim, Suyeon;Kim, Sang-Bum
    • Journal of the Korean Institute of Rural Architecture
    • /
    • 제24권3호
    • /
    • pp.39-46
    • /
    • 2022
  • This study aims to derive basic living service facility items in rural areas and construct related spatial data. To do this, a literature review on the laws and systems related to the residential environment and services in rural areas, rural spatial planning, and the 'Rural Convention' strategic plan reports for the Jeolla and Gyeongsang Region in 2021 was conducted. Primary data collection and review on the list of basic living service items in rural areas derived from the analysis were conducted. After data collection, 12 sectors and 44 types of rural basic living service items were derived; the data selection was carried out based on the clarity of the subject of data management, whether it was established nationwide, whether it was disclosed and provided, whether it was periodically updated, and whether it was an underlying law. Afterwards, data on the derived rural basic living service items were constructed. Afterwards, spatial data on the derived rural basic living service items were constructed. Because open data provided through various institutions were employed, data structure unification such as data attribute values and code names was needed, and abnormal data such as address errors and omissions were refined. After that, the data provided in text form was converted into spatial data through geocoding, and through comparative review of the distribution status of the converted data and the provided address, spatial data related to rural basic living services were finally constructed for about 540,000 cases. Finally, implications for data construction for diagnosing rural living areas were derived through the data collection and construction process. The derived implications include data unification, data update system establishment, the establishment of attribute values necessary for rural living area diagnosis and spatial planning, data establishment plan for facilities that provide various services, rural living area analysis method, and diagnostic index development. This study is meaningful in that it laid the foundation for data-based rural area diagnosis and rural planning, by selecting the basic rural living service items, and constructing spatial data on the selected items.