• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.034 seconds

GPT-enabled SNS Sentence writing support system Based on Image Object and Meta Information (이미지 객체 및 메타정보 기반 GPT 활용 SNS 문장 작성 보조 시스템)

  • Dong-Hee Lee;Mikyeong Moon;Bong-Jun, Choi
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.3
    • /
    • pp.160-165
    • /
    • 2023
  • In this study, we propose an SNS sentence writing assistance system that utilizes YOLO and GPT to assist users in writing texts with images, such as SNS. We utilize the YOLO model to extract objects from images inserted during writing, and also extract meta-information such as GPS information and creation time information, and use them as prompt values for GPT. To use the YOLO model, we trained it on form image data, and the mAP score of the model is about 0.25 on average. GPT was trained on 1,000 blog text data with the topic of 'restaurant reviews', and the model trained in this study was used to generate sentences with two types of keywords extracted from the images. A survey was conducted to evaluate the practicality of the generated sentences, and a closed-ended survey was conducted to clearly analyze the survey results. There were three evaluation items for the questionnaire by providing the inserted image and keyword sentences. The results showed that the keywords in the images generated meaningful sentences. Through this study, we found that the accuracy of image-based sentence generation depends on the relationship between image keywords and GPT learning contents.

Automatic scoring of mathematics descriptive assessment using random forest algorithm (랜덤 포레스트 알고리즘을 활용한 수학 서술형 자동 채점)

  • Inyong Choi;Hwa Kyung Kim;In Woo Chung;Min Ho Song
    • The Mathematical Education
    • /
    • v.63 no.2
    • /
    • pp.165-186
    • /
    • 2024
  • Despite the growing attention on artificial intelligence-based automated scoring technology as a support method for the introduction of descriptive items in school environments and large-scale assessments, there is a noticeable lack of foundational research in mathematics compared to other subjects. This study developed an automated scoring model for two descriptive items in first-year middle school mathematics using the Random Forest algorithm, evaluated its performance, and explored ways to enhance this performance. The accuracy of the final models for the two items was found to be between 0.95 to 1.00 and 0.73 to 0.89, respectively, which is relatively high compared to automated scoring models in other subjects. We discovered that the strategic selection of the number of evaluation categories, taking into account the amount of data, is crucial for the effective development and performance of automated scoring models. Additionally, text preprocessing by mathematics education experts proved effective in improving both the performance and interpretability of the automated scoring model. Selecting a vectorization method that matches the characteristics of the items and data was identified as one way to enhance model performance. Furthermore, we confirmed that oversampling is a useful method to supplement performance in situations where practical limitations hinder balanced data collection. To enhance educational utility, further research is needed on how to utilize feature importance derived from the Random Forest-based automated scoring model to generate useful information for teaching and learning, such as feedback. This study is significant as foundational research in the field of mathematics descriptive automatic scoring, and there is a need for various subsequent studies through close collaboration between AI experts and math education experts.

A Study on the Perception of Pit and Fissure Sealant using Unstructured Big Data (비정형 빅데이터를 이용한 치면열구전색(치아홈메우기)에 대한 인식분석)

  • Han-A Cho
    • Journal of Korean Dental Hygiene Science
    • /
    • v.6 no.2
    • /
    • pp.101-114
    • /
    • 2023
  • Background: This study aimed to explore the overall perception of pit and fissure sealants and suggest methods to revitalize their current stagnation. Methods: To determine the social perception of the change in coverage policy for pit and fissure sealants, we categorized them into five time periods. The first period (December 1, 2009 to November 30, 2010), the second period (December 1, 2010 to September 30, 2012), the third period (October 1, 2012 to May 5, 2013), the fourth period (May 6, 2013 to September 30, 2017), and the fifth period (October 1, 2017 to December 31, 2022). We utilized text mining, an unstructured big data analysis method. Keywords were collected and analyzed using Textom, and the frequency analysis of the top 30 keywords, structural features of the semantic network, centrality analysis, QAP correlation analysis, and co-occurrence analysis were conducted. Results: The frequency analysis showed that the top keywords for each time period were 'Cavities', 'Treatment', and 'Children'. In the structural features of the semantic network of pit and fissure sealants by time period, the density index was found to be around 1.00 for all time periods. The QAP correlation analysis showed the highest correlation between the first and second periods and the fourth and fifth periods with a correlation coefficient of 0.834. The co-occurrence analysis showed that 'cavities' and 'prevention were the top two words across all time periods. Conclusion: This study showed that pit and fissure sealants are well accepted by the society as a preventive treatment for caries. However, the awareness of health education related to these sealants was found to be low. Efforts to revitalize stagnant pit and fissure sealants need to be strengthened with effective education.

Analysis of the AI Convergence Science Education Research Trends Using Text Mining (텍스트 마이닝을 활용한 AI융합 과학교육 연구 동향 분석)

  • Lee, Ju-Young
    • Journal of Korean Elementary Science Education
    • /
    • v.43 no.4
    • /
    • pp.544-553
    • /
    • 2024
  • The purpose of this study was to analyze the trends of research focusing on artificial intelligence and the science education and derive important problems, topics, and research trends,. The analysis of the AI convergence science education research trends targeted 83 articles on the awareness of artificial intelligence, research trends, design, development, and application of the education programs related to artificial intelligence. The analysis data was collected through the RISS. The collected data was refined using Excel and Textom, and the main keywords were identified and analyzed through the frequency analysis and keyword network analysis. The connection centrality of the keywords was confirmed using the CONCOR analysis. The research results showed that the AI convergence science education research was expanding in both quantitative and qualitative aspects, and that the main keywords were identified as 'AI,' 'AI convergence education,' 'AI convergence science education,' 'AI education,' 'science education,' 'science,' 'machine learning,' 'elementary school,' 'generative AI,' and 'educational program.' Through the connection centrality analysis and CONCOR analysis, it was confirmed that the clusters were formed around the 'naming,' 'content and method,' 'elementary,' and 'data' in the AI integrated science education. Based on the results, the main topics and trends of the research integrating artificial intelligence into the science subjects were derived and the implications and directions for follow-up research were set forth.

Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques (텍스트 마이닝을 이용한 2012년 한국대선 관련 트위터 분석)

  • Bae, Jung-Hwan;Son, Ji-Eun;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.141-156
    • /
    • 2013
  • Social media is a representative form of the Web 2.0 that shapes the change of a user's information behavior by allowing users to produce their own contents without any expert skills. In particular, as a new communication medium, it has a profound impact on the social change by enabling users to communicate with the masses and acquaintances their opinions and thoughts. Social media data plays a significant role in an emerging Big Data arena. A variety of research areas such as social network analysis, opinion mining, and so on, therefore, have paid attention to discover meaningful information from vast amounts of data buried in social media. Social media has recently become main foci to the field of Information Retrieval and Text Mining because not only it produces massive unstructured textual data in real-time but also it serves as an influential channel for opinion leading. But most of the previous studies have adopted broad-brush and limited approaches. These approaches have made it difficult to find and analyze new information. To overcome these limitations, we developed a real-time Twitter trend mining system to capture the trend in real-time processing big stream datasets of Twitter. The system offers the functions of term co-occurrence retrieval, visualization of Twitter users by query, similarity calculation between two users, topic modeling to keep track of changes of topical trend, and mention-based user network analysis. In addition, we conducted a case study on the 2012 Korean presidential election. We collected 1,737,969 tweets which contain candidates' name and election on Twitter in Korea (http://www.twitter.com/) for one month in 2012 (October 1 to October 31). The case study shows that the system provides useful information and detects the trend of society effectively. The system also retrieves the list of terms co-occurred by given query terms. We compare the results of term co-occurrence retrieval by giving influential candidates' name, 'Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn' as query terms. General terms which are related to presidential election such as 'Presidential Election', 'Proclamation in Support', Public opinion poll' appear frequently. Also the results show specific terms that differentiate each candidate's feature such as 'Park Jung Hee' and 'Yuk Young Su' from the query 'Guen Hae Park', 'a single candidacy agreement' and 'Time of voting extension' from the query 'Jae In Moon' and 'a single candidacy agreement' and 'down contract' from the query 'Chul Su Ahn'. Our system not only extracts 10 topics along with related terms but also shows topics' dynamic changes over time by employing the multinomial Latent Dirichlet Allocation technique. Each topic can show one of two types of patterns-Rising tendency and Falling tendencydepending on the change of the probability distribution. To determine the relationship between topic trends in Twitter and social issues in the real world, we compare topic trends with related news articles. We are able to identify that Twitter can track the issue faster than the other media, newspapers. The user network in Twitter is different from those of other social media because of distinctive characteristics of making relationships in Twitter. Twitter users can make their relationships by exchanging mentions. We visualize and analyze mention based networks of 136,754 users. We put three candidates' name as query terms-Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn'. The results show that Twitter users mention all candidates' name regardless of their political tendencies. This case study discloses that Twitter could be an effective tool to detect and predict dynamic changes of social issues, and mention-based user networks could show different aspects of user behavior as a unique network that is uniquely found in Twitter.

Implementation of Realtime B2B System using Mobile Terminal (모바일 단말기를 이용한 실시간 B2B 시스템 구현)

  • Lee Hyae-Jung;Joung Suck-Tae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.1
    • /
    • pp.1-6
    • /
    • 2006
  • Previous to do business computerization of industry field, it isn't efficient for write down all account book. Recently many companies competitive is raised from competence improvement for a task and cost-cutting for electronic data processing system. Moreover, sudden increase of internet-user need effective management information system and share information because of network connection. Accordingly, it needs to get out of text base information system from multimedia base system and it need to introduce E-catalog system of merchandise information. In addition, it is required technological development and merchandising with movement, real time omnipresent. carrying. In this paper, established portable B2B system with movement of precious metals and jewels field's portable terminal from mobile technology.

A Study on Implementation of Writing Supporting System(ICWS) for Interactive Storytelling Contents (인터렉티브 스토리텔링 콘텐츠 저작지원도구 설계 및 구현에 관한 연구)

  • Lee, Eun Ryoung;Kim, Kio Chung
    • Journal of Digital Convergence
    • /
    • v.11 no.2
    • /
    • pp.263-269
    • /
    • 2013
  • This research paper is applying Writing Supporting System on the previous research study about writing tool data model on interactive storytelling about family Story. Family story writing supporting system enables users to create text, images, videos and digital contents based on experimental knowledge collected from the first and second generations. The paper about studies on writing tool system on family story, aims to create documentary based high quality contents about each family members and family history. At the same time, overcome generation gaps and the lack of creation infrastructures. Throughout this process, the author will contribute to the expansion of creation devices which can be applied in other researches and writing tools.

Morphological Analysis Study for the Development of DB on the Medicinal Herbs Manufacturing Process - with focus on the manufacturing method of Rehmanniae radix - (본초 제조 공정의 DB화를 위한 형태소 분석 연구 - 숙지황 제조 공정을 중심으로 -)

  • Kim, Thaeyul;Kim, Kiwook;Kim, Byungchul;Lee, Byungwook
    • Journal of Society of Preventive Korean Medicine
    • /
    • v.20 no.1
    • /
    • pp.111-124
    • /
    • 2016
  • Objectives : Treatment method using drugs has already been used in Korean medicine for a long time. Moreover, database has been developed and utilized for more efficient management of the treatments that use drugs. Most of such database related to knowledge on drugs is composed of origin, efficacy, temperament, ingredients and examples of application of the standardized drugs. Communication with knowledge information in other specialized areas is also accomplished by using the efficacies and ingredients with the drugs. In this study, we aimed to make data structure of the terminologies that represent the manufacturing process of herbs. However, in spite of the fact that the manufacturing process of the drugs imparts effect on their efficacies and ingredients, details of the manufacturing processes are quite limited to simple text sentences, thereby resulting in substantially lower level of utilization and difficulties in systematic researches on various factors included in the manufacturing processes in comparison to other knowledge on drugs. Methods : This Study extracted the factors necessary in the development of database by executing morphological analysis of the manufacturing process of herbs. Results : The factors are 'Order', 'Act', 'Raw material', 'Tools', 'Supporting materials', 'Intensity', 'Duration Time', 'Interval', 'Focus', 'Repetition Number', 'Untill'. We were able to tell the difference of the manufacturing process with a simple structured query language and the factors. Conclusions : Morphological analysis of medicinal herbs manufacturing Process contributes to standardization with information of the manufacturing process. And it helps to creates a quality management system through the Database.

The Effect of the Science Process Skills and Science Related Attitude on the Science-play through the Science Class (과학 놀이를 이용한 과학수업이 과학 탐구 능력과 과학 관련 태도에 미치는 영향)

  • Heo, Kwi-Hee;Lee, Ji-Hwa;Moon, Seong-Bae
    • Journal of the Korean Society of Earth Science Education
    • /
    • v.7 no.1
    • /
    • pp.1-10
    • /
    • 2014
  • The purpose of this study is to introduce the science-play in the regular class, stimulate the student's curiosity, motivate them and take active part in their science class. To make an effective science class, we developed the science-play activity instead of experiments in the text, and applied it to the class. The experimental group has statistically meaningful results in the science process skills, expecially in subordinate elements such as observation, deduction, expectation, data analysis and assumption establishments(p<.01). However, the comparative group has no meaningful results in the science process skills. Though the average value of the science related attitude in the experimental group had only a little increase and had no statistically meaningful results, that in the comparative group has decreased during the same period. As for the experimental group, the science-play activities were repeated and their science related attitude has increased a little. Even though there were no meaningful statistic results(p>.05), the science-play activity was effective in the science related attitude. As a result of this research, it could be said that the science-play activity can improve the student's science process skills and the science related attitude, and the science-play program should be further developed and applied to make easy and effective science classes.

Design of Automatic Document Classifier for IT documents based on SVM (SVM을 이용한 디렉토리 기반 기술정보 문서 자동 분류시스템 설계)

  • Kang, Yun-Hee;Park, Young-B.
    • Journal of IKEEE
    • /
    • v.8 no.2 s.15
    • /
    • pp.186-194
    • /
    • 2004
  • Due to the exponential growth of information on the internet, it is getting difficult to find and organize relevant informations. To reduce heavy overload of accesses to information, automatic text classification for handling enormous documents is necessary. In this paper, we describe structure and implementation of a document classification system for web documents. We utilize SVM for documentation classification model that is constructed based on training set and its representative terms in a directory. In our system, SVM is trained and is used for document classification by using word set that is extracted from information and communication related web documents. In addition, we use vector-space model in order to represent characteristics based on TFiDF and training data consists of positive and negative classes that are represented by using characteristic set with weight. Experiments show the results of categorization and the correlation of vector length.

  • PDF