• Title/Summary/Keyword: linguistic analysis

Search Result 518, Processing Time 0.026 seconds

A Study on the Fraud Detection in an Online Second-hand Market by Using Topic Modeling and Machine Learning (토픽 모델링과 머신 러닝 방법을 이용한 온라인 C2C 중고거래 시장에서의 사기 탐지 연구)

  • Dongwoo Lee;Jinyoung Min
    • Information Systems Review
    • /
    • v.23 no.4
    • /
    • pp.45-67
    • /
    • 2021
  • As the transaction volume of the C2C second-hand market is growing, the number of frauds, which intend to earn unfair gains by sending products different from specified ones or not sending them to buyers, is also increasing. This study explores the model that can identify frauds in the online C2C second-hand market by examining the postings for transactions. For this goal, this study collected 145,536 field data from actual C2C second-hand market. Then, the model is built with the characteristics from postings such as the topic and the linguistic characteristics of the product description, and the characteristics of products, postings, sellers, and transactions. The constructed model is then trained by the machine learning algorithm XGBoost. The final analysis results show that fraudulent postings have less information, which is also less specific, fewer nouns and images, a higher ratio of the number and white space, and a shorter length than genuine postings do. Also, while the genuine postings are focused on the product information for nouns, delivery information for verbs, and actions for adjectives, the fraudulent postings did not show those characteristics. This study shows that the various features can be extracted from postings written in C2C second-hand transactions and be used to construct an effective model for frauds. The proposed model can be also considered and applied for the other C2C platforms. Overall, the model proposed in this study can be expected to have positive effects on suppressing and preventing fraudulent behavior in online C2C markets.

Analysis of High School Students' Conceptual Change in Model-Based Instruction for Blood Circulation (혈액 순환 모형 기반 수업에서 고등학생들의 개념 변화 분석)

  • Kim, Mi-Young;Kim, Heui-Baik
    • Journal of The Korean Association For Science Education
    • /
    • v.27 no.5
    • /
    • pp.379-393
    • /
    • 2007
  • The purpose of this article is to analyze the conceptual change of nine 11th graders after implementing the model-based instruction of blood circulation by multidimensional framework, and to find some implications about teaching strategies for improving conceptual understanding. The model-based instruction consisted of 4 periods: (1) introduction for inducing students' interests using an episode in the science history of blood circulation, (2) vivisectional experiment on rats, (3) visual-linguistic model instruction using the videotape of heartbeat, and (4) modeling activity on the path of blood flow. Based on the data from pre-test, post-test and interviews, we classified students' models on the path of blood flow, and investigated their ontological features and the conceptual status of blood circulation. Most students could describe the path of blood flow and the changes of substances in blood precisely after the instructions. However, the modeling activity were not sufficient to improve students' understanding of the mechanisms of the blood distribution throughout various organs and the material exchanges between blood and tissues. From the interview of 9 students, we acquired informative results about conceptual status elements that were helpful to, preventing from, or not used for students' understanding. It was also found that conceptual status of students depended on the ontological categories into which students' conceptions of blood circulation fell. The results of this study can help design the effective teaching strategy for the understanding of concept of the equilibrium category.

The narrative inquiry on Korean Language Learners' Korean proficiency and Academic adjustment in College Life (학문 목적 한국어 학습자의 한국어 능력과 학업 적응에 관한 연구)

  • Cheong Yeun Sook
    • Journal of the International Relations & Interdisciplinary Education
    • /
    • v.4 no.1
    • /
    • pp.57-83
    • /
    • 2024
  • This study aimed to investigate the impact of scores on the Test of Proficiency in Korean (TOPIK) among foreign exchange students on academic adaptation. Recruited students, approved by the Institutional Review Board (IRB), totaled seven, and their interview contents were analyzed using a comprehensive analysis procedure based on pragmatic eclecticism (Lee, Kim, 2014), utilizing six stages. As a result, factors influencing academic adaptation of Korean language learners for academic purposes were categorized into three dimensions: academic, daily life, and psychological-emotional aspects. On the academic front, interviewees pointed out difficulties in adapting to specialized terminology and studying in their majors, as well as experiencing significant challenges with Chinese characters and Sino-Korean words. Next, from a daily life perspective, even participants holding advanced TOPIK scores faced difficulties in adapting to university life, emphasizing the necessity of practical expressions and extensive vocabulary for proper adjustment to Korean life. Lastly, within the psychological-emotional dimension, despite being advanced TOPIK holders, they were found to experience considerable stress in conversations or presentations with Koreans. Their lack of knowledge in social-cultural and everyday life culture also led to linguistic errors and contributed to psychological-emotional difficulties, despite proficiency in Korean. Based on these narratives, the conclusion was reached that in order to promote the academic adaptation of Korean language learners, it is essential to provide opportunities for Korean language learning. With this goal in mind, efforts should be directed towards enhancing learners' academic proficiency in their majors, improving Korean language fluency, and fostering interpersonal relationships within the academic community. Furthermore, the researchers suggested as a solution to implement various extracurricular activities tailored for foreign learners.

A Discourse Analysis of Science Teachers' Scientific Modeling Activities: A Case from Earth Science Teacher Training (과학 모델링 활동에 나타난 교사의 담화 분석 -지구과학 교사 연수 사례-)

  • Heungjin Eom;Hyunjin Shim
    • Journal of The Korean Association For Science Education
    • /
    • v.44 no.4
    • /
    • pp.301-311
    • /
    • 2024
  • We developed a small-group training program for in-service teachers focused on scientific modeling. We collected the discourses of the teachers who participated in the activity and analyzed them by type. The training program employed a collaborative approach in which a small group completed tasks and produced outputs based on the theme of 'galaxies and the Universe' to enable practical application in classes. Three in-service science teachers participated in the training program. Their discourses were recorded, transcribed, and classified into types based on individual turns and interaction units. The language expressions of the teachers reflected the unique characteristics of the teaching profession, with each participant having preferred language expression types, albeit with a generally low prevalence of specific language expression types across the participants. Differences in discourse characteristics related to the modeling theme, task presentation method, and model types, revealed that variations in the proportion of interaction unit types during the modeling design, build, and evaluation stages were primarily influenced by the teachers' familiarity with the modeling theme. While the task presentation method also influenced interaction types, model types had little impact on the distribution of interaction types. Considering these findings, training programs on modeling for in-service teachers should include a checklist to encourage sufficient interaction between participants as well as propose proper questions that can be effectively addressed through collaboration.

A Study of 'Emotion Trigger' by Text Mining Techniques (텍스트 마이닝을 이용한 감정 유발 요인 'Emotion Trigger'에 관한 연구)

  • An, Juyoung;Bae, Junghwan;Han, Namgi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.69-92
    • /
    • 2015
  • The explosion of social media data has led to apply text-mining techniques to analyze big social media data in a more rigorous manner. Even if social media text analysis algorithms were improved, previous approaches to social media text analysis have some limitations. In the field of sentiment analysis of social media written in Korean, there are two typical approaches. One is the linguistic approach using machine learning, which is the most common approach. Some studies have been conducted by adding grammatical factors to feature sets for training classification model. The other approach adopts the semantic analysis method to sentiment analysis, but this approach is mainly applied to English texts. To overcome these limitations, this study applies the Word2Vec algorithm which is an extension of the neural network algorithms to deal with more extensive semantic features that were underestimated in existing sentiment analysis. The result from adopting the Word2Vec algorithm is compared to the result from co-occurrence analysis to identify the difference between two approaches. The results show that the distribution related word extracted by Word2Vec algorithm in that the words represent some emotion about the keyword used are three times more than extracted by co-occurrence analysis. The reason of the difference between two results comes from Word2Vec's semantic features vectorization. Therefore, it is possible to say that Word2Vec algorithm is able to catch the hidden related words which have not been found in traditional analysis. In addition, Part Of Speech (POS) tagging for Korean is used to detect adjective as "emotional word" in Korean. In addition, the emotion words extracted from the text are converted into word vector by the Word2Vec algorithm to find related words. Among these related words, noun words are selected because each word of them would have causal relationship with "emotional word" in the sentence. The process of extracting these trigger factor of emotional word is named "Emotion Trigger" in this study. As a case study, the datasets used in the study are collected by searching using three keywords: professor, prosecutor, and doctor in that these keywords contain rich public emotion and opinion. Advanced data collecting was conducted to select secondary keywords for data gathering. The secondary keywords for each keyword used to gather the data to be used in actual analysis are followed: Professor (sexual assault, misappropriation of research money, recruitment irregularities, polifessor), Doctor (Shin hae-chul sky hospital, drinking and plastic surgery, rebate) Prosecutor (lewd behavior, sponsor). The size of the text data is about to 100,000(Professor: 25720, Doctor: 35110, Prosecutor: 43225) and the data are gathered from news, blog, and twitter to reflect various level of public emotion into text data analysis. As a visualization method, Gephi (http://gephi.github.io) was used and every program used in text processing and analysis are java coding. The contributions of this study are as follows: First, different approaches for sentiment analysis are integrated to overcome the limitations of existing approaches. Secondly, finding Emotion Trigger can detect the hidden connections to public emotion which existing method cannot detect. Finally, the approach used in this study could be generalized regardless of types of text data. The limitation of this study is that it is hard to say the word extracted by Emotion Trigger processing has significantly causal relationship with emotional word in a sentence. The future study will be conducted to clarify the causal relationship between emotional words and the words extracted by Emotion Trigger by comparing with the relationships manually tagged. Furthermore, the text data used in Emotion Trigger are twitter, so the data have a number of distinct features which we did not deal with in this study. These features will be considered in further study.

Analysis of activities task using multiple intelligence in middle school 「Technology·Home Economics」 textbooks - Focusing on the 'Dietary Life' unit according to the curriculum of the 2015 revised Practical Arts(Technology·Home Economics) curriculum - (중학교 기술·가정 교과서 다중지능 활용 활동과제 분석 - 2015 개정 실과(기술·가정) 교육과정에 따른 '식생활' 단원을 중심으로 -)

  • Choi, Seong-Youn;Lee, Young-Sun;Choi, Ye-Ji;Joo, Hyun-Jung;Kim, Seung-Hee;Park, Mi-Jeong
    • Journal of Korean Home Economics Education Association
    • /
    • v.30 no.3
    • /
    • pp.19-42
    • /
    • 2018
  • The purpose of this study is to analyze the tasks of 'dietary life' in the textbook developed according to the 2015 revised middle school 「Technology·Home economics」 education curriculum based on the multiple intelligence teaching and learning methods. To accomplish this purpose, 12 textbooks of middle school 「Technology·Home economics」 textbooks were titled "Nutrition and Dietary Behavior of Adolescents", "Planning and Choosing Meals", "Choosing Foods and Safe Cooking" except the questions, the tasks that the students can perform are analyzed based on the teaching and learning methods using multiple intelligences. Analysis methods were analyzed by using contents analysis method, focusing on learning activities, and sub-questions of activities were all included in each activity, and the process of preparing activities on a continuous line was grouped into one. Three people analyzed the activities and proceeded to revise and supplement the analysis standard through consultation. The other three researchers confirmed it. As a result of analyzing 12 kinds of textbooks, the number of activity tasks was 25~74 for each kind of textbooks, and the total number of activities was 527. According to the ratio of multiple intelligences, 35% of the tasks were using logical-mathematical intelligence, and 26.8% of linguistic intelligence, 23% of intrapersonal intelligence, 7.2% of interpersonal intelligence, 3.8% of spatial intelligence, bodily-kinesthetic(2.7%) and musical intelligence(1.5%). On the other hand, it was analyzed that there is no activity task using naturalist intelligence. Except to the naturalist intelligence, general intelligence was utilized. This indicates that the home economics curriculum is a convergence of the home economics curriculum in that it is a reorganization by extracting the contents and methods of other curriculum related to dietary life, is interpreted. This study is expected to provide a framework for various teaching and learning methods to activate students' participation classes and to provide an alternative to realize convergence education in home economics curriculum.

Revisiting the cause of unemployment problem in Korea's labor market: The job seeker's interests-based topic analysis (취업준비생 토픽 분석을 통한 취업난 원인의 재탐색)

  • Kim, Jung-Su;Lee, Suk-Jun
    • Management & Information Systems Review
    • /
    • v.35 no.1
    • /
    • pp.85-116
    • /
    • 2016
  • The present study aims to explore the causes of employment difficulty on the basis of job applicant's interest from P-E (person-environment) fit perspective. Our approach relied on a textual analytic method to reveal insights from their situational interests in a job search during the change of labor market. Thus, to investigate the type of major interests and psychological responses, user-generated texts in a social community were collected for analysis between January 1, 2013 through December 31, 2015 by crawling the online-community in regard to job seeking and sharing information and opinions. The results of topic analysis indicated user's primary interests were divided into four types: perception of vocation expectation, employment pre-preparation behaviors, perception of labor market, and job-seeking stress. Specially, job applicants put mainly concerns of monetary reward and a form of employment, rather than their work values or career exploration, thus youth job applicants expressed their psychological responses using contextualized language (e.g., slang, vulgarisms) for projecting their unstable state under uncertainty in response to environmental changes. Additionally, they have perceived activities in the restricted preparation (e.g., certification, English exam) as determinant factors for success in employment and suffered form job-seeking stress. On the basis of these findings, current unemployment matters are totally attributed to the absence of pursing the value of vocation and job in individuals, organizations, and society. Concretely, job seekers are preoccupied with occupational prestige in social aspect and have undecided vocational value. On the other hand, most companies have no perception of the importance of human resources and have overlooked the needs for proper work environment development in respect of stimulating individual motivation. The attempt in this study to reinterpret the effect of environment as for classifying job applicant's interests in reference to linguistic and psychological theories not only helps conduct a more comprehensive meaning for understanding social matters, but guides new directions for future research on job applicant's psychological factors (e.g., attitudes, motivation) using topic analysis.

  • PDF

Understanding the Legal Structure of German Human Gene Testing Act (GenDG) (독일 유전자검사법의 규율 구조 이해 - 의료 목적 유전자검사의 문제를 중심으로 -)

  • Kim, Na-Kyoung
    • The Korean Society of Law and Medicine
    • /
    • v.17 no.2
    • /
    • pp.85-124
    • /
    • 2016
  • The Human gene testing act (GenDG) in Germany starts from the characteristic features of gene testing, i.e. dualisting structure consisted of anlaysis on the one side and the interpretation on the other side. The linguistic distincion of 'testing', 'anlaysis' and 'judgment' in the act is a fine example. Another important basis of the regulation is the ideological purpose of the law, that is information autonomy. The normative texts as such and the founding principle are the basis of the classification of testing types. Especially in the case of gene testing for medical purpose is classified into testing for diagnostic purpose and predictive purpose. However, those two types are not always clearly differentiated because the predictive value of testing is common in both types. In the legal regulation of gene testing it is therefore important to manage the uncertainty and subjectivity which are inherent in the gene-analysis and the judgment. In GenDG the system ensuring the quality of analysis is set up and GEKO(Commity for gene tisting) based on the section 23 of GenDG concretes the criterium of validity through guidelines. It is also very important in the case of gene testing for medical purpose to set up the system for ensurement of procedural rationality of the interpretation. The interpretation of the results of analysis has a wide spectrum because of the consistent development of technology on the one side and different understandings of different subjects who performs gene testings. Therefore the process should include the communication process for patients in oder that he or she could understand the meaning of gene testing and make plans of life. In GenDG the process of genetic counselling and GEKO concretes the regulation very precisely. The regulation as such in GenDG seems to be very suggestive to Korean legal polic concerning the gene testing.

  • PDF

A MVC Framework for Visualizing Text Data (텍스트 데이터 시각화를 위한 MVC 프레임워크)

  • Choi, Kwang Sun;Jeong, Kyo Sung;Kim, Soo Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.39-58
    • /
    • 2014
  • As the importance of big data and related technologies continues to grow in the industry, it has become highlighted to visualize results of processing and analyzing big data. Visualization of data delivers people effectiveness and clarity for understanding the result of analyzing. By the way, visualization has a role as the GUI (Graphical User Interface) that supports communications between people and analysis systems. Usually to make development and maintenance easier, these GUI parts should be loosely coupled from the parts of processing and analyzing data. And also to implement a loosely coupled architecture, it is necessary to adopt design patterns such as MVC (Model-View-Controller) which is designed for minimizing coupling between UI part and data processing part. On the other hand, big data can be classified as structured data and unstructured data. The visualization of structured data is relatively easy to unstructured data. For all that, as it has been spread out that the people utilize and analyze unstructured data, they usually develop the visualization system only for each project to overcome the limitation traditional visualization system for structured data. Furthermore, for text data which covers a huge part of unstructured data, visualization of data is more difficult. It results from the complexity of technology for analyzing text data as like linguistic analysis, text mining, social network analysis, and so on. And also those technologies are not standardized. This situation makes it more difficult to reuse the visualization system of a project to other projects. We assume that the reason is lack of commonality design of visualization system considering to expanse it to other system. In our research, we suggest a common information model for visualizing text data and propose a comprehensive and reusable framework, TexVizu, for visualizing text data. At first, we survey representative researches in text visualization era. And also we identify common elements for text visualization and common patterns among various cases of its. And then we review and analyze elements and patterns with three different viewpoints as structural viewpoint, interactive viewpoint, and semantic viewpoint. And then we design an integrated model of text data which represent elements for visualization. The structural viewpoint is for identifying structural element from various text documents as like title, author, body, and so on. The interactive viewpoint is for identifying the types of relations and interactions between text documents as like post, comment, reply and so on. The semantic viewpoint is for identifying semantic elements which extracted from analyzing text data linguistically and are represented as tags for classifying types of entity as like people, place or location, time, event and so on. After then we extract and choose common requirements for visualizing text data. The requirements are categorized as four types which are structure information, content information, relation information, trend information. Each type of requirements comprised with required visualization techniques, data and goal (what to know). These requirements are common and key requirement for design a framework which keep that a visualization system are loosely coupled from data processing or analyzing system. Finally we designed a common text visualization framework, TexVizu which is reusable and expansible for various visualization projects by collaborating with various Text Data Loader and Analytical Text Data Visualizer via common interfaces as like ITextDataLoader and IATDProvider. And also TexVisu is comprised with Analytical Text Data Model, Analytical Text Data Storage and Analytical Text Data Controller. In this framework, external components are the specifications of required interfaces for collaborating with this framework. As an experiment, we also adopt this framework into two text visualization systems as like a social opinion mining system and an online news analysis system.

Validation of Korean Diagnostic Scale of Multiple Intelligence (한국형 다중지능 진단도구의 타당화)

  • Moon, Yong-Lin;Yu, Gyeong-Jae
    • (The) Korean Journal of Educational Psychology
    • /
    • v.23 no.3
    • /
    • pp.645-663
    • /
    • 2009
  • The purpose of this study is to develop and verify a Korean Diagnostic Scale of Multiple Intelligence(MI), which will be an alternative test to avoid problems with former Shearer's MI test and to adopt H. Gardner's suggestions to develop MI assessment. The test is developed 5 types; kindergartner, elementary lower grader, elementary upper grader, middle schooler, high schooler test. A form of test is diversified with 3 types; multiple-choice items for accomplishment, true or false items for ability, and self-reported items with likert scale for interest and ability. According to H. Gardner's suggestions, we have tried to reanalyze key component of MI, analyze an overlapping or hierarchical relationship between intelligences, develop intelligences-fair items, diversify form of item. We have developed a final standardized test through a primary, secondary preliminary-test analysis, and sampled 5,585 students by age, gender, and regional groups. As a result of this sampling test, we can get a norm score and compare individuals with other's score relatively. To verify this test, we analyzed behavior observation, mean, standard deviation, a percentage of correct answers, reliability of each test type, correlation between intelligence scales, Kruskal-Wallis test of mean rank of career choice by intelligences. As a result of correlation analysis between sub-intelligence scales, we can conclude that this MI test is satisfied with intelligence independent assumption. Besides, as non-parametric statistics test(Kruskal-Wallis) of career choice by intelligences, we can identify that MI is related with domain of career choice. This test is not a linguistic and logical-mathematical biased test but a intelligences-fair test. It makes us compare individual's potential with a norm score. Besides, it could be useful as a means of educational prescription or counsel in comparison with ability, interest, and accomplishment of individual. But this test is limited to do factor or correlation analysis between types of sub-test, because items are minimized for a time-constraint and a heavy burden of test receiver. But if it could be tested with increased items by two sessions, further research could be expected to get over this constraints and do a further validation analysis.