• Title/Summary/Keyword: learning data

Search Result 11,690, Processing Time 0.039 seconds

A Methodology for Automatic Multi-Categorization of Single-Categorized Documents (단일 카테고리 문서의 다중 카테고리 자동확장 방법론)

  • Hong, Jin-Sung;Kim, Namgyu;Lee, Sangwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.77-92
    • /
    • 2014
  • Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we propose a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. First, we attempt to find the relationship between documents and topics by using the result of topic analysis for single-categorized documents. Second, we construct a correspondence table between topics and categories by investigating the relationship between them. Finally, we calculate the matching scores for each document to multiple categories. The results imply that a document can be classified into a certain category if and only if the matching score is higher than the predefined threshold. For example, we can classify a certain document into three categories that have larger matching scores than the predefined threshold. The main contribution of our study is that our methodology can improve the applicability of traditional multi-category classifiers by generating multi-categorized documents from single-categorized documents. Additionally, we propose a module for verifying the accuracy of the proposed methodology. For performance evaluation, we performed intensive experiments with news articles. News articles are clearly categorized based on the theme, whereas the use of vulgar language and slang is smaller than other usual text document. We collected news articles from July 2012 to June 2013. The articles exhibit large variations in terms of the number of types of categories. This is because readers have different levels of interest in each category. Additionally, the result is also attributed to the differences in the frequency of the events in each category. In order to minimize the distortion of the result from the number of articles in different categories, we extracted 3,000 articles equally from each of the eight categories. Therefore, the total number of articles used in our experiments was 24,000. The eight categories were "IT Science," "Economy," "Society," "Life and Culture," "World," "Sports," "Entertainment," and "Politics." By using the news articles that we collected, we calculated the document/category correspondence scores by utilizing topic/category and document/topics correspondence scores. The document/category correspondence score can be said to indicate the degree of correspondence of each document to a certain category. As a result, we could present two additional categories for each of the 23,089 documents. Precision, recall, and F-score were revealed to be 0.605, 0.629, and 0.617 respectively when only the top 1 predicted category was evaluated, whereas they were revealed to be 0.838, 0.290, and 0.431 when the top 1 - 3 predicted categories were considered. It was very interesting to find a large variation between the scores of the eight categories on precision, recall, and F-score.

Predictive Clustering-based Collaborative Filtering Technique for Performance-Stability of Recommendation System (추천 시스템의 성능 안정성을 위한 예측적 군집화 기반 협업 필터링 기법)

  • Lee, O-Joun;You, Eun-Soon
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.119-142
    • /
    • 2015
  • With the explosive growth in the volume of information, Internet users are experiencing considerable difficulties in obtaining necessary information online. Against this backdrop, ever-greater importance is being placed on a recommender system that provides information catered to user preferences and tastes in an attempt to address issues associated with information overload. To this end, a number of techniques have been proposed, including content-based filtering (CBF), demographic filtering (DF) and collaborative filtering (CF). Among them, CBF and DF require external information and thus cannot be applied to a variety of domains. CF, on the other hand, is widely used since it is relatively free from the domain constraint. The CF technique is broadly classified into memory-based CF, model-based CF and hybrid CF. Model-based CF addresses the drawbacks of CF by considering the Bayesian model, clustering model or dependency network model. This filtering technique not only improves the sparsity and scalability issues but also boosts predictive performance. However, it involves expensive model-building and results in a tradeoff between performance and scalability. Such tradeoff is attributed to reduced coverage, which is a type of sparsity issues. In addition, expensive model-building may lead to performance instability since changes in the domain environment cannot be immediately incorporated into the model due to high costs involved. Cumulative changes in the domain environment that have failed to be reflected eventually undermine system performance. This study incorporates the Markov model of transition probabilities and the concept of fuzzy clustering with CBCF to propose predictive clustering-based CF (PCCF) that solves the issues of reduced coverage and of unstable performance. The method improves performance instability by tracking the changes in user preferences and bridging the gap between the static model and dynamic users. Furthermore, the issue of reduced coverage also improves by expanding the coverage based on transition probabilities and clustering probabilities. The proposed method consists of four processes. First, user preferences are normalized in preference clustering. Second, changes in user preferences are detected from review score entries during preference transition detection. Third, user propensities are normalized using patterns of changes (propensities) in user preferences in propensity clustering. Lastly, the preference prediction model is developed to predict user preferences for items during preference prediction. The proposed method has been validated by testing the robustness of performance instability and scalability-performance tradeoff. The initial test compared and analyzed the performance of individual recommender systems each enabled by IBCF, CBCF, ICFEC and PCCF under an environment where data sparsity had been minimized. The following test adjusted the optimal number of clusters in CBCF, ICFEC and PCCF for a comparative analysis of subsequent changes in the system performance. The test results revealed that the suggested method produced insignificant improvement in performance in comparison with the existing techniques. In addition, it failed to achieve significant improvement in the standard deviation that indicates the degree of data fluctuation. Notwithstanding, it resulted in marked improvement over the existing techniques in terms of range that indicates the level of performance fluctuation. The level of performance fluctuation before and after the model generation improved by 51.31% in the initial test. Then in the following test, there has been 36.05% improvement in the level of performance fluctuation driven by the changes in the number of clusters. This signifies that the proposed method, despite the slight performance improvement, clearly offers better performance stability compared to the existing techniques. Further research on this study will be directed toward enhancing the recommendation performance that failed to demonstrate significant improvement over the existing techniques. The future research will consider the introduction of a high-dimensional parameter-free clustering algorithm or deep learning-based model in order to improve performance in recommendations.

An Analysis on Consumers' Awareness of a Rural Specialties Exhibition Shop and the Design Development : Focusing on Rural Tourism Village (농촌 농특산품 전시판매시설 디자인 소비자 의식 분석 및 디자인 개발 - 농촌관광마을을 중심으로 -)

  • Jin, Hye-Ryeon;Seo, Ji-Ye;Jo, Lok-Hwan
    • Journal of Korean Society of Rural Planning
    • /
    • v.20 no.4
    • /
    • pp.253-262
    • /
    • 2014
  • This, an association research for design-improvement and model-development of exhibition shops at rural tourism communities, is to secure objective data by analyzing customers' awareness-tendency of and demand for agricultural-specialty exhibition shops. Survey-questions for finding out consumers' awareness-tendency and demand were determined through brainstorming of a professional council, 30 rural communities of which visit-rate by consumers is considerably high were selected for the recruit of 200 consumers. For investigation and analysis, survey and in-depth interview were carried out at the scene with the application of frequency analysis and summarization of their opinions, which revealed that they have a strong will to visit the rural tourism communities for the purchase of agricultural specialties along with the experience of learning-program and on-the-scene direct dealing and that their viewpoint on the direct dealing at the scene was very positive. Also it was confirmed hat their satisfaction with the purchase of agricultural specialties by on-the-scene direct dealing, their pleasure at the purchase, their satisfaction with services and their intention for re-purchase of them were very high while their satisfaction with the exhibition shops was very low. With on-the-scene survey, the consumers' opinions could be listened to in depth. Almost all of them said their satisfaction with the trip to those rural tourism communities was considerably high since they could go to those communities themselves to relieve the stress from their modern life, to experience healing and to see the goods on the scene. Their satisfaction also was attributed to the fact that they have enough trust in purchase along with feeling the warm-heartedness of rural residents. As to their awareness of exhibition shops, they showed a positive response to the on-the-scene direct dealing at rural communities while they, thinking that the space in those exhibition shops was not sufficiently wide, demanded for more systematic counters in more accessible and affordable exhibition shops so that they might be more satisfied with the exhibition shops. Their demand for the necessity of exhibition shops selling agricultural specialties was found to be over 80%, which indicates that the necessity is very high. As to the suitability of function, they have the opinion that the business at those shops had better be focused on sales since they have the understanding of information when they take a trip to the rural communities, while there was another opinion: since agricultural products are seasonal items they should be exhibited and sold at the same time. More than 90% of the respondents had a positive viewpoint on direct dealing of agricultural specialties on the scene, which showed that their response to it was very high. They preferred the permanent shops equipped with roll-around table-booths. In addition, it was revealed that they want systematic exhibition shops in rural communities because they frequent those communities for on-the-scene direct purchase. The preferred type and opinion resulting from estimation of consumers' demands have been reflected for development of practical designs. The structure of variable principles has been designed so that the types of display-case and table-booth might be created. The result of this study is a positive data as a design model which can be utilized at rural communities and will be commercialized for the verification of its validity.

A Cases of Crane Breeding(養鶴) in the Palace of the Joseon Dynasty Period (조선시대 궁궐에서의 양학(養鶴) 사례)

  • Hong, Hyoung-Soon
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.38 no.3
    • /
    • pp.1-10
    • /
    • 2020
  • The purpose of this study is to identify whether the cranes had been bred in the palace of the Joseon Dynasty period and to consider the related cases. The temporal range of this study is in the Joseon Dynasty period, and the spatial range is throughout the entire palace, including the naejeon(內殿) and oijeon(外殿), and government offices inside(闕內各司) and government offices outside(闕外各司). The reference materials for this study were partly extracted and translated from the original documents to consider, and a Korean version of documents was used in the database of the Institute for the Translation of Korean Classics. The results of this study are summarized as follows. First, the cranes were bred from the early Joseon Dynasty Era in Uijeongbu, the highest government office in the Joseon Dynasty period. After the Japanese Invasion of Korea in 1592, crane breeding in Uijeongbu(議政府) seems to have been suspended due to the damage to the government building and the change in the status of the government office. Second, crane breeding in Hongmungwan(弘文館), which was responsible for the classics colloquium(經筵) and public opinions and assisted the king by the side, continued from the early Joseon Dynasty period(Jungjong's Era) to the late Joseon Dynasty period(Jeongjo's Era) after the Japanese Invasion of Korea in 1592. Third, in the Jeongjo's Era, the cranes were also bred in Gyujanggak(奎章閣), which was newly established as the central institution of learning to strengthen the royal authority. At that time, it seems that several cranes were bred in Gyujanggak. Fourth, it is judged that 'Crane breeding' in the core government offices of Joseon, such as Uijeongbu, Hongmungwan, and Gyujanggak, was meaningful as a symbol of identities, such as the status and character of the institution. Fifth, it seems that the cranes bred in the palace, including Hongmungwan, were conventionally brought by the Baecheon County of Hwanghae-do. This convention caused minor conflicts between the central and local government offices during the Yeongjo's Era, but it seems to have continued throughout the Jeongjo's Era. In this study, there is a limit that most of the studies were conducted based on local data. If further data discovery and translation outcomes are accumulated in the future, more abundant cases will be identified. The deepened follow-up studies are also needed, other than the cases of rearing cranes in the local government offices and temples.

Exploring the Factors Influencing on the Accuracy of Self-Reported Responses in Affective Assessment of Science (과학과 자기보고식 정의적 영역 평가의 정확성에 영향을 주는 요소 탐색)

  • Chung, Sue-Im;Shin, Donghee
    • Journal of The Korean Association For Science Education
    • /
    • v.39 no.3
    • /
    • pp.363-377
    • /
    • 2019
  • This study reveals the aspects of subjectivity in the test results in a science-specific aspect when assessing science-related affective characteristic through self-report items. The science-specific response was defined as the response that appear due to student's recognition of nature or characteristics of science when his or her concepts or perceptions about science were attempted to measure. We have searched for cases where science-specific responses especially interfere with the measurement objective or accurate self-reports. The results of the error due to the science-specific factors were derived from the quantitative data of 649 students in the 1st and 2nd grade of high school and the qualitative data of 44 students interviewed. The perspective of science and the characteristics of science that students internalize from everyday life and science learning experiences interact with the items that form the test tool. As a result, it was found that there were obstacles to accurate self-report in three aspects: characteristics of science, personal science experience, and science in tool. In terms of the characteristic of science in relation to the essential aspect of science, students respond to items regardless of the measuring constructs, because of their views and perceived characteristics of science based on subjective recognition. The personal science experience factor representing the learner side consists of student's science motivation, interaction with science experience, and perception of science and life. Finally, from the instrumental point of view, science in tool leads to terminological confusion due to the uncertainty of science concepts and results in a distance from accurate self-report eventually. Implications from the results of the study are as follows: review of inclusion of science-specific factors, precaution to clarify the concept of measurement, check of science specificity factors at the development stage, and efforts to cross the boundaries between everyday science and school science.

Verification the Systems Thinking Factor Structure and Comparison of Systems Thinking Based on Preferred Subjects about Elementary School Students' (초등학생의 시스템 사고 요인 구조 검증과 선호 과목에 따른 시스템 사고 비교)

  • Lee, Hyonyong;Jeon, Jaedon;Lee, Hyundong
    • Journal of The Korean Association For Science Education
    • /
    • v.39 no.2
    • /
    • pp.161-171
    • /
    • 2019
  • The purposes of this study are: 1) to verify the systems thinking factor structure of elementary school students and 2) to compare systems thinking according to their preferred subjects in order to get implications for following research. For the study, pre-tests analyze data from 732 elementary school students using the STMI (Systems Thinking Measuring Instrument) developed by Lee et al. (2013). And exploratory factor analysis was conducted to identify the factor structure of the students. Based on the results of the pre-test, the expert group council revised the STMI so that elementary school students could respond to the 5-factor structure that STMI intended. In the post-test, 503 data were analyzed by modified STMI and exploratory factor analysis was performed. The results of the study are as follows: First, in the pre-test, elementary school students responded to the STMI with a test paper consisting of two factors (personal internal factors and personal external factors). The total reliability of the instrument was .932 and the reliability of each factor was analyzed as .857 and .894. Second, for modified STMI, elementary school students responded a 4-factor instrument. Team learning, Shared Vision, and Personal Mastery were derived independent factors, and mental model and systems analysis were derived 1-factor. The total reliability of the instrument was .886 and the reliability of each factor was analyzed as .686 to .864. Finally, a comparison of systems thinking according to preferred subjects showed a significant difference between students who selected science (engineering) group and art (music and physical education). In conclusion, it was confirmed that statistically meaningful results could be obtained using STMI modified by term and sentence structure appropriate for elementary school students, and it is a necessary to study the relation of systems thinking with various student variables such as the preferred subjects.

The Design Improvement Plan of Seoul Forest Visitor Centers for Little Children (서울시 유아숲체험장의 공간 개선 방안)

  • Kim, Minjung;Jeong, Wookju
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.49 no.6
    • /
    • pp.49-63
    • /
    • 2021
  • The Forest Visitor Centers for Little Children who means preschoolers is an educational facility that achieves holistic growth by experiencing forests, and it should not be completed by installing specific facilities in the forest environment, but should be a space where preschoolers can play freely in the forest environment themselves. This study comprehensively evaluated the current status of Seoul Forest Visitor Centers for Little Children and suggested space improvement measures to enhance the effectiveness of forest experience. Through the theoretical review, seven spatial elements that enhance the effect of forest experience and six areas composing outdoor play areas were derived to prepare an analysis table for current status evaluation, and field survey studies were conducted on 24 centers in Seoul. Through expert interviews, the physical status was examined from the perspective of childhood education and the experiences of the users were summarized. As a result of the study, the Seoul Forest Visitor Center for Little Children is classified into six types according to the location characteristics and spatial structure, and has the characteristics of each type. The effectiveness of forest experience can be enhanced by identifying and revealing the environmental strengths of individual centers. In the case of outdoor experience learning zones, the proportion of exercise play areas was very large. By evenly organizing the forest experience space for each area, it will be possible to provide more diverse experiences to preschoolers. However, the status of uniform facility-oriented cannot be viewed as a fragmentary factor that lowers the effect of forest experience. The key to increasing the effect of forest experience by inducing creative activities is the spatial composition that considers the surrounding natural environment. Facilities should be a medium to help preschoolers' interest move into the forest. This study prepared data to understand the average physical status of the Seoul Forest Visitor Center for Little Children and suggested space improvement measures to increase the effectiveness of forest experience. This can be used as basic data for research to improve the quality level of the Seoul Forest Visitor Center for Little Children about 10 years after the project was implemented.

Development of Pedagogical Content Knowledge of Novice Secondary Science Teachers through Collaborative Reflection (초임 중등 과학교사들의 협력적 성찰을 통한 수업 전문성 발달)

  • Shin, Minkyoung;Kim, Heui-Baik
    • Journal of The Korean Association For Science Education
    • /
    • v.42 no.1
    • /
    • pp.77-96
    • /
    • 2022
  • This study investigated how collaborative reflection between novice secondary science teachers promoted the development of teaching professionalism. We intentionally selected research participants who shared sufficient rapport. Data were collected by videotaping the classes taught by participants, pre-talk, post-interviews and nine collaborative reflection processes. All data were transcribed and analyzed. Results indicated that all three teachers showed changes in teaching practice. Minyoung's practice involved a teacher-led lecture, but through collaborative reflection, she could create a learning environment to enhance students' power and ownership in her class. Emphasizing academic rigor, Soyoung used to teach content outside the scope of the curriculum, but through collaborative reflection, she became more considerate of students' understanding. Finally, in Jiyeon's classes inquiry activities and theoretical explanations were separated from each other. However, she repeated her efforts to improve her class after collaborative reflection, allowing students to construct explanations through activities. In this study, three factors that promoted the development of teachers' pedagogical content knowledge through collaborative reflection were identified. First, the different teaching orientations of the three teachers who participated in this study, promoted sharing of opinions through collaborative reflection. Second, reflection based on teaching practice enabled practical feedback on the class, which enhanced the development of teachers' pedagogical content knowledge. Third, the equal status and formation of rapport between the three teachers created an environment for productive reflection. These results suggest that future teacher education programs should target communities that can promote collaborative reflection based on teachers' teaching practice.

A COVID-19 Diagnosis Model based on Various Transformations of Cough Sounds (기침 소리의 다양한 변환을 통한 코로나19 진단 모델)

  • Minkyung Kim;Gunwoo Kim;Keunho Choi
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.57-78
    • /
    • 2023
  • COVID-19, which started in Wuhan, China in November 2019, spread beyond China in 2020 and spread worldwide in March 2020. It is important to prevent a highly contagious virus like COVID-19 in advance and to actively treat it when confirmed, but it is more important to identify the confirmed fact quickly and prevent its spread since it is a virus that spreads quickly. However, PCR test to check for infection is costly and time consuming, and self-kit test is also easy to access, but the cost of the kit is not easy to receive every time. Therefore, if it is possible to determine whether or not a person is positive for COVID-19 based on the sound of a cough so that anyone can use it easily, anyone can easily check whether or not they are confirmed at anytime, anywhere, and it can have great economic advantages. In this study, an experiment was conducted on a method to identify whether or not COVID-19 was confirmed based on a cough sound. Cough sound features were extracted through MFCC, Mel-Spectrogram, and spectral contrast. For the quality of cough sound, noisy data was deleted through SNR, and only the cough sound was extracted from the voice file through chunk. Since the objective is COVID-19 positive and negative classification, learning was performed through XGBoost, LightGBM, and FCNN algorithms, which are often used for classification, and the results were compared. Additionally, we conducted a comparative experiment on the performance of the model using multidimensional vectors obtained by converting cough sounds into both images and vectors. The experimental results showed that the LightGBM model utilizing features obtained by converting basic information about health status and cough sounds into multidimensional vectors through MFCC, Mel-Spectogram, Spectral contrast, and Spectrogram achieved the highest accuracy of 0.74.

Seeking for a Curriculum of Dance Department in the University in the Age of the 4th Industrial Revolution (4차 산업혁명시대 대학무용학과 커리큘럼의 방향모색)

  • Baek, Hyun-Soon;Yoo, Ji-Young
    • Journal of Korea Entertainment Industry Association
    • /
    • v.13 no.3
    • /
    • pp.193-202
    • /
    • 2019
  • This study focuses on what changes are required as to a curriculum of dance department in the university in the age of the 4th industrial revolution. By comparing and analyzing the curricula of dance department in the five universities in Seoul, five academic subjects as to curricula of dance department, which covers what to learn for dance education in the age of the 4th industrial revolution, are presented. First, dance integrative education, the integration of creativity and science education, can be referred to as a subject that stimulates ideas and creativity and raises artistic sensitivity based on STEAM. Second, the curriculum characterized by prediction of the future prospect through Big Data can be utilized well in dealing with dance performance, career path of dance-majoring people, and job creation by analyzing public opinion, evaluation, and feelings. Third, video education. Seeing the images as modern major media tends to occupy most of the expressive area of art, dance by dint of video enables existing dance work to be created as new form of art, expanding dance boundaries in academic and performing art viewpoint. Fourth, VR and AR are essential techniques in the era of smart media. Whether upcoming dance studies are in the form of performance or education or industry, for VR and AR to be digitally applied into every relevant field, keeping with the time, learning about VR and AR is indispensable. Last, the 4th industrial revolution and the curriculum of dance art are needed to foresee the changes in the 4th industrial revolution and to educate changes, development and seeking in dance curriculum.