• 제목/요약/키워드: inverse category frequency

검색결과 9건 처리시간 0.026초

제목의 단어 가중치를 이용한 중등학교 공문서 자동분류시스템 (An Automatic Classification System of Official Documents in Middle Schools Using Term Weighting of Titles)

  • 강현희;진민
    • 정보교육학회논문지
    • /
    • 제7권2호
    • /
    • pp.219-226
    • /
    • 2003
  • 현재 일선 학교와 교육기관의 공문서 분류는 아직도 수작업으로 처리되고 있어 많은 시간이 소요된다. 이러한 문제점을 해결하기 위해 본 논문은 문서 제목의 단어 정보를 이용한 자동 문서 분류 방법을 제안한다. 먼저 기존 문서의 제목 단어 중에서 의미 있는 단어를 추출하여 각 단어에 대해 범주별로 역문헌 빈도(IDF) 가중치를 계산한 후 단어 가중치 사전을 구축한다. 문서의 분류 요구가 들어오면 구축된 단어 가중치 사전을 이용하여 문서 제목에 포함된 단어들의 범주별 가중치 합을 비교하여, 범주별 가중치 합이 최대인 범주로 문서를 분류한다. 실제 중등학교에서의 공문서를 대상으로 제안된 방법의 분류 성능을 평가하였다.

  • PDF

지게차 운전자의 작업자세 부담의 평가

  • 임창호;장통일;임현교
    • 한국산업안전학회:학술대회논문집
    • /
    • 한국안전학회 1998년도 춘계 학술논문발표회 논문집
    • /
    • pp.307-312
    • /
    • 1998
  • In forklift operations, awkward postures due to backward driving may put drivers to the risk of CTD or low back pain. In this research, 6 forklift drivers were surveyed with OWAS for objective posture evaluation and bodymaps for self-report evaluation. The backward driving happened more frequently than forward driving as expected, and, as work hours passed by, the drivers naturally tended to assume the easier work postures in inverse proportion to the frequency of the backward operations. According to the results of OWAS, 60 % of the work postures in the forklift operations belonged to the category II, III, and IV classified serious. Especially, in the backward driving, the postures with the neck twisted over $45^{\circ}$ occupied 82.4 %. In addition, discomfort on the neck, left shoulder, and low back was frequently reported in the self-reports.

  • PDF

아파트 하자 보수 시설공사 세부공종 머신러닝 분류 시스템에 관한 연구 (Classifying Sub-Categories of Apartment Defect Repair Tasks: A Machine Learning Approach)

  • 김은혜;지홍근;김지나;박은일;엄재용
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제10권9호
    • /
    • pp.359-366
    • /
    • 2021
  • 대한민국 건설사들은 아파트 하자 정보를 축적하고 보수작업을 관리하기 위한 시스템을 운영하는데 상당한 인력과 비용을 투자하고 있다. 본 연구에서는 하자 접수 상세내용 텍스트 데이터를 이용하여 하자 보수 시설공사에 따른 세부공종을 분류하는 머신러닝 모델을 제안한다. 두 가지 단어 임베딩(Bag-of-words, Term Frequency-Inverse Document Frequency (TF-IDF))과 두 가지 분류기(Support Vector Machine, Random Forest)를 통해 한국어로 작성된 65만건 이상의 하자 접수데이터로부터 하자보수 시설공사 세부공종을 분류했다. 특히, 이번 연구에서는 특정 시설공사(마감공사)의 9개 세부공종(가전제품, 도배공사, 도장공사, 미장공사, 석공사, 수장공사, 옥내가구공사, 주방기구공사, 타일공사)을 분류하는 이진분류 모델과 다중 분류 모델을 연구했다. 그 결과, TF-IDF와 Random Forest를 사용한 두가지 분류 모델에서 90%이상의 정확도, 정밀도, 재현율 및 F1점수를 확인했다.

액화 천연 가스 연료 선박의 연료 공급 장치 폭발 잠재 위험 분석 (Estimation of explosion risk potential in fuel gas supply systems for LNG fuelled ships)

  • 이상익
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제39권9호
    • /
    • pp.918-922
    • /
    • 2015
  • 선박으로부터 배출되는 오염원과 온실가스에 대한 국제적 규제가 점점 더 강화되어 감에 따라, 액화 천연 가스를 선박의 연료로 사용하는데 대한 관심이 높아져 가고 있다. 본 연구는 액화 천연 가스 연료 선박에서 사용되는 두 가지 방식의 연료 가스 공급 장치에 대하여 폭발 잠재 위험 분석을 수행하였다. 8500 TEU 급 컨테이너 선박을 목표 선박으로 선정하여, 액화 천연 가스 저장 탱크를 설계하였고 각 연료 공급 방식의 운전을 위한 압력 조건을 가정하였다. 누출공의 크기를 세 개의 범주로 분류하여, 각 누출공 크기 범주에 대한 누출 빈도를 산출하였고, 대표 누출공의 크기와 누출량을 추산하였다. 방출률의 증가와 누출 빈도는 역비례 관계를 보였으며, 펌프 방식 연료 공급 장치에서는 누출 빈도가 높게 나타났고, 가압 방식 연료 공급 장치에서는 방출률이 높게 나타났다. 전산 유체 역학 시뮬레이션을 통하여 폭발 잠재 위험 분석을 수행하고 각 연료 공급 장치에 대한 결과를 비교하였다.

A Feasibility Study on Adopting Individual Information Cognitive Processing as Criteria of Categorization on Apple iTunes Store

  • Zhang, Chao;Wan, Lili
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제27권2호
    • /
    • pp.1-28
    • /
    • 2018
  • Purpose More than 7.6 million mobile apps could be approved on both Apple iTunes Store and Google Play. For managing those existed Apps, Apple Inc. established twenty-four primary categories, as well as Google Play had thirty-three primary categories. However, all of their categorizations have appeared more and more problems in managing and classifying numerous apps, such as app miscategorized, cross-attribution problems, lack of categorization keywords index, etc. The purpose of this study focused on introducing individual information cognitive processing as the classification criteria to update the current categorization on Apple iTunes Store. Meanwhile, we tried to observe the effectiveness of the new criteria from a classification process on Apple iTunes Store. Design/Methodology/Approach A research approach with four research stages were performed and a series of mixed methods was developed to identify the feasibility of adopting individual information cognitive processing as categorization criteria. By using machine-learning techniques with Term Frequency-Inverse Document Frequency and Singular Value Decomposition, keyword lists were extracted. By using the prior research results related to car app's categorization, we developed individual information cognitive processing. Further keywords extracting process from the extracted keyword lists was performed. Findings By TF-IDF and SVD, keyword lists from more than five thousand apps were extracted. Furthermore, we developed individual information cognitive processing that included a categorization teaching process and learning process. Three top three keywords for each category were extracted. By comparing the extracted results with prior studies, the inter-rater reliability for two different methods shows significant reliable, which proved the individual information cognitive processing to be reliable as criteria of categorization on Apple iTunes Store. The updating suggestions for Apple iTunes Store were discussed in this paper and the results of this paper may be useful for app store hosts to improve the current categorizations on app stores as well as increasing the efficiency of app discovering and locating process for both app developers and users.

학동기 어린이 주의력결핍 과잉행동장애에서 식이요인의 역할 규명 (Dietary Factors Associated with Attention Deficit Hyperactivity Disorder (ADHD) in School-aged Children)

  • 안민지;안효진;황효정;권호장;하미나;홍윤철;홍수종;오세영
    • 대한지역사회영양학회지
    • /
    • 제23권5호
    • /
    • pp.397-410
    • /
    • 2018
  • Objectives: An association between dietary patterns and mental health in children has been suggested in a series of studies, yet detailed analyses of dietary patterns and their effects on ADHD (attention deficit hyperactivity disorder) are limited. Methods: We included 4569 children who had dietary intake data as part of the CHEER (Children's Health and Environmental Research) study conducted nationwide from 2005 to 2010. We assessed ADHD (Attention Deficit Hyperactivity Disorder) by the DuPaul's ADHD Rating Scales and dietary intake by a semi-quantitative food frequency questionnaire. Using intake data, we constructed five dietary patterns: "Plant foods & fish," "Sweets," "Meat & fish," "Fruits & dairy products," and "Wheat based." Results: The overall proportion of ADHD was 12.3%. Boys (17.8%) showed a higher rate of ADHD than girls (6.5%). The total intake of calories (85 kcal) and plant fat (2g) in the ADHD group was significantly higher than that of the normal group. ADHD was significantly negatively associated with dietary habits such as having breakfast and meal frequency, and positively associated with eating speed, unbalanced diet, overeating, and rice consumption. Regarding dietary patterns, the "Sweets" category was relevant to high ADHD risk (OR 1.59, 95% CI: 1.18, 2.15 for Q5 vs. Q1) in a linear relationship. An inverse, non-linear association was found between "Fruits & dairy products" and ADHD (OR 0.55, 95% CI: 0.39, 0.76 for Q4 vs. Q1). Conclusions: Our study confirms both positive and negative associations between diet and ADHD in elementary school age children. Moreover, linear or nonlinear associations between diet and ADHD draw attention to the possible threshold role of nutrients. Further studies may consider characteristics of diet in more detail to develop better intervention or management in terms of diet and health.

텍스트 분류 기반 기계학습의 정신과 진단 예측 적용 (Application of Text-Classification Based Machine Learning in Predicting Psychiatric Diagnosis)

  • 백두현;황민규;이민지;우성일;한상우;이연정;황재욱
    • 생물정신의학
    • /
    • 제27권1호
    • /
    • pp.18-26
    • /
    • 2020
  • Objectives The aim was to find effective vectorization and classification models to predict a psychiatric diagnosis from text-based medical records. Methods Electronic medical records (n = 494) of present illness were collected retrospectively in inpatient admission notes with three diagnoses of major depressive disorder, type 1 bipolar disorder, and schizophrenia. Data were split into 400 training data and 94 independent validation data. Data were vectorized by two different models such as term frequency-inverse document frequency (TF-IDF) and Doc2vec. Machine learning models for classification including stochastic gradient descent, logistic regression, support vector classification, and deep learning (DL) were applied to predict three psychiatric diagnoses. Five-fold cross-validation was used to find an effective model. Metrics such as accuracy, precision, recall, and F1-score were measured for comparison between the models. Results Five-fold cross-validation in training data showed DL model with Doc2vec was the most effective model to predict the diagnosis (accuracy = 0.87, F1-score = 0.87). However, these metrics have been reduced in independent test data set with final working DL models (accuracy = 0.79, F1-score = 0.79), while the model of logistic regression and support vector machine with Doc2vec showed slightly better performance (accuracy = 0.80, F1-score = 0.80) than the DL models with Doc2vec and others with TF-IDF. Conclusions The current results suggest that the vectorization may have more impact on the performance of classification than the machine learning model. However, data set had a number of limitations including small sample size, imbalance among the category, and its generalizability. With this regard, the need for research with multi-sites and large samples is suggested to improve the machine learning models.

집회소음 노출시간에 따른 성가심도 연구 (Study of Annoyance in Relation to Exposure Time to Demonstration Noise)

  • 박형우;배명진
    • 한국인터넷방송통신학회논문지
    • /
    • 제16권6호
    • /
    • pp.103-108
    • /
    • 2016
  • 오늘날 도시의 규모가 커지고 기능이 복잡해지고 있으며, 다양한 사람들이 다양한 활동을 하며 살아가고 있다. 또한 도시에서의 사람의 삶은 많은 부분에서 주변의 사람과 연관되어 있다. 그리고 도시의 생활은 다양한 인공적인 활동을 하고 있음을 뜻하며, 이에 따른 소리의 발생이 주변사람에게는 소음공해가 되기도 한다. 이러한 이유에서 사람들로 만들어진 인공소음은 특히 서울의 도심 4대문 안의 도로 주변의 소음은 평균 73㏈가 될 정도로 크다. 그리고 도시에 사라는 사람들은 소음공해에 쉽게 노출되며 특히 집회나 시위현장 또는 홍보를 위해 사용되는 확성기들은 상당한 소음을 발생시키며 이는 다른 사람들에게 스트레스를 유발한다. 더욱이, 집회 및 시위에 관한 법률 등에서 지정하는 확성기 사용 및 규제는 있지만 잘 지켜지지 않는 것이 현실이다. 그리고 최근 법령을 -5㏈를 낮추는 등의 법령기준을 강화하였지만, 여전히 스트레스가 되고 있다. 따라서 집회 및 시위현장의 소음의 크기를 제한하고 규제하는 것 뿐만 아니라, 노출되는 시간을 고려하여 피해를 줄이는 것이 더 효과적인 방법이다. 소음으로 인한 스트레스는 짧은 시간만 노출되더라도 오래도록 지속되며, 이러한 환경에 위치한다면, 소음공간에서 빠르게 벗어나는 것을 권장 하는 것도 본 연구에서 제시하는 피해저감 방안이다.

대학전공별(大學專攻別) 전문직학생(專門職學生)들의 인구관련문제(人口關聯問題)에 대한 연차적(年次的) 변화(變化) 연구(硏究) (A Prospective Study on Attitude of Professional Student toward Population Related Issues in Korea)

  • 이경식;김화중
    • Journal of Preventive Medicine and Public Health
    • /
    • 제9권1호
    • /
    • pp.11-24
    • /
    • 1976
  • This study was a part of large scale of a prospective study on attitudes of professional students in medicine, nursing and teaching toward population related issues in Korea. The study was first conducted in May 1974 and then in May 1975 for the 1974 class cohot using a questionaire consisted of attitude scales and other items developed by Lee. The purpose of stuay was twohold, namely, to determine the difference in students among specializations on one hand and between the first and second years in the 1974 class cohot regarding tile subject matter. A one-way analysis of variance was used for attitude scale, and absolute and relative frequency were computed for the analysis of non-attitude scale items by employing Fishers' Ratio and Duncan's multiple range test at 5% level and chi square test at 5% level as significance tests. The hypothesis 'students in health profession are more likely to have positive attitudes toward population related issues progressively as class year advances than students in teaching profession' was tested and the following results were obtained: 1) Nursing students were more likely to display favarable attitudes toward family planning than medical or teaching students although the class cohot showed slightly negative improvement in the second year. Medical and teaching students apperaed to have slightly improved attitudes in the second year. 2) Respondents in general perceived national family planning program as a means of population control and this tendency was more true among nursing students as the class year advances than two other professional groups of students. Students in teaching profession appeared to perceive it more as a means to improve individual family welfare while health students were likely to see as to improve maternal and child health. This tendency was progressively improved as the class year advanced. 3) The majority of students regardless of their respective specializations believed that family planning program should be directed toward the improvement of individual family welfare. No progressive changes in the class cohot were observed. 4) About the plan to use contraceptives in future, no singnificant differences were observes among different specializations nor in different class years. However, the majority was confirmed to have a plan to use contracepives in future. An increasing proportion of the undecided category was observed, as class year advanced among health students. 5) Students in health profession were found to be more favorable about 'more leisure opportunities' as motive for limiting number of children whereas education students indicated the reasons as 'facilitate ambitions' and 'economic base' The progressive changes toward positive direction in both groups were observed as the class years advanced. 6) Attitudes toward induced abortions of the health students were observed to be positively related to class years while an inverse relationship was found in teaching students who showed much less favor in the subject matter than health students. This phenomenon may be due to the different exposure to learning environments unique to respective specializations. 7) Health students were found to have more favorable attitudes toward population education in general than the teaching students. The teaching students appeared to have changed more to the negative direction when they became the second year while no such development was observed in health students. The teaching students seemed to hold a very conservative position with regard to sex education in schools. 8) About the equality of sexes, the nursing group was found to be most favorable while the reverse was true in the teaching group. A change in the negative direction as the class year advanced was found in the teaching group. 9) About questions related to fertility values-the 10 percent of respondents regardless of specialization indicated that they would maintain their single status in future, however no change was observed in the second year. The desired number of children was found to be two by the majority of students in nursing, medicine and teaching in order of high proportion. No changes in a different class year were observed. The childless marriage was seen by nursing students as a problem more than other students, but a slight change in positive direction was found when the nursing students became the second year. In summing, as data supported in the above, students in health profession demonstrated more favorable attitudes toward population related issues than the teaching students and this tendency became more apparent in the second year. It was noticed that health students were more conscious about the health aspect of population and family planning program while the teaching students gave more attention to the socioeconomic aspect. The sex variable seemed to have operated in the item related to the equality of sexes. In conclusion, as data presented in the above, the hypothesis of this study was accepted except in the few items. It should be noted that the limitation of this study is the short duration of the observation in measuring the possible attitude changes. It should include curriculum analysis for the respective specializations in order to indentify the area of curriculum impact on students in future study.

  • PDF