• Title/Summary/Keyword: 누락정보

Search Result 352, Processing Time 0.017 seconds

Subject-Balanced Intelligent Text Summarization Scheme (주제 균형 지능형 텍스트 요약 기법)

  • Yun, Yeoil;Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.141-166
    • /
    • 2019
  • Recently, channels like social media and SNS create enormous amount of data. In all kinds of data, portions of unstructured data which represented as text data has increased geometrically. But there are some difficulties to check all text data, so it is important to access those data rapidly and grasp key points of text. Due to needs of efficient understanding, many studies about text summarization for handling and using tremendous amounts of text data have been proposed. Especially, a lot of summarization methods using machine learning and artificial intelligence algorithms have been proposed lately to generate summary objectively and effectively which called "automatic summarization". However almost text summarization methods proposed up to date construct summary focused on frequency of contents in original documents. Those summaries have a limitation for contain small-weight subjects that mentioned less in original text. If summaries include contents with only major subject, bias occurs and it causes loss of information so that it is hard to ascertain every subject documents have. To avoid those bias, it is possible to summarize in point of balance between topics document have so all subject in document can be ascertained, but still unbalance of distribution between those subjects remains. To retain balance of subjects in summary, it is necessary to consider proportion of every subject documents originally have and also allocate the portion of subjects equally so that even sentences of minor subjects can be included in summary sufficiently. In this study, we propose "subject-balanced" text summarization method that procure balance between all subjects and minimize omission of low-frequency subjects. For subject-balanced summary, we use two concept of summary evaluation metrics "completeness" and "succinctness". Completeness is the feature that summary should include contents of original documents fully and succinctness means summary has minimum duplication with contents in itself. Proposed method has 3-phases for summarization. First phase is constructing subject term dictionaries. Topic modeling is used for calculating topic-term weight which indicates degrees that each terms are related to each topic. From derived weight, it is possible to figure out highly related terms for every topic and subjects of documents can be found from various topic composed similar meaning terms. And then, few terms are selected which represent subject well. In this method, it is called "seed terms". However, those terms are too small to explain each subject enough, so sufficient similar terms with seed terms are needed for well-constructed subject dictionary. Word2Vec is used for word expansion, finds similar terms with seed terms. Word vectors are created after Word2Vec modeling, and from those vectors, similarity between all terms can be derived by using cosine-similarity. Higher cosine similarity between two terms calculated, higher relationship between two terms defined. So terms that have high similarity values with seed terms for each subjects are selected and filtering those expanded terms subject dictionary is finally constructed. Next phase is allocating subjects to every sentences which original documents have. To grasp contents of all sentences first, frequency analysis is conducted with specific terms that subject dictionaries compose. TF-IDF weight of each subjects are calculated after frequency analysis, and it is possible to figure out how much sentences are explaining about each subjects. However, TF-IDF weight has limitation that the weight can be increased infinitely, so by normalizing TF-IDF weights for every subject sentences have, all values are changed to 0 to 1 values. Then allocating subject for every sentences with maximum TF-IDF weight between all subjects, sentence group are constructed for each subjects finally. Last phase is summary generation parts. Sen2Vec is used to figure out similarity between subject-sentences, and similarity matrix can be formed. By repetitive sentences selecting, it is possible to generate summary that include contents of original documents fully and minimize duplication in summary itself. For evaluation of proposed method, 50,000 reviews of TripAdvisor are used for constructing subject dictionaries and 23,087 reviews are used for generating summary. Also comparison between proposed method summary and frequency-based summary is performed and as a result, it is verified that summary from proposed method can retain balance of all subject more which documents originally have.

Actual Conditions and Perception of Safety Accidents by School Foodservice Employees in Chungbuk (충북지역 학교급식 조리종사원의 안전사고 실태 및 인식)

  • Cho, Hyun A;Lee, Young Eun;Park, Eun Hye
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.43 no.10
    • /
    • pp.1594-1606
    • /
    • 2014
  • The purpose of this study was to examine safety accidents related to school foodservice, working and operating environments of school foodservice, status and awareness of safety education, educational needs, and information on qualitative improvement of school foodservice. The subjects in this study were 234 cooks in charge of cooking at elementary and secondary schools in Chungbuk. A survey was conducted from July 30 to August 8, 2012, and among 202 questionnaires gathered, 194 completed questionnaires were analyzed. Statistical analyses were performed on data utilizing the SPSS version 19.0. The main results of this study were as follows: 44.3% of workers experienced safety accidents. The most frequent safety accident was 'once' (60.5%), and most safety accidents took place between June and August (31.4%). The time at which most safety accidents happened was between 8 and 11 am. Most safety accidents happened during cooking (52.3%) and while using a soup pot or frying pot (52.4%). The most common accidents were 'burns', 'wrist and arm pain', and 'slips and falls'. Respondents who experienced safety accidents replied that 57.6% of employees dealt with injuries at their own expense, and only 35.3% utilized industrial accident insurance. In terms of the operating environment, the score for 'offering information and application' was highest (3.76 points), whereas that for 'security of budget' was lowest (1.77 points). As for accident education, employees received safety education approximately 3.45 times and 5.10 hours per year. Improving the working environment of school foodservice cooks requires administrative and financial support. Furthermore, educational materials and guidelines based on the working environment and safety accident status of school foodservice cooks are required in order to minimize potential risk factors and control safety accidents in school foodservice.