• Title/Summary/Keyword: Score ratio

Search Result 1,677, Processing Time 0.038 seconds

Selective Word Embedding for Sentence Classification by Considering Information Gain and Word Similarity (문장 분류를 위한 정보 이득 및 유사도에 따른 단어 제거와 선택적 단어 임베딩 방안)

  • Lee, Min Seok;Yang, Seok Woo;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.105-122
    • /
    • 2019
  • Dimensionality reduction is one of the methods to handle big data in text mining. For dimensionality reduction, we should consider the density of data, which has a significant influence on the performance of sentence classification. It requires lots of computations for data of higher dimensions. Eventually, it can cause lots of computational cost and overfitting in the model. Thus, the dimension reduction process is necessary to improve the performance of the model. Diverse methods have been proposed from only lessening the noise of data like misspelling or informal text to including semantic and syntactic information. On top of it, the expression and selection of the text features have impacts on the performance of the classifier for sentence classification, which is one of the fields of Natural Language Processing. The common goal of dimension reduction is to find latent space that is representative of raw data from observation space. Existing methods utilize various algorithms for dimensionality reduction, such as feature extraction and feature selection. In addition to these algorithms, word embeddings, learning low-dimensional vector space representations of words, that can capture semantic and syntactic information from data are also utilized. For improving performance, recent studies have suggested methods that the word dictionary is modified according to the positive and negative score of pre-defined words. The basic idea of this study is that similar words have similar vector representations. Once the feature selection algorithm selects the words that are not important, we thought the words that are similar to the selected words also have no impacts on sentence classification. This study proposes two ways to achieve more accurate classification that conduct selective word elimination under specific regulations and construct word embedding based on Word2Vec embedding. To select words having low importance from the text, we use information gain algorithm to measure the importance and cosine similarity to search for similar words. First, we eliminate words that have comparatively low information gain values from the raw text and form word embedding. Second, we select words additionally that are similar to the words that have a low level of information gain values and make word embedding. In the end, these filtered text and word embedding apply to the deep learning models; Convolutional Neural Network and Attention-Based Bidirectional LSTM. This study uses customer reviews on Kindle in Amazon.com, IMDB, and Yelp as datasets, and classify each data using the deep learning models. The reviews got more than five helpful votes, and the ratio of helpful votes was over 70% classified as helpful reviews. Also, Yelp only shows the number of helpful votes. We extracted 100,000 reviews which got more than five helpful votes using a random sampling method among 750,000 reviews. The minimal preprocessing was executed to each dataset, such as removing numbers and special characters from text data. To evaluate the proposed methods, we compared the performances of Word2Vec and GloVe word embeddings, which used all the words. We showed that one of the proposed methods is better than the embeddings with all the words. By removing unimportant words, we can get better performance. However, if we removed too many words, it showed that the performance was lowered. For future research, it is required to consider diverse ways of preprocessing and the in-depth analysis for the co-occurrence of words to measure similarity values among words. Also, we only applied the proposed method with Word2Vec. Other embedding methods such as GloVe, fastText, ELMo can be applied with the proposed methods, and it is possible to identify the possible combinations between word embedding methods and elimination methods.

Differences of Obstetric Complications and Clinical Characteristics between Autism Spectrum Disorder and Intellectual Disability (자폐스펙트럼장애와 지적 장애의 산과적 합병증 및 임상적 특성의 차이)

  • Lee, Seul Bee;Kim, Ji Yong;Chung, Hee Jung;Kim, Seong Woo;Im, Woo Young;Song, Jung-Eun
    • Korean Journal of Psychosomatic Medicine
    • /
    • v.24 no.2
    • /
    • pp.165-173
    • /
    • 2016
  • Objectives : Since the awareness of autism spectrum disorders(ASD) is growing, as a result, it is increasing numbers of infants and toddlers being referred to specialized clinics for a differential diagnosis and the importance of early autism spectrum disorders detection is emphasized. This study is to know the difference between ASD and intellectual disability(ID) from comparison of the demographics, clinical characters and obstetric complications. Methods : The participants are 816 toddlers who visited the developmental delay clinic(DDC) in National Health Insurance Ilsan hospital. The number of toddlers diagnosed as ASD and ID was 324 and 492. 75 toddlers out of 114 who returned to DDC were diagnosed as ID at the first visit but 7 of them had changed diagnosis to ASD at the second visit. After compared ASD with ID from the first visit, we analyzed characters of toddlers who had the changed diagnosis to ASD at the second visit. Results : As a result, the comparison between ASD and ID at the first visit shows that the boys have higher ratio, lower obstetric complication and lower language assessment score in ASD. The toddlers who had the changed diagnosis at the second visit were all boys and they had more cases of family history of developmental delay and had lower score of receptive language developmental quotient. Conclusions : These findings suggest that sex, language characteristics and obstetric complication could be useful in the early detection of ASD.

A re-appraisal of scoring items in state assessment of NATM tunnel considering influencing factors causing longitudinal cracks (종방향균열 영향인자 분석을 통한 NATM터널 정밀안전진단 상태평가 항목의 재검토)

  • Choo, Jin-Ho;Yoo, Chang-Kyoon;Oh, Young-Chul;Lee, In-Mo
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.21 no.4
    • /
    • pp.479-499
    • /
    • 2019
  • State assessment of an operational tunnel is usually done by performing visual inspection and durability tests by following the detailed guideline for safety inspection (SI) and/ or precision inspection for safety and diagnosis (PISD). In this study, 12 NATM tunnels, which have been operational for more than 10 years, were inspected to figure out the cause of longitudinal cracks for the purpose of modifying the scoring items in the state assessment NATM tunnel related to the longitudinal crack and the thickness of concrete lining. All investigated tunnels were classified into four groups depending on the shape and usage of each tunnel. The causes of longitudinal crack occurrence were analyzed by investigating the correlations between the longitudinal crack and the following four factors: the patterns of ground excavation; construction state of primary support system; characteristics of material properties of the concrete lining; and thickness of lining which was obtained by Ground Penetration Radar (GPR) tests. It was found that influencing factors causing longitudinal cracks in the lining were closely related with the construction condition of the primary support system, i.e. shotcrete, rockbolt, and steel-rib; crack occurrences were not much affected by the excavation patterns. As for the properties of concrete lining materials, occurrence of the longitudinal crack was mostly affected by the following three items: w/c ratio; contents of cement; and strength of lining. When estimating the lining thickness of the concrete lining by GPR tests and taking thickness effect into account in the statement assessment, it was concluded that increase of the index score by an average of 0.03 (ranging from 0.01 up to 0.071) is needed; a more realistic way of state assessment should be proposed in which the increased index score caused by lack of lining thickness should be taken into account.

Incidence and Procedure-Related Risk Factors of Delirium in Patients Admitted to an Intensive Care Unit (중환자실 입원 환자의 섬망 발생과 처치 관련 위험인자)

  • Ahn, Jee Seon;Oh, Jooyoung;Park, Jaesub;Kim, Jae-Jin;Park, Jin Young
    • Korean Journal of Psychosomatic Medicine
    • /
    • v.27 no.1
    • /
    • pp.35-41
    • /
    • 2019
  • Objectives : Although delirium is a common complication among patients hospitalized in intensive care units(ICUs), little is known about the roles that diagnostic and therapeutic procedures play in its development. This study investigates the procedure-related risk factors of delirium in ICU patients. Methods : All the consecutive patients admitted to the ICU between June 2016 and May 2017 were routinely evaluated for delirium by psychiatrists. In total, 1156 patients met the inclusion criteria and were retrospectively analyzed. A multiple logistic regression analysis was conducted to investigate independent risk factors of delirium development while adjusting for other characteristics. Results : The age, Acute Physiology and Chronic Health Evaluation (APACHE II) score, proportion of patients who had undergone an operation, and proportion of patients who were foley catheterized, mechanically ventilated, and physically restrained were higher in the delirium group. The multiple logistic regression analysis confirmed that the use of restraint was an independent risk factor of delirium (odds ratio : 10.006 ; 95% confidence interval : 6.120-16.360 ; p<0.001). The patient factors independently associated with delirium were an advanced age and a higher APACHE II score. The incidence of delirium was 15.3%. Conclusions : There is a high prevalence of delirium influenced by potentially harmful procedures in patients in ICU settings. The use of physical restraint had the strongest association with the development of delirium. These findings advocate the need to target procedure-related risk factors such as the use of restraints as preventive intervention measures for ICU delirium.

Analyzing the discriminative characteristic of cover letters using text mining focused on Air Force applicants (텍스트 마이닝을 이용한 공군 부사관 지원자 자기소개서의 차별적 특성 분석)

  • Kwon, Hyeok;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.75-94
    • /
    • 2021
  • The low birth rate and shortened military service period are causing concerns about selecting excellent military officers. The Republic of Korea entered a low birth rate society in 1984 and an aged society in 2018 respectively, and is expected to be in a super-aged society in 2025. In addition, the troop-oriented military is changed as a state-of-the-art weapons-oriented military, and the reduction of the military service period was implemented in 2018 to ease the burden of military service for young people and play a role in the society early. Some observe that the application rate for military officers is falling due to a decrease of manpower resources and a preference for shortened mandatory military service over military officers. This requires further consideration of the policy of securing excellent military officers. Most of the related studies have used social scientists' methodologies, but this study applies the methodology of text mining suitable for large-scale documents analysis. This study extracts words of discriminative characteristics from the Republic of Korea Air Force Non-Commissioned Officer Applicant cover letters and analyzes the polarity of pass and fail. It consists of three steps in total. First, the application is divided into general and technical fields, and the words characterized in the cover letter are ordered according to the difference in the frequency ratio of each field. The greater the difference in the proportion of each application field, the field character is defined as 'more discriminative'. Based on this, we extract the top 50 words representing discriminative characteristics in general fields and the top 50 words representing discriminative characteristics in technology fields. Second, the number of appropriate topics in the overall cover letter is calculated through the LDA. It uses perplexity score and coherence score. Based on the appropriate number of topics, we then use LDA to generate topic and probability, and estimate which topic words of discriminative characteristic belong to. Subsequently, the keyword indicators of questions used to set the labeling candidate index, and the most appropriate index indicator is set as the label for the topic when considering the topic-specific word distribution. Third, using L-LDA, which sets the cover letter and label as pass and fail, we generate topics and probabilities for each field of pass and fail labels. Furthermore, we extract only words of discriminative characteristics that give labeled topics among generated topics and probabilities by pass and fail labels. Next, we extract the difference between the probability on the pass label and the probability on the fail label by word of the labeled discriminative characteristic. A positive figure can be seen as having the polarity of pass, and a negative figure can be seen as having the polarity of fail. This study is the first research to reflect the characteristics of cover letters of Republic of Korea Air Force non-commissioned officer applicants, not in the private sector. Moreover, these methodologies can apply text mining techniques for multiple documents, rather survey or interview methods, to reduce analysis time and increase reliability for the entire population. For this reason, the methodology proposed in the study is also applicable to other forms of multiple documents in the field of military personnel. This study shows that L-LDA is more suitable than LDA to extract discriminative characteristics of Republic of Korea Air Force Noncommissioned cover letters. Furthermore, this study proposes a methodology that uses a combination of LDA and L-LDA. Therefore, through the analysis of the results of the acquisition of non-commissioned Republic of Korea Air Force officers, we would like to provide information available for acquisition and promotional policies and propose a methodology available for research in the field of military manpower acquisition.

Correlation between fish consumption and the risk of mild cognitive impairment in the elderly living in rural areas (농촌지역에 거주하는 노인의 생선 섭취량과 인지기능저하 위험도 간의 상관성)

  • Yu, Areum;Kim, Jihye;Choi, Bo Youl;Kim, Mi Kyung;Yang, Yoonkyoung;Yang, Yoon Jung
    • Journal of Nutrition and Health
    • /
    • v.54 no.2
    • /
    • pp.139-151
    • /
    • 2021
  • Purpose: This study examines the correlation between fish consumption and the risk of mild cognitive impairment in the elderly living in rural areas. Methods: The Yangpyeong cohort data collected from Yangpyeong in July 2009 and August 2010 was used as the data set. Adults greater than or equal to 60 years who have completed the Korean version of the Mini-Mental State Examination (MMSE-KC) were selected for the study. After excluding participants with less than 500 kcal of energy intake (n = 2), a total of 806 adults were enrolled as the final subjects. Cognitive function was assessed using the MMSE-KC, and dietary intake was collected using the quantitative food frequency questionnaire comprising 106 foods or food groups. Results: The educational level, proportion of people who exercise, fruits and vegetable intake, and energy intake, tended to increase with fish intake among men, while increasing age resulted in decreased fish consumption. Among women, the educational level, proportion of subjects who exercise, proportion of subjects currently taking dietary supplements, fruits and vegetable intake, and energy intake, tended to increase with fish consumption, whereas increasing age showed decreasing fish consumption. Increased fish intake resulted in a higher MMSE-KC score after adjusting for the confounding variables in women (p for trend = 0.016), but no significant trend was observed between fish intake and MMSE-KC score in men. Fish intake was inversely related to the risk of mild cognitive impairment after adjusting for covariates in women (Q1 vs. Q4; odds ratio, 0.46 [0.23-0.90]; p for trend = 0.009). Conclusion: This study determined that increased fish consumption is correlated with reduced risk of mild cognitive impairment in the female elderly. Further longitudinal studies with larger samples are required to determine a causal relationship between fish intake and cognitive function.

Effect of Restriction of Vitamin A and D on Carcass Characteristics in Hanwoo Steers (비타민 A와 D의 공급제한이 거세 한우의 육질등급에 미치는 영향)

  • Kim, W.Y.;Park, J.K.;Cho, S.Y.;Nam, K.T.;Yeo, J.M.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.18 no.1
    • /
    • pp.13-24
    • /
    • 2016
  • Sixty Hanwoo steers(15 months of age; 409±29.2 kg of BW) were used to evaluate the effects of dietary vitamins A and D restriction on carcass characteristics. Steers were allotted randomly to 1 of 4 treatments: Control(diet supplemented with vitamins A, D and E), -A (diet supplemented with vitamins D and E), -D(diet supplemented with vitamins A and E) and -AD(diet supplemented with vitamin E only). Steers were fed the experimental diet for a period of 8 months(until 23 months of age), and then supplemented with vitamins A and D at 0.05% of the diet(as fed-basis) from 24 to 26 months of age, and at 0.1% of the diet from 27 to 31 months of age(harvesting time). Dietary restriction of vitamins A and D did not affect DM intake, daily gain and feed conversion ratio. But the concentration of serum retinol was significantly(P<0.05) decreased by vitamin A restriction with the lowest concentration being seen at 23 months of age(345.0 ㎍/L and 326.7㎍/L for control and -D treatment versus 169.3 ㎍/L and 175.4 ㎍/L for -A and -AD treatments). The serum concentration of 25(OH)D3 was also decreased significantly(P<0.05) by vitamin D restriction and the lowest concentration was seen at 18 months of age(53.7ng/ml and 61.8ng/ml for control and - A treatment versus 24.0 ng/ml and 24.5 ng/ml for -D and -AD treatments). After the restriction period of vitamins A and D, the concentrations of retinol and 25(OH)D3 for - A, -D and -AD treatments were recovered at those of control. Dietary restriction of vitamins A and D did not affect carcass weight, backfat thickness, ribeye area, quality grade and yield grade. But marbling score was significantly increased by vitamin A restriction compared with control(6.73, 6.87 and 5.73 for -A, -AD and control, respectively). The results of the present study suggested that dietary vitamin A restriction could improve marbling score in Hanwoo steers.

Effects of Moisture Absorbent Application Timing on Performance, Blood Cell Characteristics and Footpad Dermatitis in Broiler Houses (육계 계사 내 수분흡수제 도포 시기가 생산성, 혈구 성상 및 발바닥피부염에 미치는 영향)

  • Eui-Chul Hong;Jin-Joo Jeon;Hee-Jin Kim
    • Korean Journal of Poultry Science
    • /
    • v.50 no.3
    • /
    • pp.125-132
    • /
    • 2023
  • This study was conducted to investigate the effect of moisture absorbent (MA) application timing for litter management on broiler performance, blood cell characteristics, litter moisture content, incidence of footpad dermatitis (FPD), and economics analysis. Treatment include untreated control (NC), 3-week-old litter treatment (PC), 0-week-old (W1), 0 and 3-week-old (W2), 3-week-old (W3) application of moisture absorbent. Six hundred eighty broilers (1-day-old, 42.0±0.24 g) were divided into 5 treatments (4 replications per treatment, 34 birds per replication) and raised for 5 weeks in a floor (2 m2 per pen). There was no significant difference among treatments in performance, blood cell characteristics, and H/L ratio according to the application period of litter and moisture absorbent. The litter moisture content and the FPD score were significantly decreased in the litter and moisture absorbent treatments at the age of 5 weeks (P<0.05). The FPD score of broilers was lowest in PC treatment compared to NC treatment (P<0.05). The incidence of FPD was lower in PC and W3 treatments compared to other treatments, and that was the highest in NC treatment. As a result of analyzing the economic feasibility, the highest expenditure occurred in PC treatment, and the lowest expenditure occurred in W3 treatment. Income was the highest in W3 treatment, and lowest in NC treatment. The profit was 185,859 won (1,367 won/unit), the highest in the W3 treatment. In conclusion, when MA was applied to the litter of broiler house at the age of 3 weeks, the litter moisture content and FPD were improved.

Estimation of Chlorophyll-a Concentration in Nakdong River Using Machine Learning-Based Satellite Data and Water Quality, Hydrological, and Meteorological Factors (머신러닝 기반 위성영상과 수질·수문·기상 인자를 활용한 낙동강의 Chlorophyll-a 농도 추정)

  • Soryeon Park;Sanghun Son;Jaegu Bae;Doi Lee;Dongju Seo;Jinsoo Kim
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.655-667
    • /
    • 2023
  • Algal bloom outbreaks are frequently reported around the world, and serious water pollution problems arise every year in Korea. It is necessary to protect the aquatic ecosystem through continuous management and rapid response. Many studies using satellite images are being conducted to estimate the concentration of chlorophyll-a (Chl-a), an indicator of algal bloom occurrence. However, machine learning models have recently been used because it is difficult to accurately calculate Chl-a due to the spectral characteristics and atmospheric correction errors that change depending on the water system. It is necessary to consider the factors affecting algal bloom as well as the satellite spectral index. Therefore, this study constructed a dataset by considering water quality, hydrological and meteorological factors, and sentinel-2 images in combination. Representative ensemble models random forest and extreme gradient boosting (XGBoost) were used to predict the concentration of Chl-a in eight weirs located on the Nakdong river over the past five years. R-squared score (R2), root mean square errors (RMSE), and mean absolute errors (MAE) were used as model evaluation indicators, and it was confirmed that R2 of XGBoost was 0.80, RMSE was 6.612, and MAE was 4.457. Shapley additive expansion analysis showed that water quality factors, suspended solids, biochemical oxygen demand, dissolved oxygen, and the band ratio using red edge bands were of high importance in both models. Various input data were confirmed to help improve model performance, and it seems that it can be applied to domestic and international algal bloom detection.

The relationship between the prevalence of anemia and dietary intake among adults according to household types based on data from the 7th (2016-2018) Korea National Health and Nutrition Examination Survey (국민건강영양조사 제7기 (2016-2018년)에서의 가구 유형에 따른 성인의 빈혈 유병율과 식이 섭취)

  • Hye Won Kim;Ji-Myung Kim
    • Journal of Nutrition and Health
    • /
    • v.56 no.5
    • /
    • pp.510-522
    • /
    • 2023
  • Purpose: In this study, data from the 7th Korea National Health and Nutrition Examination Survey (2016-2018) were used to examine the relationship between the prevalence of anemia and dietary intake among adults according to household types. Methods: Using data from a total of 10,646 subjects (4,428 men and 6,218 women), the general information, body measurements, results of biochemical examination, food and nutrient intake, and meal quality evaluation were analyzed according to the type of household. Results: The prevalence of anemia was higher in men belonging to single-person households (SPH) than in those from multi-person households (MPH), while anemia prevalence was higher among the women in the MPH than in the SPH. The men in SPH had a lower total food intake of nuts, vegetables, fruits, fish, and seaweed than the men in MPH, and consumed higher quantities of milk, oil, and processed foods. The women from SPH had a lower intake of seaweed and a higher intake of milk than those belonging to the MPH. In addition, the men in SPH had a lower iron intake and iron intake per 1,000kcal than the men in MPH, lower iron intake through plant-based foods, and a lower iron intake ratio compared to the reference nutrient intake. The total Korean Healthy Eating Index (KHEI) score was lower in both men and women in SPH than in those from the MPH. When analyzing the relationship between household type and anemia risk after correcting for the confusion variable, the risk of anemia in men in SPH increased compared to those belonging to the MPH. However, women showed no such significant correlation. There was no relationship between the total KHEI score and the risk of anemia by gender and household type. Conclusion: In conclusion, since anemia in men belonging to SPH is a matter of concern, it is essential to develop guidelines for anemia-related nutrition education for men living alone.