• Title/Summary/Keyword: Co-occurrence

Search Result 1,044, Processing Time 0.029 seconds

Selective Word Embedding for Sentence Classification by Considering Information Gain and Word Similarity (문장 분류를 위한 정보 이득 및 유사도에 따른 단어 제거와 선택적 단어 임베딩 방안)

  • Lee, Min Seok;Yang, Seok Woo;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.105-122
    • /
    • 2019
  • Dimensionality reduction is one of the methods to handle big data in text mining. For dimensionality reduction, we should consider the density of data, which has a significant influence on the performance of sentence classification. It requires lots of computations for data of higher dimensions. Eventually, it can cause lots of computational cost and overfitting in the model. Thus, the dimension reduction process is necessary to improve the performance of the model. Diverse methods have been proposed from only lessening the noise of data like misspelling or informal text to including semantic and syntactic information. On top of it, the expression and selection of the text features have impacts on the performance of the classifier for sentence classification, which is one of the fields of Natural Language Processing. The common goal of dimension reduction is to find latent space that is representative of raw data from observation space. Existing methods utilize various algorithms for dimensionality reduction, such as feature extraction and feature selection. In addition to these algorithms, word embeddings, learning low-dimensional vector space representations of words, that can capture semantic and syntactic information from data are also utilized. For improving performance, recent studies have suggested methods that the word dictionary is modified according to the positive and negative score of pre-defined words. The basic idea of this study is that similar words have similar vector representations. Once the feature selection algorithm selects the words that are not important, we thought the words that are similar to the selected words also have no impacts on sentence classification. This study proposes two ways to achieve more accurate classification that conduct selective word elimination under specific regulations and construct word embedding based on Word2Vec embedding. To select words having low importance from the text, we use information gain algorithm to measure the importance and cosine similarity to search for similar words. First, we eliminate words that have comparatively low information gain values from the raw text and form word embedding. Second, we select words additionally that are similar to the words that have a low level of information gain values and make word embedding. In the end, these filtered text and word embedding apply to the deep learning models; Convolutional Neural Network and Attention-Based Bidirectional LSTM. This study uses customer reviews on Kindle in Amazon.com, IMDB, and Yelp as datasets, and classify each data using the deep learning models. The reviews got more than five helpful votes, and the ratio of helpful votes was over 70% classified as helpful reviews. Also, Yelp only shows the number of helpful votes. We extracted 100,000 reviews which got more than five helpful votes using a random sampling method among 750,000 reviews. The minimal preprocessing was executed to each dataset, such as removing numbers and special characters from text data. To evaluate the proposed methods, we compared the performances of Word2Vec and GloVe word embeddings, which used all the words. We showed that one of the proposed methods is better than the embeddings with all the words. By removing unimportant words, we can get better performance. However, if we removed too many words, it showed that the performance was lowered. For future research, it is required to consider diverse ways of preprocessing and the in-depth analysis for the co-occurrence of words to measure similarity values among words. Also, we only applied the proposed method with Word2Vec. Other embedding methods such as GloVe, fastText, ELMo can be applied with the proposed methods, and it is possible to identify the possible combinations between word embedding methods and elimination methods.

Denied Boarding and Compensation for Passengers in the EU Air Transport Legal Framework and Cases (항공여객운송에서의 탑승거부와 여객보상기준)

  • Sur, Ji-Min
    • The Korean Journal of Air & Space Law and Policy
    • /
    • v.34 no.1
    • /
    • pp.203-234
    • /
    • 2019
  • The concept of denied boarding is defined in Article 2(j) of Regulation 261/2004 thus: "denied boarding means a refusal to carry passengers on a flight, although they have presented themselves for boarding under the conditions laid down in Article 3(2), except where there are reasonable grounds to deny them boarding, such as reasons of health, safety or security, or inadequate travel documentation." So far as relevant to this case, to be entitled to compensation, if denied boarding, Article 3(2) provides a passenger must first come within the scope of the protection of the Regulation, which applies under the following conditions: "${\cdots}$.that passengers (a) have a confirmed reservation on the flight concerned and, except in the case of cancellation referred to in Article 5, present themselves for check-in, as stipulated and at the time indicated in advance and in writing (including by electronic means) by the air carrier, the tour operator or an authorised travel agent, or, if no time is indicated, not later than 45 minutes before the published departure time." This paper reviews the EU Cases such as Rodríguez Cachafeiro v. Iberia [2012] Case C-321/11; Finnair Oyj v. Timy Lassooy [2012] Case C-22/11; Caldwell v. easyJet Airline Co. Ltd. [2015] ScotSC 64. ECJ and Sheriff court of Scotland held that the concept of denied boarding, within the meaning of Articles 2(j) and 4 of Regulation No 261/2004 establishing common rules on compensation and assistance to passengers in the event of denied boarding and of cancellation or long delay of flights, and repealing Regulation No 295/91, must be interpreted as relating not only to cases where boarding is denied because of overbooking but also to those where boarding is denied on other grounds, such as operational reasons. Also, ECJ ruled that Articles 2(j) and 4(3) must be interpreted as meaning that the occurrence of extraordinary circumstances resulting in an air carrier rescheduling flights after those circumstances arose cannot give grounds for denying boarding on those later flights or for exempting that carrier from its obligation, under Article 4(3) of that regulation, to compensate a passenger to whom it denies boarding on such a flight.

A Review of Recent Climate Trends and Causes over the Korean Peninsula (한반도 기후변화의 추세와 원인 고찰)

  • An, Soon-Il;Ha, Kyung-Ja;Seo, Kyong-Hwan;Yeh, Sang-Wook;Min, Seung-Ki;Ho, Chang-Hoi
    • Journal of Climate Change Research
    • /
    • v.2 no.4
    • /
    • pp.237-251
    • /
    • 2011
  • This study presents a review on the recent climate change over the Korean peninsula, which has experienced a significant change due to the human-induced global warming more strongly than other regions. The recent measurement of carbon dioxide concentrations over the Korean peninsula shows a faster rise than the global average, and the increasing trend in surface temperature over this region is much larger than the global mean trend. Recent observational studies reporting the weakened cold extremes and intensified warm extremes over the region support consistently the increase of mean temperature. Surface vegetation greenness in spring has also progressed relatively more quickly. Summer precipitation over the Korean peninsula has increased by about 15% since 1990 compared to the previous period. This was mainly due to an increase in August. On the other hand, a slight decrease in the precipitation (about 5%) during Changma period (rainy season of the East Asian summer monsoon), was observed. The heavy rainfall amounts exhibit an increasing trend particularly since the late 1970s, and a consecutive dry-day has also increased primarily over the southern area. This indicates that the duration of precipitation events has shortened, while their intensity became stronger. During the past decades, there have been more stronger typhoons affecting the Korean peninsula with landing more preferentially over the southeastern area. Meanwhile, the urbanization effect is likely to contribute to the rapid warming, explaining about 28% of total temperature increase during the past 55 years. The impact of El Nino on seasonal climate over the Korean peninsula has been well established - winter [summer] temperatures was generally higher [lower] than normal, and summer rainfall tends to increase during El-Nino years. It is suggested that more frequent occurrence of the 'central-Pacific El-Nino' during recent decades may have induced warmer summer and fall over the Korean peninsula. In short, detection and attribution studies provided fundamental information that needed to construct more reliable projections of future climate changes, and therefore more comprehensive researches are required for better understanding of past climate variations.

Floristic features of upland fields in South Korea (우리나라 밭 경작지에 출현하는 식물상 특성)

  • Kim, Myung-Hyun;Eo, Jinu;Kim, Min-Kyeong;Oh, Young-Ju
    • Korean Journal of Environmental Biology
    • /
    • v.38 no.4
    • /
    • pp.528-553
    • /
    • 2020
  • Upland fields are characterized by dry environments, a high degree of disturbance by farming practices such as double-cropping, and a high diversity of crops compared to other field types. This study focused on the floristic composition and characteristics of upland fields in South Korea. Flora surveys were conducted in 36 areas in nine provinces at two times (June and August) in 2015. The results showed that the vascular plants in the upland fields in South Korea included 532 taxa, containing 100 families, 322 genera, 483 species, nine subspecies, 37 varieties, one form, and two hybrids. Among the 100 families, Asteraceae was the most diverse in species (75 taxa), followed by Poaceae (68 taxa), Fabaceae (34 taxa), Polygonaceae (21 taxa), Rosaceae (19 taxa), and Liliaceae (17 taxa). Based on the occurrence frequency of each species, Acalypha australis L. (100%), and Artemisia indica Willd. (100%) were the highest, followed by Humulus scandens (Lour.) Merr., Rorippa palustris (L.) Besser, Conyza canadensis (L.) Cronquist, Erigeron annuus (L.) Pers., Lactuca indica L., Commelina communis L., Digitaria ciliaris (Retz.) Koeler, Echinochloa crus-galli(L.) P.Beauv., Cyperus microiria Steud., and Oxalis corniculata L. The biological type of upland fields in South Korea was determined to be Th-R5-D4-e type. Rare plants were found in 11 taxa: Taxus cuspidata Siebold & Zucc, Magnolia kobus DC, Clematis trichotoma Nakai, Aristolochina contorta Bunge, Buxus sinica (Rehder & E.H.Wilson) M.Cheng var. koreana (Nakai ex Rehder) Q.L.Wang, Melothria japonica (Thunb.) Maxim, Mitrasacme indica Wight, Lithospermum arvense L., Carpesium rosulatum Miq., Allium senescens L., and Pseudoraphis sordida (Thwaites) S.M.Phillips & S.L.Chen. Ninety-seven taxa contained naturalized plants composed of 24 families, 68 genera, 97 species, one variety, and one form. The urbanization and naturalization indices were 30.5% and 18.4%, respectively.

Analysis of Physical Status on COVID-19: Based on Impacts of Physical Activity (COVID-19에 대한 운동중재효과 분석)

  • Kim, Kwi-Baek;Kwak, Yi Sub
    • Journal of Life Science
    • /
    • v.31 no.6
    • /
    • pp.603-608
    • /
    • 2021
  • The purpose of this perspective research is to discuss the potential role of exercise-interventions in COVID-19, terms of prevention and prognosis in the periods of the COVID-19 vaccine. SARCO-CoV-2. COVID-19 was detected as a new virus causing severe cardiovascular and respiratory complications. It emerged as a global public health emergency and national pandemic. It caused more than 1 million deaths in the first 6 months of the pandemic and resulted in huge social and economic fluctuations internationally. Unprecedented stressful situations, such as COVID-19 blue and COVID-19 red impact on many health problems. In healthy individuals, COVID-19 infection may induced no symptoms (i.e., asymptomatic), whereas others may experience flu-like symptoms, such as ARDS, pneumonia, and death. Poor health status, such as obesity and cardiovascular and respiratory complications, are high risk factors for COVID-19 prevention, occurrence, and prognosis. Several COVID-19 vaccines are currently in human trials. However, the efficacy and safety of COVID-19 vaccines, including potential side effects, such as anaphylaxis (a life-threatening allergic reaction) and rare blood clots, still need to be investigated. On the basis of direct and indirect evidence, it seems that regular and moderate physical exercise can be recommended as a nonpharmacological, efficient, and safe way to cope with COVID-19. Physical inactivity and metabolic abnormalities are directly associated with reduced immune responses, including reduced innate, CMI, and AMI responses. Due to prolonged viral shedding, quarantine in inactive, obese and disease people should likely be longer than physical active people. Multicomponent and systemic exercise should be considered for the obese, disease, and elderly people. More mechanism research is needed in this area.

Operation Measures of Sea Fog Observation Network for Inshore Route Marine Traffic Safety (연안항로 해상교통안전을 위한 해무관측망 운영방안에 관한 연구)

  • Joo-Young Lee;Kuk-Jin Kim;Yeong-Tae Son
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.2
    • /
    • pp.188-196
    • /
    • 2023
  • Among marine accidents caused by bad weather, visibility restrictions caused by sea fog occurrence cause accidents such as ship strand and ship bottom damage, and at the same time involve casualties caused by accidents, which continue to occur every year. In addition, low visibility at sea is emerging as a social problem such as causing considerable inconvenience to islanders in using transportation as passenger ships are collectively delayed and controlled even if there are local differences between regions. Moreover, such measures are becoming more problematic as they cannot objectively quantify them due to regional deviations or different criteria for judging observations from person to person. Currently, the VTS of each port controls the operation of the ship if the visibility distance is less than 1km, and in this case, there is a limit to the evaluation of objective data collection to the extent that the visibility of sea fog depends on the visibility meter or visual observation. The government is building a marine weather signal sign and sea fog observation networks for sea fog detection and prediction as part of solving these obstacles to marine traffic safety, but the system for observing locally occurring sea fog is in a very insufficient practical situation. Accordingly, this paper examines domestic and foreign policy trends to solve social problems caused by low visibility at sea and provides basic data on the need for government support to ensure maritime traffic safety due to sea fog by factually investigating and analyzing social problems. Also, this aims to establish a more stable maritime traffic operation system by blocking marine safety risks that may ultimately arise from sea fog in advance.

Analysis of inundation and rainfall-runoff in mountainous small catchment using the MIKE model - Focusing on the Var river in France - (MIKE 모델을 이용한 산지소유역 강우유출 및 침수 분석 - 프랑스 Var river 유역을 중심으로 -)

  • Lee, Suwon;Jang, Dongwoo;Jung, Seungkwon
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.1
    • /
    • pp.53-62
    • /
    • 2023
  • Recently, due to the influence of climate change, the occurrence of damage to heavy rain is increasing around the world, and the frequency of heavy rain with a large amount of rain in a short period of time is also increasing. Heavy rains generate a large amount of outflow in a short time, causing flooding in the downstream part of the mountainous area before joining the small and medium-sized rivers. In order to reduce damage to downstream areas caused by flooding, it is very important to calculate the outflow of mountainous areas due to torrential rains. However, the sewage network flooding analysis, which is currently conducting the most analysis in Korea, uses the time and area method using the existing data rather than calculating the rainfall outflow in the mountainous area, which is difficult to determine that the soil characteristics of the region are accurately applied. Therefore, if the rainfall is analyzed for mountainous areas that can cause flooding in the downstream area in a short period of time due to large outflows, the accuracy of the analysis of flooding characteristics that can occur in the downstream area can be improved and used as data for evacuating residents and calculating the extent of damage. In order to calculate the rainfall outflow in the mountainous area, the rainfall outflow in the mountainous area was calculated using MIKE SHE among the MIKE series, and the flooding analysis in the downstream area was conducted through MIKE 21 FM (Flood model). Through this study, it was possible to confirm the amount of outflow and the time to reach downstream in the event of rainfall in the mountainous area, and the results of this analysis can be used to protect human and material resources through pre-evacuation in the downstream area in the future.

Estimation of Chlorophyll-a Concentration in Nakdong River Using Machine Learning-Based Satellite Data and Water Quality, Hydrological, and Meteorological Factors (머신러닝 기반 위성영상과 수질·수문·기상 인자를 활용한 낙동강의 Chlorophyll-a 농도 추정)

  • Soryeon Park;Sanghun Son;Jaegu Bae;Doi Lee;Dongju Seo;Jinsoo Kim
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.655-667
    • /
    • 2023
  • Algal bloom outbreaks are frequently reported around the world, and serious water pollution problems arise every year in Korea. It is necessary to protect the aquatic ecosystem through continuous management and rapid response. Many studies using satellite images are being conducted to estimate the concentration of chlorophyll-a (Chl-a), an indicator of algal bloom occurrence. However, machine learning models have recently been used because it is difficult to accurately calculate Chl-a due to the spectral characteristics and atmospheric correction errors that change depending on the water system. It is necessary to consider the factors affecting algal bloom as well as the satellite spectral index. Therefore, this study constructed a dataset by considering water quality, hydrological and meteorological factors, and sentinel-2 images in combination. Representative ensemble models random forest and extreme gradient boosting (XGBoost) were used to predict the concentration of Chl-a in eight weirs located on the Nakdong river over the past five years. R-squared score (R2), root mean square errors (RMSE), and mean absolute errors (MAE) were used as model evaluation indicators, and it was confirmed that R2 of XGBoost was 0.80, RMSE was 6.612, and MAE was 4.457. Shapley additive expansion analysis showed that water quality factors, suspended solids, biochemical oxygen demand, dissolved oxygen, and the band ratio using red edge bands were of high importance in both models. Various input data were confirmed to help improve model performance, and it seems that it can be applied to domestic and international algal bloom detection.

Effectiveness of controlled atmosphere container on the freshness of exported PMRsupia melon (CA 컨테이너를 이용한 수출 멜론의 선도유지 효과)

  • Haejo Yang;Min-Sun Chang;Puehee Park;Hyang Lan Eum;Jae-Han Cho;Ji Weon Choi;Sooyeon Lim;Yeo Eun Yun;Han Ryul Choi;Me-Hea Park;Yoonpyo Hong;Ji Hyun Lee
    • Food Science and Preservation
    • /
    • v.30 no.5
    • /
    • pp.822-832
    • /
    • 2023
  • This study investigates the effectiveness of CA (controlled atmosphere) containers in maintaining the freshness of exported melons. The melons were harvested on June 5, 2023, in the Yeongam area of Jeollanam-do, Korea. The CA container was loaded with melon samples packed in an export box. The temperature inside the container was set at 4℃, while the gas composition was set at 5% oxygen, 12% carbon dioxide, and 83% nintrogen. Following two weeks of simulated transportation, quality analysis was conducted at 10℃. The melons were inoculated with spore suspensions, and the decay rate was determined to investigate the effect of the gas composition inside the CA container on suppressing the occurrence of Penicillium oxalicum in melons. The results were compared with a Reefer container set at the same temperature. The samples transported in the CA container exhibited lower weight loss. The melon pulp softening, respiration rate, and ethylene production were slower using the CA container. Moreover, the decay rate during the distribution period in the CA container was lower than in the Reefer container. In contrast, the firmness of melons transported in the Reefer container decreased significantly (from 9.03N to 5.18N) immediately after transportation. The soluble solid content (SSC) of melons transported in the Reefer container also decreased rapidly. The results suggested that the CA container is the optimal export container for maintaining the freshness of melons.

Radiomics Analysis of Gray-Scale Ultrasonographic Images of Papillary Thyroid Carcinoma > 1 cm: Potential Biomarker for the Prediction of Lymph Node Metastasis (Radiomics를 이용한 1 cm 이상의 갑상선 유두암의 초음파 영상 분석: 림프절 전이 예측을 위한 잠재적인 바이오마커)

  • Hyun Jung Chung;Kyunghwa Han;Eunjung Lee;Jung Hyun Yoon;Vivian Youngjean Park;Minah Lee;Eun Cho;Jin Young Kwak
    • Journal of the Korean Society of Radiology
    • /
    • v.84 no.1
    • /
    • pp.185-196
    • /
    • 2023
  • Purpose This study aimed to investigate radiomics analysis of ultrasonographic images to develop a potential biomarker for predicting lymph node metastasis in papillary thyroid carcinoma (PTC) patients. Materials and Methods This study included 431 PTC patients from August 2013 to May 2014 and classified them into the training and validation sets. A total of 730 radiomics features, including texture matrices of gray-level co-occurrence matrix and gray-level run-length matrix and single-level discrete two-dimensional wavelet transform and other functions, were obtained. The least absolute shrinkage and selection operator method was used for selecting the most predictive features in the training data set. Results Lymph node metastasis was associated with the radiomics score (p < 0.001). It was also associated with other clinical variables such as young age (p = 0.007) and large tumor size (p = 0.007). The area under the receiver operating characteristic curve was 0.687 (95% confidence interval: 0.616-0.759) for the training set and 0.650 (95% confidence interval: 0.575-0.726) for the validation set. Conclusion This study showed the potential of ultrasonography-based radiomics to predict cervical lymph node metastasis in patients with PTC; thus, ultrasonography-based radiomics can act as a biomarker for PTC.