• Title/Summary/Keyword: Term Statistics

Search Result 752, Processing Time 0.032 seconds

Document classification using a deep neural network in text mining (텍스트 마이닝에서 심층 신경망을 이용한 문서 분류)

  • Lee, Bo-Hui;Lee, Su-Jin;Choi, Yong-Seok
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.5
    • /
    • pp.615-625
    • /
    • 2020
  • The document-term frequency matrix is a term extracted from documents in which the group information exists in text mining. In this study, we generated the document-term frequency matrix for document classification according to research field. We applied the traditional term weighting function term frequency-inverse document frequency (TF-IDF) to the generated document-term frequency matrix. In addition, we applied term frequency-inverse gravity moment (TF-IGM). We also generated a document-keyword weighted matrix by extracting keywords to improve the document classification accuracy. Based on the keywords matrix extracted, we classify documents using a deep neural network. In order to find the optimal model in the deep neural network, the accuracy of document classification was verified by changing the number of hidden layers and hidden nodes. Consequently, the model with eight hidden layers showed the highest accuracy and all TF-IGM document classification accuracy (according to parameter changes) were higher than TF-IDF. In addition, the deep neural network was confirmed to have better accuracy than the support vector machine. Therefore, we propose a method to apply TF-IGM and a deep neural network in the document classification.

Air passenger demand forecasting for the Incheon airport using time series models (시계열 모형을 이용한 인천공항 이용객 수요 예측)

  • Lee, Jihoon;Han, Hyerim;Yoon, Sanghoo
    • Journal of Digital Convergence
    • /
    • v.18 no.12
    • /
    • pp.87-95
    • /
    • 2020
  • The Incheon airport is a gateway to and from the Republic of Korea and has a great influence on the image of the country. Therefore, it is necessary to predict the number of airport passengers in the long term in order to maintain the quality of service at the airport. In this study, we compared the predictive performance of various time series models to predict the air passenger demand at Incheon Airport. From 2002 to 2019, passenger data include trend and seasonality. We considered the naive method, decomposition method, exponential smoothing method, SARIMA, PROPHET. In order to compare the capacity and number of passengers at Incheon Airport in the future, the short-term, mid-term, and long-term was forecasted by time series models. For the short-term forecast, the exponential smoothing model, which weighted the recent data, was excellent, and the number of annual users in 2020 will be about 73.5 million. For the medium-term forecast, the SARIMA model considering stationarity was excellent, and the annual number of air passengers in 2022 will be around 79.8 million. The PROPHET model was excellent for long-term prediction and the annual number of passengers is expected to be about 99.0 million in 2024.

Counseling Elderly People in Long-term Care Service (장기요양서비스에 대한 노인상담 실태와 영향 요인)

  • Lee, Hung-Sa;Kim, Chun-Mi
    • Research in Community and Public Health Nursing
    • /
    • v.22 no.2
    • /
    • pp.141-150
    • /
    • 2011
  • Purpose: The purpose of this study was to examine satisfaction with counseling in long-term care service, and to compare the scores of counseling satisfaction according to variables among beneficiaries of Korean long-term care services. Methods: Questionnaires were completed by 445 beneficiaries of long-term care insurance to measure satisfaction with counseling. Research design was cross-sectional descriptive design. Data were analyzed using descriptive statistics, t-test and ANOVA for evaluating differences in satisfaction with counseling according to variables including economic status, the level of long-term care insurance approval, duration of long term care service, and conditions of counseling. Results: The score of satisfaction with counseling was somewhat high as 71.67. The score of counselor's attitude was highest among the subcategories of satisfaction. The factors that influenced satisfaction with counseling were frequency and time of counseling (F=12.19, p<.001). Conclusion: Home-based individual counseling is necessary for the elderly who need long-term care service. The National Long-term Care Insurance Corporation should offer counseling and assistance to elders and their caregivers about long term care insurance.

The Dynamics of Economic Growth in Underdeveloped Regions: A Case Study in Indonesia

  • JUMONO, Sapto;BASKARA, Ika;ABDURAHMAN, Abdurrahman;MALA, Chajar Matari Fath
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.8 no.4
    • /
    • pp.643-651
    • /
    • 2021
  • This study aims to determine the response of regional economic growth to the financial performance of regional economies in regard to the liquidity conditions, saving-investment gaps, trade openness, inflation, as well as the national economic growth. The basic logic theory of research uses the principles of open economics and financial intermediary systems. The data used in this study are secondary data, and the form of data is a quarterly time series for the period from 2008 to 2019. The data were obtained from various publications, such as the Central Statistics Agency (CSA), Regional Financial Economics Statistics (RFES), Indonesian Banking Statistics (IBS), and the Financial Services Authority (FSA). Data processing was done through VAR/VECM analysis; short-term and long-term equilibrium analyses were carried out. The results of the analysis illustrate that regional economic growth and the conditions of liquidity, saving-investment gaps, trade openness, inflation, and national economic growth are related and lead to significant impact variations in the provinces of Papua and West Papua. In conclusion, the findings of this research support the leading supply hypothesis and reformulate the strategy and policy of economic development, bearing in mind that there are still many underdeveloped districts in these two provinces.

A Short Note on Empirical Penalty Term Study of BIC in K-means Clustering Inverse Regression

  • Ahn, Ji-Hyun;Yoo, Jae-Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.3
    • /
    • pp.267-275
    • /
    • 2011
  • According to recent studies, Bayesian information criteria(BIC) is proposed to determine the structural dimension of the central subspace through sliced inverse regression(SIR) with high-dimensional predictors. The BIC may be useful in K-means clustering inverse regression(KIR) with high-dimensional predictors. However, the direct application of the BIC to KIR may be problematic, because the slicing scheme in SIR is not the same as that of KIR. In this paper, we present empirical penalty term studies of BIC in KIR to identify the most appropriate one. Numerical studies and real data analysis are presented.

GLOBAL WEAK SOLUTIONS FOR THE RELATIVISTIC VLASOV-KLEIN-GORDON SYSTEM IN TWO DIMENSIONS

  • Xiao, Meixia;Zhang, Xianwen
    • Bulletin of the Korean Mathematical Society
    • /
    • v.55 no.2
    • /
    • pp.591-598
    • /
    • 2018
  • This paper is concerned with global existence of weak solutions to the relativistic Vlasov-Klein-Gordon system. The energy of this system is conserved, but the interaction term ${\int}_{{\mathbb{R}}^n}\;{\rho}{\varphi}dx$ in it need not be positive. So far existence of global weak solutions has been established only for small initial data [9, 14]. In two dimensions, this paper shows that the interaction term can be estimated by the kinetic energy to the power of ${\frac{4q-4}{3q-2}}$ for 1 < q < 2. As a consequence, global existence of weak solutions for general initial data is obtained.

Stochastic precipitation modeling based on Korean historical data

  • Kim, Yongku;Kim, Hyeonjeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.6
    • /
    • pp.1309-1317
    • /
    • 2012
  • Stochastic weather generators are commonly used to simulate time series of daily weather, especially precipitation amount. Recently, a generalized linear model (GLM) has been proposed as a convenient approach to fitting these weather generators. In this paper, a stochastic weather generator is considered to model the time series of daily precipitation at Seoul in South Korea. As a covariate, global temperature is introduced to relate long-term temporal scale predictor to short-term temporal predictands. One of the limitations of stochastic weather generators is a marked tendency to underestimate the observed interannual variance of monthly, seasonal, or annual total precipitation. To reduce this phenomenon, we incorporate time series of seasonal total precipitation in the GLM weather generator as covariates. It is veri ed that the addition of these covariates does not distort the performance of the weather generator in other respects.

Parametric Modeling and Shape Optimization of Offshore Structures

  • Birk, Lothar
    • International Journal of CAD/CAM
    • /
    • v.6 no.1
    • /
    • pp.29-40
    • /
    • 2006
  • The paper presents an optimization system which integrates a parametric design tool, 3D diffraction-radiation analysis and hydrodynamic performance assessment based on short and long term wave statistics. Controlled by formal optimization strategies the system is able to design offshore structure hulls with superior seakeeping qualities. The parametric modeling tool enables the designer to specify the geometric characteristics of the design from displacement over principal dimensions down to local shape properties. The computer generates the hull form and passes it on to the hydrodynamic analysis, which computes response amplitude operators (RAOs) for forces and motions. Combining the RAOs with short and long-term wave statistics provides a realistic assessment of the quality of the design. The optimization algorithm changes selected shape parameters in order to minimize forces and motions, thus increasing availability and safety of the system. Constraints ensure that only feasible designs with sufficient stability in operation and survival condition are generated. As an example the optimization study of a semisubmersible is discussed. It illustrates how offshore structures can be optimized for a specific target area of operation.

A Lattice Model Based on Molecular Clusters for Supercritical Fluids (초임계 유체를 위한 분자 클러스터 기반의 격자모델)

  • Shin, Moon-Sam
    • Proceedings of the KAIS Fall Conference
    • /
    • 2010.05b
    • /
    • pp.961-964
    • /
    • 2010
  • A semi-empirical fluctuation term is presented to improve a classical equation of state (EOS) for volumetric properties in the critical region. The term is based on the two assumptions: (1) The Helmholtz energy is individually divided into classical and long-range density fluctuation contribution (2) All molecules form cluster near the critical region due to long-range density fluctuation. To formulate such molecular cluster, we extended the Veytsman statistics originally developed for the cluster due to hydrogen bonding. The probability function in the statistics is modified to represent the characteristics of long-range density fluctuation vanishing far from critical region. The proposed fluctuation contribution was incorporated into the Sanchez-Lacombe EOS and the combined model with 6 adjustable parameters has been tested against experimental VLE data. The combined model is found to well represent flatten critical isotherm for methane and top of the coexistence curve for the tested components. The prediction results for caloric data are in good agreement with the experimental data.

  • PDF

Short-Term Load Forecasting Using Multiple Time-Series Model Including Dummy Variables (더미변수(Dummy Variable)를 포함하는 다변수 시계열 모델을 이용한 단기부하예측)

  • 이경훈;김진오
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.52 no.8
    • /
    • pp.450-456
    • /
    • 2003
  • This paper proposes a multiple time-series model with dummy variables for one-hour ahead load forecasting. We used 11 dummy variables that were classified by day characteristics such as day of the week, holiday, and special holiday. Also, model specification and selection of input variables including dummy variables were made by test statistics such as AIC(Akaike Information Criterion) and t-test statistics of each coefficient. OLS (Ordinary Least Squares) method was used for estimation and forecasting. We found out that model specifications for each hour are not identical usually at 30% of optimal significance level, and dummy variables reduce the forecasting error if they are classified properly. The proposed model has much more accurate estimates in forecasting with less MAPE (Mean Absolute Percentage Error).