• Title/Summary/Keyword: Feature-based Modeling

Search Result 378, Processing Time 0.028 seconds

Implications and numerical application of the asymptotical shock wave model (점진적 충격파모형의 함축적 의미와 검산)

  • Cho, Seong-Kil
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.11 no.4
    • /
    • pp.51-62
    • /
    • 2012
  • According to the Lighthill and Whitham's shock wave model, a shock wave exists even in a homogeneous speed condition. They referred this wave as unobservable- analogous to a radio wave that cannot be seen. Recent research has attempted to identify how such a counterintuitive conclusion results from the Lighthill and Whitham's shock wave model, and derive a new asymptotical shock wave model. The asymptotical model showed that the shock wave in a homogenous speed traffic stream is identical to the ambient vehicle speed. Thus, no radio wave-like shock wave exists. However, performance tests of the asymptotical model using numerical values have not yet been performed. We investigated the new asymptotical model by examining the implications of the new model, and tested it using numerical values based on a test scenario. Our investigation showed that the only difference between both models is in the third term of the equations, and that this difference has a crucial role in the model output. Incorporation of model parameter${\alpha}$ is another distinctive feature of the asymptotical model. This parameter makes the asymptotical model more flexible. In addition, due to various choices of ${\alpha}$ values, model calibration to accommodate various traffic flow situations is achievable. In Lighthill and Whitham's model, this is not possible. Our numerical test results showed that the new model yields significantly different outputs: the predicted shock wave speeds of the asymptotical model tend to lean toward the downstream direction in most cases compared to the shock wave speeds of Lighthill and Whitham's model for the same test environment. Statistical tests of significance also indicate that the outputs of the new model are significantly different than the corresponding outputs of Lighthill and Whitham's model.

Efficiency evaluation of nursing homes in China's eastern areas Based on DEA-Malmquist Model (DEA-Malmquist를 활용한 중국 동부지역 요양원의 효율성 평가에 관한 연구)

  • Chu, Ting;Sim, Jae-yeon
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.7
    • /
    • pp.273-282
    • /
    • 2021
  • Nursing home plays a role in providing elderly care in the context of China's rapid population aging, but little understanding of the efficiency of the nursing homes. In this paper, we investigated the efficiency in nursing homes using Data Envelopment Analysis (DEA) and Malmquist index (MPI) for the modeling of the number of nursing home beds, fixed assets, and medical personnel as input variables, and the number of elderly people of self-care, the number of elderly people of partial self-care, the number of bed-ridden elderly people and the income of nursing homes as output variables. Stratification analysis showed that the top two provinces in the DEA-CCR yield were Beijing and Shanghai in the five-year survey period. Four provinces (Beijing, Jiangsu, Shandong, and Shanghai) scored 1.00 in terms of DEA-BCC yield. The MPI analysis showed that Hainan ranked the highest five-year average in the included provinces. In terms of resource utilization, internal management, operation scale, and other aspects, the nursing homes in the provinces with high-efficiency evaluation results show high efficiency and technological progress, whereas the areas with low-efficiency evaluation showed a feature of the improving technical efficiency.

Geophysical Studies on Major Faults in the Gyeonggi Massif : Gravity and Electrical Surveys In the Gongju Basin (경기육괴내 주요 단층대의 지구물리학적 연구: 공주분지의 중력 및 지전기 탐사)

  • Kwon Byung-Doo;Jung Gyung-Ja;Baag Chang-Eob
    • The Korean Journal of Petroleum Geology
    • /
    • v.2 no.2 s.3
    • /
    • pp.43-50
    • /
    • 1994
  • The geologic structure of Gongju Basin, which is a Cretaceous sedimentary basin located on the boundary of Gyeonggi Massif and Ogcheon Belt, is modeled by using gravity data and interpreted in relation with basin forming tectonism. The electrical survey with dipole-dipole array was also conducted to uncover the development of fractures in the two fault zones which form the boundaries of the basin. In the process of gravity data reduction, the terrain correction was performed by using the conic prism model, which showed better results specially for topography having a steep slope. The gravity model of the geologic structure of Gongju basin is obtained by forward modeling based on the surface geology and density inversion. It reveals that the width of the basin at its central part is about $4{\cal}km$ and about $2.5{\cal}km$ at the southern part. The depth of crystalline basement beneath sedimentary rocks of the basin is about $700{\~}400{\cal}m$ below the sea level and it is thinner in the center than in margin. The fault of the southeastern boundary appears more clearly than that of the northwestern boundary, and its fracture zone may extended to the depth of more than $1{\cal}km$. Therefore, it is thought that the tectonic movement along the fault in the southeastern boundary was much stronger. These results coincide with the appearance of broad low resistivity anomaly at the southeastern boundary of the basin in the resistivity section. The fracture zones having low density are also recognized inside the basin from the gravity model. The swelling feature of basement and the fractures in sedimentary rocks of the basin suggest that the compressional tectonic stress had also involved after the deposition of the Cretaceous sediments.

  • PDF

Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques (텍스트 마이닝을 이용한 2012년 한국대선 관련 트위터 분석)

  • Bae, Jung-Hwan;Son, Ji-Eun;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.141-156
    • /
    • 2013
  • Social media is a representative form of the Web 2.0 that shapes the change of a user's information behavior by allowing users to produce their own contents without any expert skills. In particular, as a new communication medium, it has a profound impact on the social change by enabling users to communicate with the masses and acquaintances their opinions and thoughts. Social media data plays a significant role in an emerging Big Data arena. A variety of research areas such as social network analysis, opinion mining, and so on, therefore, have paid attention to discover meaningful information from vast amounts of data buried in social media. Social media has recently become main foci to the field of Information Retrieval and Text Mining because not only it produces massive unstructured textual data in real-time but also it serves as an influential channel for opinion leading. But most of the previous studies have adopted broad-brush and limited approaches. These approaches have made it difficult to find and analyze new information. To overcome these limitations, we developed a real-time Twitter trend mining system to capture the trend in real-time processing big stream datasets of Twitter. The system offers the functions of term co-occurrence retrieval, visualization of Twitter users by query, similarity calculation between two users, topic modeling to keep track of changes of topical trend, and mention-based user network analysis. In addition, we conducted a case study on the 2012 Korean presidential election. We collected 1,737,969 tweets which contain candidates' name and election on Twitter in Korea (http://www.twitter.com/) for one month in 2012 (October 1 to October 31). The case study shows that the system provides useful information and detects the trend of society effectively. The system also retrieves the list of terms co-occurred by given query terms. We compare the results of term co-occurrence retrieval by giving influential candidates' name, 'Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn' as query terms. General terms which are related to presidential election such as 'Presidential Election', 'Proclamation in Support', Public opinion poll' appear frequently. Also the results show specific terms that differentiate each candidate's feature such as 'Park Jung Hee' and 'Yuk Young Su' from the query 'Guen Hae Park', 'a single candidacy agreement' and 'Time of voting extension' from the query 'Jae In Moon' and 'a single candidacy agreement' and 'down contract' from the query 'Chul Su Ahn'. Our system not only extracts 10 topics along with related terms but also shows topics' dynamic changes over time by employing the multinomial Latent Dirichlet Allocation technique. Each topic can show one of two types of patterns-Rising tendency and Falling tendencydepending on the change of the probability distribution. To determine the relationship between topic trends in Twitter and social issues in the real world, we compare topic trends with related news articles. We are able to identify that Twitter can track the issue faster than the other media, newspapers. The user network in Twitter is different from those of other social media because of distinctive characteristics of making relationships in Twitter. Twitter users can make their relationships by exchanging mentions. We visualize and analyze mention based networks of 136,754 users. We put three candidates' name as query terms-Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn'. The results show that Twitter users mention all candidates' name regardless of their political tendencies. This case study discloses that Twitter could be an effective tool to detect and predict dynamic changes of social issues, and mention-based user networks could show different aspects of user behavior as a unique network that is uniquely found in Twitter.

Subject-Balanced Intelligent Text Summarization Scheme (주제 균형 지능형 텍스트 요약 기법)

  • Yun, Yeoil;Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.141-166
    • /
    • 2019
  • Recently, channels like social media and SNS create enormous amount of data. In all kinds of data, portions of unstructured data which represented as text data has increased geometrically. But there are some difficulties to check all text data, so it is important to access those data rapidly and grasp key points of text. Due to needs of efficient understanding, many studies about text summarization for handling and using tremendous amounts of text data have been proposed. Especially, a lot of summarization methods using machine learning and artificial intelligence algorithms have been proposed lately to generate summary objectively and effectively which called "automatic summarization". However almost text summarization methods proposed up to date construct summary focused on frequency of contents in original documents. Those summaries have a limitation for contain small-weight subjects that mentioned less in original text. If summaries include contents with only major subject, bias occurs and it causes loss of information so that it is hard to ascertain every subject documents have. To avoid those bias, it is possible to summarize in point of balance between topics document have so all subject in document can be ascertained, but still unbalance of distribution between those subjects remains. To retain balance of subjects in summary, it is necessary to consider proportion of every subject documents originally have and also allocate the portion of subjects equally so that even sentences of minor subjects can be included in summary sufficiently. In this study, we propose "subject-balanced" text summarization method that procure balance between all subjects and minimize omission of low-frequency subjects. For subject-balanced summary, we use two concept of summary evaluation metrics "completeness" and "succinctness". Completeness is the feature that summary should include contents of original documents fully and succinctness means summary has minimum duplication with contents in itself. Proposed method has 3-phases for summarization. First phase is constructing subject term dictionaries. Topic modeling is used for calculating topic-term weight which indicates degrees that each terms are related to each topic. From derived weight, it is possible to figure out highly related terms for every topic and subjects of documents can be found from various topic composed similar meaning terms. And then, few terms are selected which represent subject well. In this method, it is called "seed terms". However, those terms are too small to explain each subject enough, so sufficient similar terms with seed terms are needed for well-constructed subject dictionary. Word2Vec is used for word expansion, finds similar terms with seed terms. Word vectors are created after Word2Vec modeling, and from those vectors, similarity between all terms can be derived by using cosine-similarity. Higher cosine similarity between two terms calculated, higher relationship between two terms defined. So terms that have high similarity values with seed terms for each subjects are selected and filtering those expanded terms subject dictionary is finally constructed. Next phase is allocating subjects to every sentences which original documents have. To grasp contents of all sentences first, frequency analysis is conducted with specific terms that subject dictionaries compose. TF-IDF weight of each subjects are calculated after frequency analysis, and it is possible to figure out how much sentences are explaining about each subjects. However, TF-IDF weight has limitation that the weight can be increased infinitely, so by normalizing TF-IDF weights for every subject sentences have, all values are changed to 0 to 1 values. Then allocating subject for every sentences with maximum TF-IDF weight between all subjects, sentence group are constructed for each subjects finally. Last phase is summary generation parts. Sen2Vec is used to figure out similarity between subject-sentences, and similarity matrix can be formed. By repetitive sentences selecting, it is possible to generate summary that include contents of original documents fully and minimize duplication in summary itself. For evaluation of proposed method, 50,000 reviews of TripAdvisor are used for constructing subject dictionaries and 23,087 reviews are used for generating summary. Also comparison between proposed method summary and frequency-based summary is performed and as a result, it is verified that summary from proposed method can retain balance of all subject more which documents originally have.

The Effects of Evaluation Attributes of Cultural Tourism Festivals on Satisfaction and Behavioral Intention (문화관광축제 방문객의 평가속성 만족과 행동의도에 관한 연구 - 2006 광주김치대축제를 중심으로 -)

  • Kim, Jung-Hoon
    • Journal of Global Scholars of Marketing Science
    • /
    • v.17 no.2
    • /
    • pp.55-73
    • /
    • 2007
  • Festivals are an indispensable feature of cultural tourism(Formica & Uysal, 1998). Cultural tourism festivals are increasingly being used as instruments promoting tourism and boosting the regional economy. So much research related to festivals is undertaken from a variety of perspectives. Plans to revisit a particular festival have been viewed as an important research topic both in academia and the tourism industry. Therefore festivals have frequently been leveled as cultural events. Cultural tourism festivals have become a crucial component in constituting the attractiveness of tourism destinations(Prentice, 2001). As a result, a considerable number of tourist studies have been carried out in diverse cultural tourism festivals(Backman et al., 1995; Crompton & Mckay, 1997; Park, 1998; Clawson & Knetch, 1996). Much of previous literature empirically shows the close linkage between tourist satisfaction and behavioral intention in festivals. The main objective of this study is to investigate the effects of evaluation attributes of cultural tourism festivals on satisfaction and behavioral intention. accomplish the research objective, to find out evaluation items of cultural tourism festivals through the literature study an empirical study. Using a varimax rotation with Kaiser normalization, the research obtained four factors in the 18 evaluation attributes of cultural tourism festivals. Some empirical studies have examined the relationship between behavioral intention and actual behavior. To understand between tourist satisfaction and behavioral intention, this study suggests five hypotheses and hypothesized model. In this study, the analysis is based on primary data collected from visitors who participated in '2006 Gwangju Kimchi Festival'. In total, 700 self-administered questionnaires were distributed and 561 usable questionnaires were obtained. Respondents were presented with the 18 satisfactions item on a scale from 1(strongly disagree) to 7(strongly agree). Dimensionality and stability of the scale were evaluated by a factor analysis with varimax rotation. Four factors emerged with eigenvalues greater than 1, which explained 66.40% of the total variance and Cronbach' alpha raging from 0.876 to 0.774. And four factors named: advertisement and guides, programs, food and souvenirs, and convenient facilities. To test and estimate the hypothesized model, a two-step approach with an initial measurement model and a subsequent structural model for Structural Equation Modeling was used. The AMOS 4.0 analysis package was used to conduct the analysis. In estimating the model, the maximum likelihood procedure was used.In this study Chi-square test is used, which is the most common model goodness-of-fit test. In addition, considering the literature about the Structural Equation Modeling, this study used, besides Chi-square test, more model fit indexes to determine the tangibility of the suggested model: goodness-of-fit index(GFI) and root mean square error of approximation(RMSEA) as absolute fit indexes; normed-fit index(NFI) and non-normed-fit index(NNFI) as incremental fit indexes. The results of T-test and ANOVAs revealed significant differences(0.05 level), therefore H1(Tourist Satisfaction level should be different from Demographic traits) are supported. According to the multiple Regressions analysis and AMOS, H2(Tourist Satisfaction positively influences on revisit intention), H3(Tourist Satisfaction positively influences on word of mouth), H4(Evaluation Attributes of cultural tourism festivals influences on Tourist Satisfaction), and H5(Tourist Satisfaction positively influences on Behavioral Intention) are also supported. As the conclusion of this study are as following: First, there were differences in satisfaction levels in accordance with the demographic information of visitors. Not all visitors had the same degree of satisfaction with their cultural tourism festival experience. Therefore it is necessary to understand the satisfaction of tourists if the experiences that are provided are to meet their expectations. So, in making festival plans, the organizer should consider the demographic variables in explaining and segmenting visitors to cultural tourism festival. Second, satisfaction with attributes of evaluation cultural tourism festivals had a significant direct impact on visitors' intention to revisit such festivals and the word of mouth publicity they shared. The results indicated that visitor satisfaction is a significant antecedent of their intention to revisit such festivals. Festival organizers should strive to forge long-term relationships with the visitors. In addition, it is also necessary to understand how the intention to revisit a festival changes over time and identify the critical satisfaction factors. Third, it is confirmed that behavioral intention was enhanced by satisfaction. The strong link between satisfaction and behavioral intentions of visitors areensured by high quality advertisement and guides, programs, food and souvenirs, and convenient facilities. Thus, examining revisit intention from a time viewpoint may be of a great significance for both practical and theoretical reasons. Additionally, festival organizers should give special attention to visitor satisfaction, as satisfied visitors are more likely to return sooner. The findings of this research have several practical implications for the festivals managers. The promotion of cultural festivals should be based on the understanding of tourist satisfaction for the long- term success of tourism. And this study can help managers carry out this task in a more informed and strategic manner by examining the effects of demographic traits on the level of tourist satisfaction and the behavioral intention. In other words, differentiated marketing strategies should be stressed and executed by relevant parties. The limitations of this study are as follows; the results of this study cannot be generalized to other cultural tourism festivals because we have not explored the many different kinds of festivals. A future study should be a comparative analysis of other festivals of different visitor segments. Also, further efforts should be directed toward developing more comprehensive temporal models that can explain behavioral intentions of tourists.

  • PDF

Corporate Default Prediction Model Using Deep Learning Time Series Algorithm, RNN and LSTM (딥러닝 시계열 알고리즘 적용한 기업부도예측모형 유용성 검증)

  • Cha, Sungjae;Kang, Jungseok
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.1-32
    • /
    • 2018
  • In addition to stakeholders including managers, employees, creditors, and investors of bankrupt companies, corporate defaults have a ripple effect on the local and national economy. Before the Asian financial crisis, the Korean government only analyzed SMEs and tried to improve the forecasting power of a default prediction model, rather than developing various corporate default models. As a result, even large corporations called 'chaebol enterprises' become bankrupt. Even after that, the analysis of past corporate defaults has been focused on specific variables, and when the government restructured immediately after the global financial crisis, they only focused on certain main variables such as 'debt ratio'. A multifaceted study of corporate default prediction models is essential to ensure diverse interests, to avoid situations like the 'Lehman Brothers Case' of the global financial crisis, to avoid total collapse in a single moment. The key variables used in corporate defaults vary over time. This is confirmed by Beaver (1967, 1968) and Altman's (1968) analysis that Deakins'(1972) study shows that the major factors affecting corporate failure have changed. In Grice's (2001) study, the importance of predictive variables was also found through Zmijewski's (1984) and Ohlson's (1980) models. However, the studies that have been carried out in the past use static models. Most of them do not consider the changes that occur in the course of time. Therefore, in order to construct consistent prediction models, it is necessary to compensate the time-dependent bias by means of a time series analysis algorithm reflecting dynamic change. Based on the global financial crisis, which has had a significant impact on Korea, this study is conducted using 10 years of annual corporate data from 2000 to 2009. Data are divided into training data, validation data, and test data respectively, and are divided into 7, 2, and 1 years respectively. In order to construct a consistent bankruptcy model in the flow of time change, we first train a time series deep learning algorithm model using the data before the financial crisis (2000~2006). The parameter tuning of the existing model and the deep learning time series algorithm is conducted with validation data including the financial crisis period (2007~2008). As a result, we construct a model that shows similar pattern to the results of the learning data and shows excellent prediction power. After that, each bankruptcy prediction model is restructured by integrating the learning data and validation data again (2000 ~ 2008), applying the optimal parameters as in the previous validation. Finally, each corporate default prediction model is evaluated and compared using test data (2009) based on the trained models over nine years. Then, the usefulness of the corporate default prediction model based on the deep learning time series algorithm is proved. In addition, by adding the Lasso regression analysis to the existing methods (multiple discriminant analysis, logit model) which select the variables, it is proved that the deep learning time series algorithm model based on the three bundles of variables is useful for robust corporate default prediction. The definition of bankruptcy used is the same as that of Lee (2015). Independent variables include financial information such as financial ratios used in previous studies. Multivariate discriminant analysis, logit model, and Lasso regression model are used to select the optimal variable group. The influence of the Multivariate discriminant analysis model proposed by Altman (1968), the Logit model proposed by Ohlson (1980), the non-time series machine learning algorithms, and the deep learning time series algorithms are compared. In the case of corporate data, there are limitations of 'nonlinear variables', 'multi-collinearity' of variables, and 'lack of data'. While the logit model is nonlinear, the Lasso regression model solves the multi-collinearity problem, and the deep learning time series algorithm using the variable data generation method complements the lack of data. Big Data Technology, a leading technology in the future, is moving from simple human analysis, to automated AI analysis, and finally towards future intertwined AI applications. Although the study of the corporate default prediction model using the time series algorithm is still in its early stages, deep learning algorithm is much faster than regression analysis at corporate default prediction modeling. Also, it is more effective on prediction power. Through the Fourth Industrial Revolution, the current government and other overseas governments are working hard to integrate the system in everyday life of their nation and society. Yet the field of deep learning time series research for the financial industry is still insufficient. This is an initial study on deep learning time series algorithm analysis of corporate defaults. Therefore it is hoped that it will be used as a comparative analysis data for non-specialists who start a study combining financial data and deep learning time series algorithm.

The Intelligent Determination Model of Audience Emotion for Implementing Personalized Exhibition (개인화 전시 서비스 구현을 위한 지능형 관객 감정 판단 모형)

  • Jung, Min-Kyu;Kim, Jae-Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.39-57
    • /
    • 2012
  • Recently, due to the introduction of high-tech equipment in interactive exhibits, many people's attention has been concentrated on Interactive exhibits that can double the exhibition effect through the interaction with the audience. In addition, it is also possible to measure a variety of audience reaction in the interactive exhibition. Among various audience reactions, this research uses the change of the facial features that can be collected in an interactive exhibition space. This research develops an artificial neural network-based prediction model to predict the response of the audience by measuring the change of the facial features when the audience is given stimulation from the non-excited state. To present the emotion state of the audience, this research uses a Valence-Arousal model. So, this research suggests an overall framework composed of the following six steps. The first step is a step of collecting data for modeling. The data was collected from people participated in the 2012 Seoul DMC Culture Open, and the collected data was used for the experiments. The second step extracts 64 facial features from the collected data and compensates the facial feature values. The third step generates independent and dependent variables of an artificial neural network model. The fourth step extracts the independent variable that affects the dependent variable using the statistical technique. The fifth step builds an artificial neural network model and performs a learning process using train set and test set. Finally the last sixth step is to validate the prediction performance of artificial neural network model using the validation data set. The proposed model is compared with statistical predictive model to see whether it had better performance or not. As a result, although the data set in this experiment had much noise, the proposed model showed better results when the model was compared with multiple regression analysis model. If the prediction model of audience reaction was used in the real exhibition, it will be able to provide countermeasures and services appropriate to the audience's reaction viewing the exhibits. Specifically, if the arousal of audience about Exhibits is low, Action to increase arousal of the audience will be taken. For instance, we recommend the audience another preferred contents or using a light or sound to focus on these exhibits. In other words, when planning future exhibitions, planning the exhibition to satisfy various audience preferences would be possible. And it is expected to foster a personalized environment to concentrate on the exhibits. But, the proposed model in this research still shows the low prediction accuracy. The cause is in some parts as follows : First, the data covers diverse visitors of real exhibitions, so it was difficult to control the optimized experimental environment. So, the collected data has much noise, and it would results a lower accuracy. In further research, the data collection will be conducted in a more optimized experimental environment. The further research to increase the accuracy of the predictions of the model will be conducted. Second, using changes of facial expression only is thought to be not enough to extract audience emotions. If facial expression is combined with other responses, such as the sound, audience behavior, it would result a better result.