• Title/Summary/Keyword: Algorithm Model

Search Result 12,882, Processing Time 0.043 seconds

Analysis on the Snow Cover Variations at Mt. Kilimanjaro Using Landsat Satellite Images (Landsat 위성영상을 이용한 킬리만자로 만년설 변화 분석)

  • Park, Sung-Hwan;Lee, Moung-Jin;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.28 no.4
    • /
    • pp.409-420
    • /
    • 2012
  • Since the Industrial Revolution, CO2 levels have been increasing with climate change. In this study, Analyze time-series changes in snow cover quantitatively and predict the vanishing point of snow cover statistically using remote sensing. The study area is Mt. Kilimanjaro, Tanzania. 23 image data of Landsat-5 TM and Landsat-7 ETM+, spanning the 27 years from June 1984 to July 2011, were acquired. For this study, first, atmospheric correction was performed on each image using the COST atmospheric correction model. Second, the snow cover area was extracted using the NDSI (Normalized Difference Snow Index) algorithm. Third, the minimum height of snow cover was determined using SRTM DEM. Finally, the vanishing point of snow cover was predicted using the trend line of a linear function. Analysis was divided using a total of 23 images and 17 images during the dry season. Results show that snow cover area decreased by approximately $6.47km^2$ from $9.01km^2$ to $2.54km^2$, equivalent to a 73% reduction. The minimum height of snow cover increased by approximately 290 m, from 4,603 m to 4,893 m. Using the trend line result shows that the snow cover area decreased by approximately $0.342km^2$ in the dry season and $0.421km^2$ overall each year. In contrast, the annual increase in the minimum height of snow cover was approximately 9.848 m in the dry season and 11.251 m overall. Based on this analysis of vanishing point, there will be no snow cover 2020 at 95% confidence interval. This study can be used to monitor global climate change by providing the change in snow cover area and reference data when studying this area or similar areas in future research.

Latent topics-based product reputation mining (잠재 토픽 기반의 제품 평판 마이닝)

  • Park, Sang-Min;On, Byung-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.39-70
    • /
    • 2017
  • Data-drive analytics techniques have been recently applied to public surveys. Instead of simply gathering survey results or expert opinions to research the preference for a recently launched product, enterprises need a way to collect and analyze various types of online data and then accurately figure out customer preferences. In the main concept of existing data-based survey methods, the sentiment lexicon for a particular domain is first constructed by domain experts who usually judge the positive, neutral, or negative meanings of the frequently used words from the collected text documents. In order to research the preference for a particular product, the existing approach collects (1) review posts, which are related to the product, from several product review web sites; (2) extracts sentences (or phrases) in the collection after the pre-processing step such as stemming and removal of stop words is performed; (3) classifies the polarity (either positive or negative sense) of each sentence (or phrase) based on the sentiment lexicon; and (4) estimates the positive and negative ratios of the product by dividing the total numbers of the positive and negative sentences (or phrases) by the total number of the sentences (or phrases) in the collection. Furthermore, the existing approach automatically finds important sentences (or phrases) including the positive and negative meaning to/against the product. As a motivated example, given a product like Sonata made by Hyundai Motors, customers often want to see the summary note including what positive points are in the 'car design' aspect as well as what negative points are in thesame aspect. They also want to gain more useful information regarding other aspects such as 'car quality', 'car performance', and 'car service.' Such an information will enable customers to make good choice when they attempt to purchase brand-new vehicles. In addition, automobile makers will be able to figure out the preference and positive/negative points for new models on market. In the near future, the weak points of the models will be improved by the sentiment analysis. For this, the existing approach computes the sentiment score of each sentence (or phrase) and then selects top-k sentences (or phrases) with the highest positive and negative scores. However, the existing approach has several shortcomings and is limited to apply to real applications. The main disadvantages of the existing approach is as follows: (1) The main aspects (e.g., car design, quality, performance, and service) to a product (e.g., Hyundai Sonata) are not considered. Through the sentiment analysis without considering aspects, as a result, the summary note including the positive and negative ratios of the product and top-k sentences (or phrases) with the highest sentiment scores in the entire corpus is just reported to customers and car makers. This approach is not enough and main aspects of the target product need to be considered in the sentiment analysis. (2) In general, since the same word has different meanings across different domains, the sentiment lexicon which is proper to each domain needs to be constructed. The efficient way to construct the sentiment lexicon per domain is required because the sentiment lexicon construction is labor intensive and time consuming. To address the above problems, in this article, we propose a novel product reputation mining algorithm that (1) extracts topics hidden in review documents written by customers; (2) mines main aspects based on the extracted topics; (3) measures the positive and negative ratios of the product using the aspects; and (4) presents the digest in which a few important sentences with the positive and negative meanings are listed in each aspect. Unlike the existing approach, using hidden topics makes experts construct the sentimental lexicon easily and quickly. Furthermore, reinforcing topic semantics, we can improve the accuracy of the product reputation mining algorithms more largely than that of the existing approach. In the experiments, we collected large review documents to the domestic vehicles such as K5, SM5, and Avante; measured the positive and negative ratios of the three cars; showed top-k positive and negative summaries per aspect; and conducted statistical analysis. Our experimental results clearly show the effectiveness of the proposed method, compared with the existing method.

Comparison of Association Rule Learning and Subgroup Discovery for Mining Traffic Accident Data (교통사고 데이터의 마이닝을 위한 연관규칙 학습기법과 서브그룹 발견기법의 비교)

  • Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.1-16
    • /
    • 2015
  • Traffic accident is one of the major cause of death worldwide for the last several decades. According to the statistics of world health organization, approximately 1.24 million deaths occurred on the world's roads in 2010. In order to reduce future traffic accident, multipronged approaches have been adopted including traffic regulations, injury-reducing technologies, driving training program and so on. Records on traffic accidents are generated and maintained for this purpose. To make these records meaningful and effective, it is necessary to analyze relationship between traffic accident and related factors including vehicle design, road design, weather, driver behavior etc. Insight derived from these analysis can be used for accident prevention approaches. Traffic accident data mining is an activity to find useful knowledges about such relationship that is not well-known and user may interested in it. Many studies about mining accident data have been reported over the past two decades. Most of studies mainly focused on predict risk of accident using accident related factors. Supervised learning methods like decision tree, logistic regression, k-nearest neighbor, neural network are used for these prediction. However, derived prediction model from these algorithms are too complex to understand for human itself because the main purpose of these algorithms are prediction, not explanation of the data. Some of studies use unsupervised clustering algorithm to dividing the data into several groups, but derived group itself is still not easy to understand for human, so it is necessary to do some additional analytic works. Rule based learning methods are adequate when we want to derive comprehensive form of knowledge about the target domain. It derives a set of if-then rules that represent relationship between the target feature with other features. Rules are fairly easy for human to understand its meaning therefore it can help provide insight and comprehensible results for human. Association rule learning methods and subgroup discovery methods are representing rule based learning methods for descriptive task. These two algorithms have been used in a wide range of area from transaction analysis, accident data analysis, detection of statistically significant patient risk groups, discovering key person in social communities and so on. We use both the association rule learning method and the subgroup discovery method to discover useful patterns from a traffic accident dataset consisting of many features including profile of driver, location of accident, types of accident, information of vehicle, violation of regulation and so on. The association rule learning method, which is one of the unsupervised learning methods, searches for frequent item sets from the data and translates them into rules. In contrast, the subgroup discovery method is a kind of supervised learning method that discovers rules of user specified concepts satisfying certain degree of generality and unusualness. Depending on what aspect of the data we are focusing our attention to, we may combine different multiple relevant features of interest to make a synthetic target feature, and give it to the rule learning algorithms. After a set of rules is derived, some postprocessing steps are taken to make the ruleset more compact and easier to understand by removing some uninteresting or redundant rules. We conducted a set of experiments of mining our traffic accident data in both unsupervised mode and supervised mode for comparison of these rule based learning algorithms. Experiments with the traffic accident data reveals that the association rule learning, in its pure unsupervised mode, can discover some hidden relationship among the features. Under supervised learning setting with combinatorial target feature, however, the subgroup discovery method finds good rules much more easily than the association rule learning method that requires a lot of efforts to tune the parameters.

Visualization and Localization of Fusion Image Using VRML for Three-dimensional Modeling of Epileptic Seizure Focus (VRML을 이용한 융합 영상에서 간질환자 발작 진원지의 3차원적 가시화와 위치 측정 구현)

  • 이상호;김동현;유선국;정해조;윤미진;손혜경;강원석;이종두;김희중
    • Progress in Medical Physics
    • /
    • v.14 no.1
    • /
    • pp.34-42
    • /
    • 2003
  • In medical imaging, three-dimensional (3D) display using Virtual Reality Modeling Language (VRML) as a portable file format can give intuitive information more efficiently on the World Wide Web (WWW). The web-based 3D visualization of functional images combined with anatomical images has not studied much in systematic ways. The goal of this study was to achieve a simultaneous observation of 3D anatomic and functional models with planar images on the WWW, providing their locational information in 3D space with a measuring implement using VRML. MRI and ictal-interictal SPECT images were obtained from one epileptic patient. Subtraction ictal SPECT co-registered to MRI (SISCOM) was performed to improve identification of a seizure focus. SISCOM image volumes were held by thresholds above one standard deviation (1-SD) and two standard deviations (2-SD). SISCOM foci and boundaries of gray matter, white matter, and cerebrospinal fluid (CSF) in the MRI volume were segmented and rendered to VRML polygonal surfaces by marching cube algorithm. Line profiles of x and y-axis that represent real lengths on an image were acquired and their maximum lengths were the same as 211.67 mm. The real size vs. the rendered VRML surface size was approximately the ratio of 1 to 605.9. A VRML measuring tool was made and merged with previous VRML surfaces. User interface tools were embedded with Java Script routines to display MRI planar images as cross sections of 3D surface models and to set transparencies of 3D surface models. When transparencies of 3D surface models were properly controlled, a fused display of the brain geometry with 3D distributions of focal activated regions provided intuitively spatial correlations among three 3D surface models. The epileptic seizure focus was in the right temporal lobe of the brain. The real position of the seizure focus could be verified by the VRML measuring tool and the anatomy corresponding to the seizure focus could be confirmed by MRI planar images crossing 3D surface models. The VRML application developed in this study may have several advantages. Firstly, 3D fused display and control of anatomic and functional image were achieved on the m. Secondly, the vector analysis of a 3D surface model was defined by the VRML measuring tool based on the real size. Finally, the anatomy corresponding to the seizure focus was intuitively detected by correlations with MRI images. Our web based visualization of 3-D fusion image and its localization will be a help to online research and education in diagnostic radiology, therapeutic radiology, and surgery applications.

  • PDF

Reconstruction of Metabolic Pathway for the Chicken Genome (닭 특이 대사 경로 재확립)

  • Kim, Woon-Su;Lee, Se-Young;Park, Hye-Sun;Baik, Woon-Kee;Lee, Jun-Heon;Seo, Seong-Won
    • Korean Journal of Poultry Science
    • /
    • v.37 no.3
    • /
    • pp.275-282
    • /
    • 2010
  • Chicken is an important livestock as a valuable biomedical model as well as food for human, and there is a strong rationale for improving our understanding on metabolism and physiology of this organism. The first draft of chicken genome assembly was released in 2004, which enables elaboration on the linkage between genetic and metabolic traits of chicken. The objectives of this study were thus to reconstruct metabolic pathway of the chicken genome and to construct a chicken specific pathway genome database (PGDB). We developed a comprehensive genome database for chicken by integrating all the known annotations for chicken genes and proteins using a pipeline written in Perl. Based on the comprehensive genome annotations, metabolic pathways of the chicken genome were reconstructed using the PathoLogic algorithm in Pathway Tools software. We identified a total of 212 metabolic pathways, 2,709 enzymes, 71 transporters, 1,698 enzymatic reactions, 8 transport reactions, and 1,360 compounds in the current chicken genome build, Gallus_gallus-2.1. Comparative metabolic analysis with the human, mouse and cattle genomes revealed that core metabolic pathways are highly conserved in the chicken genome. It was indicated the quality of assembly and annotations of the chicken genome need to be improved and more researches are required for improving our understanding on function of genes and metabolic pathways of avian species. We conclude that the chicken PGDB is useful for studies on avian and chicken metabolism and provides a platform for comparative genomic and metabolic analysis of animal biology and biomedicine.

Tegumental ultrastructure of juvenile and adult Echinostoma cinetorchis (이전고환극구흡충 유약충 및 성충의 표피 미세구조)

  • 이순형;전호승
    • Parasites, Hosts and Diseases
    • /
    • v.30 no.2
    • /
    • pp.65-74
    • /
    • 1992
  • The tegumental ultrastructure of juvenile and adult Echinostoma cinetorchis (Trematoda: Echinostomatidae) was observed by scanning electron microscopy. Three-day (juvenile) and 16-day (adult) worms were harvested from rats (Sprague-Dawley) experimentally fed the metacercariae from the laboratory-infected fresh water snail, Hippeutis cantori. The worms were fifed with 2.5% glutaraldehyde, processed routinely, and observed by an ISI Korea DS-130 scanning electron microscope. The 3-day old juvenile worms were elongated and ventrally curved, with their ventral sucker near the anterior two-fifths of the body. The head crown was bearing 37∼38 collar spines arranged in a zigzag pattern. The lips of the oral and ventral suckers had 8 and 5 type II sensory papillae respectively, and bewteen the spines, a few type III papillae were observed. Tongue or spade-shape spines were distributed anteriorly to the ventral sucker, whereas peg-like spines were distributed posteriorly and became sparse toward the posterior body. The spines of the dorsal surface were similar to those of the ventral surface. The 16-day old adults were leaf-like, and their oral and ventral suckers were located very closely. Aspinous head crown, oral and ventral suckers had type II and type III sensory papillae, and numerous type I papillae were distributed on the tegument anterior to the ventral sucker. Scale-like spines, with broad base and round tip, were distributed densely on the tegument anterior to the ventral sucker but they became sparse posteriorly. At the dorsal surface, spines were observed at times only at the anterior body. The results showed that the tegument of E. cinetorchis is similar to that of other echinostomes, but differs in the number and arrangement of collar spines, shape and distribution of tegumenal spines, and type and distribution of sensory papillae.

  • PDF

Quantification of Myocardial Blood flow using Dynamic N-13 Ammonia PET and factor Analysis (N-13 암모니아 PET 동적영상과 인자분석을 이용한 심근 혈류량 정량화)

  • Choi, Yong;Kim, Joon-Young;Im, Ki-Chun;Kim, Jong-Ho;Woo, Sang-Keun;Lee, Kyung-Han;Kim, Sang-Eun;Choe, Yearn-Seong;Kim, Byung-Tae
    • The Korean Journal of Nuclear Medicine
    • /
    • v.33 no.3
    • /
    • pp.316-326
    • /
    • 1999
  • Purpose: We evaluated the feasibility of extracting pure left ventricular blood pool and myocardial time-activity curves (TACs) and of generating factor images from human dynamic N-13 ammonia PET using factor analysis. The myocardial blood flow (MBF) estimates obtained with factor analysis were compared with those obtained with the user drawn region-of-interest (ROI) method. Materials and Methods: Stress and rest N-13 ammonia cardiac PET imaging was acquired for 23 min in 5 patients with coronary artery disease using GE Advance tomograph. Factor analysis generated physiological TACs and factor images using the normalized TACs from each dixel. Four steps were involved in this algorithm: (a) data preprocessing; (b) principal component analysis; (c) oblique rotation with positivity constraints; (d) factor image computation. Area under curves and MBF estimated using the two compartment N-13 ammonia model were used to validate the accuracy of the factor analysis generated physiological TACs. The MBF estimated by factor analysis was compared to the values estimated by using the ROI method. Results: MBF values obtained by factor analysis were linearly correlated with MBF obtained by the ROI method (slope = 0.84, r = 0.91), Left ventricular blood pool TACs obtained by the two methods agreed well (Area under curve ratio: 1.02 ($0{\sim}1min$), 0.98 ($0{\sim}2min$), 0.86 ($1{\sim}2min$)). Conclusion: The results of this study demonstrates that MBF can be measured accurately and noninvasively with dynamic N-13 ammonia PET imaging and factor analysis. This method is simple and accurate, and can measure MBF without blood sampling, ROI definition or spillover correction.

  • PDF

Comparative analysis of auto-calibration methods using QUAL2Kw and assessment on the water quality management alternatives for Sum River (QUAL2Kw 모형을 이용한 자동보정 방법 비교분석과 섬강의 수질관리 대안 평가)

  • Cho, Jae Heon
    • Journal of Environmental Impact Assessment
    • /
    • v.25 no.5
    • /
    • pp.345-356
    • /
    • 2016
  • In this study, auto-calibration method for water quality model was compared and analyzed using QUAL2Kw, which can estimate the optimum parameters through the integration of genetic algorithm and QUAL2K. The QUAL2Kw was applied to the Sum River which is greatly affected by the pollution loads of Wonju city. Two auto-calibration methods were examined: single parameter application for the whole river reach and separate parameter application for each reach of multiple reaches. The analysis about CV(RMSE) and fitness of the GA show that the separate parameter auto-calibration method is better than the single parameter method in the degree of precision. Thus the separate parameter auto-calibration method is applied to the water quality modelling of this study. The calibrated QUAL2Kw was used for the three scenarios for the water quality management of the Sum River, and the water quality impact on the river was analyzed. In scenario 1, which improve the effluent water quality of Wonju WWTP, BOD and TP concentrations of the Sum River 4-1 station which is representative one of Mid-Watershed, are decreased 17.7% and 29.1%, respectively. And immediately after joining the Wonjucheon, BOD and TP concentrations are decreased 50.4% and 40.5%, respectively. In scenario 2, Wonju water supply intake is closed and multi-regional water supply, which come from other watershed except the Sum River, is provided. The Sum River water quality in scenario 2 is slightly improved as the flow of the river is increased. Immediately after joining the Wonjucheon, BOD and TP concentrations are decreased 0.18mg/L and 0.0063mg/L, respectively. In scenario 3, the water quality management alternatives of scenario 1 and 2 are planned simultaneously, the Sum River water quality is slightly more improved than scenario 1. Water quality prediction of the three scenarios indicates that effluent water quality improvement of Wonju WWTP is the most efficient alternative in water quality management of the Sum River. Particularly the Sum River water quality immediately after joining the Wonjucheon is greatly improved. When Wonju water supply intake is closed and multi-regional water supply is provided, the Sum River water quality is slightly improved.

Online news-based stock price forecasting considering homogeneity in the industrial sector (산업군 내 동질성을 고려한 온라인 뉴스 기반 주가예측)

  • Seong, Nohyoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.1-19
    • /
    • 2018
  • Since stock movements forecasting is an important issue both academically and practically, studies related to stock price prediction have been actively conducted. The stock price forecasting research is classified into structured data and unstructured data, and it is divided into technical analysis, fundamental analysis and media effect analysis in detail. In the big data era, research on stock price prediction combining big data is actively underway. Based on a large number of data, stock prediction research mainly focuses on machine learning techniques. Especially, research methods that combine the effects of media are attracting attention recently, among which researches that analyze online news and utilize online news to forecast stock prices are becoming main. Previous studies predicting stock prices through online news are mostly sentiment analysis of news, making different corpus for each company, and making a dictionary that predicts stock prices by recording responses according to the past stock price. Therefore, existing studies have examined the impact of online news on individual companies. For example, stock movements of Samsung Electronics are predicted with only online news of Samsung Electronics. In addition, a method of considering influences among highly relevant companies has also been studied recently. For example, stock movements of Samsung Electronics are predicted with news of Samsung Electronics and a highly related company like LG Electronics.These previous studies examine the effects of news of industrial sector with homogeneity on the individual company. In the previous studies, homogeneous industries are classified according to the Global Industrial Classification Standard. In other words, the existing studies were analyzed under the assumption that industries divided into Global Industrial Classification Standard have homogeneity. However, existing studies have limitations in that they do not take into account influential companies with high relevance or reflect the existence of heterogeneity within the same Global Industrial Classification Standard sectors. As a result of our examining the various sectors, it can be seen that there are sectors that show the industrial sectors are not a homogeneous group. To overcome these limitations of existing studies that do not reflect heterogeneity, our study suggests a methodology that reflects the heterogeneous effects of the industrial sector that affect the stock price by applying k-means clustering. Multiple Kernel Learning is mainly used to integrate data with various characteristics. Multiple Kernel Learning has several kernels, each of which receives and predicts different data. To incorporate effects of target firm and its relevant firms simultaneously, we used Multiple Kernel Learning. Each kernel was assigned to predict stock prices with variables of financial news of the industrial group divided by the target firm, K-means cluster analysis. In order to prove that the suggested methodology is appropriate, experiments were conducted through three years of online news and stock prices. The results of this study are as follows. (1) We confirmed that the information of the industrial sectors related to target company also contains meaningful information to predict stock movements of target company and confirmed that machine learning algorithm has better predictive power when considering the news of the relevant companies and target company's news together. (2) It is important to predict stock movements with varying number of clusters according to the level of homogeneity in the industrial sector. In other words, when stock prices are homogeneous in industrial sectors, it is important to use relational effect at the level of industry group without analyzing clusters or to use it in small number of clusters. When the stock price is heterogeneous in industry group, it is important to cluster them into groups. This study has a contribution that we testified firms classified as Global Industrial Classification Standard have heterogeneity and suggested it is necessary to define the relevance through machine learning and statistical analysis methodology rather than simply defining it in the Global Industrial Classification Standard. It has also contribution that we proved the efficiency of the prediction model reflecting heterogeneity.

Development of Gated Myocardial SPECT Analysis Software and Evaluation of Left Ventricular Contraction Function (게이트 심근 SPECT 분석 소프트웨어의 개발과 좌심실 수축 기능 평가)

  • Lee, Byeong-Il;Lee, Dong-Soo;Lee, Jae-Sung;Chung, June-Key;Lee, Myung-Chul;Choi, Heung-Kook
    • The Korean Journal of Nuclear Medicine
    • /
    • v.37 no.2
    • /
    • pp.73-82
    • /
    • 2003
  • Objectives: A new software (Cardiac SPECT Analyzer: CSA) was developed for quantification of volumes and election fraction on gated myocardial SPECT. Volumes and ejection fraction by CSA were validated by comparing with those quantified by Quantitative Gated SPECT (QGS) software. Materials and Methods: Gated myocardial SPECT was peformed in 40 patients with ejection fraction from 15% to 85%. In 26 patients, gated myocardial SPECT was acquired again with the patients in situ. A cylinder model was used to eliminate noise semi-automatically and profile data was extracted using Gaussian fitting after smoothing. The boundary points of endo- and epicardium were found using an iterative learning algorithm. Enddiastolic (EDV) and endsystolic volumes (ESV) and election fraction (EF) were calculated. These values were compared with those calculated by QGS and the same gated SPECT data was repeatedly quantified by CSA and variation of the values on sequential measurements of the same patients on the repeated acquisition. Results: From the 40 patient data, EF, EDV and ESV by CSA were correlated with those by QGS with the correlation coefficients of 0.97, 0.92, 0.96. Two standard deviation (SD) of EF on Bland Altman plot was 10.1%. Repeated measurements of EF, EDV, and ESV by CSA were correlated with each other with the coefficients of 0.96, 0.99, and 0.99 for EF, EDV and ESV respectively. On repeated acquisition, reproducibility was also excellent with correlation coefficients of 0.89, 0.97, 0.98, and coefficient of variation of 8.2%, 5.4mL, 8.5mL and 2SD of 10.6%, 21.2mL, and 16.4mL on Bland Altman plot for EF, EDV and ESV. Conclusion: We developed the software of CSA for quantification of volumes and ejection fraction on gated myocardial SPECT. Volumes and ejection fraction quantified using this software was found valid for its correctness and precision.