• Title/Summary/Keyword: Multi-Instance Data

Search Result 32, Processing Time 0.027 seconds

Factors Clustering Approach to Parametric Cost Estimates And OLAP Driver

  • JaeHo, Cho;BoSik, Son;JaeYoul, Chun
    • International conference on construction engineering and project management
    • /
    • 2009.05a
    • /
    • pp.707-716
    • /
    • 2009
  • The role of cost modeller is to facilitate the design process by systematic application of cost factors so as to maintain a sensible and economic relationship between cost, quantity, utility and appearance which thus helps in achieving the client's requirements within an agreed budget. There are a number of research on cost estimates in the early design stage based on the improvement of accuracy or impact factors. It is common knowledge that cost estimates are undertaken progressively throughout the design stage and make use of the information that is available at each phase, through the related research up to now. In addition, Cost estimates in the early design stage shall analyze the information under the various kinds of precondition before reaching the more developed design because a design can be modified and changed in all process depending on clients' requirements. Parametric cost estimating models have been adopted to support decision making in a changeable environment, in the early design stage. These models are using a similar instance or a pattern of historical case to be constituted in project information, geographic design features, relevant data to quantity or cost, etc. OLAP technique analyzes a subject data by multi-dimensional points of view; it supports query, analysis, comparison of required information by diverse queries. OLAP's data structure matches well with multiview-analysis framework. Accordingly, this study implements multi-dimensional information system for case based quantity data related to design information that is utilizing OLAP's technology, and then analyzes impact factors of quantity by the design criteria or parameter of the same meaning. On the basis of given factors examined above, this study will generate the rules on quantity measure and produce resemblance class using clustering of data mining. These sorts of knowledge-base consist of a set of classified data as group patterns, of which will be appropriate stand on the parametric cost estimating method.

  • PDF

Analysis on Geo-stress and casing damage based on fluid-solid coupling for Q9G3 block in Jibei oil field

  • Ji, Youjun;Li, Xiaoyu
    • Geomechanics and Engineering
    • /
    • v.15 no.1
    • /
    • pp.677-686
    • /
    • 2018
  • Aimed at serious casing damage problem during the process of oilfield development by injecting water, based on seepage mechanics, fluid mechanics and the theory of rock mechanics, the multi-physics coupling theory was also taken into account, the mathematical model for production of petroleum with water flooding was established, and the method to solve the coupling model was presented by combination of Abaqus and Eclipse software. The Q9G3 block in Jibei oilfield was taken for instance, the well log data and geological survey data were employed to build the numerical model of Q9G3 block, the method established above was applied to simulate the evolution of seepage and stress. The production data was imported into the model to conduct the history match work of the model, and the fitting accuracy of the model was quite good. The main mechanism of casing damage of the block was analyzed, and some wells with probable casing damage problem were pointed out, the displacement of the well wall matched very well with testing data of the filed. Finally, according to the simulation results, some useful measures for preventing casing damage in Jibei oilfield was proposed.

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

3D Cross-Modal Retrieval Using Noisy Center Loss and SimSiam for Small Batch Training

  • Yeon-Seung Choo;Boeun Kim;Hyun-Sik Kim;Yong-Suk Park
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.670-684
    • /
    • 2024
  • 3D Cross-Modal Retrieval (3DCMR) is a task that retrieves 3D objects regardless of modalities, such as images, meshes, and point clouds. One of the most prominent methods used for 3DCMR is the Cross-Modal Center Loss Function (CLF) which applies the conventional center loss strategy for 3D cross-modal search and retrieval. Since CLF is based on center loss, the center features in CLF are also susceptible to subtle changes in hyperparameters and external inferences. For instance, performance degradation is observed when the batch size is too small. Furthermore, the Mean Squared Error (MSE) used in CLF is unable to adapt to changes in batch size and is vulnerable to data variations that occur during actual inference due to the use of simple Euclidean distance between multi-modal features. To address the problems that arise from small batch training, we propose a Noisy Center Loss (NCL) method to estimate the optimal center features. In addition, we apply the simple Siamese representation learning method (SimSiam) during optimal center feature estimation to compare projected features, making the proposed method robust to changes in batch size and variations in data. As a result, the proposed approach demonstrates improved performance in ModelNet40 dataset compared to the conventional methods.

Looking Beyond the Numbers: Bibliometric Approach to Analysis of LIS Research in Korea

  • Yang, Kiduk;Lee, Jongwook;Choi, Wonchan
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.49 no.4
    • /
    • pp.241-264
    • /
    • 2015
  • Bibliometric analysis for research performance evaluation can generate erroneous assessments for various reasons. Application of the same evaluation metric to different domains, for instance, can produce unfair evaluation results, while analysis based on incomplete data can lead to incorrect conclusions. This study examines bibliometric data of library and information science (LIS) research in Korea to investigate whether research performance should be evaluated in a uniform manner in multi-disciplinary fields such as LIS and how data incompleteness can affect the bibliometric assessment outcomes. The initial analysis of our study data, which consisted of 4,350 citations to 1,986 domestic papers published between 2001 and 2010 by 163 LIS faculty members in Korea, showed an anomalous citation pattern caused by data incompleteness, which was addressed via data projection based on past citation trends. The subsequent analysis of augmented study data revealed ample evidence of bibliometric pattern differences across subject areas. In addition to highlighting the need for a subject-specific assessment of research performance, the study demonstrated the importance of rigorous analysis and careful interpretation of bibliometric data by identifying and compensating for deficiencies in the data source, examining per capita as well as overall statistics, and considering various facets of research in order to interpret what the numbers reflect rather than merely taking them at face value as quantitative measures of research performance.

Statistical implications of extrapolating the overall result to the target region in multi-regional clinical trials

  • Kang, Seung-Ho;Kim, Saemina
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.4
    • /
    • pp.341-354
    • /
    • 2018
  • The one of the principles described in ICH E9 is that only results obtained from pre-specified statistical methods in a protocol are regarded as confirmatory evidence. However, in multi-regional clinical trials, even when results obtained from pre-specified statistical methods in protocol are significant, it does not guarantee that the test treatment is approved by regional regulatory agencies. In other words, there is no so-called global approval, and each regional regulatory agency makes its own decision in the face of the same set of data from a multi-regional clinical trial. Under this situation, there are two natural methods a regional regulatory agency can use to estimate the treatment effect in a particular region. The first method is to use the overall treatment estimate, which is to extrapolate the overall result to the region of interest. The second method is to use regional treatment estimate. If the treatment effect is completely identical across all regions, it is obvious that the overall treatment estimator is more efficient than the regional treatment estimator. However, it is not possible to confirm statistically that the treatment effect is completely identical in all regions. Furthermore, some magnitude of regional differences within the range of clinical relevance may naturally exist for various reasons due to, for instance, intrinsic and extrinsic factors. Nevertheless, if the magnitude of regional differences is relatively small, a conventional method to estimate the treatment effect in the region of interest is to extrapolate the overall result to that region. The purpose of this paper is to investigate the effects produced by this type of extrapolation via estimations, followed by hypothesis testing of the treatment effect in the region of interest. This paper is written from the viewpoint of regional regulatory agencies.

Open Source Cloud Computing: An Experience Case of Geo-based Image Handling in Amazon Web Services

  • Lee, Ki-Won
    • Korean Journal of Remote Sensing
    • /
    • v.28 no.3
    • /
    • pp.337-346
    • /
    • 2012
  • In the view from most application system developers and users, cloud computing becomes popular in recent years and is still evolving. But in fact it is not easy to reach at the level of actual operations. Despite, it is known that the cloud in the practical stage provides a new pattern for deploying a geo-spatial application. However, domestically geo-spatial application implementation and operation based on this concept or scheme is on the beginning stage. It is the motivation of this works. Although this study is an introductory level, a simple and practical processed result was presented. This study was carried out on Amazon web services platform, as infrastructure as a service in the geo-spatial areas. Under this environment, cloud instance, a web and mobile system being previously implemented in the multi-layered structure for geo-spatial open sources of database and application server, was generated. Judging from this example, it is highly possible that cloud services with the functions of geo-processing service and large volume data handling are the crucial point, leading a new business model for civilian remote sensing application and geo-spatial enterprise industry. The further works to extend geo-spatial applications in cloud computing paradigm are left.

An Optimization Model for Assignment of Freight Trains to Transshipment Tracks and Allocation of Containers to Freight Trains (화물열차 작업선배정 및 열차조성을 위한 수리모형 및 해법)

  • Kim, Kyung-Min;Kim, Dong-Hee;Park, Bum-Hwan
    • Journal of the Korean Society for Railway
    • /
    • v.13 no.5
    • /
    • pp.535-540
    • /
    • 2010
  • We present an optimization model for how to assign the freight trains to transshipment tracks and allocate the containers to the freight trains in a rail container terminal. We formulate this problem as a multi-criteria integer programming to minimize the makespan of job schedule and simultaneously to maximize the loading throughput, i.e. the number of containers to be disposed per day. We also apply our model to the instance obtained from the real-world data of the Uiwang Inner Container Depot. From the experiments, we can see an improvement of approximately 6% in makespan, which means that our model can contribute to the improvement of the disposal capacity of containers without additional expansion of facilities.

Performance of pilot-assisted coded-OFDM-CDMA using low-density parity-check coding in Rayleigh fading channels (레일리 페이딩 채널에서 파일럿 기법과 LDPC 코딩이 적용된 COFDM-CDMA의 성능 분석)

  • 안영신;최재호
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.5C
    • /
    • pp.532-538
    • /
    • 2003
  • In this paper we have investigated a novel approach applying low-density parity-check coding to a COFDM-CDMA system, which operates in a multi-path fading mobile channel. Developed as a linear-block channel coder, the LDPC code is known for a superior signal reception capability in AWGN and/or flat fading channels with respect to increased encoding rates, however, its performance degrades when the communication channel becomes multi-path fading. For a typical multi-path fading mobile channel with a SNR of 16㏈ or lower. in order to obtain a BER lower than 1 out of 10000, the LDPC code with encoding rates below 1:3 requires not only the inherent parity check information but also the piloting information for refreshing front-end equalizer taps of COFDM-CDMA, periodically. For instance, while the 1:3-rate LDPC coded transmission symbol is consisted of data bits and parity-check bits in 1 to 3 proportion, on the other hand, in the proposed method the same rate LDPC transmission symbol contains data bits, parity check bits, and pilot bits in 1 to 2 to 1 proportion, respectively. The included pilot bits are effective not only for channel estimation and channel equalization but for symbol decoding by assisting the parity-check bits, hence, improving SNR vs BER performance over the conventional 1:3-rate LDPC code. The proposed system performance has been verified using computer simulations in multi-path, Rayleigh fading channels, and the results show us that the proposed method out-performs the general LDPC channel coding methods in terms of SNR vs BER measurements.

Gradual Certification Correspond with Sensual Confidence by Network Paths (본인인증의 네트워크 경로와 감성신뢰도에 연동한 점진적 인증방법)

  • Suh, Hyo-Joong
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.12
    • /
    • pp.955-963
    • /
    • 2017
  • Nowadays, fintech becomes the key technology of the mobile banking and payments. Financial market is moved to fintech-based non-face-to-face trade/payment from traditional face-to-face process in Korea. Core of this transition is the smartphones, which have several sensitive sensors for personal identifications such as fingerprint and iris recognition sensors. But it has some originated security risks by data path attacks, for instance, hacking and pharming. Multi-level certification and security systems are applied to avoid these threats effectively, while these protections can be cause of some inconvenience for non-face-to-face certifications and financing processes. In this paper, I confirmed that it have sensible differences correspond with the data connection paths such as WiFi networks and mobile communication networks of the smartphones, and I propose a gradual certification method which alleviates the inconvenience by risk-level definitions of the data-paths.