• Title/Summary/Keyword: Multi-Instance Data

Search Result 32, Processing Time 0.034 seconds

Efficiently Processing Skyline Query on Multi-Instance Data

  • Chiu, Shu-I;Hsu, Kuo-Wei
    • Journal of Information Processing Systems
    • /
    • v.13 no.5
    • /
    • pp.1277-1298
    • /
    • 2017
  • Related to the maximum vector problem, a skyline query is to discover dominating tuples from a set of tuples, where each defines an object (such as a hotel) in several dimensions (such as the price and the distance to the beach). A tuple, an instance of an object, dominates another tuple if it is equally good or better in all dimensions and better in at least one dimension. Traditionally, skyline queries are defined upon single-instance data or upon objects each of which is associated with an instance. However, in some cases, an object is not associated with a single instance but rather by multiple instances. For example, on a review website, many users assign scores to a product or a service, and a user's score is an instance of the object representing the product or the service. Such data is an example of multi-instance data. Unlike most (if not all) others considering the traditional setting, we consider skyline queries defined upon multi-instance data. We define the dominance calculation and propose an algorithm to reduce its computational cost. We use synthetic and real data to evaluate the proposed methods, and the results demonstrate their utility.

Learning Multiple Instance Support Vector Machine through Positive Data Distribution (긍정 데이터 분포를 반영한 다중 인스턴스 지지 벡터 기계 학습)

  • Hwang, Joong-Won;Park, Seong-Bae;Lee, Sang-Jo
    • Journal of KIISE
    • /
    • v.42 no.2
    • /
    • pp.227-234
    • /
    • 2015
  • This paper proposes a modified MI-SVM algorithm by considering data distribution. The previous MI-SVM algorithm seeks the margin by considering the "most positive" instance in a positive bag. Positive instances included in positive bags are located in a similar area in a feature space. In order to reflect this characteristic of positive instances, the proposed method selects the "most positive" instance by calculating the distance between each instance in the bag and a pivot point that is the intersection point of all positive instances. This paper suggests two ways to select the "most positive" pivot point in the training data. First, the algorithm seeks the "most positive" pivot point along the current predicted parameter, and then selects the nearest instance in the bag as a representative from the pivot point. Second, the algorithm finds the "most positive" pivot point by using a Diverse Density framework. Our experiments on 12 benchmark multi-instance data sets show that the proposed method results in higher performance than the previous MI-SVM algorithm.

DATA MININING APPROACH TO PARAMETRIC COST ESTIMATE IN EARLY DESIGN STAGE AND ANALYTICAL CHARACTERIZATION ON OLAP (ON-LINE ANALYTICAL PROCESSING)

  • JaeHo Cho;HyunKyun Jung;JaeYoul Chun
    • International conference on construction engineering and project management
    • /
    • 2011.02a
    • /
    • pp.176-181
    • /
    • 2011
  • A role of cost modeler is that of facilitating design process by the systematic application of cost factors so as to maintain sensible and economic relationships between cost, quantity, utility and appearance. These relationships help to achieve the client's requirements within an agreed budget. The purpose of this study is to develop a parametric cost estimating model for the early design stage by using the multi-dimensional system of OLAP (On-line Analytical Processing) based on the case of quantity data related to architectural design features. The parametric cost estimating models have been adopted to support decision making in the early design stage. These models typically use a similar instance or a pattern of historical case. In order to effectively use this type of data model, it is required to set data classification and prediction methods. One of the methods is to find the similar class in line with attribute selection measure in the multi-dimensional data model. Therefore, this research is to analyze the relevance attribute influenced by architectural design features with the subject of case-based quantity data used for the parametric cost estimating model. The relevance attributes can be analyzed by Analytical Characterization. It helps determine what attributes to be included in the OLAP multi-dimension.

  • PDF

Intelligent Intrusion Detection and Prevention System using Smart Multi-instance Multi-label Learning Protocol for Tactical Mobile Adhoc Networks

  • Roopa, M.;Raja, S. Selvakumar
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.6
    • /
    • pp.2895-2921
    • /
    • 2018
  • Security has become one of the major concerns in mobile adhoc networks (MANETs). Data and voice communication amongst roaming battlefield entities (such as platoon of soldiers, inter-battlefield tanks and military aircrafts) served by MANETs throw several challenges. It requires complex securing strategy to address threats such as unauthorized network access, man in the middle attacks, denial of service etc., to provide highly reliable communication amongst the nodes. Intrusion Detection and Prevention System (IDPS) undoubtedly is a crucial ingredient to address these threats. IDPS in MANET is managed by Command Control Communication and Intelligence (C3I) system. It consists of networked computers in the tactical battle area that facilitates comprehensive situation awareness by the commanders for timely and optimum decision-making. Key issue in such IDPS mechanism is lack of Smart Learning Engine. We propose a novel behavioral based "Smart Multi-Instance Multi-Label Intrusion Detection and Prevention System (MIML-IDPS)" that follows a distributed and centralized architecture to support a Robust C3I System. This protocol is deployed in a virtually clustered non-uniform network topology with dynamic election of several virtual head nodes acting as a client Intrusion Detection agent connected to a centralized server IDPS located at Command and Control Center. Distributed virtual client nodes serve as the intelligent decision processing unit and centralized IDPS server act as a Smart MIML decision making unit. Simulation and experimental analysis shows the proposed protocol exhibits computational intelligence with counter attacks, efficient memory utilization, classification accuracy and decision convergence in securing C3I System in a Tactical Battlefield environment.

A Geometric Active Contour Model Using Multi Resolution Level Set Methods (다중 해상도 레벨 세트 방식을 이용한 기하 활성 모델)

  • Kim, Seong-Gon;Kim, Du-Yeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.10
    • /
    • pp.2809-2815
    • /
    • 1999
  • Level set, and active contour(snakes) models are extensively used for image segmentation or shape extraction in computer vision. Snakes utilize the energy minimization concepts, and level set is based on the curve evolution in order to extract contours from image data. In general, these two models have their own drawbacks. For instance, snake acts pooly unless it is placed close to the wanted shape boundary, and it has difficult problem when image has multiple objects to be extracted. But, level set method is free of initial curve position problem, and has ability to handle topology of multiple objects. Nevertheless, level set method requires much more calculation time compared to snake model. In this paper, we use good points of two described models and also apply multi resolution algorithm in order to speed up the process without decreasing the performance of the shape extraction.

  • PDF

A Study on the Business Model of a Fan Community Platform 'Weverse'

  • Song, Minzheong
    • International journal of advanced smart convergence
    • /
    • v.10 no.4
    • /
    • pp.172-182
    • /
    • 2021
  • We look at the business model development of a fan community platform 'Weverse' from two-sided platform (TSP) to multi-sided platform (MSP) and investigate its platform business model. From the Rocket Model's theoretical perspective, the results reveal that Weverse firstly focuses on inviting artists as many as possible starting from BTS, then attracts new artists' fans naturally. For success of this TSP, it forms MSP, 'Weverse Shop' to meet two sides' relevant needs timely and filtered. In third stage of connection, various partnerships are attempted in terms of open platform strategies. For instance, by combining 'VLive' and Weverse, Naver's fan platform business is transferred to Weverse. For core transaction through direct and indirect monetization, several cobranding activities are tried. Lastly, regarding optimization, newly born Weverse being launched in the first half of 2022 is supposed to create further synergies with Naver's R&D capabilities in data, AI, and other technologies like metaverse platform 'ZEPETO' which already sells clothing items of Weverse artists.

WAVE MODEL DEVELOPMENT IN MULTI-ION PLASMAS (다중 이온 플라즈마 파동모델 개발)

  • 송성희;이동훈;표유선
    • Journal of Astronomy and Space Sciences
    • /
    • v.16 no.1
    • /
    • pp.41-52
    • /
    • 1999
  • Near-earth space is composed of plasmas which embed a number of plasma waves. Space plasmas consist of electrons and multi-ion that determine local wave propagation characteristics. In multi-ion plasmas, it is difficult to find out analytic solutions from the dispersion relation in general. In this work, we have developed a model with an arbitrary magnetic field and density as well as multi-ion plasmas. This model allows us to investigate how plasma waves behave when they propagate along realistic magnetic field lines, which are assumed by IGRF(International Geomagnetic Reference Field). The results are found to be useful for the analysis of the in situ observational data in space. For instance, if waves are assumed to propagate into the polar region, from the equatorial region, our model quantitatively show how polarization is altered along earth travel path.

  • PDF

The Expert Search System using keyword association based on Multi-Ontology (멀티 온톨로지 기반의 키워드 연관성을 이용한 전문가 검색 시스템)

  • Jung, Kye-Dong;Hwang, Chi-Gon;Choi, Young-Keun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.1
    • /
    • pp.183-190
    • /
    • 2012
  • This study constructs an expert search system which has a mutual cooperation function based on thesis and author profile. The proposed methodology is as follows. First, we propose weighting method which can search a keyword and the most relevant keyword. Second, we propose a method which can search the experts efficiently with this weighting method. On the preferential basis, keywords and author profiles are extracted from the papers, and experts can be searched through this method. This system will be available to many fields of social network. However, this information is distributed to many systems. We propose a method using multi-ontology to integrate distributed data. The multi-ontology is composed of meta ontology, instance ontology, location ontology and association ontology. The association ontology is constructed through analysis of keyword association dynamically. An expert network is constructed using this multi-ontology, and this expert network can search expert through association trace of keyword. The expert network can check the detail area of expertise through the research list which is provided by the system.

Privacy Disclosure and Preservation in Learning with Multi-Relational Databases

  • Guo, Hongyu;Viktor, Herna L.;Paquet, Eric
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.3
    • /
    • pp.183-196
    • /
    • 2011
  • There has recently been a surge of interest in relational database mining that aims to discover useful patterns across multiple interlinked database relations. It is crucial for a learning algorithm to explore the multiple inter-connected relations so that important attributes are not excluded when mining such relational repositories. However, from a data privacy perspective, it becomes difficult to identify all possible relationships between attributes from the different relations, considering a complex database schema. That is, seemingly harmless attributes may be linked to confidential information, leading to data leaks when building a model. Thus, we are at risk of disclosing unwanted knowledge when publishing the results of a data mining exercise. For instance, consider a financial database classification task to determine whether a loan is considered high risk. Suppose that we are aware that the database contains another confidential attribute, such as income level, that should not be divulged. One may thus choose to eliminate, or distort, the income level from the database to prevent potential privacy leakage. However, even after distortion, a learning model against the modified database may accurately determine the income level values. It follows that the database is still unsafe and may be compromised. This paper demonstrates this potential for privacy leakage in multi-relational classification and illustrates how such potential leaks may be detected. We propose a method to generate a ranked list of subschemas that maintains the predictive performance on the class attribute, while limiting the disclosure risk, and predictive accuracy, of confidential attributes. We illustrate and demonstrate the effectiveness of our method against a financial database and an insurance database.

Variational Auto-Encoder Based Semi-supervised Learning Scheme for Learner Classification in Intelligent Tutoring System (지능형 교육 시스템의 학습자 분류를 위한 Variational Auto-Encoder 기반 준지도학습 기법)

  • Jung, Seungwon;Son, Minjae;Hwang, Eenjun
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.11
    • /
    • pp.1251-1258
    • /
    • 2019
  • Intelligent tutoring system enables users to effectively learn by utilizing various artificial intelligence techniques. For instance, it can recommend a proper curriculum or learning method to individual users based on their learning history. To do this effectively, user's characteristics need to be analyzed and classified based on various aspects such as interest, learning ability, and personality. Even though data labeled by the characteristics are required for more accurate classification, it is not easy to acquire enough amount of labeled data due to the labeling cost. On the other hand, unlabeled data should not need labeling process to make a large number of unlabeled data be collected and utilized. In this paper, we propose a semi-supervised learning method based on feedback variational auto-encoder(FVAE), which uses both labeled data and unlabeled data. FVAE is a variation of variational auto-encoder(VAE), where a multi-layer perceptron is added for giving feedback. Using unlabeled data, we train FVAE and fetch the encoder of FVAE. And then, we extract features from labeled data by using the encoder and train classifiers with the extracted features. In the experiments, we proved that FVAE-based semi-supervised learning was superior to VAE-based method in terms with accuracy and F1 score.