• Title/Summary/Keyword: large-scale problem

Search Result 950, Processing Time 0.028 seconds

Confidence Value based Large Scale OWL Horst Ontology Reasoning (신뢰 값 기반의 대용량 OWL Horst 온톨로지 추론)

  • Lee, Wan-Gon;Park, Hyun-Kyu;Jagvaral, Batselem;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.43 no.5
    • /
    • pp.553-561
    • /
    • 2016
  • Several machine learning techniques are able to automatically populate ontology data from web sources. Also the interest for large scale ontology reasoning is increasing. However, there is a problem leading to the speculative result to imply uncertainties. Hence, there is a need to consider the reliability problems of various data obtained from the web. Currently, large scale ontology reasoning methods based on the trust value is required because the inference-based reliability of quantitative ontology is insufficient. In this study, we proposed a large scale OWL Horst reasoning method based on a confidence value using spark, a distributed in-memory framework. It describes a method for integrating the confidence value of duplicated data. In addition, it explains a distributed parallel heuristic algorithm to solve the problem of degrading the performance of the inference. In order to evaluate the performance of reasoning methods based on the confidence value, the experiment was conducted using LUBM3000. The experiment results showed that our approach could perform reasoning twice faster than existing reasoning systems like WebPIE.

A Range Query Method using Index in Large-scale Database Systems (대규모 데이터베이스 시스템에서 인덱스를 이용한 범위 질의 방법)

  • Kim, Chi-Yeon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.7 no.5
    • /
    • pp.1095-1101
    • /
    • 2012
  • As the amount of data increases explosively, a large scale database system is emerged to store, retrieve and manipulate it. There are several issues in this environments such as, consistency, availability and fault tolerance. In this paper, we address a efficient range-query method where data management services are separated from transaction management services in large-scale database systems. A study had been proposed using partitions to protect independence of two modules and to resolve the phantom problem, but this method was efficient only when range-query is specified by a key. So, we present a new method that can improve the efficiency when range-query is specified by a key attribute as well as other attributes. The presented method can guarantee the independence of separated modules and alleviate overheads for range-query using partial index.

An Estimated Closeness Centrality Ranking Algorithm and Its Performance Analysis in Large-Scale Workflow-supported Social Networks

  • Kim, Jawon;Ahn, Hyun;Park, Minjae;Kim, Sangguen;Kim, Kwanghoon Pio
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.3
    • /
    • pp.1454-1466
    • /
    • 2016
  • This paper implements an estimated ranking algorithm of closeness centrality measures in large-scale workflow-supported social networks. The traditional ranking algorithms for large-scale networks have suffered from the time complexity problem. The larger the network size is, the bigger dramatically the computation time becomes. To solve the problem on calculating ranks of closeness centrality measures in a large-scale workflow-supported social network, this paper takes an estimation-driven ranking approach, in which the ranking algorithm calculates the estimated closeness centrality measures by applying the approximation method, and then pick out a candidate set of top k actors based on their ranks of the estimated closeness centrality measures. Ultimately, the exact ranking result of the candidate set is obtained by the pure closeness centrality algorithm [1] computing the exact closeness centrality measures. The ranking algorithm of the estimation-driven ranking approach especially developed for workflow-supported social networks is named as RankCCWSSN (Rank Closeness Centrality Workflow-supported Social Network) algorithm. Based upon the algorithm, we conduct the performance evaluations, and compare the outcomes with the results from the pure algorithm. Additionally we extend the algorithm so as to be applied into weighted workflow-supported social networks that are represented by weighted matrices. After all, we confirmed that the time efficiency of the estimation-driven approach with our ranking algorithm is much higher (about 50% improvement) than the traditional approach.

Improved Service Restoration technique by Using Dijkstra Algorithm in Distribution Systems (다익스트라 알고리즘을 이용한 배전계통의 향상된 사고복구 기법)

  • Kim, Nark-Kyung;Kim, Jae-Chul;Jeon, Young-Jae;Kim, Hoon
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.50 no.2
    • /
    • pp.67-75
    • /
    • 2001
  • This paper presents a fast and effective methodology for service restoration in large-scale distribution systems. The service restoration problem is formulated as a constrained optimization problem and requires the fast computation time and superior solution because the more unfaulted out-of-service area should be restored as soon as possible. The proposed methodology is designed to consider the fast computation time and priority service restoration by dijkstra algorithm and fuzzy theory in large-scale distribution systems. Simulation results demonstrate the validity and effectiveness of the proposed on a 26-bus and 140-bus system.

  • PDF

Preprocessing in large scale linear programming problems (대형선형계획문제의 사전처리)

  • 성명기;박순달
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1996.10a
    • /
    • pp.285-288
    • /
    • 1996
  • Generally MPS, standardized by IBM, is the input type of large scale linear programming problems, and there may be unnecessary variables or constraints. These can be discarded by preprocessing. As the size of a problem is reduced by preprocessing, the running time is reduced. And more, the infeasibility of a problem may be detected before using solution methods. When the preprocessing implemented by this paper is used in NETLIB problems, it removes unnecessary variables and constraints by 21%, 15%, respectively. The use of preprocessing gives in the average 21% reduction in running time by applying the interior point method. Preprocessing can detect 10 out of 30 infeasible NETLIB problems.

  • PDF

Large Scale Cooperative Coevolution Differential Evolution (대규모 협동진화 차등진화)

  • Shin, Seong-Yoon;Tan, Xujie;Shin, Kwang-Seong;Lee, Hyun-Chang
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.665-666
    • /
    • 2022
  • Differential evolution is an efficient algorithm for continuous optimization problems. However, applying differential evolution to solve large-scale optimization problems quickly degrades performance and exponentially increases runtime. To overcome this problem, a new cooperative coevolution differential evolution based on Spark (referred to as SparkDECC) is proposed. The divide-and-conquer strategy is used in SparkDECC.

  • PDF

Stereo matching for large-scale high-resolution satellite images using new tiling technique

  • Hong, An Nguyen;Woo, Dong-Min
    • Journal of IKEEE
    • /
    • v.17 no.4
    • /
    • pp.517-524
    • /
    • 2013
  • Stereo matching has been grabbing the attention of researchers because it plays an important role in computer vision, remote sensing and photogrammetry. Although most methods perform well with small size images, experiments applying them to large-scale data sets under uncontrolled conditions are still lacking. In this paper, we present an empirical study on stereo matching for large-scale high-resolution satellite images. A new method is studied to solve the problem of huge size and memory requirement when dealing with large-scale high resolution satellite images. Integrating the tiling technique with the well-known dynamic programming and coarse-to-fine pyramid scheme as well as using memory wisely, the suggested method can be utilized for huge stereo satellite images. Analyzing 350 points from an image of size of 8192 x 8192, disparity results attain an acceptable accuracy with RMS error of 0.5459. Taking the trade-off between computational aspect and accuracy, our method gives an efficient stereo matching for huge satellite image files.

A study on high dimensional large-scale data visualization (고차원 대용량 자료의 시각화에 대한 고찰)

  • Lee, Eun-Kyung;Hwang, Nayoung;Lee, Yoondong
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.6
    • /
    • pp.1061-1075
    • /
    • 2016
  • In this paper, we discuss various methods to visualize high dimensional large-scale data and review some issues associated with visualizing this type of data. High-dimensional data can be presented in a 2-dimensional space with a few selected important variables. We can visualize more variables with various aesthetic attributes in graphics or use the projection pursuit method to find an interesting low-dimensional view. For large-scale data, we discuss jittering and alpha blending methods that solve any problem with overlapping points. We also review the R package tabplot, scagnostics, and other R packages for interactive web application with visualization.

COVID-19 recommender system based on an annotated multilingual corpus

  • Barros, Marcia;Ruas, Pedro;Sousa, Diana;Bangash, Ali Haider;Couto, Francisco M.
    • Genomics & Informatics
    • /
    • v.19 no.3
    • /
    • pp.24.1-24.7
    • /
    • 2021
  • Tracking the most recent advances in Coronavirus disease 2019 (COVID-19)-related research is essential, given the disease's novelty and its impact on society. However, with the publication pace speeding up, researchers and clinicians require automatic approaches to keep up with the incoming information regarding this disease. A solution to this problem requires the development of text mining pipelines; the efficiency of which strongly depends on the availability of curated corpora. However, there is a lack of COVID-19-related corpora, even more, if considering other languages besides English. This project's main contribution was the annotation of a multilingual parallel corpus and the generation of a recommendation dataset (EN-PT and EN-ES) regarding relevant entities, their relations, and recommendation, providing this resource to the community to improve the text mining research on COVID-19-related literature. This work was developed during the 7th Biomedical Linked Annotation Hackathon (BLAH7).

Consideration on Pre-Feasibility Studies for Large-scale Offshore Wind Farms Led by Local Governments, Focusing on the Case of Shinan-gun (지자체 주도 대규모 해상풍력단지 사전 타당성 조사에 대한 고찰, 신안군 사례 중심으로)

  • Min Cheol Park;Ji Hoon Park;Gi Yun Lee;Chang Min Lee;Gwang Hyeok Yu;Hee Woong Jang;Hyun Sig Park
    • New & Renewable Energy
    • /
    • v.20 no.2
    • /
    • pp.65-70
    • /
    • 2024
  • The major challenge in promoting large-scale offshore wind power projects is securing local acceptance. Several recent studies have emphasized the crucial role of local governments in addressing this problem. However, local governments have difficulty in achieving clear results because of the lack of expertise and manpower in offshore wind power projects, making thempassive in promoting these initiatives. In this context, we briefly introduce the case of Shinan-gun, which recently successfully conducted a pre-feasibility study on a large-scale offshore wind power complex led by the local government.