• 제목/요약/키워드: uncertain datasets

검색결과 5건 처리시간 0.019초

High Utility Itemset Mining over Uncertain Datasets Based on a Quantum Genetic Algorithm

  • Wang, Ju;Liu, Fuxian;Jin, Chunjie
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권8호
    • /
    • pp.3606-3629
    • /
    • 2018
  • The discovered high potential utility itemsets (HPUIs) have significant influence on a variety of areas, such as retail marketing, web click analysis, and biological gene analysis. Thus, in this paper, we propose an algorithm called HPUIM-QGA (Mining high potential utility itemsets based on a quantum genetic algorithm) to mine HPUIs over uncertain datasets based on a quantum genetic algorithm (QGA). The proposed algorithm not only can handle the problem of the non-downward closure property by developing an upper bound of the potential utility (UBPU) (which prunes the unpromising itemsets in the early stage) but can also handle the problem of combinatorial explosion by introducing a QGA, which finds optimal solutions quickly and needs to set only very few parameters. Furthermore, a pruning strategy has been designed to avoid the meaningless and redundant itemsets that are generated in the evolution process of the QGA. As proof of the HPUIM-QGA, a substantial number of experiments are performed on the runtime, memory usage, analysis of the discovered itemsets and the convergence on real-life and synthetic datasets. The results show that our proposed algorithm is reasonable and acceptable for mining meaningful HPUIs from uncertain datasets.

제한된 델로네 삼각분할을 이용한 공간 불확실한 영역 탐색 기법 (Detecting Uncertain Boundary Algorithm using Constrained Delaunay Triangulation)

  • 조성환
    • 한국측량학회지
    • /
    • 제32권2호
    • /
    • pp.87-93
    • /
    • 2014
  • 지적 필지를 구성하고 있는 폴리곤 집합은 현실세계의 국토를 반영하는 가장 기반이 되는 데이터 집합이다. 따라서 지적 필지는 서로 간에 겹쳐있거나 공백을 가지지 않는 위상적 무결성이 보장되어야하는 데이터이다. 하지만, 여러 가지 이유로 필지들 간의 겹침과 공백의 문제가 발생하고 있고, 이러한 경우 폴리곤의 경계들은 주변의 폴리곤과 정확하게 인접하고 있지 못하기 때문에 의도하지 않은 겹침 영역과 공백 영역이 생산되고 있다. 이와 같이 정확하게 인접되어 있지 않은 경계가 불확실한 모서리를 하나 이상 포함하고 있는 경우, 이 폴리곤을 불확실한 영역이라고 부른다. 본 논문에서는 이러한 영역을 탐색하기 위한 TTA 기법을 제안하고자 한다. TTA 처리 순서는 우선 폴리곤 데이터 집합으로부터 포인트와 폴리라인을 추출하여 제한된 델로네 삼각분할을 수행한다. 다음으로 각 삼각형마다 데이터 집합과 중첩되는 면의 수를 세어 삼각형에 태깅을 수행한다. 태깅 값이 0 또는 1 이상인 삼각형을 추출한 후 연결성을 가지고 있는 삼각형끼리 병합을 수행하여 위상적 모순이 있는 영역들을 발견한다. 본 실험에서는 제안하는 알고리즘을 자동화하여 실세계에서 경계가 교차하는 지적 데이터에 적용하여 실험을 하였다.

Mining Highly Reliable Dense Subgraphs from Uncertain Graphs

  • LU, Yihong;HUANG, Ruizhi;HUANG, Decai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권6호
    • /
    • pp.2986-2999
    • /
    • 2019
  • The uncertainties of the uncertain graph make the traditional definition and algorithms on mining dense graph for certain graph not applicable. The subgraph obtained by maximizing expected density from an uncertain graph always has many low edge-probability data, which makes it low reliable and low expected edge density. Based on the concept of ${\beta}$-subgraph, to overcome the low reliability of the densest subgraph, the concept of optimal ${\beta}$-subgraph is proposed. An efficient greedy algorithm is also developed to find the optimal ${\beta}$-subgraph. Simulation experiments of multiple sets of datasets show that the average edge-possibility of optimal ${\beta}$-subgraph is improved by nearly 40%, and the expected edge density reaches 0.9 on average. The parameter ${\beta}$ is scalable and applicable to multiple scenarios.

Force-deformation relationship prediction of bridge piers through stacked LSTM network using fast and slow cyclic tests

  • Omid Yazdanpanah;Minwoo Chang;Minseok Park;Yunbyeong Chae
    • Structural Engineering and Mechanics
    • /
    • 제85권4호
    • /
    • pp.469-484
    • /
    • 2023
  • A deep recursive bidirectional Cuda Deep Neural Network Long Short Term Memory (Bi-CuDNNLSTM) layer is recruited in this paper to predict the entire force time histories, and the corresponding hysteresis and backbone curves of reinforced concrete (RC) bridge piers using experimental fast and slow cyclic tests. The proposed stacked Bi-CuDNNLSTM layers involve multiple uncertain input variables, including horizontal actuator displacements, vertical actuators axial loads, the effective height of the bridge pier, the moment of inertia, and mass. The functional application programming interface in the Keras Python library is utilized to develop a deep learning model considering all the above various input attributes. To have a robust and reliable prediction, the dataset for both the fast and slow cyclic tests is split into three mutually exclusive subsets of training, validation, and testing (unseen). The whole datasets include 17 RC bridge piers tested experimentally ten for fast and seven for slow cyclic tests. The results bring to light that the mean absolute error, as a loss function, is monotonically decreased to zero for both the training and validation datasets after 5000 epochs, and a high level of correlation is observed between the predicted and the experimentally measured values of the force time histories for all the datasets, more than 90%. It can be concluded that the maximum mean of the normalized error, obtained through Box-Whisker plot and Gaussian distribution of normalized error, associated with unseen data is about 10% and 3% for the fast and slow cyclic tests, respectively. In recapitulation, it brings to an end that the stacked Bi-CuDNNLSTM layer implemented in this study has a myriad of benefits in reducing the time and experimental costs for conducting new fast and slow cyclic tests in the future and results in a fast and accurate insight into hysteretic behavior of bridge piers.

Overcoming taxonomic challenges in DNA barcoding for improvement of identification and preservation of clariid catfish species

  • Piangjai Chalermwong;Thitipong Panthum;Pish Wattanadilokcahtkun;Nattakan Ariyaraphong;Thanyapat Thong;Phanitada Srikampa;Worapong Singchat;Syed Farhan Ahmad;Kantika Noito;Ryan Rasoarahona;Artem Lisachov;Hina Ali;Ekaphan Kraichak;Narongrit Muangmai;Satid Chatchaiphan6;Kednapat Sriphairoj;Sittichai Hatachote;Aingorn Chaiyes;Chatchawan Jantasuriyarat;Visarut Chailertlit;Warong Suksavate;Jumaporn Sonongbua;Witsanu Srimai;Sunchai Payungporn;Kyudong Han;Agostinho Antunes;Prapansak Srisapoome;Akihiko Koga;Prateep Duengkae;Yoichi Matsuda;Uthairat Na-Nakorn;Kornsorn Srikulnath
    • Genomics & Informatics
    • /
    • 제21권3호
    • /
    • pp.39.1-39.15
    • /
    • 2023
  • DNA barcoding without assessing reliability and validity causes taxonomic errors of species identification, which is responsible for disruptions of their conservation and aquaculture industry. Although DNA barcoding facilitates molecular identification and phylogenetic analysis of species, its availability in clariid catfish lineage remains uncertain. In this study, DNA barcoding was developed and validated for clariid catfish. 2,970 barcode sequences from mitochondrial cytochrome c oxidase I (COI) and cytochrome b (Cytb) genes and D-loop sequences were analyzed for 37 clariid catfish species. The highest intraspecific nearest neighbor distances were 85.47%, 98.03%, and 89.10% for COI, Cytb, and D-loop sequences, respectively. This suggests that the Cytb gene is the most appropriate for identifying clariid catfish and can serve as a standard region for DNA barcoding. A positive barcoding gap between interspecific and intraspecific sequence divergence was observed in the Cytb dataset but not in the COI and D-loop datasets. Intraspecific variation was typically less than 4.4%, whereas interspecific variation was generally more than 66.9%. However, a species complex was detected in walking catfish and significant intraspecific sequence divergence was observed in North African catfish. These findings suggest the need to focus on developing a DNA barcoding system for classifying clariid catfish properly and to validate its efficacy for a wider range of clariid catfish. With an enriched database of multiple sequences from a target species and its genus, species identification can be more accurate and biodiversity assessment of the species can be facilitated.