• Title/Summary/Keyword: data partition

Search Result 416, Processing Time 0.025 seconds

Camera Model Identification Based on Deep Learning (딥러닝 기반 카메라 모델 판별)

  • Lee, Soo Hyeon;Kim, Dong Hyun;Lee, Hae-Yeoun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.10
    • /
    • pp.411-420
    • /
    • 2019
  • Camera model identification has been a subject of steady study in the field of digital forensics. Among the increasingly sophisticated crimes, crimes such as illegal filming are taking up a high number of crimes because they are hard to detect as cameras become smaller. Therefore, technology that can specify which camera a particular image was taken on could be used as evidence to prove a criminal's suspicion when a criminal denies his or her criminal behavior. This paper proposes a deep learning model to identify the camera model used to acquire the image. The proposed model consists of four convolution layers and two fully connection layers, and a high pass filter is used as a filter for data pre-processing. To verify the performance of the proposed model, Dresden Image Database was used and the dataset was generated by applying the sequential partition method. To show the performance of the proposed model, it is compared with existing studies using 3 layers model or model with GLCM. The proposed model achieves 98% accuracy which is similar to that of the latest technology.

Single-step genomic evaluation for growth traits in a Mexican Braunvieh cattle population

  • Jonathan Emanuel Valerio-Hernandez;Agustin Ruiz-Flores;Mohammad Ali Nilforooshan;Paulino Perez-Rodriguez
    • Animal Bioscience
    • /
    • v.36 no.7
    • /
    • pp.1003-1009
    • /
    • 2023
  • Objective: The objective was to compare (pedigree-based) best linear unbiased prediction (BLUP), genomic BLUP (GBLUP), and single-step GBLUP (ssGBLUP) methods for genomic evaluation of growth traits in a Mexican Braunvieh cattle population. Methods: Birth (BW), weaning (WW), and yearling weight (YW) data of a Mexican Braunvieh cattle population were analyzed with BLUP, GBLUP, and ssGBLUP methods. These methods are differentiated by the additive genetic relationship matrix included in the model and the animals under evaluation. The predictive ability of the model was evaluated using random partitions of the data in training and testing sets, consistently predicting about 20% of genotyped animals on all occasions. For each partition, the Pearson correlation coefficient between adjusted phenotypes for fixed effects and non-genetic random effects and the estimated breeding values (EBV) were computed. Results: The random contemporary group (CG) effect explained about 50%, 45%, and 35% of the phenotypic variance in BW, WW, and YW, respectively. For the three methods, the CG effect explained the highest proportion of the phenotypic variances (except for YW-GBLUP). The heritability estimate obtained with GBLUP was the lowest for BW, while the highest heritability was obtained with BLUP. For WW, the highest heritability estimate was obtained with BLUP, the estimates obtained with GBLUP and ssGBLUP were similar. For YW, the heritability estimates obtained with GBLUP and BLUP were similar, and the lowest heritability was obtained with ssGBLUP. Pearson correlation coefficients between adjusted phenotypes for non-genetic effects and EBVs were the highest for BLUP, followed by ssBLUP and GBLUP. Conclusion: The successful implementation of genetic evaluations that include genotyped and non-genotyped animals in our study indicate a promising method for use in genetic improvement programs of Braunvieh cattle. Our findings showed that simultaneous evaluation of genotyped and non-genotyped animals improved prediction accuracy for growth traits even with a limited number of genotyped animals.

The application of fuzzy spatial overlay method to the site selection using GSIS (GSIS를 이용한 입지선정에 있어 퍼지공간중첩기법의 적용에 관한 연구)

  • 임승현;조기성
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.17 no.2
    • /
    • pp.177-187
    • /
    • 1999
  • Up to date, in many application fields of GSIS, we usually have used vector-based spatial overlay or grid-based spatial algebra for extraction and analysis of spatial data. But, because these methods are based on traditional crisp set, concept which is used these methods. shows that many kinds of spatial data are partitioned with sharp boundary. That is not agree with spatial distribution pattern of data in the real world. Therefore, it has a error that a region or object is restricted within only one attribution (One-Entity-one-value). In this study, for improving previous methods that deal with spatial data based on crisp set, we are suggested to apply into spatial overlay process the concept of fuzzy set which is good for expressing the boundary vagueness or ambiguity of spatial data. two methods be given. First method is a fuzzy interval partition by fuzzy subsets in case of spatially continuous data, and second method is fuzzy boundary set applied on categorical data. with a case study to get a land suitability map for the development site selection of new town, we compared results between Boolean analysis method and fuzzy spatial overlay method. And as a result, we could find out that suitability map using fuzzy spatial overlay method provide more reasonable information about development site of new town, and is more adequate type in the aspect of presentation.

  • PDF

Influence of Self-driving Data Set Partition on Detection Performance Using YOLOv4 Network (YOLOv4 네트워크를 이용한 자동운전 데이터 분할이 검출성능에 미치는 영향)

  • Wang, Xufei;Chen, Le;Li, Qiutan;Son, Jinku;Ding, Xilong;Song, Jeongyoung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.6
    • /
    • pp.157-165
    • /
    • 2020
  • Aiming at the development of neural network and self-driving data set, it is also an idea to improve the performance of network model to detect moving objects by dividing the data set. In Darknet network framework, the YOLOv4 (You Only Look Once v4) network model was used to train and test Udacity data set. According to 7 proportions of the Udacity data set, it was divided into three subsets including training set, validation set and test set. K-means++ algorithm was used to conduct dimensional clustering of object boxes in 7 groups. By adjusting the super parameters of YOLOv4 network for training, Optimal model parameters for 7 groups were obtained respectively. These model parameters were used to detect and compare 7 test sets respectively. The experimental results showed that YOLOv4 can effectively detect the large, medium and small moving objects represented by Truck, Car and Pedestrian in the Udacity data set. When the ratio of training set, validation set and test set is 7:1.5:1.5, the optimal model parameters of the YOLOv4 have highest detection performance. The values show mAP50 reaching 80.89%, mAP75 reaching 47.08%, and the detection speed reaching 10.56 FPS.

An Energy Efficient Unequal Clustering Algorithm for Wireless Sensor Networks (무선 센서 네트워크에서의 에너지 효율적인 불균형 클러스터링 알고리즘)

  • Lee, Sung-Ju;Kim, Sung-Chun
    • The KIPS Transactions:PartC
    • /
    • v.16C no.6
    • /
    • pp.783-790
    • /
    • 2009
  • The necessity of wireless sensor networks is increasing in the recent years. So many researches are studied in wireless sensor networks. The clustering algorithm provides an effective way to prolong the lifetime of the wireless sensor networks. The one-hop routing of LEACH algorithm is an inefficient way in the energy consumption of cluster-head, because it transmits a data to the BS(Base Station) with one-hop. On the other hand, other clustering algorithms transmit data to the BS with multi-hop, because the multi-hop transmission is an effective way. But the multi-hop routing of other clustering algorithms which transmits data to BS with multi-hop have a data bottleneck state problem. The unequal clustering algorithm solved a data bottleneck state problem by increasing the routing path. Most of the unequal clustering algorithms partition the nodes into clusters of unequal size, and clusters closer to the BS have small-size the those farther away from the BS. However, the energy consumption of cluster-head in unequal clustering algorithm is more increased than other clustering algorithms. In the thesis, I propose an energy efficient unequal clustering algorithm which decreases the energy consumption of cluster-head and solves the data bottleneck state problem. The basic idea is divided a three part. First of all I provide that the election of appropriate cluster-head. Next, I offer that the decision of cluster-size which consider the distance from the BS, the energy state of node and the number of neighborhood node. Finally, I provide that the election of assistant node which the transmit function substituted for cluster-head. As a result, the energy consumption of cluster-head is minimized, and the energy consumption of total network is minimized.

A System of Audio Data Analysis and Masking Personal Information Using Audio Partitioning and Artificial Intelligence API (오디오 데이터 내 개인 신상 정보 검출과 마스킹을 위한 인공지능 API의 활용 및 음성 분할 방법의 연구)

  • Kim, TaeYoung;Hong, Ji Won;Kim, Do Hee;Kim, Hyung-Jong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.5
    • /
    • pp.895-907
    • /
    • 2020
  • With the recent increasing influence of multimedia content other than the text-based content, services that help to process information in content brings us great convenience. These services' representative features are searching and masking the sensitive data. It is not difficult to find the solutions that provide searching and masking function for text information and image. However, even though we recognize the necessity of the technology for searching and masking a part of the audio data, it is not easy to find the solution because of the difficulty of the technology. In this study, we propose web application that provides searching and masking functions for audio data using audio partitioning method. While we are achieving the research goal, we evaluated several speech to text conversion APIs to choose a proper API for our purpose and developed regular expressions for searching sensitive information. Lastly we evaluated the accuracy of the developed searching and masking feature. The contribution of this work is in design and implementation of searching and masking a sensitive information from the audio data by the various functionality proving experiments.

Study on the Chemical Management - 1. Chemical Characteristics and Occupational Exposure Limits under Occupational Safety and Health Act of Korea (화학물질 관리 연구-1. 산업안전보건법상 관리 화학물질의 특성과 노출기준 비교)

  • Park, Jihoon;Ham, Seunghon;Kim, Sunju;Lee, Kwonseob;Ha, Kwonchul;Park, Donguk;Yoon, Chungsik
    • Journal of Korean Society of Occupational and Environmental Hygiene
    • /
    • v.25 no.1
    • /
    • pp.45-57
    • /
    • 2015
  • Objectives: This study aims to compare the physicochemical characteristics, toxicological data with Occupational Exposure Limits (OELs) of chemicals under the Occupational Safety and Health Act(OSHA) regulated by the Ministry of Employment and Labor of Korea. Methods: Information on chemicals which have OELs on physicochemical characteristics and toxicological data was collected using Material Safety Data Sheet(MSDS) from Korea Occupational Safety and Health Agency(KOSHA) and the Korea Information System for Chemical Safety Management(KISChem) in 2014. Statistical analyses including correlation and simple regression were performed to compare the OELs with chemical characteristics including molecular weight, boiling point, odor threshold, vapor pressure, vapor density, solubility and octanol-water partition coefficient(OWPC) and toxicological data such as median lethal dose($LD_{50}$) and median lethal concentration($LC_{50}$). Results: A total of 656 chemicals have OELs under OSHA in Korea. The numbers of chemicals which have eight-hour time weighted average(TWA) and short term exposure limits(STEL) are 618 and 190, respectively. TWA was significantly correlated with boiling point and STEL was only correlated with vapor pressure among physicochemical characteristics. Solubility and OWPC between "skin" and "no skin" substances which indicate skin penetration were not significantly different. Both $LD_{50}$ and $LC_{50}$ were correlated with TWA, while the $LC_{50}$ was not with STEL. As health indicators, health rating and Emergency Response Planning Guidelines(ERPG) rating as recommended by the National Fire Protection Association(NFPA) and American Industrial Hygiene Association(AIHA) were associated with OELs and reflect the chemical hazards. Conclusions: We found relationships between OEL and chemical information including physicochemical characteristics and toxicological data. The study has an important meaning for understanding present regulatory OELs.

A Load Balancing Method using Partition Tuning for Pipelined Multi-way Hash Join (다중 해시 조인의 파이프라인 처리에서 분할 조율을 통한 부하 균형 유지 방법)

  • Mun, Jin-Gyu;Jin, Seong-Il;Jo, Seong-Hyeon
    • Journal of KIISE:Databases
    • /
    • v.29 no.3
    • /
    • pp.180-192
    • /
    • 2002
  • We investigate the effect of the data skew of join attributes on the performance of a pipelined multi-way hash join method, and propose two new harsh join methods in the shared-nothing multiprocessor environment. The first proposed method allocates buckets statically by round-robin fashion, and the second one allocates buckets dynamically via a frequency distribution. Using harsh-based joins, multiple joins can be pipelined to that the early results from a join, before the whole join is completed, are sent to the next join processing without staying in disks. Shared nothing multiprocessor architecture is known to be more scalable to support very large databases. However, this hardware structure is very sensitive to the data skew. Unless the pipelining execution of multiple hash joins includes some dynamic load balancing mechanism, the skew effect can severely deteriorate the system performance. In this parer, we derive an execution model of the pipeline segment and a cost model, and develop a simulator for the study. As shown by our simulation with a wide range of parameters, join selectivities and sizes of relations deteriorate the system performance as the degree of data skew is larger. But the proposed method using a large number of buckets and a tuning technique can offer substantial robustness against a wide range of skew conditions.

Spherical Pyramid-Technique : An Efficient Indexing Technique for Similarity Search in High-Dimensional Data (구형 피라미드 기법 : 고차원 데이터의 유사성 검색을 위한 효율적인 색인 기법)

  • Lee, Dong-Ho;Jeong, Jin-Wan;Kim, Hyeong-Ju
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.11
    • /
    • pp.1270-1281
    • /
    • 1999
  • 피라미드 기법 1 은 d-차원의 공간을 2d개의 피라미드들로 분할하는 특별한 공간 분할 방식을 이용하여 고차원 데이타를 효율적으로 색인할 수 있는 새로운 색인 방법으로 제안되었다. 피라미드 기법은 고차원 사각형 형태의 영역 질의에는 효율적이나, 유사성 검색에 많이 사용되는 고차원 구형태의 영역 질의에는 비효율적인 면이 존재한다. 본 논문에서는 고차원 데이타를 많이 사용하는 유사성 검색에 효율적인 새로운 색인 기법으로 구형 피라미드 기법을 제안한다. 구형 피라미드 기법은 먼저 d-차원의 공간을 2d개의 구형 피라미드로 분할하고, 각 단일 구형 피라미드를 다시 구형태의 조각으로 분할하는 특별한 공간 분할 방법에 기반하고 있다. 이러한 공간 분할 방식은 피라미드 기법과 마찬가지로 d-차원 공간을 1-차원 공간으로 변환할 수 있다. 따라서, 변환된 1-차원 데이타를 다루기 위하여 B+-트리를 사용할 수 있다. 본 논문에서는 이렇게 분할된 공간에서 고차원 구형태의 영역 질의를 효율적으로 처리할 수 있는 알고리즘을 제안한다. 마지막으로, 인위적 데이타와 실제 데이타를 사용한 다양한 실험을 통하여 구형 피라미드 기법이 구형태의 영역 질의를 처리하는데 있어서 기존의 피라미드 기법보다 효율적임을 보인다.Abstract The Pyramid-Technique 1 was proposed as a new indexing method for high- dimensional data spaces using a special partitioning strategy that divides d-dimensional space into 2d pyramids. It is efficient for hypercube range query, but is not efficient for hypersphere range query which is frequently used in similarity search. In this paper, we propose the Spherical Pyramid-Technique, an efficient indexing method for similarity search in high-dimensional space. The Spherical Pyramid-Technique is based on a special partitioning strategy, which is to divide the d-dimensional data space first into 2d spherical pyramids, and then cut the single spherical pyramid into several spherical slices. This partition provides a transformation of d-dimensional space into 1-dimensional space as the Pyramid-Technique does. Thus, we are able to use a B+-tree to manage the transformed 1-dimensional data. We also propose the algorithm of processing hypersphere range query on the space partitioned by this partitioning strategy. Finally, we show that the Spherical Pyramid-Technique clearly outperforms the Pyramid-Technique in processing hypersphere range queries through various experiments using synthetic and real data.

Mitochondrial DNA Sequence Variation of the Tiny Dragonfly, Nannophya pygmaea(Odonata: Libellulidae)

  • Kim, Ki-Gyoung;Jang, Sang-Kyun;Park, Dong-Woo;Hong, Mee-Yeon;Oh, Kyoung-Hee;Kim, Kee-Young;Hwang, Jae-Sam;Han, Yeon-Soo;Kim, Ik-Soo
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • v.15 no.1
    • /
    • pp.47-58
    • /
    • 2007
  • The tiny dragonfly, Nannophya pygmaea(Odonata: Libellulidae) is one the smallest dragonflies in the world and listed as a second-degree endangered wild animal and plant in Korea. For the long-term conservation of such endangered species, an investigation on nation-wide genetic magnitude and nature of genetic diversity is required as a part of conservation strategy. We, thus, sequenced a portion of mitochondrial COI gene, corresponding to "DNA Barcode" region(658 bp) from 68 N. pygmaea individuals collected over six habitats in Korea. The sequence data were used to investigate genetic diversity within populations and species, geographic variation within species, phylogeographic relationship among populations, and phylogenetic relationship among haplotypes. Phylogenetic analysis and uncorrected pairwise distance estimate showed overall low genetic diversity within species. Regionally, populations in southern localities such as Gangjin and Gokseong in Jeollanamdo Province showed somewhat higher genetic diversity estimates than those of remaining regions in Korean peninsula. Although geographic populations of N. pygmaea were subdivided into two groups, distance- or region-based geographic partition was not observed.