• 제목/요약/키워드: De-identification Algorithms

검색결과 13건 처리시간 0.031초

A non-destructive method for elliptical cracks identification in shafts based on wave propagation signals and genetic algorithms

  • Munoz-Abella, Belen;Rubio, Lourdes;Rubio, Patricia
    • Smart Structures and Systems
    • /
    • 제10권1호
    • /
    • pp.47-65
    • /
    • 2012
  • The presence of crack-like defects in mechanical and structural elements produces failures during their service life that in some cases can be catastrophic. So, the early detection of the fatigue cracks is particularly important because they grow rapidly, with a propagation velocity that increases exponentially, and may lead to long out-of-service periods, heavy damages of machines and severe economic consequences. In this work, a non-destructive method for the detection and identification of elliptical cracks in shafts based on stress wave propagation is proposed. The propagation of a stress wave in a cracked shaft has been numerically analyzed and numerical results have been used to detect and identify the crack through the genetic algorithm optimization method. The results obtained in this work allow the development of an on-line method for damage detection and identification for cracked shaft-like components using an easy and portable dynamic testing device.

관계형 데이터베이스에서 데이터 그룹화를 이용한 익명화 처리 기법 (The De-identification Technique Using Data Grouping in Relational Database)

  • 박준범;진승헌;최대선
    • 정보보호학회논문지
    • /
    • 제25권3호
    • /
    • pp.493-500
    • /
    • 2015
  • 정부 3.0 공공정보 공유 및 개방, 소셜네트워크서비스의 활성화 그리고 사용자 간의 공유 데이터 증가로 인터넷상에 노출되는 사용자의 개인 정보가 증가하고 있다. 이에 따라 프라이버시를 지키기 위한 익명화 알고리즘이 등장하였으며 관계형 데이터베이스에서의 익명화 알고리즘은 k-익명성(k-anonymity)을 시작으로 ${\ell}$-다양성(${\ell}$-diversity), t-밀집성(t-closeness)으로 발전하였다. 익명화 알고리즘의 성능 향상 부분은 계속해서 효율적인 방법이 제안되고 있지만, 기업이나 공공기관에서는 알고리즘 성능의 향상보다는 전체적인 익명화 처리 방법이 필요한 실정이다. 본 논문에서는 관계형 데이터베이스에서 데이터의 그룹화를 이용하여 k-익명성, ${\ell}$-다양성, t-밀집성 알고리즘을 처리하는 과정을 구체화하였다.

A pilot study of an automated personal identification process: Applying machine learning to panoramic radiographs

  • Ortiz, Adrielly Garcia;Soares, Gustavo Hermes;da Rosa, Gabriela Cauduro;Biazevic, Maria Gabriela Haye;Michel-Crosato, Edgard
    • Imaging Science in Dentistry
    • /
    • 제51권2호
    • /
    • pp.187-193
    • /
    • 2021
  • Purpose: This study aimed to assess the usefulness of machine learning and automation techniques to match pairs of panoramic radiographs for personal identification. Materials and Methods: Two hundred panoramic radiographs from 100 patients (50 males and 50 females) were randomly selected from a private radiological service database. Initially, 14 linear and angular measurements of the radiographs were made by an expert. Eight ratio indices derived from the original measurements were applied to a statistical algorithm to match radiographs from the same patients, simulating a semi-automated personal identification process. Subsequently, measurements were automatically generated using a deep neural network for image recognition, simulating a fully automated personal identification process. Results: Approximately 85% of the radiographs were correctly matched by the automated personal identification process. In a limited number of cases, the image recognition algorithm identified 2 potential matches for the same individual. No statistically significant differences were found between measurements performed by the expert on panoramic radiographs from the same patients. Conclusion: Personal identification might be performed with the aid of image recognition algorithms and machine learning techniques. This approach will likely facilitate the complex task of personal identification by performing an initial screening of radiographs and matching ante-mortem and post-mortem images from the same individuals.

의료 비정형 텍스트 비식별화 및 속성기반 유용도 측정 기법 (De-identifying Unstructured Medical Text and Attribute-based Utility Measurement)

  • 노건;전종훈
    • 한국전자거래학회지
    • /
    • 제24권1호
    • /
    • pp.121-137
    • /
    • 2019
  • 비식별화는 데이터셋으로부터 개인정보를 제거함으로써 개인을 식별할 수 없도록 하는 방법으로, 정보를 수집, 가공, 저장, 배포하는 과정에서 발생할 수 있는 개인정보 노출 위험도를 낮추기 위해 사용한다. 그간 비식별화와 관련된 알고리즘, 모델 등의 관점에서 많은 연구가 이루어졌지만, 대부분은 정형 데이터를 대상으로 하는 제한적인 연구로, 비정형 데이터에 대한 고려는 상대적으로 많지 않은 실정이다. 특히 비정형 텍스트가 빈번히 사용되는 의료 분야의 경우에서는 개인 식별 정보들을 단순 제거함으로써 개인정보 노출 위험도는 낮추지만, 그에 따른 데이터 활용성이 떨어지는 점을 감수하는 실정이다. 본 연구는 개인정보 보호 이슈가 가장 중요하고 따라서 비식별화가 활발하게 연구되고 있는 의료분야 데이터 중 비정형 텍스트를 대상으로 k-익명성 보호모델을 적용한 비식별화 수행 방안을 제시하고, 비식별화 결과에 대한 새로운 유용도 측정 기법을 제안하여 이를 통해 직관적으로 데이터 활용성을 판단할 수 있도록 하는 것을 목표로 한다. 따라서 본 연구의 결과물이 의료 분야뿐만 아니라 비정형 텍스트가 활용되는 모든 산업 분야에서 활용될 경우, 개인 식별 정보가 포함된 비정형 텍스트의 활용도를 향상시킬 수 있을 것으로 기대한다.

IDENTIFICATION OF FALSIFIED DRUGS USING NEAR-INFRARED SPECTROSCOPY

  • Scafi, Sergio H.F.;Pasquini, Celio
    • 한국근적외분광분석학회:학술대회논문집
    • /
    • 한국근적외분광분석학회 2001년도 NIR-2001
    • /
    • pp.3112-3112
    • /
    • 2001
  • Near-Infrared Spectroscopy (NIRS) was investigated aiming at the identification of falsified drugs. The identification is based on comparison of the NIR spectrum of a sample with a typical spectra of an authentic drug using multivariate modelling and classification algorithms (PCA/SIMCA). Two spectrophotometers (Brimrose - Luminar 2000 and 2030), based on acoustic-optical filter (AOTF) technology, sharing the same controlling computer, software (Brimrose - Snap 2.03) and the data acquisition electronics, were employed. The Luminar 2000 scans the range 850 1800 nm and was employed for transmitance/absorbance measurements of liquids with a transflectance optical bundle probe with total optical path of 5 mm and a circular area of 0.5 $\textrm{cm}^2$. Model 2030 scans the rage 1100 2400 nm and was employed for reflectance measurement of solids drugs. 300 spectra, acquired in about 20 s, were averaged for each sample. Chemometric treatment of the spectral data, modelling and classification were performed by using the Unscrambler 7.5 software (CAMO Norway). This package provides the Principal Component Analysis (PCA) and SIMCA algorithms, used for modelling and classification, respectively. Initially, NIRS was evaluated for spectrum acquisition of various drugs, selected in order to accomplish the diversity of physico-chemical characteristics found among commercial products. Parameters which could affect the spectra of a given drug (especially if presented as solid tablets) were investigated and the results showed that the first derivative can minimize spectral changes associated with tablet geometry, physical differences in their faces and position in relation to the probe beam. The effect of ambient humidity and temperature were also investigated. The first factor needs to be controlled for model construction because the ambient humidity can cause spectral alterations that should cause the wrong classification of a real drug if the factor is not considered by the model.

  • PDF

A Study on Efficient Data De-Identification Method for Blockchain DID

  • Min, Youn-A
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제13권2호
    • /
    • pp.60-66
    • /
    • 2021
  • Blockchain is a technology that enables trust-based consensus and verification based on a decentralized network. Distributed ID (DID) is based on a decentralized structure, and users have the right to manage their own ID. Recently, interest in self-sovereign identity authentication is increasing. In this paper, as a method for transparent and safe sovereignty management of data, among data pseudonymization techniques for blockchain use, various methods for data encryption processing are examined. The public key technique (homomorphic encryption) has high flexibility and security because different algorithms are applied to the entire sentence for encryption and decryption. As a result, the computational efficiency decreases. The hash function method (MD5) can maintain flexibility and is higher than the security-related two-way encryption method, but there is a threat of collision. Zero-knowledge proof is based on public key encryption based on a mutual proof method, and complex formulas are applied to processes such as personal identification, key distribution, and digital signature. It requires consensus and verification process, so the operation efficiency is lowered to the level of O (logeN) ~ O(N2). In this paper, data encryption processing for blockchain DID, based on zero-knowledge proof, was proposed and a one-way encryption method considering data use range and frequency of use was proposed. Based on the content presented in the thesis, it is possible to process corrected zero-knowledge proof and to process data efficiently.

Clustering Approaches to Identifying Gene Expression Patterns from DNA Microarray Data

  • Do, Jin Hwan;Choi, Dong-Kug
    • Molecules and Cells
    • /
    • 제25권2호
    • /
    • pp.279-288
    • /
    • 2008
  • The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

Uncertainty in Operational Modal Analysis of Hydraulic Turbine Components

  • Gagnon, Martin;Tahan, S.-Antoine;Coutu, Andre
    • International Journal of Fluid Machinery and Systems
    • /
    • 제2권4호
    • /
    • pp.278-285
    • /
    • 2009
  • Operational modal analysis (OMA) allows modal parameters, such as natural frequencies and damping, to be estimated solely from data collected during operation. However, a main shortcoming of these methods resides in the evaluation of the accuracy of the results. This paper will explore the uncertainty and possible variations in the estimates of modal parameters for different operating conditions. Two algorithms based on the Least Square Complex Exponential (LSCE) method will be used to estimate the modal parameters. The uncertainties will be calculated using a Monte-Carlo approach with the hypothesis of constant modal parameters at a given operating condition. In collaboration with Andritz-Hydro Ltd, data collected on two different stay vanes from an Andritz-Hydro Ltd Francis turbine will be used. This paper will present an overview of the procedure and the results obtained.

NOISE SOURCE IDENTIFICATION WITH INCREASED SPATIAL RESOLUTION

  • Gade, Svend;Hald, Jorgen;Ginn, Bernard
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2012년도 추계학술대회 논문집
    • /
    • pp.636-642
    • /
    • 2012
  • Delay-and-sum (DAS) Planar Beamforming has been a widely used Noise Source Identification Technique for the last decade. It is a quick one shot measurement technique being able to map sources that are larger than the array itself. The spatial resolution is proportional to distance between array and source, and inversely proportional to wavelength, thus the resolution is only good at medium to high frequencies. Improved algorithms using iterative de-convolution techniques offers up to ten times better resolution. The principle behind these techniques is described in this paper, as well as measurement examples from the automotive industry are presented.

  • PDF

High-Throughput Screening Technique for Microbiome using MALDI-TOF Mass Spectrometry: A Review

  • Mojumdar, Abhik;Yoo, Hee-Jin;Kim, Duck-Hyun;Cho, Kun
    • Mass Spectrometry Letters
    • /
    • 제13권4호
    • /
    • pp.106-114
    • /
    • 2022
  • A rapid and reliable approach to the identification of microorganisms is a critical requirement for large-scale culturomics analysis. MALDI-TOF MS is a suitable technique that can be a better alternative to conventional biochemical and gene sequencing methods as it is economical both in terms of cost and labor. In this review, the applications of MALDI-TOF MS for the comprehensive identification of microorganisms and bacterial strain typing for culturomics-based approaches for various environmental studies including bioremediation, plant sciences, agriculture and food microbiology have been widely explored. However, the restriction of this technique is attributed to insufficient coverage of the mass spectral database. To improve the applications of this technique for the identification of novel isolates, the spectral database should be updated with the peptide mass fingerprint (PMF) of type strains with not only microbes with clinical relevance but also from various environmental sources. Further, the development of enhanced sample processing methods and new algorithms for automation and de-replication of isolates will increase its application in microbial ecology studies.