• Title/Summary/Keyword: Data de-identification

Search Result 116, Processing Time 0.025 seconds

Modern vistas of process control

  • Georgakis, Christos
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1996.10a
    • /
    • pp.18-18
    • /
    • 1996
  • This paper reviews some of the most prominent and promising areas of chemical process control both in relations to batch and continuous processes. These areas include the modeling, optimization, control and monitoring of chemical processes and entire plants. Most of these areas explicitly utilize a model of the process. For this purpose the types of models used are examined in some detail. These types of models are categorized in knowledge-driven and datadriven classes. In the areas of modeling and optimization, attention is paid to batch reactors using the Tendency Modeling approach. These Tendency models consist of data- and knowledge-driven components and are often called Gray or Hybrid models. In the case of continuous processes, emphasis is placed in the closed-loop identification of a state space model and their use in Model Predictive Control nonlinear processes, such as the Fluidized Catalytic Cracking process. The effective monitoring of multivariate process is examined through the use of statistical charts obtained by the use of Principal Component Analysis (PMC). Static and dynamic charts account for the cross and auto-correlation of the substantial number of variables measured on-line. Centralized and de-centralized chart also aim in isolating the source of process disturbances so that they can be eliminated. Even though significant progress has been made during the last decade, the challenges for the next ten years are substantial. Present progress is strongly influenced by the economical benefits industry is deriving from the use of these advanced techniques. Future progress will be further catalyzed from the harmonious collaboration of University and Industrial researchers.

  • PDF

Participation in Decision-making and Expertise of Staff Nurses (일부종합병원 일반간호사의 의사결정 참여와 전문성)

  • Cho, Mee-Kyung;Jeong, Hyun-Sook
    • Research in Community and Public Health Nursing
    • /
    • v.10 no.2
    • /
    • pp.537-548
    • /
    • 1999
  • The purpose of this study was to analyze the relationship of the participation in decisionmaking and expertise of staff nurses. The population for this study was the registered nurses(N=342) working in Chungnam and Chungbuk. The data were collected from April 26 to May 26, 1999. The survey instrument was Participation in Decision Activities Quesionnaire and Expertise scale developed by Anthony(1995), and Job Expertise scale of Van de Ven and Ferry. The results were as follows: 1) There was a significant difference for identification and design, and selection among the process of participation in decision making. 2) There was a significant difference between the participation in caregiving decisions and condition of work decisions. 3) (1) There was a significant difference among caregiving decisions between the expertise indicators and the variables such as education level. and the experience to told who is an expert. (2) There was a significant difference among condition of work decisions between the expertise indicators and the variables such as career, the spent keeping current per week, and self-rating of expertise.

  • PDF

A de-identification technique using generalization and insert a salt data (일반화와 데이터 삽입을 이용한 익명화 처리 기법)

  • Park, Jun-Bum;Cho, Jin-Man;Choi, Dae-Seon;Jin, Seung-Hun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.04a
    • /
    • pp.351-353
    • /
    • 2015
  • 공공정보 공유 및 개방, 소셜네트워크서비스의 활성화 그리고 사용자 간의 공유 데이터 증가 등의 이유로 인터넷상에 노출되는 사용자의 개인 정보가 증가하고 있다. 인터넷상에 노출된 사용자들의 개인정보들은 연결공격(linkage attack), 배경지식 공격(background attack)으로 프라이버시를 침해할 수 있다. 이를 막기 위해 관계형 데이터베이스에서는 대표적으로 k-익명성(k-anonymity)을 시작으로 l-다양성(l-diversity), t-밀집성(t-closeness)이라는 익명화 모델이 제안되었으며 계속해서 익명화 알고리즘의 성능은 개선되고 있다. 하지만 k-익명성, l-다양성, t-밀집성 모델의 조건을 만족하기 위해서는 준식별자(quasi-identifier)를 일반화(generalization)처리 해주어야 하는데 이 과정에서 준식별자의 가치를 손실된다는 단점이 있다. 본 논문에서 준식별자의 정보 손실을 최소화하기 위해 k-익명성 모델을 만족시키는 과정에서 일반화와 데이터를 삽입을 사용하는 익명화 처리하는 방법을 제안한다.

Implementation of efficient L-diversity de-identification for large data (대용량 데이터에 대한 효율적인 L-diversity 비식별화 구현)

  • Jeon, Min-Hyuk;Temuujin, Odsuren;Ahn, Jinhyun;Im, Dong-Hyuk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.465-467
    • /
    • 2019
  • 최근 많은 단체나 기업에서 다양하고 방대한 데이터를 요구로 하고, 그에 따라서 국가 공공데이터나 데이터 브로커등 데이터를 통해 직접 수집 하거나 구매해야 하는 경우가 많아지고 있다. 하지만 개인정보의 경우 개인의 동의 없이는 타인에게 양도가 불가능하여 이러한 데이터에 대한 연구에 어려움이 있다. 그래서 특정 개인을 추론할 수 없도록 하는 비식별 처리 기술이 연구되고 있다. 이러한 비식별화의 정도는 모델로 나타낼 수가 있는데, 현재 k-anonymity 와 l-diversity 모델 등이 많이 사용된다. 이 중에서 l-diversity 는 k-anonymity 의 만족 조건을 포함하고 있어 비식별화의 정도가 더욱 강하다. 이러한 l-diversity 모델을 만족하는 알고리즘은 The Hardness and Approximation, Anatomy 등이 있는데 본 논문에서는 일반화 과정을 거치지 않아 유용성이 높은 Anatomy 의 구현에 대해 연구하였다. 또한 비식별화 과정은 전체 데이터에 대한 특성을 고려해야 하기 때문에 데이터의 크기가 커짐에 따라 실질적인 처리량이 방대해지는데, 이러한 문제를 Spark 를 통해 데이터가 커짐에 따라서 최대한 안정적으로 대응하여 처리할 수 있는 시스템을 구현하였다.

Epstein-Barr Virus-Positive Diffuse Large B-Cell Lymphoma: is it different between Over and Under 50 Years of Age?

  • Monabati, Ahmad;Vahedi, Amir;Safaei, Akbar;Noori, Sadat;Mokhtari, Maral;Vahedi, Leila;Zamani, Mehdi
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.4
    • /
    • pp.2285-2289
    • /
    • 2016
  • Background: Epstein-Barr virus (EBV) positive diffuse large B-cell lymphoma (DLBCL) of the elderly is an entity introduced in the latest WHO classification of lymphoid tumors and defined in patients older than 50 years without prior lymphoma or immunodeficiency. However, recently it has also been seen in patients under 50. There is thus debate as to whether these are separate entities. Materials and Methods: In this retrospective study, we analyzed de novo DLBCL admitted to our institute over a period of two years. Clinical data included age, sex, nodal and extranodal presentation. The results of an immunohistochemistry (IHC) panel were also reviewed. IHC findings were mainly used to sub-classify DLBCL as germinal center vs. non germinal center types. IHC for identification of LMP-1 (latent membrane protein) and in situ hybridization for detection of EBV-encoded RNA (EBER) was performed. EBV prevalence, clinical data and IHC findings were compared between patients under and over 50 years of age. Results: Out of 95 DLBCL, 11.6% were EBV positive (7.5% and 14.5% in the young and old groups). We did not find any significant differences in IHC subclasses and clinical data between EBV positive DLBCL (EBV+DLBCL) of young and old groups. Conclusions: EBV+DLBCL are not exclusive to patients older than 50 years. With regard to clinical data as well as IHC subclasses, no differences were evident between EBV+DLBCL of young and old groups. Our suggestion is to eliminate any cut off age for EBV+DLBCL.

Unstable Approach Mitigation Based on Flight Data Analysis (비행 데이터 분석 기반의 불안정 접근 경감방안)

  • Kim, Hyeon Deok
    • Journal of Advanced Navigation Technology
    • /
    • v.25 no.1
    • /
    • pp.52-59
    • /
    • 2021
  • According to the International Air Transport Association (IATA), 61% of the accidents occurred during the approach and landing phase of the flight, with 16% of the accidents caused by unstable access of the commercial aircraft. It was identified that the pilot's unstable approach and poor manipulation of correction led to accidents by continuing the excessive approach without go-around manuever. The causes of unstable access may vary, including airport approach procedures, pilot error, misplanning, workload, ATC (Air Traffic Contol) congestion, etc. In this study, we use the flight data analysis system to select domestic case airports and aircraft type where unstable approach events occur repeatedly. Through flight data analysis, including main events, airport approach procedures, pilot operations, as well as various environmental factors such as weather and geographical conditions at the airport. It aims to identify and eliminate the tendency of unstable approach events and the causes and risks of them to derive implications for mitigating unstable approach events and for developing navigation safety measures.

A Study on the Medical Application and Personal Information Protection of Generative AI (생성형 AI의 의료적 활용과 개인정보보호)

  • Lee, Sookyoung
    • The Korean Society of Law and Medicine
    • /
    • v.24 no.4
    • /
    • pp.67-101
    • /
    • 2023
  • The utilization of generative AI in the medical field is also being rapidly researched. Access to vast data sets reduces the time and energy spent in selecting information. However, as the effort put into content creation decreases, there is a greater likelihood of associated issues arising. For example, with generative AI, users must discern the accuracy of results themselves, as these AIs learn from data within a set period and generate outcomes. While the answers may appear plausible, their sources are often unclear, making it challenging to determine their veracity. Additionally, the possibility of presenting results from a biased or distorted perspective cannot be discounted at present on ethical grounds. Despite these concerns, the field of generative AI is continually advancing, with an increasing number of users leveraging it in various sectors, including biomedical and life sciences. This raises important legal considerations regarding who bears responsibility and to what extent for any damages caused by these high-performance AI algorithms. A general overview of issues with generative AI includes those discussed above, but another perspective arises from its fundamental nature as a large-scale language model ('LLM') AI. There is a civil law concern regarding "the memorization of training data within artificial neural networks and its subsequent reproduction". Medical data, by nature, often reflects personal characteristics of patients, potentially leading to issues such as the regeneration of personal information. The extensive application of generative AI in scenarios beyond traditional AI brings forth the possibility of legal challenges that cannot be ignored. Upon examining the technical characteristics of generative AI and focusing on legal issues, especially concerning the protection of personal information, it's evident that current laws regarding personal information protection, particularly in the context of health and medical data utilization, are inadequate. These laws provide processes for anonymizing and de-identification, specific personal information but fall short when generative AI is applied as software in medical devices. To address the functionalities of generative AI in clinical software, a reevaluation and adjustment of existing laws for the protection of personal information are imperative.

Identification of Alternative Splicing and Fusion Transcripts in Non-Small Cell Lung Cancer by RNA Sequencing

  • Hong, Yoonki;Kim, Woo Jin;Bang, Chi Young;Lee, Jae Cheol;Oh, Yeon-Mok
    • Tuberculosis and Respiratory Diseases
    • /
    • v.79 no.2
    • /
    • pp.85-90
    • /
    • 2016
  • Background: Lung cancer is the most common cause of cancer related death. Alterations in gene sequence, structure, and expression have an important role in the pathogenesis of lung cancer. Fusion genes and alternative splicing of cancer-related genes have the potential to be oncogenic. In the current study, we performed RNA-sequencing (RNA-seq) to investigate potential fusion genes and alternative splicing in non-small cell lung cancer. Methods: RNA was isolated from lung tissues obtained from 86 subjects with lung cancer. The RNA samples from lung cancer and normal tissues were processed with RNA-seq using the HiSeq 2000 system. Fusion genes were evaluated using Defuse and ChimeraScan. Candidate fusion transcripts were validated by Sanger sequencing. Alternative splicing was analyzed using multivariate analysis of transcript sequencing and validated using quantitative real time polymerase chain reaction. Results: RNA-seq data identified oncogenic fusion genes EML4-ALK and SLC34A2-ROS1 in three of 86 normal-cancer paired samples. Nine distinct fusion transcripts were selected using DeFuse and ChimeraScan; of which, four fusion transcripts were validated by Sanger sequencing. In 33 squamous cell carcinoma, 29 tumor specific skipped exon events and six mutually exclusive exon events were identified. ITGB4 and PYCR1 were top genes that showed significant tumor specific splice variants. Conclusion: In conclusion, RNA-seq data identified novel potential fusion transcripts and splice variants. Further evaluation of their functional significance in the pathogenesis of lung cancer is required.

De novo genome assembly and single nucleotide variations for Soybean yellow common mosaic virus using soybean flower bud transcriptome data

  • Jo, Yeonhwa;Choi, Hoseong;Kim, Sang-Min;Lee, Bong Choon;Cho, Won Kyong
    • Journal of Applied Biological Chemistry
    • /
    • v.63 no.3
    • /
    • pp.189-195
    • /
    • 2020
  • The soybean (Glycine max L.), also known as the soya bean, is an economically important legume species. Pathogens are always major threats for soybean cultivation. Several pathogens negatively affect soybean production. The soybean is also known as a susceptible host to many viruses. Recently, we carried out systematic analyses to identify viruses infecting soybeans using soybean transcriptome data. Our screening results showed that only few soybean transcriptomes contained virus-associated sequences. In this study, we further carried out bioinformatics analyses using a soybean flower bud transcriptome for virus identification, genome assembly, and single nucleotide variations (SNVs). We assembled the genome of Soybean yellow common mosaic virus (SYCMV) isolate China and revealed two SNVs. Phylogenetic analyses using three viral proteins suggested that SYCMV isolate China is closely related to SYCMV isolates from South Korea. Furthermore, we found that replication and mutation of SYCMV is relatively low, which might be associated with flower bud tissue. The most interesting finding was that SYCMV was not detected in the cytoplasmic male sterility (CMS) line derived from the non-CMS line that was severely infected by SYCMV. In summary, in silico analyses identified SYCMV from the soybean flower bud transcriptome, and a nearly complete genome of SYCMV was successfully assembled. Our results suggest that the low level of virus replication and mutation for SYCMV might be associated with plant tissues. Moreover, we provide the first evidence that male sterility might be used to eliminate viruses in crop plants.

A Study on the Principle of Application of Privacy by Design According to the Life Cycle of Pseudonymization Information (가명정보 생명주기에 따른 개인정보보호 중심 설계 적용 원칙에 관한 연구)

  • Kim, Dong-hyun
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.329-339
    • /
    • 2022
  • Recently, as personal information has been used as data, various new industries have been discovered, but cases of personal information leakage and misuse have occurred one after another due to insufficient systematic management system establishment. In addition, services that use personal information anonymously and anonymously have emerged since the enforcement of the Data 3 Act in August 2020, but personal information issues have arisen due to insufficient alias processing, safety measures for alias information processing, and insufficient hate expression. Therefore, this study proposed a new PbD principle that can be applied to the pseudonym information life cycle based on the Privacy by Design (PbD) principle proposed by Ann Cavoukian [1] of Canada to safely utilize personal information. In addition, the significance of the proposed method was confirmed through a survey of 30 experts related to personal information protection.