3-모수 카파분포의 모수 추정 방법들의 비교

  • Jeon, Yu-Na;Kim, Yeon-Woo;Hwang, Yeong-A;Park, Jeong-Su
    • 한국데이터정보과학회:학술대회논문집
    • 2003.10a
    • pp.139-145
    • 2003
  • 본 논문에서는 강수 자료의 예측에 사용되는 3-모수 카파 분포(KD3)에서의 모수 추정 방법을 알아보고 시뮬레이션을 통하여 모수 추정 방법에 따른 성능을 비교해 보았다. 이 분포의 모수 $\alpha,\;\beta,\;\mu$를 추정하기 위하여 적률추정법(MME), L-적률 추정법(LME), 최우추정법(MLE)을 적용하였다. 소표본의 경우뿐만 아니라 대표본의 경우에도 시뮬레이션을 통하여 추정법들의 성능을 비교하였다. 적률 추정법과 L-적률 추정법에서는 제약조건 하에서의 1차원 Newton-Raphson방법을 수정하여 이용하였다. MSE를 기준으로 한 시뮬레이션 결과, KD3의 모수 추정에 있어서 표본의 크기가 100보다 작으면 LME의 적용을 추천하고 표본의 크기가 100이상이면 MLE를 추천한다.

Probleme nach geltendem Recht „Richtlinien für die Verwendung von Gesundheitsdaten" ('보건의료 데이터 활용 가이드라인'의 현행법상 문제점)

  • Lee, Seok-Bae
    • The Korean Society of Law and Medicine
    • v.22 no.4
    • pp.3-35
    • 2021
  • Inmitten der Flut der privaten und öffentlichen Information gilt die riesige Informationsmenge als Schlüsselressource im Zeitalter der 4. industriellen Revolution, repräsentiert durch Big-Data. Das Interesse an diesen wächst weltweit. Es gibt eine aktive Diskussion darüber, wie man Daten sichert und akkumuliert und wie man die gesammelten Daten sicher und effektiv nutzt. Gesundheitsdaten werden vor allem als die wertvollste Ressource bewertet, für die Big-DataTechnologie eingesetzt wird. Um Gesundheitsdaten sinnvoll zu nutzen, müssen verteilte Gesundheitsdaten integriert und den Benutzern in einer Form zur Verfügung gestellt werden, die für Forschung oder Inspektion verwendet werden kann. In einer Situation, in der große Länder um den Aufbau bzw. die Führung der Datenwirtschaft konkurrieren, wurden im August 2020 auch in Südkorea die sog. „3-Daten-Gesetze" geändert, die das Datenschutzgesetz(DSG) enthälten. Das DSG führte das Konzept der pseudonymen Informationen ein und baute eine Rechtsgrundlage für deren Verwendung auf. Als Folgemaßnahme kündigte die, Kommission für den Schutz personenbezogener Daten(Personal Information Protection Commission: PIPC)' die „Richtlinien für die Bahandlung mit pseudonymen Informationen" und, Ministerium für Gesundheit und Wohlfahrt' die „Richtlinien für die Verwendung von Gesundheitsdaten" an. Gesundheitsdaten stehen direkt in Zusammenhang mit Leben und Körper des Menschen und damit enthalten viele sensible Daten. Es handelt sich also um ein System, das aus einer vorsichtigeren und konservativeren Sicht unter der Voraussetzung verwendet werden kann, personenbezogene Daten sicherer zu schützen. Um die Hauptinhalte der „Richtlinien für Verwendung von Gesundheitsdaten" zu analysieren, überprüften wir zunächst die Hauptinhalte des überarbeiteten DSG. Danach durch die Analyse der wesentlichen Inhalte der „Richtlinien für Verwendung von Gesundheitsdaten" wurden Probleme wie Konflikte mit anderen Gesetzen und Verbesserungsmaßnahmen überprüft.

Classifying Cancer Using Partially Correlated Genes Selected by Forward Selection Method (전진선택법에 의해 선택된 부분 상관관계의 유전자들을 이용한 암 분류)

  • 유시호;조성배
    • Journal of the Institute of Electronics Engineers of Korea SP
    • v.41 no.3
    • pp.83-92
    • 2004
  • Gene expression profile is numerical data of gene expression level from organism measured on the microarray. Generally, each specific tissue indicates different expression levels in related genes, so that we can classify cancer with gene expression profile. Because not all the genes are related to classification, it is needed to select related genes that is called feature selection. This paper proposes a new gene selection method using forward selection method in regression analysis. This method reduces redundant information in the selected genes to have more efficient classification. We used k-nearest neighbor as a classifier and tested with colon cancer dataset. The results are compared with Pearson's coefficient and Spearman's coefficient methods and the proposed method showed better performance. It showed 90.3% accuracy in classification. The method also successfully applied to lymphoma cancer dataset.

Bin Packing-Exchange Algorithm for 3-Partition Problem (3-분할 문제의 상자 채우기-교환 알고리즘)

  • Lee, Sang-Un
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • /
    • /
  • This paper proposed a linear time algorithm for a three-partition problem(TPP) in which a polynomial time algorithm is not known as NP-complete. This paper proposes a backtracking method that improves the problems of not being able to obtain a solution of the MM method using the sum of max-min values and third numbers, which are known polynomial algorithms in the past. In addition, the problem of MM applying the backtracking method was improved. The proposed algorithm partition the descending ordered set S into three and assigned to the forward, backward, and best-fit allocation method with maximum margin, and found an optimal solution for 50.00%, which is 5 out of 10 data in initial allocation phase. The remaining five data also showed performance to find the optimal solution by exchanging numbers between surplus boxes and shortage boxes at least once and up to seven times. The proposed algorithm that performs simple allocation and exchange optimization with less O(k) linear time performance complexity than the three-partition m=n/3 data, and it was shown that there could be a polynomial time algorithm in which TPP is a P-problem, not NP-complete.

An Analysis of the 3D Spatial Distribution of Flow rate and Water Quality Convergence Monitoring Results in Rivers (하천에서의 수리·수질 복합 모니터링 결과의 3차원 공간분포 해석연구)

  • Lee, Chang Hyun;Kim, Kyung Dong;Ryu, Si Wan;Kim, Dong Su;Kim, Young Do
    • Proceedings of the Korea Water Resources Association Conference
    • 2022.05a
    • pp.18-18
    • 2022
  • 하천 합류부에 있어 수체의 혼합양상을 분석은 고해상도의 자료가 필요하다. 반면에 수질환경 문제와 기존 모니터링 시스템이 고정된 측정 방식으로 이루어지기 때문에 하천 전체의 정보는 저해상도의 결과값은 나타낸다. 또한, 많은 수중 환경 문제가 1차원에서 3차원에 걸쳐 있지만, 대부분의 관측 시스템은 1차원에 머물러 있음을 확인할 수 있다. 이러한 문제를 해결하기 위해서는 보다 발전된 관찰 및 계측이 필요하다. 그에 따른 고해상도의 측정 자료를 얻기 위해서는 측정자가부담을 많이 가지며, 측정할 수 있는 영역이나 시간적으로 제한적이다. 해상도는 낮추되 광범위한 데이터를 취득하기 위해서는 적절한 보간법이 선정되어야 한다. 관련 논문을 검토한 결과, 측정 결과에 따른 2차원 횡단면 분포의 내용이 지배적이었고, 3차원 매핑 및 3차원 분석을 통한 수리학적 정보 획득에 관한 연구는 부족한 실정이였다. 특히 3차원 하천 수질 농도의 연구가 불충분했다. 그에 따라 저해상도 측정결과에서의 예측과 보간법에 대한 시각화를 통해 하천의 전체적인 수리·수질정보를 표기하였다. 각각의 보간법을 비교함으로써 하천 매핑에 있어 IDW, Natual Neighbor, Kriging 기법을 적용하여 시각화된 자료와 정량적 평가를 통해 하천매핑의 정밀성을 향상시켰다. 이를 통해 3차원화된 공간보간 자료를 이용한 하천합류부의 혼합양상을 해석하였다. 3차원 데이터를 활용하는 방법으로 측정 및 모니터링 기술의 중요한 데이터로 활용되며, 이러한 데이터는 유해물질 저감 기술 및 평가 예측 기술의 기초 데이터로 활용되고 있다. 유해화학물질 추정, 호수의 고위험 조류군 계층분석 등 다양한 수생건강 진단기술을 활용할 수 있다.

Cancer Classification with Gene Expression Profiles using Forward Selection Method (전진 선택법을 이용한 유전자 발현정보 기반의 암 분류)

  • Yoo, Si-Ho;Cho, Sung-Bae
    • Annual Conference of KIPS
    • 2003.05a
    • pp.293-296
    • 2003
  • 유전 발현 데이터는 생명체의 특정 조직에서 채취한 샘플을 microarray상에서 측정한 것으로 유전자들의 발현 정도가 수치로 나타난 데이터이다. 일반적으로 정상조직과 이상조직에서 관련 유전자들의 발현 정도는 차이를 보이기 때문에, 유전발현 데이터를 통하여 암을 분류할 수 있다. 하지만 분류에 모든 유전자가 관여하지는 않으므로 관련성 있는 유전자만을 선별해내는 작업인 특징 선택방법이 필요하다. 본 논문에서는 회귀분석의 변수선택방법중 하나인 전진 선택법(forward selection method)을 사용하여 유전자들을 선택하고 분류하는 방법을 제안한다. 실험데이터는 대장암 데이트를 사용하였고, 분류기는 KNN을 사용하였다. 이 방법과 상관계수를 이용한 특징 선택 방법인 피어슨 상관계수와 스피어맨 상관계수방법과 비교해본 결과 전진 선택법에 의한 특징 선택방법이 암의 분류에 있어서 더 효과적인 유전자 선택을 한다는 사실을 확인하였다. 실험결과 90.3%의 높은 인식률을 보였다.

A Study on Management Method of Point and Line Data Using Mobile GIS (모바일 GIS를 이용한 Point 및 Line형 데이터 갱신 방법에 관한 연구)

  • Jeon, Jae-Yong;Cho, Gi-Sung
    • Journal of Korean Society for Geospatial Information Science
    • /
    • /
    • /
  • As information communication technology matures, GIS is being evolved from wire communication GIS to mobile GIS. This is due that mobile GIS satisfy mobility, field work, speed and time. Also, this is proper that mobile GIS accomplish investigation, confirmation, input and modification operations of spatial and attribute data at field. I think mobile GIS's use is the best way. Because, Mobile GIS's use can accomplish effectively the management of various kinds facilities in city. On this study, we consider diversified methods a person in charge can management point and line data with more easy and more effective method. The management methods of pint type data are free method, offset method two point method. The management methods of line type data are free method, point connection method, point and line connection method and the minimum distance connection method between point and line.

Study on Potential Topics of the MyData and Data Transactions Using LDA Topic Modeling (국내 마이데이터 태동과 데이터 거래에 관한 잠재적 주제 분석)

  • Cho, Ji Yeon;Lee, Bong Gyou
    • Journal of Digital Convergence
    • /
    • /
    • /
  • With the recent full-fledged MyData service, interest in the use of personal data is increasing. However, studies on MyData are still in the early stages, focusing on legal and institutional discussions, and studies from a comprehensive perspective are insufficient. Therefore, this study aimed at finding the potential topics formed by social discussions by analyzing news data from 2018 to the present. News data analysis using LDA topic modeling were conducted and 6 potential topics including digital transformation in finance, scope of Mydata business license, amendments and data-related laws, safe use of big data, data economy promotion policy and strategy of the financial industry were derived. This study has significance in that it comprehensively viewed the issues that emerged with the MyData and deriving gaps in previous discussion. Future research is expected to identify changes after the launch of MyData service and provide specific implications through research by specific industries.

Activation of Health Care Big Data (헬스케어 분야에서의 빅데이터 활용 활성화 방안)

  • Moon, Ja-hwa
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • 2021.05a
    • pp.483-486
    • 2021
  • With the explosive increase in data, the 'big data era' has arrived, focusing on deriving new values and insights through data. With the development of data analysis technology, the importance of data analysis and utilization in the field of diagnosis and treatment as well as prevention is expanding, while the use of big data is emerging in the healthcare field. Moreover, as the three data-related laws (Personal Information Protection Act, Information and Communication Network Act, and Credit Information Act) were passed in January 2020, it became possible to use a wide range of big data through pseudonym information. However, the use of healthcare big data is still struggling due to various policies and regulations, inconsistent data quality, and the absence of specialized personnel. Therefore, in this study, examines the current state of use of big data in the healthcare field, and analyzes the challenges, overseas cases, plans, and expected effects for activation of healthcare big data.

Reconstruction of 3D Volume of Talairach Brain Atlas (Talairach 뇌지도의 3차원 볼륨 재구성)

  • 백철화;김태우
    • Journal of Biomedical Engineering Research
    • v.20 no.4
    • pp.409-417
    • 1999
  • Talairach atlas consists of three orthogonal sets of coronal, sagittal, and axial slices. This atlas has recently an important role as a standard brain atlas in diagnosing disease related with brain function and analyzing cause of brain disease. The 3D digital volume data set reconstructed from the atlas is widely applied to visualization and quantitative analysis of results processed in the digital computer. This paper represented application method of bi-linear interpolation technique, proposed tri-planar interpolation algorithm for 3D volume data reconstruction of Talairach atlas. And we implemented Talairach atlas editor and discussed problems in volume reconstruction of Talairach atlas. The bi-linear method was applied to only one set of the slices and considered the on intensity value in the interpolation process. The tri-planar technique concurrently uses three orthogonal sets of slices with the same information of brain structures. Talairach atlas editor visualized three sets. of atlas slices on the same coordinate and had editing function. Using the atlas editor, we represented problems in volume reconstruction by showing inconsistency of brain structures among three sets of atlas slices.

