• 제목/요약/키워드: Active Mining

검색결과 149건 처리시간 0.026초

비트 클러스터링을 이용한 빈발 패턴 탐사의 성능 개선 방안 (Advanced Improvement for Frequent Pattern Mining using Bit-Clustering)

  • 김의찬;김계현;이철용;박은지
    • 한국공간정보시스템학회 논문지
    • /
    • 제9권1호
    • /
    • pp.105-115
    • /
    • 2007
  • 데이터마이닝은 데이터베이스에 저장되어 있는 많은 일반적인 정보들을 가지고 의미있는 정보를 찾아내는 것이다. 많은 데이터 마이닝 기법들 중에 클러스터링과 연관규칙을 다루는 연구가 많이 이뤄지고 있다. 클러스터링 기법에는 공간데이터를 다루거나 속성데이터(비공간 데이터)를 다루는 많은 기법들이 연구되고 있고, 연관규칙 또한 빈발 패턴을 찾아내는 연구가 활발히 진행되고 있다. 기존의 연구 중 apriori 연관규칙 알고리즘을 개선하는 방법으로 비트 클러스터링을 이용하는 방법이 있다. 우리는 apriori 연관규칙 보다 더 나은 성능을 나타내는 FP-Growth에 대해 살펴보고 FP-Growth의 문제점을 찾아 이를 해결하기 위한 방법으로 비트 클러스터링을 이용하여 해결할 수 있는지에 대해 연구하였다. 본 논문에서는 전체 데이터베이스를 비트 클러스터링을 통해 몇 개의 클러스터로 나누어 FP-Growth 방법에 사용할 것을 제안하였다. 이렇게 하면 기존의 FP-Growth 방법보다 더 나은 성능을 가질 수 있으며 이를 증명하기 위한 실험을 수행하였다. 실험은 패턴 마이닝 연구에서 사용하는 chess 데이터를 이용하였으며, 최소지지도를 다르게 적용하면서 FP-Tree를 생성하는 실험을 하였다. 최소지지도가 높은 경우에는 기존의 방법과 비슷한 결과를 얻었지만 그 외 경우에는 기존의 방법보다 본 논문에서 제안하는 방법이 더 우수한 결과를 얻을 수 있었다. 본 논문의 주요 결론으로서 비트 클러스터링을 이용한 방법이 상대적으로 우수한 데이터 마이닝 방법임을 정리하였으며, 아울러 GML 데이터를 위한 비트 클러스터링의 적용방법론에 대하여도 논의하였다.적 성분으로 평가된다. 이러한 잠재적 추적자들에 근거할 때, 한국 서남해에 발달하고 있는 니질 퇴적대의 전퇴적물은 한국과 중국의 혼합 기원으로 해석되나, 실트와 점토 구간의 퇴적물로 나누어 볼 때 그기원이 각각 다르게 나타났다. 즉, 점토 퇴적물은 한국과 중국의 혼합 기원으로, 실트 퇴적물은 한국 기원이 우세한 것으로 해석된다. 과립에 황금입자가 표지되었다. 따라서 1일 동안 배설되는 분비배설항원은 선모충 유충의 표피와 stichocyte의 ${\alpha}_0\;{\alpha}_1$ 과립에서 유도되는 반면에 3일 동안 배설되는 분비배설항원은 표피와 stichocyte의 ${\alpha}_0$ 과립에서 유도되고, 선모충유충 감염후 1주, 4주에 실험쥐에서 형성되는 감염항체는 선모충의 표피와 기저층 그리고 EIM에서 분비되는 항원에 의하여 생성된다. 이상의 결과로 선모충의 분비배설항원과 감염항원은 선모충 유충의 표피와 EIM및 stichocyte의 ${\alpha}_0\;{\alpha}_1$ 과립에서 유도되며 이들은 45 kDa 단백을 포함하고 있는 것으로 생각된다.성하고 있는 세포들에는 세포질이 어두운 세포와 밝은 세포가 있었으며, 세포질내에는 전자밀도가 높은 분비과립이 관찰되었다. 전체적인 특징은 눈물샘분비세포 중 장액세포의 것과 비슷하였으나, 과립의 크기는 작았다. 분비관을 구성하는 세포들 사이에도 연접복합체가 매우 잘 발달되어 있었다. 샘포에서 사이관으로 이행되는 곳에서도 샘포세포와 사이관세포 사이에서도 연접복합체가 관찰되었다. 분비관세포의 분비과립 가운데는 중심부분에 전자밀도가

  • PDF

A proof-of-concept study of extracting patient histories for rare/intractable diseases from social media

  • Yamaguchi, Atsuko;Queralt-Rosinach, Nuria
    • Genomics & Informatics
    • /
    • 제18권2호
    • /
    • pp.17.1-17.4
    • /
    • 2020
  • The amount of content on social media platforms such as Twitter is expanding rapidly. Simultaneously, the lack of patient information seriously hinders the diagnosis and treatment of rare/intractable diseases. However, these patient communities are especially active on social media. Data from social media could serve as a source of patient-centric knowledge for these diseases complementary to the information collected in clinical settings and patient registries, and may also have potential for research use. To explore this question, we attempted to extract patient-centric knowledge from social media as a task for the 3-day Biomedical Linked Annotation Hackathon 6 (BLAH6). We selected amyotrophic lateral sclerosis and multiple sclerosis as use cases of rare and intractable diseases, respectively, and we extracted patient histories related to these health conditions from Twitter. Four diagnosed patients for each disease were selected. From the user timelines of these eight patients, we extracted tweets that might be related to health conditions. Based on our experiment, we show that our approach has considerable potential, although we identified problems that should be addressed in future attempts to mine information about rare/intractable diseases from Twitter.

도로 네트워크에서 이동 객체를 위한 시공간 유사 궤적 검색 알고리즘 (Trajectory Search Algorithm for Spatio-temporal Similarity of Moving Objects on Road Network)

  • 김영창;라빈드라 비스타;장재우
    • 한국공간정보시스템학회 논문지
    • /
    • 제9권1호
    • /
    • pp.59-77
    • /
    • 2007
  • 모바일 환경의 대중화와 이를 위한 기반 기술의 발전으로 인하여 이동 객체들을 효과적으로 표현하고 분석하는 것이 중요한 문제로 대두되고 있다. 이러한 환경에서 이동 객체 궤적의 유사성 검색은 궤적에 대한 데이터 마이닝의 일부분으로 중요한 연구 분야중의 하나이다. 본 논문에서는 도로 네트워크상의 이동 객체 궤적을 위한 시공간 유사 궤적 검색 알고리즘을 제안한다. 이를 위하여 도로 네트워크상에서 두 이동 객체 궤적 사이의 시공간 거리를 정의하고, 이를 기반으로 궤적 사이의 시공간 유사도 측정 방법을 제안한다. 유사 궤적 알고리즘은 효율적인 검색을 위하여 시그니쳐 파일 기법을 이용하여 궤적을 검색한다. 마지막으로, 본 논문에서 제안하는 시공간 유사 궤적 검색 알고리즘을 구현하고, 성능 분석을 통해 제안하는 알고리즘의 효율성을 입증한다.

  • PDF

나노 크기 적철석 입자 피복 모래를 이용한 지하수내 비소 3가와 5가의 제거 기술 개발

  • 고일원;이철효;이상우;김주용;김경웅
    • 한국지하수토양환경학회:학술대회논문집
    • /
    • 한국지하수토양환경학회 2003년도 추계학술발표회
    • /
    • pp.78-82
    • /
    • 2003
  • Development of hematite-coated sand was evaluated for the application of the PRB (permeable reactive barrier) in the arsenic-contaminated subsurface of the metal mining areas. The removal efficiency of As(III) and As(V), the effect of anion competition and the capability of arsenic removal in the flow system were investigated through the experiments of adsorption isotherm, arsenic removal kinetics against anion competition and column removal. Hematite-coated sand followed a linear adsorption isotherm with high adsorption capacity at low level concentrations of arsenic (< 1.0 mg/l). When As(III) and As(V) underwent adsorption reactions in the presence of anions (sulfate, nitrate and bicarbonate), sulfate caused strong inhibition of arsenic removal, and bicarbonate and nitrate caused weak inhibition due to specific and nonspecific adsorption onto hematite, respectively. In the column experiments, high content of hematite-coated sand enhance the arsenic removal, but the amount of the arsenic removal decreased due to the higher affinity of As(V) than As(III) and reduced adsorption kinetics in the flow system, Therefore, the amount of hematite-coated sand, the adsorption affinity of arsenic species and removal kinetics determined the removal efficiency of arsenic in the flow system. arsenic, hematite-coated sand, permeable reactive barrier, anion competition, adsorption.

  • PDF

북한의 지질학 연구활동 분석 (An Analysis of Geological Research Activities in North Korea)

  • 김성용;윤성택;허철호
    • 자원환경지질
    • /
    • 제35권4호
    • /
    • pp.373-378
    • /
    • 2002
  • Among the science and engineering fields in North Korean Academy of Sciences, geology occupies about 10 percent of the total number of departments. An analysis of major geologic research fields in North Korea, based on the number of authors of 2000-200l publications in a representative journal "Geology and Geography", shows the proportions as follows: mineralogy and petrology (31.0%), stratigraphy and paleontology (12.3%), economic geology and geochemistry (11.6%), geophysics and structural geology (14.2%), and applied geology (31.0%). This proportion is similar to that in South Korea in 1960s and shows that geologic research activity in North Korea is concentrated for the purpose of mineral resources exploration. The academic collaboration between South and North Korea in near future should include the researches on the reconstruction of geologic history in Korean peninsula and Northeast Asia and the environmental restoration from mining-related environmental pollution in North Korea. For active academic interchange between South and North Korea, efforts to overcome the academic gap are requisite. Frequent joint symposia, interchange programme of post-doctoral fellows, and cooperative researches on specific topics are recommended for this effort.

비트코인을 활용한 효율적 전자화폐 활성화 방안 (Effective Vitalization Plan of Electronic Cash using Bitcoin)

  • 이준형;이성훈;이도은;김우철;김민수
    • 융합보안논문지
    • /
    • 제16권4호
    • /
    • pp.79-90
    • /
    • 2016
  • 현재 통용되고 있는 '전자화폐'는 기존에 통용되던 실물화폐를 디지털화하여 사용할 뿐 물리적으로 통용되던 화폐를 벗어나지 못하고 있는 실정이다. 특히 비트코인의 경우 발행주체 없이 '채굴'이라는 행위에 의해서만 발행되며, 개인간 거래는 P2P 형태로 '블록체인(BlockChain)'을 통해 거래를 증명하는 형태로 몇몇 국가에서 화폐로써 인정을 받아 제법 활발히 통용되고 있지만, 비트코인이 갖는 특성들 때문에 여러 가지 문제를 안고 있다. 따라서 본 연구에서는 비트코인의 활성화 방안에 있어 정책적, 관리적, 기술적 문제점들에 대한 대안을 제시하고자 한다.

Analysis of Infertility Keywords in the Largest Domestic Mom Cafe Bulletin Board in Korea Using Text Mining

  • Sangmin Lee
    • 인터넷정보학회논문지
    • /
    • 제24권4호
    • /
    • pp.137-144
    • /
    • 2023
  • The purpose of this study is to examine consumers' perceptions of domestic infertility support policies based on infertility-related keywords and the trends of their changes. To this end, Momsholic, a mom cafe which has the most active infertility-related bulletin boards on Naver, was selected as the analysis target, and 'infertility' was selected as a keyword for data search. The data was collected for three months. In addition, network analysis and visualization were performed using R for data collection and analysis, and cross-validation was attempted using the NetDraw function of 'textom 1.0' and the UCINET6 program. As a result of the analysis, the main keywords were cost, artificial insemination, in vitro fertilization, freezing, harvest, ovulation, and how much. Next, looking at the central value of the degree of connection, it was found that the degree of connection between the words cost, cost, how much, problem, public health center, and artificial insemination was high. According to the results of this study, women who visit mom cafes due to infertility in Korea are more interested in the cost. It is believed to be closely related to infertility treatment as well as in vitro fertilization and egg freezing. Therefore, by examining keywords related toinfertility, it has academic significance in that it is possible to identify major factors that end users are interested in. Furthermore, it is possible to redefine the guidelines for domestic infertility support policies by presenting infertility support policies that reflect the factors of interest of end consumers.

토픽모델링을 활용한 해운물류 뉴스 분석 (Analysis of Shipping and Logistics News Articles using Topic Modeling)

  • 윤희영;곽일엽
    • 무역학회지
    • /
    • 제46권4호
    • /
    • pp.61-76
    • /
    • 2021
  • This study focuses on three logistics-related news (Logistics Newspaper, Korea Shipping Gadget, and Korea Shipping Newspaper) in order to present changes in logistics issues, centering on Corona 19, which has recently had the greatest impact in the world. For data collection, two-year news articles in 2019 and 2020 (title, article, content, date, article classification, article URL) were collected through web crawling (using Python's BeautifulSoup, requests module) on the homepages of three representative logistics-related media companies. As for the data analysis methods, fundamental statistical analysis, Latent Dirichlet Allocation (LDA) for topic modeling, and Scattertext were performed. The analysis results were as follows. First, among the three news media related to logistics, the Korea Shipping Newspaper was carrying out the most active media activities. Second, through topic modeling with LDA, eight logistics-related topics were identified, and keywords and significant issues of each topic were presented. Third, the keywords were visually expressed through Scattertext. This is the first study to present changes in the logistics field, focusing on articles from representative logistics-related media in 2019 and 2020. In particular, 2019 and 2020 can be divided into before and after the outbreak of Corona 19, which has had a great impact not only on the logistics field but also on our lives as a whole. For future work, a multi-faceted approach is required, such as comparative studies of logistics issues between countries or presenting implications based on long-term time-series articles.

토픽모델링을 활용한 무역분야 연구동향 분석 (A Study on the Research Trends in Int'l Trade Using Topic modeling)

  • 이지훈;김정숙
    • 무역학회지
    • /
    • 제45권3호
    • /
    • pp.55-69
    • /
    • 2020
  • This study examines the research trends and knowledge structure of international trade studies using topic modeling method, which is one of the main methodologies of text mining. We collected and analyzed English abstracts of 1,868 papers of three Korean major journals in the area of international trade from 2003 to 2019. We used the Latent Dirichlet Allocation(LDA), an unsupervised machine learning algorithm to extract the latent topics from the large quantity of research abstracts. 20 topics are identified without any prior human judgement. The topics reveal topographical maps of research in international trade and are representative and meaningful in the sense that most of them correspond to previously established sub-topics in trade studies. Then we conducted a regression analysis on the document-topic distributions generated by LDA to identify hot and cold topics. We discovered 2 hot topics(internationalization capacity and performance of export companies, economic effect of trade) and 2 cold topics(exchange rate and current account, trade finance). Trade studies are characterized as a interdisciplinary study of three agendas(i.e. international economy, International Business, trade practice), and 20 topics identified can be grouped into these 3 agendas. From the estimated results of the study, we find that the Korean government's active pursuit of FTA and consequent necessity of capacity building in Korean export firms lie behind the popularity of topic selection by the Korean researchers in the area of int'l trade.

산머루 관련 정보수집 및 데이터베이스의 구축 (Data Mining and Construction of Database Concerning Effects of Vitis Genus)

  • 김민아;조윤주;신지영;신민규;배현수;홍무창;김양석
    • 동의생리병리학회지
    • /
    • 제26권4호
    • /
    • pp.551-556
    • /
    • 2012
  • The database for the oriental medicine had been existed in documentation in past times and it has been developed to the database type for random accesses in the information society. However, the aspects of the database are not so diversified and the database for the bio herbal material exists in widened type dictionary style. It is a situation that the database which handles the in-depth raw herbal medicines is not sufficient in its quantity and quality. Korean wild grape is a deciduous plant categorized into the Vitaceae and it was found experimentally that it has various medical effects. It is one of the medical materials with higher potentiality of academic study and commercialization recently because it has a bigger possibility to be applied into diverse industrial fields including the medical product for health, food and beauty. We constituted the cooperative system among the Muju cluster business group for Korean mountain wild grapes, Physiology Laboratory in Kyung Hee University Oriental Medicine and Medical Classics Laboratory in Kyung Hee University Oriental Medicine with a view to focusing on such potentiality and a database for Korean wild grapes was made a touchstone for establishing the in-depth database for the single bio medical materials. First of all, the literatures based on the North East Asia in ancient times had been categorized into the classical literature (Korean literature published by government organization, Korean classical literature, Chinese classical literature and classical literature fro Korean and Chinese oriental medicine) and modern literature (Modern literature for oriental medicine, modern literature for domestic and foreign herbal medicine) to cover the eastern and western research records and writings related to Korean wild grapes and the text-mining work has been performed through the cooperation system with the Medical Classics Laboratory in Kyung Hee University Oriental Medicine. First of all, the data for the experiment and theory for Korean wild grape were collected for the Medline database controlled by the Parliament Library of USA to arrange the domestic and foreign theses with topic for Korean wild grapes and the network hyperlink function and down load function were mounted for self-thesis searching function and active view based on the collected data. The thesis searching function provides various auxiliary functions and the searching is available according to the diverse searching/queries such as the name of sub species of Korean wild grape, the logical intersection index for the active ingredients, efficacy and elements. It was constituted for the researchers who design the Korean wild grape study to design of easier experiment. In addition, the data related to the patents for Korean wild grape which were collected from European Patent Office in response to the commercialization possibility and the system available for searching and view was established in the same viewpoint. Perl was used for the query programming and MS-SQL for database establishment and management in the designing of this database. Currently, the data is available for free use and the address is as follows. http://163.180.41.43:8011/index.html