• Title/Summary/Keyword: data mining processes

Search Result 141, Processing Time 0.026 seconds

An Algorithm Study to Detect Mass Flow Controller Error in Plasma Deposition Equipment Using Artificial Immune System (인공면역체계를 이용한 플라즈마 증착 장비의 유량조절기 오류 검출 실험 연구)

  • You, Young Min;Jeong, Ji Yoon;Ch, Na Hyeon;Park, So Eun;Hong, Sang Jeen
    • Journal of the Semiconductor & Display Technology
    • /
    • v.20 no.4
    • /
    • pp.161-166
    • /
    • 2021
  • Errors in the semiconductor process are generated by a change in the state of the equipment, and errors usually arise when the state of the equipment changes or when parts that make up the equipment have flaws. In this investigation, we anticipated that aging of the mass flow controller in the plasma enhanced chemical vapor deposition SiO2 thin film deposition method caused a minute flow rate shift. In seven cases, fourier transformation infrared film quality analysis of the deposited thin film was used to characterize normal and pathological processes. The plasma condition was monitored using optical emission spectrometry data as the flow rate changed during the procedure. Preprocessing was used to apply the collected OES data to the artificial immune system algorithm, which was then used to process diagnosis. Through comparisons between datasets, the learning algorithm compared classification accuracy and improved the method. It has been confirmed that data characterized as a normal process and abnormal processes with differing flow rates may be discriminated by themselves using the artificial immune system data mining method.

Interactive Visualization for Patient-to-Patient Comparison

  • Nguyen, Quang Vinh;Nelmes, Guy;Huang, Mao Lin;Simoff, Simeon;Catchpoole, Daniel
    • Genomics & Informatics
    • /
    • v.12 no.1
    • /
    • pp.21-34
    • /
    • 2014
  • A visual analysis approach and the developed supporting technology provide a comprehensive solution for analyzing large and complex integrated genomic and biomedical data. This paper presents a methodology that is implemented as an interactive visual analysis technology for extracting knowledge from complex genetic and clinical data and then visualizing it in a meaningful and interpretable way. By synergizing the domain knowledge into development and analysis processes, we have developed a comprehensive tool that supports a seamless patient-to-patient analysis, from an overview of the patient population in the similarity space to the detailed views of genes. The system consists of multiple components enabling the complete analysis process, including data mining, interactive visualization, analytical views, and gene comparison. We demonstrate our approach with medical scientists on a case study of childhood cancer patients on how they use the tool to confirm existing hypotheses and to discover new scientific insights.

PACRIM SCIENCE APPLICATIONS: A DECADE WITH AIRSAR

  • Milne, A.K.;Tapley, I.J.
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.428-428
    • /
    • 2002
  • The scientific objectives of PACRIM (Pacific Rim) are to advance the understanding of polarimetric and interferometric radar and to promote its application in environmental research designed to detect and quantify changes found in both the physical and humanly dominated ecosystems on the earth's surface. The information derived is used to more readily identify environments at risk; improve environmental decision making and the management of resources and thereby lead to the implementation of more effective and sustainable land use practices. PACRIM is a collaborative research project was organized by NASA's Mission to Planet Earth, Airborne Sciences Program; the Jet Propulsion Laboratory; CSIRO-COSSA and the Centre for Remote Sensing and GIS at the University of New South Wales. A decade of working with AIRSAR data (1993-2003) in the Australia-Asian-Pacific region has provided the opportunity for more than 400 investigators from 20 countries to collect, analyse, interpret and apply state-of-the-art radar data to earth-science studies. This has been achieved by scientists working within seven broad research themes; o Forestry and vegetation o Geology and tectonic processes o Interferometry o Disaster management o Coastal analysis o Agriculture o Urban and regional development. This paper presents an overview of the three data acquisition missions (1993,1996 and 2000) and the science research outcomes achieved from analyzing high quality radar data.

  • PDF

Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary (주가지수 방향성 예측을 위한 주제지향 감성사전 구축 방안)

  • Yu, Eunji;Kim, Yoosin;Kim, Namgyu;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.95-110
    • /
    • 2013
  • Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools. Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants' opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches. One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news repots. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of new content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices. So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature. The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision-support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day's stock index. In addition, we applied a domain-specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative. For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by "M" and "E" media between July 2011 and September 2011.

Product Recommendation System on VLDB using k-means Clustering and Sequential Pattern Technique (k-means 클러스터링과 순차 패턴 기법을 이용한 VLDB 기반의 상품 추천시스템)

  • Shim, Jang-Sup;Woo, Seon-Mi;Lee, Dong-Ha;Kim, Yong-Sung;Chung, Soon-Key
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.1027-1038
    • /
    • 2006
  • There are many technical problems in the recommendation system based on very large database(VLDB). So, it is necessary to study the recommendation system' structure and the data-mining technique suitable for the large scale Internet shopping mail. Thus we design and implement the product recommendation system using k-means clustering algorithm and sequential pattern technique which can be used in large scale Internet shopping mall. This paper processes user information by batch processing, defines the various categories by hierarchical structure, and uses a sequential pattern mining technique for the search engine. For predictive modeling and experiment, we use the real data(user's interest and preference of given category) extracted from log file of the major Internet shopping mall in Korea during 30 days. And we define PRP(Predictive Recommend Precision), PRR(Predictive Recommend Recall), and PF1(Predictive Factor One-measure) for evaluation. In the result of experiments, the best recommendation time and the best learning time of our system are much as O(N) and the values of measures are very excellent.

Ginsenoside Rg3, a promising agent for NSCLC patients in the pandemic: a large-scale data mining and systemic biological analysis

  • Zhenjie Zhuang;Qianying Chen;Xiaoying Zhong;Huiqi Chen;Runjia Yu;Ying Tang
    • Journal of Ginseng Research
    • /
    • v.47 no.2
    • /
    • pp.291-301
    • /
    • 2023
  • Introduction: Non-small cell lung cancer (NSCLC) patients are particularly vulnerable to the Coronavirus Disease-2019 (COVID-19). Currently, no anti-NSCLC/COVID-19 treatment options are available. As ginsenoside Rg3 is beneficial to NSCLC patients and has been identified as an entry inhibitor of the virus, this study aims to explore underlying pharmacological mechanisms of ginsenoside Rg3 for the treatment of NSCLC patients with COVID-19. Methods: Based on a large-scale data mining and systemic biological analysis, this study investigated target genes, biological processes, pharmacological mechanisms, and underlying immune implications of ginsenoside Rg3 for NSCLC patients with COVID-19. Results: An important gene set containing 26 target genes was built. Target genes with significant prognostic value were identified, including baculoviral IAP repeat containing 5 (BIRC5), carbonic anhydrase 9 (CA9), endothelin receptor type B (EDNRB), glucagon receptor (GCGR), interleukin 2 (IL2), peptidyl arginine deiminase 4 (PADI4), and solute carrier organic anion transporter family member 1B1 (SLCO1B1). The expression of target genes was significantly correlated with the infiltration level of macrophages, eosinophils, natural killer cells, and T lymphocytes. Ginsenoside Rg3 may benefit NSCLC patients with COVID-19 by regulating signaling pathways primarily involved in anti-inflammation, immunomodulation, cell cycle, cell fate, carcinogenesis, and hemodynamics. Conclusions: This study provided a comprehensive strategy for drug discovery in NSCLC and COVID-19 based on systemic biology approaches. Ginsenoside Rg3 may be a prospective drug for NSCLC patients with COVID-19. Future studies are needed to determine the value of ginsenoside Rg3 for NSCLC patients with COVID-19.

Metaverse Platform Customer Review Analysis Using Text Mining Techniques (텍스트 마이닝 기법을 활용한 메타버스 플랫폼 고객 리뷰 분석)

  • Hye Jin Kim;Jung Seung Lee;Soo Kyung Kim
    • Journal of Information Technology Applications and Management
    • /
    • v.31 no.1
    • /
    • pp.113-122
    • /
    • 2024
  • This comprehensive study delves into the analysis of user review data across various metaverse platforms, employing advanced text mining techniques such as TF-IDF and Word2Vec to gain insights into user perceptions. The primary objective is to uncover the factors that contribute to user satisfaction and dissatisfaction, thereby providing a nuanced understanding of user experiences in the metaverse. Through TF-IDF analysis, the research identifies key words and phrases frequently mentioned in user reviews, highlighting aspects that resonate positively with users, such as the ability to engage in creative activities and social interactions within these virtual environments. Word2Vec analysis further enriches this understanding by revealing the contextual relationships between words, offering a deeper insight into user sentiments and the specific features that enhance their engagement with the platforms. A significant finding of this study is the identification of common grievances among users, particularly related to the processes of refunds and login, which point to broader issues within payment systems and user interface designs across platforms. These insights are critical for developers and operators of metaverse platforms, suggesting a focused approach towards enhancing user experiences by amplifying positive aspects. The research underscores the importance of continuous improvement in user interface design and the transparency of payment systems to foster a loyal user base. By providing a comprehensive analysis of user reviews, this study offers valuable guidance for the strategic development and optimization of metaverse platforms, ensuring they remain responsive to user needs and continue to evolve as vibrant, engaging virtual environments.

Wine Quality Assessment Using a Decision Tree with the Features Recommended by the Sequential Forward Selection

  • Lee, Seunghan;Kang, Kyungtae;Noh, Dong Kun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.2
    • /
    • pp.81-87
    • /
    • 2017
  • Nowadays wine is increasingly enjoyed by a wider range of consumers, and wine certification and quality assessment are key elements in supporting the wine industry to develop new technologies for both wine making and selling processes. There have been many attempts to construct a more methodical approach to the assessment of wines, but most of them rely on objective decision rather than subjective judgement. In this paper, we propose a data mining approach to predict human wine taste preferences that is based on easily available analytical tests at the certification step. We used sequential forward selection and decision tree for this purpose. Experiments with the wine quality dataset from the UC Irvine Machine Learning Repository demonstrate the accuracies of 76.7% and 78.7% for red and white wines respectively.

A Novelty Detection Algorithm for Multiple Normal Classes : Application to TFT-LCD Processes (다중 정상 하에서 단일 클래스 분류기법을 이용한 이상치 탐지 : TFT-LCD 공정 사례)

  • Joo, Tae Woo;Kim, Seoung Bum
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.39 no.2
    • /
    • pp.82-89
    • /
    • 2013
  • Novelty detection (ND) is an effective technique that can be used to determine whether a future observation is normal or not. In the present study we propose a novelty detection algorithm that can handle a situation where the distributions of target (normal) observations are inhomogeneous. A simulation study and a real case with the TFT-LCD process demonstrated the effectiveness and usefulness of the proposed algorithm.

Comparison of Feature Selection Processes for Image Retrieval Applications

  • Choi, Young-Mee;Choo, Moon-Won
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.12
    • /
    • pp.1544-1548
    • /
    • 2011
  • A process of choosing a subset of original features, so called feature selection, is considered as a crucial preprocessing step to image processing applications. There are already large pools of techniques developed for machine learning and data mining fields. In this paper, basically two methods, non-feature selection and feature selection, are investigated to compare their predictive effectiveness of classification. Color co-occurrence feature is used for defining image features. Standard Sequential Forward Selection algorithm are used for feature selection to identify relevant features and redundancy among relevant features. Four color spaces, RGB, YCbCr, HSV, and Gaussian space are considered for computing color co-occurrence features. Gray-level image feature is also considered for the performance comparison reasons. The experimental results are presented.