• Title/Summary/Keyword: 최근접 탐색

Search Result 51, Processing Time 0.022 seconds

Cancer Diagnosis System using Genetic Algorithm and Multi-boosting Classifier (Genetic Algorithm과 다중부스팅 Classifier를 이용한 암진단 시스템)

  • Ohn, Syng-Yup;Chi, Seung-Do
    • Journal of the Korea Society for Simulation
    • /
    • v.20 no.2
    • /
    • pp.77-85
    • /
    • 2011
  • It is believed that the anomalies or diseases of human organs are identified by the analysis of the patterns. This paper proposes a new classification technique for the identification of cancer disease using the proteome patterns obtained from two-dimensional polyacrylamide gel electrophoresis(2-D PAGE). In the new classification method, three different classification methods such as support vector machine(SVM), multi-layer perceptron(MLP) and k-nearest neighbor(k-NN) are extended by multi-boosting method in an array of subclassifiers and the results of each subclassifier are merged by ensemble method. Genetic algorithm was applied to obtain optimal feature set in each subclassifier. We applied our method to empirical data set from cancer research and the method showed the better accuracy and more stable performance than single classifier.

Exploring the Performance of Synthetic Minority Over-sampling Technique (SMOTE) to Predict Good Borrowers in P2P Lending (P2P 대부 우수 대출자 예측을 위한 합성 소수집단 오버샘플링 기법 성과에 관한 탐색적 연구)

  • Costello, Francis Joseph;Lee, Kun Chang
    • Journal of Digital Convergence
    • /
    • v.17 no.9
    • /
    • pp.71-78
    • /
    • 2019
  • This study aims to identify good borrowers within the context of P2P lending. P2P lending is a growing platform that allows individuals to lend and borrow money from each other. Inherent in any loans is credit risk of borrowers and needs to be considered before any lending. Specifically in the context of P2P lending, traditional models fall short and thus this study aimed to rectify this as well as explore the problem of class imbalances seen within credit risk data sets. This study implemented an over-sampling technique known as Synthetic Minority Over-sampling Technique (SMOTE). To test our approach, we implemented five benchmarking classifiers such as support vector machines, logistic regression, k-nearest neighbor, random forest, and deep neural network. The data sample used was retrieved from the publicly available LendingClub dataset. The proposed SMOTE revealed significantly improved results in comparison with the benchmarking classifiers. These results should help actors engaged within P2P lending to make better informed decisions when selecting potential borrowers eliminating the higher risks present in P2P lending.

한국 가정과 교육과정의 현황과 과제

  • 윤인경
    • Proceedings of the KHEEA Conference
    • /
    • 2002.08a
    • /
    • pp.5-19
    • /
    • 2002
  • 재한국, 1995년제일차출대국가제정적가정과교육과정. 지후, 한국적가정과교육과정경마료7차적수정여개혁과정. 재차과정중, 가정과정불단지추극출신, 기과목명칭재변화, 필수자선 등 선과성질야재변화, 과치함축, 여기술과정합. 이차, 가정과변위남녀생공수적과목, 저취순응료사회발전적수요. 종한국적교육과정래간, 1년급도10년급시국민공용기본교육계단, 11년급도12년급시자선교육계단. 거차, 가정과교육과정적접배위여하 : 소학(5~6학년)위실과, 중학화고중(7~10학년)위"기술.가정", 고중(11~12학년)위"가전과학". 장종2003년개시실시적가정과과시안배위여하 : 소학각2과시, 7~12학년시각2.3.3.3과시, 11~12학년위6개등차. 최근, 청소년문제, 교육환경, 상시인성, 가정파양, 소비과잉, 학대아동, 등사회문제도근가정생활유착밀절적연계, 인차, 재반지교육중, 가정교육응수중시. 단시, 실제상병불시여차. 작위교육주체적교사화부모도몰유인식도저개실정. 인차, 가정학자여교사유심요주근지거연구가정교육. 우기시, 유필요근중국, 일본, 등저사아주국가호상교류화합작적과정중거탐색가정교육적안정발전. 하면파미래가정학육발전적방향건의여하 : 1) "가정" 과시이가정과위연구대상적가정학적독립적연구요영역. 가정경적연구감상시 "가정", 타이가정생활질량적제고위기연구목적. 인차, 재가정교육중, 과목적명칭명명위 "기술. 가정", "가정일반". "가사" 시부합리적. 이응위 "가정" 2) 가정교육웅사중시성각색적변환, 직업적인직변고적각도출발, 사소학도고중분개위필수화선수과, 유남녀생공수. 3) 가정과과시재축점축단. 도시유우교육과정적축단이인기적피면불료적현상. 단시고 여가정과시실천, 실험성과목응보장기최저적과시, 최기마필수유지현재적과시. 4) 향래, 한국적가정교육과정기이가정과위기본철학배경화리념, 우급시파국가교육과정적배경화이념, 가정학적발전동태반영재교육과정중, 즉강조즘요교. 단시, 경력료반복적변혁지후, 최근, 각중시즘양거배양학생적십 요 양적능력여가치. 인차, 가정교육파교육목라방재즘루거제고가정생활적질량, 즘루거호조화가정생활화직업생활, 즘양거개발합리지해결화실천가정생활적가치관. 5) 최근, 가정교육파교육방향화목라방재거배양학생작위독립적개인, 작위가족적성원, 작위사회성원래주인생도로적능력. 인차, 가정교육이인적생활위중심. 우거섭급학생재성장과정중소우도적문제,재거포괄재가정화사회생활중소우도적문제. 즉거배양해결가정생활중소우도적소유적 문종적종합능력. 6) 가정과재교학방법화교학평개상, 응채용실험, 실습, 관찰 등방식, 응반체험성, 실천성경험. 위차, 응필편기험적실험, 실습설비. 7) 확정교육과정편제적치후, 응제고일반교육학자적참여율, 가정교육학자응적극참여 제정교청정책적유관교육적각종위원회. 재제정정책적과정중각진소능, 적극제출건고성적의황. 8) 한, 중, 일 삼국권원층립가정교육과정도작사, 위삼국교육과정적량호발전주공헌.

  • PDF

Semantic Similarity Search using the Signature Tree (시그니처 트리를 사용한 의미적 유사성 검색 기법)

  • Kim, Ki-Sung;Im, Dong-Hyuk;Kim, Cheol-Han;Kim, Hyoung-Joo
    • Journal of KIISE:Databases
    • /
    • v.34 no.6
    • /
    • pp.546-553
    • /
    • 2007
  • As ontologies are used widely, interest for semantic similarity search is also increasing. In this paper, we suggest a query evaluation scheme for k-nearest neighbor query, which retrieves k most similar objects to the query object. We use the best match method to calculate the semantic similarity between objects and use the signature tree to index annotation information of objects in database. The signature tree is usually used for the set similarity search. When we use the signature tree in similarity search, we are required to predict the upper-bound of similarity for a node; the highest similarity value which can be found when we traverse into the node. So we suggest a prediction function for the best match similarity function and prove the correctness of the prediction. And we modify the original signature tree structure for same signatures not to be stored redundantly. This improved structure of signature tree not only reduces the size of signature tree but also increases the efficiency of query evaluation. We use the Gene Ontology(GO) for our experiments, which provides large ontologies and large amount of annotation data. Using GO, we show that proposed method improves query efficiency and present several experimental results varying the page size and using several node-splitting methods.

A Learning Agent for Automatic Bookmark Classification (북 마크 자동 분류를 위한 학습 에이전트)

  • Kim, In-Cheol;Cho, Soo-Sun
    • The KIPS Transactions:PartB
    • /
    • v.8B no.5
    • /
    • pp.455-462
    • /
    • 2001
  • The World Wide Web has become one of the major services provided through Internet. When searching the vast web space, users use bookmarking facilities to record the sites of interests encountered during the course of navigation. One of the typical problems arising from bookmarking is that the list of bookmarks lose coherent organization when the the becomes too lengthy, thus ceasing to function as a practical finding aid. In order to maintain the bookmark file in an efficient, organized manner, the user has to classify all the bookmarks newly added to the file, and update the folders. This paper introduces our learning agent called BClassifier that automatically classifies bookmarks by analyzing the contents of the corresponding web documents. The chief source for the training examples are the bookmarks already classified into several bookmark folders according to their subject by the user. Additionally, the web pages found under top categories of Yahoo site are collected and included in the training examples for diversifying the subject categories to be represented, and the training examples for these categories as well. Our agent employs naive Bayesian learning method that is a well-tested, probability-based categorizing technique. In this paper, the outcome of some experimentation is also outlined and evaluated. A comparison of naive Bayesian learning method alongside other learning methods such as k-Nearest Neighbor and TFIDF is also presented.

  • PDF

A Concordance Study of the Preprocessing Orders in Microarray Data (마이크로어레이 자료의 사전 처리 순서에 따른 검색의 일치도 분석)

  • Kim, Sang-Cheol;Lee, Jae-Hwi;Kim, Byung-Soo
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.3
    • /
    • pp.585-594
    • /
    • 2009
  • Researchers of microarray experiment transpose processed images of raw data to possible data of statistical analysis: it is preprocessing. Preprocessing of microarray has image filtering, imputation and normalization. There have been studied about several different methods of normalization and imputation, but there was not further study on the order of the procedures. We have no further study about which things put first on our procedure between normalization and imputation. This study is about the identification of differentially expressed genes(DEG) on the order of the preprocessing steps using two-dye cDNA microarray in colon cancer and gastric cancer. That is, we check for compare which combination of imputation and normalization steps can detect the DEG. We used imputation methods(K-nearly neighbor, Baysian principle comparison analysis) and normalization methods(global, within-print tip group, variance stabilization). Therefore, preprocessing steps have 12 methods. We identified concordance measure of DEG using the datasets to which the 12 different preprocessing orders were applied. When we applied preprocessing using variance stabilization of normalization method, there was a little variance in a sensitive way for detecting DEG.

Leision Detection in Chest X-ray Images based on Coreset of Patch Feature (패치 특징 코어세트 기반의 흉부 X-Ray 영상에서의 병변 유무 감지)

  • Kim, Hyun-bin;Chun, Jun-Chul
    • Journal of Internet Computing and Services
    • /
    • v.23 no.3
    • /
    • pp.35-45
    • /
    • 2022
  • Even in recent years, treatment of first-aid patients is still often delayed due to a shortage of medical resources in marginalized areas. Research on automating the analysis of medical data to solve the problems of inaccessibility for medical services and shortage of medical personnel is ongoing. Computer vision-based medical inspection automation requires a lot of cost in data collection and labeling for training purposes. These problems stand out in the works of classifying lesion that are rare, or pathological features and pathogenesis that are difficult to clearly define visually. Anomaly detection is attracting as a method that can significantly reduce the cost of data collection by adopting an unsupervised learning strategy. In this paper, we propose methods for detecting abnormal images on chest X-RAY images as follows based on existing anomaly detection techniques. (1) Normalize the brightness range of medical images resampled as optimal resolution. (2) Some feature vectors with high representative power are selected in set of patch features extracted as intermediate-level from lesion-free images. (3) Measure the difference from the feature vectors of lesion-free data selected based on the nearest neighbor search algorithm. The proposed system can simultaneously perform anomaly classification and localization for each image. In this paper, the anomaly detection performance of the proposed system for chest X-RAY images of PA projection is measured and presented by detailed conditions. We demonstrate effect of anomaly detection for medical images by showing 0.705 classification AUROC for random subset extracted from the PadChest dataset. The proposed system can be usefully used to improve the clinical diagnosis workflow of medical institutions, and can effectively support early diagnosis in medically poor area.

A Research on the Calligraphic Critique of Seongjeok Jeong-Jik Lee - Based on 'Wongyo-Jinjeok' of Wongyo Gwang-Sa Lee (석정 이정직의 서예비평 연구 - 원교 이광사의 『원교진적』을 중심으로 -)

  • Gu, Sa Whae
    • (The)Study of the Eastern Classic
    • /
    • no.32
    • /
    • pp.29-50
    • /
    • 2008
  • This thesis is an introduction and critique of the recently released 'Wongyo-Jinjeok(원교진적)'. 'Wongyo-Jinjeok' is the critique of Seokjeong Jeong-Jik Lee (석정 이정직, 1841-1910), a practical scientist and writer during the last years of the Korean Empire, on the calligraphy of Wongyo(원교) Gwang-Sa Lee (이광사, 1705-1777). Even though whether or not Seokjeong follows the flow of Donggukjinche(동국진체) is to be determined by the specialists in this field, this thesis is based on the view that Seokjeong was influenced by Donggukjinche. The academic value of 'Wongyo-Jinjeok' is Seokjeong's preface and epilogue which critiques Wongyo's writing. 'Wongyo-Jinjeok'is a collection of calligraphic specimens from the 18 pieces of Chinese poetry Wongyo had written before and after June 1756 which was the year after he was banished to Booryung. Seokjeong critiqued the writing of Wongyo from the perspective of calligraphic history in the preface and epilogue of 'Wongyo-Jinjeok'. Seokjeong had been positive about Wongyo's taking after the pre-Wangheejee calligraphic style. But at the same time, Seokjeong thought that Wongyo's ability to create was limited by the public morals of that time. Such thought of Seokjeong can be interpreted as an evaluation of Wongyo's calligraphy as having been externally stern but failing to transcend the realm of mastery to the realm of creation.

Pre-service mathematics teachers' noticing competency: Focusing on teaching for robust understanding of mathematics (예비 수학교사의 수학적 사고 중심 수업에 관한 노티싱 역량 탐색)

  • Kim, Hee-jeong
    • The Mathematical Education
    • /
    • v.61 no.2
    • /
    • pp.339-357
    • /
    • 2022
  • This study explores pre-service secondary mathematics teachers (PSTs)' noticing competency. 17 PSTs participated in this study as a part of the mathematics teaching method class. Individual PST's essays regarding the question 'what effective mathematics teaching would be?' that they discussed and wrote at the beginning of the course were collected as the first data. PSTs' written analysis of an expert teacher's teaching video, colleague PSTs' demo-teaching video, and own demo-teaching video were also collected and analyzed. Findings showed that most PSTs' noticing level improved as the class progressed and showed a pattern of focusing on each key aspect in terms of the Teaching for Robust Understanding of Mathematics (TRU Math) framework, but their reasoning strategies were somewhat varied. This suggests that the TRU Math framework can support PSTs to improve the competency of 'what to attend' among the noticing components. In addition, the instructional reasoning strategies imply that PSTs' noticing reasoning strategy was mostly related to their interpretation of noticing components, which should be also emphasized in the teacher education program.

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (부도예측을 위한 KNN 앙상블 모형의 동시 최적화)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.139-157
    • /
    • 2016
  • Bankruptcy involves considerable costs, so it can have significant effects on a country's economy. Thus, bankruptcy prediction is an important issue. Over the past several decades, many researchers have addressed topics associated with bankruptcy prediction. Early research on bankruptcy prediction employed conventional statistical methods such as univariate analysis, discriminant analysis, multiple regression, and logistic regression. Later on, many studies began utilizing artificial intelligence techniques such as inductive learning, neural networks, and case-based reasoning. Currently, ensemble models are being utilized to enhance the accuracy of bankruptcy prediction. Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving the generalization ability of the classifier. Base classifiers in the ensemble must be as accurate and diverse as possible in order to enhance the generalization ability of an ensemble model. Commonly used methods for constructing ensemble classifiers include bagging, boosting, and random subspace. The random subspace method selects a random feature subset for each classifier from the original feature space to diversify the base classifiers of an ensemble. Each ensemble member is trained by a randomly chosen feature subspace from the original feature set, and predictions from each ensemble member are combined by an aggregation method. The k-nearest neighbors (KNN) classifier is robust with respect to variations in the dataset but is very sensitive to changes in the feature space. For this reason, KNN is a good classifier for the random subspace method. The KNN random subspace ensemble model has been shown to be very effective for improving an individual KNN model. The k parameter of KNN base classifiers and selected feature subsets for base classifiers play an important role in determining the performance of the KNN ensemble model. However, few studies have focused on optimizing the k parameter and feature subsets of base classifiers in the ensemble. This study proposed a new ensemble method that improves upon the performance KNN ensemble model by optimizing both k parameters and feature subsets of base classifiers. A genetic algorithm was used to optimize the KNN ensemble model and improve the prediction accuracy of the ensemble model. The proposed model was applied to a bankruptcy prediction problem by using a real dataset from Korean companies. The research data included 1800 externally non-audited firms that filed for bankruptcy (900 cases) or non-bankruptcy (900 cases). Initially, the dataset consisted of 134 financial ratios. Prior to the experiments, 75 financial ratios were selected based on an independent sample t-test of each financial ratio as an input variable and bankruptcy or non-bankruptcy as an output variable. Of these, 24 financial ratios were selected by using a logistic regression backward feature selection method. The complete dataset was separated into two parts: training and validation. The training dataset was further divided into two portions: one for the training model and the other to avoid overfitting. The prediction accuracy against this dataset was used to determine the fitness value in order to avoid overfitting. The validation dataset was used to evaluate the effectiveness of the final model. A 10-fold cross-validation was implemented to compare the performances of the proposed model and other models. To evaluate the effectiveness of the proposed model, the classification accuracy of the proposed model was compared with that of other models. The Q-statistic values and average classification accuracies of base classifiers were investigated. The experimental results showed that the proposed model outperformed other models, such as the single model and random subspace ensemble model.