Search | Korea Science

A Research on Enhancement of Text Categorization Performance by using Okapi BM25 Word Weight Method (Okapi BM25 단어 가중치법 적용을 통한 문서 범주화의 성능 향상)

Lee, Yong-Hun;Lee, Sang-Bum
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.11 no.12
- /
- pp.5089-5096
- /
- 2010
Text categorization is one of important features in information searching system which classifies documents according to some criteria. The general method of categorization performs the classification of the target documents by eliciting important index words and providing the weight on them. Therefore, the effectiveness of algorithm is so important since performance and correctness of text categorization totally depends on such algorithm. In this paper, an enhanced method for text categorization by improving word weighting technique is introduced. A method called Okapi BM25 has been proved its effectiveness from some information retrieval engines. We applied Okapi BM25 and showed its good performance in the categorization. Various other words weights methods are compared: TF-IDF, TF-ICF and TF-ISF. The target documents used for this experiment is Reuter-21578, and SVM and KNN algorithms are used. Finally, modified Okapi BM25 shows the most excellent performance.
https://doi.org/10.5762/KAIS.2010.11.12.5089 인용 PDF KSCI

Seismic Fragility Analysis of Substation Systems by Using the Fault Tree Method (고장수목을 이용한 변전소의 지진취약도 분석)

Kim, Min-Kyu;Choun, Young-Sun;Choi, In-Kil;Oh, Keum-Ho
- Journal of the Earthquake Engineering Society of Korea
- /
- v.13 no.2
- /
- pp.47-58
- /
- 2009
In this study, a seismic fragility analysis was performed for substation systems in Korea. To evaluate the seismic fragility function of the substation systems, a fragility analysis of the individual equipment and facilities of the substation systems was first performed, and then all systems were considered in the fragility analysis of the substation systems using a fault-tree method. For this research, the status of the substation systems in Korea was investigated for the classification of the substation systems. Following the classification of the substation systems, target equipment was selected based on previous damage records in earthquake hazards. The substation systems were classified as 765kV, 345kV, and 154kV systems. Transformer and bushing were chosen as target equipment. The failure modes and criteria for transformer and bushing were decided, and fragility analysis performed. Finally, the fragility functions of substation system were evaluated using the fault tree method according to damage status.
https://doi.org/10.5000/EESK.2009.13.2.047 인용 PDF KSCI

On the Tree Model grown by one-sided purity (단측 순수성에 의한 나무모형의 성장에 대하여)

김용대;최대우
- Journal of Intelligence and Information Systems
- /
- v.7 no.1
- /
- pp.17-25
- /
- 2001
Tree model is the most popular classification algorithm in data mining due to easy interpretation of the result. In CART(Breiman et al., 1984) and C4.5(Quinlan, 1993) which are representative of tree algorithms, the split fur classification proceeds to attain the homogeneous terminal nodes with respect to the composition of levels in target variable. But, fur instance, in the chum prediction modeling fur CRM(Customer Relationship management), the rate of churn is generally very low although we are interested in mining the churners. Thus it is difficult to get accurate prediction modes using tree model based on the traditional split rule, such as mini or deviance. Buja and Lee(1999) introduced a new split rule, one-sided purity for classifying minor interesting group. In this paper, we compared one-sided purity with traditional split rule, deviance analyzing churning vs. non-churning data of ISP company. Also reviewing the result of tree model based on one-sided purity with some simulated data, we discussed problems and researchable topics.
PDF

Construction Scheme of Training Data using Automated Exploring of Boundary Categories (경계범주 자동탐색에 의한 확장된 학습체계 구성방법)

Choi, Yun-Jeong;Jee, Jeong-Gyu;Park, Seung-Soo
- The KIPS Transactions:PartB
- /
- v.16B no.6
- /
- pp.479-488
- /
- 2009
This paper shows a reinforced construction scheme of training data for improvement of text classification by automatic search of boundary category. The documents laid on boundary area are usually misclassified as they are including multiple topics and features. which is the main factor that we focus on. In this paper, we propose an automated exploring methodology of optimal boundary category based on previous research. We consider the boundary area among target categories to new category to be required training, which are then added to the target category sementically. In experiments, we applied our method to complex documents by intentionally making errors in training process. The experimental results show that our system has high accuracy and reliability in noisy environment.
https://doi.org/10.3745/KIPSTB.2009.16B.6.479 인용 PDF KSCI

A Substitute Model Learning Method Using Data Augmentation with a Decay Factor and Adversarial Data Generation Using Substitute Model (감쇠 요소가 적용된 데이터 어그멘테이션을 이용한 대체 모델 학습과 적대적 데이터 생성 방법)

Min, Jungki;Moon, Jong-sub
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.29 no.6
- /
- pp.1383-1392
- /
- 2019
Adversarial attack, which geneartes adversarial data to make target model misclassify the input data, is able to confuse real life applications of classification models and cause severe damage to the classification system. An Black-box adversarial attack learns a substitute model, which have similar decision boundary to the target model, and then generates adversarial data with the substitute model. Jacobian-based data augmentation is used to synthesize the training data to learn substitutes, but has a drawback that the data synthesized by the augmentation get distorted more and more as the training loop proceeds. We suggest data augmentation with 'decay factor' to alleviate this problem. The result shows that attack success rate of our method is higher(around 8.5%) than the existing method.
https://doi.org/10.13089/JKIISC.2019.29.6.1383 인용 PDF KSCI HTML

Credit Card Bad Debt Prediction Model based on Support Vector Machine (신용카드 대손회원 예측을 위한 SVM 모형)

Kim, Jin Woo;Jhee, Won Chul
- Journal of Information Technology Services
- /
- v.11 no.4
- /
- pp.233-250
- /
- 2012
In this paper, credit card delinquency means the possibility of occurring bad debt within the certain near future from the normal accounts that have no debt and the problem is to predict, on the monthly basis, the occurrence of delinquency 3 months in advance. This prediction is typical binary classification problem but suffers from the issue of data imbalance that means the instances of target class is very few. For the effective prediction of bad debt occurrence, Support Vector Machine (SVM) with kernel trick is adopted using credit card usage and payment patterns as its inputs. SVM is widely accepted in the data mining society because of its prediction accuracy and no fear of overfitting. However, it is known that SVM has the limitation in its ability to processing the large-scale data. To resolve the difficulties in applying SVM to bad debt occurrence prediction, two stage clustering is suggested as an effective data reduction method and ensembles of SVM models are also adopted to mitigate the difficulty due to data imbalance intrinsic to the target problem of this paper. In the experiments with the real world data from one of the major domestic credit card companies, the suggested approach reveals the superior prediction accuracy to the traditional data mining approaches that use neural networks, decision trees or logistics regressions. SVM ensemble model learned from T2 training set shows the best prediction results among the alternatives considered and it is noteworthy that the performance of neural networks with T2 is better than that of SVM with T1. These results prove that the suggested approach is very effective for both SVM training and the classification problem of data imbalance.
https://doi.org/10.9716/KITS.2012.11.4.233 인용 PDF KSCI

Adversarial Example Detection Based on Symbolic Representation of Image (이미지의 Symbolic Representation 기반 적대적 예제 탐지 방법)

Park, Sohee;Kim, Seungjoo;Yoon, Hayeon;Choi, Daeseon
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.32 no.5
- /
- pp.975-986
- /
- 2022
Deep learning is attracting great attention, showing excellent performance in image processing, but is vulnerable to adversarial attacks that cause the model to misclassify through perturbation on input data. Adversarial examples generated by adversarial attacks are minimally perturbated where it is difficult to identify, so visual features of the images are not generally changed. Unlikely deep learning models, people are not fooled by adversarial examples, because they classify the images based on such visual features of images. This paper proposes adversarial attack detection method using Symbolic Representation, which is a visual and symbolic features such as color, shape of the image. We detect a adversarial examples by comparing the converted Symbolic Representation from the classification results for the input image and Symbolic Representation extracted from the input images. As a result of measuring performance on adversarial examples by various attack method, detection rates differed depending on attack targets and methods, but was up to 99.02% for specific target attack.
https://doi.org/10.13089/JKIISC.2022.32.5.975 인용 PDF KSCI HTML

Current Status of the Diagnosis and Management of Pancreatic Neuroendocrine Tumors in Japan

Tetsuhide Ito;Masami Miki;Keijiro Ueda;Lingaku Lee;Ken Kawabe;Hisato Igarashi;Nao Fujimori;Kazuhiko Nakamura;Kohei Yasunaga;Robert T. Jensen;Takao Ohtsuka;Yoshihiro Ogawa
- Journal of Digestive Cancer Research
- /
- v.4 no.2
- /
- pp.51-57
- /
- 2016
The epidemiology of pancreatic neuroendocrine neoplasms (PNENs) in Asia has been clarified through epidemiological studies, including one conducted in Japan, and subsequently another in South Korea. As endoscopic ultrasonography (EUS) has become more widely accessible, endoscopic ultrasound-fine needle aspiration (EUS-FNA) has been performed in pancreatic tumors for which the clinical course was only monitored previously. This has enabled accurate diagnosis of pancreatic tumors based on the 2010 WHO classification; as a result, the number of patients with an accurate diagnosis has increased. Although surgery has been the standard therapy for PNENs, new treatment options have become available in Japan for the treatment of advanced or inoperable PNENs; of particular note is the recent introduction of molecular target drugs (such as everolimus and sunitinib) and streptozocin. Treatment for progressive PNENs needs to be selected for each patient with consideration of the performance status, degree of tumor differentiation, tumor mass, and proliferation rate. Somatostatin receptor (SSTR)-2 is expressed in many patients with neuroendocrine tumor. Somatostatin receptor scintigraphy (SRS), which can visualize SSTR-2 expression, has been approved in Japan. The SRS will be a useful diagnostic tool for locating neuroendocrine neoplasms, detecting distant metastasis, and evaluating therapy outcomes. In this manuscript, we review the latest diagnostic methods and treatments for PNENs.
PDF

Real-Time Implementation of Active Classification Using Cumulative Processing (누적처리기법을 이용한 능동표적식별 시스템의 실시간 구현)

Park, Gyu-Tae;Bae, Eun-Hyon;Lee, Kyun-Kyung
- The Journal of the Acoustical Society of Korea
- /
- v.26 no.2
- /
- pp.87-94
- /
- 2007
In active sonar system, aspect angle and length of a target can be estimated by calculating the cross-correlation between left and right split-beams of a LFM(Linear Frequency Modulated) signal. However, high-resolution performances in bearing and range are required to estimate the information of a remote target. Because a certain higher sampling frequency than the Nyquist sampling frequency is required in this performance, an over-sampling process through interpolation method should be required. However, real-time implementation of split-beam processing with over-sampled split-beam outputs on a COTS(commercial off-the-shelf) DSP platform limits its performance because of given throughput and memory capacity. This paper proposes a cumulative processing algorithm for split-beam processing to solve the problems. The performance of the proposed method was verified through some simulation tests. Also, the proposed method was implemented as a real-time system using an ADSP-TS101.
https://doi.org/10.7776/ASK.2007.26.2.087 인용 PDF KSCI

Warning Classification Method Based On Artificial Neural Network Using Topics of Source Code (소스코드 주제를 이용한 인공신경망 기반 경고 분류 방법)

Lee, Jung-Been
- KIPS Transactions on Computer and Communication Systems
- /
- v.9 no.11
- /
- pp.273-280
- /
- 2020
Automatic Static Analysis Tools help developers to quickly find potential defects in source code with less effort. However, the tools reports a large number of false positive warnings which do not have to fix. In our study, we proposed an artificial neural network-based warning classification method using topic models of source code blocks. We collect revisions for fixing bugs from software change management (SCM) system and extract code blocks modified by developers. In deep learning stage, topic distribution values of the code blocks and the binary data that present the warning removal in the blocks are used as input and target data in an simple artificial neural network, respectively. In our experimental results, our warning classification model based on neural network shows very high performance to predict label of warnings such as true or false positive.
https://doi.org/10.3745/KTCCS.2020.9.11.273 인용 PDF KSCI

Search Result 672, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)