• 제목/요약/키워드: Classification Algorithms

검색결과 1,198건 처리시간 0.029초

온라인 제품 리뷰 스팸 판별을 위한 점증적 SVM (Incremental SVM for Online Product Review Spam Detection)

  • 지쳉장;장진홍;강대기
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2014년도 춘계학술대회
    • /
    • pp.89-93
    • /
    • 2014
  • 제품 리뷰들은 잠재적인 고객의 구매 선택에 매우 중요하다. 제품 리뷰들은 또한 제조사들로 하여금 자신들의 제품의 문제점을 찾고 경쟁자들의 비즈니스 정보를 수집하는 데 사용된다. 그러나 어떤 사람들은 가짜 리뷰를 쓰고, 잠재적인 고객들과 제조사들로 하여금 잘못된 선택을 하게 만든다. 따라서 가짜 리뷰 판별은 전자 상거래 사이트에서 주된 문제들 중 하나이다. 서포트 벡터 머신즈(SVM)는 좋은 성능을 보이는 중요한 텍스트 분류 알고리즘이다. 본 논문에서는 온라인 리뷰 스팸을 판별하기 위해 가중치, Karush-Kuhn-Tucker(KKT) 조건의 확장, 그리고 컨벡스 헐(Convex Hull)에 근거한 점증적 알고리즘을 제시한다. 최종적으로 우리는 제시된 알고리즘의 성능을 이론적으로 분석한다.

  • PDF

패션콘텐츠 미디어 환경 예측을 위한 해외 SPA 브랜드의 SNS 언어 네트워크 분석 (Estimating Media Environments of Fashion Contents through Semantic Network Analysis from Social Network Service of Global SPA Brands)

  • 전여선
    • 한국의류학회지
    • /
    • 제43권3호
    • /
    • pp.427-439
    • /
    • 2019
  • This study investigated the semantic network based on the focus of the fashion image and SNS text utilized by global SPA brands on the last seven years in terms of the quantity and quality of data generated by the fast-changing fashion trends and fashion content-based media environment. The research method relocated frequency, density and repetitive key words as well as visualized algorithms using the UCINET 6.347 program and the overall classification of the text related to fashion images on social networks used by global SPA brands. The conclusions of the study are as follows. A common aspect of global SPA brands is that by looking at the basis of text extraction on SNS, exposure through image of products is considered important for sales. The following is a discriminatory aspect of global SPA brands. First, ZARA consistently exposes marketing using a variety of professions and nationalities to SNS. Second, UNIQLO's correlation exposes its collaboration promotion to SNS while steadily exposing basic items. Third, in the case of H&M, some discriminatory results were found with other brands in connectivity with each cluster category that showed remarkably independent results.

Design and Implementation of Web Crawler utilizing Unstructured data

  • Tanvir, Ahmed Md.;Chung, Mokdong
    • 한국멀티미디어학회논문지
    • /
    • 제22권3호
    • /
    • pp.374-385
    • /
    • 2019
  • A Web Crawler is a program, which is commonly used by search engines to find the new brainchild on the internet. The use of crawlers has made the web easier for users. In this paper, we have used unstructured data by structuralization to collect data from the web pages. Our system is able to choose the word near our keyword in more than one document using unstructured way. Neighbor data were collected on the keyword through word2vec. The system goal is filtered at the data acquisition level and for a large taxonomy. The main problem in text taxonomy is how to improve the classification accuracy. In order to improve the accuracy, we propose a new weighting method of TF-IDF. In this paper, we modified TF-algorithm to calculate the accuracy of unstructured data. Finally, our system proposes a competent web pages search crawling algorithm, which is derived from TF-IDF and RL Web search algorithm to enhance the searching efficiency of the relevant information. In this paper, an attempt has been made to research and examine the work nature of crawlers and crawling algorithms in search engines for efficient information retrieval.

Evolutionary Neural Network based on Quantum Elephant Herding Algorithm for Modulation Recognition in Impulse Noise

  • Gao, Hongyuan;Wang, Shihao;Su, Yumeng;Sun, Helin;Zhang, Zhiwei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권7호
    • /
    • pp.2356-2376
    • /
    • 2021
  • In this paper, we proposed a novel modulation recognition method based on quantum elephant herding algorithm (QEHA) evolving neural network under impulse noise environment. We use the adaptive weight myriad filter to preprocess the received digital modulation signals which passing through the impulsive noise channel, and then the instantaneous characteristics and high order cumulant features of digital modulation signals are extracted as classification feature set, finally, the BP neural network (BPNN) model as a classifier for automatic digital modulation recognition. Besides, based on the elephant herding optimization (EHO) algorithm and quantum computing mechanism, we design a quantum elephant herding algorithm (QEHA) to optimize the initial thresholds and weights of the BPNN, which solves the problem that traditional BPNN is easy into local minimum values and poor robustness. The experimental results prove that the adaptive weight myriad filter we used can remove the impulsive noise effectively, and the proposed QEHA-BPNN classifier has better recognition performance than other conventional pattern recognition classifiers. Compared with other global optimization algorithms, the QEHA designed in this paper has a faster convergence speed and higher convergence accuracy. Furthermore, the effect of symbol shape has been considered, which can satisfy the need for engineering.

Improved Feature Selection Techniques for Image Retrieval based on Metaheuristic Optimization

  • Johari, Punit Kumar;Gupta, Rajendra Kumar
    • International Journal of Computer Science & Network Security
    • /
    • 제21권1호
    • /
    • pp.40-48
    • /
    • 2021
  • Content-Based Image Retrieval (CBIR) system plays a vital role to retrieve the relevant images as per the user perception from the huge database is a challenging task. Images are represented is to employ a combination of low-level features as per their visual content to form a feature vector. To reduce the search time of a large database while retrieving images, a novel image retrieval technique based on feature dimensionality reduction is being proposed with the exploit of metaheuristic optimization techniques based on Genetic Algorithm (GA), Extended Binary Cuckoo Search (EBCS) and Whale Optimization Algorithm (WOA). Each image in the database is indexed using a feature vector comprising of fuzzified based color histogram descriptor for color and Median binary pattern were derived in the color space from HSI for texture feature variants respectively. Finally, results are being compared in terms of Precision, Recall, F-measure, Accuracy, and error rate with benchmark classification algorithms (Linear discriminant analysis, CatBoost, Extra Trees, Random Forest, Naive Bayes, light gradient boosting, Extreme gradient boosting, k-NN, and Ridge) to validate the efficiency of the proposed approach. Finally, a ranking of the techniques using TOPSIS has been considered choosing the best feature selection technique based on different model parameters.

동해안 너울성 파도 예측을 위한 머신러닝 모델 연구 (A Study of Machine Learning Model for Prediction of Swelling Waves Occurrence on East Sea)

  • 강동훈;오세종
    • 한국정보기술학회논문지
    • /
    • 제17권9호
    • /
    • pp.11-17
    • /
    • 2019
  • 최근 들어 동해안에서 너울성 파도에 의한 손실이 빈번히 발생하고 있다. 너울성 파도는 다양한 요인들이 결합되어 발생하기 때문에 예측이 어렵다. 본 연구에서는 머신러닝 기술에 기초하여 동해안에서 너울성 파도의 발생을 예측하는 모델을 제안하였다. 모델 개발을 위해 포항 신항의 하역중단 데이터 및 신항 부근의 기압, 풍속, 풍향, 수온 등의 기상자료를 수집하였다. 수집한 데이터로부터 너울발생에 중요한 영향을 미치는 변수들을 선별하였으며, 모델 개발을 위해 다양한 머신러닝 예측 알고리즘들을 테스트 하였다. 그 결과 조위, 수온, 기압이 너울 발생 예측을 위한 주요 변수로 확인이 되었고, Random Forest 모델이 가장 우수한 성능을 보였으며. 모델의 예측 정확도는 88.6%이다.

굴착기 주행디바이스의 고장 진단을 위한 AI기반 상태 모니터링 시스템 개발 (Development of AI-Based Condition Monitoring System for Failure Diagnosis of Excavator's Travel Device)

  • 백희승;신종호;김성준
    • 드라이브 ㆍ 컨트롤
    • /
    • 제18권1호
    • /
    • pp.24-30
    • /
    • 2021
  • There is an increasing interest in condition-based maintenance for the prevention of economic loss due to failure. Moreover, immense research is being carried out in related technologies in the field of construction machinery. In particular, data-based failure diagnosis methods that employ AI (machine & deep learning) algorithms are in the spotlight. In this study, we have focused on the failure diagnosis and mode classification of reduction gear of excavator's travel device by using the AI algorithm. In addition, a remote monitoring system has been developed that can monitor the status of the reduction gear by using the developed diagnosis algorithm. The failure diagnosis algorithm was performed in the process of data acquisition of normal and abnormal under various operating conditions, data processing and analysis by the wavelet transformation, and learning. The developed algorithm was verified based on three-evaluation conditions. Finally, we have built a system that can check the status of the reduction gear of travel devices on the web using the Edge platform, which is embedded with the failure diagnosis algorithm and cloud.

자동차 부품 품질검사를 위한 비전시스템 개발과 머신러닝 모델 비교 (Development of vision system for quality inspection of automotive parts and comparison of machine learning models)

  • 박영민;정동일
    • 문화기술의 융합
    • /
    • 제8권1호
    • /
    • pp.409-415
    • /
    • 2022
  • 컴퓨터 비전은 카메라를 이용하여 측정대상의 영상을 획득하고, 추출하고자 하는 특징 값, 벡터, 영역 등을 알고리즘과 라이브러리 함수를 응용하여 검출한다. 검출된 데이터는 사용하는 목적에 따라 다양한 형태로 계산되고 분석한다. 컴퓨터 비전은 다양한 곳에 활용되고 있으며, 특히 자동차의 부품을 자동으로 인식하거나 품질을 측정하는 분야에 많이 활용된다. 컴퓨터 비전을 산업분야에서 머신비전이라는 용어로 활용되고 있으며, 인공지능과 연결되어 제품의 품질을 판정하거나 결과를 예측하기도 한다. 본 연구에서는 자동차 부품의 품질을 판정하기 위한 비전시스템을 구축하고, 생산된 데이터에 5개의 머신러닝 분류 모델을 적용하여 그 결과를 비교하였다.

Classification Model and Crime Occurrence City Forecasting Based on Random Forest Algorithm

  • KANG, Sea-Am;CHOI, Jeong-Hyun;KANG, Min-soo
    • 한국인공지능학회지
    • /
    • 제10권1호
    • /
    • pp.21-25
    • /
    • 2022
  • Korea has relatively less crime than other countries. However, the crime rate is steadily increasing. Many people think the crime rate is decreasing, but the crime arrest rate has increased. The goal is to check the relationship between CCTV and the crime rate as a way to lower the crime rate, and to identify the correlation between areas without CCTV and areas without CCTV. If you see a crime that can happen at any time, I think you should use a random forest algorithm. We also plan to use machine learning random forest algorithms to reduce the risk of overfitting, reduce the required training time, and verify high-level accuracy. The goal is to identify the relationship between CCTV and crime occurrence by creating a crime prevention algorithm using machine learning random forest techniques. Assuming that no crime occurs without CCTV, it compares the crime rate between the areas where the most crimes occur and the areas where there are no crimes, and predicts areas where there are many crimes. The impact of CCTV on crime prevention and arrest can be interpreted as a comprehensive effect in part, and the purpose isto identify areas and frequency of frequent crimes by comparing the time and time without CCTV.

건설기계의 오일진단 관련 특허 분석 (Analysis of Patents Related to Oil Diagnosis of Construction Equipment)

  • 홍성호;장범석
    • Tribology and Lubricants
    • /
    • 제38권4호
    • /
    • pp.143-151
    • /
    • 2022
  • This study analyzes patents related to oil diagnosis of construction equipment. Oil diagnosis is extremely important for maintaining construction equipment properly. Through the evaluation of existing patents, a patent strategy for the future construction equipment market is presented. The related patents are classified and selected in several steps. Finally, 16 valid patents are selected and analyzed in detail. In the classification process, patents are classified by country, year, and company. A market analysis shows that the top 10 companies have a market share of more than 50. In addition to patents related to the oil analysis of construction equipment, patents related to automobile oil analysis and development of oil sensors are investigated to identify the contents of patents in other fields that can be applied to oil diagnosis technology for construction equipment. Moreover, not only the contents of research articles of two Korean construction companies, but also the research trends in the literature in this field are used in the analysis. The related patents of the two Korean companies are few. Companies with a high market share, including Caterpillar, hold many patents, and patents for diagnosis algorithms using such technologies as artificial intelligence and artificial neural networks, along with oil sensor-based condition monitoring technology, are gradually expanding.