• Title/Summary/Keyword: Distributed Data Mining

Search Result 110, Processing Time 0.025 seconds

Big Data Analysis of News on Purchasing Second-hand Clothing and Second-hand Luxury Goods: Identification of Social Perception and Current Situation Using Text Mining (중고의류와 중고명품 구매 관련 언론 보도 빅데이터 분석: 텍스트마이닝을 활용한 사회적 인식과 현황 파악)

  • Hwa-Sook Yoo
    • Human Ecology Research
    • /
    • v.61 no.4
    • /
    • pp.687-707
    • /
    • 2023
  • This study was conducted to obtain useful information on the development of the future second-hand fashion market by obtaining information on the current situation through unstructured text data distributed as news articles related to 'purchase of second-hand clothing' and 'purchase of second-hand luxury goods'. Text-based unstructured data was collected on a daily basis from Naver news from January 1st to December 31st, 2022, using 'purchase of second-hand clothing' and 'purchase of second-hand luxury goods' as collection keywords. This was analyzed using text mining, and the results are as follows. First, looking at the frequency, the collection data related to the purchase of second-hand luxury goods almost quadrupled compared to the data related to the purchase of second-hand clothing, indicating that the purchase of second-hand luxury goods is receiving more social attention. Second, there were common words between the data obtained by the two collection keywords, but they had different words. Regarding second-hand clothing, words related to donations, sharing, and compensation sales were mainly mentioned, indicating that the purchase of second-hand clothing tends to be recognized as an eco-friendly transaction. In second-hand luxury goods, resale and genuine controversy related to the transaction of second-hand luxury goods, second-hand trading platforms, and luxury brands were frequently mentioned. Third, as a result of clustering, data related to the purchase of second-hand clothing were divided into five groups, and data related to the purchase of second-hand luxury goods were divided into six groups.

Sectoral Banking Credit Facilities and Non-Oil Economic Growth in Saudi Arabia: Application of the Autoregressive Distributed Lag (ARDL)

  • ALZYADAT, Jumah Ahmad
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.8 no.2
    • /
    • pp.809-820
    • /
    • 2021
  • The study aimed to investigate the impact of sectoral bank credit facilities provided by commercial banks on the non-oil economic growth in Saudi Arabia. Bank credit facilities are given for nine economic sectors: agriculture, manufacturing, mining, electricity and water, health services, construction, wholesale and retail trade, transportation and communications, services, and finance sector. The study employs annual data from 1970 to 2019. The study employs the Autoregressive Distributed Lag (ARDL) approach to identify the long-run and short-run dynamics relationships among the variables. The main results reveal that the overall impact of total bank credit has a significant and positive effect on non-oil economic growth in KSA. The results revealed that the effect of bank credit on the non-oil GDP growth in the short and long run was uneven. The study finds that all sectors have a positive and significant impact in the long run, except for the agricultural and mining sectors. Likewise, all sectors have a positive and significant impact in the short run, except for construction, finance, services, and transportation & communications. As a result, bank credit facilities in different sectors have played an important role in enhancing the non-oil economic growth in the KSA.

The Distributed Management System of Moving Objects for LBS

  • Jang, In-Sung;Cho, Dae-Soo;Park, Jong-Hyun
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.163-167
    • /
    • 2002
  • Recently, owing to performance elevation of telecommunication technology, increase of wireless internet's subscriber and diffusion of wireless device, Interest about LBS (Location Based Service) which take advantage of user's location information and can receive information in concerning with user's location is increasing rapidly. So, MOMS (Moving Object Management System) that manage user's location information is required compulsorily to provide location base service. LBS of childhood such as service to find a friend need only current location, but to provide high-quality service in connection with Data Mining, CRM, We must be able to manage location information of past. In this paper, we design distributed manage system to insert and search Moving Object in a large amount. It has been consisted of CLIM (Current Location Information Manager), PLIM (Past-Location Information Manager) and BLIM (Distributed Location Information Manager). CLIM and PLIM prove performance of searching data by using spatiotemporal-index. DLIM distribute an enormous amount of location data to various database. Thus it keeps load-balance, regulates overload and manage a huge number of location information efficiently.

  • PDF

Support vector machines for big data analysis (빅 데이터 분석을 위한 지지벡터기계)

  • Choi, Hosik;Park, Hye Won;Park, Changyi
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.5
    • /
    • pp.989-998
    • /
    • 2013
  • We cannot analyze big data, which attracts recent attentions in industry and academy, by batch processing algorithms developed in data mining because big data, by definition, cannot be uploaded and processed in the memory of a single system. So an imminent issue is to develop various leaning algorithms so that they can be applied to big data. In this paper, we review various algorithms for support vector machines in the literature. Particularly, we introduce online type and parallel processing algorithms that are expected to be useful in big data classifications and compare the strengths, the weaknesses and the performances of those algorithms through simulations for linear classification.

Analyzing RDF Data in Linked Open Data Cloud using Formal Concept Analysis

  • Hwang, Suk-Hyung;Cho, Dong-Heon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.6
    • /
    • pp.57-68
    • /
    • 2017
  • The Linked Open Data(LOD) cloud is quickly becoming one of the largest collections of interlinked datasets and the de facto standard for publishing, sharing and connecting pieces of data on the Web. Data publishers from diverse domains publish their data using Resource Description Framework(RDF) data model and provide SPARQL endpoints to enable querying their data, which enables creating a global, distributed and interconnected dataspace on the LOD cloud. Although it is possible to extract structured data as query results by using SPARQL, users have very poor in analysis and visualization of RDF data from SPARQL query results. Therefore, to tackle this issue, based on Formal Concept Analysis, we propose a novel approach for analyzing and visualizing useful information from the LOD cloud. The RDF data analysis and visualization technique proposed in this paper can be utilized in the field of semantic web data mining by extracting and analyzing the information and knowledge inherent in LOD and supporting classification and visualization.

Design of a Sentiment Analysis System to Prevent School Violence and Student's Suicide (학교폭력과 자살사고를 예방하기 위한 감성분석 시스템의 설계)

  • Kim, YoungTaek
    • The Journal of Korean Association of Computer Education
    • /
    • v.17 no.6
    • /
    • pp.115-122
    • /
    • 2014
  • One of the problems with current youth generations is increasing rate of violence and suicide in their school lives, and this study aims at the design of a sentiment analysis system to prevent suicide by uising big data process. The main issues of the design are economical implementation, easy and fast processing for the users, so, the open source Hadoop system with MapReduce algorithm is used on the HDFS(Hadoop Distributed File System) for the experimentation. This study uses word count method to do the sentiment analysis with informal data on some sns communications concerning a kinds of violent words, in terms of text mining to avoid some expensive and complex statistical analysis methods.

  • PDF

Distributed and Scalable Intrusion Detection System Based on Agents and Intelligent Techniques

  • El-Semary, Aly M.;Mostafa, Mostafa Gadal-Haqq M.
    • Journal of Information Processing Systems
    • /
    • v.6 no.4
    • /
    • pp.481-500
    • /
    • 2010
  • The Internet explosion and the increase in crucial web applications such as ebanking and e-commerce, make essential the need for network security tools. One of such tools is an Intrusion detection system which can be classified based on detection approachs as being signature-based or anomaly-based. Even though intrusion detection systems are well defined, their cooperation with each other to detect attacks needs to be addressed. Consequently, a new architecture that allows them to cooperate in detecting attacks is proposed. The architecture uses Software Agents to provide scalability and distributability. It works in two modes: learning and detection. During learning mode, it generates a profile for each individual system using a fuzzy data mining algorithm. During detection mode, each system uses the FuzzyJess to match network traffic against its profile. The architecture was tested against a standard data set produced by MIT's Lincoln Laboratory and the primary results show its efficiency and capability to detect attacks. Finally, two new methods, the memory-window and memoryless-window, were developed for extracting useful parameters from raw packets. The parameters are used as detection metrics.

An Extended Dynamic Web Page Recommendation Algorithm Based on Mining Frequent Traversal Patterns (빈발 순회패턴 탐사에 기반한 확장된 동적 웹페이지 추천 알고리즘)

  • Lee KeunSoo;Lee Chang Hoon;Yoon Sun-Hee;Lee Sang Moon;Seo Jeong Min
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.9
    • /
    • pp.1163-1176
    • /
    • 2005
  • The Web is the largest distributed information space but, the individual's capacity to read and digest contents is essentially fixed. In these Web environments, mining traversal patterns is an important problem in Web mining with a host of application domains including system design and information services. Conventional traversal pattern mining systems use the inter-pages association in sessions with only a very restricted mechanism (based on vector or matrix) for generating frequent K-Pagesets. We extend a family of novel algorithms (termed WebPR - Web Page Recommend) for mining frequent traversal patterns and then pageset to recommend. We add a WebPR(A) algorithm into a family of WebPR algorithms, and propose a new winWebPR(T) algorithm introducing a window concept on WebPR(T). Including two extended algorithms, our experimentation with two real data sets, including LadyAsiana and KBS media server site, clearly validates that our method outperforms conventional methods.

  • PDF

DNP3 Protocol Security and Attack Detection Method (DNP3 프로토콜 보안 현황 및 공격 탐지 방안)

  • Kwon, Sung-Moon;Yoo, Hyung-Uk;Lee, Sang-Ha;Shon, Tae-Shik
    • Journal of Advanced Navigation Technology
    • /
    • v.18 no.4
    • /
    • pp.353-358
    • /
    • 2014
  • In the past, security on control system was guaranteed by isolation of control system networks from external networks. However as devices of the control systems became more various and interaction between the devices became necessary, effective management system for such network emerged and this triggered connection between control system networks and external system networks. This made management of control system easier but also made control system exposed to various cyber attack threats, Therefore researches on appending security measures on each protocols are in progress. This paper focused on DNP(distributed network protocol)3 protocol which is used for communication between control center and substations. It describes characteristics of DNP3 protocol and research on adding security elements to the protocol. It also analyzed known vulnerabilities of DNP3 protocol and proposed data mining methodology for detecting such vulnerabilities.

HBase based Business Process Event Log Schema Design of Hadoop Framework

  • Ham, Seonghun;Ahn, Hyun;Kim, Kwanghoon Pio
    • Journal of Internet Computing and Services
    • /
    • v.20 no.5
    • /
    • pp.49-55
    • /
    • 2019
  • Organizations design and operate business process models to achieve their goals efficiently and systematically. With the advancement of IT technology, the number of items that computer systems can participate in and the process becomes huge and complicated. This phenomenon created a more complex and subdivide flow of business process.The process instances that contain workcase and events are larger and have more data. This is an essential resource for process mining and is used directly in model discovery, analysis, and improvement of processes. This event log is getting bigger and broader, which leads to problems such as capacity management and I / O load in management of existing row level program or management through a relational database. In this paper, as the event log becomes big data, we have found the problem of management limit based on the existing original file or relational database. Design and apply schemes to archive and analyze large event logs through Hadoop, an open source distributed file system, and HBase, a NoSQL database system.