• Title/Summary/Keyword: 비정형분석

Search Result 484, Processing Time 0.032 seconds

Telemetering Service in OpenStack (오픈스택 텔레메터링 서비스(Ceilometer))

  • Baek, D.M.;Lee, B.C.
    • Electronics and Telecommunications Trends
    • /
    • v.29 no.6
    • /
    • pp.102-112
    • /
    • 2014
  • 최근 빌링(billing, 과금), 벤치마킹, 확장성(scalability), 통계적 목적을 위해 오픈스택 클라우드의 개별 컴포넌트를 모니터링하고 메터링하는 텔레메터링 서비스가 Ceilometer라는 코드명으로 정식 프로젝트로 추가되었다. 초기의 빌링만을 위해 필수 요소만 모니터링하는 것에서, 상태를 감시하여 클라우드 자원의 오토스케일링 등의 오케스트레이션 기능을 위한 다목적성으로 발전하고 있다. 특히 이것은 빅데이터 등의 데이터 분석에 있어서 중요한 힌트를 제공해 준다. 본고는 소스분석을 통한 Ceilometer의 데이터 수집 구조, Ceilometer 모니터링의 핵심 키워드, 비정형 데이터 DB인 MongoDB, 외부인터페이스로써 API(Application Interface) 혹은 CLI(Command Line Interface) 명령어를 소개하고자 한다. 결론에서는 ceilometer의 전반적 구조에 대한 나름대로의 평가를 기술하였다.

  • PDF

A Study on the Polarity of Apartment Price News Using Big Data Analysis Method (빅데이터 분석기법을 활용한 아파트 가격 관련 뉴스 기사의 극성 분석)

  • Cho, Sang-Yeon;Hong, Eun-Pyo
    • Journal of Digital Convergence
    • /
    • v.17 no.9
    • /
    • pp.47-54
    • /
    • 2019
  • This study confirms the polarity of news articles on apartment prices using Opinion Mining which has widely been used for a big data analysis. The analyses were carried out utilizing internet news articles posted on the Naver for two years: 2012 and 2018. We proposed a sentiment analysis model and modeled a topic-oriented sentiment dictionary construction methods. As a result of analyzing the proposed sentiment analysis model, it was confirmed that there was a difference according to the tendency of the media companies in selecting social issues at the time of rising apartment prices. At the same time, we were able to find more affirmative articles in the media companies which share similar sentiment with the government in charge. In this paper, we proposed a sentiment analysis model that can be used in real estate field and analyzed the polarity of unformatted data related to real estate. In order to integrate them into various fields in the future, it is necessary to build the sentiment dictionaries by themes, as well as to collect various unformatted data over extended periods.

Similar Patent Search Service System using Latent Dirichlet Allocation (잠재 의미 분석을 적용한 유사 특허 검색 서비스 시스템)

  • Lim, HyunKeun;Kim, Jaeyoon;Jung, Hoekyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.8
    • /
    • pp.1049-1054
    • /
    • 2018
  • Keyword searching used in the past as a method of finding similar patents, and automated classification by machine learning is using in recently. Keyword searching is a method of analyzing data that is formalized through data refinement. While the accuracy for short text is high, long one consisted of several words like as document that is not able to analyze the meaning contained in sentences. In semantic analysis level, the method of automatic classification is used to classify sentences composed of several words by unstructured data analysis. There was an attempt to find similar documents by combining the two methods. However, it have a problem in the algorithm w the methods of analysis are different ways to use simultaneous unstructured data and regular data. In this paper, we study the method of extracting keywords implied in the document and using the LDA(Latent Semantic Analysis) method to classify documents efficiently without human intervention and finding similar patents.

Security tendency analysis techniques through machine learning algorithms applications in big data environments (빅데이터 환경에서 기계학습 알고리즘 응용을 통한 보안 성향 분석 기법)

  • Choi, Do-Hyeon;Park, Jung-Oh
    • Journal of Digital Convergence
    • /
    • v.13 no.9
    • /
    • pp.269-276
    • /
    • 2015
  • Recently, with the activation of the industry related to the big data, the global security companies have expanded their scopes from structured to unstructured data for the intelligent security threat monitoring and prevention, and they show the trend to utilize the technique of user's tendency analysis for security prevention. This is because the information scope that can be deducted from the existing structured data(Quantify existing available data) analysis is limited. This study is to utilize the analysis of security tendency(Items classified purpose distinction, positive, negative judgment, key analysis of keyword relevance) applying the machine learning algorithm($Na{\ddot{i}}ve$ Bayes, Decision Tree, K-nearest neighbor, Apriori) in the big data environment. Upon the capability analysis, it was confirmed that the security items and specific indexes for the decision of security tendency could be extracted from structured and unstructured data.

Analysis of News Regarding New Southeastern Airport Using Text Mining Techniques (텍스트 마이닝 기법을 활용한 동남권 신공항 신문기사 분석)

  • Han, Mu Moung Cho;Kim, Yang Sok;Lee, Choong Kwon
    • Smart Media Journal
    • /
    • v.6 no.1
    • /
    • pp.47-53
    • /
    • 2017
  • Social issues are important factors that decide government policy and newspapers are critical channels that reflect them. Analysing news articles can contribute to understanding social issues, but it is very difficult to analyse the unstructured large volumes of news data manually. Therefore, this study aims to analyze the different views among stakeholders of a specific social issue by using text analysis, word cloud analysis and associative analysis methods, which systematically transform unstructured news data into structured one. We analyzed a total of 115 news articles and a total of 6,772 comments, collected from the selected newspapers (Chosun-Il-bo, Joongang-Il-bo, Donga-Il-bo, Maeil Newspaper, Busan-Il-bo) for two weeks. We found that there are significant differences in tone between newspapers. While nation-wide daily newspapers focus on political relations with local areas, local daily newspapers tend to write articles to represent local governments' interests.

Convergence Analysis on Policy Decision Making Factor of Local Construction Planning Phase by Using Unstructured Data in point of the Technology and Culture (비정형 데이터 분석을 통한 기술과 문화의 융합적 관점의 지역 건설기획단계 정책의사결정 영향요인 분석)

  • Park, Eun Soo;Kim, Ji Eun
    • Korea Science and Art Forum
    • /
    • v.23
    • /
    • pp.149-162
    • /
    • 2016
  • Here are background, method, scope, main contents of this research. As the interests increased in recent about the construction in complex and diverse areas, construction is locally connected to human life like to coexistence of the technology and culture. The local development should not be fragmentary construction to improve local recycling ability. Local society should be inherited by modern cultural perspective through a variety of local culture and coexistence. Effective decision making analysis is necessary to build a livable area with a combination of high-tech industry. For this reason, this paper will study the political analysis for decision making at the planning stage of construction in point of fusion of technology and culture by using unstructured data analysis. Conclusion is as in the following. Local planning stage of construction describes diverse meanings of intangible and intangible factors as political factor. Technology factors have various qualitative and quantitative factors in construction field. Understanding decision making at the planning stage of construction means not only visible 'technology factor' such as structure, method, shape, and so on, but also invisible 'culture factor' such as spirit of age, religion, learning, and life-style reflected in formation process of space, and insight of brain power about art.

On the Design of a Big Data based Real-Time Network Traffic Analysis Platform (빅데이터 기반의 실시간 네트워크 트래픽 분석 플랫폼 설계)

  • Lee, Donghwan;Park, Jeong Chan;Yu, Changon;Yun, Hosang
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.23 no.4
    • /
    • pp.721-728
    • /
    • 2013
  • Big data is one of the most spotlighted technological trends in these days, enabling new methods to handle huge volume of complicated data for a broad range of applications. Real-time network traffic analysis essentially deals with big data, which is comprised of different types of log data from various sensors. To tackle this problem, in this paper, we devise a big data based platform, RENTAP, to detect and analyse malicious network traffic. Focused on military network environment such as closed network for C4I systems, leading big data based solutions are evaluated to verify which combination of the solutions is the best design for network traffic analysis platform. Based on the selected solutions, we provide detailed functional design of the suggested platform.

A Machine Learning Based Facility Error Pattern Extraction Framework for Smart Manufacturing (스마트제조를 위한 머신러닝 기반의 설비 오류 발생 패턴 도출 프레임워크)

  • Yun, Joonseo;An, Hyeontae;Choi, Yerim
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.2
    • /
    • pp.97-110
    • /
    • 2018
  • With the advent of the 4-th industrial revolution, manufacturing companies have increasing interests in the realization of smart manufacturing by utilizing their accumulated facilities data. However, most previous research dealt with the structured data such as sensor signals, and only a little focused on the unstructured data such as text, which actually comprises a large portion of the accumulated data. Therefore, we propose an association rule mining based facility error pattern extraction framework, where text data written by operators are analyzed. Specifically, phrases were extracted and utilized as a unit for text data analysis since a word, which normally used as a unit for text data analysis, is unable to deliver the technical meanings of facility errors. Performances of the proposed framework were evaluated by addressing a real-world case, and it is expected that the productivity of manufacturing companies will be enhanced by adopting the proposed framework.

A Study on Data Cleansing Techniques for Word Cloud Analysis of Text Data (텍스트 데이터 워드클라우드 분석을 위한 데이터 정제기법에 관한 연구)

  • Lee, Won-Jo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.745-750
    • /
    • 2021
  • In Big data visualization analysis of unstructured text data, raw data is mostly large-capacity, and analysis techniques cannot be applied without cleansing it unstructured. Therefore, from the collected raw data, unnecessary data is removed through the first heuristic cleansing process and Stopwords are removed through the second machine cleansing process. Then, the frequency of the vocabulary is calculated, visualized using the word cloud technique, and key issues are extracted and informationalized, and the results are analyzed. In this study, we propose a new Stopword cleansing technique using an external Stopword set (DB) in Python word cloud, and derive the problems and effectiveness of this technique through practical case analysis. And, through this verification result, the utility of the practical application of word cloud analysis applying the proposed cleansing technique is presented.

Investigating the Impact of Corporate Social Responsibility on Firm's Short- and Long-Term Performance with Online Text Analytics (온라인 텍스트 분석을 통해 추정한 기업의 사회적책임 성과가 기업의 단기적 장기적 성과에 미치는 영향 분석)

  • Lee, Heesung;Jin, Yunseon;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.13-31
    • /
    • 2016
  • Despite expectations of short- or long-term positive effects of corporate social responsibility (CSR) on firm performance, the results of existing research into this relationship are inconsistent partly due to lack of clarity about subordinate CSR concepts. In this study, keywords related to CSR concepts are extracted from atypical sources, such as newspapers, using text mining techniques to examine the relationship between CSR and firm performance. The analysis is based on data from the New York Times, a major news publication, and Google Scholar. We used text analytics to process unstructured data collected from open online documents to explore the effects of CSR on short- and long-term firm performance. The results suggest that the CSR index computed using the proposed text - online media - analytics predicts long-term performance very well compared to short-term performance in the absence of any internal firm reports or CSR institute reports. Our study demonstrates the text analytics are useful for evaluating CSR performance with respect to convenience and cost effectiveness.