• Title/Summary/Keyword: 데이터 분석론

Search Result 1,370, Processing Time 0.034 seconds

Design of Customized Research Information Service Based on Prescriptive Analytics (처방적 분석 기반의 연구자 맞춤형 연구정보 서비스 설계)

  • Lee, Jeong-Won;Oh, Yong-Sun
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.3
    • /
    • pp.69-74
    • /
    • 2022
  • Big data related analysis techniques, the prescriptive analytics methodology improves the performance of passive learning models by ensuring that active learning secures high-quality learning data. Prescriptive analytics is a performance maximizing process by enhancing the machine learning models and optimizing systems through active learning to secure high-quality learning data. It is the best subscription value analysis that constructs the expensive category data efficiently. To expand the value of data by collecting research field, research propensity, and research activity information, customized researcher through prescriptive analysis such as predicting the situation at the time of execution after data pre-processing, deriving viable alternatives, and examining the validity of alternatives according to changes in the situation Provides research information service.

A study on rethinking EDA in digital transformation era (DX 전환 환경에서 EDA에 대한 재고찰)

  • Seoung-gon Ko
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.1
    • /
    • pp.87-102
    • /
    • 2024
  • Digital transformation refers to the process by which a company or organization changes or innovates its existing business model or sales activities using digital technology. This requires the use of various digital technologies - cloud computing, IoT, artificial intelligence, etc. - to strengthen competitiveness in the market, improve customer experience, and discover new businesses. In addition, in order to derive knowledge and insight about the market, customers, and production environment, it is necessary to select the right data, preprocess the data to an analyzable state, and establish the right process for systematic analysis suitable for the purpose. The usefulness of such digital data is determined by the importance of pre-processing and the correct application of exploratory data analysis (EDA), which is useful for information and hypothesis exploration and visualization of knowledge and insights. In this paper, we reexamine the philosophy and basic concepts of EDA and discuss key visualization information, information expression methods based on the grammar of graphics, and the ACCENT principle, which is the final visualization review standard, for effective visualization.

How to Identify Customer Needs Based on Big Data and Netnography Analysis (빅데이터와 네트노그라피 분석을 통합한 온라인 커뮤니티 고객 욕구 도출 방안: 천기저귀 온라인 커뮤니티 사례를 중심으로)

  • Soonhwa Park;Sanghyeok Park;Seunghee Oh
    • Information Systems Review
    • /
    • v.21 no.4
    • /
    • pp.175-195
    • /
    • 2019
  • This study conducted both big data and netnography analysis to analyze consumer needs and behaviors of online consumer community. Big data analysis is easy to identify correlations, but causality is difficult to identify. To overcome this limitation, we used netnography analysis together. The netnography methodology is excellent for context grasping. However, there is a limit in that it is time and costly to analyze a large amount of data accumulated for a long time. Therefore, in this study, we searched for patterns of overall data through big data analysis and discovered outliers that require netnography analysis, and then performed netnography analysis only before and after outliers. As a result of analysis, the cause of the phenomenon shown through big data analysis could be explained through netnography analysis. In addition, it was able to identify the internal structural changes of the community, which are not easily revealed by big data analysis. Therefore, this study was able to effectively explain much of online consumer behavior that was difficult to understand as well as contextual semantics from the unstructured data missed by big data. The big data-netnography integrated model proposed in this study can be used as a good tool to discover new consumer needs in the online environment.

Dependency parsing applying reinforced dominance-dependency constraint rule: Combination of deep learning and linguistic knowledge (강화된 지배소-의존소 제약규칙을 적용한 의존구문분석 모델 : 심층학습과 언어지식의 결합)

  • JoongMin Shin;Sanghyun Cho;Seunglyul Park;Seongki Choi;Minho Kim;Miyeon Kim;Hyuk-Chul Kwon
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.289-294
    • /
    • 2022
  • 의존구문분석은 문장을 의존관계(의존소-지배소)로 분석하는 구문분석 방법론이다. 현재 사전학습모델을 사용한 전이 학습의 딥러닝이 좋은 성능을 보이며 많이 연구되지만, 데이터셋에 의존적이며 그로 인한 자료부족 문제와 과적합의 문제가 발생한다는 단점이 있다. 본 논문에서는 언어학적 지식에 기반한 강화된 지배소-의존소 제약규칙 에지 알고리즘을 심층학습과 결합한 모델을 제안한다. TTAS 표준 가이드라인 기반 모두의 말뭉치로 평가한 결과, 최대 UAS 96.28, LAS 93.19의 성능을 보였으며, 선행연구 대비 UAS 2.21%, LAS 1.84%의 향상된 결과를 보였다. 또한 적은 데이터셋으로 학습했음에도 8배 많은 데이터셋 학습모델 대비 UAS 0.95%의 향상과 11배 빠른 학습 시간을 보였다. 이를 통해 심층학습과 언어지식의 결합이 딥러닝의 문제점을 해결할 수 있음을 확인하였다.

  • PDF

A Study on practice of customer lifetime value models in the telecommunication industry (고객평생가치 모델의 통신산업 응용에 관한 연구)

  • Hwang, Seong-Seop;Jo, Seong-Jun
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2005.05a
    • /
    • pp.268-274
    • /
    • 2005
  • 고객관리란 고객데이터와 정보를 분석, 통합하여 개별 고객의 특성에 기초한 마케팅 활동을 계획, 지원, 평가하는 과정이다. 고객 관리는 기업에 있어서 가장 중요한 전략 중의 하나이며, 1990년대 이후로 그 중요성은 강조되어 왔다. 이러한 고객관리의 출발은 고객의 가치를 정확히 측정하는 것이며, 고객평생가치를 판단하기 위한 많은 연구가 이루어졌다. 우리는 고객평생가치 모델을 한국의 이동통신산업에 적용하기 위한 이론적 배경과 그 방법론을 분석하였다. 또한 이 논문은 이러한 과정에서 나타나는 문제점의 원인과 그 해결방안을 제시하고 있다. 이러한 연구와 방법론들은 실제 마케팅 전략에 직접적으로 활용될 수 있을 것이다.

  • PDF

Web Log Analysis Technique using Fuzzy C-Means Clustering (Fuzzy C-Means클러스터링을 이용한 웹 로그 분석기법)

  • 김미라;곽미라;조동섭
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.04b
    • /
    • pp.550-552
    • /
    • 2002
  • 플러스터링이란 주어진 데이터 집합의 패턴들을 비슷한 성실을 가지는 그룹으로 나누어 패턴 상호간의 관계를 정립하기 위한 방법론으로, 지금가지 이를 위한 많은 알고리즘들이 개발되어 왔으며, 패턴인식, 영상 처리 등의 여러 공학 분야에 널리 적용되고 있다. FCM(Fuzzy C-Means) 알고리즘은 최소자승 기준함수(least square criterion function)에 퍼지이론을 적용만 목적함수의 반복최적화(iterative optimization)에 기반을 둔 방식으로, 하드 분할에 의한 기존의 클러스터링 방법이 승자(winner take all) 형태의 방법론을 취하는데 비하여, 각 패턴이 특정 클러스터에 속하는 소속정도를 줌으로써 보다 정확한 정보를 형성하도록 도와준다. 본 논문에서는 FCM 기법을 이용한 웹로그 분석을 하고자 한다.

  • PDF

Development of Topic Trend Analysis Model for Industrial Intelligence using Public Data (텍스트마이닝을 활용한 공개데이터 기반 기업 및 산업 토픽추이분석 모델 제안)

  • Park, Sunyoung;Lee, Gene Moo;Kim, You-Eil;Seo, Jinny
    • Journal of Technology Innovation
    • /
    • v.26 no.4
    • /
    • pp.199-232
    • /
    • 2018
  • There are increasing needs for understanding and fathoming of business management environment through big data analysis at industrial and corporative level. The research using the company disclosure information, which is comprehensively covering the business performance and the future plan of the company, is getting attention. However, there is limited research on developing applicable analytical models leveraging such corporate disclosure data due to its unstructured nature. This study proposes a text-mining-based analytical model for industrial and firm level analyses using publicly available company disclousre data. Specifically, we apply LDA topic model and word2vec word embedding model on the U.S. SEC data from the publicly listed firms and analyze the trends of business topics at the industrial and corporate levels. Using LDA topic modeling based on SEC EDGAR 10-K document, whole industrial management topics are figured out. For comparison of different pattern of industries' topic trend, software and hardware industries are compared in recent 20 years. Also, the changes of management subject at firm level are observed with comparison of two companies in software industry. The changes of topic trends provides lens for identifying decreasing and growing management subjects at industrial and firm level. Mapping companies and products(or services) based on dimension reduction after using word2vec word embedding model and principal component analysis of 10-K document at firm level in software industry, companies and products(services) that have similar management subjects are identified and also their changes in decades. For suggesting methodology to develop analysis model based on public management data at industrial and corporate level, there may be contributions in terms of making ground of practical methodology to identifying changes of managements subjects. However, there are required further researches to provide microscopic analytical model with regard to relation of technology management strategy between management performance in case of related to various pattern of management topics as of frequent changes of management subject or their momentum. Also more studies are needed for developing competitive context analysis model with product(service)-portfolios between firms.

Determining on Model-based Clusters of Time Series Data (시계열데이터의 모델기반 클러스터 결정)

  • Jeon, Jin-Ho;Lee, Gye-Sung
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.6
    • /
    • pp.22-30
    • /
    • 2007
  • Most real word systems such as world economy, stock market, and medical applications, contain a series of dynamic and complex phenomena. One of common methods to understand these systems is to build a model and analyze the behavior of the system. In this paper, we investigated methods for best clustering over time series data. As a first step for clustering, BIC (Bayesian Information Criterion) approximation is used to determine the number of clusters. A search technique to improve clustering efficiency is also suggested by analyzing the relationship between data size and BIC values. For clustering, two methods, model-based and similarity based methods, are analyzed and compared. A number of experiments have been performed to check its validity using real data(stock price). BIC approximation measure has been confirmed that it suggests best number of clusters through experiments provided that the number of data is relatively large. It is also confirmed that the model-based clustering produces more reliable clustering than similarity based ones.

Novel Deep Learning-Based Profiling Side-Channel Analysis on the Different-Device (이종 디바이스 환경에 효과적인 신규 딥러닝 기반 프로파일링 부채널 분석)

  • Woo, Ji-Eun;Han, Dong-Guk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.987-995
    • /
    • 2022
  • Deep learning-based profiling side-channel analysis has been many proposed. Deep learning-based profiling analysis is a technique that trains the relationship between the side-channel information and the intermediate values to the neural network, then finds the secret key of the attack device using the trained neural network. Recently, cross-device profiling side channel analysis was proposed to consider the realistic deep learning-based profiling side channel analysis scenarios. However, it has a limitation in that attack performance is lowered if the profiling device and the attack device have not the same chips. In this paper, an environment in which the profiling device and the attack device have not the same chips is defined as the different-device, and a novel deep learning-based profiling side-channel analysis on different-device is proposed. Also, MCNN is used to well extract the characteristic of each data. We experimented with the six different boards to verify the attack performance of the proposed method; as a result, when the proposed method was used, the minimum number of attack traces was reduced by up to 25 times compared to without the proposed method.

A Study on the Naming Rules of Metadata based on Ontology (온톨로지 기반 메타데이터 명명 규칙에 관한 연구)

  • Ko, Young-Man;Seo, Tae-Sul
    • Journal of the Korean Society for information Management
    • /
    • v.22 no.4 s.58
    • /
    • pp.97-109
    • /
    • 2005
  • To build the consistency among different metadata systems and to increase the interoperability of that systems even among different domains, naming rules and glossaries for the data elements are necessary. This study provides discussion of naming and identification of the data element concept, data element, conceptual domain, value domain, and its meta model. This study also describes example naming conventions based on ontology derived from the combination with object, properties, and representation of data elements. The naming principles and rules described in this study use I-R analysis, DC metadata set, and SHOE 1.0 as an example of the scientific documents. This study would be a guideline to build the naming rules of metadata based on ontology in various domains.