• 제목/요약/키워드: Large data

검색결과 14,214건 처리시간 0.04초

사물인터넷 환경에서 대용량 스트리밍 센서데이터의 실시간·병렬 시맨틱 변환 기법 (Real-time and Parallel Semantic Translation Technique for Large-Scale Streaming Sensor Data in an IoT Environment)

  • 권순현;박동환;방효찬;박영택
    • 정보과학회 논문지
    • /
    • 제42권1호
    • /
    • pp.54-67
    • /
    • 2015
  • 최근 사물인터넷 환경에서는 발생하는 센서데이터의 가치와 데이터의 상호운용성을 증진시키기 위해 시맨틱웹 기술과의 접목에 대한 연구가 활발히 진행되고 있다. 이를 위해서는 센서데이터와 서비스 도메인 지식의 융합을 위한 센서데이터의 시맨틱화는 필수적이다. 하지만 기존의 시맨틱 변환기술은 정적인 메타데이터를 시맨틱 데이터(RDF)로 변환하는 기술이며, 이는 사물인터넷 환경의 실시간성, 대용량성의 특징을 제대로 처리할 수 없는 실정이다. 따라서 본 논문에서는 사물인터넷 환경에서 발생하는 대용량 스트리밍 센서데이터의 실시간 병렬처리를 통해 시맨틱 데이터로 변환하는 기법을 제시한다. 본 기법에서는 시맨틱 변환을 위한 변환규칙을 정의하고, 정의된 변환규칙과 온톨로지 기반 센서 모델을 통해 실시간 병렬로 센서데이터를 시맨틱 변환하여 시맨틱 레파지토리에 저장한다. 성능향상을 위해 빅데이터 실시간 분석 프레임워크인 아파치 스톰을 이용하여, 각 변환작업을 병렬로 처리한다. 이를 위한 시스템을 구현하고, 대용량 스트리밍 센서데이터인 기상청 AWS 관측데이터를 이용하여 제시된 기법에 대한 성능평가를 진행하여, 본 논문에서 제시된 기법을 입증한다.

다수의 결측치가 존재하는 가전업 고객 데이터 활용을 위한 고객분류기법의 개발 (Customer Classification Method for Household Appliances Industries with a Large Number of Incomplete Data)

  • 장영순;서종현
    • 산업공학
    • /
    • 제19권1호
    • /
    • pp.86-96
    • /
    • 2006
  • Some customer data of manufacturing industries have a large number of incomplete data set due to the customer's infrequent purchasing behavior and the limitation of customer profile data gathered from sales representatives. So that, most sophisticated data analysis methods may not be applied directly. This paper proposes a heuristic data analysis method to classify customers in household appliances industries. The proposed PD (percent of difference) method can be used for the discriminant analysis of incomplete customer data with simple mathematical calculations. The method is composed of variable distribution estimation step, PD measure and cluster score evaluation steps, variable impact construction step, and segment assignment step. A real example is also presented.

Query Optimization on Large Scale Nested Data with Service Tree and Frequent Trajectory

  • Wang, Li;Wang, Guodong
    • Journal of Information Processing Systems
    • /
    • 제17권1호
    • /
    • pp.37-50
    • /
    • 2021
  • Query applications based on nested data, the most commonly used form of data representation on the web, especially precise query, is becoming more extensively used. MapReduce, a distributed architecture with parallel computing power, provides a good solution for big data processing. However, in practical application, query requests are usually concurrent, which causes bottlenecks in server processing. To solve this problem, this paper first combines a column storage structure and an inverted index to build index for nested data on MapReduce. On this basis, this paper puts forward an optimization strategy which combines query execution service tree and frequent sub-query trajectory to reduce the response time of frequent queries and further improve the efficiency of multi-user concurrent queries on large scale nested data. Experiments show that this method greatly improves the efficiency of nested data query.

한국주식시장에서 주식규모별 분산비 특성에 관한 연구 -서브프라임 전.후의 비교를 중심으로- (The Characteristics of Korea Stock Market using Variance Ratio)

  • 서상구;박종해
    • 경영과정보연구
    • /
    • 제26권
    • /
    • pp.293-309
    • /
    • 2008
  • This study examined the market efficiency of korea stock market by comparing variance ratios(VR) of stock groups which is sorted by market capitalization. We compute variance ratios of KOSPI large capitalization, midium capitalization, and small capitalization for 546 trading days from 2006/01/02 to 2008/04/15. For our study, we also use high frequency data that is; intra-day 1 minute data. The characteristics of variance ratios of stock groups by market capitalization as follows: From 1 to 5 minute interval, variance ratios of three stock group increase far from zero(0). The longer time interval, the more variance ratios decrease, but only large capitalization converge on around zero. This means that the market of large capitalization is more efficient compare to other stock groups. The entire sample period can be divided two sub-period because the impact of sub prime crisis arised from U.S.A. influences Korea stock market. Before sub prime crisis, the VRs of mid cap and small cap do not converge on around zero except large cap although the time interval is longer. After sub prime crisis, the VRs of three stock groups decrease when time interval is longer, but only large cap converge on around zero. We conclude that large cap is more efficient than other stock groups in Korea Stock Market.

  • PDF

Too Big to Fail: Succession Challenge in Large Family Businesses

  • NG, Hadi Cahyadi;TAN, Jacob Donald;SUGIARTO, Sugiarto;WIDJAJA, Anton Wachidin;PRAMONO, Rudy
    • The Journal of Asian Finance, Economics and Business
    • /
    • 제8권1호
    • /
    • pp.199-206
    • /
    • 2021
  • This study investigated the main concerns and strategies in Indonesian large family businesses to undertake intergenerational succession effectively. The research data was obtained to shed light on the incumbents' mindsets, key preferences, and experiences during the succession process. Access to incumbents of large family businesses that are conglomerates is scant. The preceding survey research was conducted to sensitize with the intricacy of the intergenerational succession process in large family businesses before entailing interpretative phenomenology analysis of qualitative data from interviews, observations, and field notes by approaching family members in five conglomerate groups that have major impacts on the economy. The findings explicate the incumbents' preferred criteria in choosing their successors as well as their perceived concerns revolving around the appointment. Additionally, the incumbents' succession approaches such as apprentice learning by successors, adaptability to external forces by successors, nurturing the entrepreneurial spirit in successors, governance establishment in the firms, business interest stimulation in successors, role modeling by incumbents, and collaboration between family and key non-family members are elicited during the intergenerational succession process. This study concluded with noteworthy implications for incumbents and successors in large family businesses, especially providing explicit criteria and strategies to appoint suitable successors, and suggesting potential avenues for future research.

공동 이용을 위한 음성 인식 및 합성용 음성코퍼스의 발성 목록 설계 (Design of Linguistic Contents of Speech Copora for Speech Recognition and Synthesis for Common Use)

  • 김연화;김형주;김봉완;이용주
    • 대한음성학회지:말소리
    • /
    • 제43호
    • /
    • pp.89-99
    • /
    • 2002
  • Recently, researches into ways of improving large vocabulary continuous speech recognition and speech synthesis are being carried out intensively as the field of speech information technology is progressing rapidly. In the field of speech recognition, developments of stochastic methods such as HMM require large amount of speech data for training, and also in the field of speech synthesis, recent practices show that synthesis of better quality can be produced by selecting and connecting only the variable size of speech data from the large amount of speech data. In this paper we design and discuss linguistic contents for speech copora for speech recognition and synthesis to be shared in common.

  • PDF

LARGE EDDY SIMULATION OF TURBULENT CHANNEL FLOW USING ALGEBRAIC WALL MODEL

  • MALLIK, MUHAMMAD SAIFUL ISLAM;UDDIN, MD. ASHRAF
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • 제20권1호
    • /
    • pp.37-50
    • /
    • 2016
  • A large eddy simulation (LES) of a turbulent channel flow is performed by using the third order low-storage Runge-Kutta method in time and second order finite difference formulation in space with staggered grid at a Reynolds number, $Re_{\tau}=590$ based on the channel half width, ${\delta}$ and wall shear velocity, $u_{\tau}$. To reduce the calculation cost of LES, algebraic wall model (AWM) is applied to approximate the near-wall region. The computation is performed in a domain of $2{\pi}{\delta}{\times}2{\delta}{\times}{\pi}{\delta}$ with $32{\times}20{\times}32$ grid points. Standard Smagorinsky model is used for subgrid-scale (SGS) modeling. Essential turbulence statistics of the flow field are computed and compared with Direct Numerical Simulation (DNS) data and LES data using no wall model. Agreements as well as discrepancies are discussed. The flow structures in the computed flow field have also been discussed and compared with LES data using no wall model.

DATA MINING AND PREDICTION OF SAI TYPE MATRIX PRECONDITIONER

  • Kim, Sang-Bae;Xu, Shuting;Zhang, Jun
    • Journal of applied mathematics & informatics
    • /
    • 제28권1_2호
    • /
    • pp.351-361
    • /
    • 2010
  • The solution of large sparse linear systems is one of the most important problems in large scale scientific computing. Among the many methods developed, the preconditioned Krylov subspace methods are considered the preferred methods. Selecting a suitable preconditioner with appropriate parameters for a specific sparse linear system presents a challenging task for many application scientists and engineers who have little knowledge of preconditioned iterative methods. The prediction of ILU type preconditioners was considered in [27] where support vector machine(SVM), as a data mining technique, is used to classify large sparse linear systems and predict best preconditioners. In this paper, we apply the data mining approach to the sparse approximate inverse(SAI) type preconditioners to find some parameters with which the preconditioned Krylov subspace method on the linear systems shows best performance.

Sampled-Data Observer-Based Decentralized Fuzzy Control for Nonlinear Large-Scale Systems

  • Koo, Geun Bum;Park, Jin Bae;Joo, Young Hoon
    • Journal of Electrical Engineering and Technology
    • /
    • 제11권3호
    • /
    • pp.724-732
    • /
    • 2016
  • In this paper, a sampled-data observer-based decentralized fuzzy control technique is proposed for a class of nonlinear large-scale systems, which can be represented to a Takagi-Sugeno fuzzy system. The premise variable is assumed to be measurable for the design of the observer-based fuzzy controller, and the closed-loop system is obtained. Based on an exact discretized model of the closed-loop system, the stability condition is derived for the closed-loop system. Also, the stability condition is converted into the linear matrix inequality (LMI) format. Finally, an example is provided to verify the effectiveness of the proposed techniques.

Age and Gender in Reddit Commenting and Success

  • Finlay, S. Craig
    • Journal of Information Science Theory and Practice
    • /
    • 제2권3호
    • /
    • pp.18-28
    • /
    • 2014
  • Reddit is a large user generated content (USG) website in which users form common interest groups and submit links to external content or text posts of user-created content. The web site operates on a voting system whereby registered users can assign positive or negative ratings to both submitted content and comments made to submitted content. While Reddit is a pseudonymous site, with users creating usernames but providing no biographical data, an informal survey posted to a large shared interest community yielded 734 responses including age and gender of users. This provided a large amount of contextual biographical data with which to analyse user profiles at the first level of Computer Mediated Discourse Analysis (CMDA), articulated by Susan Herring. The results indicate that older Reddit users both formulate more complex writing and enjoy more success when rated by other users. Gender data was incomplete and as such only tentative results could be proposed in that regard.