• Title/Summary/Keyword: Unstructed Data

Search Result 4, Processing Time 0.017 seconds

An Extraction Method of Sentiment Infromation from Unstructed Big Data on SNS (SNS상의 비정형 빅데이터로부터 감성정보 추출 기법)

  • Back, Bong-Hyun;Ha, Ilkyu;Ahn, ByoungChul
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.6
    • /
    • pp.671-680
    • /
    • 2014
  • Recently, with the remarkable increase of social network services, it is necessary to extract interesting information from lots of data about various individual opinions and preferences on SNS(Social Network Service). The sentiment information can be applied to various fields of society such as politics, public opinions, economics, personal services and entertainments. To extract sentiment information, it is necessary to use processing techniques that store a large amount of SNS data, extract meaningful data from them, and search the sentiment information. This paper proposes an efficient method to extract sentiment information from various unstructured big data on social networks using HDFS(Hadoop Distributed File System) platform and MapReduce functions. In experiments, the proposed method collects and stacks data steadily as the number of data is increased. When the proposed functions are applied to sentiment analysis, the system keeps load balancing and the analysis results are very close to the results of manual work.

A Parallel HDFS and MapReduce Functions for Emotion Analysis (감성분석을 위한 병렬적 HDFS와 맵리듀스 함수)

  • Back, BongHyun;Ryoo, Yun-Kyoo
    • Journal of the Korea society of information convergence
    • /
    • v.7 no.2
    • /
    • pp.49-57
    • /
    • 2014
  • Recently, opinion mining is introduced to extract useful information from SNS data and to evaluate the true intention of users. Opinion mining are required several efficient techniques to collect and analyze a large amount of SNS data and extract meaningful data from them. Therefore in this paper, we propose a parallel HDFS(Hadoop Distributed File System) and emotion functions based on Mapreduce to extract some emotional information of users from various unstructured big data on social networks. The experiment results have verified that the proposed system and functions perform faster than O(n) for data gathering time and loading time, and maintain stable load balancing for memory and CPU resources.

  • PDF

GNUnet improvenemt for anonymity supporing in large multimedia file (대형 멀티미디어 파일의 익명성 지원을 위한 수정 GNUnet)

  • Lee Myoung-Hoon;Park Byung-Yeon;Jo In-June
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2006.05a
    • /
    • pp.81-90
    • /
    • 2006
  • The GNUnet proposed a file encoding method by 1KB block size to support anonimity of files, decentralizes encoded block to peers through unstructed mode and original data decoding method a block searching or encoded blocks. but, the encoding and block decentralizing method with $600\sim700MB$ large multimedia file appered two problems. First problem, it need addition R block and I block, which make about 4% of storage resource. Second problem, unstructured model added network load by broadcasting decentralizing method. Third problem, The critical point of keyword search function. This paper suggest variable encoding block size and structured model by block decentralizing solution. Suggested encoding method reduced block request supplementary block generation from 4% to 1%, network load by proposal structured model sending answer through dedicated peer to decentralize block and we defined content-based keyword and identifier of sharing file.

  • PDF

Multidimensional Analysis of Unstructured Data and Trends in Architectural Review Opinions of Small and Medium-Sized Apartment Projects (다차원 분석방법을 활용한 중소규모 공동주택 건축심의 의견의 경향과 비정형 데이터로서의 특성분석)

  • Kim, Jinhee;Hwang, Taeeon;Kim, Jae-Sik;Huh, Youngki
    • Korean Journal of Construction Engineering and Management
    • /
    • v.24 no.6
    • /
    • pp.74-80
    • /
    • 2023
  • This study examines the characteristics of architectural review opinions as unstructured data, focusing on the most challenging risk for developers of small and medium-sized apartment projects in response to the increasing number of single-person households in Korea. Using multidimensional analysis methods, the study analyzes the review opinions of 25 projects in B City. Correspondence analysis and MDS (Multidimensional Scale) analysis show that, consistent with prior research, the keywords related to 'structure' and 'planning' dominate architectural review opinions in B City. While the MDS model's stress is very poor at 34.4%, correspondence analysis reveals that this is due to the characteristics of unstructured data in architectural reviews. In addition, the non-structured data analyzed in this study, such as architectural review opinions, exhibited a probability distribution with low kurtosis and high skewness, as they involved various combinations and occurrences of data depending on the discretion of the review committee members and the specific formats of different local governments. This often led to the emergence of keywords that differed significantly from commonly mentioned terms. Although the study has some limitations, it provides a foundation for future detailed analysis by identifying the characteristics of architectural review opinions as unstructured data.