• Title/Summary/Keyword: Big data Processing

Search Result 1,063, Processing Time 0.034 seconds

A Study on Radar Video Fusion Systems for Pedestrian and Vehicle Detection (보행자 및 차량 검지를 위한 레이더 영상 융복합 시스템 연구)

  • Sung-Youn Cho;Yeo-Hwan Yoon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.1
    • /
    • pp.197-205
    • /
    • 2024
  • Development of AI and big data-based algorithms to advance and optimize the recognition and detection performance of various static/dynamic vehicles in front and around the vehicle at a time when securing driving safety is the most important point in the development and commercialization of autonomous vehicles. etc. are being studied. However, there are many research cases for recognizing the same vehicle by using the unique advantages of radar and camera, but deep learning image processing technology is not used, or only a short distance is detected as the same target due to radar performance problems. Therefore, there is a need for a convergence-based vehicle recognition method that configures a dataset that can be collected from radar equipment and camera equipment, calculates the error of the dataset, and recognizes it as the same target. In this paper, we aim to develop a technology that can link location information according to the installation location because data errors occur because it is judged as the same object depending on the installation location of the radar and CCTV (video).

A Study of the Definition and Components of Data Literacy for K-12 AI Education (초·중등 AI 교육을 위한 데이터 리터러시 정의 및 구성 요소 연구)

  • Kim, Seulki;Kim, Taeyoung
    • Journal of The Korean Association of Information Education
    • /
    • v.25 no.5
    • /
    • pp.691-704
    • /
    • 2021
  • The development of AI technology has brought about a big change in our lives. The importance of AI and data education is also growing as AI's influence from life to society to the economy grows. In response, the OECD Education Research Report and various domestic information and curriculum studies deal with data literacy and present it as an essential competency. However, the definition of data literacy and the content and scope of the components vary among researchers. Thus, we analyze the semantic similarity of words through Word2Vec deep learning natural language processing methods along with the definitions of key data literacy studies and analysis of word frequency utilized in components, to present objective and comprehensive definition and components. It was revised and supplemented by expert review, and we defined data literacy as the 'basic ability of knowledge construction and communication to collect, analyze, and use data and process it as information for problem solving'. Furthermore we propose the components of each category of knowledge, skills, values and attitudes. We hope that the definition and components of data literacy derived from this study will serve as a good foundation for the systematization and education research of AI education related to students' future competency.

Efficient Privacy-Preserving Duplicate Elimination in Edge Computing Environment Based on Trusted Execution Environment (신뢰실행환경기반 엣지컴퓨팅 환경에서의 암호문에 대한 효율적 프라이버시 보존 데이터 중복제거)

  • Koo, Dongyoung
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.9
    • /
    • pp.305-316
    • /
    • 2022
  • With the flood of digital data owing to the Internet of Things and big data, cloud service providers that process and store vast amount of data from multiple users can apply duplicate data elimination technique for efficient data management. The user experience can be improved as the notion of edge computing paradigm is introduced as an extension of the cloud computing to improve problems such as network congestion to a central cloud server and reduced computational efficiency. However, the addition of a new edge device that is not entirely reliable in the edge computing may cause increase in the computational complexity for additional cryptographic operations to preserve data privacy in duplicate identification and elimination process. In this paper, we propose an efficiency-improved duplicate data elimination protocol while preserving data privacy with an optimized user-edge-cloud communication framework by utilizing a trusted execution environment. Direct sharing of secret information between the user and the central cloud server can minimize the computational complexity in edge devices and enables the use of efficient encryption algorithms at the side of cloud service providers. Users also improve the user experience by offloading data to edge devices, enabling duplicate elimination and independent activity. Through experiments, efficiency of the proposed scheme has been analyzed such as up to 78x improvements in computation during data outsourcing process compared to the previous study which does not exploit trusted execution environment in edge computing architecture.

IRFP-tree: Intersection Rule Based FP-tree (IRFP-tree(Intersection Rule Based FP-tree): 메모리 효율성을 향상시키기 위해 교집합 규칙 기반의 패러다임을 적용한 FP-tree)

  • Lee, Jung-Hun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.3
    • /
    • pp.155-164
    • /
    • 2016
  • For frequency pattern analysis of large databases, the new tree-based frequency pattern analysis algorithm which can compensate for the disadvantages of the Apriori method has been variously studied. In frequency pattern tree, the number of nodes is associated with memory allocation, but also affects memory resource consumption and processing speed of the growth. Therefore, reducing the number of nodes in the tree is very important in the frequency pattern mining. However, the absolute criteria which need to order the transaction items for construction frequency pattern tree has lowered the compression ratio of the tree nodes. But most of the frequency based tree construction methods adapted the absolute criteria. FP-tree is typically frequency pattern tree structure which is an extended prefix-tree structure for storing compressed frequent crucial information about frequent patterns. For construction the tree, all the frequent items in different transactions are sorted according to the absolute criteria, frequency descending order. CanTree also need to absolute criteria, canonical order, to construct the tree. In this paper, we proposed a novel frequency pattern tree construction method that does not use the absolute criteria, IRFP-tree algorithm. IRFP-tree(Intersection Rule based FP-tree). IRFP-tree is constituted with the new paradigm of the intersection rule without the use of the absolute criteria. It increased the compression ratio of the tree nodes, and reduced the tree construction time. Our method has the additional advantage that it provides incremental mining. The reported test result demonstrate the applicability and effectiveness of the proposed approach.

An Analysis of IT Trends Using Tweet Data (트윗 데이터를 활용한 IT 트렌드 분석)

  • Yi, Jin Baek;Lee, Choong Kwon;Cha, Kyung Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.143-159
    • /
    • 2015
  • Predicting IT trends has been a long and important subject for information systems research. IT trend prediction makes it possible to acknowledge emerging eras of innovation and allocate budgets to prepare against rapidly changing technological trends. Towards the end of each year, various domestic and global organizations predict and announce IT trends for the following year. For example, Gartner Predicts 10 top IT trend during the next year, and these predictions affect IT and industry leaders and organization's basic assumptions about technology and the future of IT, but the accuracy of these reports are difficult to verify. Social media data can be useful tool to verify the accuracy. As social media services have gained in popularity, it is used in a variety of ways, from posting about personal daily life to keeping up to date with news and trends. In the recent years, rates of social media activity in Korea have reached unprecedented levels. Hundreds of millions of users now participate in online social networks and communicate with colleague and friends their opinions and thoughts. In particular, Twitter is currently the major micro blog service, it has an important function named 'tweets' which is to report their current thoughts and actions, comments on news and engage in discussions. For an analysis on IT trends, we chose Tweet data because not only it produces massive unstructured textual data in real time but also it serves as an influential channel for opinion leading on technology. Previous studies found that the tweet data provides useful information and detects the trend of society effectively, these studies also identifies that Twitter can track the issue faster than the other media, newspapers. Therefore, this study investigates how frequently the predicted IT trends for the following year announced by public organizations are mentioned on social network services like Twitter. IT trend predictions for 2013, announced near the end of 2012 from two domestic organizations, the National IT Industry Promotion Agency (NIPA) and the National Information Society Agency (NIA), were used as a basis for this research. The present study analyzes the Twitter data generated from Seoul (Korea) compared with the predictions of the two organizations to analyze the differences. Thus, Twitter data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. To overcome these challenges, we used SAS IRS (Information Retrieval Studio) developed by SAS to capture the trend in real-time processing big stream datasets of Twitter. The system offers a framework for crawling, normalizing, analyzing, indexing and searching tweet data. As a result, we have crawled the entire Twitter sphere in Seoul area and obtained 21,589 tweets in 2013 to review how frequently the IT trend topics announced by the two organizations were mentioned by the people in Seoul. The results shows that most IT trend predicted by NIPA and NIA were all frequently mentioned in Twitter except some topics such as 'new types of security threat', 'green IT', 'next generation semiconductor' since these topics non generalized compound words so they can be mentioned in Twitter with other words. To answer whether the IT trend tweets from Korea is related to the following year's IT trends in real world, we compared Twitter's trending topics with those in Nara Market, Korea's online e-Procurement system which is a nationwide web-based procurement system, dealing with whole procurement process of all public organizations in Korea. The correlation analysis show that Tweet frequencies on IT trending topics predicted by NIPA and NIA are significantly correlated with frequencies on IT topics mentioned in project announcements by Nara market in 2012 and 2013. The main contribution of our research can be found in the following aspects: i) the IT topic predictions announced by NIPA and NIA can provide an effective guideline to IT professionals and researchers in Korea who are looking for verified IT topic trends in the following topic, ii) researchers can use Twitter to get some useful ideas to detect and predict dynamic trends of technological and social issues.

Development of Joint-Based Motion Prediction Model for Home Co-Robot Using SVM (SVM을 이용한 가정용 협력 로봇의 조인트 위치 기반 실행동작 예측 모델 개발)

  • Yoo, Sungyeob;Yoo, Dong-Yeon;Park, Ye-Seul;Lee, Jung-Won
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.12
    • /
    • pp.491-498
    • /
    • 2019
  • Digital twin is a technology that virtualizes physical objects of the real world on a computer. It is used by collecting sensor data through IoT, and using the collected data to connect physical objects and virtual objects in both directions. It has an advantage of minimizing risk by tuning an operation of virtual model through simulation and responding to varying environment by exploiting experiments in advance. Recently, artificial intelligence and machine learning technologies have been attracting attention, so that tendency to virtualize a behavior of physical objects, observe virtual models, and apply various scenarios is increasing. In particular, recognition of each robot's motion is needed to build digital twin for co-robot which is a heart of industry 4.0 factory automation. Compared with modeling based research for recognizing motion of co-robot, there are few attempts to predict motion based on sensor data. Therefore, in this paper, an experimental environment for collecting current and inertia data in co-robot to detect the motion of the robot is built, and a motion prediction model based on the collected sensor data is proposed. The proposed method classifies the co-robot's motion commands into 9 types based on joint position and uses current and inertial sensor values to predict them by accumulated learning. The data used for accumulating learning is the sensor values that are collected when the co-robot operates with margin in input parameters of the motion commands. Through this, the model is constructed to predict not only the nine movements along the same path but also the movements along the similar path. As a result of learning using SVM, the accuracy, precision, and recall factors of the model were evaluated as 97% on average.

Collection and Utilization of Unstructured Environmental Disaster by Using Disaster Information Standardization (재난정보 표준화를 통한 환경 재난정보 수집 및 활용)

  • Lee, Dong Seop;Kim, Byung Sik
    • Ecology and Resilient Infrastructure
    • /
    • v.6 no.4
    • /
    • pp.236-242
    • /
    • 2019
  • In this study, we developed the system that can collect and store environmental disaster data into the database and use it for environmental disaster management by converting structured and unstructured documents such as images into electronic documents. In the 4th Industrial Revolution, various intelligent technologies have been developed in many fields. Environmental disaster information is one of important elements of disaster cycle. Environment disaster information management refers to the act of managing and processing electronic data about disaster cycle. However, these information are mainly managed in the structured and unstructured form of reports. It is necessary to manage unstructured data for disaster information. In this paper, the intelligent generation approach is used to convert handout into electronic documents. Following that, the converted disaster data is organized into the disaster code system as disaster information. Those data are stored into the disaster database system. These converted structured data is managed in a standardized disaster information form connected with the disaster code system. The disaster code system is covered that the structured information is stored and retrieve on entire disaster cycle. The expected effect of this research will be able to apply it to smart environmental disaster management and decision making by combining artificial intelligence technologies and historical big data.

A Study on the Direction of Funeral service focused on Thick Data Analysis (Thick데이터 분석에 기반한 장례서비스 방향성 연구)

  • Ahn, Jinho;Lee, Jeungsun
    • Journal of Service Research and Studies
    • /
    • v.10 no.1
    • /
    • pp.85-96
    • /
    • 2020
  • In Asia, where the aging population is growing rapidly, as the funeral service industry develops and the market grows. The economic value and interest of funeral services is increasing. However, Korea's funeral services are being developed in a biased direction, focusing only on funeral services, after death. Compared to the case of advanced funeral services in the United States, the United Kingdom, and Japan, not only the funeral but also the care of the deceased's family and acquaintances around us are developing. It is appropriate to use a method based on ethnography and User eXperience. For this purpose, the method of collecting and analyzing the ethnography and user experience data of actual resident and visitor was deduced in persona method in the next ten years, and funeral service centered on resident and visitor. In this study, qualitative data centered on the future direction of funeral services, focusing on the resident (family) and the guest who are the principals of services from the perspective of service science. It is difficult to derive meaningful results from the process of collecting, processing, and interpreting big data in general, and in this case, the data analysis method is based on ethnography and user eXperience.) Is appropriate. For this purpose, the method of collecting and analyzing the ethnography and user experience data of the actual resident and the visitor was deduced in the persona method in detail after 10 years. In addition, the future direction of funeral services centered on residence and visitor was presented.

Different Heterogeneous IoT Data Management Techniques for IoT Cloud Environments (IoT 클라우드 환경을 위한 서로 다른 이기종의 IoT 데이터 관리 기법)

  • Cho, Sung-Nam;Jeong, Yoon-Su
    • Journal of Convergence for Information Technology
    • /
    • v.10 no.12
    • /
    • pp.15-21
    • /
    • 2020
  • Although IoT systems are used in a variety of heterogeneous environments as cloud environments develop, all IoT devices are not provided with reliable protocols and services. This paper proposes an IoT data management technique that can extend the IoT cloud environment to an n-layer multi-level structure so that information collected from different heterogeneous IoT devices can be efficiently sorted and processed. The proposed technique aims to classify and process IoT information by transmitting routing information and weight information through wireless data link data collected from heterogeneous IoT devices. The proposed technique not only delivers information classified from IoT devices to the corresponding routing path but also improves the efficiency of IoT data processing by assigning priority according to weight information. The IoT devices used in the proposed technique use each other's reliable protocols, and queries for other IoT devices locally through a local cloud composed of hierarchical structures have features that ensure scalability because they maintain a certain cost.y channels of IoT information in order to make the most of the multiple antenna technology.

Study on the Possibility of Estimating Surface Soil Moisture Using Sentinel-1 SAR Satellite Imagery Based on Google Earth Engine (Google Earth Engine 기반 Sentinel-1 SAR 위성영상을 이용한 지표 토양수분량 산정 가능성에 관한 연구)

  • Younghyun Cho
    • Korean Journal of Remote Sensing
    • /
    • v.40 no.2
    • /
    • pp.229-241
    • /
    • 2024
  • With the advancement of big data processing technology using cloud platforms, access, processing, and analysis of large-volume data such as satellite imagery have recently been significantly improved. In this study, the Change Detection Method, a relatively simple technique for retrieving soil moisture, was applied to the backscattering coefficient values of pre-processed Sentinel-1 synthetic aperture radar (SAR) satellite imagery product based on Google Earth Engine (GEE), one of those platforms, to estimate the surface soil moisture for six observatories within the Yongdam Dam watershed in South Korea for the period of 2015 to 2023, as well as the watershed average. Subsequently, a correlation analysis was conducted between the estimated values and actual measurements, along with an examination of the applicability of GEE. The results revealed that the surface soil moisture estimated for small areas within the soil moisture observatories of the watershed exhibited low correlations ranging from 0.1 to 0.3 for both VH and VV polarizations, likely due to the inherent measurement accuracy of the SAR satellite imagery and variations in data characteristics. However, the surface soil moisture average, which was derived by extracting the average SAR backscattering coefficient values for the entire watershed area and applying moving averages to mitigate data uncertainties and variability, exhibited significantly improved results at the level of 0.5. The results obtained from estimating soil moisture using GEE demonstrate its utility despite limitations in directly conducting desired analyses due to preprocessed SAR data. However, the efficient processing of extensive satellite imagery data allows for the estimation and evaluation of soil moisture over broad ranges, such as long-term watershed averages. This highlights the effectiveness of GEE in handling vast satellite imagery datasets to assess soil moisture. Based on this, it is anticipated that GEE can be effectively utilized to assess long-term variations of soil moisture average in major dam watersheds, in conjunction with soil moisture observation data from various locations across the country in the future.