• Title/Summary/Keyword: Distributed data collection

Search Result 236, Processing Time 0.078 seconds

Presentation Planning for Distributed VoD Systems (분산 VoD 시스템을 위한 프리젠테이션 플래닝)

  • Hwang, In-Jun;Byeon, Gwang-Jun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.2S
    • /
    • pp.577-593
    • /
    • 2000
  • A distributed video-on-demand (VoD) system is one where collection of video data is located at dispersed sites across a computer network. In a single site environment, a local video server retrieves video data from its local storage device. However, in the setting of a distributed VoD system, when a customer requests a movie from the local server, the server may need to interact with other servers located across the network. In this paper, we present three types of presentation plans that a local server must construct in order to satisfy the customer request. Informally speaking, a presentation plan is a temporally synchronized detailed sequence of steps that the local server must perform for presenting the requested movie to the customer. This involves obtaining commitments from other video servers, obtaining commitments from the network service provider, as well as making commitments of local resources, within the limitations of available bandwidth, available buffer, and customer data consumption rates. Furthermore, for evaluating the goodness of a presentation plan, we introduce two measures of optimality for presentation plans: minimizing wait time for a customer, and minimizing access bandwidth is used. We develop algorithms to compute optimal presentation plans for all three types, and carry out extensive experiments to compare their performance. We have also mathematically proved certain results for the presentation plans that had previously been verified experimentally in the literature.

  • PDF

Design and Implemention of Real-time web Crawling distributed monitoring system (실시간 웹 크롤링 분산 모니터링 시스템 설계 및 구현)

  • Kim, Yeong-A;Kim, Gea-Hee;Kim, Hyun-Ju;Kim, Chang-Geun
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.1
    • /
    • pp.45-53
    • /
    • 2019
  • We face problems from excessive information served with websites in this rapidly changing information era. We find little information useful and much useless and spend a lot of time to select information needed. Many websites including search engines use web crawling in order to make data updated. Web crawling is usually used to generate copies of all the pages of visited sites. Search engines index the pages for faster searching. With regard to data collection for wholesale and order information changing in realtime, the keyword-oriented web data collection is not adequate. The alternative for selective collection of web information in realtime has not been suggested. In this paper, we propose a method of collecting information of restricted web sites by using Web crawling distributed monitoring system (R-WCMS) and estimating collection time through detailed analysis of data and storing them in parallel system. Experimental results show that web site information retrieval is applied to the proposed model, reducing the time of 15-17%.

Alien Hitchhiker Insect Species Detected from International Vessels Entering Korea in 2022

  • Tae Hwa Kang;Sang Woong Kim;Deuk-Soo Choi
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • v.5 no.2
    • /
    • pp.60-67
    • /
    • 2024
  • Hitchhiker insect species from international vessels entering Korea in 2022 were monitored. A total of 947 samples of hitchhiker insects were collected using a simple collection method by hand. Among them, 856 individuals were classified as 374 species of 86 families in 10 orders through integrative analysis with DNA barcoding and morphological examination. The rest 91 individuals were identified only to the family level. As a result of examining the distribution of the 374 species (856 individuals), 38 species (71 individuals) were confirmed as not-distributed species in Korea, including six species (11 individuals) as 'regulated species' listed by the Korean Animal and Plant Quarantine Agency. Of 38 not-distributed species, 10 species were detected multiple times (at least twice). Accordingly, it is necessary to strengthen monitoring of the area around the port of entry along with continuous surveillance to prevent invasion of species detected multiple times. For monitoring alien hitchhiker insect species, this study provided detection information and biological data for alien species.

Lessons from constructing and operating the national ecological observatory network

  • Christopher McKay
    • Journal of Ecology and Environment
    • /
    • v.47 no.4
    • /
    • pp.187-192
    • /
    • 2023
  • The United States (US) National Science Foundation's (NSF's) National Ecological Observatory Network (NEON) is a continental-scale observation facility, constructed and operated by Battelle, that collects long-term ecological data to better understand and forecast how US ecosystems are changing. All data and samples are collected using standardized methods at 81 field sites across the US and are freely and openly available through the NEON data portal, application programming interface (API), and the NEON Biorepository. NSF led a decade-long design process with the research community, including numerous workshops to inform the key features of NEON, culminating in a formal final design review with an expert panel in 2009. The NEON construction phase began in 2012 and was completed in May 2019, when the observatory began the full operations phase. Full operations are defined as all 81 NEON sites completely built and fully operational, with data being collected using instrumented and observational methods. The intent of the NSF is for NEON operations to continue over a 30-year period. Each challenge encountered, problem solved, and risk realized on NEON offers up lessons learned for constructing and operating distributed ecological data collection infrastructure and data networks. NEON's construction phase included offices, labs, towers, aquatic instrumentation, terrestrial sampling plots, permits, development and testing of the instrumentation and associated cyberinfrastructure, and the development of community-supported collection plans. Although colocation of some sites with existing research sites and use of mostly "off the shelf" instrumentation was part of the design, successful completion of the construction phase required the development of new technologies and software for collecting and processing the hundreds of samples and 5.6 billion data records a day produced across NEON. Continued operation of NEON involves reexamining the decisions made in the past and using the input of the scientific community to evolve, upgrade, and improve data collection and resiliency at the field sites. Successes to date include improvements in flexibility and resilience for aquatic infrastructure designs, improved engagement with the scientific community that uses NEON data, and enhanced methods to deal with obsolescence of the instrumentation and infrastructure across the observatory.

Efficient distributed consensus optimization based on patterns and groups for federated learning (연합학습을 위한 패턴 및 그룹 기반 효율적인 분산 합의 최적화)

  • Kang, Seung Ju;Chun, Ji Young;Noh, Geontae;Jeong, Ik Rae
    • Journal of Internet Computing and Services
    • /
    • v.23 no.4
    • /
    • pp.73-85
    • /
    • 2022
  • In the era of the 4th industrial revolution, where automation and connectivity are maximized with artificial intelligence, the importance of data collection and utilization for model update is increasing. In order to create a model using artificial intelligence technology, it is usually necessary to gather data in one place so that it can be updated, but this can infringe users' privacy. In this paper, we introduce federated learning, a distributed machine learning method that can update models in cooperation without directly sharing distributed stored data, and introduce a study to optimize distributed consensus among participants without an existing server. In addition, we propose a pattern and group-based distributed consensus optimization algorithm that uses an algorithm for generating patterns and groups based on the Kirkman Triple System, and performs parallel updates and communication. This algorithm guarantees more privacy than the existing distributed consensus optimization algorithm and reduces the communication time until the model converges.

Structural Design Optimization using Distributed Structural Analysis (분산구조해석을 이용한 구조설계최적화)

  • 박종희;정진덕;전한규;황진하
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2000.10a
    • /
    • pp.124-132
    • /
    • 2000
  • Distributed processing approach for structural optimization is presented in this study. It is implemented on network of personal computers. The validity and efficiency of this approach are demonstrated and verified by test model of truss. Repeated structural analysis algorithm, which spend a lot of overall structural optimization processes, are based on substructuring scheme with domain-wise parallelism and converted to be adapted to hardware and software environments. The design information data are modularized and assigned to each computer in order to minize the communication cost. The communications between nodes are limited to static condensation and constraint-related data collection.

  • PDF

Big Data Analysis of News on Purchasing Second-hand Clothing and Second-hand Luxury Goods: Identification of Social Perception and Current Situation Using Text Mining (중고의류와 중고명품 구매 관련 언론 보도 빅데이터 분석: 텍스트마이닝을 활용한 사회적 인식과 현황 파악)

  • Hwa-Sook Yoo
    • Human Ecology Research
    • /
    • v.61 no.4
    • /
    • pp.687-707
    • /
    • 2023
  • This study was conducted to obtain useful information on the development of the future second-hand fashion market by obtaining information on the current situation through unstructured text data distributed as news articles related to 'purchase of second-hand clothing' and 'purchase of second-hand luxury goods'. Text-based unstructured data was collected on a daily basis from Naver news from January 1st to December 31st, 2022, using 'purchase of second-hand clothing' and 'purchase of second-hand luxury goods' as collection keywords. This was analyzed using text mining, and the results are as follows. First, looking at the frequency, the collection data related to the purchase of second-hand luxury goods almost quadrupled compared to the data related to the purchase of second-hand clothing, indicating that the purchase of second-hand luxury goods is receiving more social attention. Second, there were common words between the data obtained by the two collection keywords, but they had different words. Regarding second-hand clothing, words related to donations, sharing, and compensation sales were mainly mentioned, indicating that the purchase of second-hand clothing tends to be recognized as an eco-friendly transaction. In second-hand luxury goods, resale and genuine controversy related to the transaction of second-hand luxury goods, second-hand trading platforms, and luxury brands were frequently mentioned. Third, as a result of clustering, data related to the purchase of second-hand clothing were divided into five groups, and data related to the purchase of second-hand luxury goods were divided into six groups.

An Individual Information Management Method on a Distributed Geographic Information System

  • Yutaka-Ohsawa;Kim, Kyongwol
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1998.06b
    • /
    • pp.105-110
    • /
    • 1998
  • This paper proposes a method to manage individual information on large scale distributed geographic information systems. On such system, ordinary users usually cannot alter the contents of the server. The method proposed in this paper makes possible to alter the contents or add individual data onto such kinds of non-write-permitted data onto set. We call the method as GDSF, ‘geographic differential script file’. In this method, a client user makes a GDSF which contains the private information to be added onto the served data. Then, the client keeps the file on a local disk. After this, when the user uses the data, he applies the differential data sequence onto the down loaded data to restore the information. The GDSF is a collection of picture commands which tell pictures insertions, deletions, and modification operations. The GDSF also can contain the modification. The GDSF also can contain the modification of the attribute information of geographic entities. The method also applicable to modify data on a ROM device, for example CD-ROM or DVD-ROM. This paper describes the method and experimental results.

  • PDF

Analysis of Energy Consumption and Processing Delay of Wireless Sensor Networks according to the Characteristic of Applications (응용프로그램의 특성에 따른 무선센서 네트워크의 에너지 소모와 처리 지연 분석)

  • Park, Chong Myung;Han, Young Tak;Jeon, Soobin;Jung, Inbum
    • Journal of KIISE
    • /
    • v.42 no.3
    • /
    • pp.399-407
    • /
    • 2015
  • Wireless sensor networks are used for data collection and processing from the surrounding environment for various applications. Since wireless sensor nodes operate on low computing power, restrictive battery capacity, and low network bandwidth, their architecture model has greatly affected the performance of applications. If applications have high computation complexity or require the real-time processing, the centralized architecture in wireless sensor networks have a delay in data processing. Otherwise, if applications only performed simple data collection for long period, the distributed architecture wasted battery energy in wireless sensors. In this paper, the energy consumption and processing delay were analyzed in centralized and distributed sensor networks. In addition, we proposed a new hybrid architecture for wireless sensor networks. According to the characteristic of applications, the proposed method had the optimal number of wireless sensors in wireless sensor networks.

Design and Implementation of Big Data Cluster for Indoor Environment Monitering (실내 환경 모니터링을 위한 빅데이터 클러스터 설계 및 구현)

  • Jeon, Byoungchan;Go, Mingu
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.13 no.2
    • /
    • pp.77-85
    • /
    • 2017
  • Due to the expansion of accommodation space caused by increase of population along with lifestyle changes, most of people spend their time indoor except for the travel time. Because of this, environmental change of indoor is very important, and it affects people's health and economy in resources. But, most of people don't acknowledge the importance of indoor environment. Thus, monitoring system for sustaining and managing indoor environment systematically is needed, and big data clusters should be used in order to save and manage numerous sensor data collected from many spaces. In this paper, we design a big data cluster for the indoor environment monitoring in order to store the sensor data and monitor unit of the huge building Implementation design big data cluster-based system for the analysis, and a distributed file system and building a Hadoop, HBase for big data processing. Also, various sensor data is saved for collection, and effective indoor environment management and health enhancement through monitoring is expected.