DOI QR코드

DOI QR Code

Construction of Spatiotemporal Big Data Using Environmental Impact Assessment Information

  • Cho, Namwook (Invited Research Fellow, Environmental Assessment Group, Korea Environment Institute) ;
  • Kim, Yunjee (Researcher, Environmental Assessment Group, Korea Environment Institute) ;
  • Lee, Moung-Jin (Research Fellow, Center for Environmental Data Strategy, Korea Environment Institute)
  • Received : 2020.08.13
  • Accepted : 2020.08.18
  • Published : 2020.08.31

Abstract

In this study, the information from environmental impact statements was converted into spatial data because environmental data from development sites are collected during the environmental impact assessment (EIA) process. Spatiotemporal big data were built from environmental spatial data for each environmental medium for 2,235 development sites during 2007-2018, available from public data portals. Comparing air-quality monitoring stations, 33,863 measurement points were constructed, which is approximately 75 times more measurement points than that 452 in Air Korea's real-time measurement network. Here, spatiotemporal big data from 2,677,260 EIAs were constructed. In the future, such data might be used not only for EIAs but also for various spatial plans.

Keywords

1. Introduction

An environmental impact assessment(EIA) collects information on development sites pertaining to natural environmental factors, such as air, quality, and soil quality; ecological factors, such as the current state of animals and plants; and socio-economic factors, such as population and housing status. Although previously collected environmental, statistical, and spatial data can be used (Sung et al., 2019; Cho et al., 2017; Song et al., 2015), it is also necessary to collect information directly via field surveys to acquire detailed data on the target site (Kim et al., 2017; Yoo et al., 2011). Preparing and reviewing environmental impact statements (EIS) necessitate considerable time and cost.

Since an EIS is used as reference material for a specific project, it contains very detailed information about the surrounding area. Accordingly, information on past projects in the same area can be used as reference data when preparing an EIA, although the practical use of such information requires institutional improvement (Cho et al., 2019). 

The existing EIA information system is based on the Environmental Impact Assessment Support System, which is an information system operated by the Ministry of the Environment of Korea to collect and disclose information generated during the EIA process. This system not only discloses information according to administrative procedure, such as the original text of the EIS and its annexes, but also supports the preparation of the EIS by providing basic spatial data necessary for the EIA (Yoo, 2018; Lee et al., 2018).

To establish a system that uses more detailed EIA information, we constructed spatiotemporal big data of spatial and attribute information using the measured values for each environmental medium in the projects and adjacent areas subject to EIA during 2007-2018 (Lee, 2018; Ahn et al., 2013). These big data are based on location, measurements, and other administrative information that the developer must provide in the process of submitting the EIS. Under the current system, when promoting a new project subject to EIA, the relevant data for the area surrounding the target site can be derived from a literature survey. It is possible to reduce the time and cost of the EIA process and improve the quality of the assessment by allowing the use of information obtained in other projects. This paper also introduces and discusses methods that can be used in fields such as spatial planning (Kim et al., 2016) and existing EIA.

2. Method of EIA spatial big data construction

EIA spatiotemporal big data were constructed from 2,235 EIA projects and 541 EIA follow-up reports from the original EIS texts collected from the Environmental ImpactAssessment Support Systemfrom2007 to 2018. The data construction process is divided into three steps. First, to extract attribute information, the environmental quality measurement information included in the EIS is classified and standardized for each environmental medium, item, and substance and constructed as data. To this end, the original EIS data in pdf files are converted using optical character recognition (OCR). Then, to extract spatial information, the standard spatial big data is defined and geocoded through standardization of the coordinates or address data extracted in the first step. Finally, the big data are stored as an open database (DB) to facilitate processing and utilization. The attribute and spatial information extracted in this process is integrated, refined into an open DB, and stored in a form that can be used in conjunction with OpenAPI and CSV format (Ahn et al., 2009). These processes are summarized in Fig. 1.

OGCSBN_2020_v36n4_637_f0001.png 이미지

Fig. 1. Flow Chart of data processing.

1) Extraction of attribute information

To extract attribute information from an EIS, first identify the form of the attribute information. When the entire attribute information is composed of text layers, the original pdf file is defined as Text PDF, and the text is extracted by OCR. However, when the attribute information includes only some or no text layers, the text is extracted using options such as alphabet (letter) + number or Korean + Chinese. The text extracted by dividing it into two categories is constructed as an attribute DB via verification processes, such as data attribute value, address typo, and null value checks (Table 1).

Table 1. Example of Environmental quality measurement data in EIS (Cheongju Ochang Technopolis General Industrial Complex Development)

OGCSBN_2020_v36n4_637_t0001.png 이미지

* N/D : Non-Detection

2) Extraction of spatial information

First, the data must be cleaned to extract accurate spatial information. This proceeds in the following order: coordinate verification, address verification, address cleaning, and checking the shapefile format.To determine the exact location of spatial information, location data are constructed after verifying the coordinates and checking the accuracy of the addresses written in the original text. Coordinate verification is a process of checking whether the coordinate system is correct, such as longitude and latitude or transverse Mercator. This is the most important process when amassing spatial information because geocoding cannot be performed when there are coordinate errors. When transverse Mercator coordinates are used, it is necessary to identify the origin point and check the map index information.After coordinate verification, standardized spatial information is constructed by unifying the coordinate system of all spatial data (Table 2).

Table 2. Example of measuring location information in EIS (Cheongju Ochang Technopolis General Industrial Complex Development)

OGCSBN_2020_v36n4_637_t0002.png 이미지

3. Results of constructing EIA spatial big data

After building the big data, 2,677,260 projects were included in a DB by integrating attribute and spatial information for 15 assessment items, such as air quality. The result was deemed national key data and made available as an open DB through public data portals (https://www.data.go.kr/).  

Open DB is provided in OpenAPI or csv format and can be mapped as shown in Fig. 2. Details of the EIA spatial big data are shown in Table 3.

OGCSBN_2020_v36n4_637_f0002.png 이미지

Fig. 2. (a) Development Project Area and (b) Air Quality Mesurement Point in EIA Spatial Big Data, (c) Case study of 「Cheongju Ochang Technopolis General Industrial Complex Development」 Project.

Table 3. Example of EIA spatial Big Data

OGCSBN_2020_v36n4_637_t0003.png 이미지

The EIA spatial big data built here can be characterized as follows. First, the measurement outcome for each environmental medium in the EIS includes the address and coordinate data of the measurement location. Since spatial information can be created and used based on this, continuous updates can be made as EIA projects are implemented. Second, the big data contain detailed measurements for a specific area; the environmental quality measured during the EIAof a project covers the projectsite and surrounding areas. For example, in terms of the air environment in Fig. 2(c), the air quality data provided by the national monitoring network, i.e., the Air Korea real-time measurement network, has been measured at 452 locations as of August 2020 (https://www.airkorea. or.kr/), while the time-series EIADBcurrently provides data measured at 33,863 locations and contains denser spatiotemporal data. Third, the big data contain the outcomes of various environmentalqualitymeasurements at a measurement point. Since the existing environmental spatial information is established separately according to the environmental medium, much time and cost were involved in collecting and pre-processing the data before using the information due to differences in the resolution and precision of the data and spatial information standards. In comparison, the EIA spatial big data include data for various media for the same region; when analyzing various environmental quality measurements, detailed data for each environmental substance can be obtained (Lee, 2018). This has the advantage of allowing insight via preliminary predictions of the EIA and saving additional measurement costs (Cho et al., 2019).

4. Conclusion and Discussion

This study examined the advantages of spatiotemporal accumulation of environmental information recorded in environmental impact assessments and used it to construct spatiotemporal big data. Over the past 12 years, big data from2,677,260 EIAs, including 160,663 on air quality, 163,338 on water quality, and 73,685 on soil quality, have been established. In the construction process, data in the form of existing books were extracted using OCR and implemented as spatial information based on coordinates. The results were stored as an open DB to increase the data usability.

An EIA requires efficient analysis of accumulated environmental impacts and damage. This process can use the EIA spatial big data established here. Consequently, there are more data available than the measurement information for each environmental medium provided by the public sector.To develop this, it is necessary to discover cases via application of the actual EIAs and institutional supplements to increase the usability of the data for EIA. 

Acknowledgements

This research was conducted at Korea Environment Institute (KEI) with support from Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07041203), and R&D Program of Responding Technology to Climate Disastersuch as Heat wave (20010017) funded by the Ministry of Interior and Safety (MOIS, Korea). 

References

  1. Ahn, J.S., H.T. Kim, H.W. Kim, and Y.H. Lim, 2009. A Study on the Implementation Method for Distributing Public Sector Real Estate Information based on OpenAPI using FOS GIS, The Geographical Journal of Korea, 43(2): 173-185 (in Korean with English abstract).
  2. Ahn, J.W., M.S. Lee, and D.B. Shin, 2013. Study for Spatial Big Data Concept and System Building, Spatial Information Research, 21(5): 43-51 (in Korean with English abstract). https://doi.org/10.12672/ksis.2013.21.5.043
  3. Cho, N.W., M.J. Lee, and J.G. Choi, 2019. Evaluation and Improvement of EIA Information Disclosure System - Focused on the Aarhus Convention -, Journal of Environmental Impact Assessment, 35(4): 400-412 (in Korean with English abstract).
  4. Cho, N.W., J.H. Maeng, and M.J. Lee, 2017. Use of Environmental Geospatial Information to Support Environmental Impact Assessment Follow-Up Management, Korean Journal of Remote Sensing, 33(5): 799-807 (in Korean with English abstract). https://doi.org/10.7780/kjrs.2017.33.5.3.4
  5. Kim, G.H., C.M. Jun, H.C. Jung, and J.H. Yoon, 2016. Providing Service Model Based on Concept and Requirements of Spatial Big Data, Journal of the Korean Society for Geospatial Information Science, 24(4): 89-96 (in Korean with English abstract).
  6. Kim, H.J., S.H. Han, S.J. Kim, H.M. Yun, S.C. Jun, and Y. Son, 2017. Spatio-Temporal Monitoring of Soil $CO_2$ Fluxes and Concentrations after Artificial $CO_2$ Release, Journal of Environmental Impact Assessment, 26(2): 93-104 (in Korean with English abstract). https://doi.org/10.14249/eia.2017.26.2.93
  7. Lee, M.J., 2018. Opening of environmental assessment monitoring DB for providing environmental information, Ministry of the Interior and Safety, Sejong, Korea.
  8. Lee, M.J., J.H. Maeng, Y.J. Lee, J.H. Yoon, J.H. Lee, S.M. Lee, and N.W. Cho, 2018. Establishment of Spatial Information Application System for Advanced Environmental Impact Assessment, Korea Environment Institute, Sejong, Korea.
  9. Song, D.H., J.W. Ryu, and E.H. Jung, 2015. A Study on Application of Open Platform of Spatial Information for Improvement of Environment Impact Assessment Supporting System, Journal of the Korean Association of Geographic Information Studies, 18(1): 105-119 (in Korean with English abstract). https://doi.org/10.11108/kagis.2015.18.1.105
  10. Sung, H.C., Y.Y. Zhu, and S.W. Jeon, 2019. Study on Application Plan of Forest Spatial Information Based on Unmanned Aerial Vehicle to Improve Environmental Impact Assessment, Journal of the Korean Society of Environmental Restoration Technology, 22(6): 63-76 (in Korean with English abstract).
  11. Yoo, H.S., 2018. Operation of Environmental impact assessment support system 2018, Ministry of Environment, Sejong, Korea.
  12. Yoo, J.W., C.S. Kim, H.I. Jung, Y.W. Lee, M.W. Lee, C.G. Lee, S.J. Jin, J.H. Maeng, and J.S. Hong, 2011. A Knowledge-based Approach for the Estimation of Effective Sampling Station Frequencies in Benthic Ecological Assessments, The Sea, 16(3): 147-154 (in Korean with English abstract). https://doi.org/10.7850/jkso.2011.16.3.147