Search | Korea Science

Construction of Event Networks from Large News Data Using Text Mining Techniques (텍스트 마이닝 기법을 적용한 뉴스 데이터에서의 사건 네트워크 구축)

Lee, Minchul;Kim, Hea-Jin
- Journal of Intelligence and Information Systems
- /
- v.24 no.1
- /
- pp.183-203
- /
- 2018
News articles are the most suitable medium for examining the events occurring at home and abroad. Especially, as the development of information and communication technology has brought various kinds of online news media, the news about the events occurring in society has increased greatly. So automatically summarizing key events from massive amounts of news data will help users to look at many of the events at a glance. In addition, if we build and provide an event network based on the relevance of events, it will be able to greatly help the reader in understanding the current events. In this study, we propose a method for extracting event networks from large news text data. To this end, we first collected Korean political and social articles from March 2016 to March 2017, and integrated the synonyms by leaving only meaningful words through preprocessing using NPMI and Word2Vec. Latent Dirichlet allocation (LDA) topic modeling was used to calculate the subject distribution by date and to find the peak of the subject distribution and to detect the event. A total of 32 topics were extracted from the topic modeling, and the point of occurrence of the event was deduced by looking at the point at which each subject distribution surged. As a result, a total of 85 events were detected, but the final 16 events were filtered and presented using the Gaussian smoothing technique. We also calculated the relevance score between events detected to construct the event network. Using the cosine coefficient between the co-occurred events, we calculated the relevance between the events and connected the events to construct the event network. Finally, we set up the event network by setting each event to each vertex and the relevance score between events to the vertices connecting the vertices. The event network constructed in our methods helped us to sort out major events in the political and social fields in Korea that occurred in the last one year in chronological order and at the same time identify which events are related to certain events. Our approach differs from existing event detection methods in that LDA topic modeling makes it possible to easily analyze large amounts of data and to identify the relevance of events that were difficult to detect in existing event detection. We applied various text mining techniques and Word2vec technique in the text preprocessing to improve the accuracy of the extraction of proper nouns and synthetic nouns, which have been difficult in analyzing existing Korean texts, can be found. In this study, the detection and network configuration techniques of the event have the following advantages in practical application. First, LDA topic modeling, which is unsupervised learning, can easily analyze subject and topic words and distribution from huge amount of data. Also, by using the date information of the collected news articles, it is possible to express the distribution by topic in a time series. Second, we can find out the connection of events in the form of present and summarized form by calculating relevance score and constructing event network by using simultaneous occurrence of topics that are difficult to grasp in existing event detection. It can be seen from the fact that the inter-event relevance-based event network proposed in this study was actually constructed in order of occurrence time. It is also possible to identify what happened as a starting point for a series of events through the event network. The limitation of this study is that the characteristics of LDA topic modeling have different results according to the initial parameters and the number of subjects, and the subject and event name of the analysis result should be given by the subjective judgment of the researcher. Also, since each topic is assumed to be exclusive and independent, it does not take into account the relevance between themes. Subsequent studies need to calculate the relevance between events that are not covered in this study or those that belong to the same subject.
https://doi.org/10.13088/jiis.2018.24.1.183 인용 PDF KSCI

Quantitative Differences between X-Ray CT-Based and $^{137}Cs$-Based Attenuation Correction in Philips Gemini PET/CT (GEMINI PET/CT의 X-ray CT, $^{137}Cs$ 기반 511 keV 광자 감쇠계수의 정량적 차이)

Kim, Jin-Su;Lee, Jae-Sung;Lee, Dong-Soo;Park, Eun-Kyung;Kim, Jong-Hyo;Kim, Jae-Il;Lee, Hong-Jae;Chung, June-Key;Lee, Myung-Chul
- The Korean Journal of Nuclear Medicine
- /
- v.39 no.3
- /
- pp.182-190
- /
- 2005
Purpose: There are differences between Standard Uptake Value (SUV) of CT attenuation corrected PET and that of $^{137}Cs$. Since various causes lead to difference of SUV, it is important to know what is the cause of these difference. Since only the X-ray CT and $^{137}Cs$ transmission data are used for the attenuation correction, in Philips GEMINI PET/CT scanner, proper transformation of these data into usable attenuation coefficients for 511 keV photon has to be ascertained. The aim of this study was to evaluate the accuracy in the CT measurement and compare the CT and $^{137}Cs$-based attenuation correction in this scanner. Methods: For all the experiments, CT was set to 40 keV (120 kVp) and 50 mAs. To evaluate the accuracy of the CT measurement, CT performance phantom was scanned and Hounsfield units (HU) for those regions were compared to the true values. For the comparison of CT and $^{137}Cs$-based attenuation corrections, transmission scans of the elliptical lung-spine-body phantom and electron density CT phantom composed of various components, such as water, bone, brain and adipose, were performed using CT and $^{137}Cs$. Transformed attenuation coefficients from these data were compared to each other and true 511 keV attenuation coefficient acquired using $^{68}Ge$ and ECAT EXACT 47 scanner. In addition, CT and $^{137}Cs$-derived attenuation coefficients and SUV values for $^{18}F$-FDG measured from the regions with normal and pathological uptake in patients' data were also compared. Results: HU of all the regions in CT performance phantom measured using GEMINI PET/CT were equivalent to the known true values. CT based attenuation coefficients were lower than those of $^{68}Ge$ about 10% in bony region of NEMA ECT phantom. Attenuation coefficients derived from $^{137}Cs$ data was slightly higher than those from CT data also in the images of electron density CT phantom and patients' body with electron density. However, the SUV values in attenuation corrected images using $^{137}Cs$ were lower than images corrected using CT. Percent difference between SUV values was about 15%. Conclusion: Although the HU measured using this scanner was accurate, accuracy in the conversion from CT data into the 511 keV attenuation coefficients was limited in the bony region. Discrepancy in the transformed attenuation coefficients and SUV values between CT and $^{137}Cs$-based data shown in this study suggests that further optimization of various parameters in data acquisition and processing would be necessary for this scanner.
PDF KSCI

Application and Analysis of Ocean Remote-Sensing Reflectance Quality Assurance Algorithm for GOCI-II (천리안해양위성 2호(GOCI-II) 원격반사도 품질 검증 시스템 적용 및 결과)

Sujung Bae;Eunkyung Lee;Jianwei Wei;Kyeong-sang Lee;Minsang Kim;Jong-kuk Choi;Jae Hyun Ahn
- Korean Journal of Remote Sensing
- /
- v.39 no.6_2
- /
- pp.1565-1576
- /
- 2023
An atmospheric correction algorithm based on the radiative transfer model is required to obtain remote-sensing reflectance (R_rs) from the Geostationary Ocean Color Imager-II (GOCI-II) observed at the top-of-atmosphere. This R_rs derived from the atmospheric correction is utilized to estimate various marine environmental parameters such as chlorophyll-a concentration, total suspended materials concentration, and absorption of dissolved organic matter. Therefore, an atmospheric correction is a fundamental algorithm as it significantly impacts the reliability of all other color products. However, in clear waters, for example, atmospheric path radiance exceeds more than ten times higher than the water-leaving radiance in the blue wavelengths. This implies atmospheric correction is a highly error-sensitive process with a 1% error in estimating atmospheric radiance in the atmospheric correction process can cause more than 10% errors. Therefore, the quality assessment of R_rs after the atmospheric correction is essential for ensuring reliable ocean environment analysis using ocean color satellite data. In this study, a Quality Assurance (QA) algorithm based on in-situ R_rs data, which has been archived into a database using Sea-viewing Wide Field-of-view Sensor (SeaWiFS) Bio-optical Archive and Storage System (SeaBASS), was applied and modified to consider the different spectral characteristics of GOCI-II. This method is officially employed in the National Oceanic and Atmospheric Administration (NOAA)'s ocean color satellite data processing system. It provides quality analysis scores for R_rs ranging from 0 to 1 and classifies the water types into 23 categories. When the QA algorithm is applied to the initial phase of GOCI-II data with less calibration, it shows the highest frequency at a relatively low score of 0.625. However, when the algorithm is applied to the improved GOCI-II atmospheric correction results with updated calibrations, it shows the highest frequency at a higher score of 0.875 compared to the previous results. The water types analysis using the QA algorithm indicated that parts of the East Sea, South Sea, and the Northwest Pacific Ocean are primarily characterized as relatively clear case-I waters, while the coastal areas of the Yellow Sea and the East China Sea are mainly classified as highly turbid case-II waters. We expect that the QA algorithm will support GOCI-II users in terms of not only statistically identifying R_rs resulted with significant errors but also more reliable calibration with quality assured data. The algorithm will be included in the level-2 flag data provided with GOCI-II atmospheric correction.
https://doi.org/10.7780/kjrs.2023.39.6.2.5 인용 PDF HTML

Search Result 2,743, Processing Time 0.028 seconds

Construction of Event Networks from Large News Data Using Text Mining Techniques (텍스트 마이닝 기법을 적용한 뉴스 데이터에서의 사건 네트워크 구축)

Quantitative Differences between X-Ray CT-Based and $^{137}Cs$-Based Attenuation Correction in Philips Gemini PET/CT (GEMINI PET/CT의 X-ray CT, $^{137}Cs$ 기반 511 keV 광자 감쇠계수의 정량적 차이)

Application and Analysis of Ocean Remote-Sensing Reflectance Quality Assurance Algorithm for GOCI-II (천리안해양위성 2호(GOCI-II) 원격반사도 품질 검증 시스템 적용 및 결과)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)