• Title/Summary/Keyword: software algorithms

Search Result 1,093, Processing Time 0.025 seconds

Latent topics-based product reputation mining (잠재 토픽 기반의 제품 평판 마이닝)

  • Park, Sang-Min;On, Byung-Won
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.39-70
    • /
    • 2017
  • Data-drive analytics techniques have been recently applied to public surveys. Instead of simply gathering survey results or expert opinions to research the preference for a recently launched product, enterprises need a way to collect and analyze various types of online data and then accurately figure out customer preferences. In the main concept of existing data-based survey methods, the sentiment lexicon for a particular domain is first constructed by domain experts who usually judge the positive, neutral, or negative meanings of the frequently used words from the collected text documents. In order to research the preference for a particular product, the existing approach collects (1) review posts, which are related to the product, from several product review web sites; (2) extracts sentences (or phrases) in the collection after the pre-processing step such as stemming and removal of stop words is performed; (3) classifies the polarity (either positive or negative sense) of each sentence (or phrase) based on the sentiment lexicon; and (4) estimates the positive and negative ratios of the product by dividing the total numbers of the positive and negative sentences (or phrases) by the total number of the sentences (or phrases) in the collection. Furthermore, the existing approach automatically finds important sentences (or phrases) including the positive and negative meaning to/against the product. As a motivated example, given a product like Sonata made by Hyundai Motors, customers often want to see the summary note including what positive points are in the 'car design' aspect as well as what negative points are in thesame aspect. They also want to gain more useful information regarding other aspects such as 'car quality', 'car performance', and 'car service.' Such an information will enable customers to make good choice when they attempt to purchase brand-new vehicles. In addition, automobile makers will be able to figure out the preference and positive/negative points for new models on market. In the near future, the weak points of the models will be improved by the sentiment analysis. For this, the existing approach computes the sentiment score of each sentence (or phrase) and then selects top-k sentences (or phrases) with the highest positive and negative scores. However, the existing approach has several shortcomings and is limited to apply to real applications. The main disadvantages of the existing approach is as follows: (1) The main aspects (e.g., car design, quality, performance, and service) to a product (e.g., Hyundai Sonata) are not considered. Through the sentiment analysis without considering aspects, as a result, the summary note including the positive and negative ratios of the product and top-k sentences (or phrases) with the highest sentiment scores in the entire corpus is just reported to customers and car makers. This approach is not enough and main aspects of the target product need to be considered in the sentiment analysis. (2) In general, since the same word has different meanings across different domains, the sentiment lexicon which is proper to each domain needs to be constructed. The efficient way to construct the sentiment lexicon per domain is required because the sentiment lexicon construction is labor intensive and time consuming. To address the above problems, in this article, we propose a novel product reputation mining algorithm that (1) extracts topics hidden in review documents written by customers; (2) mines main aspects based on the extracted topics; (3) measures the positive and negative ratios of the product using the aspects; and (4) presents the digest in which a few important sentences with the positive and negative meanings are listed in each aspect. Unlike the existing approach, using hidden topics makes experts construct the sentimental lexicon easily and quickly. Furthermore, reinforcing topic semantics, we can improve the accuracy of the product reputation mining algorithms more largely than that of the existing approach. In the experiments, we collected large review documents to the domestic vehicles such as K5, SM5, and Avante; measured the positive and negative ratios of the three cars; showed top-k positive and negative summaries per aspect; and conducted statistical analysis. Our experimental results clearly show the effectiveness of the proposed method, compared with the existing method.

Effect of Attenuation Correction, Scatter Correction and Resolution Recovery on Diagnostic Performance of Quantitative Myocardial SPECT for Coronary Artery Disease (감쇠보정, 산란보정 및 해상도복원이 정량적 심근 SPECT의 관상동맥질환 진단성능에 미치는 효과)

  • Hwang, Kyung-Hoon;Lee, Dong-Soo;Paeng, Jin-Chul;Lee, Myoung-Mook;Chung, June-Key;Lee, Myung-Chul
    • The Korean Journal of Nuclear Medicine
    • /
    • v.36 no.5
    • /
    • pp.288-297
    • /
    • 2002
  • Purpose: Soft tissue attenuation and scattering are major methodological limitations of myocardial perfusion SPECT. To overcome these limitations, algorithms for attenuation, scatter correction and resolution recovery (ASCRR) is being developed, while quantitative myocardial SPECT has also become available. In this study, we investigated the efficacy of an ASCRR-corrected quantitative myocardial SPECT method for the diagnosis of coronary artery disease (CAD). Materials and Methods: Seventy-five patients (M:F=51:24, $61.0{\pm}8.9$ years old) suspected of CAD who underwent coronary angiography (CAG) within $7{\pm}12$ days of SPECT(Group-I) and 20 subjects (M:F=10:10, age $40.6{\pm}9.4$) with a low likelihood of coronary artery disease (Group-II) were enrolled. Tl-201 rest/ dipyridamole-stress Tc-99m-MIBI gated myocardial SPECT was performed. ASCRR correction was peformed using a Gd-153 line source and automatic software (Vantage-Pro; ADAC Labs, USA). Using a 20-segment model, segmental perfusion was automatically quantified on both the ASCRR-corrected and uncorrected images using an automatic quantifying software (AutoQUANT; ADAC Labs.). Using these quantified values, CAD was diagnosed in each of the 3 coronary arterial territories. The diagnostic performance of ASCRR-corrected SPECT was compared with that of non-corrected SPECT. Results: Among the 75 patients of Group-I, 9 patients had normal CAG while the remaining 66 patients had 155 arterial lesions; 61 left anterior descending (LAD), 48 left circumflex (LCX) and 46 right coronary (RCA) arterial lesions. For the LAD and LCX lesions, there was no significant difference in diagnostic performance. In Group-II patients, the overall normalcy rate improved but this improvement was not statistically significant (p=0.07). However, for RCA lesions, specificity improved significantly but sensitivity worsened significantly with ASCRR correction (both p<0.05). Overall accuracy was the same. Conclusion: The ASCRR correction did not improve diagnostic performance significantly although the diagnostic specificity for RCA lesions improved on quantitative myocardial SPECT. The clinical application of the ASC-RR correction requires more discretion regarding cost and efficacy.

Topographic Factors Computation in Island: A Comparison of Different Open Source GIS Programs (오픈소스 GIS 프로그램의 지형인자 계산 비교: 도서지역 경사도와 지형습윤지수 중심으로)

  • Lee, Bora;Lee, Ho-Sang;Lee, Gwang-Soo
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.5_1
    • /
    • pp.903-916
    • /
    • 2021
  • An area's topography refers to the shape of the earth's surface, described by its elevation, slope, and aspect, among other features. The topographical conditions determine energy flowsthat move water and energy from higher to lower elevations, such as how much solar energy will be received and how much wind or rain will affect it. Another common factor, the topographic wetness index (TWI), is a calculation in digital elevation models of the tendency to accumulate water per slope and unit area, and is one of the most widely referenced hydrologic topographic factors, which helps explain the location of forest vegetation. Analyses of topographical factors can be calculated using a geographic information system (GIS) program based on digital elevation model (DEM) data. Recently, a large number of free open source software (FOSS) GIS programs are available and developed for researchers, industries, and governments. FOSS GIS programs provide opportunitiesfor flexible algorithms customized forspecific user needs. The majority of biodiversity in island areas exists at about 20% higher elevations than in land ecosystems, playing an important role in ecological processes and therefore of high ecological value. However, island areas are vulnerable to disturbances and damage, such as through climate change, environmental pollution, development, and human intervention, and lacks systematic investigation due to geographical limitations (e.g. remoteness; difficulty to access). More than 4,000 of Korea's islands are within a few hours of its coast, and 88% are uninhabited, with 52% of them forested. The forest ecosystems of islands have fewer encounters with human interaction than on land, and therefore most of the topographical conditions are formed naturally and affected more directly by weather conditions or the environment. Therefore, the analysis of forest topography in island areas can be done more precisely than on its land counterparts, and therefore has become a major focus of attention in Korea. This study is focused on calculating the performance of different topographical factors using FOSS GIS programs. The test area is the island forests in Korea's south and the DEM of the target area was processed with GRASS GIS and SAGA GIS. The final slopes and TWI maps were produced as comparisons of the differences between topographic factor calculations of each respective FOSS GIS program. Finally, the merits of each FOSS GIS program used to calculate the topographic factors is discussed.

A Study on the Medical Application and Personal Information Protection of Generative AI (생성형 AI의 의료적 활용과 개인정보보호)

  • Lee, Sookyoung
    • The Korean Society of Law and Medicine
    • /
    • v.24 no.4
    • /
    • pp.67-101
    • /
    • 2023
  • The utilization of generative AI in the medical field is also being rapidly researched. Access to vast data sets reduces the time and energy spent in selecting information. However, as the effort put into content creation decreases, there is a greater likelihood of associated issues arising. For example, with generative AI, users must discern the accuracy of results themselves, as these AIs learn from data within a set period and generate outcomes. While the answers may appear plausible, their sources are often unclear, making it challenging to determine their veracity. Additionally, the possibility of presenting results from a biased or distorted perspective cannot be discounted at present on ethical grounds. Despite these concerns, the field of generative AI is continually advancing, with an increasing number of users leveraging it in various sectors, including biomedical and life sciences. This raises important legal considerations regarding who bears responsibility and to what extent for any damages caused by these high-performance AI algorithms. A general overview of issues with generative AI includes those discussed above, but another perspective arises from its fundamental nature as a large-scale language model ('LLM') AI. There is a civil law concern regarding "the memorization of training data within artificial neural networks and its subsequent reproduction". Medical data, by nature, often reflects personal characteristics of patients, potentially leading to issues such as the regeneration of personal information. The extensive application of generative AI in scenarios beyond traditional AI brings forth the possibility of legal challenges that cannot be ignored. Upon examining the technical characteristics of generative AI and focusing on legal issues, especially concerning the protection of personal information, it's evident that current laws regarding personal information protection, particularly in the context of health and medical data utilization, are inadequate. These laws provide processes for anonymizing and de-identification, specific personal information but fall short when generative AI is applied as software in medical devices. To address the functionalities of generative AI in clinical software, a reevaluation and adjustment of existing laws for the protection of personal information are imperative.

${T_2}weighted$- Half courier Echo Planar Imaging

  • 김치영;김휴정;안창범
    • Investigative Magnetic Resonance Imaging
    • /
    • v.5 no.1
    • /
    • pp.57-65
    • /
    • 2001
  • Purpose : $T_2$-weighted half courier Echo Planar Imaging (T2HEPI) method is proposed to reduce measurement time of existing EPI by a factor of 2. In addition, high $T_2$ contrast is obtained for clinical applications. High resolution single-shot EPI images with $T_2$ contrast are obtained with $128{\times}128$ matrix size by the proposed method. Materials and methods : In order to reduce measurement time in EPI, half courier space is measured, and rest of half courier data is obtained by conjugate symmetric filling. Thus high resolution single shot EPI image with $128{\times}128$ matrix size is obtained with 64 echoes. By the arrangement of phase encoding gradients, high $T_2$ weighted images are obtained. The acquired data in k-space are shifted if there exists residual gradient field due to eddy current along phase encoding gradient, which results in a serious problem in the reconstructed image. The residual field is estimated by the correlation coefficient between the echo signal for dc and the corresponding reference data acquired during the pre-scan. Once the residual gradient field is properly estimated, it can be removed by the adjustment of initial phase encoding gradient field between $70^{\circ}$ and $180^{\circ}$ rf pulses. Results : The suggested T2EPl is implemented in a 1.0 Tela whole body MRI system. Experiments are done with the effective echo times of 72ms and 96ms with single shot acquisitions. High resolution($128{\times}128$) volunteer head images with high $T_2$ contrast are obtained in a single scan by the proposed method. Conclusion : Using the half courier technique, higher resolution EPI images are obtained with matrix size of $128{\times}128$ in a single scan. Furthermore $T_2$ contrast is controlled by the effective echo time. Since the suggested method can be implemented by software alone (pulse sequence and corresponding tuning and reconstruction algorithms) without addition of special hardware, it can be widely used in existing MRI systems.

  • PDF

T-Cache: a Fast Cache Manager for Pipeline Time-Series Data (T-Cache: 시계열 배관 데이타를 위한 고성능 캐시 관리자)

  • Shin, Je-Yong;Lee, Jin-Soo;Kim, Won-Sik;Kim, Seon-Hyo;Yoon, Min-A;Han, Wook-Shin;Jung, Soon-Ki;Park, Se-Young
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.5
    • /
    • pp.293-299
    • /
    • 2007
  • Intelligent pipeline inspection gauges (PIGs) are inspection vehicles that move along within a (gas or oil) pipeline and acquire signals (also called sensor data) from their surrounding rings of sensors. By analyzing the signals captured in intelligent PIGs, we can detect pipeline defects, such as holes and curvatures and other potential causes of gas explosions. There are two major data access patterns apparent when an analyzer accesses the pipeline signal data. The first is a sequential pattern where an analyst reads the sensor data one time only in a sequential fashion. The second is the repetitive pattern where an analyzer repeatedly reads the signal data within a fixed range; this is the dominant pattern in analyzing the signal data. The existing PIG software reads signal data directly from the server at every user#s request, requiring network transfer and disk access cost. It works well only for the sequential pattern, but not for the more dominant repetitive pattern. This problem becomes very serious in a client/server environment where several analysts analyze the signal data concurrently. To tackle this problem, we devise a fast in-memory cache manager, called T-Cache, by considering pipeline sensor data as multiple time-series data and by efficiently caching the time-series data at T-Cache. To the best of the authors# knowledge, this is the first research on caching pipeline signals on the client-side. We propose a new concept of the signal cache line as a caching unit, which is a set of time-series signal data for a fixed distance. We also provide the various data structures including smart cursors and algorithms used in T-Cache. Experimental results show that T-Cache performs much better for the repetitive pattern in terms of disk I/Os and the elapsed time. Even with the sequential pattern, T-Cache shows almost the same performance as a system that does not use any caching, indicating the caching overhead in T-Cache is negligible.

A Hardware Implementation of Image Scaler Based on Area Coverage Ratio (면적 점유비를 이용한 영상 스케일러의 설계)

  • 성시문;이진언;김춘호;김이섭
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.3
    • /
    • pp.43-53
    • /
    • 2003
  • Unlike in analog display devices, the physical screen resolution in digital devices are fixed from the manufacturing. It is a weak point on digital devices. The screen resolution displayed in digital display devices is varied. Thus, interpolation or decimation of the resolution on the display is needed to make the input pixels equal to the screen resolution., This process is called image scaling. Many researches have been developed to reduce the hardware cost and distortion of the image of image scaling algorithm. In this paper, we proposed a Winscale algorithm. which modifies the scale up/down in continuous domain to the scale up/down in discrete domain. Thus, the algorithm is suitable to digital display devices. Hardware implementation of the image scaler is performed using Verilog XL and chip is fabricated in a 0.5${\mu}{\textrm}{m}$ Samsung SOG technology. The hardware costs as well as the scalabilities are compared with the conventional image scaling algorithms that are used in other software. This Winscale algorithm is proved more scalable than other image-scaling algorithm, which has similar H/W cost. This image-scaling algorithm can be used in various digital display devices that need image scaling process.

Implementation of a Self Controlled Mobile Robot with Intelligence to Recognize Obstacles (장애물 인식 지능을 갖춘 자율 이동로봇의 구현)

  • 류한성;최중경
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.40 no.5
    • /
    • pp.312-321
    • /
    • 2003
  • In this paper, we implement robot which are ability to recognize obstacles and moving automatically to destination. we present two results in this paper; hardware implementation of image processing board and software implementation of visual feedback algorithm for a self-controlled robot. In the first part, the mobile robot depends on commands from a control board which is doing image processing part. We have studied the self controlled mobile robot system equipped with a CCD camera for a long time. This robot system consists of a image processing board implemented with DSPs, a stepping motor, a CCD camera. We will propose an algorithm in which commands are delivered for the robot to move in the planned path. The distance that the robot is supposed to move is calculated on the basis of the absolute coordinate and the coordinate of the target spot. And the image signal acquired by the CCD camera mounted on the robot is captured at every sampling time in order for the robot to automatically avoid the obstacle and finally to reach the destination. The image processing board consists of DSP (TMS320VC33), ADV611, SAA7111, ADV7l76A, CPLD(EPM7256ATC144), and SRAM memories. In the second part, the visual feedback control has two types of vision algorithms: obstacle avoidance and path planning. The first algorithm is cell, part of the image divided by blob analysis. We will do image preprocessing to improve the input image. This image preprocessing consists of filtering, edge detection, NOR converting, and threshold-ing. This major image processing includes labeling, segmentation, and pixel density calculation. In the second algorithm, after an image frame went through preprocessing (edge detection, converting, thresholding), the histogram is measured vertically (the y-axis direction). Then, the binary histogram of the image shows waveforms with only black and white variations. Here we use the fact that since obstacles appear as sectional diagrams as if they were walls, there is no variation in the histogram. The intensities of the line histogram are measured as vertically at intervals of 20 pixels. So, we can find uniform and nonuniform regions of the waveforms and define the period of uniform waveforms as an obstacle region. We can see that the algorithm is very useful for the robot to move avoiding obstacles.

The Effects of Discrepancy in Reconstruction Algorithm between Patient Data and Normal Database in AutoQuant Evaluation: Focusing on Half-Time Scan Algorithm in Myocardial SPECT (심근 관류 스펙트에서 Half-Time Scan과 새로운 재구성법이 적용된 정상군 데이터를 기반으로 한 정량적 분석 결과의 차이 비교)

  • Lee, Hyung-Jin;Do, Yong-Ho;Cho, Seong-Wook;Kim, Jin-Eui
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.18 no.1
    • /
    • pp.122-126
    • /
    • 2014
  • Purpose: The new reconstruction algorithms (NRA) provided by vendor aim to shorten the acquisition scan time. Whereas depending on the installed version AutoQuant program used for myocardial SPECT quantitative analysis did not contain the normal data that NRA is applied. Thus, the purpose of this paper is to compare the results according to AutoQuant versions in myocardial SPECT applied NRA and half-time scan (HT). Materials and Methods: Rest Tl and stress MIBI data of total 80 (40 men, 40 women) patients were gathered. Data were applied HT acquisition and ASTONISH (Philips) software which is NRA. Modified autoquant of SNUH and old version of AutoQuant (full-time scan) provided by company were compared. Comparison groups were classified as coronary artery disease (CAD), 24 hrs delay and almost normal patients who have a simple pain patient. Perfusion distribution aspect, summed stress score (SSS), summed rest score (SRS), extent and total perfusion deficit (TPD) of each 25 patient who have above diseases were compared and evaluated. Results: The case of CAD, when using re-edited AutoQuant (HT) SSS and SRS showed about 30% reduction (P<0.0001), Extent showed about 38% reduction and TPD showed about 30% reduction in the tendency (P<0.0001). In the score of the perfusion, especially on the part of infero-medium, infero-apical, lateral-medium and lateral-apical regions were the biggest change. The case of the 24 hrs delay patient SRS (P=0.042), Extent (P=0.018) and TPD (P=0.0024) showed about 13-18% reduction. And the case of simple pain patient, comparison of 4 results showed about 5-7% reduction. Conclusion: This study was started based on expectation that results could be affected by normal patient data. Normal patient data is possible to change by race and gender. It was proved that combination of new reconstruction algorithm for reducing scan time and analysis program according to scan protocol with NRA could also be affected to results. Clinical usefulness of gated myocardial SPECT is possibly increased if each hospital properly collects normal patient data for their scan acquisition protocol.

  • PDF

An Investigation on the Periodical Transition of News related to North Korea using Text Mining (텍스트마이닝을 활용한 북한 관련 뉴스의 기간별 변화과정 고찰)

  • Park, Chul-Soo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.63-88
    • /
    • 2019
  • The goal of this paper is to investigate changes in North Korea's domestic and foreign policies through automated text analysis over North Korea represented in South Korean mass media. Based on that data, we then analyze the status of text mining research, using a text mining technique to find the topics, methods, and trends of text mining research. We also investigate the characteristics and method of analysis of the text mining techniques, confirmed by analysis of the data. In this study, R program was used to apply the text mining technique. R program is free software for statistical computing and graphics. Also, Text mining methods allow to highlight the most frequently used keywords in a paragraph of texts. One can create a word cloud, also referred as text cloud or tag cloud. This study proposes a procedure to find meaningful tendencies based on a combination of word cloud, and co-occurrence networks. This study aims to more objectively explore the images of North Korea represented in South Korean newspapers by quantitatively reviewing the patterns of language use related to North Korea from 2016. 11. 1 to 2019. 5. 23 newspaper big data. In this study, we divided into three periods considering recent inter - Korean relations. Before January 1, 2018, it was set as a Before Phase of Peace Building. From January 1, 2018 to February 24, 2019, we have set up a Peace Building Phase. The New Year's message of Kim Jong-un and the Olympics of Pyeong Chang formed an atmosphere of peace on the Korean peninsula. After the Hanoi Pease summit, the third period was the silence of the relationship between North Korea and the United States. Therefore, it was called Depression Phase of Peace Building. This study analyzes news articles related to North Korea of the Korea Press Foundation database(www.bigkinds.or.kr) through text mining, to investigate characteristics of the Kim Jong-un regime's South Korea policy and unification discourse. The main results of this study show that trends in the North Korean national policy agenda can be discovered based on clustering and visualization algorithms. In particular, it examines the changes in the international circumstances, domestic conflicts, the living conditions of North Korea, the South's Aid project for the North, the conflicts of the two Koreas, North Korean nuclear issue, and the North Korean refugee problem through the co-occurrence word analysis. It also offers an analysis of South Korean mentality toward North Korea in terms of the semantic prosody. In the Before Phase of Peace Building, the results of the analysis showed the order of 'Missiles', 'North Korea Nuclear', 'Diplomacy', 'Unification', and ' South-North Korean'. The results of Peace Building Phase are extracted the order of 'Panmunjom', 'Unification', 'North Korea Nuclear', 'Diplomacy', and 'Military'. The results of Depression Phase of Peace Building derived the order of 'North Korea Nuclear', 'North and South Korea', 'Missile', 'State Department', and 'International'. There are 16 words adopted in all three periods. The order is as follows: 'missile', 'North Korea Nuclear', 'Diplomacy', 'Unification', 'North and South Korea', 'Military', 'Kaesong Industrial Complex', 'Defense', 'Sanctions', 'Denuclearization', 'Peace', 'Exchange and Cooperation', and 'South Korea'. We expect that the results of this study will contribute to analyze the trends of news content of North Korea associated with North Korea's provocations. And future research on North Korean trends will be conducted based on the results of this study. We will continue to study the model development for North Korea risk measurement that can anticipate and respond to North Korea's behavior in advance. We expect that the text mining analysis method and the scientific data analysis technique will be applied to North Korea and unification research field. Through these academic studies, I hope to see a lot of studies that make important contributions to the nation.