• Title/Summary/Keyword: automatic modeling

Search Result 650, Processing Time 0.03 seconds

Subject-Balanced Intelligent Text Summarization Scheme (주제 균형 지능형 텍스트 요약 기법)

  • Yun, Yeoil;Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.141-166
    • /
    • 2019
  • Recently, channels like social media and SNS create enormous amount of data. In all kinds of data, portions of unstructured data which represented as text data has increased geometrically. But there are some difficulties to check all text data, so it is important to access those data rapidly and grasp key points of text. Due to needs of efficient understanding, many studies about text summarization for handling and using tremendous amounts of text data have been proposed. Especially, a lot of summarization methods using machine learning and artificial intelligence algorithms have been proposed lately to generate summary objectively and effectively which called "automatic summarization". However almost text summarization methods proposed up to date construct summary focused on frequency of contents in original documents. Those summaries have a limitation for contain small-weight subjects that mentioned less in original text. If summaries include contents with only major subject, bias occurs and it causes loss of information so that it is hard to ascertain every subject documents have. To avoid those bias, it is possible to summarize in point of balance between topics document have so all subject in document can be ascertained, but still unbalance of distribution between those subjects remains. To retain balance of subjects in summary, it is necessary to consider proportion of every subject documents originally have and also allocate the portion of subjects equally so that even sentences of minor subjects can be included in summary sufficiently. In this study, we propose "subject-balanced" text summarization method that procure balance between all subjects and minimize omission of low-frequency subjects. For subject-balanced summary, we use two concept of summary evaluation metrics "completeness" and "succinctness". Completeness is the feature that summary should include contents of original documents fully and succinctness means summary has minimum duplication with contents in itself. Proposed method has 3-phases for summarization. First phase is constructing subject term dictionaries. Topic modeling is used for calculating topic-term weight which indicates degrees that each terms are related to each topic. From derived weight, it is possible to figure out highly related terms for every topic and subjects of documents can be found from various topic composed similar meaning terms. And then, few terms are selected which represent subject well. In this method, it is called "seed terms". However, those terms are too small to explain each subject enough, so sufficient similar terms with seed terms are needed for well-constructed subject dictionary. Word2Vec is used for word expansion, finds similar terms with seed terms. Word vectors are created after Word2Vec modeling, and from those vectors, similarity between all terms can be derived by using cosine-similarity. Higher cosine similarity between two terms calculated, higher relationship between two terms defined. So terms that have high similarity values with seed terms for each subjects are selected and filtering those expanded terms subject dictionary is finally constructed. Next phase is allocating subjects to every sentences which original documents have. To grasp contents of all sentences first, frequency analysis is conducted with specific terms that subject dictionaries compose. TF-IDF weight of each subjects are calculated after frequency analysis, and it is possible to figure out how much sentences are explaining about each subjects. However, TF-IDF weight has limitation that the weight can be increased infinitely, so by normalizing TF-IDF weights for every subject sentences have, all values are changed to 0 to 1 values. Then allocating subject for every sentences with maximum TF-IDF weight between all subjects, sentence group are constructed for each subjects finally. Last phase is summary generation parts. Sen2Vec is used to figure out similarity between subject-sentences, and similarity matrix can be formed. By repetitive sentences selecting, it is possible to generate summary that include contents of original documents fully and minimize duplication in summary itself. For evaluation of proposed method, 50,000 reviews of TripAdvisor are used for constructing subject dictionaries and 23,087 reviews are used for generating summary. Also comparison between proposed method summary and frequency-based summary is performed and as a result, it is verified that summary from proposed method can retain balance of all subject more which documents originally have.

The evaluation for the usability ofthe Varian Standard Couch modelingusing Treatment Planning System (치료계획 시스템을 이용한 Varian Standard Couch 모델링의 유용성 평가)

  • Yang, yong mo;Song, yong min;Kim, jin man;Choi, ji min;Choi, byeung gi
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.28 no.1
    • /
    • pp.77-86
    • /
    • 2016
  • Purpose : When a radiation treatment, there is an attenuation by Carbon Fiber Couch. In this study, we tried to evaluate the usability of the Varian Standard Couch(VSC) by modeling with Treatment Planning System (TPS) Materials and Methods : VSC was scanned by CBCT(Cone Beam Computed Tomography) of the Linac(Clinac IX, VARIAN, USA), following the three conditions of VSC, Side Rail OutGrid(SROG), Side Rail InGrid(SRIG), Side Rail In OutSpine Down Bar(SRIOS). After scan, the data was transferred to TPS and modeled by contouring Side Rail, Side Bar Upper, Side Bar Lower, Spine Down Bar automatically. We scanned the Cheese Phantom(Middelton, USA) using Computed Tomography(Light Speed RT 16, GE, USA) and transfer the data to TPS, and apply VSC modeled previously with TPS to it. Dose was measured at the isocenter of Ion Chamber(A1SL, Standard imaging, USA) in Cheese Phantom using 4 and 10 MV radiation for every $5^{\circ}$ gantry angle in a different filed size($3{\times}3cm^2$, $10{\times}10cm^2$) without any change of MU(=100), and then we compared the calculated dose and measured dose. Also we included dose at the $127^{\circ}$ in SRIG to compare the attenuation by Side Bar Upper. Results : The density of VSC by CBCT in TPS was $0.9g/cm^3$, and in the case of Spine Down Bar, it was $0.7g/cm^3$. The radiation was attenuated by 17.49%, 16.49%, 8.54%, and 7.59% at the Side Rail, Side Bar Upper, Side Bar Lower, and Spine Down Bar. For the accuracy of modeling, calculated dose and measured dose were compared. The average error was 1.13% and the maximum error was 1.98% at the $170^{\circ}beam$ crossing the Spine Down Bar. Conclusion : To evaluate the usability for the VSC modeled by TPS, the maximum error was 1.98% as a result of compassion between calculated dose and measured dose. We found out that VSC modeling helped expect the dose, so we think that it will be helpful for the more accurate treatment.

  • PDF

Topic Modeling Insomnia Social Media Corpus using BERTopic and Building Automatic Deep Learning Classification Model (BERTopic을 활용한 불면증 소셜 데이터 토픽 모델링 및 불면증 경향 문헌 딥러닝 자동분류 모델 구축)

  • Ko, Young Soo;Lee, Soobin;Cha, Minjung;Kim, Seongdeok;Lee, Juhee;Han, Ji Yeong;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.2
    • /
    • pp.111-129
    • /
    • 2022
  • Insomnia is a chronic disease in modern society, with the number of new patients increasing by more than 20% in the last 5 years. Insomnia is a serious disease that requires diagnosis and treatment because the individual and social problems that occur when there is a lack of sleep are serious and the triggers of insomnia are complex. This study collected 5,699 data from 'insomnia', a community on 'Reddit', a social media that freely expresses opinions. Based on the International Classification of Sleep Disorders ICSD-3 standard and the guidelines with the help of experts, the insomnia corpus was constructed by tagging them as insomnia tendency documents and non-insomnia tendency documents. Five deep learning language models (BERT, RoBERTa, ALBERT, ELECTRA, XLNet) were trained using the constructed insomnia corpus as training data. As a result of performance evaluation, RoBERTa showed the highest performance with an accuracy of 81.33%. In order to in-depth analysis of insomnia social data, topic modeling was performed using the newly emerged BERTopic method by supplementing the weaknesses of LDA, which is widely used in the past. As a result of the analysis, 8 subject groups ('Negative emotions', 'Advice and help and gratitude', 'Insomnia-related diseases', 'Sleeping pills', 'Exercise and eating habits', 'Physical characteristics', 'Activity characteristics', 'Environmental characteristics') could be confirmed. Users expressed negative emotions and sought help and advice from the Reddit insomnia community. In addition, they mentioned diseases related to insomnia, shared discourse on the use of sleeping pills, and expressed interest in exercise and eating habits. As insomnia-related characteristics, we found physical characteristics such as breathing, pregnancy, and heart, active characteristics such as zombies, hypnic jerk, and groggy, and environmental characteristics such as sunlight, blankets, temperature, and naps.

RPC Correction of KOMPSAT-3A Satellite Image through Automatic Matching Point Extraction Using Unmanned AerialVehicle Imagery (무인항공기 영상 활용 자동 정합점 추출을 통한 KOMPSAT-3A 위성영상의 RPC 보정)

  • Park, Jueon;Kim, Taeheon;Lee, Changhui;Han, Youkyung
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.5_1
    • /
    • pp.1135-1147
    • /
    • 2021
  • In order to geometrically correct high-resolution satellite imagery, the sensor modeling process that restores the geometric relationship between the satellite sensor and the ground surface at the image acquisition time is required. In general, high-resolution satellites provide RPC (Rational Polynomial Coefficient) information, but the vendor-provided RPC includes geometric distortion caused by the position and orientation of the satellite sensor. GCP (Ground Control Point) is generally used to correct the RPC errors. The representative method of acquiring GCP is field survey to obtain accurate ground coordinates. However, it is difficult to find the GCP in the satellite image due to the quality of the image, land cover change, relief displacement, etc. By using image maps acquired from various sensors as reference data, it is possible to automate the collection of GCP through the image matching algorithm. In this study, the RPC of KOMPSAT-3A satellite image was corrected through the extracted matching point using the UAV (Unmanned Aerial Vehichle) imagery. We propose a pre-porocessing method for the extraction of matching points between the UAV imagery and KOMPSAT-3A satellite image. To this end, the characteristics of matching points extracted by independently applying the SURF (Speeded-Up Robust Features) and the phase correlation, which are representative feature-based matching method and area-based matching method, respectively, were compared. The RPC adjustment parameters were calculated using the matching points extracted through each algorithm. In order to verify the performance and usability of the proposed method, it was compared with the GCP-based RPC correction result. The GCP-based method showed an improvement of correction accuracy by 2.14 pixels for the sample and 5.43 pixelsfor the line compared to the vendor-provided RPC. In the proposed method using SURF and phase correlation methods, the accuracy of sample was improved by 0.83 pixels and 1.49 pixels, and that of line wasimproved by 4.81 pixels and 5.19 pixels, respectively, compared to the vendor-provided RPC. Through the experimental results, the proposed method using the UAV imagery presented the possibility as an alternative to the GCP-based method for the RPC correction.

Modeling and mapping fuel moisture content using equilibrium moisture content computed from weather data of the automatic mountain meteorology observation system (AMOS) (산악기상자료와 목재평형함수율에 기반한 산림연료습도 추정식 개발)

  • Lee, HoonTaek;WON, Myoung-Soo;YOON, Suk-Hee;JANG, Keun-Chang
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.22 no.3
    • /
    • pp.21-36
    • /
    • 2019
  • Dead fuel moisture content is a key variable in fire danger rating as it affects fire ignition and behavior. This study evaluates simple regression models estimating the moisture content of standardized 10-h fuel stick (10-h FMC) at three sites with different characteristics(urban and outside/inside the forest). Equilibrium moisture content (EMC) was used as an independent variable, and in-situ measured 10-h FMC was used as a dependent variable and validation data. 10-h FMC spatial distribution maps were created for dates with the most frequent fire occurrence during 2013-2018. Also, 10-h FMC values of the dates were analyzed to investigate under which 10-h FMC condition forest fire is likely to occur. As the results, fitted equations could explain considerable part of the variance in 10-h FMC (62~78%). Compared to the validation data, the models performed well with R2 ranged from 0.53 to 0.68, root mean squared error (RMSE) ranged from 2.52% to 3.43%, and bias ranged from -0.41% to 1.10%. When the 10-h FMC model fitted for one site was applied to the other sites, $R^2$ was maintained as the same while RMSE and bias increased up to 5.13% and 3.68%, respectively. The major deficiency of the 10-h FMC model was that it poorly caught the difference in the drying process after rainfall between 10-h FMC and EMC. From the analysis of 10-h FMC during the dates fire occurred, more than 70% of the fires occurred under a 10-h FMC condition of less than 10.5%. Overall, the present study suggested a simple model estimating 10-h FMC with acceptable performance. Applying the 10-h FMC model to the automatic mountain weather observation system was successfully tested to produce a national-scale 10-h FMC spatial distribution map. This data will be fundamental information for forest fire research, and will support the policy maker.

Efficient 3D Geometric Structure Inference and Modeling for Tensor Voting based Region Segmentation (효과적인 3차원 기하학적 구조 추정 및 모델링을 위한 텐서 보팅 기반 영역 분할)

  • Kim, Sang-Kyoon;Park, Soon-Young;Park, Jong-Hyun
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.3
    • /
    • pp.10-17
    • /
    • 2012
  • In general, image-based 3D scenes can now be found in many popular vision systems, computer games and virtual reality tours. In this paper, we propose a method for creating 3D virtual scenes based on 2D image that is completely automatic and requires only a single scene as input data. The proposed method is similar to the creation of a pop-up illustration in a children's book. In particular, to estimate geometric structure information for 3D scene from a single outdoor image, we apply the tensor voting to an image segmentation. The tensor voting is used based on the fact that homogeneous region in an image is usually close together on a smooth region and therefore the tokens corresponding to centers of these regions have high saliency values. And then, our algorithm labels regions of the input image into coarse categories: "ground", "sky", and "vertical". These labels are then used to "cut and fold" the image into a pop-up model using a set of simple assumptions. The experimental results show that our method successfully segments coarse regions in many complex natural scene images and can create a 3D pop-up model to infer the structure information based on the segmented region information.

Impact Assessment of Spatial Resolution of Radar Rainfall and a Distributed Hydrologic Model on Parameter Estimation (레이더 강우 및 분포형 수문모형의 공간해상도가 매개변수 추정에 미치는 영향 평가)

  • Noh, Seong Jin;Choi, Shin Woo;Choi, Yun Seok;Kim, Kyung Tak
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.34 no.5
    • /
    • pp.1443-1454
    • /
    • 2014
  • In this study, we assess impact of spatial resolution of radar rainfall and a distributed hydrologic model on parameter estimation and rainfall-runoff response. Radar data measured by S-band polarimetric radar located at Mt. Bisl in the year of 2012 are used for the comparative study. As different rainfall estimates such as R-KDP, R-Z, and R-ZDR show good agreement with ground rainfall, R-KDP are applied for rainfall-runoff modeling due to relatively high accuracy in terms of catchment averaged and gauging point rainfall. GRM (grid based rainfall-runoff model) is implemented for flood simulations at the Geumho River catchment with spatial resolutions of 200m, 500m, and 1000m. Automatic calibration is performed by PEST (model independent parameter estimation tool) to find suitable parameters for each spatial resolution. For 200m resolution, multipliers of overlandflow and soil hydraulic conductivity are estimated within stable ranges, while high variations are found from results for 500m and 1000m resolution. No tendency is found in the estimated initial soil moisture. When parameters estimated for different spatial resolution are applied for other resolutions, 200m resolution model shows higher sensitivity compared to 1000m resolution model.

Development of Rule-Set Definition for Architectural Design Code Checking based on BIM - for Act on the Promotion and Guarantee of Access for the Disabled, the Aged, and Pregnant Women to Facilities and Information - (BIM 기반의 건축법규검토를 위한 룰셋 정의서 개발 - 장애인,노인,임산부 등의 편의증진 보장에 관한 법률 대상으로 -)

  • Kim, Yuri;Lee, Sang-Hya;Park, Sang-Hyuk
    • Korean Journal of Construction Engineering and Management
    • /
    • v.13 no.6
    • /
    • pp.143-152
    • /
    • 2012
  • As the Public Procurement Service announced the compulsory of BIM adaption in every public construction from 2016, the importance of BIM is increasing. Besides, automatic code checking takes significance in terms of the quality control for BIM based design. In this study, rule-sets were defined for Act on the Promotion and Guarantee of Access for the Disabled, the Aged, and Pregnant Women to Facilities and Information. Three analytic steps were suggested to shortlist the objective clauses from the entire code; the frequency analysis using project reviews for architectural code compliance, the clause analysis on quantifiability, and the analysis for model checking possibilities. The shortlisted clauses were transformed into the machine readable rule-set definition. A case study was conducted to verify the adaptiveness and consistency of rule-set definitions. In future study, it is required the methodologies of selecting objective clauses to be specified and its indicators to be quantified. Also case studies should be performed to determine the pre-conditions in modeling and to check interoperability issues and other possible errors in models.

A Study on the Characteristics Analysis of Hybrid Choke Coil with Reduced Parasitic Capacitance suitable for LED-TV SMPS (LED-TV용(用) 전원장치에 적합한 기생 커패시턴스 저감형 Hybrid 초크 코일의 특성 해석에 관한 연구)

  • Lee, Jong-Hyeon;Kim, Gu-Yong;Kim, Jong-Hae
    • Journal of IKEEE
    • /
    • v.22 no.1
    • /
    • pp.185-188
    • /
    • 2018
  • This paper describes the parasitic capacitance modeling according to the coil structure, section bobbin and winding method for hybrid choke coil with reduced parasitic capacitance capable of the EMI attenuation of broad bands from lower frequency to higher frequency applied in the EMI attenuation filter of LED-TV SMPS. Especially, the hybrid choke coil with reduced parasitic capacitance($C_p$) proposed in this paper can reduces the parasitic capacitance($C_p$) by adopting the winding methods of rectangular copper wire, compared to the conventional common mode choke coil with the winding method of automatic type. The first resonant frequency of the proposed hybrid choke coil has a tendency to increase as the parasitic capacitance is smaller and its impedance characteristics, especially in the high frequency bands, improves as the first resonant frequency increases. In the future, the proposed hybrid choke coil with reduced parasitic capacitance shows it can be actually utilized in not only LED-TV SMPS but also various applications such as LED Lighting, Note-PC Adapter, and so forth.

Parameter estimations to improve urban planning area runoff prediction accuracy using Stormwater Management Model (SWMM) (SWMM을 이용한 도시계획지역 유출량 예측 정확도 향상을 위한 매개변수 산정)

  • Koo, Young Min;Seo, Dongil
    • Journal of Korea Water Resources Association
    • /
    • v.50 no.5
    • /
    • pp.303-313
    • /
    • 2017
  • In environmental impact assessments for large urban development projects, the Korean government requires analysis of stormwater runoff before, during and after the projects. Though hydrological models are widely used to analyze and prepare for surface runoff during storm events, accuracy of the predicted results have been in question due to limited amount of field data for model calibrations. Intensive field measurements have been made for storm events between July 2015 and July 2016 at a sub-basin of the Gwanpyung-cheon, Daejeon, Republic of Korea using an automatic monitoring system and also additional manual measurements. Continuous precipitation and surface runoff data used for utilization of SWMM model to predict surface runoff during storm events with improved accuracy. The optimal values for Manning's roughness coefficient and values for depression storage were estimated for pervious and impervious surfaces using three representative infiltration methods; the Curve Number Methods, the Horton's Method and the Green-Ampt Methods. The results of the research is expected to be used more efficiently for urban development projects in Korea.