• Title/Summary/Keyword: and Pre-Processing

Search Result 1,917, Processing Time 0.031 seconds

KorPatELECTRA : A Pre-trained Language Model for Korean Patent Literature to improve performance in the field of natural language processing(Korean Patent ELECTRA)

  • Jang, Ji-Mo;Min, Jae-Ok;Noh, Han-Sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.2
    • /
    • pp.15-23
    • /
    • 2022
  • In the field of patents, as NLP(Natural Language Processing) is a challenging task due to the linguistic specificity of patent literature, there is an urgent need to research a language model optimized for Korean patent literature. Recently, in the field of NLP, there have been continuous attempts to establish a pre-trained language model for specific domains to improve performance in various tasks of related fields. Among them, ELECTRA is a pre-trained language model by Google using a new method called RTD(Replaced Token Detection), after BERT, for increasing training efficiency. The purpose of this paper is to propose KorPatELECTRA pre-trained on a large amount of Korean patent literature data. In addition, optimal pre-training was conducted by preprocessing the training corpus according to the characteristics of the patent literature and applying patent vocabulary and tokenizer. In order to confirm the performance, KorPatELECTRA was tested for NER(Named Entity Recognition), MRC(Machine Reading Comprehension), and patent classification tasks using actual patent data, and the most excellent performance was verified in all the three tasks compared to comparative general-purpose language models.

Study on the efficiency improvement of wind turbine load analysis by using automatic generation for wind load condition data (풍황 하중조건 데이터 자동생성화를 이용한 풍력터빈 하중해석의 효율 향상에 관한 연구)

  • Ahn, Kyoung-Min;Lim, Dong-Soo;Lee, Hyun-Joo;Choi, Won-Ho;Lee, Seung-Kuh
    • 한국신재생에너지학회:학술대회논문집
    • /
    • 2006.11a
    • /
    • pp.269-272
    • /
    • 2006
  • Load analysis software enables to design wind turbines effectively and exactly. In this paper, Bladed software developed by Garrad Hassan and Partners is used for load analysis. When using Bladed software, many time is requested to input data which is called by pre-processing. So in this paper, pre-processing Is automated by in-house software(BX) With this BX software, we can reduce the total time for pre-processing about 90%.

  • PDF

Dual deep neural network-based classifiers to detect experimental seizures

  • Jang, Hyun-Jong;Cho, Kyung-Ok
    • The Korean Journal of Physiology and Pharmacology
    • /
    • v.23 no.2
    • /
    • pp.131-139
    • /
    • 2019
  • Manually reviewing electroencephalograms (EEGs) is labor-intensive and demands automated seizure detection systems. To construct an efficient and robust event detector for experimental seizures from continuous EEG monitoring, we combined spectral analysis and deep neural networks. A deep neural network was trained to discriminate periodograms of 5-sec EEG segments from annotated convulsive seizures and the pre- and post-EEG segments. To use the entire EEG for training, a second network was trained with non-seizure EEGs that were misclassified as seizures by the first network. By sequentially applying the dual deep neural networks and simple pre- and post-processing, our autodetector identified all seizure events in 4,272 h of test EEG traces, with only 6 false positive events, corresponding to 100% sensitivity and 98% positive predictive value. Moreover, with pre-processing to reduce the computational burden, scanning and classifying 8,977 h of training and test EEG datasets took only 2.28 h with a personal computer. These results demonstrate that combining a basic feature extractor with dual deep neural networks and rule-based pre- and post-processing can detect convulsive seizures with great accuracy and low computational burden, highlighting the feasibility of our automated seizure detection algorithm.

An Intelligent Framework for Feature Detection and Health Recommendation System of Diseases

  • Mavaluru, Dinesh
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.3
    • /
    • pp.177-184
    • /
    • 2021
  • All over the world, people are affected by many chronic diseases and medical practitioners are working hard to find out the symptoms and remedies for the diseases. Many researchers focus on the feature detection of the disease and trying to get a better health recommendation system. It is necessary to detect the features automatically to provide the most relevant solution for the disease. This research gives the framework of Health Recommendation System (HRS) for identification of relevant and non-redundant features in the dataset for prediction and recommendation of diseases. This system consists of three phases such as Pre-processing, Feature Selection and Performance evaluation. It supports for handling of missing and noisy data using the proposed Imputation of missing data and noise detection based Pre-processing algorithm (IMDNDP). The selection of features from the pre-processed dataset is performed by proposed ensemble-based feature selection using an expert's knowledge (EFS-EK). It is very difficult to detect and monitor the diseases manually and also needs the expertise in the field so that process becomes time consuming. Finally, the prediction and recommendation can be done using Support Vector Machine (SVM) and rule-based approaches.

Ginsenoside Changes in Red Ginseng Manufactured by Acid Impregnation Treatment

  • Kim, Mi-Hyun;Hong, Hee-Do;Kim, Young-Chan;Rhee, Young-Kyoung;Kim, Kyung-Tack;Rho, Jeong-Hae
    • Journal of Ginseng Research
    • /
    • v.34 no.2
    • /
    • pp.93-97
    • /
    • 2010
  • To enhance the functionalities of ginseng, an acid impregnation pre-treatment was applied during red ginseng processing. Acetic, ascorbic, citric, malic, lactic, and oxalic acid were used for the acid impregnation treatment, and total and crude saponin concentrations and ginsenoside patterns were evaluated. Total and crude saponin contents of red ginseng pre-treated by acetic, ascorbic, and citric acid were similar to those of red ginseng without pre-treatment, whereas lactic, malic, and oxalic acid pre-treatment caused a reduction of total and crude saponin in red ginseng. From the high performance liquid chromatography analysis of ginsenosides, increased $Rg_3$ density was shown in red ginseng pre-treated by acetic, ascorbic, and citric acid impregnation. In the case of lactic, malic, and oxalic acid pre-treatment, increased $Rg_1$ density was observed in red ginseng. Increased $Rg_1$ and $Rg_3$ contents due to acid impregnation during red ginseng processing may contribute to improving bioactive functionalities of red ginseng.

A VLSI Design for High-speed Data Processing of Differential Phase Detectors with Decision Feedback (결정 궤환 구조를 갖는 차동 위상 검출기의 고속 데이터 처리를 위한 VLSI 설계)

  • Kim, Chang-Gon;Jeong, Jeong-Hwa
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.39 no.5
    • /
    • pp.74-86
    • /
    • 2002
  • This paper proposes a VLSI architecture for high-speed data processing of the differential phase detectors with the decision feedback. To improve the BER performance of the conventional differential phase detection, DF-DPD, DPD-RGPR and DFDPD-SA have been proposed. These detection methods have the architecture feedbacking the detected phase to reduce the noise of the previous symbol as phase reference. However, the feedback of the detected phase results in lower data processing speed than that of the conventional differential phase detection. In this paper, the VLSI architecture was proposed for high-speed data processing of the differential phase detectors with decision feedback. The Proposed architecture has the pre-calculation method to previously calculate the results on 'N'th step at 'M-1'th step and the pre-decision feedback method to previously feedback the predicted phases at 'M-1'th step. The architecture proposed in this paper was implemented to RTL using VHDL. The simulation results show that the Proposed architecture obtains the high-speed data processing.

Edge Enhanced Error Diffusion based on Gradient Shaping of Original image (원영상의 기울기 성형을 이용한 경계강조 오차확산법)

  • 강태하;황병원
    • Proceedings of the IEEK Conference
    • /
    • 2000.06d
    • /
    • pp.70-73
    • /
    • 2000
  • The error diffusion is good for reproducing continuous image to binary image. However the reproduction of edge characteristics is weak in power spectrum analysis of display error. It is suggested for us an edge-enhanced error-diffusion method that is included pre-processing algorithm for edge characteristic enhancement. Pre-processing algorithm is organized horizontal and vertical directional 2nd order differential values and weighting function of pre-filter. The improved Error diffusion using pre-filter, presents a good results visually which edge characteristics is enhanced. The performance of the proposed algorithm is compared with that of the conventional edge-enhanced error diffusion by measuring the RAPSD of display error, the egde correlation and the local average accordance.

  • PDF

Developemtn of Vehicle Dynamics Program AutoDyn7(II) - Pre-Processor and Post-Processor (차량동역학 해석 프로그램 AutoDyn7의 개발(∥) - 전처리 및 후처리 프로그램)

  • 한종규;김두현;김성수;유완석;김상섭
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.8 no.3
    • /
    • pp.190-197
    • /
    • 2000
  • A graphic vehicle modeling pre-processing program and a visualization post-processing program have been developed for AutoDyn7, which is a special program for vehicle dynamics. The Rapid-App for GUI(Graphic User Interface) builder and the Open Inventor for 3D graphic library have been employed to develop these programs in Silicon Graphics workstation. A Graphic User Interface program integrates vehicle modeling pre-processor, AutoDyn7 analysis processor, and visualization post-processor. In vehicle modeling pre-processor, vehicle hard point data for a suspension model are automatically converted into multibody vehicle system data. An interactive graphics capabilities provides suspension modeling aides to verify user input data interactively. In visualization post-processor, vehicle virtual test simulation results are animated with virtual testing environments.

  • PDF

Korean Machine Reading Comprehension for Patent Consultation Using BERT (BERT를 이용한 한국어 특허상담 기계독해)

  • Min, Jae-Ok;Park, Jin-Woo;Jo, Yu-Jeong;Lee, Bong-Gun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.4
    • /
    • pp.145-152
    • /
    • 2020
  • MRC (Machine reading comprehension) is the AI NLP task that predict the answer for user's query by understanding of the relevant document and which can be used in automated consult services such as chatbots. Recently, the BERT (Pre-training of Deep Bidirectional Transformers for Language Understanding) model, which shows high performance in various fields of natural language processing, have two phases. First phase is Pre-training the big data of each domain. And second phase is fine-tuning the model for solving each NLP tasks as a prediction. In this paper, we have made the Patent MRC dataset and shown that how to build the patent consultation training data for MRC task. And we propose the method to improve the performance of the MRC task using the Pre-trained Patent-BERT model by the patent consultation corpus and the language processing algorithm suitable for the machine learning of the patent counseling data. As a result of experiment, we show that the performance of the method proposed in this paper is improved to answer the patent counseling query.

Study on the Changes of Cellulose Molecular Weight and α-Cellulose Content by the Extrusion Conditions of Cellulose-NMMO Hydrate Solution (셀룰로오스-NMMO 수화물 용액의 압출가공 조건에 따른 셀룰로오스 분자량과 알파 셀룰로오스 함량 변화에 대한 연구)

  • Kim, Dong-Bok
    • Polymer(Korea)
    • /
    • v.37 no.3
    • /
    • pp.362-372
    • /
    • 2013
  • During extruder processing to manufacture cellulose fiber and film using cellulose-NMMO pre-dope produced by a new method, it seems to occur the changes of molecular weight and ${\alpha}$-cellulose content of cellulose upon thermal and mechanical degradation. In an extruder making cellulose solutions from the pre-dope obtained by high-speed mixer, the changes of cellulose molecular weight and ${\alpha}$-cellulose content resulted with the variations of processing temperature, concentration of cellulose, and residence time. The molecular weight and ${\alpha}$-cellulose content of cellulose decreased with decreasing cellulose concentration and increasing processing temperature. At 15% concentration and short residence time region, the change of ${\alpha}$-cellulose content was so high due to high-shear with an increase in temperature. From these processing conditions, the variations of ${\alpha}$-cellulose content and molecular weight showed different behaviors, and these processing conditions for making cellulose solution were found to be important factors.