• Title/Summary/Keyword: 기본오류

Search Result 393, Processing Time 0.03 seconds

A Survey on the Latest Research Trends in Retrieval-Augmented Generation (검색 증강 생성(RAG) 기술의 최신 연구 동향에 대한 조사)

  • Eunbin Lee;Ho Bae
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.9
    • /
    • pp.429-436
    • /
    • 2024
  • As Large Language Models (LLMs) continue to advance, effectively harnessing their potential has become increasingly important. LLMs, trained on vast datasets, are capable of generating text across a wide range of topics, making them useful in applications such as content creation, machine translation, and chatbots. However, they often face challenges in generalization due to gaps in specific or specialized knowledge, and updating these models with the latest information post-training remains a significant hurdle. To address these issues, Retrieval-Augmented Generation (RAG) models have been introduced. These models enhance response generation by retrieving information from continuously updated external databases, thereby reducing the hallucination phenomenon often seen in LLMs while improving efficiency and accuracy. This paper presents the foundational architecture of RAG, reviews recent research trends aimed at enhancing the retrieval capabilities of LLMs through RAG, and discusses evaluation techniques. Additionally, it explores performance optimization and real-world applications of RAG in various industries. Through this analysis, the paper aims to propose future research directions for the continued development of RAG models.

주거부문 행정자료의 인구주택총조사 활용방안

  • Lee, Geon;Byeon, Mi-Ri;Lee, Myeong-Jin;Seo, U-Seok
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2005.11a
    • /
    • pp.117-120
    • /
    • 2005
  • 인구주택총조사는 국가통계의 가장 기본이 되는 자료를 생산하는 조사로 거의 대부분의 나라에서 전수조사방식으로 정기적으로 시행해왔다. 그러나 최근 들어 일부 국가, 특히 선진국에서 응답거부가 늘고, 조사대상을 접촉하기 어려운 등 조사환경이 나빠지고 있다. 아울러 조사비용이 급격하게 증가하고 있다. 이에 각 국의 통계청에서는 이러한 상황을 인구센서스에 대한 '근본적인 도전'으로 간주하고 있다(Jensen, 2000). 심지어 독일이나 네델란드에서는 조사환경의 악화로 1990년대 이후 인구센서스를 중단한 상태이다(Bierau, 2000). 조사환경의 악화는 조사의 포괄성과 신뢰성에 대한 문제를 야기한다. 선진국들과 마찬가지로 우리나라에서도 조사환경이 빠른 속도로 악화되고 있다. 더욱이 우리의 경우 읍면동사무소 기능축소로 말미암아 과거 인구주택총조사에서 실제 조사에 도움을 주었던 행정지원이 없어짐에 따라 앞으로 조사의 어려움은 더욱 커질 것으로 보인다. 이렇듯 악화되는 조사환경변화에 대응하여 선진 국가에서는 다양한 형태의 인구센서스방식들이 모색되고 있다. 많은 나라들이 순환형 센서스보다는 행정자료를 인구주택총조사에 활용하는 방안을 모색하고 있으며, 덴마크나 핀란드 등 일부 국가에서는 이미 전혀 조사를 하지 않고 행정자료로 대부분의 인구센서스 통계를 생산하고 있다(Harala, 1996; Gaasemyr, 1999; Laihonen, 1999), 많은 나라들이 행정자료를 활용한 인구센서스 방식을 선호하는 데는 또 다른 이유가 있다. 자료의 측면에서 보면, 행정자료를 활용할 경우 매년 인구센서스 통계를 생산할 수 있다. 실제로 현재 덴마크와 핀란드는 인구센서스에 준하는 통계를 매년 생산하고 있다. 또한 이러한 자료를 바탕으로 지역통계 수요에 즉각 대처할 수 있다. 더 나아가 이와 같은 통계는 전 국민에 대한 패널자료이기 때문에 통계적 활용의 범위가 방대하다. 특히 개인, 가구, 사업체 등 사회 활동의 주체들이 어떻게 변화하는지를 추적할 수 있는 자료를 생산함으로써 다양한 인과적 통계분석을 할 수 있다. 행정자료를 활용한 인구센서스의 이러한 특징은 국가의 교육정책, 노동정책, 복지정책 등 다양한 정책을 정확한 자료를 근거로 수립할 수 있는 기반을 제공한다(Gaasemyr, 1999). 이와 더불어 행정자료 기반의 인구센서스는 비용이 적게 드는 장점이 있다. 예를 들어 덴마크나 핀란드에서는 조사로 자료를 생산하던 때의 1/20 정도 비용으로 행정자료로 인구센서스의 모든 자료를 생산하고 있다. 특히, 최근 모든 행정자료들이 정보통신기술에 의해 데이터베이스 형태로 바뀌고, 인터넷을 근간으로 한 컴퓨터네트워크가 발달함에 따라 각 부처별로 행정을 위해 축적한 자료를 정보통신기술로 연계${cdot}$통합하면 막대한 조사비용을 들이지 않더라도 인구센서스자료를 적은 비용으로 생산할 수 있는 근간이 마련되었다. 이렇듯 행정자료 기반의 인구센서스가 많은 장점을 가졌지만, 그렇다고 모든 국가가 당장 행정자료로 인구센서스를 대체할 수 있는 것은 아니다. 행정자료로 인구센서스통계를 생산하기 위해서는 각 행정부서별로 사용하는 행정자료들을 연계${cdot}$통합할 수 있도록 국가사회전반에 걸쳐 행정 체제가 갖추어져야 하기 때문이다. 특히 모든 국민 개개인에 관한 기본정보, 개인들이 거주하며 생활하는 단위인 개별 주거단위에 관한 정보가 행정부에 등록되어 있고, 잘 정비되어 있어야 하며, 정보의 형태 또한 서로 연계가 가능하도록 표준화되어있어야 한다. 이와 더불어, 현재 인구센서스에서 표본조사를 통해 부가적으로 생산하는 경제활동통계를 생산하기 위해서는 개인이 속한 사업체를 파악할 수 있도록 모든 사업체가 등록되어 있고, 개인의 경제활동과 관련된 각종 정보들이 사업체에 잘 기록 및 정비되어 있어야 한다. 따라서 행정자료 기반의 인구센서스통계생산은 단지 국가의 통계뿐만 아니라 행정조직과 행정체계를 정비하고, 개인과 사업체의 등록체계를 정비하며, 사업체의 개인에 관한 정보를 정비하여 표준화하는 막대한 작업을 수반한다. 이런 이유에서 대부분의 국가들은 장래에 행정자료 기반의 인구센서스통계생산을 목표로 하되, 당장은 행정자료를 인구센서스에 보조적 수단을 사용하는 데 노력을 기울이고 있다. 우리나라의 경우 행정자료를 인구주택총조사에 활용할 수 있는 몇 가지 중요한 기반을 갖추고 있다. 첫째, 1962년부터 시행한 주민등록제도가 있다. 주민등록제도는 모든 국민 개개인을 파악할 수 있는 주민등록번호를 갖추고 있으며 40년 이상 제도화되어 오류가 거의 없는 편이다. 둘째, 세계 10위권 내에 들 정도로 높은 우리나라의 정보화 수준과 2000년부터 시작된 전자정부사업으로 행정자료를 연계${cdot}$통합할 수 있는 기반이 잘 갖추어져 있다. 반면, 우리나라 행정자료 가운데 주거(생활)단위와 사업체를 파악할 수 있는 자료는 매우불완전하다. 대표적으로 인구센서스통계의 주요한 단위인 가구를 파악할 수 있는 수준으로 주소체계가 정비되어 있지 않으며, 많은 사업체, 특히 소규모 사업 가운데 등록되어 있지 않거나 등록오류가 많은 편이다. 이외에도 과세대장, 토지대장 등 많은 행정자료가 아직은 불완전하여 이들을 직접 연계하기에 어렵다. 행정자료를 연계하기 위해서는 모든 자료를 정비하고 표준화하여 실제 행정에 활용하여야 하기 때문에 행정적으로 많은 노력과 시간이필요하다. 따라서 현재는 손쉬운 부분에서부터 인구주택총조사에 행정자료를 활용하고, 앞으로 활용 과정을 거치면서 행정자료를 정비하고 표준화하는 장기적인 방안을 마련할 필요가 있다.

  • PDF

An Oceanic Current Map of the East Sea for Science Textbooks Based on Scientific Knowledge Acquired from Oceanic Measurements (해양관측을 통해 획득된 과학적 지식에 기반한 과학교과서 동해 해류도)

  • Park, Kyung-Ae;Park, Ji-Eun;Choi, Byoung-Ju;Byun, Do-Seong;Lee, Eun-Il
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.18 no.4
    • /
    • pp.234-265
    • /
    • 2013
  • Oceanic current maps in the secondary school science and earth science textbooks have played an important role in piquing students's inquisitiveness and interests in the ocean. Such maps can provide students with important opportunities to learn about oceanic currents relevant to abrupt climate change and global energy balance issues. Nevertheless, serious and diverse errors in these secondary school oceanic current maps have been discovered upon comparison with up-to-date scientific knowledge concerning oceanic currents. This study presents the fundamental methods and strategies for constructing such maps error-free, through the unification of the diverse current maps currently in the textbooks. In order to do so, we analyzed the maps found in 27 different textbooks and compared them with other up-to-date maps found in scientific journals, and developed a mapping technique for extracting digitalized quantitative information on warm and cold currents in the East Sea. We devised analysis items for the current visualization in relation to the branching features of the Tsushima Warm Current (TWC) in the Korea Strait. These analysis items include: its nearshore and offshore branches, the northern limit and distance from the coast of the East Korea Warm Current, outflow features of the TWC near the Tsugaru and Soya Straits and their returning currents, and flow patterns of the Liman Cold Current and the North Korea Cold Current. The first draft of the current map was constructed based upon the scientific knowledge and input of oceanographers based on oceanic in-situ measurements, and was corrected with the help of a questionnaire survey to the members of an oceanographic society. In addition, diverse comments have been collected from a special session of the 2013 spring meeting of the Korean Oceanographic Society to assist in the construction of an accurate current map of the East Sea which has been corrected repeatedly through in-depth discussions with oceanographers. Finally, we have obtained constructive comments and evaluations of the interim version of the current map from several well-known ocean current experts and incorporated their input to complete the map's final version. To avoid errors in the production of oceanic current maps in future textbooks, we provide the geolocation information (latitude and longitude) of the currents by digitalizing the map. This study is expected to be the first step towards the completion of an oceanographic current map suitable for secondary school textbooks, and to encourage oceanographers to take more interest in oceanic education.

Comparative Study on the Ability of Instruments to Maintain Original Canal Curvature of Continuous rotary System and Single File System (Continuous rotary system과 single file system의 만곡 근관 형태 유지능에 대한 비교 연구)

  • Park, Sang-Hee;Kim, Deok-Joong;Song, Yong-Beom;Lee, Hye-Yun;Kim, Hyoung-Sun;Lee, Kwang-Won;Yu, Mi-Kyung
    • Journal of Dental Rehabilitation and Applied Science
    • /
    • v.28 no.4
    • /
    • pp.371-383
    • /
    • 2012
  • Shaping the root canal system to maintain original canal curvature is essential to clinical success in endodontic treatment. Opposed to most root canals that are curved, endodontic instruments are made from straight metal blanks. They have a tendency of straightening the root canal during preparation and frequently result in procedural errors. A new treatment method to maintain original canal curvature during shaping has been introduced for preventing procedural errors. The aim of this study was to compare the ability of instruments to maintain original canal curvature of continuous rotary system and single file system. Thirty ISO 15, 0.02 taper, Endo Training Blocks(Dentsplay Maillefer) were used. Specimens were assigned to 1 of 3 groups for shaping: specimens in group 1 were shaped with ProFile #20/.06 at the WL. Specimens in group 2 were shaped with Mtwo #35/.04 at the WL. Specimens in group 3 were shaped with WaveOne Primary reciprocating files at the WL after the glide path was achieved with PathFile. Pre- and postinstrumentation digital images were superimposed and processed with Matlab r2010b(The MathWorks Inc, Natick, MA) software to analyze the curvature-radius ratio(CRr), representing canal curvature modification. Data for comparison on the ability of instruments to maintain original canal curvature depending on each Ni-Ti file were analyzed with 1-way ANOVA(P<.05). Data for comparison on the ability of instruments to maintain original canal curvature depending on each Ni-Ti file system were analyzed with independent t-test(P<.05). A statistically significant difference(P<0.05) was noted on each Ni-Ti file. ProFile and WaveOne instrumentations maintained the original canal curvature significantly better(P<0.05) than Mtwo file. There were no significant difference(P>0.05) between continuous rotary system and single file system. Under the conditions of this study, ProFile and WaveOne instruments maintained the original curvature significantly better than Mtwo file and were less modification of the canal curvature compared. There was no significant difference between continuous rotary system and single file system in shaping of simulated canals. As clinical practitioners, it may be advantages to use hybrid approach when root canal shapes depending on the design and usage of Ni-Ti files.

Building the Process for Reducing Whole Body Bone Scan Errors and its Effect (전신 뼈 스캔의 오류 감소를 위한 프로세스 구축과 적용 효과)

  • Kim, Dong Seok;Park, Jang Won;Choi, Jae Min;Shim, Dong Oh;Kim, Ho Seong;Lee, Yeong Hee
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.21 no.1
    • /
    • pp.76-82
    • /
    • 2017
  • Purpose Whole body bone scan is one of the most frequently performed in nuclear medicine. Basically, both the anterior and posterior views are acquired simultaneously. Occasionally, it is difficult to distinguish the lesion by only the anterior view and the posterior view. In this case, accurate location of the lesion through SPECT / CT or additional static scan images are important. Therefore, in this study, various improvement activities have been carried out in order to enhance the work capacity of technologists. In this study, we investigate the effect of technologist training and standardized work process processes on bone scan error reduction. Materials and Methods Several systems have been introduced in sequence for the application of new processes. The first is the implementation of education and testing with physicians, the second is the classification of patients who are expected to undergo further scanning, introducing a pre-filtration system that allows technologists to check in advance, and finally, The communication system called NMQA is applied. From January, 2014 to December, 2016, we examined the whole body bone scan patients who visited the Department of Nuclear Medicine, Asan Medical Center, Seoul, Korea Results We investigated errors based on the Bone Scan NMQA sent from January 2014 to December 2016. The number of tests in which NMQA was transmitted over the entire bone scan during the survey period was calculated as a percentage. The annual output is 141 cases in 2014, 88 cases in 2015, and 86 cases in 2016. The rate of NMQA has decreased to 0.88% in 2014, 0.53% in 2015 and 0.45% in 2016. Conclusion The incidence of NMQA has decreased since 2014 when the new process was applied. However, we believe that it will be necessary to accumulate data continuously in the future because of insufficient data until statistically confirming its usefulness. This study confirmed the necessity of standardized work and education to improve the quality of Bone Scan image, and it is thought that update is needed for continuous research and interest in the future.

  • PDF

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.

Unidentified Flying Objectivity: The Rhetoric of Pseudo-Science in Four Major Newspapers in Korea (미확인비행물체(UFO)에 대한 우리나라 신문 보도의 특징: 과학저널리즘의 관점에서)

  • Shin, Soon-Chul
    • Korean journal of communication and information
    • /
    • v.62
    • /
    • pp.244-263
    • /
    • 2013
  • There have been enormous social impacts on many areas, including science journalism, since the so-called "Hwang Woo Suk" incident. Although wide demand for better science journalism has been aroused since then, but it is hard to find an evidence to prove we have reached the point. This study examines how major Korean newspapers report Unidentified Flying Objects in order to test if the level of science journalism had been elevated. As results, still it is a long road ahead to achieve the goals because most reports were taken from the international news agents or from the witnesses rather than scientific researches and analyses; terminologies used in the stories were ambiguous; follow-up stories were rare, the sources were usually pseudo-scientific, wanton errors in basic facts and coherence, and other problems were found. It could be suggested that the dependency on supplied news to be reduced, journalists who understand both science and journalism are required, inner regulations on science reporting to be established, correct quotations and fact-checks to be accomplished, fairness to be maintained within the boundary of normal science.

  • PDF

Development of a Standard Vector Data Model for Interoperability of River-Geospatial Information (하천공간정보의 상호운용성을 위한 표준벡터데이터 모델 개발)

  • Shin, Hyung-Jin;Chae, Hyo-Sok;Lee, Eul-Rae
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.17 no.2
    • /
    • pp.44-58
    • /
    • 2014
  • In this study, a standard vector data model was developed for interoperability of river-geospatial information and for verification purpose the applicability of the standard vector model was evaluated using a model to RIMGIS vector data at Changnyeong-Hapcheon & Gangjung-Goryeong irrigation watershed. The standards from ISO and OGC were analyzed and the river geospatial data model standard was established by applying the standards. The ERD was designed based on the analysis information on data characteristics and relationship. The verification of RIMGIS vector data included points, lines and polygon to develope GDM was carried out by comparing with the data by layer. This conducting comparison of basic spatial data and attribute data to each record and spatial information vertex. The error in the process of conversion was 0 %, indicating no problem with model. Our Geospatial Data Model presented in this study provides a new and consistent format for the storage and retrieval of river geospatial data from connected database. It is designed to facilitators integrated analysis of large data sets collected by multiple institutes.

Analysis of University Information Disclosure Services in the Co-operative Universities for Operating the Information Disclosure System (대학정보 사전공개서비스 운영분석 - 대학정보공시 운영협력대학을 중심으로 -)

  • Koo, Joung Hwa;Cho, Chanyang
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.29 no.2
    • /
    • pp.169-197
    • /
    • 2018
  • The research aims to analyze current university information disclosure services in the perspectives of both university records management and services and to recommend ways to improve the current university information disclosure systems and services. The research collects and analyzes various raw data such as laws, guidelines, and manuals of university information disclosure services and the portal site of 'Higher Education in Korea' also known as 'dae-hak-al-ri-mi', and data on each homepage of 40 cooperative universities selected as the research sample. At the result, the research found some limits in the current operation of university information disclosure services: first, the information posted on the university disclosure information system is mostly focused on administrative information rather than information related to research or education within universities. Second, there are the high rate of error and frequent modification in the information posted on the disclosure information system. Third, the menus on both the information disclosure system and homepages of each cooperative university are useless or contents of the menus are empty. The research suggests some solutions to improve these problems: it is required to make up the current legal systems for university information disclosure services and to cooperate all organizations and universities related to university information disclosure services within the united system and rule. Also, it is crucial to attach the metadata of the disclosed information when to post the information to the university disclosure information systems. Finally, it is necessary for each university to employ archivists not only to develop qualified university records to maintain the unique roles and value of universities but also to disclose reliable and authentic information to users and manage the university information disclosure systems effectively and efficiently.

Proposition Empirical Equations and Application of Artificial Neural Network to the Estimation of Compression Index (압축지수의 추정을 위한 인공신경망 적용과 경험식 제안)

  • 김병탁;김영수;배상근
    • Journal of the Korean Geotechnical Society
    • /
    • v.17 no.6
    • /
    • pp.25-36
    • /
    • 2001
  • The purpose of this paper is to discuss the effects of soil properties such as liquid limit, water content, etc. on the compression index and to propose the empirical equation of compression index far regional clay and to verify the application Back Propagation Neural Network(BPNN). The compression index values obtained from laboratory tests are in the range of 0.01 to 3.06 for clay soils sampled in eleven regions. As the compare with the results of laboratory test and the predicted compression index value from the proposed empirical equations, the results of empirical equations including single soil parameter have a possibility to be overestimated. Also, the results of empirical equations including multiple soil parameters closed to the measured value more than that of empirical equations including single soil parameter, but the standard error for measured value obtained larger than 0.05. For these reasons, the empirical equations including single or multiple soil parameters proposed base on the results of laboratory test and the determination coefficient is up to 0.89. The result of BPNN shows that correlation coefficient and standard error between test and neural network result is larger than 0.925 and smaller than 0.0196, which means high correlativity, respectively. Especially, the estimated result by neural network, using only three parameters such as natural water content, dry unit weight and in-situ void ratio among various factors is available to the estimation of compression index and the correlation coefficient is 0.974. This result verified the possibility that if BPNN use, the compression index can be predicted by the parameters, which obtained from simplex field test.

  • PDF