• Title/Summary/Keyword: similarity calculation

Search Result 208, Processing Time 0.027 seconds

A Federated Multi-Task Learning Model Based on Adaptive Distributed Data Latent Correlation Analysis

  • Wu, Shengbin;Wang, Yibai
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.441-452
    • /
    • 2021
  • Federated learning provides an efficient integrated model for distributed data, allowing the local training of different data. Meanwhile, the goal of multi-task learning is to simultaneously establish models for multiple related tasks, and to obtain the underlying main structure. However, traditional federated multi-task learning models not only have strict requirements for the data distribution, but also demand large amounts of calculation and have slow convergence, which hindered their promotion in many fields. In our work, we apply the rank constraint on weight vectors of the multi-task learning model to adaptively adjust the task's similarity learning, according to the distribution of federal node data. The proposed model has a general framework for solving optimal solutions, which can be used to deal with various data types. Experiments show that our model has achieved the best results in different dataset. Notably, our model can still obtain stable results in datasets with large distribution differences. In addition, compared with traditional federated multi-task learning models, our algorithm is able to converge on a local optimal solution within limited training iterations.

Developing a New Algorithm for Conversational Agent to Detect Recognition Error and Neologism Meaning: Utilizing Korean Syllable-based Word Similarity (대화형 에이전트 인식오류 및 신조어 탐지를 위한 알고리즘 개발: 한글 음절 분리 기반의 단어 유사도 활용)

  • Jung-Won Lee;Il Im
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.267-286
    • /
    • 2023
  • The conversational agents such as AI speakers utilize voice conversation for human-computer interaction. Voice recognition errors often occur in conversational situations. Recognition errors in user utterance records can be categorized into two types. The first type is misrecognition errors, where the agent fails to recognize the user's speech entirely. The second type is misinterpretation errors, where the user's speech is recognized and services are provided, but the interpretation differs from the user's intention. Among these, misinterpretation errors require separate error detection as they are recorded as successful service interactions. In this study, various text separation methods were applied to detect misinterpretation. For each of these text separation methods, the similarity of consecutive speech pairs using word embedding and document embedding techniques, which convert words and documents into vectors. This approach goes beyond simple word-based similarity calculation to explore a new method for detecting misinterpretation errors. The research method involved utilizing real user utterance records to train and develop a detection model by applying patterns of misinterpretation error causes. The results revealed that the most significant analysis result was obtained through initial consonant extraction for detecting misinterpretation errors caused by the use of unregistered neologisms. Through comparison with other separation methods, different error types could be observed. This study has two main implications. First, for misinterpretation errors that are difficult to detect due to lack of recognition, the study proposed diverse text separation methods and found a novel method that improved performance remarkably. Second, if this is applied to conversational agents or voice recognition services requiring neologism detection, patterns of errors occurring from the voice recognition stage can be specified. The study proposed and verified that even if not categorized as errors, services can be provided according to user-desired results.

A Study on the Intelligent Quick Response System for Fast Fashion(IQRS-FF) (패스트 패션을 위한 지능형 신속대응시스템(IQRS-FF)에 관한 연구)

  • Park, Hyun-Sung;Park, Kwang-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.163-179
    • /
    • 2010
  • Recentlythe concept of fast fashion is drawing attention as customer needs are diversified and supply lead time is getting shorter in fashion industry. It is emphasized as one of the critical success factors in the fashion industry how quickly and efficiently to satisfy the customer needs as the competition has intensified. Because the fast fashion is inherently susceptible to trend, it is very important for fashion retailers to make quick decisions regarding items to launch, quantity based on demand prediction, and the time to respond. Also the planning decisions must be executed through the business processes of procurement, production, and logistics in real time. In order to adapt to this trend, the fashion industry urgently needs supports from intelligent quick response(QR) system. However, the traditional functions of QR systems have not been able to completely satisfy such demands of the fast fashion industry. This paper proposes an intelligent quick response system for the fast fashion(IQRS-FF). Presented are models for QR process, QR principles and execution, and QR quantity and timing computation. IQRS-FF models support the decision makers by providing useful information with automated and rule-based algorithms. If the predefined conditions of a rule are satisfied, the actions defined in the rule are automatically taken or informed to the decision makers. In IQRS-FF, QRdecisions are made in two stages: pre-season and in-season. In pre-season, firstly master demand prediction is performed based on the macro level analysis such as local and global economy, fashion trends and competitors. The prediction proceeds to the master production and procurement planning. Checking availability and delivery of materials for production, decision makers must make reservations or request procurements. For the outsourcing materials, they must check the availability and capacity of partners. By the master plans, the performance of the QR during the in-season is greatly enhanced and the decision to select the QR items is made fully considering the availability of materials in warehouse as well as partners' capacity. During in-season, the decision makers must find the right time to QR as the actual sales occur in stores. Then they are to decide items to QRbased not only on the qualitative criteria such as opinions from sales persons but also on the quantitative criteria such as sales volume, the recent sales trend, inventory level, the remaining period, the forecast for the remaining period, and competitors' performance. To calculate QR quantity in IQRS-FF, two calculation methods are designed: QR Index based calculation and attribute similarity based calculation using demographic cluster. In the early period of a new season, the attribute similarity based QR amount calculation is better used because there are not enough historical sales data. By analyzing sales trends of the categories or items that have similar attributes, QR quantity can be computed. On the other hand, in case of having enough information to analyze the sales trends or forecasting, the QR Index based calculation method can be used. Having defined the models for decision making for QR, we design KPIs(Key Performance Indicators) to test the reliability of the models in critical decision makings: the difference of sales volumebetween QR items and non-QR items; the accuracy rate of QR the lead-time spent on QR decision-making. To verify the effectiveness and practicality of the proposed models, a case study has been performed for a representative fashion company which recently developed and launched the IQRS-FF. The case study shows that the average sales rateof QR items increased by 15%, the differences in sales rate between QR items and non-QR items increased by 10%, the QR accuracy was 70%, the lead time for QR dramatically decreased from 120 hours to 8 hours.

Evaluation of Non-point source Vulnerable Areas In West Nakdong River Watershed Using TOPSIS (TOPSIS를 이용한 서낙동강 유역 비점오염 취약지역 평가 연구)

  • KAL, Byung-Seok;PARK, Jae-Beom;KIM, Ye-Jin
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.24 no.1
    • /
    • pp.26-39
    • /
    • 2021
  • This study investigated the characteristics of the watershed and pollutants in the Seonakdong River basin in the lower stream of the Nakdong River Water System, and evaluated the areas vulnerable to nonpoint pollution by subwatershed according to the TOPSIS(Technique for Order of Preference by Similarity to Ideal Solution) method. The selection method consists of selection of evaluation factors, calculation of weights and selection of areas vulnerable to non-point pollution through evaluation factors and weights. The entropy method was used as the weight calculation method and TOPSIS, a multi-criteria decision making(MCDM) method was used as the evaluation method. Indicator data were collected as of 2018, and national pollution source survey data and national statistics were used. Most of the vulnerable watersheds were highly urbanized had a large number of residents and were evaluated as having a large land area among industrial facilities and site area rate. Through this study, it is necessary to approach a variety of weighting methodologies to assess the vulnerability of non-point pollution with high reliability, and scientific analysis of the factors that affect non-point pollution sources and consideration of the effects are necessary.

Development and Verification of Approximate Methods for In-Structure Response Spectrum (ISRS) Scaling (구조물내응답스펙트럼 스케일링 근사 방법 개발 및 검증)

  • Shinyoung Kwag;Chaeyeon Go;Seunghyun Eem;Jaewook Jung;In-Kil Choi
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.37 no.2
    • /
    • pp.111-118
    • /
    • 2024
  • An in-structure response spectrum (ISRS) is required to evaluate the seismic performance of a nuclear power plant (NPP). However, when a new ISRS is required because of the change in the unique spectrum of an NPP site, considerable costs such as seismic response re-analyses are incurred. This study provides several approaches to generate approximate methods for ISRS scaling, which do not require seismic response re-analyses. The ISRSs derived using these approaches are compared to the original ISRS. The effect of the ISRS of the approximate method on the seismic response and seismic performance of one of the main systems of an NPP is analyzed. The ISRS scaling approximation methods presented in this study produce ISRSs that are relatively similar at low frequencies; however, the similarity decreases at high frequencies. The effect of the ISRS scaling approximate method on the calculation accuracy of the seismic response/seismic performance of the system is determined according to the degree of similarity in the calculation of the system's essential mode responses for the method.

Community Characteristics of Benthic Macroinvertebrates according to Growth Environment at Rural Palustrine Wetland (농촌지역 소택형습지의 생육환경에 따른 저서성대형무척추동물 군집 특성)

  • Son, Jin-Kwan;Kim, Nam-Choon;Kim, Mi-Heui;Kang, Banghun
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.15 no.5
    • /
    • pp.129-144
    • /
    • 2012
  • This study was conducted to understand the community characteristics of benthic macroinvertebrates according to Growth Environment at 6 Palustrine Wetlands in a rural area. As growth environment factors, size, water depth, water inlet and water outlet, land-use and water environment was analyzed. Two years' quantitative collection of benthic macroinvertebrate was carried out, and it executed community analysis and ESB index calculation and also carried out twinspan, MDS and correlation analysis. As a result, the collected benthic macroinvertebrate was 1254 individuals with 3 Phylums, 6 Classes, 14 Orders, 35 Families, 52 Genera and 61 Species. Odonata and Coenagreionidae had the highest species and individuals. Dominance Index was 0.252~0.698, Diversity Index was 1.661~2.902, Evenness Index was 0.414~0.724, and Species Richness Index was 1.990~6.224. As a result of community analysis, when correlation analysis was executed, Dominance Index had the opposite tendency with Diversity Index and Evenness Index, which had the same tendency with the previous studies. When ESB Index was calculated, Grade 2 (polluted) had the highest species with 48 species (78.7%). It is determined from the Environmental quality evaluation and saprobity evaluation result according to ESB index that there is a need to revise environmental evaluation system more specifically. As a result of MDS analysis, the subject spots A and D had the highest similarity, and the subject spot E and D had a relatively high similarity. The life environment that is the closest related with species diversity is estimated by the land-use. As for number of Individual, it seems to have the closest relation with inlet, which is to be determined as a characteristics of Palustrine Wetland. Through such investigation, this study is expected to be utilized for various types of habitats including ecological pond and to be utilized for the increase of species diversity in rural areas.

2D/3D image Conversion Method using Simplification of Level and Reduction of Noise for Optical Flow and Information of Edge (Optical flow의 레벨 간소화 및 노이즈 제거와 에지 정보를 이용한 2D/3D 변환 기법)

  • Han, Hyeon-Ho;Lee, Gang-Seong;Lee, Sang-Hun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.2
    • /
    • pp.827-833
    • /
    • 2012
  • In this paper, we propose an improved optical flow algorithm which reduces computational complexity as well as noise level. This algorithm reduces computational time by applying level simplification technique and removes noise by using eigenvectors of objects. Optical flow is one of the accurate algorithms used to generate depth information from two image frames using the vectors which track the motions of pixels. This technique, however, has disadvantage of taking very long computational time because of the pixel-based calculation and can cause some noise problems. The level simplifying technique is applied to reduce the computational time, and the noise is removed by applying optical flow only to the area of having eigenvector, then using the edge image to generate the depth information of background area. Three-dimensional images were created from two-dimensional images using the proposed method which generates the depth information first and then converts into three-dimensional image using the depth information and DIBR(Depth Image Based Rendering) technique. The error rate was obtained using the SSIM(Structural SIMilarity index).

A Fast Normalized Cross-Correlation Computation for WSOLA-based Speech Time-Scale Modification (WSOLA 기반의 음성 시간축 변환을 위한 고속의 정규상호상관도 계산)

  • Lim, Sangjun;Kim, Hyung Soon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.7
    • /
    • pp.427-434
    • /
    • 2012
  • The overlap-add technique based on waveform similarity (WSOLA) method is known to be an efficient high-quality algorithm for time scaling of speech signal. The computational load of WSOLA is concentrated on the repeated normalized cross-correlation (NCC) calculation to evaluate the similarity between two signal waveforms. To reduce the computational complexity of WSOLA, this paper proposes a fast NCC computation method, in which NCC is obtained through pre-calculated sum tables to eliminate redundancy of repeated NCC calculations in the adjacent regions. While the denominator part of NCC has much redundancy irrespective of the time-scale factor, the numerator part of NCC has less redundancy and the amount of redundancy is dependent on both the time-scale factor and optimal shift value, thereby requiring more sophisticated algorithm for fast computation. The simulation results show that the proposed method reduces about 40%, 47% and 52% of the WSOLA execution time for the time-scale compression, 2 and 3 times time-scale expansions, respectively, while maintaining exactly the same speech quality of the conventional WSOLA.

A Historical Analysis on Trigonometric Functions (삼각함수 개념의 역사적 분석)

  • Yoo, Jae Geun
    • Journal of Educational Research in Mathematics
    • /
    • v.24 no.4
    • /
    • pp.607-622
    • /
    • 2014
  • The purpose of this paper is that it analyzes the historical development of the concept of trigonometric functions and discuss some didactical implications. The results of the study are as follows. First, the concept of trigonometric functions is developed from line segments measuring ratios to numbers representing the ratios. Geometry, arithmetic, algebra and analysis has been integrated in this process. Secondly, as a result of developing from practical calculation to theoretical function, periodicity is formalized, but 'trigonometry' is overlooked. Third, it must be taught trigonometry relationally and structurally by the principle of similarity. Fourth, the conceptual generalization of trigonometric functions must be recognized as epistemological obstacle, and it should be improved to emphasize the integration revealed in history. The results of these studies provide some useful suggestions to teaching and learning of trigonometry.

  • PDF

A New Semantic Distance Measurement Method using TF-IDF in Linked Open Data (링크드 오픈 데이터에서 TF-IDF를 이용한 새로운 시맨틱 거리 측정 기법)

  • Cho, Jung-Gil
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.10
    • /
    • pp.89-96
    • /
    • 2020
  • Linked Data allows structured data to be published in a standard way that datasets from various domains can be interlinked. With the rapid evolution of Linked Open Data(LOD), researchers are exploiting it to solve particular problems such as semantic similarity assessment. In this paper, we propose a method, on top of the basic concept of Linked Data Semantic Distance (LDSD), for calculating the Linked Data semantic distance between resources that can be used in the LOD-based recommender system. The semantic distance measurement model proposed in this paper is based on a similarity measurement that combines the LOD-based semantic distance and a new link weight using TF-IDF, which is well known in the field of information retrieval. In order to verify the effectiveness of this paper's approach, performance was evaluated in the context of an LOD-based recommendation system using mixed data of DBpedia and MovieLens. Experimental results show that the proposed method shows higher accuracy compared to other similar methods. In addition, it contributed to the improvement of the accuracy of the recommender system by expanding the range of semantic distance calculation.