• Title/Summary/Keyword: nearest neighbor rule

Search Result 43, Processing Time 0.02 seconds

Exact BER Analysis of Physical Layer Network Coding for Two-Way Relay Channels (물리 계층 네트워크 코딩을 이용한 양방향 중계 채널에서의 정확한 BER 분석)

  • Park, Moon-Seo;Choi, Il-Hwan;Ahn, Min-Ki;Lee, In-Kyu
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37 no.5A
    • /
    • pp.317-324
    • /
    • 2012
  • Physical layer network coding (PNC) was first introduce by Zhang et al. for two-way relay channels (TWRCs). By utilizing the PNC, we can complete two-way communications within two time slots, instead of three time slots required in non-PNC systems. Recently, the upper and lower bounds for a bit error rate (BER) of PNC have been analyzed for fading channels. In this paper, we derive an exact BER of the PNC for the TWRC over fading channels. We determine decision regions based on the nearest neighbor rule and partition them into several wedge areas to apply the Craig's polar coordinate form for computing the BER. We confirm that our derived analysis accurately matches with the simulation results.

Spatial Distribution Pattern of Patches of Erythronium japonicum at Mt. Geumjeong in Korea (한국 금정산에 븐포하고 있는 얼레지의 공간적 분포 양상과 집단 구조)

  • Man Kyu Huh
    • Journal of Life Science
    • /
    • v.33 no.3
    • /
    • pp.227-233
    • /
    • 2023
  • The purpose of this paper was to describe a statistical analysis for the spatial distribution of geographical distances of Erythronium japonicum at Mt. Geumjeong in Korea. The spatial pattern of E. japonicum was analyzed according to the nearest neighbor rule, population aggregation under different plot sizes by dispersion indices, and spatial autocorrelation. Most natural plots of E. japonicum were uniformly distributed in the forest community. Disturbed plots were aggregately distributed within 5 m × 5 m of one another. Neighboring patches of E. japonicum were predominantly 7.5~10 m apart on average. If the natural populations of E. japonicum were disturbed by human activities, then the aggregation occurred in a shorter distance than the 7.5~10 m distance scale. The Morisita index (IM) is related to the patchiness index (PAI) that showed the 2.5 m × 5 m plot had an overly steep slope at the west and south areas when the area was smaller than 5 m × 5 m. When the patch size was one 2.5 m × 5 m quadrat at the west distributed area of Mt. Geumjeong, the cluster was determined by both species characteristics and environmental factors. The comparison of Moran's I values to a logistic regression indicated that individuals in E. japonicum populations at Mt. Geumjeong could be explained by isolation by distance.

Comparison of Association Rule Learning and Subgroup Discovery for Mining Traffic Accident Data (교통사고 데이터의 마이닝을 위한 연관규칙 학습기법과 서브그룹 발견기법의 비교)

  • Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.1-16
    • /
    • 2015
  • Traffic accident is one of the major cause of death worldwide for the last several decades. According to the statistics of world health organization, approximately 1.24 million deaths occurred on the world's roads in 2010. In order to reduce future traffic accident, multipronged approaches have been adopted including traffic regulations, injury-reducing technologies, driving training program and so on. Records on traffic accidents are generated and maintained for this purpose. To make these records meaningful and effective, it is necessary to analyze relationship between traffic accident and related factors including vehicle design, road design, weather, driver behavior etc. Insight derived from these analysis can be used for accident prevention approaches. Traffic accident data mining is an activity to find useful knowledges about such relationship that is not well-known and user may interested in it. Many studies about mining accident data have been reported over the past two decades. Most of studies mainly focused on predict risk of accident using accident related factors. Supervised learning methods like decision tree, logistic regression, k-nearest neighbor, neural network are used for these prediction. However, derived prediction model from these algorithms are too complex to understand for human itself because the main purpose of these algorithms are prediction, not explanation of the data. Some of studies use unsupervised clustering algorithm to dividing the data into several groups, but derived group itself is still not easy to understand for human, so it is necessary to do some additional analytic works. Rule based learning methods are adequate when we want to derive comprehensive form of knowledge about the target domain. It derives a set of if-then rules that represent relationship between the target feature with other features. Rules are fairly easy for human to understand its meaning therefore it can help provide insight and comprehensible results for human. Association rule learning methods and subgroup discovery methods are representing rule based learning methods for descriptive task. These two algorithms have been used in a wide range of area from transaction analysis, accident data analysis, detection of statistically significant patient risk groups, discovering key person in social communities and so on. We use both the association rule learning method and the subgroup discovery method to discover useful patterns from a traffic accident dataset consisting of many features including profile of driver, location of accident, types of accident, information of vehicle, violation of regulation and so on. The association rule learning method, which is one of the unsupervised learning methods, searches for frequent item sets from the data and translates them into rules. In contrast, the subgroup discovery method is a kind of supervised learning method that discovers rules of user specified concepts satisfying certain degree of generality and unusualness. Depending on what aspect of the data we are focusing our attention to, we may combine different multiple relevant features of interest to make a synthetic target feature, and give it to the rule learning algorithms. After a set of rules is derived, some postprocessing steps are taken to make the ruleset more compact and easier to understand by removing some uninteresting or redundant rules. We conducted a set of experiments of mining our traffic accident data in both unsupervised mode and supervised mode for comparison of these rule based learning algorithms. Experiments with the traffic accident data reveals that the association rule learning, in its pure unsupervised mode, can discover some hidden relationship among the features. Under supervised learning setting with combinatorial target feature, however, the subgroup discovery method finds good rules much more easily than the association rule learning method that requires a lot of efforts to tune the parameters.

A Study on the Earthwork Volume Computation and Topographic Analysis using DTM Interpolations (DTM 보간기법별 토공량 산정과 지형분석에 관한 연구)

  • Park, Woon-Yong;Kim, Chun-Young;Lee, Hyun-Woo
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.9 no.1 s.17
    • /
    • pp.39-47
    • /
    • 2001
  • DTM(Digital Terrain Model) can play a key rule in a great number of the fields of construction Engineering. One of the most important application fields is to determine volume in that the total construction expenses is usually calculated through this. It therefore is necessary to the study on improving the precise of the determination using DTM on account of saving time and cost. On this study, 1:5000 topographic maps issued by NGI in 15 districts involved in Pusan city was digitalized to generate DTM at first. After this step, not only was the determination of the volume as well as readjusted area and height done for the sake of estimating the changable topography caused by cut & fill volume in future but also provided the model to calculate it as results. In addition, comparison among the interpolations, such as Inverse Distance Method and Nearest Neighbor, was respectively done to look over the differences of the volume estimated from each interpolation and also to find the most suitable method. As a result, the former yielded the largest values of area and the volume while the latter gave the smallest ones. Moreover, the values estimated on this study were closely similar to ones obtained by the government of Pusan.

  • PDF

(Efficient Methods for Combining User and Article Models for Collaborative Recommendation) (협력적 추천을 위한 사용자와 항목 모델의 효율적인 통합 방법)

  • 도영아;김종수;류정우;김명원
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.5_6
    • /
    • pp.540-549
    • /
    • 2003
  • In collaborative recommendation two models are generally used: the user model and the article model. A user model learns correlation between users preferences and recommends an article based on other users preferences for the article. Similarly, an article model learns correlation between preferences for articles and recommends an article based on the target user's preference for other articles. In this paper, we investigates various combination methods of the user model and the article model for better recommendation performance. They include simple sequential and parallel methods, perceptron, multi-layer perceptron, fuzzy rules, and BKS. We adopt the multi-layer perceptron for training each of the user and article models. The multi-layer perceptron has several advantages over other methods such as the nearest neighbor method and the association rule method. It can learn weights between correlated items and it can handle easily both of symbolic and numeric data. The combined models outperform any of the basic models and our experiments show that the multi-layer perceptron is the most efficient combination method among them.

Optimal Associative Neighborhood Mining using Representative Attribute (대표 속성을 이용한 최적 연관 이웃 마이닝)

  • Jung Kyung-Yong
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.4 s.310
    • /
    • pp.50-57
    • /
    • 2006
  • In Electronic Commerce, the latest most of the personalized recommender systems have applied to the collaborative filtering technique. This method calculates the weight of similarity among users who have a similar preference degree in order to predict and recommend the item which hits to propensity of users. In this case, we commonly use Pearson Correlation Coefficient. However, this method is feasible to calculate a correlation if only there are the items that two users evaluated a preference degree in common. Accordingly, the accuracy of prediction falls. The weight of similarity can affect not only the case which predicts the item which hits to propensity of users, but also the performance of the personalized recommender system. In this study, we verify the improvement of the prediction accuracy through an experiment after observing the rule of the weight of similarity applying Vector similarity, Entropy, Inverse user frequency, and Default voting of Information Retrieval field. The result shows that the method combining the weight of similarity using the Entropy with Default voting got the most efficient performance.

Design of knowledge search algorithm for PHR based personalized health information system (PHR 기반 개인 맞춤형 건강정보 탐사 알고리즘 설계)

  • SHIN, Moon-Sun
    • Journal of Digital Convergence
    • /
    • v.15 no.4
    • /
    • pp.191-198
    • /
    • 2017
  • It is needed to support intelligent customized health information service for user convenience in PHR based Personal Health Care Service Platform. In this paper, we specify an ontology-based health data model for Personal Health Care Service Platform. We also design a knowledge search algorithm that can be used to figure out similar health record by applying machine learning and data mining techniques. Axis-based mining algorithm, which we proposed, can be performed based on axis-attributes in order to improve relevance of knowledge exploration and to provide efficient search time by reducing the size of candidate item set. And K-Nearest Neighbor algorithm is used to perform to do grouping users byaccording to the similarity of the user profile. These algorithms improves the efficiency of customized information exploration according to the user 's disease and health condition. It can be useful to apply the proposed algorithm to a process of inference in the Personal Health Care Service Platform and makes it possible to recommend customized health information to the user. It is useful for people to manage smart health care in aging society.

Cable anomaly detection driven by spatiotemporal correlation dissimilarity measurements of bridge grouped cable forces

  • Dong-Hui, Yang;Hai-Lun, Gu;Ting-Hua, Yi;Zhan-Jun, Wu
    • Smart Structures and Systems
    • /
    • v.30 no.6
    • /
    • pp.661-671
    • /
    • 2022
  • Stayed cables are the key components for transmitting loads in cable-stayed bridges. Therefore, it is very important to evaluate the cable force condition to ensure bridge safety. An online condition assessment and anomaly localization method is proposed for cables based on the spatiotemporal correlation of grouped cable forces. First, an anomaly sensitive feature index is obtained based on the distribution characteristics of grouped cable forces. Second, an adaptive anomaly detection method based on the k-nearest neighbor rule is used to perform dissimilarity measurements on the extracted feature index, and such a method can effectively remove the interference of environment factors and vehicle loads on online condition assessment of the grouped cable forces. Furthermore, an online anomaly isolation and localization method for stay cables is established, and the complete decomposition contributions method is used to decompose the feature matrix of the grouped cable forces and build an anomaly isolation index. Finally, case studies were carried out to validate the proposed method using an in-service cable-stayed bridge equipped with a structural health monitoring system. The results show that the proposed approach is sensitive to the abnormal distribution of grouped cable forces and is robust to the influence of interference factors. In addition, the proposed approach can also localize the cables with abnormal cable forces online, which can be successfully applied to the field monitoring of cables for cable-stayed bridges.

An effective automated ontology construction based on the agriculture domain

  • Deepa, Rajendran;Vigneshwari, Srinivasan
    • ETRI Journal
    • /
    • v.44 no.4
    • /
    • pp.573-587
    • /
    • 2022
  • The agricultural sector is completely different from other sectors since it completely relies on various natural and climatic factors. Climate changes have many effects, including lack of annual rainfall and pests, heat waves, changes in sea level, and global ozone/atmospheric CO2 fluctuation, on land and agriculture in similar ways. Climate change also affects the environment. Based on these factors, farmers chose their crops to increase productivity in their fields. Many existing agricultural ontologies are either domain-specific or have been created with minimal vocabulary and no proper evaluation framework has been implemented. A new agricultural ontology focused on subdomains is designed to assist farmers using Jaccard relative extractor (JRE) and Naïve Bayes algorithm. The JRE is used to find the similarity between two sentences and words in the agricultural documents and the relationship between two terms is identified via the Naïve Bayes algorithm. In the proposed method, the preprocessing of data is carried out through natural language processing techniques and the tags whose dimensions are reduced are subjected to rule-based formal concept analysis and mapping. The subdomain ontologies of weather, pest, and soil are built separately, and the overall agricultural ontology are built around them. The gold standard for the lexical layer is used to evaluate the proposed technique, and its performance is analyzed by comparing it with different state-of-the-art systems. Precision, recall, F-measure, Matthews correlation coefficient, receiver operating characteristic curve area, and precision-recall curve area are the performance metrics used to analyze the performance. The proposed methodology gives a precision score of 94.40% when compared with the decision tree(83.94%) and K-nearest neighbor algorithm(86.89%) for agricultural ontology construction.

Improving the Accuracy of Document Classification by Learning Heterogeneity (이질성 학습을 통한 문서 분류의 정확성 향상 기법)

  • Wong, William Xiu Shun;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.21-44
    • /
    • 2018
  • In recent years, the rapid development of internet technology and the popularization of smart devices have resulted in massive amounts of text data. Those text data were produced and distributed through various media platforms such as World Wide Web, Internet news feeds, microblog, and social media. However, this enormous amount of easily obtained information is lack of organization. Therefore, this problem has raised the interest of many researchers in order to manage this huge amount of information. Further, this problem also required professionals that are capable of classifying relevant information and hence text classification is introduced. Text classification is a challenging task in modern data analysis, which it needs to assign a text document into one or more predefined categories or classes. In text classification field, there are different kinds of techniques available such as K-Nearest Neighbor, Naïve Bayes Algorithm, Support Vector Machine, Decision Tree, and Artificial Neural Network. However, while dealing with huge amount of text data, model performance and accuracy becomes a challenge. According to the type of words used in the corpus and type of features created for classification, the performance of a text classification model can be varied. Most of the attempts are been made based on proposing a new algorithm or modifying an existing algorithm. This kind of research can be said already reached their certain limitations for further improvements. In this study, aside from proposing a new algorithm or modifying the algorithm, we focus on searching a way to modify the use of data. It is widely known that classifier performance is influenced by the quality of training data upon which this classifier is built. The real world datasets in most of the time contain noise, or in other words noisy data, these can actually affect the decision made by the classifiers built from these data. In this study, we consider that the data from different domains, which is heterogeneous data might have the characteristics of noise which can be utilized in the classification process. In order to build the classifier, machine learning algorithm is performed based on the assumption that the characteristics of training data and target data are the same or very similar to each other. However, in the case of unstructured data such as text, the features are determined according to the vocabularies included in the document. If the viewpoints of the learning data and target data are different, the features may be appearing different between these two data. In this study, we attempt to improve the classification accuracy by strengthening the robustness of the document classifier through artificially injecting the noise into the process of constructing the document classifier. With data coming from various kind of sources, these data are likely formatted differently. These cause difficulties for traditional machine learning algorithms because they are not developed to recognize different type of data representation at one time and to put them together in same generalization. Therefore, in order to utilize heterogeneous data in the learning process of document classifier, we apply semi-supervised learning in our study. However, unlabeled data might have the possibility to degrade the performance of the document classifier. Therefore, we further proposed a method called Rule Selection-Based Ensemble Semi-Supervised Learning Algorithm (RSESLA) to select only the documents that contributing to the accuracy improvement of the classifier. RSESLA creates multiple views by manipulating the features using different types of classification models and different types of heterogeneous data. The most confident classification rules will be selected and applied for the final decision making. In this paper, three different types of real-world data sources were used, which are news, twitter and blogs.