• Title/Summary/Keyword: Package software

Search Result 715, Processing Time 0.022 seconds

Research Trends on Soil Erosion Control Engineering in North Korea (북한의 사방공학 분야 연구동향 분석)

  • Kim, Kidae;Kang, Minjeng;Kim, Dongyeob;Lee, Changwoo;Woo, Choongshik;Seo, Junpyo
    • Journal of Korean Society of Forest Science
    • /
    • v.108 no.4
    • /
    • pp.469-483
    • /
    • 2019
  • North Korea has experienced floods and sediment-related disasters annually since the 1970s due to deforestation. It is of paramount importance that technologies and trends related to forest restoration and soil erosion control engineering be properly understood in a bid to reduce damage from sediment-related disasters in North Korea, and to effect national territorial management following unification. This paper presents a literature review and bibliometric analysis including 146 related articles published in North Korea. First, we analyzed the textual characteristics of the articles. We then employed the VOSviewer software package to classify the research topic and analyzed this topic based on the time change. The results showed that articles on the topic have consistently increased since the 1990s. In addition, research related to soil erosion control engineering has been classified into four subjects in North Korea: (i) assessment of hazard area on soil erosion and soil loss, sediment related-disasters; (ii) hydraulic and hydrologic understanding of forests; (iii) reasonable construction of soil erosion control structures; and (iv) effects and management plan of soil erosion control works. The proportion of research related to the (ii) hydraulic and hydrologic understanding of forests had been significant during the reign of Kim Ilsung. However, the proportion of research related to the (i) assessment of hazard area on soil erosion and soil loss, sediment-related disasters, increased during the reign of Kim Jongil and Kim Jongun. Using these results, our analysis indicated that an interest in and need for soil erosion control engineering in North Korea has continually increased. The results of this study are expected to serve as a basis for preparing forestry cooperation between North and South Korea, and to serve as essential data for better understanding soil erosion control engineering in North Korea.

Optimal Selection of Classifier Ensemble Using Genetic Algorithms (유전자 알고리즘을 이용한 분류자 앙상블의 최적 선택)

  • Kim, Myung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.99-112
    • /
    • 2010
  • Ensemble learning is a method for improving the performance of classification and prediction algorithms. It is a method for finding a highly accurateclassifier on the training set by constructing and combining an ensemble of weak classifiers, each of which needs only to be moderately accurate on the training set. Ensemble learning has received considerable attention from machine learning and artificial intelligence fields because of its remarkable performance improvement and flexible integration with the traditional learning algorithms such as decision tree (DT), neural networks (NN), and SVM, etc. In those researches, all of DT ensemble studies have demonstrated impressive improvements in the generalization behavior of DT, while NN and SVM ensemble studies have not shown remarkable performance as shown in DT ensembles. Recently, several works have reported that the performance of ensemble can be degraded where multiple classifiers of an ensemble are highly correlated with, and thereby result in multicollinearity problem, which leads to performance degradation of the ensemble. They have also proposed the differentiated learning strategies to cope with performance degradation problem. Hansen and Salamon (1990) insisted that it is necessary and sufficient for the performance enhancement of an ensemble that the ensemble should contain diverse classifiers. Breiman (1996) explored that ensemble learning can increase the performance of unstable learning algorithms, but does not show remarkable performance improvement on stable learning algorithms. Unstable learning algorithms such as decision tree learners are sensitive to the change of the training data, and thus small changes in the training data can yield large changes in the generated classifiers. Therefore, ensemble with unstable learning algorithms can guarantee some diversity among the classifiers. To the contrary, stable learning algorithms such as NN and SVM generate similar classifiers in spite of small changes of the training data, and thus the correlation among the resulting classifiers is very high. This high correlation results in multicollinearity problem, which leads to performance degradation of the ensemble. Kim,s work (2009) showedthe performance comparison in bankruptcy prediction on Korea firms using tradition prediction algorithms such as NN, DT, and SVM. It reports that stable learning algorithms such as NN and SVM have higher predictability than the unstable DT. Meanwhile, with respect to their ensemble learning, DT ensemble shows the more improved performance than NN and SVM ensemble. Further analysis with variance inflation factor (VIF) analysis empirically proves that performance degradation of ensemble is due to multicollinearity problem. It also proposes that optimization of ensemble is needed to cope with such a problem. This paper proposes a hybrid system for coverage optimization of NN ensemble (CO-NN) in order to improve the performance of NN ensemble. Coverage optimization is a technique of choosing a sub-ensemble from an original ensemble to guarantee the diversity of classifiers in coverage optimization process. CO-NN uses GA which has been widely used for various optimization problems to deal with the coverage optimization problem. The GA chromosomes for the coverage optimization are encoded into binary strings, each bit of which indicates individual classifier. The fitness function is defined as maximization of error reduction and a constraint of variance inflation factor (VIF), which is one of the generally used methods to measure multicollinearity, is added to insure the diversity of classifiers by removing high correlation among the classifiers. We use Microsoft Excel and the GAs software package called Evolver. Experiments on company failure prediction have shown that CO-NN is effectively applied in the stable performance enhancement of NNensembles through the choice of classifiers by considering the correlations of the ensemble. The classifiers which have the potential multicollinearity problem are removed by the coverage optimization process of CO-NN and thereby CO-NN has shown higher performance than a single NN classifier and NN ensemble at 1% significance level, and DT ensemble at 5% significance level. However, there remain further research issues. First, decision optimization process to find optimal combination function should be considered in further research. Secondly, various learning strategies to deal with data noise should be introduced in more advanced further researches in the future.

An Efficient Heuristic for Storage Location Assignment and Reallocation for Products of Different Brands at Internet Shopping Malls for Clothing (의류 인터넷 쇼핑몰에서 브랜드를 고려한 상품 입고 및 재배치 방법 연구)

  • Song, Yong-Uk;Ahn, Byung-Hyuk
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.2
    • /
    • pp.129-141
    • /
    • 2010
  • An Internet shopping mall for clothing operates a warehouse for packing and shipping products to fulfill its orders. All the products in the warehouse are put into the boxes of same brands and the boxes are stored in a row on shelves equiped in the warehouse. To make picking and managing easy, boxes of the same brands are located side by side on the shelves. When new products arrive to the warehouse for storage, the products of a brand are put into boxes and those boxes are located adjacent to the boxes of the same brand. If there is not enough space for the new coming boxes, however, some boxes of other brands should be moved away and then the new coming boxes are located adjacent in the resultant vacant spaces. We want to minimize the movement of the existing boxes of other brands to another places on the shelves during the warehousing of new coming boxes, while all the boxes of the same brand are kept side by side on the shelves. Firstly, we define the adjacency of boxes by looking the shelves as an one dimensional series of spaces to store boxes, i.e. cells, tagging the series of cells by a series of numbers starting from one, and considering any two boxes stored in the cells to be adjacent to each other if their cell numbers are continuous from one number to the other number. After that, we tried to formulate the problem into an integer programming model to obtain an optimal solution. An integer programming formulation and Branch-and-Bound technique for this problem may not be tractable because it would take too long time to solve the problem considering the number of the cells or boxes in the warehouse and the computing power of the Internet shopping mall. As an alternative approach, we designed a fast heuristic method for this reallocation problem by focusing on just the unused spaces-empty cells-on the shelves, which results in an assignment problem model. In this approach, the new coming boxes are assigned to each empty cells and then those boxes are reorganized so that the boxes of a brand are adjacent to each other. The objective of this new approach is to minimize the movement of the boxes during the reorganization process while keeping the boxes of a brand adjacent to each other. The approach, however, does not ensure the optimality of the solution in terms of the original problem, that is, the problem to minimize the movement of existing boxes while keeping boxes of the same brands adjacent to each other. Even though this heuristic method may produce a suboptimal solution, we could obtain a satisfactory solution within a satisfactory time, which are acceptable by real world experts. In order to justify the quality of the solution by the heuristic approach, we generate 100 problems randomly, in which the number of cells spans from 2,000 to 4,000, solve the problems by both of our heuristic approach and the original integer programming approach using a commercial optimization software package, and then compare the heuristic solutions with their corresponding optimal solutions in terms of solution time and the number of movement of boxes. We also implement our heuristic approach into a storage location assignment system for the Internet shopping mall.

Performance Optimization of Numerical Ocean Modeling on Cloud Systems (클라우드 시스템에서 해양수치모델 성능 최적화)

  • JUNG, KWANGWOOG;CHO, YANG-KI;TAK, YONG-JIN
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.27 no.3
    • /
    • pp.127-143
    • /
    • 2022
  • Recently, many attempts to run numerical ocean models in cloud computing environments have been tried actively. A cloud computing environment can be an effective means to implement numerical ocean models requiring a large-scale resource or quickly preparing modeling environment for global or large-scale grids. Many commercial and private cloud computing systems provide technologies such as virtualization, high-performance CPUs and instances, ether-net based high-performance-networking, and remote direct memory access for High Performance Computing (HPC). These new features facilitate ocean modeling experimentation on commercial cloud computing systems. Many scientists and engineers expect cloud computing to become mainstream in the near future. Analysis of the performance and features of commercial cloud services for numerical modeling is essential in order to select appropriate systems as this can help to minimize execution time and the amount of resources utilized. The effect of cache memory is large in the processing structure of the ocean numerical model, which processes input/output of data in a multidimensional array structure, and the speed of the network is important due to the communication characteristics through which a large amount of data moves. In this study, the performance of the Regional Ocean Modeling System (ROMS), the High Performance Linpack (HPL) benchmarking software package, and STREAM, the memory benchmark were evaluated and compared on commercial cloud systems to provide information for the transition of other ocean models into cloud computing. Through analysis of actual performance data and configuration settings obtained from virtualization-based commercial clouds, we evaluated the efficiency of the computer resources for the various model grid sizes in the virtualization-based cloud systems. We found that cache hierarchy and capacity are crucial in the performance of ROMS using huge memory. The memory latency time is also important in the performance. Increasing the number of cores to reduce the running time for numerical modeling is more effective with large grid sizes than with small grid sizes. Our analysis results will be helpful as a reference for constructing the best computing system in the cloud to minimize time and cost for numerical ocean modeling.

A Study on Differences of Contents and Tones of Arguments among Newspapers Using Text Mining Analysis (텍스트 마이닝을 활용한 신문사에 따른 내용 및 논조 차이점 분석)

  • Kam, Miah;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.53-77
    • /
    • 2012
  • This study analyses the difference of contents and tones of arguments among three Korean major newspapers, the Kyunghyang Shinmoon, the HanKyoreh, and the Dong-A Ilbo. It is commonly accepted that newspapers in Korea explicitly deliver their own tone of arguments when they talk about some sensitive issues and topics. It could be controversial if readers of newspapers read the news without being aware of the type of tones of arguments because the contents and the tones of arguments can affect readers easily. Thus it is very desirable to have a new tool that can inform the readers of what tone of argument a newspaper has. This study presents the results of clustering and classification techniques as part of text mining analysis. We focus on six main subjects such as Culture, Politics, International, Editorial-opinion, Eco-business and National issues in newspapers, and attempt to identify differences and similarities among the newspapers. The basic unit of text mining analysis is a paragraph of news articles. This study uses a keyword-network analysis tool and visualizes relationships among keywords to make it easier to see the differences. Newspaper articles were gathered from KINDS, the Korean integrated news database system. KINDS preserves news articles of the Kyunghyang Shinmun, the HanKyoreh and the Dong-A Ilbo and these are open to the public. This study used these three Korean major newspapers from KINDS. About 3,030 articles from 2008 to 2012 were used. International, national issues and politics sections were gathered with some specific issues. The International section was collected with the keyword of 'Nuclear weapon of North Korea.' The National issues section was collected with the keyword of '4-major-river.' The Politics section was collected with the keyword of 'Tonghap-Jinbo Dang.' All of the articles from April 2012 to May 2012 of Eco-business, Culture and Editorial-opinion sections were also collected. All of the collected data were handled and edited into paragraphs. We got rid of stop-words using the Lucene Korean Module. We calculated keyword co-occurrence counts from the paired co-occurrence list of keywords in a paragraph. We made a co-occurrence matrix from the list. Once the co-occurrence matrix was built, we used the Cosine coefficient matrix as input for PFNet(Pathfinder Network). In order to analyze these three newspapers and find out the significant keywords in each paper, we analyzed the list of 10 highest frequency keywords and keyword-networks of 20 highest ranking frequency keywords to closely examine the relationships and show the detailed network map among keywords. We used NodeXL software to visualize the PFNet. After drawing all the networks, we compared the results with the classification results. Classification was firstly handled to identify how the tone of argument of a newspaper is different from others. Then, to analyze tones of arguments, all the paragraphs were divided into two types of tones, Positive tone and Negative tone. To identify and classify all of the tones of paragraphs and articles we had collected, supervised learning technique was used. The Na$\ddot{i}$ve Bayesian classifier algorithm provided in the MALLET package was used to classify all the paragraphs in articles. After classification, Precision, Recall and F-value were used to evaluate the results of classification. Based on the results of this study, three subjects such as Culture, Eco-business and Politics showed some differences in contents and tones of arguments among these three newspapers. In addition, for the National issues, tones of arguments on 4-major-rivers project were different from each other. It seems three newspapers have their own specific tone of argument in those sections. And keyword-networks showed different shapes with each other in the same period in the same section. It means that frequently appeared keywords in articles are different and their contents are comprised with different keywords. And the Positive-Negative classification showed the possibility of classifying newspapers' tones of arguments compared to others. These results indicate that the approach in this study is promising to be extended as a new tool to identify the different tones of arguments of newspapers.