• Title/Summary/Keyword: Learning Repository

Search Result 108, Processing Time 0.029 seconds

Case Study on Software Education using Social Coding Sites (소셜 코딩 사이트를 활용한 소프트웨어 교육 사례 연구)

  • Kang, Hwan-Soo;Cho, Jin-Hyung;Kim, Hee-Chern
    • Journal of Digital Convergence
    • /
    • v.15 no.5
    • /
    • pp.37-48
    • /
    • 2017
  • Recently, the importance of software education is growing because computational thinking of software education is recognized as a key means of future economic development. Also human resources who will lead the 4th industrial revolution need convergence and creativity, computational thinking based on critical thinking, communication, and collaborative learning is known to be effective in creativity education. Software education is also a time needed to reflect social issues such as collaboration with developers sharing interests and open source development methods. Github is a leading social coding site that facilitates collaborative work among developers and supports community activities in open software development. In this study, we apply operational cases of basic learning of social coding sites, learning for storage server with sources and outputs of lectures, and open collaborative learning by using Github. And we propose educational model consisted of four stages: Introduction to Github, Using Repository, Applying Social Coding, Making personal portfolio and Assessment. The proposal of this paper is very effective for software education by attracting interest and leading to pride in the student.

Improvements in Patch-Based Machine Learning for Analyzing Three-Dimensional Seismic Sequence Data (3차원 탄성파자료의 층서구분을 위한 패치기반 기계학습 방법의 개선)

  • Lee, Donguk;Moon, Hye-Jin;Kim, Chung-Ho;Moon, Seonghoon;Lee, Su Hwan;Jou, Hyeong-Tae
    • Geophysics and Geophysical Exploration
    • /
    • v.25 no.2
    • /
    • pp.59-70
    • /
    • 2022
  • Recent studies demonstrate that machine learning has expanded in the field of seismic interpretation. Many convolutional neural networks have been developed for seismic sequence identification, which is important for seismic interpretation. However, expense and time limitations indicate that there is insufficient data available to provide a sufficient dataset to train supervised machine learning programs to identify seismic sequences. In this study, patch division and data augmentation are applied to mitigate this lack of data. Furthermore, to obtain spatial information that could be lost during patch division, an artificial channel is added to the original data to indicate depth. Seismic sequence identification is performed using a U-Net network and the Netherlands F3 block dataset from the dGB Open Seismic Repository, which offers datasets for machine learning, and the predicted results are evaluated. The results show that patch-based U-Net seismic sequence identification is improved by data augmentation and the addition of an artificial channel.

Increasing Splicing Site Prediction by Training Gene Set Based on Species

  • Ahn, Beunguk;Abbas, Elbashir;Park, Jin-Ah;Choi, Ho-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.11
    • /
    • pp.2784-2799
    • /
    • 2012
  • Biological data have been increased exponentially in recent years, and analyzing these data using data mining tools has become one of the major issues in the bioinformatics research community. This paper focuses on the protein construction process in higher organisms where the deoxyribonucleic acid, or DNA, sequence is filtered. In the process, "unmeaningful" DNA sub-sequences (called introns) are removed, and their meaningful counterparts (called exons) are retained. Accurate recognition of the boundaries between these two classes of sub-sequences, however, is known to be a difficult problem. Conventional approaches for recognizing these boundaries have sought for solely enhancing machine learning techniques, while inherent nature of the data themselves has been overlooked. In this paper we present an approach which makes use of the data attributes inherent to species in order to increase the accuracy of the boundary recognition. For experimentation, we have taken the data sets for four different species from the University of California Santa Cruz (UCSC) data repository, divided the data sets based on the species types, then trained a preprocessed version of the data sets on neural network(NN)-based and support vector machine(SVM)-based classifiers. As a result, we have observed that each species has its own specific features related to the splice sites, and that it implies there are related distances among species. To conclude, dividing the training data set based on species would increase the accuracy of predicting splicing junction and propose new insight to the biological research.

Propositionalized Attribute Taxonomy Guided Naive Bayes Learning Algorithm (명제화된 어트리뷰트 택소노미를 이용하는 나이브 베이스 학습 알고리즘)

  • Kang, Dae-Ki;Cha, Kyung-Hwan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.12
    • /
    • pp.2357-2364
    • /
    • 2008
  • In this paper, we consider the problem of exploiting a taxonomy of propositionalized attributes in order to generate compact and robust classifiers. We introduce Propositionalized Attribute Taxonomy guided Naive Bayes Learner (PAT-NBL), an inductive learning algorithm that exploits a taxonomy of propositionalized attributes as prior knowledge to generate compact and accurate classifiers. PAT-NBL uses top-down and bottom-up search to find a locally optimal cut that corresponds to the instance space from propositionalized attribute taxonomy and data. Our experimental results on University of California-Irvine (UCI) repository data set, show that the proposed algorithm can generate a classifier that is sometimes comparably compact and accurate to those produced by standard Naive Bayes learners.

Distributed Genetic Algorithm using Automatic Migration Control (분산 유전 알고리즘에서 자동 마이그레이션 조절방법)

  • Lee, Hyun-Jung;Na, Yong-Chan;Yang, Ji-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.17B no.2
    • /
    • pp.157-162
    • /
    • 2010
  • We present a new distributed genetic algorithm that can be used to extract useful information from distributed, large data over the network. The main idea of the proposed algorithms is to determine how many and which individuals move between subpopulations at each site adaptively. In addition, we present a method to help individuals from other subpopulations not be weeded out but adapt to the new subpopulation. We used six data sets from UCI Machine Learning Repository to compare the performance of our approach with that of the single, centralized genetic algorithm. As a result, the proposed algorithm produced better performance than the single genetic algorithm in terms of the classification accuracy with the feature subsets.

The Design of Granular-based Radial Basis Function Neural Network by Context-based Clustering (Context-based 클러스터링에 의한 Granular-based RBF NN의 설계)

  • Park, Ho-Sung;Oh, Sung-Kwun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.58 no.6
    • /
    • pp.1230-1237
    • /
    • 2009
  • In this paper, we develop a design methodology of Granular-based Radial Basis Function Neural Networks(GRBFNN) by context-based clustering. In contrast with the plethora of existing approaches, here we promote a development strategy in which a topology of the network is predominantly based upon a collection of information granules formed on a basis of available experimental data. The output space is granulated making use of the K-Means clustering while the input space is clustered with the aid of a so-called context-based fuzzy clustering. The number of information granules produced for each context is adjusted so that we satisfy a certain reconstructability criterion that helps us minimize an error between the original data and the ones resulting from their reconstruction involving prototypes of the clusters and the corresponding membership values. In contrast to "standard" Radial Basis Function neural networks, the output neuron of the network exhibits a certain functional nature as its connections are realized as local linear whose location is determined by the values of the context and the prototypes in the input space. The other parameters of these local functions are subject to further parametric optimization. Numeric examples involve some low dimensional synthetic data and selected data coming from the Machine Learning repository.

Value Weighted Regularized Logistic Regression Model (속성값 기반의 정규화된 로지스틱 회귀분석 모델)

  • Lee, Chang-Hwan;Jung, Mina
    • Journal of KIISE
    • /
    • v.43 no.11
    • /
    • pp.1270-1274
    • /
    • 2016
  • Logistic regression is widely used for predicting and estimating the relationship among variables. We propose a new logistic regression model, the value weighted logistic regression, which comprises of a fine-grained weighting method, and assigns adapted weights to each feature value. This gradient approach obtains the optimal weights of feature values. Experiments were conducted on several data sets from the UCI machine learning repository, and the results revealed that the proposed method achieves meaningful improvement in the prediction accuracy.

THE USE OF NUMERICAL MODELS IN SUPPORT OF SITE CHARACTERIZATION AND PERFORMANCE ASSESSMENT STUDIES FOR GEOLOGICAL REPOSITORIES

  • Neerdael, Bernard;Finsterle, Stefan
    • Nuclear Engineering and Technology
    • /
    • v.42 no.2
    • /
    • pp.145-150
    • /
    • 2010
  • The paper is describing work being developed in the frame of a 5-year IAEA Coordinated Research Programme (CRP) started in late 2005. Participants gained knowledge of modelling methodologies and experience in the development and use of rather sophisticated simulation tools in support of site characterization and performance assessment calculations. These goals were achieved by a coordinated effort, in which the advantages and limitations of numerical models are examined and demonstrated through a comparative analysis of simplified, illustrative test cases. This knowledge and experience should help them address these issues in their own country's nuclear waste program. Coordination efforts during the first three years of the project aimed at enabling this transfer of expertise and maximizing the learning experience of the participants as a group. This was accomplished by identifying common interests of the participants (i.e., Process Modelling and Total System Performance Assessment methodology), and by defining complementary tasks that are solved by the members. Synthesis of all available results by comparative assessments is planned in the coming months. The project will be completed end of 2010. This paper is summarizing activities up to November 2009.

Identifying Classes for Classification of Potential Liver Disorder Patients by Unsupervised Learning with K-means Clustering (K-means 클러스터링을 이용한 자율학습을 통한 잠재적간 질환 환자의 분류를 위한 계층 정의)

  • Kim, Jun-Beom;Oh, Kyo-Joong;Oh, Keun-Whee;Choi, Ho-Jin
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06c
    • /
    • pp.195-197
    • /
    • 2011
  • This research deals with an issue of preventive medicine in bioinformatics. We can diagnose liver conditions reasonably well to prevent Liver Cirrhosis by classifying liver disorder patients into fatty liver and high risk groups. The classification proceeds in two steps. Classification rules are first built by clustering five attributes (MCV, ALP, ALT, ASP, and GGT) of blood test dataset provided by the UCI Repository. The clusters can be formed by the K-mean method that analyzes multi dimensional attributes. We analyze the properties of each cluster divided into fatty liver, high risk and normal classes. The classification rules are generated by the analysis. In this paper, we suggest a method to diagnosis and predict liver condition to alcoholic patient according to risk levels using the classification rule from the new results of blood test. The K-mean classifier has been found to be more accurate for the result of blood test and provides the risk of fatty liver to normal liver conditions.

Improved marine predators algorithm for feature selection and SVM optimization

  • Jia, Heming;Sun, Kangjian;Li, Yao;Cao, Ning
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.4
    • /
    • pp.1128-1145
    • /
    • 2022
  • Owing to the rapid development of information science, data analysis based on machine learning has become an interdisciplinary and strategic area. Marine predators algorithm (MPA) is a novel metaheuristic algorithm inspired by the foraging strategies of marine organisms. Considering the randomness of these strategies, an improved algorithm called co-evolutionary cultural mechanism-based marine predators algorithm (CECMPA) is proposed. Through this mechanism, search agents in different spaces can share knowledge and experience to improve the performance of the native algorithm. More specifically, CECMPA has a higher probability of avoiding local optimum and can search the global optimum quickly. In this paper, it is the first to use CECMPA to perform feature subset selection and optimize hyperparameters in support vector machine (SVM) simultaneously. For performance evaluation the proposed method, it is tested on twelve datasets from the university of California Irvine (UCI) repository. Moreover, the coronavirus disease 2019 (COVID-19) can be a real-world application and is spreading in many countries. CECMPA is also applied to a COVID-19 dataset. The experimental results and statistical analysis demonstrate that CECMPA is superior to other compared methods in the literature in terms of several evaluation metrics. The proposed method has strong competitive abilities and promising prospects.