• Title/Summary/Keyword: dataset records

Search Result 99, Processing Time 0.023 seconds

A Study on the Functional Requirements of Migration Tool for Dataset Records (데이터세트 기록의 이관도구 기능요건 연구)

  • Yim, Jin-Hee;Cho, Eun Hee
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2010.08a
    • /
    • pp.155-162
    • /
    • 2010
  • 데이터세트 기록은 단순한 이관모듈의 작성이나 API를 제공하는 것만으로는 효과적이고 효율적으로 이관을 수행하기 어렵다. 데이터세트의 기록은 기록 건의 범위가 명확하지 않으며, 이관되는 과정에서 데이터의 보정과 품질개선이 필요하며, 메타데이터의 적극적인 수집보완이 요구되는 등 규격화된 전자문서 중심의 기록과는 이관 과정의 특성이 다르기 때문이다. 따라서 데이터세트 기록의 이관 시에는 세부적인 이관 절차를 따라 이관대상 데이터의 상태를 확인하고 필요한 조치를 취해나갈 수 있도록 지원해주는 이관도구가 필수적이다. 이관의 이관 작업과 달리 이관 작업을 지원하는 도구가 필수적이다. 이 논문에서는 행정정보시스템에서 데이터세트 기록을 추출하여 아카이브 시스템으로 이관하는 과정을 지원하는 도구가 갖춰야 할에 기능요건을 제시하고자 한다.

  • PDF

The effect of extended lactation on parameters of Wood's model of lactation curve in dairy Simmental cows

  • Kopec, Tomas;Chladek, Gustav;Falta, Daniel;Kucera, Josef;Vecera, Milan;Hanus, Oto
    • Animal Bioscience
    • /
    • v.34 no.6
    • /
    • pp.949-956
    • /
    • 2021
  • Objective: This study was focused on the estimation of parameters of Wood's model and description of the lactation curve using the cows which were lactated over 24 months on the first lactation. Methods: The database included 1,333 pure-bred dairy Simmental primiparous cows which lactated for 24 months (732 days). The initial dataset entering the procedure of assessment of parameters of Wood's function included 35,826 milk yield records. Milk yield was recorded throughout lactation, with the earliest record taken on day 6 and the latest on day 1,348 of lactation. This dataset was used for the assessment of parameters a, b, c of Wood's model using the non-linear statistical procedure. These parameters were estimated for different length of lactation. The assessed parameters were used for calculation of some characteristics of lactation curves. Results: The lowest value of a parameter (15.2317) of Wood's model of lactation curve was found out in lactations up to 305 days long, contrary to b and c parameters which were highest in those lactations (0.1029 and 0.0015, respectively). The maximum value of a parameter (17.4329) was found out in lactations up to 640 days long, unlike b and c parameters which were minimal in those lactations (0.0603 and 0.0010, respectively). Conclusion: It can be concluded that the parameters of Wood's model and the shape of lactation curve are changing with the growing number of milk yield records. Also, the assessed parameters revealed a significant milk production potential after 305 days of lactation.

Empirical Verification of Conversion and Restoration of Preservation Format for Dataset: Application of Dataset with Disaster Safety Information to SIARD (데이터세트 보존포맷 검증방안에 관한 연구: 재난안전정보 데이터세트의 SIARD 적용을 통해)

  • Han, Hui-Jeong;Yoon, Sung-Ho;Oh, Hyo-Jung;Yang, Dongmin
    • Journal of the Korean Society for information Management
    • /
    • v.37 no.2
    • /
    • pp.251-284
    • /
    • 2020
  • As the use of information has emerged as the core of national competitiveness, major developed countries and the Korean government have realized the importance of data. They have pursued technical research and standard establishment for long-term preservation and continuously strived for systematic management and preservation of data. However, although various types of data are specified for the purpose of record management in the law, there is no specific method on how to collect, manage and preserve them, except standard electronic documents. In particular, management and preservation of huge datasets from the administrative information system have been strongly demanded above all. Any guidelines for datasets do not have been properly provided. After the framework for selecting preservation format must be prepared, the system can be supplemented and built. The framework considering the characteristics of the dataset should be specified more concretely, and empirical verification of the conversion and restoration for the dataset preservation format derived according to the selection criteria is necessary. Therefore, this study intends to propose a method for long-term preservation through empirical verification of the preservation format after deriving an evaluation the framework for the preservation format selection criteria considering the characteristics of the dataset.

Transaction Mining for Fraud Detection in ERP Systems

  • Khan, Roheena;Corney, Malcolm;Clark, Andrew;Mohay, George
    • Industrial Engineering and Management Systems
    • /
    • v.9 no.2
    • /
    • pp.141-156
    • /
    • 2010
  • Despite all attempts to prevent fraud, it continues to be a major threat to industry and government. Traditionally, organizations have focused on fraud prevention rather than detection, to combat fraud. In this paper we present a role mining inspired approach to represent user behaviour in Enterprise Resource Planning (ERP) systems, primarily aimed at detecting opportunities to commit fraud or potentially suspicious activities. We have adapted an approach which uses set theory to create transaction profiles based on analysis of user activity records. Based on these transaction profiles, we propose a set of (1) anomaly types to detect potentially suspicious user behaviour, and (2) scenarios to identify inadequate segregation of duties in an ERP environment. In addition, we present two algorithms to construct a directed acyclic graph to represent relationships between transaction profiles. Experiments were conducted using a real dataset obtained from a teaching environment and a demonstration dataset, both using SAP R/3, presently the predominant ERP system. The results of this empirical research demonstrate the effectiveness of the proposed approach.

A Novel Classification Model for Employees Turnover Using Neural Network for Enhancing Job Satisfaction in Organizations

  • Tarig Mohamed Ahmed
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.7
    • /
    • pp.71-78
    • /
    • 2023
  • Employee turnover is one of the most important challenges facing modern organizations. It causes job experiences and skills such as distinguished faculty members in universities, rare-specialized doctors, innovative engineers, and senior administrators. HR analytics has enhanced the area of data analytics to an extent that institutions can figure out their employees' characteristics; where inaccuracy leads to incorrect decision making. This paper aims to develop a novel model that can help decision-makers to classify the problem of Employee Turnover. By using feature selection methods: Information Gain and Chi-Square, the most important four features have been extracted from the dataset. These features are over time, job level, salary, and years in the organization. As one of the important results of this research, these features should be planned carefully to keep organizations their employees as valuable assets. The proposed model based on machine learning algorithms. Classification algorithms were used to implement the model such as Decision Tree, SVM, Random Frost, Neuronal Network, and Naive Bayes. The model was trained and tested by using a dataset that consists of 1470 records and 25 features. To develop the research model, many experiments had been conducted to find the best one. Based on implementation results, the Neural Network algorithm is selected as the best one with an Accuracy of 84 percents and AUC (ROC) 74 percents. By validation mechanism, the model is acceptable and reliable to help origination decision-makers to manage their employees in a good manner.

Efficient Similarity Joins by Adaptive Prefix Filtering (맞춤 접두 필터링을 이용한 효율적인 유사도 조인)

  • Park, Jong Soo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.4
    • /
    • pp.267-272
    • /
    • 2013
  • As an important operation with many applications such as data cleaning and duplicate detection, the similarity join is a challenging issue, which finds all pairs of records whose similarities are above a given threshold in a dataset. We propose a new algorithm that uses the prefix filtering principle as strong constraints on generation of candidate pairs for fast similarity joins. The candidate pair is generated only when the current prefix token of a probing record shares one prefix token of an indexing record within the constrained prefix tokens by the principle. This generation method needs not to compute an upper bound of the overlap between two records, which results in reduction of execution time. Experimental results show that our algorithm significantly outperforms the previous prefix filtering-based algorithms on real datasets.

Biodiversity and Enzyme Activity of Marine Fungi with 28 New Records from the Tropical Coastal Ecosystems in Vietnam

  • Pham, Thu Thuy;Dinh, Khuong V.;Nguyen, Van Duy
    • Mycobiology
    • /
    • v.49 no.6
    • /
    • pp.559-581
    • /
    • 2021
  • The coastal marine ecosystems of Vietnam are one of the global biodiversity hotspots, but the biodiversity of marine fungi is not well known. To fill this major gap of knowledge, we assessed the genetic diversity (ITS sequence) of 75 fungal strains isolated from 11 surface coastal marine and deeper waters in Nha Trang Bay and Van Phong Bay using a culture-dependent approach and 5 OTUs (Operational Taxonomic Units) of fungi in three representative sampling sites using next-generation sequencing. The results from both approaches shared similar fungal taxonomy to the most abundant phylum (Ascomycota), genera (Candida and Aspergillus) and species (Candida blankii) but were different at less common taxa. Culturable fungal strains in this study belong to 3 phyla, 5 subdivisions, 7 classes, 12 orders, 17 families, 22 genera and at least 40 species, of which 29 species have been identified and several species are likely novel. Among identified species, 12 and 28 are new records in global and Vietnamese marine areas, respectively. The analysis of enzyme activity and the checklist of trophic mode and guild assignment provided valuable additional biological information and suggested the ecological function of planktonic fungi in the marine food web. This is the largest dataset of marine fungal biodiversity on morphology, phylogeny and enzyme activity in the tropical coastal ecosystems of Vietnam and Southeast Asia. Biogeographic aspects, ecological factors and human impact may structure mycoplankton communities in such aquatic habitats.

A Study on the Records Management for the National Assembly Members (국회의원 기록관리 방안 연구)

  • Kim, Jang-hwan
    • The Korean Journal of Archival Studies
    • /
    • no.55
    • /
    • pp.39-71
    • /
    • 2018
  • The purpose of this study is to examine the reality of the records management of the National Assembly members and suggest a desirable alternative. Until the Public Records Management Act was enacted in 1999, the level of the records management in the National Assembly was not beyond that of the document management in both the administration and the legislature. Rather, the National Assembly has maintained a records management tradition that systematically manages the minutes and bills since the Constitutional Assembly. After the Act was legislated in 2000, the National Assembly Records Management Regulation was enacted and enforced, and the Archives was established in the form of a subsidiary organ of the Secretariat of the National Assembly, even though its establishment is not obligatory. In addition, for the first time, an archivist was assigned as a records and archives researcher in Korea, whose role is to respond quickly in accordance with the records schedule of the National Assembly, making its service faster than that of the administration. However, the power of the records management of the National Assembly Archives at the time of the Secretariat of the National Assembly was greatly reduced, so the revision of the regulations in accordance with the revised Act in 2007 was not completed until 2011. In the case of the National Assembly, the direct influence of the executive branch was insignificant. As the National Assembly had little direct influence on the administration, it had little positive influence on records management innovation under Roh Moo-Hyun Administration. Even within the National Assembly, the records management observed by its members is insignificant both in practice and in theory. As the National Assembly members are excluded from the Act, there is no legal basis to enforce a records management method upon them. In this study, we analyze the records management problem of the National Assembly members, which mainly concerns the National Assembly records management plan established in the National Archives. Moreover, this study proposes three kinds of records management methods for the National Assembly members, namely, the legislation and revision of regulations, the records management consulting of the National Assembly members, and the transfer of the dataset of administrative information systems and websites.

Actor-Critic Algorithm with Transition Cost Estimation

  • Sergey, Denisov;Lee, Jee-Hyong
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.16 no.4
    • /
    • pp.270-275
    • /
    • 2016
  • We present an approach for acceleration actor-critic algorithm for reinforcement learning with continuous action space. Actor-critic algorithm has already proved its robustness to the infinitely large action spaces in various high dimensional environments. Despite that success, the main problem of the actor-critic algorithm remains the same-speed of convergence to the optimal policy. In high dimensional state and action space, a searching for the correct action in each state takes enormously long time. Therefore, in this paper we suggest a search accelerating function that allows to leverage speed of algorithm convergence and reach optimal policy faster. In our method, we assume that actions may have their own distribution of preference, that independent on the state. Since in the beginning of learning agent act randomly in the environment, it would be more efficient if actions were taken according to the some heuristic function. We demonstrate that heuristically-accelerated actor-critic algorithm learns optimal policy faster, using Educational Process Mining dataset with records of students' course learning process and their grades.

Development of nationwide amplification map of response spectrum for Japan based on station correction factors

  • Maruyama, Yoshihisa;Sakemoto, Masaki
    • Earthquakes and Structures
    • /
    • v.13 no.1
    • /
    • pp.17-27
    • /
    • 2017
  • In this study, the characteristics of site amplification at seismic observation stations in Japan were estimated using the attenuation relationship of each station's response spectrum. Ground motion records observed after 32 earthquakes were employed to construct the attenuation relationship. The station correction factor at each KiK-net station was compared to the transfer functions between the base rock and the surface. For each station, the plot of the station correction factor versus the period was similar in shape to the graphs of the transfer function (amplitude ratio versus period). Therefore, the station correction factors are effective for evaluating site amplifications considering the period of ground shaking. In addition, the station correction factors were evaluated with respect to the average shear wave velocities using a geographic information system (GIS) dataset. Lastly, the site amplifications for specific periods were estimated throughout Japan.