• Title/Summary/Keyword: apriori

Search Result 144, Processing Time 0.021 seconds

Analysis and Prediction of Energy Consumption Using Supervised Machine Learning Techniques: A Study of Libyan Electricity Company Data

  • Ashraf Mohammed Abusida;Aybaba Hancerliogullari
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.3
    • /
    • pp.10-16
    • /
    • 2023
  • The ever-increasing amount of data generated by various industries and systems has led to the development of data mining techniques as a means to extract valuable insights and knowledge from such data. The electrical energy industry is no exception, with the large amounts of data generated by SCADA systems. This study focuses on the analysis of historical data recorded in the SCADA database of the Libyan Electricity Company. The database, spanned from January 1st, 2013, to December 31st, 2022, contains records of daily date and hour, energy production, temperature, humidity, wind speed, and energy consumption levels. The data was pre-processed and analyzed using the WEKA tool and the Apriori algorithm, a supervised machine learning technique. The aim of the study was to extract association rules that would assist decision-makers in making informed decisions with greater efficiency and reduced costs. The results obtained from the study were evaluated in terms of accuracy and production time, and the conclusion of the study shows that the results are promising and encouraging for future use in the Libyan Electricity Company. The study highlights the importance of data mining and the benefits of utilizing machine learning technology in decision-making processes.

When is the best time to run SNS AD per topic?: through conversation data analysis (SNS 대화 분석을 통한 주제별 적합 광고 시간대 도출)

  • Lee, Jimin;Jeon, Yerim;Lee, Jisun;Woo, Jiyoung
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.01a
    • /
    • pp.335-336
    • /
    • 2022
  • 본 논문에서는 시간대와 대화 주제를 활용하여 카테고리별로 적절한 SNS 광고 시간대 예측 방법을 제시한다. 위의 분석으로 광고주들에게 적절한 광고시간을 제안할 수 있다. 연관규칙분석 알고리즘인 apriori를 사용하였다. 주제는 상거래(쇼핑), 미용과 건강, 시사/교육, 식음료, 여가생활로 추려서 분석하였다. 연관분석 결과, 미용과 건강이 18시, 17시, 16시에 가장 활발히 대화를 나누었다. 상거래(쇼핑)이 14시, 16시, 17시 순으로 가장 활발히 대화를 나누었으며, 시사/교육이 15시, 17시, 16시 순으로 많은 대화를 나누었으며, 식음료가 18시, 17시, 19시 순으로 대화를 많이 나눈 것을 확인했다. 마지막으로, 여가생활은 22시, 23시, 21시 순으로 각각의 대화 주제별로 가장 많이 대화를 나눈 시간대가 달라지는 것을 확인할 수 있었다. 이를 통해 소비자 입장에서는 알맞은 광고를 적절한 시간대에 추천받을 수 있다.

  • PDF

Trend-based Sequential Pattern Discovery from Time-Series Data (시계열 데이터로부터의 경향성 기반 순차패턴 탐색)

  • 오용생;이동하;남도원;이전영
    • Journal of Intelligence and Information Systems
    • /
    • v.7 no.1
    • /
    • pp.27-45
    • /
    • 2001
  • Sequential discovery from time series data has mainly concerned about events or item sets. Recently, the research has stated to applied to the numerical data. An example is sensor information generated by checking a machine state. The numerical data hardly have the same valuers while making patterns. So, it is important to extract suitable number of pattern features, which can be transformed to events or item sets and be applied to sequential pattern mining tasks. The popular methods to extract the patterns are sliding window and clustering. The results of these methods are sensitive to window sine or clustering parameters; that makes users to apply data mining task repeatedly and to interpret the results. This paper suggests the method to retrieve pattern features making numerical data into vector of an angle and a magnitude. The retrieved pattern features using this method make the result easy to understand and sequential patterns finding fast. We define an inclusion relation among pattern features using angles and magnitudes of vectors. Using this relation, we can fad sequential patterns faster than other methods, which use all data by reducing the data size.

  • PDF

Research on a handwritten character recognition algorithm based on an extended nonlinear kernel residual network

  • Rao, Zheheng;Zeng, Chunyan;Wu, Minghu;Wang, Zhifeng;Zhao, Nan;Liu, Min;Wan, Xiangkui
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.1
    • /
    • pp.413-435
    • /
    • 2018
  • Although the accuracy of handwritten character recognition based on deep networks has been shown to be superior to that of the traditional method, the use of an overly deep network significantly increases time consumption during parameter training. For this reason, this paper took the training time and recognition accuracy into consideration and proposed a novel handwritten character recognition algorithm with newly designed network structure, which is based on an extended nonlinear kernel residual network. This network is a non-extremely deep network, and its main design is as follows:(1) Design of an unsupervised apriori algorithm for intra-class clustering, making the subsequent network training more pertinent; (2) presentation of an intermediate convolution model with a pre-processed width level of 2;(3) presentation of a composite residual structure that designs a multi-level quick link; and (4) addition of a Dropout layer after the parameter optimization. The algorithm shows superior results on MNIST and SVHN dataset, which are two character benchmark recognition datasets, and achieves better recognition accuracy and higher recognition efficiency than other deep structures with the same number of layers.

A Neural Network and Kalman Filter Hybrid Approach for GPS/INS Integration

  • Wang, Jianguo Jack;Wang, Jinling;Sinclair, David;Watts, Leo
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • v.1
    • /
    • pp.277-282
    • /
    • 2006
  • It is well known that Kalman filtering is an optimal real-time data fusion method for GPS/INS integration. However, it has some limitations in terms of stability, adaptability and observability. A Kalman filter can perform optimally only when its dynamic model is correctly defined and the noise statistics for the measurement and process are completely known. It is found that estimated Kalman filter states could be influenced by several factors, including vehicle dynamic variations, filter tuning results, and environment changes, etc., which are difficult to model. Neural networks can map input-output relationships without apriori knowledge about them; hence a proper designed neural network is capable of learning and extracting these complex relationships with enough training. This paper presents a GPS/INS integrated system that combines Kalman filtering and neural network algorithms to improve navigation solutions during GPS outages. An Extended Kalman filter estimates INS measurement errors, plus position, velocity and attitude errors etc. Kalman filter states, and gives precise navigation solutions while GPS signals are available. At the same time, a multi-layer neural network is trained to map the vehicle dynamics with corresponding Kalman filter states, at the same rate of measurement update. After the output of the neural network meets a similarity threshold, it can be used to correct INS measurements when no GPS measurements are available. Selecting suitable inputs and outputs of the neural network is critical for this hybrid method. Detailed analysis unveils that some Kalman filter states are highly correlated with vehicle dynamic variations. The filter states that heavily impact system navigation solutions are selected as the neural network outputs. The principle of this hybrid method and the neural network design are presented. Field test data are processed to evaluate the performance of the proposed method.

  • PDF

Mathematical Cognition as the Construction of Concepts in Kant's Critique of Pure Reason ("순수이성비판"에 나타난 수학적 인식의 특성: 개념의 구성)

  • Yim, Jae-Hoon
    • Journal of Elementary Mathematics Education in Korea
    • /
    • v.16 no.1
    • /
    • pp.1-19
    • /
    • 2012
  • Kant defines mathematical cognition as the cognition by reason from the construction of concepts. In this paper, I inquire the meaning and the characteristics of the construction of concepts based on Kant's theory on the sensibility and the understanding. To construct a concept is to exhibit or represent the object which corresponds to the concept in pure intuition apriori. The construction of a mathematical concept includes a dynamic synthesis of the pure imagination to produce a schema of a concept rather than its image. Kant's transcendental explanation on the sensibility and the understanding can be regarded as an epistemological theory that supports the necessity of arithmetic and geometry as common core in human education. And his views on mathematical cognition implies that we should pay more attention to how to have students get deeper understanding of a mathematical concept through the construction of it beyond mere abstraction from sensible experience and how to guide students to cultivate the habit of mind to refer to given figures or symbols as schemata of mathematical concepts rather than mere images of them.

  • PDF

Negative Selection Algorithm based Multi-Level Anomaly Intrusion Detection for False-Positive Reduction (과탐지 감소를 위한 NSA 기반의 다중 레벨 이상 침입 탐지)

  • Kim, Mi-Sun;Park, Kyung-Woo;Seo, Jae-Hyun
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.16 no.6
    • /
    • pp.111-121
    • /
    • 2006
  • As Internet lastly grows, network attack techniques are transformed and new attack types are appearing. The existing network-based intrusion detection systems detect well known attack, but the false-positive or false-negative against unknown attack is appearing high. In addition, The existing network-based intrusion detection systems is difficult to real time detection against a large network pack data in the network and to response and recognition against new attack type. Therefore, it requires method to heighten the detection rate about a various large dataset and to reduce the false-positive. In this paper, we propose method to reduce the false-positive using multi-level detection algorithm, that is combine the multidimensional Apriori algorithm and the modified Negative Selection algorithm. And we apply this algorithm in intrusion detection and, to be sure, it has a good performance.

Performance analysis of Frequent Itemset Mining Technique based on Transaction Weight Constraints (트랜잭션 가중치 기반의 빈발 아이템셋 마이닝 기법의 성능분석)

  • Yun, Unil;Pyun, Gwangbum
    • Journal of Internet Computing and Services
    • /
    • v.16 no.1
    • /
    • pp.67-74
    • /
    • 2015
  • In recent years, frequent itemset mining for considering the importance of each item has been intensively studied as one of important issues in the data mining field. According to strategies utilizing the item importance, itemset mining approaches for discovering itemsets based on the item importance are classified as follows: weighted frequent itemset mining, frequent itemset mining using transactional weights, and utility itemset mining. In this paper, we perform empirical analysis with respect to frequent itemset mining algorithms based on transactional weights. The mining algorithms compute transactional weights by utilizing the weight for each item in large databases. In addition, these algorithms discover weighted frequent itemsets on the basis of the item frequency and weight of each transaction. Consequently, we can see the importance of a certain transaction through the database analysis because the weight for the transaction has higher value if it contains many items with high values. We not only analyze the advantages and disadvantages but also compare the performance of the most famous algorithms in the frequent itemset mining field based on the transactional weights. As a representative of the frequent itemset mining using transactional weights, WIS introduces the concept and strategies of transactional weights. In addition, there are various other state-of-the-art algorithms, WIT-FWIs, WIT-FWIs-MODIFY, and WIT-FWIs-DIFF, for extracting itemsets with the weight information. To efficiently conduct processes for mining weighted frequent itemsets, three algorithms use the special Lattice-like data structure, called WIT-tree. The algorithms do not need to an additional database scanning operation after the construction of WIT-tree is finished since each node of WIT-tree has item information such as item and transaction IDs. In particular, the traditional algorithms conduct a number of database scanning operations to mine weighted itemsets, whereas the algorithms based on WIT-tree solve the overhead problem that can occur in the mining processes by reading databases only one time. Additionally, the algorithms use the technique for generating each new itemset of length N+1 on the basis of two different itemsets of length N. To discover new weighted itemsets, WIT-FWIs performs the itemset combination processes by using the information of transactions that contain all the itemsets. WIT-FWIs-MODIFY has a unique feature decreasing operations for calculating the frequency of the new itemset. WIT-FWIs-DIFF utilizes a technique using the difference of two itemsets. To compare and analyze the performance of the algorithms in various environments, we use real datasets of two types (i.e., dense and sparse) in terms of the runtime and maximum memory usage. Moreover, a scalability test is conducted to evaluate the stability for each algorithm when the size of a database is changed. As a result, WIT-FWIs and WIT-FWIs-MODIFY show the best performance in the dense dataset, and in sparse dataset, WIT-FWI-DIFF has mining efficiency better than the other algorithms. Compared to the algorithms using WIT-tree, WIS based on the Apriori technique has the worst efficiency because it requires a large number of computations more than the others on average.

Development of Intelligent Job Classification System based on Job Posting on Job Sites (구인구직사이트의 구인정보 기반 지능형 직무분류체계의 구축)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.123-139
    • /
    • 2019
  • The job classification system of major job sites differs from site to site and is different from the job classification system of the 'SQF(Sectoral Qualifications Framework)' proposed by the SW field. Therefore, a new job classification system is needed for SW companies, SW job seekers, and job sites to understand. The purpose of this study is to establish a standard job classification system that reflects market demand by analyzing SQF based on job offer information of major job sites and the NCS(National Competency Standards). For this purpose, the association analysis between occupations of major job sites is conducted and the association rule between SQF and occupation is conducted to derive the association rule between occupations. Using this association rule, we proposed an intelligent job classification system based on data mapping the job classification system of major job sites and SQF and job classification system. First, major job sites are selected to obtain information on the job classification system of the SW market. Then We identify ways to collect job information from each site and collect data through open API. Focusing on the relationship between the data, filtering only the job information posted on each job site at the same time, other job information is deleted. Next, we will map the job classification system between job sites using the association rules derived from the association analysis. We will complete the mapping between these market segments, discuss with the experts, further map the SQF, and finally propose a new job classification system. As a result, more than 30,000 job listings were collected in XML format using open API in 'WORKNET,' 'JOBKOREA,' and 'saramin', which are the main job sites in Korea. After filtering out about 900 job postings simultaneously posted on multiple job sites, 800 association rules were derived by applying the Apriori algorithm, which is a frequent pattern mining. Based on 800 related rules, the job classification system of WORKNET, JOBKOREA, and saramin and the SQF job classification system were mapped and classified into 1st and 4th stages. In the new job taxonomy, the first primary class, IT consulting, computer system, network, and security related job system, consisted of three secondary classifications, five tertiary classifications, and five fourth classifications. The second primary classification, the database and the job system related to system operation, consisted of three secondary classifications, three tertiary classifications, and four fourth classifications. The third primary category, Web Planning, Web Programming, Web Design, and Game, was composed of four secondary classifications, nine tertiary classifications, and two fourth classifications. The last primary classification, job systems related to ICT management, computer and communication engineering technology, consisted of three secondary classifications and six tertiary classifications. In particular, the new job classification system has a relatively flexible stage of classification, unlike other existing classification systems. WORKNET divides jobs into third categories, JOBKOREA divides jobs into second categories, and the subdivided jobs into keywords. saramin divided the job into the second classification, and the subdivided the job into keyword form. The newly proposed standard job classification system accepts some keyword-based jobs, and treats some product names as jobs. In the classification system, not only are jobs suspended in the second classification, but there are also jobs that are subdivided into the fourth classification. This reflected the idea that not all jobs could be broken down into the same steps. We also proposed a combination of rules and experts' opinions from market data collected and conducted associative analysis. Therefore, the newly proposed job classification system can be regarded as a data-based intelligent job classification system that reflects the market demand, unlike the existing job classification system. This study is meaningful in that it suggests a new job classification system that reflects market demand by attempting mapping between occupations based on data through the association analysis between occupations rather than intuition of some experts. However, this study has a limitation in that it cannot fully reflect the market demand that changes over time because the data collection point is temporary. As market demands change over time, including seasonal factors and major corporate public recruitment timings, continuous data monitoring and repeated experiments are needed to achieve more accurate matching. The results of this study can be used to suggest the direction of improvement of SQF in the SW industry in the future, and it is expected to be transferred to other industries with the experience of success in the SW industry.

Study on the Orgainc Relations among Hado. Laks${\u{\lrcorner}}$, a Priori Eight Trigrams, and a Posteriori Eight Trigrams (하도(河圖)${\cdot}$락서(書洛)${\cdot}$선천팔괘(先夭八卦)${\cdot}$후천팔괘(後夭八卦)의 상호 유기적 관계 연구)

  • Kim, Byoung-Soo
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.21 no.2
    • /
    • pp.379-386
    • /
    • 2007
  • Hado(河圖) and Laks${\u{\lrcorner}}$(書洛) are the diagrams composed of the symbols of numbers from one to ten. And the eight-trigrams, P'al-gwoe divide into two types one is a priori eight-trigrams (先夭八卦) or the Bok-Hui's eight trigrams(伏羲八卦); and the other is a posterior eight trigrams (後夭八卦) or the king Mun's eight trigrams (文王八卦). Relating these two diagrams of Hado and Laks${\u{\lrcorner}}$ with the two types of eight trigrams, they are said to be a term of Ha-Lak-Hui-Mun (HLHM). Each of HLHM represents the process of creating and changing of 'heaven and Earth' and every beingby the symbols of numbers and trigrams. In other words, each of HLHM symbolizes the origin and the structure of the universe as well as the birth of every life represented in the diagram of theosophany (福智學) or Kabbalah. HLHM are also regarded as the origin of l-ching or Book of Change. Hado produces Laks${\u{\lrcorner}}$ through the principle of yin-yang(陰陽). Laks${\u{\lrcorner}}$ produces a priori eight trigrams through the zigzag shapes which means Heaven and Earth are mutually responding. And a priori eight trigrams produce a posteriori eight trigrams through the triangle principle of connecting Heaven and Earth. In this process, Hado and a priori eight trigrams are respectively prior to Laks${\u{\lrcorner}}$ and a posteriori eight trigrams. HLHM represent fractal shape resembling the symbol of five on the center of Hado, or Hado itself. In the dynamic process of HLHM, a diagram of Circle, Quadrangle, and Triangle (CQT) is produced as follows: Circle, the symbol of 'infinify' or Heaven, represents the origin of life or birth. Hadois the symbol of creation. Quadrangle, the symbol of Earth, represents that Laks${\u{\lrcorner}}$is scattered into four directions of front, back, left, and rifht. Quadrangle, which is immovable, represents materiality. Triangle, being described from the eight trigrams, means the movements of the process of 'mutual inclusion' of Circle and Quadrangle. Triangle also means the process of harmonizing human beings with natural law.