• Title/Summary/Keyword: Big Data Log Analysis System

Search Result 38, Processing Time 0.023 seconds

New Authentication Methods based on User's Behavior Big Data Analysis on Cloud (클라우드 환경에서 빅데이터 분석을 통한 새로운 사용자 인증방법에 관한 연구)

  • Hong, Sunghyuck
    • Journal of Convergence Society for SMB
    • /
    • v.6 no.4
    • /
    • pp.31-36
    • /
    • 2016
  • User authentication is the first step to network security. There are lots of authentication types, and more than one authentication method works together for user's authentication in the network. Except for biometric authentication, most authentication methods can be copied, or someone else can adopt and abuse someone else's credential method. Thus, more than one authentication method must be used for user authentication. However, more credential makes system degrade and inefficient as they log on the system. Therefore, without tradeoff performance with efficiency, this research proposed user's behavior based authentication for secure communication, and it will improve to establish a secure and efficient communication.

A Study on an Automatical BKLS Measurement By Programming Technology

  • Shin, YeounOuk;Kim, KiBum
    • International journal of advanced smart convergence
    • /
    • v.7 no.3
    • /
    • pp.73-78
    • /
    • 2018
  • This study focuses on presenting the IT program module provided by BKLS measure in order to solve the problem of capital cost due to information asymmetry of external investors and corporate executives. Barron at al(1998) set up a BKLS measure to guide the market by intermediate analysts. The BKLS measure was measured by using the changes in the analyst forecast dispersion and analyst mean forecast error squared. This study suggests a model of the algorithm that the BKLS measure can be provided to all investors immediately by IT program in order to deliver the meaningful value in the domestic capital market as measured. This is a method of generating and analyzing real-time or non-real-time prediction models by transferring the predicted estimates delivered to the Big Data Log Analysis System through the statistical DB to the statistical forecasting engine. Because BKLS measure is not carried out in a concrete method, it is practically very difficult to estimate the BKLS measure. It is expected that the BKLS measure of Barron at al(1998) introduced in this study and the model of IT module provided in real time will be the starting point for the follow-up study for the introduction and realization of IT technology in the future.

Intelligent Brand Positioning Visualization System Based on Web Search Traffic Information : Focusing on Tablet PC (웹검색 트래픽 정보를 활용한 지능형 브랜드 포지셔닝 시스템 : 태블릿 PC 사례를 중심으로)

  • Jun, Seung-Pyo;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.93-111
    • /
    • 2013
  • As Internet and information technology (IT) continues to develop and evolve, the issue of big data has emerged at the foreground of scholarly and industrial attention. Big data is generally defined as data that exceed the range that can be collected, stored, managed and analyzed by existing conventional information systems and it also refers to the new technologies designed to effectively extract values from such data. With the widespread dissemination of IT systems, continual efforts have been made in various fields of industry such as R&D, manufacturing, and finance to collect and analyze immense quantities of data in order to extract meaningful information and to use this information to solve various problems. Since IT has converged with various industries in many aspects, digital data are now being generated at a remarkably accelerating rate while developments in state-of-the-art technology have led to continual enhancements in system performance. The types of big data that are currently receiving the most attention include information available within companies, such as information on consumer characteristics, information on purchase records, logistics information and log information indicating the usage of products and services by consumers, as well as information accumulated outside companies, such as information on the web search traffic of online users, social network information, and patent information. Among these various types of big data, web searches performed by online users constitute one of the most effective and important sources of information for marketing purposes because consumers search for information on the internet in order to make efficient and rational choices. Recently, Google has provided public access to its information on the web search traffic of online users through a service named Google Trends. Research that uses this web search traffic information to analyze the information search behavior of online users is now receiving much attention in academia and in fields of industry. Studies using web search traffic information can be broadly classified into two fields. The first field consists of empirical demonstrations that show how web search information can be used to forecast social phenomena, the purchasing power of consumers, the outcomes of political elections, etc. The other field focuses on using web search traffic information to observe consumer behavior, identifying the attributes of a product that consumers regard as important or tracking changes on consumers' expectations, for example, but relatively less research has been completed in this field. In particular, to the extent of our knowledge, hardly any studies related to brands have yet attempted to use web search traffic information to analyze the factors that influence consumers' purchasing activities. This study aims to demonstrate that consumers' web search traffic information can be used to derive the relations among brands and the relations between an individual brand and product attributes. When consumers input their search words on the web, they may use a single keyword for the search, but they also often input multiple keywords to seek related information (this is referred to as simultaneous searching). A consumer performs a simultaneous search either to simultaneously compare two product brands to obtain information on their similarities and differences, or to acquire more in-depth information about a specific attribute in a specific brand. Web search traffic information shows that the quantity of simultaneous searches using certain keywords increases when the relation is closer in the consumer's mind and it will be possible to derive the relations between each of the keywords by collecting this relational data and subjecting it to network analysis. Accordingly, this study proposes a method of analyzing how brands are positioned by consumers and what relationships exist between product attributes and an individual brand, using simultaneous search traffic information. It also presents case studies demonstrating the actual application of this method, with a focus on tablets, belonging to innovative product groups.

Development of Integrated Security Control Service Model based on Artificial Intelligence Technology (인공지능 기술기반의 통합보안관제 서비스모델 개발방안)

  • Oh, Young-Tack;Jo, In-June
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.1
    • /
    • pp.108-116
    • /
    • 2019
  • In this paper, we propose a method to apply artificial intelligence technology efficiently to integrated security control technology. In other words, by applying machine learning learning to artificial intelligence based on big data collected in integrated security control system, cyber attacks are detected and appropriately responded. As technology develops, many large capacity Is limited to analyzing individual logs. The analysis method should also be applied to the integrated security control more quickly because it needs to correlate the logs of various heterogeneous security devices rather than one log. We have newly proposed an integrated security service model based on artificial intelligence, which analyzes and responds to these behaviors gradually evolves and matures through effective learning methods. We sought a solution to the key problems expected in the proposed model. And we developed a learning method based on normal behavior based learning model to strengthen the response ability against unidentified abnormal behavior threat. In addition, future research directions for security management that can efficiently support analysis and correspondence of security personnel through proposed security service model are suggested.

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

  • Park, Jiae;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.143-163
    • /
    • 2016
  • The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.

Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary (주가지수 방향성 예측을 위한 주제지향 감성사전 구축 방안)

  • Yu, Eunji;Kim, Yoosin;Kim, Namgyu;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.95-110
    • /
    • 2013
  • Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools. Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants' opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches. One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news repots. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of new content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices. So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature. The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision-support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day's stock index. In addition, we applied a domain-specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative. For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by "M" and "E" media between July 2011 and September 2011.

Identifying Roadway Sections Influenced by Speed Humps Using Survival Analysis (생존분석을 활용한 과속방지턱 영향구간 분석)

  • YOON, Gyugeun;JANG, Youlim;KHO, Seung-Young;LEE, Chungwon
    • Journal of Korean Society of Transportation
    • /
    • v.35 no.4
    • /
    • pp.261-277
    • /
    • 2017
  • This study defines influencing sections as the part of the road section where passing vehicles are traveling with the lower speed compared to speed limit due to speed humps. The influencing section was divided into 3 parts; influencing section before the speed hump, interval section, and influencing section after the speed hump. This analysis focused on the changes of each part depending on installation types, vehicle types, and daytime or nighttime. For the interval section, especially, the ratio of distance traveled with lower speed than speed limit to interval section is defined as effective influencing section ratio to be analyzed. Vehicle speed profiles were collected with a speed gun to extract influencing section lengths. The survival analysis was applied and estimated survival functions are compared with each other by several statistical tests. As a consequence, the average length of influencing section on the 50m sequential speed humps was 75.3% longer during the deceleration than that of isolated speed hump, and 18.9% during the acceleration. The effective influencing section ratio for the 30m and 50m sequential speed humps had a small difference of 81.0% and 76.0% while the absolute values of the section that passing speed were less than the speed limit were longer on 50m sequential speed humps, each being 24.3m and 38.0m. Using the log rank test, it was evident that sequential speed humps were more effective to increase the length of influencing sections compared to the isolated speed hump. Vehicle type was the strong factor for influencing section length on the isolated speed hump, but daytime or nighttime was not the effective one. This research result can be used for improving the efficiency selecting the installation point of speed humps for road safety and estimating the standard of the distance between sequential speed humps.

Open Digital Textbook for Smart Education (스마트교육을 위한 오픈 디지털교과서)

  • Koo, Young-Il;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.177-189
    • /
    • 2013
  • In Smart Education, the roles of digital textbook is very important as face-to-face media to learners. The standardization of digital textbook will promote the industrialization of digital textbook for contents providers and distributers as well as learner and instructors. In this study, the following three objectives-oriented digital textbooks are looking for ways to standardize. (1) digital textbooks should undertake the role of the media for blended learning which supports on-off classes, should be operating on common EPUB viewer without special dedicated viewer, should utilize the existing framework of the e-learning learning contents and learning management. The reason to consider the EPUB as the standard for digital textbooks is that digital textbooks don't need to specify antoher standard for the form of books, and can take advantage od industrial base with EPUB standards-rich content and distribution structure (2) digital textbooks should provide a low-cost open market service that are currently available as the standard open software (3) To provide appropriate learning feedback information to students, digital textbooks should provide a foundation which accumulates and manages all the learning activity information according to standard infrastructure for educational Big Data processing. In this study, the digital textbook in a smart education environment was referred to open digital textbook. The components of open digital textbooks service framework are (1) digital textbook terminals such as smart pad, smart TVs, smart phones, PC, etc., (2) digital textbooks platform to show and perform digital contents on digital textbook terminals, (3) learning contents repository, which exist on the cloud, maintains accredited learning, (4) App Store providing and distributing secondary learning contents and learning tools by learning contents developing companies, and (5) LMS as a learning support/management tool which on-site class teacher use for creating classroom instruction materials. In addition, locating all of the hardware and software implement a smart education service within the cloud must have take advantage of the cloud computing for efficient management and reducing expense. The open digital textbooks of smart education is consdered as providing e-book style interface of LMS to learners. In open digital textbooks, the representation of text, image, audio, video, equations, etc. is basic function. But painting, writing, problem solving, etc are beyond the capabilities of a simple e-book. The Communication of teacher-to-student, learner-to-learnert, tems-to-team is required by using the open digital textbook. To represent student demographics, portfolio information, and class information, the standard used in e-learning is desirable. To process learner tracking information about the activities of the learner for LMS(Learning Management System), open digital textbook must have the recording function and the commnincating function with LMS. DRM is a function for protecting various copyright. Currently DRMs of e-boook are controlled by the corresponding book viewer. If open digital textbook admitt DRM that is used in a variety of different DRM standards of various e-book viewer, the implementation of redundant features can be avoided. Security/privacy functions are required to protect information about the study or instruction from a third party UDL (Universal Design for Learning) is learning support function for those with disabilities have difficulty in learning courses. The open digital textbook, which is based on E-book standard EPUB 3.0, must (1) record the learning activity log information, and (2) communicate with the server to support the learning activity. While the recording function and the communication function, which is not determined on current standards, is implemented as a JavaScript and is utilized in the current EPUB 3.0 viewer, ths strategy of proposing such recording and communication functions as the next generation of e-book standard, or special standard (EPUB 3.0 for education) is needed. Future research in this study will implement open source program with the proposed open digital textbook standard and present a new educational services including Big Data analysis.