• Title/Summary/Keyword: Real-time prediction

Search Result 1,197, Processing Time 0.028 seconds

Visualizing the Results of Opinion Mining from Social Media Contents: Case Study of a Noodle Company (소셜미디어 콘텐츠의 오피니언 마이닝결과 시각화: N라면 사례 분석 연구)

  • Kim, Yoosin;Kwon, Do Young;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.89-105
    • /
    • 2014
  • After emergence of Internet, social media with highly interactive Web 2.0 applications has provided very user friendly means for consumers and companies to communicate with each other. Users have routinely published contents involving their opinions and interests in social media such as blogs, forums, chatting rooms, and discussion boards, and the contents are released real-time in the Internet. For that reason, many researchers and marketers regard social media contents as the source of information for business analytics to develop business insights, and many studies have reported results on mining business intelligence from Social media content. In particular, opinion mining and sentiment analysis, as a technique to extract, classify, understand, and assess the opinions implicit in text contents, are frequently applied into social media content analysis because it emphasizes determining sentiment polarity and extracting authors' opinions. A number of frameworks, methods, techniques and tools have been presented by these researchers. However, we have found some weaknesses from their methods which are often technically complicated and are not sufficiently user-friendly for helping business decisions and planning. In this study, we attempted to formulate a more comprehensive and practical approach to conduct opinion mining with visual deliverables. First, we described the entire cycle of practical opinion mining using Social media content from the initial data gathering stage to the final presentation session. Our proposed approach to opinion mining consists of four phases: collecting, qualifying, analyzing, and visualizing. In the first phase, analysts have to choose target social media. Each target media requires different ways for analysts to gain access. There are open-API, searching tools, DB2DB interface, purchasing contents, and so son. Second phase is pre-processing to generate useful materials for meaningful analysis. If we do not remove garbage data, results of social media analysis will not provide meaningful and useful business insights. To clean social media data, natural language processing techniques should be applied. The next step is the opinion mining phase where the cleansed social media content set is to be analyzed. The qualified data set includes not only user-generated contents but also content identification information such as creation date, author name, user id, content id, hit counts, review or reply, favorite, etc. Depending on the purpose of the analysis, researchers or data analysts can select a suitable mining tool. Topic extraction and buzz analysis are usually related to market trends analysis, while sentiment analysis is utilized to conduct reputation analysis. There are also various applications, such as stock prediction, product recommendation, sales forecasting, and so on. The last phase is visualization and presentation of analysis results. The major focus and purpose of this phase are to explain results of analysis and help users to comprehend its meaning. Therefore, to the extent possible, deliverables from this phase should be made simple, clear and easy to understand, rather than complex and flashy. To illustrate our approach, we conducted a case study on a leading Korean instant noodle company. We targeted the leading company, NS Food, with 66.5% of market share; the firm has kept No. 1 position in the Korean "Ramen" business for several decades. We collected a total of 11,869 pieces of contents including blogs, forum contents and news articles. After collecting social media content data, we generated instant noodle business specific language resources for data manipulation and analysis using natural language processing. In addition, we tried to classify contents in more detail categories such as marketing features, environment, reputation, etc. In those phase, we used free ware software programs such as TM, KoNLP, ggplot2 and plyr packages in R project. As the result, we presented several useful visualization outputs like domain specific lexicons, volume and sentiment graphs, topic word cloud, heat maps, valence tree map, and other visualized images to provide vivid, full-colored examples using open library software packages of the R project. Business actors can quickly detect areas by a swift glance that are weak, strong, positive, negative, quiet or loud. Heat map is able to explain movement of sentiment or volume in categories and time matrix which shows density of color on time periods. Valence tree map, one of the most comprehensive and holistic visualization models, should be very helpful for analysts and decision makers to quickly understand the "big picture" business situation with a hierarchical structure since tree-map can present buzz volume and sentiment with a visualized result in a certain period. This case study offers real-world business insights from market sensing which would demonstrate to practical-minded business users how they can use these types of results for timely decision making in response to on-going changes in the market. We believe our approach can provide practical and reliable guide to opinion mining with visualized results that are immediately useful, not just in food industry but in other industries as well.

Issue tracking and voting rate prediction for 19th Korean president election candidates (댓글 분석을 통한 19대 한국 대선 후보 이슈 파악 및 득표율 예측)

  • Seo, Dae-Ho;Kim, Ji-Ho;Kim, Chang-Ki
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.199-219
    • /
    • 2018
  • With the everyday use of the Internet and the spread of various smart devices, users have been able to communicate in real time and the existing communication style has changed. Due to the change of the information subject by the Internet, data became more massive and caused the very large information called big data. These Big Data are seen as a new opportunity to understand social issues. In particular, text mining explores patterns using unstructured text data to find meaningful information. Since text data exists in various places such as newspaper, book, and web, the amount of data is very diverse and large, so it is suitable for understanding social reality. In recent years, there has been an increasing number of attempts to analyze texts from web such as SNS and blogs where the public can communicate freely. It is recognized as a useful method to grasp public opinion immediately so it can be used for political, social and cultural issue research. Text mining has received much attention in order to investigate the public's reputation for candidates, and to predict the voting rate instead of the polling. This is because many people question the credibility of the survey. Also, People tend to refuse or reveal their real intention when they are asked to respond to the poll. This study collected comments from the largest Internet portal site in Korea and conducted research on the 19th Korean presidential election in 2017. We collected 226,447 comments from April 29, 2017 to May 7, 2017, which includes the prohibition period of public opinion polls just prior to the presidential election day. We analyzed frequencies, associative emotional words, topic emotions, and candidate voting rates. By frequency analysis, we identified the words that are the most important issues per day. Particularly, according to the result of the presidential debate, it was seen that the candidate who became an issue was located at the top of the frequency analysis. By the analysis of associative emotional words, we were able to identify issues most relevant to each candidate. The topic emotion analysis was used to identify each candidate's topic and to express the emotions of the public on the topics. Finally, we estimated the voting rate by combining the volume of comments and sentiment score. By doing above, we explored the issues for each candidate and predicted the voting rate. The analysis showed that news comments is an effective tool for tracking the issue of presidential candidates and for predicting the voting rate. Particularly, this study showed issues per day and quantitative index for sentiment. Also it predicted voting rate for each candidate and precisely matched the ranking of the top five candidates. Each candidate will be able to objectively grasp public opinion and reflect it to the election strategy. Candidates can use positive issues more actively on election strategies, and try to correct negative issues. Particularly, candidates should be aware that they can get severe damage to their reputation if they face a moral problem. Voters can objectively look at issues and public opinion about each candidate and make more informed decisions when voting. If they refer to the results of this study before voting, they will be able to see the opinions of the public from the Big Data, and vote for a candidate with a more objective perspective. If the candidates have a campaign with reference to Big Data Analysis, the public will be more active on the web, recognizing that their wants are being reflected. The way of expressing their political views can be done in various web places. This can contribute to the act of political participation by the people.

Development of a TBM Advance Rate Model and Its Field Application Based on Full-Scale Shield TBM Tunneling Tests in 70 MPa of Artificial Rock Mass (70 MPa급 인공암반 내 실대형 쉴드TBM 굴진실험을 통한 굴진율 모델 및 활용방안 제안)

  • Kim, Jungjoo;Kim, Kyoungyul;Ryu, Heehwan;Hwan, Jung Ju;Hong, Sungyun;Jo, Seonah;Bae, Dusan
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.6 no.3
    • /
    • pp.305-313
    • /
    • 2020
  • The use of cable tunnels for electric power transmission as well as their construction in difficult conditions such as in subsea terrains and large overburden areas has increased. So, in order to efficiently operate the small diameter shield TBM (Tunnel Boring Machine), the estimation of advance rate and development of a design model is necessary. However, due to limited scope of survey and face mapping, it is very difficult to match the rock mass characteristics and TBM operational data in order to achieve their mutual relationships and to develop an advance rate model. Also, the working mechanism of previously utilized linear cutting machine is slightly different than the real excavation mechanism owing to the penetration of a number of disc cutters taking place at the same time in the rock mass in conjunction with rotation of the cutterhead. So, in order to suggest the advance rate and machine design models for small diameter TBMs, an EPB (Earth Pressure Balance) shield TBM having 3.54 m diameter cutterhead was manufactured and 19 cases of full-scale tunneling tests were performed each in 87.5 ㎥ volume of artificial rock mass. The relationships between advance rate and machine data were effectively analyzed by performing the tests in homogeneous rock mass with 70 MPa uniaxial compressive strength according to the TBM operational parameters such as thrust force and RPM of cutterhead. The utilization of the recorded penetration depth and torque values in the development of models is more accurate and realistic since they were derived through real excavation mechanism. The relationships between normal force on single disc cutter and penetration depth as well as between normal force and rolling force were suggested in this study. The prediction of advance rate and design of TBM can be performed in rock mass having 70 MPa strength using these relationships. An effort was made to improve the application of the developed model by applying the FPI (Field Penetration Index) concept which can overcome the limitation of 100% RQD (Rock Quality Designation) in artificial rock mass.

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.

A Study on Mixed-Mode Survey which Combine the Landline and Mobile Telephone Interviews: The Case of Special Election for the Mayor of Seoul (유.무선전화 병행조사에 대한 연구: 2011년 서울시장 보궐선거 여론조사 사례)

  • Lee, Kyoung-Taeg;Lee, Hwa-Jeong;Hyun, Kyung-Bo
    • Survey Research
    • /
    • v.13 no.1
    • /
    • pp.135-158
    • /
    • 2012
  • Korean telephone surveys have been based on landline telephone directory or RDD(Random Digit Dialing) method. These days, however, there has been an increase of the households with no landline, or households with the line but not willing to register in the directory. Moreover, it is hard to contact young people or office workers who are usually staying out of home in the daytime. Due to these issues above, the predictability of election polls gets weaker. Especially, low accessibility to those who stay out of home when the poll's done, results in predictions with positive inclination toward conservatism. A solution to resolve this problem is to contact respondents by using both mobile and landline phones-via landline phone to those who are at home and via mobile phone to those who are out of home in the daytime(Mixed Mode Survey, hereafter MMS). To conduct MMS, 1) we need to obtain the sampling frames for the landline and mobile surveys, and 2) we need to decide the proportion of sample size of both. In this paper, we propose a heuristic method for conducting MMS. The method uses RDD for the landline phone survey, and the access panel list for the mobile phone survey. The proportion of sample sizes between landline and mobile phones are determined based on the 'Lifestyle and Time Use Study' conducted by Statistics Korea. As a case study, 4 election polls were conducted in the periods of the special election for the mayor of Seoul on Oct 26th, 2011. From the initial 3 polls, reactions and responses regarding the issues raised during the survey period were appropriately covered, and the final poll showed a very close prediction to the real election result.

  • PDF

Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary (주가지수 방향성 예측을 위한 주제지향 감성사전 구축 방안)

  • Yu, Eunji;Kim, Yoosin;Kim, Namgyu;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.95-110
    • /
    • 2013
  • Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools. Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants' opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches. One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news repots. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of new content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices. So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature. The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision-support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day's stock index. In addition, we applied a domain-specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative. For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by "M" and "E" media between July 2011 and September 2011.

Relationship among porcine lncRNA TCONS_00010987, miR-323, and leptin receptor based on dual luciferase reporter gene assays and expression patterns

  • Ding, Yueyun;Qian, Li;Wang, Li;Wu, Chaodong;Li, DengTao;Zhang, Xiaodong;Yin, Zongjun;Wang, Yuanlang;Zhang, Wei;Wu, Xudong;Ding, Jian;Yang, Min;Zhang, Liang;Shang, Jinnan;Wang, Chonglong;Gao, Yafei
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.33 no.2
    • /
    • pp.219-229
    • /
    • 2020
  • Objective: Considering the physiological and clinical importance of leptin receptor (LEPR) in regulating obesity and the fact that porcine LEPR expression is not known to be controlled by lncRNAs and miRNAs, we aim to characterize this gene as a potential target of SSC-miR-323 and the lncRNA TCONS_00010987. Methods: Bioinformatics analyses revealed that lncRNA TCONS_00010987 and LEPR have SSC-miR-323-binding sites and that LEPR might be a target of lncRNA TCONS_00010987 based on cis prediction. Wild-type and mutant TCONS_00010987-target sequence fragments and wild-type and mutant LEPR 3'-UTR fragments were generated and cloned into pmiRRB-REPORTTM-Control vectors to construct respective recombinant plasmids. HEK293T cells were co-transfected with the SSC-miR-323 mimics or a negative control with constructs harboring the corresponding binding sites and relative luciferase activities were determined. Tissue expression patterns of lncRNA TCONS_00010987, SSC-miR-323, and LEPR in Anqing six-end-white (AQ, the obese breed) and Large White (LW, the lean breed) pigs were detected by real-time quantitative polymerase chain reaction; backfat expression of LEPR protein was detected by western blotting. Results: Target gene fragments were successfully cloned, and the four recombinant vectors were constructed. Compared to the negative control, SSC-miR-323 mimics significantly inhibited luciferase activity from the wild-type TCONS_00010987-target sequence and wild-type LEPR-3'-UTR (p<0.01 for both) but not from the mutant TCONS_00010987-target sequence and mutant LEPR-3'-UTR (p>0.05 for both). Backfat expression levels of TCONS_00010987 and LEPR in AQ pigs were significantly higher than those in LW pigs (p<0.01), whereas levels of SSC-miR-323 in AQ pigs were significantly lower than those in LW pigs (p<0.05). LEPR protein levels in the backfat tissues of AQ pigs were markedly higher than those in LW pigs (p<0.01). Conclusion: LEPR is a potential target of SSC-miR-323, and TCONS_00010987 might act as a sponge for SSC-miR-323 to regulate LEPR expression.

Microarray Analysis of Long Non-coding RNA Expression Profile Associated with 5-Fluorouracil-Based Chemoradiation Resistance in Colorectal Cancer Cells

  • Xiong, Wei;Jiang, Yong-Xin;Ai, Yi-Qin;Liu, Shan;Wu, Xing-Rao;Cui, Jian-Guo;Qin, Ji-Yong;Liu, Yan;Xia, Yao-Xiong;Ju, Yun-He;He, Wen-Jie;Wang, Yong;Li, Yun-Fen;Hou, Yu;Wang, Li;Li, Wen-Hui
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.8
    • /
    • pp.3395-3402
    • /
    • 2015
  • Background: Preoperative 5-fluorouracil (5-FU)-based chemoradiotherapy is a standard treatment for locally advanced colorectal cancer (CRC). However, CRC cells often develop chemoradiation resistance (CRR). Recent studies have shown that long non-coding RNA (lncRNA) plays critical roles in a myriad of biological processes and human diseases, as well as chemotherapy resistance. Since the roles of lncRNAs in 5-FU-based CRR in human CRC cells remain unknown, they were investigated in this study. Materials and Methods: A 5-FU-based concurrent CRR cell model was established using human CRC cell line HCT116. Microarray expression profiling of lncRNAs and mRNAs was undertaken in parental HCT116 and 5-FU-based CRR cell lines. Results: In total, 2,662 differentially expressed lncRNAs and 2,398 mRNAs were identified in 5-FU-based CRR HCT116 cells when compared with those in parental HCT116. Moreover, 6 lncRNAs and 6 mRNAs found to be differentially expressed were validated by quantitative real time PCR (qRT-PCR). Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis for the differentially expressed mRNAs indicated involvement of many, such as Jak-STAT, PI3K-Akt and NF-kappa B signaling pathways. To better understand the molecular basis of 5-FU-based CRR in CRC cells, correlated expression networks were constructed based on 8 intergenic lncRNAs and their nearby coding genes. Conclusions: Changes in lncRNA expression are involved in 5-FU-based CRR in CRC cells. These findings may provide novel insight for the prognosis and prediction of response to therapy in CRC patients.

Prediction of infectious diseases using multiple web data and LSTM (다중 웹 데이터와 LSTM을 사용한 전염병 예측)

  • Kim, Yeongha;Kim, Inhwan;Jang, Beakcheol
    • Journal of Internet Computing and Services
    • /
    • v.21 no.5
    • /
    • pp.139-148
    • /
    • 2020
  • Infectious diseases have long plagued mankind, and predicting and preventing them has been a big challenge for mankind. For this reasen, various studies have been conducted so far to predict infectious diseases. Most of the early studies relied on epidemiological data from the Centers for Disease Control and Prevention (CDC), and the problem was that the data provided by the CDC was updated only once a week, making it difficult to predict the number of real-time disease outbreaks. However, with the emergence of various Internet media due to the recent development of IT technology, studies have been conducted to predict the occurrence of infectious diseases through web data, and most of the studies we have researched have been using single Web data to predict diseases. However, disease forecasting through a single Web data has the disadvantage of having difficulty collecting large amounts of learning data and making accurate predictions through models for recent outbreaks such as "COVID-19". Thus, we would like to demonstrate through experiments that models that use multiple Web data to predict the occurrence of infectious diseases through LSTM models are more accurate than those that use single Web data and suggest models suitable for predicting infectious diseases. In this experiment, we predicted the occurrence of "Malaria" and "Epidemic-parotitis" using a single web data model and the model we propose. A total of 104 weeks of NEWS, SNS, and search query data were collected, of which 75 weeks were used as learning data and 29 weeks were used as verification data. In the experiment we predicted verification data using our proposed model and single web data, Pearson correlation coefficient for the predicted results of our proposed model showed the highest similarity at 0.94, 0.86, and RMSE was also the lowest at 0.19, 0.07.

Analysis and Prediction of Sewage Components of Urban Wastewater Treatment Plant Using Neural Network (대도시 하수종말처리장 유입 하수의 성상 평가와 인공신경망을 이용한 구성성분 농도 예측)

  • Jeong, Hyeong-Seok;Lee, Sang-Hyung;Shin, Hang-Sik;Song, Eui-Yeol
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.28 no.3
    • /
    • pp.308-315
    • /
    • 2006
  • Since sewage characteristics are the most important factors that can affect the biological reactions in wastewater treatment plants, a detailed understanding on the characteristics and on-line measurement techniques of the influent sewage would play an important role in determining the appropriate control strategies. In this study, samples were taken at two hour intervals during 51 days from $1^{st}$ October to $21^{st}$ November 2005 from the influent gate of sewage treatment plant. Then the characteristics of sewage were investigated. It was found that the daily values of flow rate and concentrations of sewage components showed a defined profile. The highest and lowest peak values were observed during $11:00{\sim}13:00$ hours and $05:00{\sim}07:00$ hours, respectively. Also, it was shown that the concentrations of sewage components were strongly correlated with the absorbance measured at 300 nm of UV. Therefore, the objective of the paper is to develop on-line estimation technique of the concentration of each component in the sewage using accumulated profiles of sewage, absorbance, and flow rate which can be measured in real time. As a first step, regression analysis was performed using the absorbance and component concentration data. Then a neural network trained with the input of influent flow rate, absorbance, and inflow duration was used. Both methods showed remarkable accuracy in predicting the resulting concentrations of the individual components of the sewage. In case of using the neural network, the predicted value md of the measurement were 19.3 and 14.4 for TSS, 26.7 and 25.1 for TCOD, 5.4 and 4.1 for TN, and for TP, 0.45 to 0.39, respectively.