Search | Korea Science

Clustering Method based on Genre Interest for Cold-Start Problem in Movie Recommendation (영화 추천 시스템의 초기 사용자 문제를 위한 장르 선호 기반의 클러스터링 기법)

You, Tithrottanak;Rosli, Ahmad Nurzid;Ha, Inay;Jo, Geun-Sik
- Journal of Intelligence and Information Systems
- /
- v.19 no.1
- /
- pp.57-77
- /
- 2013
Social media has become one of the most popular media in web and mobile application. In 2011, social networks and blogs are still the top destination of online users, according to a study from Nielsen Company. In their studies, nearly 4 in 5active users visit social network and blog. Social Networks and Blogs sites rule Americans' Internet time, accounting to 23 percent of time spent online. Facebook is the main social network that the U.S internet users spend time more than the other social network services such as Yahoo, Google, AOL Media Network, Twitter, Linked In and so on. In recent trend, most of the companies promote their products in the Facebook by creating the "Facebook Page" that refers to specific product. The "Like" option allows user to subscribed and received updates their interested on from the page. The film makers which produce a lot of films around the world also take part to market and promote their films by exploiting the advantages of using the "Facebook Page". In addition, a great number of streaming service providers allows users to subscribe their service to watch and enjoy movies and TV program. They can instantly watch movies and TV program over the internet to PCs, Macs and TVs. Netflix alone as the world's leading subscription service have more than 30 million streaming members in the United States, Latin America, the United Kingdom and the Nordics. As the matter of facts, a million of movies and TV program with different of genres are offered to the subscriber. In contrast, users need spend a lot time to find the right movies which are related to their interest genre. Recent years there are many researchers who have been propose a method to improve prediction the rating or preference that would give the most related items such as books, music or movies to the garget user or the group of users that have the same interest in the particular items. One of the most popular methods to build recommendation system is traditional Collaborative Filtering (CF). The method compute the similarity of the target user and other users, which then are cluster in the same interest on items according which items that users have been rated. The method then predicts other items from the same group of users to recommend to a group of users. Moreover, There are many items that need to study for suggesting to users such as books, music, movies, news, videos and so on. However, in this paper we only focus on movie as item to recommend to users. In addition, there are many challenges for CF task. Firstly, the "sparsity problem"; it occurs when user information preference is not enough. The recommendation accuracies result is lower compared to the neighbor who composed with a large amount of ratings. The second problem is "cold-start problem"; it occurs whenever new users or items are added into the system, which each has norating or a few rating. For instance, no personalized predictions can be made for a new user without any ratings on the record. In this research we propose a clustering method according to the users' genre interest extracted from social network service (SNS) and user's movies rating information system to solve the "cold-start problem." Our proposed method will clusters the target user together with the other users by combining the user genre interest and the rating information. It is important to realize a huge amount of interesting and useful user's information from Facebook Graph, we can extract information from the "Facebook Page" which "Like" by them. Moreover, we use the Internet Movie Database(IMDb) as the main dataset. The IMDbis online databases that consist of a large amount of information related to movies, TV programs and including actors. This dataset not only used to provide movie information in our Movie Rating Systems, but also as resources to provide movie genre information which extracted from the "Facebook Page". Formerly, the user must login with their Facebook account to login to the Movie Rating System, at the same time our system will collect the genre interest from the "Facebook Page". We conduct many experiments with other methods to see how our method performs and we also compare to the other methods. First, we compared our proposed method in the case of the normal recommendation to see how our system improves the recommendation result. Then we experiment method in case of cold-start problem. Our experiment show that our method is outperform than the other methods. In these two cases of our experimentation, we see that our proposed method produces better result in case both cases.
https://doi.org/10.13088/jiis.2013.19.1.057 인용 PDF KSCI

The Child Sexual Assaults by Kin -The Experience of YoungNam District Sunflower Center for Prevention of Child Sexual Assaults- (친족에 의한 아동 성폭력 실태 - 영남권역 해바라기 아동센터의 경험 -)

Seo, Sun-Ki;Lee, Sang-Han
- Journal of forensic and investigative science
- /
- v.2 no.2
- /
- pp.21-29
- /
- 2007
News from the media on sexual assaults to children committed by natural fathers doesn't attract social attention any more. The number of crimes related to Child Sexual Assault(CSA) is increasing every year in spite of the "Special Act on Prevention of Sexual Assault in Korea". The YoungNam District Sunflower Center for prevention of Child Sexual Assaults(SC-CSA) was established in Daegu, June 2005. The YoungNam District SC-CSA provides forensic evaluation of physical evidence, medical and psychological treatment for the victims less than 13 years of sexual assaults simultaneously. This study carried out 36 cases of CSA by kin reported to YoungNam District SC-CSA, among 180 cases in total until December 2006 since its opening. Most of the victims were girls (32 cases). 28 cases (78%) were indecent assaults (78%) and 8 cases (22%) were rapes. The assailants were overwhelmingly males (35 cases). The assailants of 21 cases (58.3%) were identified as the victims' natural fathers. The incident locations were victim's residence (31 cases, 86.1%) and the victims had been sexually assaulted regularly for many years (25 cases, 69.4%). Considering the above research, we can conclude that CSA committed by kin has specific characteristics. CSA is not a one-time incident, but consistently occurring crime. However, in 22 cases (61.1%), the victim's guardian didn't want to report about it or punish the assailants. As the assailants were natural fathers or relatives of the victims, the other family members probably thought it might be shameful to reveal their wrong doings and would lead to defamation of their family's reputation. The SC-CSA provides the counseling and medical treatment to the victims with the consent of the parents. Due to the guardians' misjudgment, the incident is sometimes not reported to the police. By not reporting the incident to the police, the assailant freely commits other crimes, which multiplies victims. The legal Act of supporting the management of the SC-CSA is still not regulated, so the stability of the SC-CSA is not guaranteed, yet. Even though it is obligatory to report incidents to the police, some cases are still not reported. Currently, there are three SC-CSA centers : in Seoul, in Daegu, and in Gwangju. More centers need to be established to diminish CSA cases in Korea.
PDF

Attitudes to Safety of Genetically Modified Foods in Korea -Focus on Consumers- (유전자재조합 식품의 안전성에 대한 기본인식 조사 -일반 소비자를 중심으로 _)

김영찬;박경진;김성조;강은영;김동연
- Journal of Food Hygiene and Safety
- /
- v.16 no.1
- /
- pp.66-75
- /
- 2001
A survey was conducted to investigate consumers'attitudes toward the foods developed by gene recombination techniques from December, 1999 to April, 2000. The questionnaires were mailed to 1,500 people, and the 1,101 people responded. The consumers were asked about knowledge, acceptance, intention of purchasing, and labeling information. Although the portion of the consumers (88.8%) knowing the genetically modified floods (GMF) was lower than that of the flood expert group (98.7%), many consumers had some knowledge on the GMF, which may be influenced by news released from mass media. Seventy-nine percent of the consumers responded that gene recombination technology is necessary in food production, which is similar to the findings on the survey of the expert group. The portion of the consumers responding that these foods are potentially hazard was 88.1%, which is a little higher than the data (80.9%) from the expert group. The consumers having greater knowledge less worried about a potential hazard of the gene recombinant foods (p<0.01). Although 62.9% of the consumers responded to be willing to purchase those foods, only 16.2% of them responded to purchase the foods with no conditions, which is lower to that from the expert group (23.5%). There was no statistically significant relationship between the knowledge and the intention of purchasing. The ninety point three percent of the consumers wanted the information on gene recombination to be labeled on the foods. The data from this survey suggest that knowledge of the consumers on the GMF are not accurate, so proper strategy for consumer education may need to be developed. In addition, it is necessary to improve safety assessment system and analytical techniques for genetically modified foods (GMF) and to build pre- and post-market surveillance system fur efficient implementation of the GMF labeling.
PDF

A Comparative Study of Information Delivery Method in Networks According to Off-line Communication (오프라인 커뮤니케이션 유무에 따른 네트워크 별 정보전달 방법 비교 분석)

Park, Won-Kuk;Choi, Chan;Moon, Hyun-Sil;Choi, Il-Young;Kim, Jae-Kyeong
- Journal of Intelligence and Information Systems
- /
- v.17 no.4
- /
- pp.131-142
- /
- 2011
In recent years, Social Network Service, which is defined as a web-based service that allows an individual to construct a public or a semi-public profile within a bounded system, articulates a list of other users with whom they share connections, and traverses their list of connections. For example, Facebook and Twitter are the representative sites of Social Network Service, and these sites are the big issue in the world. A lot of people use Social Network Services to connect and maintain social relationship. Recently the users of Social Network Services have increased dramatically. Accordingly, many organizations become interested in Social Network Services as means of marketing, media, communication with their customers, and so on, because social network services can offer a variety of benefits to organizations such as companies and associations. In other words, organizations can use Social Network Services to respond rapidly to various user's behaviors because Social Network Services can make it possible to communicate between the users more easily and faster. And marketing cost of the Social Network Service is lower than that of existing tools such as broadcasts, news papers, and direct mails. In addition, Social network Services are growing in market place. So, the organizations such as companies and associations can acquire potential customers for the future. However, organizations uniformly communicate with users through Social Network Service without consideration of the characteristics of the networks although networks have different effects on information deliveries. For example, members' cohesion in an offline communication is higher than that in an online communication because the members of the offline communication are very close. that is, the network of the offline communication has a strong tie. Accordingly, information delivery is fast in the network of the offline communication. In this study, we compose two networks which have different characteristic of communication in Twitter. First network is constructed with data based on an offline communication such as friend, family, senior and junior in school. Second network is constructed with randomly selected data from users who want to associate with friends in online. Each network size is 250 people who divide with three groups. The first group is an ego which means a person in the center of the network. The second group is the ego's followers. The last group is composed of the ego's follower's followers. We compare the networks through social network analysis and follower's reaction analysis. We investigate density and centrality to analyze the characteristic of each network. And we analyze the follower's reactions such as replies and retweets to find differences of information delivery in each network. Our experiment results indicate that density and centrality of the offline communicationbased network are higher than those of the online-based network. Also the number of replies are larger than that of retweets in the offline communication-based network. On the other hand, the number of retweets are larger than that of replies in the online based network. We identified that the effect of information delivery in the offline communication-based network was different from those in the online communication-based network through experiments. So, you configure the appropriate network types considering the characteristics of the network if you want to use social network as an effective marketing tool.
https://doi.org/10.13088/jiis.2011.17.4.131 인용 PDF KSCI

Effects of an Educational Program for the High Risk Group of Cardio-cerebrovascular Disease: Awareness of the Warning Signs and Symptoms of Acute Myocardial Infarction and Stroke in the Aged at Senior Centers (심뇌혈관질환 고위험군 대상 교육프로그램의 효과: 경로당노인의 심근경색과 뇌졸중에 대한 경고증상 인지도)

Song, Jung-Kook;Park, Hyeung-Keun;Hong, Seong Chul
- Journal of agricultural medicine and community health
- /
- v.40 no.3
- /
- pp.126-136
- /
- 2015
Objectives: This study was performed to investigate the effects of a health education program for the aged on knowledge about the warning signs and symptoms of acute myocardial infarction and stroke. Methods: Data from 337 elderly people (159 participated and 178 non-participated) at senior centers in Jeju-si were collected by 1 to 1 interview from January to March 2012, one year after the education program provided. Two stages of study were performed: Cross-sectional, case-control study on the level of knowledge about the warning signs and symptoms; and multivariate logistic regression to fine out predictors of optimal awareness. Results: No significant discrepancy of knowledge level between case and control group was found. The knowledge level as high as a surge was shown in both groups one year later. A surge of knowledge had been shown after the education provided in one month. The factors affecting the optimal level of knowledge were education (Odds ratio 3.01; Confidence Interval 1.72-5.26; P-value <0.001) and 7 days of watching TV news per week (2.97; 1.68-5.23; P<0.001). However, participation in the health education was not significant (1.60; 0.98-2.61; P=0.059). Conclusions: The effects of a targeted program in high-risk groups for cardio-cerebrovascular disease are only guaranteed in the enhancement by a population-based mass-media education campaign.
https://doi.org/10.5393/JAMCH.2015.40.3.126 인용 PDF KSCI

Success Factor in the K-Pop Music Industry: focusing on the mediated effect of Internet Memes (대중음악 흥행 요인에 대한 연구: 인터넷 밈(Internet Meme)의 매개효과를 중심으로)

YuJeong Sim;Minsoo Shin
- Journal of Service Research and Studies
- /
- v.13 no.1
- /
- pp.48-62
- /
- 2023
As seen in the recent K-pop craze, the size and influence of the Korean music industry is growing even bigger. At least 6,000 songs are released a year in the Korean music market, but not many can be said to have been successful. Many studies and attempts are being made to identify the factors that make the hit music. Commercial factors such as media exposure and promotion as well as the quality of music play an important role in the commercial success of music. Recently, there have been many marketing campaigns using Internet memes in the pop music industry, and Internet memes are activities or trends that spread in various forms, such as images and videos, as cultural units that spread among people. Depending on the Internet environment and the characteristics of digital communication, contents are expanded and reproduced in the form of various memes, which causes a greater response to consumers. Previously, the phenomenon of Internet memes has occurred naturally, but artists who are aware of the marketing effects have recently used it as an element of marketing. In this paper, the mediated effect of Internet memes in relation to the success factors of popular music was analyzed, and a prediction model reflecting them was proposed. As a result of the analysis, the factors with the mediated effect of 'cover effect' and 'challenge effect' were the same. Among the internal success factors, there were mediated effects in "Singer Recognition," the genres of "POP, Dance, Ballad, Trot and Electronica," and among the external success factors, mediated effects in "Planning Company Capacity," "The Number of Music Broadcasting Programs," and "The Number of News Articles." Predictive models reflecting cover effects and challenge effects showed F1-score at 0.6889 and 0.7692, respectively. This study is meaningful in that it has collected and analyzed actual chart data and presented commercial directions that can be used in practice, and found that there are many success factors of popular music and the mediating effects of Internet memes.
https://doi.org/10.18807/jsrs.2023.13.1.048 인용 PDF

The Analysis of the Current Status of Medical Accidents and Disputes Researched in the Korean Web Sites (인터넷 사이트를 통해 살펴본 의료사고 및 의료분쟁의 현황에 관한 분석)

Cha, Yu-Rim;Kwon, Jeong-Seung;Choi, Jong-Hoon;Kim, Chong-Youl
- Journal of Oral Medicine and Pain
- /
- v.31 no.4
- /
- pp.297-316
- /
- 2006
The increasing tendency of medical disputes is one of the remarkable social phenomena. Especially we must not overlook the phenomenon that production and circulation of information related to medical accidents is increasing rapidly through the internet. In this research, we evaluated the web sites which provide the information related to medical accidents using the keyword "medical accidents" in March 2006, and classified the 28 web sites according to the kinds of establishers. We also analyzed the contents of the sites, and checked and compared the current status of the web sites and problems that have to be improved. Finally, we suggested the possible solutions to prevent medical accidents. The detailed results were listed below. 1. Medical practitioners, general public, and lawyers were all familiar with and prefer the term "medical accidents" mainly. 2. In the number of sites searched by the keyword "medical accidents", lawyer had the most sites and medical practitioners had the least ones. 3. Many sites by general public and lawyers had their own medical record analysts but there was little professional analysts for dentistry. 4. General public were more interested in the prevention of medical accidents but the lawyers were more interested in the process after medical accidents. The sites by medical practitioners dealt with the least remedies of medical accidents, compared with other sites. 5. General public wanted the third party such as government intervention into the disputes including the medical dispute arbitration law or/and the establishment of independent medical dispute judgment institution. 6. In the comparison among the establishers of web sites, medical practitioners dealt with the least examples of medical accidents. 7. The suggestion of cases in counseling articles related to dental accidents were considered less importantly than the reality. 8. Whereas there were many articles about domestic cases related to the bloody dental treatment, in the open counseling articles the number of dental treatment regarding to non insurance treatment was large. 9. In comparing offered information of medical accidents based on the establishers, general public offered vocabularies, lawyers offered related laws and medical practitioners offered medical knowledge relatively. 10. They all cited the news pressed by the media to offer the current status of domestic medical accidents. Especially among the web sites by general public, NGOs provided the plentiful statistical data related to medical accidents. 11. The web sites that collect the medical accidents were only two. As a result of our research, we found out that, in the flood of information, medical disputes can be occurred by the wrong information from third party, and the medical practitioners have the most passive attitudes on the medical accidents. Thus, it is crucial to have the mutual interchange and exchange of information between lawyer, patients and medical practitioners, so that based on clear mutual comprehension we can solve the accidents and disputes more positively and actively.
PDF KSCI

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
- Journal of Intelligence and Information Systems
- /
- v.20 no.2
- /
- pp.109-122
- /
- 2014
People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.
https://doi.org/10.13088/jiis.2014.20.2.109 인용 PDF KSCI

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

Park, Ho-yeon;Kim, Kyoung-jae
- Journal of Intelligence and Information Systems
- /
- v.25 no.4
- /
- pp.141-154
- /
- 2019
Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.
https://doi.org/10.13088/jiis.2019.25.4.141 인용 PDF KSCI

Search Result 759, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)