• Title/Summary/Keyword: Reddit

Search Result 12, Processing Time 0.026 seconds

Age and Gender in Reddit Commenting and Success

  • Finlay, S. Craig
    • Journal of Information Science Theory and Practice
    • /
    • v.2 no.3
    • /
    • pp.18-28
    • /
    • 2014
  • Reddit is a large user generated content (USG) website in which users form common interest groups and submit links to external content or text posts of user-created content. The web site operates on a voting system whereby registered users can assign positive or negative ratings to both submitted content and comments made to submitted content. While Reddit is a pseudonymous site, with users creating usernames but providing no biographical data, an informal survey posted to a large shared interest community yielded 734 responses including age and gender of users. This provided a large amount of contextual biographical data with which to analyse user profiles at the first level of Computer Mediated Discourse Analysis (CMDA), articulated by Susan Herring. The results indicate that older Reddit users both formulate more complex writing and enjoy more success when rated by other users. Gender data was incomplete and as such only tentative results could be proposed in that regard.

Analysis of Users' Sentiments and Needs for ChatGPT through Social Media on Reddit (Reddit 소셜미디어를 활용한 ChatGPT에 대한 사용자의 감정 및 요구 분석)

  • Hye-In Na;Byeong-Hee Lee
    • Journal of Internet Computing and Services
    • /
    • v.25 no.2
    • /
    • pp.79-92
    • /
    • 2024
  • ChatGPT, as a representative chatbot leveraging generative artificial intelligence technology, is used valuable not only in scientific and technological domains but also across diverse sectors such as society, economy, industry, and culture. This study conducts an explorative analysis of user sentiments and needs for ChatGPT by examining global social media discourse on Reddit. We collected 10,796 comments on Reddit from December 2022 to August 2023 and then employed keyword analysis, sentiment analysis, and need-mining-based topic modeling to derive insights. The analysis reveals several key findings. The most frequently mentioned term in ChatGPT-related comments is "time," indicative of users' emphasis on prompt responses, time efficiency, and enhanced productivity. Users express sentiments of trust and anticipation in ChatGPT, yet simultaneously articulate concerns and frustrations regarding its societal impact, including fears and anger. In addition, the topic modeling analysis identifies 14 topics, shedding light on potential user needs. Notably, users exhibit a keen interest in the educational applications of ChatGPT and its societal implications. Moreover, our investigation uncovers various user-driven topics related to ChatGPT, encompassing language models, jobs, information retrieval, healthcare applications, services, gaming, regulations, energy, and ethical concerns. In conclusion, this analysis provides insights into user perspectives, emphasizing the significance of understanding and addressing user needs. The identified application directions offer valuable guidance for enhancing existing products and services or planning the development of new service platforms.

In the Log Cabin with My Favorite Player: Appreciating Traditional American Masculinity Through Homoerotic Language in Baseball Fandom

  • Shin, Hyerin;Jie, Sue Hyun
    • American Studies
    • /
    • v.42 no.1
    • /
    • pp.133-159
    • /
    • 2019
  • On the website r/NYYankees, a sub-forum ("subreddit") of Reddit is devoted to the Major League Baseball team New York Yankees, with its predominantly male users showing their appreciation for baseball heroes by expressing erotic desires towards the players. When a player performs well, the subreddit is filled with admiration of desires to become the player's intimate lover-explicitly expressed by "male" fans. This paper explains the phenomenon of young male fans' desire for the now-lost model of traditional masculinity of domination and control, displayed in the context of baseball players' dominant performances. The discrepancy between a fan's non-homosexual real-world self and his homoerotic language on the subreddit is explained using the "performative fandom" theory, developed by Osborne and Coombs borrowing Butler's notion of performativity. This paper suggests how this desire for traditional masculinity serves as recognition to the collapse of masculinity in the modern American society.

Comparison of Industrial Mathematics Issues between Korea and the US Using Topic Modeling (토픽모델링을 활용한 한국과 미국의 산업수학 이슈 비교)

  • Kim, Sung-Yeun
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.7
    • /
    • pp.30-45
    • /
    • 2022
  • This study explored the issues of industrial mathematics in online news articles and online forums in Korea and the US by using text mining and compared the results. Text data about industrial mathematics were collected from news articles of Naver, a major portal site, and postings and replies on Clien as resources of Korea, and from news articles by the New York Times and CNN as well as postings and replies on Reddit as resources of the US. Structural topic modeling analyses were performed, the major results of which were as follows. First, news articles in Korea mainly dealt with the necessity of industrial mathematics and government support. On the contrary, the news articles in the US focused more on various fields where industrial mathematics fields were utilized. Second, in Korea, the same number of issues with different topics were discussed in news articles and online forums, whereas in the US more issues were covered in news articles than in online forums. It was suggested academic implications for researchers and practical implications for the government for settling industrial mathematics in Korea.

Correlation Analysis Between Online Public Opinion and Stock Price (SNS 여론과 주가지수의 상관관계 분석)

  • Hyun-Ji Kim;Sung-Ju Oh
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.394-395
    • /
    • 2023
  • "이성적이며 이상적인 합리적 인간"을 가정하는 기존 경제학의 이론이 항상 실제 상황과 일치하지는 않는 것으로 알려져 있다. 이의 대안으로 나온 행동경제학은, 인간의 경제적 의사결정에 심리, 인지, 감정, 사회문화적 배경 등이 영향을 미친다고 본다. 본 연구에서는 행동경제학에 의거하여, 개인의 감정과 경험이 경제적 의사결정에 영향을 미치는지 여부를 빅데이터 모델을 활용하여 분석하였다. SNS 여론으로는 Reddit, 주가지수로는 S&P 500 을 선정하였다. 수집한 텍스트 데이터를 전처리와 감정분석을 통해 독립변수 값으로 사용했고, 주가지수 등락의 방향성을 종속변수로 사용하여 로지스틱 모형을 구성했다. 모델을 활용하여 분석한 결과 Public sentiment 와 Market sentiment 간 양의 상관관계를 확인할 수 있었다. 또한, lag 를 설정하는 모델이 정확도가 더욱 높음을 확인해, 기존 경제학의 EMH 와 대립되는 바를 확인할 수 있었다. 하지만 최적의 lag 산정을 위해, 더 광범위한 데이터를 바탕으로 한 후속연구가 필요하다.

Research on Transformer-Based Approaches for MBTI Classification Using Social Network Service Data (트랜스포머 기반 MBTI 성격 유형 분류 연구 : 소셜 네트워크 서비스 데이터를 중심으로)

  • Jae-Joon Jung;Heui-Seok Lim
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.529-532
    • /
    • 2023
  • 본 논문은 소셜 네트워크 이용자의 텍스트 데이터를 대상으로, 트랜스포머 계열의 언어모델을 전이학습해 이용자의 MBTI 성격 유형을 분류한 국내 첫 연구이다. Kaggle MBTI Dataset을 대상으로 RoBERTa Distill, DeBERTa-V3 등의 사전 학습모델로 전이학습을 해, MBTI E/I, N/S, T/F, J/P 네 유형에 대한 분류의 평균 정확도는 87.9181, 평균 F-1 Score는 87.58를 도출했다. 해외 연구의 State-of-the-art보다 네 유형에 대한 F1-Score 표준편차를 50.1% 낮춰, 유형별 더 고른 분류 성과를 보였다. 또, Twitter, Reddit과 같은 글로벌 소셜 네트워크 서비스의 텍스트 데이터를 추가로 분류, 트랜스포머 기반의 MBTI 분류 방법론을 확장했다.

  • PDF

Topic Modeling Insomnia Social Media Corpus using BERTopic and Building Automatic Deep Learning Classification Model (BERTopic을 활용한 불면증 소셜 데이터 토픽 모델링 및 불면증 경향 문헌 딥러닝 자동분류 모델 구축)

  • Ko, Young Soo;Lee, Soobin;Cha, Minjung;Kim, Seongdeok;Lee, Juhee;Han, Ji Yeong;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.2
    • /
    • pp.111-129
    • /
    • 2022
  • Insomnia is a chronic disease in modern society, with the number of new patients increasing by more than 20% in the last 5 years. Insomnia is a serious disease that requires diagnosis and treatment because the individual and social problems that occur when there is a lack of sleep are serious and the triggers of insomnia are complex. This study collected 5,699 data from 'insomnia', a community on 'Reddit', a social media that freely expresses opinions. Based on the International Classification of Sleep Disorders ICSD-3 standard and the guidelines with the help of experts, the insomnia corpus was constructed by tagging them as insomnia tendency documents and non-insomnia tendency documents. Five deep learning language models (BERT, RoBERTa, ALBERT, ELECTRA, XLNet) were trained using the constructed insomnia corpus as training data. As a result of performance evaluation, RoBERTa showed the highest performance with an accuracy of 81.33%. In order to in-depth analysis of insomnia social data, topic modeling was performed using the newly emerged BERTopic method by supplementing the weaknesses of LDA, which is widely used in the past. As a result of the analysis, 8 subject groups ('Negative emotions', 'Advice and help and gratitude', 'Insomnia-related diseases', 'Sleeping pills', 'Exercise and eating habits', 'Physical characteristics', 'Activity characteristics', 'Environmental characteristics') could be confirmed. Users expressed negative emotions and sought help and advice from the Reddit insomnia community. In addition, they mentioned diseases related to insomnia, shared discourse on the use of sleeping pills, and expressed interest in exercise and eating habits. As insomnia-related characteristics, we found physical characteristics such as breathing, pregnancy, and heart, active characteristics such as zombies, hypnic jerk, and groggy, and environmental characteristics such as sunlight, blankets, temperature, and naps.

Sentiment Analysis and Issue Mining on All-Solid-State Battery Using Social Media Data (소셜미디어 분석을 통한 전고체 배터리 감성분석과 이슈 탐색)

  • Lee, Ji Yeon;Lee, Byeong-Hee
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.10
    • /
    • pp.11-21
    • /
    • 2022
  • All-solid-state batteries are one of the promising candidates for next-generation batteries and are drawing attention as a key component that will lead the future electric vehicle industry. This study analyzes 10,280 comments on Reddit, which is a global social media, in order to identify policy issues and public interest related to all-solid-state batteries from 2016 to 2021. Text mining such as frequency analysis, association rule analysis, and topic modeling, and sentiment analysis are applied to the collected global data to grasp global trends, compare them with the South Korean government's all-solid-state battery development strategy, and suggest policy directions for its national research and development. As a result, the overall sentiment toward all-solid-state battery issues was positive with 50.5% positive and 39.5% negative comments. In addition, as a result of analyzing detailed emotions, it was found that the public had trust and expectation for all-solid-state batteries. However, feelings of concern about unresolved problems coexisted. This study has an academic and practical contribution in that it presented a text mining analysis method for deriving key issues related to all-solid-state batteries, and a more comprehensive trend analysis by employing both a top-down approach based on government policy analysis and a bottom-up approach that analyzes public perception.

Social Media Mining Toolkit (SMMT)

  • Tekumalla, Ramya;Banda, Juan M.
    • Genomics & Informatics
    • /
    • v.18 no.2
    • /
    • pp.16.1-16.5
    • /
    • 2020
  • There has been a dramatic increase in the popularity of utilizing social media data for research purposes within the biomedical community. In PubMed alone, there have been nearly 2,500 publication entries since 2014 that deal with analyzing social media data from Twitter and Reddit. However, the vast majority of those works do not share their code or data for replicating their studies. With minimal exceptions, the few that do, place the burden on the researcher to figure out how to fetch the data, how to best format their data, and how to create automatic and manual annotations on the acquired data. In order to address this pressing issue, we introduce the Social Media Mining Toolkit (SMMT), a suite of tools aimed to encapsulate the cumbersome details of acquiring, preprocessing, annotating and standardizing social media data. The purpose of our toolkit is for researchers to focus on answering research questions, and not the technical aspects of using social media data. By using a standard toolkit, researchers will be able to acquire, use, and release data in a consistent way that is transparent for everybody using the toolkit, hence, simplifying research reproducibility and accessibility in the social media domain.

Technology Mining and Sentiment Analysis on Hydrogen Fuel Cell Using National R&D and Social Data (국가R&D와 소셜 데이터를 활용한 수소연료전지 기술마이닝과 감성분석)

  • Lee, Byeong-Hee;Choi, Jung-Woo;Kim, Tae-Hyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.341-343
    • /
    • 2022
  • 온실가스 배출 문제가 세계적인 현안으로 부각되면서 수소를 에너지원으로 사용하는 수소경제가 주목받고 있다. 수소연료전지는 수소경제의 구성요소 중 하나로, 수소를 활용해 열과 전기를 생산하며 에너지 변환 효율이 높이는데 장점이 있다. 본 연구는 세계적인 온라인 커뮤니티인 레딧(Reddit)에서 수집한 수소연료전지와 관련된 소셜 데이터를 텍스트마이닝과 감성분석 기법으로 분석하였다. 분석 결과 9,211건의 댓글을 LDA(Latent Dirichlet Allocation)을 이용해 4개의 토픽 그룹으로 분류할 수 있었다. 이 중 수소연료전지와 관련이 높은 그룹을 선정해 STM(Structural Topic Model) 분석으로 10개 토픽을 추출하였고, 기후 환경, 수소 산업, 수소 차와 관련 있는 토픽 3개를 발견할 수 있었다. 이 연구 결과를 통해 수소연료전지의 세계적으로 실제적인 내용을 빠르고 효과적으로 파악하여 수소연료전지에 대한 예측하고, 우리나라의 수소연료전지 관련 국가R&D의 정책적 방향을 제시하고자 한다.