• Title/Summary/Keyword: Paired dataset

Search Result 24, Processing Time 0.019 seconds

A Study on Generation Quality Comparison of Concrete Damage Image Using Stable Diffusion Base Models (Stable diffusion의 기저 모델에 따른 콘크리트 손상 영상의 생성 품질 비교 연구)

  • Seung-Bo Shim
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.28 no.4
    • /
    • pp.55-61
    • /
    • 2024
  • Recently, the number of aging concrete structures is steadily increasing. This is because many of these structures are reaching their expected lifespan. Such structures require accurate inspections and persistent maintenance. Otherwise, their original functions and performance may degrade, potentially leading to safety accidents. Therefore, research on objective inspection technologies using deep learning and computer vision is actively being conducted. High-resolution images can accurately observe not only micro cracks but also spalling and exposed rebar, and deep learning enables automated detection. High detection performance in deep learning is only guaranteed with diverse and numerous training datasets. However, surface damage to concrete is not commonly captured in images, resulting in a lack of training data. To overcome this limitation, this study proposed a method for generating concrete surface damage images, including cracks, spalling, and exposed rebar, using stable diffusion. This method synthesizes new damage images by paired text and image data. For this purpose, a training dataset of 678 images was secured, and fine-tuning was performed through low-rank adaptation. The quality of the generated images was compared according to three base models of stable diffusion. As a result, a method to synthesize the most diverse and high-quality concrete damage images was developed. This research is expected to address the issue of data scarcity and contribute to improving the accuracy of deep learning-based damage detection algorithms in the future.

Identification of key genes and carcinogenic pathways in hepatitis B virus-associated hepatocellular carcinoma through bioinformatics analysis

  • Sang-Hoon Kim;Shin Hwang;Gi-Won Song;Dong-Hwan Jung;Deok-Bog Moon;Jae Do Yang;Hee Chul Yu
    • Annals of Hepato-Biliary-Pancreatic Surgery
    • /
    • v.26 no.1
    • /
    • pp.58-68
    • /
    • 2022
  • Backgrounds/Aims: Mechanisms for the development of hepatocellular carcinoma (HCC) in hepatitis B virus (HBV)-infected patients remain unclear. The aim of the present study was to identify genes and pathways involved in the development of HBV-associated HCC. Methods: The GSE121248 gene dataset, which included 70 HCCs and 37 adjacent liver tissues, was downloaded from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) in HCCs and adjacent liver tissues were identified. Gene ontology and Kyoto Encyclopedia of Genes and Genome pathway enrichment analyses were then performed. Results: Of 134 DEGs identified, 34 were up-regulated and 100 were down-regulated in HCCs. The 34 up-regulated DEGs were mainly involved in nuclear division, organelle fission, spindle and midbody formation, histone kinase activity, and p53 signaling pathway, whereas the 100 down-regulated DEGs were involved in steroid and hormone metabolism, collagen-coated extracellular matrix, oxidoreductase activity, and activity on paired donors, including incorporation or reduction of molecular oxygen, monooxygenase activity, and retinol metabolism. Analyses of protein-protein interaction networks with a high degree of connectivity identified significant modules containing 14 hub genes, including ANLN, ASPM, BUB1B, CCNB1, CDK1, CDKN3, ECT2, HMMR, NEK2, PBK, PRC1, RACGAP1, RRM2, and TOP2A, which were mainly associated with nuclear division, organelle fission, spindle formation, protein serine/threonine kinase activity, p53 signaling pathway, and cell cycle. Conclusions: This study identified key genes and carcinogenic pathways that play essential roles in the development of HBV-associated HCC. This may provide important information for the development of diagnostic and therapeutic targets for HCC.

Social Network-based Hybrid Collaborative Filtering using Genetic Algorithms (유전자 알고리즘을 활용한 소셜네트워크 기반 하이브리드 협업필터링)

  • Noh, Heeryong;Choi, Seulbi;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.19-38
    • /
    • 2017
  • Collaborative filtering (CF) algorithm has been popularly used for implementing recommender systems. Until now, there have been many prior studies to improve the accuracy of CF. Among them, some recent studies adopt 'hybrid recommendation approach', which enhances the performance of conventional CF by using additional information. In this research, we propose a new hybrid recommender system which fuses CF and the results from the social network analysis on trust and distrust relationship networks among users to enhance prediction accuracy. The proposed algorithm of our study is based on memory-based CF. But, when calculating the similarity between users in CF, our proposed algorithm considers not only the correlation of the users' numeric rating patterns, but also the users' in-degree centrality values derived from trust and distrust relationship networks. In specific, it is designed to amplify the similarity between a target user and his or her neighbor when the neighbor has higher in-degree centrality in the trust relationship network. Also, it attenuates the similarity between a target user and his or her neighbor when the neighbor has higher in-degree centrality in the distrust relationship network. Our proposed algorithm considers four (4) types of user relationships - direct trust, indirect trust, direct distrust, and indirect distrust - in total. And, it uses four adjusting coefficients, which adjusts the level of amplification / attenuation for in-degree centrality values derived from direct / indirect trust and distrust relationship networks. To determine optimal adjusting coefficients, genetic algorithms (GA) has been adopted. Under this background, we named our proposed algorithm as SNACF-GA (Social Network Analysis - based CF using GA). To validate the performance of the SNACF-GA, we used a real-world data set which is called 'Extended Epinions dataset' provided by 'trustlet.org'. It is the data set contains user responses (rating scores and reviews) after purchasing specific items (e.g. car, movie, music, book) as well as trust / distrust relationship information indicating whom to trust or distrust between users. The experimental system was basically developed using Microsoft Visual Basic for Applications (VBA), but we also used UCINET 6 for calculating the in-degree centrality of trust / distrust relationship networks. In addition, we used Palisade Software's Evolver, which is a commercial software implements genetic algorithm. To examine the effectiveness of our proposed system more precisely, we adopted two comparison models. The first comparison model is conventional CF. It only uses users' explicit numeric ratings when calculating the similarities between users. That is, it does not consider trust / distrust relationship between users at all. The second comparison model is SNACF (Social Network Analysis - based CF). SNACF differs from the proposed algorithm SNACF-GA in that it considers only direct trust / distrust relationships. It also does not use GA optimization. The performances of the proposed algorithm and comparison models were evaluated by using average MAE (mean absolute error). Experimental result showed that the optimal adjusting coefficients for direct trust, indirect trust, direct distrust, indirect distrust were 0, 1.4287, 1.5, 0.4615 each. This implies that distrust relationships between users are more important than trust ones in recommender systems. From the perspective of recommendation accuracy, SNACF-GA (Avg. MAE = 0.111943), the proposed algorithm which reflects both direct and indirect trust / distrust relationships information, was found to greatly outperform a conventional CF (Avg. MAE = 0.112638). Also, the algorithm showed better recommendation accuracy than the SNACF (Avg. MAE = 0.112209). To confirm whether these differences are statistically significant or not, we applied paired samples t-test. The results from the paired samples t-test presented that the difference between SNACF-GA and conventional CF was statistical significant at the 1% significance level, and the difference between SNACF-GA and SNACF was statistical significant at the 5%. Our study found that the trust/distrust relationship can be important information for improving performance of recommendation algorithms. Especially, distrust relationship information was found to have a greater impact on the performance improvement of CF. This implies that we need to have more attention on distrust (negative) relationships rather than trust (positive) ones when tracking and managing social relationships between users.

Product Recommender Systems using Multi-Model Ensemble Techniques (다중모형조합기법을 이용한 상품추천시스템)

  • Lee, Yeonjeong;Kim, Kyoung-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.39-54
    • /
    • 2013
  • Recent explosive increase of electronic commerce provides many advantageous purchase opportunities to customers. In this situation, customers who do not have enough knowledge about their purchases, may accept product recommendations. Product recommender systems automatically reflect user's preference and provide recommendation list to the users. Thus, product recommender system in online shopping store has been known as one of the most popular tools for one-to-one marketing. However, recommender systems which do not properly reflect user's preference cause user's disappointment and waste of time. In this study, we propose a novel recommender system which uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user's preference. The research data is collected from the real-world online shopping store, which deals products from famous art galleries and museums in Korea. The data initially contain 5759 transaction data, but finally remain 3167 transaction data after deletion of null data. In this study, we transform the categorical variables into dummy variables and exclude outlier data. The proposed model consists of two steps. The first step predicts customers who have high likelihood to purchase products in the online shopping store. In this step, we first use logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. We perform above data mining techniques using SAS E-Miner software. In this study, we partition datasets into two sets as modeling and validation sets for the logistic regression and decision trees. We also partition datasets into three sets as training, test, and validation sets for the artificial neural network model. The validation dataset is equal for the all experiments. Then we composite the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. Bagging is the abbreviation of "Bootstrap Aggregation" and it composite outputs from several machine learning techniques for raising the performance and stability of prediction or classification. This technique is special form of the averaging method. Bumping is the abbreviation of "Bootstrap Umbrella of Model Parameter," and it only considers the model which has the lowest error value. The results show that bumping outperforms bagging and the other predictors except for "Poster" product group. For the "Poster" product group, artificial neural network model performs better than the other models. In the second step, we use the market basket analysis to extract association rules for co-purchased products. We can extract thirty one association rules according to values of Lift, Support, and Confidence measure. We set the minimum transaction frequency to support associations as 5%, maximum number of items in an association as 4, and minimum confidence for rule generation as 10%. This study also excludes the extracted association rules below 1 of lift value. We finally get fifteen association rules by excluding duplicate rules. Among the fifteen association rules, eleven rules contain association between products in "Office Supplies" product group, one rules include the association between "Office Supplies" and "Fashion" product groups, and other three rules contain association between "Office Supplies" and "Home Decoration" product groups. Finally, the proposed product recommender systems provides list of recommendations to the proper customers. We test the usability of the proposed system by using prototype and real-world transaction and profile data. For this end, we construct the prototype system by using the ASP, Java Script and Microsoft Access. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The participants for the survey are 173 persons who use MSN Messenger, Daum Caf$\acute{e}$, and P2P services. We evaluate the user satisfaction using five-scale Likert measure. This study also performs "Paired Sample T-test" for the results of the survey. The results show that the proposed model outperforms the random selection model with 1% statistical significance level. It means that the users satisfied the recommended product list significantly. The results also show that the proposed system may be useful in real-world online shopping store.