• Title/Summary/Keyword: Cross-feature Analysis

Search Result 111, Processing Time 0.029 seconds

Surface-Engineered Graphene surface-enhanced Raman scattering Platform with Machine-learning Enabled Classification of Mixed Analytes

  • Jae Hee Cho;Garam Bae;Ki-Seok An
    • Journal of Sensor Science and Technology
    • /
    • v.33 no.3
    • /
    • pp.139-146
    • /
    • 2024
  • Surface-enhanced Raman scattering (SERS) enables the detection of various types of π-conjugated biological and chemical molecules owing to its exceptional sensitivity in obtaining unique spectra, offering nondestructive classification capabilities for target analytes. Herein, we demonstrate an innovative strategy that provides significant machine learning (ML)-enabled predictive SERS platforms through surface-engineered graphene via complementary hybridization with Au nanoparticles (NPs). The hybridized Au NPs/graphene SERS platforms showed exceptional sensitivity (10-7 M) due to the collaborative strong correlation between the localized electromagnetic effect and the enhanced chemical bonding reactivity. The chemical and physical properties of the demonstrated SERS platform were systematically investigated using microscopy and spectroscopic analysis. Furthermore, an innovative strategy employing ML is proposed to predict various analytes based on a featured Raman spectral database. Using a customized data-preprocessing algorithm, the feature data for ML were extracted from the Raman peak characteristic information, such as intensity, position, and width, from the SERS spectrum data. Additionally, sophisticated evaluations of various types of ML classification models were conducted using k-fold cross-validation (k = 5), showing 99% prediction accuracy.

Research on Objects Tracking System using HOG Algorithm and CNN (HOG 알고리즘과 CNN을 이용한 객체 검출 시스템에 관한 연구)

  • Park Byungjoon;Kim Hyunsik
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.20 no.3
    • /
    • pp.13-23
    • /
    • 2024
  • For the purpose of predicting credit card customer churn accurately through data analysis Detecting and tracking objects in continuous video is essential in self-driving cars, security and surveillance systems, sports analytics, medical image processing, and more. Correlation tracking methods such as Normalized Cross Correlation(NCC) and Sum of Absolute Differences(SAD) are used as an effective way to measure the similarity between two images. NCC, a representative correlation tracking method, has been useful in real-time environments because it is relatively simple to compute and effective. However, correlation tracking methods are sensitive to rotation and size changes of objects, making them difficult to apply to real-time changing videos. To overcome these limitations, this paper proposes an object tracking method using the Histogram of Oriented Gradients(HOG) feature to effectively obtain object data and the Convolution Neural Network(CNN) algorithm. By using the two algorithms, the shape and structure of the object can be effectively represented and learned, resulting in more reliable and accurate object tracking. In this paper, the performance of the proposed method is verified through experiments and its superiority is demonstrated.

Characteristics of Mass Transport Depending on the Feature of Tidal Creek at Han River Estuary, Gyeong-gi Bay, South Korea (경기만 염하수로에서의 비정규 격자 수치모델링을 통한 조간대 조수로의 고려에 따른 Mass Transport 특성)

  • Kim, Minha;Woo, Seung-Buhm
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.25 no.2
    • /
    • pp.41-51
    • /
    • 2013
  • The tidal creek dependent mass transport characteristic in Gyeong-Gi Bay (west coast of Korea) was studied using field measured data and numerical model. Gyeong-Gi Bay consists of 3 main tidal channels and contains a well-developed vast tidal flat. This region is famous for its large tidal difference and strong current. We aim to study the effect of tidal creek in the tidal flat on the mass exchange between the estuary and the ocean. For numerical application, the application of unstructured grid feature is essential, since the tidal creek has complicated shape and form. For this purpose, the FVCOM is applied to the study area and simulation is performed for 2 different cases. In case A, geographic characteristics of the tidal creek is ignored in the numerical grid and in case B, the tidal creek are constructed using unstructured grid. And these 2 cases are compared with the field measured cross-channel mass transport data. The cross-channel mass transport at the Yeomha waterway mouth and Incheon harbor was measured in June, 9~10 (Spring tide) and 17~18 (Neap tide), 2009. CTD casting and ADCP cross-channel transect was conducted 13 times in one tidal cycle. The observation data analysis results showed that mass transport has characteristic of the ebb dominance Line 1 (Yeomha waterway mouth), on the other hand, a flood dominant characteristic is shown in Line 2 (Incheon harbor front). By comparing the numerical model (case A & B) with observation data, we found that the case B results show much better agreement with measurement data than case A. It is showed that the geographic feature of tidal creek should be considered in grid design of numerical model in order to understand the mass transport characteristics over large tidal flat area.

Multi-camera image feature analysis for virtual space convergence (가상공간 융합을 위한 다중 카메라 영상 특징 분석)

  • Yun, Jong-Ho;Choi, Myung-Ryul;Lee, Sang-Sun
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.5
    • /
    • pp.19-28
    • /
    • 2017
  • In this paper, we propose a method to reduce the difference in image characteristics when multiple camera images are captured for virtual space production. Sixty-four images were used by cross-mounting eight bodies and lenses, respectively. Image analysis compares and analyzes the standard deviation of the histogram and pixel distribution values. As a result of the analysis, it shows different image characteristics depending on the lens or image sensor, though it is a camera of the same model. In this paper, we have adjusted the distribution of the overall brightness value of the image to compensate for this difference. As a result, the average deviation was the maximum of (Indoor: 6.89, outdoor: 24.23), we obtained images with almost no deviation (Indoor: maximum 0.42, outdoor: maximum: 2.73). In the future, we will study and apply more accurate image analysis methods than image brightness distribution.

Prediction of Key Variables Affecting NBA Playoffs Advancement: Focusing on 3 Points and Turnover Features (미국 프로농구(NBA)의 플레이오프 진출에 영향을 미치는 주요 변수 예측: 3점과 턴오버 속성을 중심으로)

  • An, Sehwan;Kim, Youngmin
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.263-286
    • /
    • 2022
  • This study acquires NBA statistical information for a total of 32 years from 1990 to 2022 using web crawling, observes variables of interest through exploratory data analysis, and generates related derived variables. Unused variables were removed through a purification process on the input data, and correlation analysis, t-test, and ANOVA were performed on the remaining variables. For the variable of interest, the difference in the mean between the groups that advanced to the playoffs and did not advance to the playoffs was tested, and then to compensate for this, the average difference between the three groups (higher/middle/lower) based on ranking was reconfirmed. Of the input data, only this year's season data was used as a test set, and 5-fold cross-validation was performed by dividing the training set and the validation set for model training. The overfitting problem was solved by comparing the cross-validation result and the final analysis result using the test set to confirm that there was no difference in the performance matrix. Because the quality level of the raw data is high and the statistical assumptions are satisfied, most of the models showed good results despite the small data set. This study not only predicts NBA game results or classifies whether or not to advance to the playoffs using machine learning, but also examines whether the variables of interest are included in the major variables with high importance by understanding the importance of input attribute. Through the visualization of SHAP value, it was possible to overcome the limitation that could not be interpreted only with the result of feature importance, and to compensate for the lack of consistency in the importance calculation in the process of entering/removing variables. It was found that a number of variables related to three points and errors classified as subjects of interest in this study were included in the major variables affecting advancing to the playoffs in the NBA. Although this study is similar in that it includes topics such as match results, playoffs, and championship predictions, which have been dealt with in the existing sports data analysis field, and comparatively analyzed several machine learning models for analysis, there is a difference in that the interest features are set in advance and statistically verified, so that it is compared with the machine learning analysis result. Also, it was differentiated from existing studies by presenting explanatory visualization results using SHAP, one of the XAI models.

Segmentation and Visualization of Human Anatomy using Medical Imagery (의료영상을 이용한 인체장기의 분할 및 시각화)

  • Lee, Joon-Ku;Kim, Yang-Mo;Kim, Do-Yeon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.1
    • /
    • pp.191-197
    • /
    • 2013
  • Conventional CT and MRI scans produce cross-section slices of body that are viewed sequentially by radiologists who must imagine or extrapolate from these views what the 3 dimensional anatomy should be. By using sophisticated algorithm and high performance computing, these cross-sections may be rendered as direct 3D representations of human anatomy. The 2D medical image analysis forced to use time-consuming, subjective, error-prone manual techniques, such as slice tracing and region painting, for extracting regions of interest. To overcome the drawbacks of 2D medical image analysis, combining with medical image processing, 3D visualization is essential for extracting anatomical structures and making measurements. We used the gray-level thresholding, region growing, contour following, deformable model to segment human organ and used the feature vectors from texture analysis to detect harmful cancer. We used the perspective projection and marching cube algorithm to render the surface from volumetric MR and CT image data. The 3D visualization of human anatomy and segmented human organ provides valuable benefits for radiation treatment planning, surgical planning, surgery simulation, image guided surgery and interventional imaging applications.

Reverse Design for Composite Rotor Blade of BO-105 Helicopter (BO-105 헬리콥터 복합재 로터 블레이드 역설계)

  • Lee, Chang-Bae;Jang, KiJoo;Im, Byeong-Uk;Shin, SangJoon
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.49 no.7
    • /
    • pp.539-547
    • /
    • 2021
  • Helicopter rotor blade is required to be designed by considering the interacting effects among aerodynamics, flexibility, and controllability. The reverse design allows the structural components to have common characteristics by using the configuration numerics and experimental results. This paper aims to design the composite rotor blade which will feature common characteristics with that of BO-105. The present engineering design procedure is done by dividing the rotor blade into a few sections and composite laminates across the cross section. For each section, variational asymptotic beam sectional analysis (VABS) program is used to evaluate its flapwise, lagwise, and torsion stiffnesses to have discrepancy smaller than certain tolerance. Finally, CAMRAD II is used to predict the stress acting on the rotor blade during the specific flight condition and to check whether the present deign is structurally valid.

The pattern of use by gender and age of the discourse markers 'a', 'eo', and 'eum' (담화표지 '아', '어', '음'의 성별과 연령별 사용 양상)

  • Song, Youngsook;Shim, Jisu;Oh, Jeahyuk
    • Phonetics and Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.37-45
    • /
    • 2020
  • This paper quantitatively calculated the speech frequency of the discourse markers 'a', 'eo', and 'eum' and the speech duration of these discourse markers using the Seoul Corpus, a spontaneous speech corpus. The sound durations were confirmed with Praat, the Seoul Corpus was analyzed with Emeditor, and the results were presented by statistical analysis with R. Based on the corpus analysis, the study investigated whether a particular factor is preferred by speakers of particular categories. The most prominent feature of the corpus is that the sound durations of female speakers were longer than those of men when using the 'eum' discourse marker in a final position. In age-related variables, teenagers uttered 'a' more than 'eo' in an initial position when compared to people in their 40s. This study is significant because it has quantitatively analyzed the discourse markers 'a', 'eo', and 'eum' by gender and age. In order to continue the discussion, more precise research should be conducted considering the context. In addition, similarities can be found in "e" and "ma" in Japanese(Watanabe & Ishi, 2000) and 'uh', 'um' in English(Gries, 2013). afterwards, a study to identify commonalities and differences can be predicted by using the cross-linguistic analysis of the discourse.

A Study of Segmental and Syllabic Intervals of Canonical Babbling and Early Speech

  • Chen, Xiaoxiang;Xiao, Yunnan
    • Cross-Cultural Studies
    • /
    • v.28
    • /
    • pp.115-139
    • /
    • 2012
  • Interval or duration of segments, syllables, words and phrases is an important acoustic feature which influences the naturalness of speech. A number of cross-sectional studies regarding acoustic characteristics of children's speech development found that intervals of segments, syllables, words and phrases tend to change with the growing age. One hypothesis assumed that decreases in intervals would be greater when children were younger and smaller decreases in intervals when older (Thelen,1991), it has been supported by quite a number of researches on the basis of cross-sectional studies (Tingley & Allen,1975; Kent & Forner,1980; Chermak & Schneiderman, 1986), but the other hypothesis predicted that decreases in intervals would be smaller when children were younger and greater decreases in intervals when older (Smith, Kenney & Hussain, 1996). Researchers seem to come up with conflicting postulations and inconsistent results about the change trends concerning intervals of segments, syllables, words and phrases, leaving it as an issue unresolved. Most acoustic investigations of children's speech production have been conducted via cross-sectional designs, which involves studying several groups of children. So far, there are only a few longitudinal studies. This issue needs more longitudinal investigations; moreover, the acoustic measures of the intervals of child speech are hardly available. All former studies focus on word stages excluding the babbling stages especially the canonical babbling stage, but we need to find out when concrete changes of intervals begin to occur and what causes the changes. Therefore, we conducted an acoustic study of interval characteristics of segments and words concerning Canonical Babble ( CB) and early speech in an infant aged from 0;9 to 2;4 acquiring Mandarin Chinese. The current research addresses the following two questions: 1. Whether decreases in interval would be greater when children were younger and smaller when they were older or vice versa? 2. Whether the child speech concerning the acoustic features of interval drifts in the direction of the language they are exposed to? The female infant whose L1 was Southern Mandarin living in Changsha was audio- and video-taped at her home for about one hour almost on a weekly basis during her age range from 0;9 to 2;4 under natural observation by us investigators. The recordings were digitized. Parts of the digitized material were labeled. All the repetitions were excluded. The utterances were extracted from 44 sessions ranging from 30 minutes to one hour. The utterances were divided into segments as well as syllable-sized units. Age stages are 0;9-1;0,1;1-1;5, 1;6-2;0, 2;1-2;4. The subject was a monolingual normal child from parents with a good education. The infant was audio-and video-taped in her home almost every week. The data were digitized, segments and syllables from 44 sessions spanning the transition from babble to speech were transcribed in narrow IPA and coded for analysis. Babble was coded from age 0;9-1;0, and words were coded from 1;0 to 2;4, the data has been checked by two professionally trained persons who majored in phonetics. The present investigation is a longitudinal analysis of some temporal characteristics of the child speech during the age periods of 0;9-1;0, 1;1-1;5, 1;6-2;0, 2;1-2;4. The answer to Research Question 1 is that our results are in agreement with neither of the hypotheses. One hypothesis assumed that decreases in intervals would be greater when children were younger and smaller decreases in intervals when older (Thelen,1991); but the other hypothesis predicted that decreases in intervals would be smaller when children were younger and greater decreases in intervals when older (Smith, Kenney & Hussain, 1996). On the whole, there is a tendency of decrease in segmental and syllabic duration with the growing age, but the changes are not drastic and abrupt. For example, /a/ after /k/ in Table 1 has greater decrease during 1;1-1;5, while /a/ after /p/, /t/ and /w/ has greater decrease during 2;1-2;4. /ka/ has greater decrease during 1;1-1;5, while /ta/ and /na/ has greater decrease during 2;1-2;4.Across the age periods, interval change experiences lots of fluctuation all the time. The answer to Research Question 2 is yes. Babbling stage is a period in which the children's acoustic features of intervals of segments, syllables, words and phrases is shifted in the direction of the language to be learned, babbling and children's speech emergence is greatly influenced by ambient language. The phonetic changes in terms of duration would go on until as late as 10-12 years of age before reaching adult-like levels. Definitely, with the increase of exposure to ambient language, the variation would be less and less until they attain the adult-like competence. Via the analysis of the SPSS 15.0, the decrease of segmental and syllabic intervals across the four age periods proves to be of no significant difference (p>0.05). It means that the change of segmental and syllabic intervals is continuous. It reveals that the process of child speech development is gradual and cumulative.

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.