• 제목/요약/키워드: Page Clone

Search Result 47, Processing Time 0.019 seconds

Web Page Similarity based on Size and Frequency of Tokens (토큰 크기 및 출현 빈도에 기반한 웹 페이지 유사도)

  • Lee, Eun-Joo;Jung, Woo-Sung
    • Journal of Information Technology Services
    • /
    • v.11 no.4
    • /
    • pp.263-275
    • /
    • 2012
  • It is becoming hard to maintain web applications because of high complexity and duplication of web pages. However, most of research about code clone is focusing on code hunks, and their target is limited to a specific language. Thus, we propose GSIM, a language-independent statistical approach to detect similar pages based on scarcity and frequency of customized tokens. The tokens, which can be obtained from pages splitted by a set of given separators, are defined as atomic elements for calculating similarity between two pages. In this paper, the domain definition for web applications and algorithms for collecting tokens, making matrics, calculating similarity are given. We also conducted experiments on open source codes for evaluation, with our GSIM tool. The results show the applicability of the proposed method and the effects of parameters such as threshold, toughness, length of tokens, on their quality and performance.

Gene Cloning of Cellulose Degradation Enzyme of Bacillus subtilis LYH201 Strain (Bacillus subtilis LYH201균주의 섬유소 분해효소의 유전자 Cloning 및 특성분석)

  • Lee, Young-Han;Park, Sang-Ryeol
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.34 no.5
    • /
    • pp.333-341
    • /
    • 2001
  • The Compost-decomposing-bacteria was isolated from livestock compost containing sawdust. The isolated bacteria was identified as Bacillus subtilis LYH201 by the method of the composition of the fatty acid with MIDI system and Bergey's manual. Cloning of CMCase encoding gene was accompanied by shotgun method. The pLK100 have yellow activity ring on CMC medium, that was carried 2.2 kb insert DNA in pBluescript II $SK^+$ vector, named BglC gene. The BglC was very similar to Pectobacterium carotovorum Gun_CLOAB(P15704) with score of 57% identity and 71% homology over 508 aa. The BglC was measured molecular weight 56 kDa by CMC-SDS-PAGE. Optimum cellulase activity Bacillus subtilis LYH201 was temperature $50^{\circ}C$ and pH 7.

  • PDF

Characterization of Leuconostoc mesenteroides B-742CB Dextransucrase Expressed in Escherichia coli

  • Park, Mi-Ran;Ryu, Hwa-Ja;Kim, Do-Man;Choe, Jun-Yong;John F. Robyt
    • Journal of Microbiology and Biotechnology
    • /
    • v.11 no.4
    • /
    • pp.628-635
    • /
    • 2001
  • Recombinant E. coli DH5$\alpha$ harboring a dextransucrase gene (dsrB742) produced an extracellular dextransucrase in a 2% sucrose medium. The enzyme was purified by DEAE-Sepharose and Phenyl-Sepharose column chromatographies upto a 142.97-fold purification with a 11.11% recovery to near homogeneity. The enzyme had a calculated molecular mass of 168.6 kDa, which was in good agreement with the activity band of 170 kDa on a nondenaturing SDS-PAGE. An expression plasmid was constructed by inserting the dsrB742 into a pRSET expression vector. The activity after expression in E. coli BL21(DE3)pLysS increased about 6.7-fold compared to the extracellular dextransucrase from L. mesenteroides B-742CB. The expressed and purified enzyme from the clone showed similar biochemical properties (acceptor reaction, size of active dextransucrase, optimum pH, and temperature) to B-742CB dextransucrase, however, the ability to synthesize ${\alpha}$-(1$\rightarrow$3) branching decreased in comparison to that of L. mesenteroides B-742CB dextransucrase.

  • PDF

Cloning and Sequencing of the ${\alpha}-1{\rightarrow}6$ Dextransurcrase Gene from Leuconostoc mensenteroides B-742CB

  • Kim, Ho-Sang;Kim, Do-Man;Ryu, Hwa-Ja;Robyt, John-F.
    • Journal of Microbiology and Biotechnology
    • /
    • v.10 no.4
    • /
    • pp.559-563
    • /
    • 2000
  • A dextransucrase gene (dsrB742) that expresses a dextransucrase to synthesize mostly ${\alpha}-1{\rightarrow}6$ linked dextran with a low amount (3-5%) of ${\alpha}-1{\rightarrow}3$ branching was cloned and sequenced from Leuconostoc mesenteroides B-742CB. The 6.1-kb PstI fragments were ligated with pGEM-3Zf(-) and transformed into E. coli $DH5{\alpha}$. The recombinant clone (pDSRB742) synthesized dextran on an agar plate containing 2% (w/v) sucrose. The dextran synthesized was hydrolyzed with Penicillium endo-dextranase. The hydrolyzate was composed of glucose, isomaltose, isomaltotriose, and branced pentasaccharide. The nucleotide sequence of dsrB742 showed one open reading frame (ORF) composed of 4,524 bp encoding dextrasnsucrase. The deduced amino acid sequence revealed a calculated molecular mass of 168.6 kDa. It also showed an activity band of 184 kKa on a non-denaturing SDS-PAGE (10%). The amino acid sequence of DSRB742 exhibited a 50% similarity with DSRA from L. mesenteroides B-1299, a 70% similarity with DSRS from L. mesenteroides B-512 (F, FMCM) and a 45-56% similarity with Streptococcal GTFs.

  • PDF

Expression in Eschepichia coli of a Cloned Bacillus thuringiensis subsp. kurstaki HDI In-secticidal Protein Gene. (클로닝된 Bacillus thuringiensis subsp. kurstaki HDI 살충성 단백질 유전자의 대장균에서의 발현)

  • 황성희;차성철;유관희;이형환
    • Microbiology and Biotechnology Letters
    • /
    • v.26 no.6
    • /
    • pp.497-506
    • /
    • 1998
  • The expression in Escherichia coli of a cloned insecticidal protein (ICP) gene from Bacillus thuringiensis var. kurstaki HD1 in pHLN1-80 (+) and pHLN2-80(-) plasmids was investigated through deletions in promoters, transcription start point, and termination region. Six recombinant plasmids were constructed in an attempt to analyze the overexpression of the ICP in relations to its gene structure. The amounts of ICP produced from the recombinants were measured by SDS-PAGE and confirmed by Western blot analysis. One clone was not overexpressed which having only -80 bp (contained BtI promoter) part of the ICP gene promoter (without Plac promoter), the right-oriented ICP gene and the termination region. Removal of 350 bp from upstream region of the Plac of the clone pHLN2-80 (-) resulted in overexpression of the ICP. One clone was not overexpressed in which the clone consisted of -72 bp part of the ICP promoter without the transcription start point and the transcriptional termination region, and having the right-oriented ICP gene sequence. One clone consisting of the inverted ICP gene sequence, the -72 bp ICP gene promoter, and without the termination region caused overexpression. One clone which consisted of the inverted ICP gene, the -72 bp ICP gene promoter and the termination sequence was overexpressed. These results indicated that the Plac promoter, transcription termination region, the inverted ICP gene insertion, and the -80 bp or -72 bp part of the ICP gene promoters were concerned in the overexpression of the ICP gene in the recombinant plasmid, and also the overexpression mechanism might result from the disruption of the transcription-suppressing regions in the promoter regions.

  • PDF

Cloning and Expression of Bacillus thuringiensis crylAa1 Type Gene. (Bacillus thuringiensis crylAa1 Type Gene의 클로닝과 발현)

  • 이형환;황성희;권혁한;안준호;김혜연;안성규;박수일
    • Microbiology and Biotechnology Letters
    • /
    • v.32 no.2
    • /
    • pp.110-116
    • /
    • 2004
  • The over-expression in E. coli of the pHLN1-SO(+) and pHLN2-80(-) plasmids cloned an insecticidal crystal protein (ICP) gene (crylAal type) from Bacillus thuringiensis var. kurstaki HD 1 was investigated through in part, the deletion of -80 bp promoter and an alternative change of cloning vector system. Two recombinant plasmids were constructed in an attempt to analyze the over-expression of the ICP in relations to its gene structure possessing only -14 bp [Shine-Dalgarno (SD) sequence of -80 bp promoter]. Also, anther two recombinant plasmids similarly cloned the icp gene in a different vector system. The amounts of ICP produced from the recombinants were measured by SDS-PAGE and confirmed by Western blot analysis. One clone, pHLRBS1-14 clone in which only the SD sequence in the inverted orientation icp gene appeared, was more evident than the pHLRBS2-14 clone in which only the -14 bp SD sequence of the right orientated icp gene was shown to exist. The pHLN2-80(-) clone produced more ICP proteins than the pHLRBS1-14 clone. In the two clones, pHLNUC1-80 right-oriented icp gene and the pHLNUC2-80 clone inverted-orientation icp gene in a new different vector, the pHLNUC2-80 produced more ICP proteins in E. coli system. These results indicate that the P/ac promoter, the inverted icp gene insertion and -80 bp promoter (-66 bp part of the icp gene promoters), were concerned with the expression of the icp gene in the recombinant plasmids. In addition, the expression mechanism might result from the disruption of the transcription-suppressing regions in the promoter regions.

Characterization of antimicrobial proteins produced by Bacillus sp. N32 (Bacillus sp. N32 균주가 생산하는 항균 단백질 특성)

  • Lee, Mi-Hye;Park, In-Cheol;Yeo, Yun-Soo;Kim, Soo-Jin;Yoon, Sang-Hong;Lee, Suk-Chan;Chung, Tae-Young;Koo, Bon-Sung
    • The Korean Journal of Pesticide Science
    • /
    • v.10 no.1
    • /
    • pp.56-65
    • /
    • 2006
  • An antagonistic bacterial isolate, that inhibits the growth of plant pathogens, was selected and identified from 5,000 isolates screened from the rhizosphere of various crop plants. An isolate Bacillus sp. N32, tested against Colletotrichum gloeosporioides causing anthracnose disease in hot pepper, produced both a heat resistant antifungal protein and a heat sensitive antifungal protein. The heat resistant protein was partially purified by Ammonium sulfate fractionation and gel filtration chromatography. The bioautography showed that the proteins possessed high antifungal activity. The biosynthetic gene cluster responsible for the heat resistant antifungal protein was cloned from cosmid library using DNA probe obtained from PCR product with the primers targeting the conserved nucleotide sequence of the synthetic genes reported earlier, Most of the clones obtained showed higher homology to fengycin antibiotic synthetic gene family reported earlier. On the other hand, the heat sensitive protein was isolated from SDS-PAGE and electroblotting to determine the N-terminal amino acid sequences. The heat sensitive antifungal protein gene was cloned from the ${\lambda}-ZAP$ libraries using a DNA probe based on the N-terminal amino acid sequences of the heat sensitive protein. We are contemplating to clone and sequence the whole gene cluster encoding the heat sensitive protein for further analysis.

Cloning and Characterization of Cellulase Gene (cel5B) from Cow Rumen Metagenome

  • Kang, Tae-Ho;Kim, Min-Keun;Barman, Dhirendra Nath;Kim, Jung-Ho;Kim, Hoon;Yun, Han-Dae
    • Journal of agriculture & life science
    • /
    • v.46 no.2
    • /
    • pp.129-137
    • /
    • 2012
  • A carboxymethyl cellulase gene, cel5B, was cloned, sequenced, and expressed in Escherichia coli. pRCS20 in E. coli was identified from metagenomic cosmid library of cow rumen for cellulase activity on a carboxymethyl cellulose agar plates. Cosmid clone (RCS20) was partially digested with Sau3AI, ligated into BamHI site of pBluescript II SK+ vector, and transformed into E. coli $DH5{\alpha}$. The insert DNA of 1.3 kb was obtained, designated cel5B, which has the activity of hydrolyzation of CMC. The cel5B gene had an open reading frame (ORF) of 1,059 bp encoding 352 amino acids with a signal peptide of 48 amino acids and the conserved region, VIYEIYNEPL, belongs to the glycosyl hydrolase family 5. The molecular mass of Cel5B protein expressed from E. coli $DH5{\alpha}$ exhibited to be about 34 kDa by CMC-SDS-PAGE. The optimal pH was 8.0, and the optimal temperature was about $50^{\circ}C$ for its enzymatic activity.

Matrix-Assisted Laser Desorption/Ionization Time-of-Flight (MALDI-TOF)- Based Cloning of Enolase, ENO1, from Cryphonectria parasitica

  • Kim, Myoung-Ju;Chung, Hea-Jong;Park, Seung-Moon;Park, Sung-Goo;Chung, Dae-Kyun;Yang, Moon-Sik;Kim, Dae-Hyuk
    • Journal of Microbiology and Biotechnology
    • /
    • v.14 no.3
    • /
    • pp.620-627
    • /
    • 2004
  • On the foundation of a database of genome sequences and protein analyses, the ability to clone a gene based on a peptide analysis is becoming more feasible and effective for identifying a specific gene and its protein product of interest. As such, the current study conducted a protein analysis using 2-D PAGE followed by MALDI- TOF and ESI-MS to identify a highly expressed gene product of C. parasitica. A distinctive and highly expressed protein spot with a molecular size of 47.2 kDa was randomly selected and MALDI-TOF MS analysis was conducted. A homology search indicated that the protein appeared to be a fungal enolase (enol). Meanwhile, multiple alignments of fungal enolases revealed a conserved amino acid sequence, from which degenerated primers were designed. A screening of the genomic $\lambda$ library of C. parasitica, using the PCR amplicon as a probe, was conducted to obtain the full-length gene, while RT-PCR was performed for the cDNA. The E. coli-expressed eno 1 exhibited enolase enzymatic activity, indicating that the cloned gene encoded the C. parasitica enolase. Moreover, ESI-MS of two of the separated peptides resolved from the protein spot on 2-D PAGE revealed sequences identical to the deduced sequences, suggesting that the cloned gene indeed encoded the resolved protein spot. Northern blot analysis indicated a consistent accumulation of an eno1 transcript during the cultivation.

Cloning and Expression of a Farnesyl Diphosphate Synthase in Centella asiatica (L.) Urban

  • Kim, Ok Tae;Ahn, Jun Cheul;Hwang, Sung Jin;Hwang, Baik
    • Molecules and Cells
    • /
    • v.19 no.2
    • /
    • pp.294-299
    • /
    • 2005
  • A cDNA encoding farnesyl diphosphate synthase (FPS; EC2.5.1.1/EC2.5.1.10) was isolated from Centella asiacita (L.) Urban, using degenerate primers based on two highly conserved domains. A full-length cDNA clone was subsequently isolated by rapid amplification of cDNA ends (RACE) PCR. The sequence of the CaFPS (C. asiatica farnesyl diphosphate synthase) cDNA contains an open reading frame of 1029 nucleotides encoding 343 amino acids with a molecular mass of 39.6 kDa. The deduced CaFPS amino acid sequence exhibits 84, 79, and 72%, identity to the FPSs of Artemisia annua, Arabidopsis thaliana, and Oryza sativa, respectively. Southern blot analysis suggested that the C. asiatica genome contains only one FPS gene. An artificially expressed soluble form of the CaFPS was identified by SDS-PAGE. It had high specific activity and produced farnesyl diphosphate as the major isoprenoid.