• Title/Summary/Keyword: URL

Search Result 445, Processing Time 0.028 seconds

SHRT : New Method of URL Shortening including Relative Word of Target URL (SHRT : 유사 단어를 활용한 URL 단축 기법)

  • Yoon, Soojin;Park, Jeongeun;Choi, Changkuk;Kim, Seungjoo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38B no.6
    • /
    • pp.473-484
    • /
    • 2013
  • Shorten URL service is the method of using short URL instead of long URL, it redirect short url to long URL. While the users of microblog increased rapidly, as the creating and usage of shorten URL is convenient, shorten url became common under the limited length of writing on microblog. E-mail, SMS and books use shorten URL well, because of its simplicity. But, there is no relativeness between the most of shorten URLs and their target URLs, user can not expect the target URL. To cover this problem, there is attempts such as changing the shorten URL service name, inserting the information of website into shorten URL, and the usage of shortcode of physical address. However, each ones has the limits, so these are the trouble of automation, relatively long address, and the narrowness of applicable targets. SHRT is complementary to the attempts, as getting the idea from the writing system of Arabic. Though the writing system of Arabic has no vowel alphabet, Arabs have no difficult to understand their writing. This paper proposes SHRT, new method of URL Shortening. SHRT makes user guess the target URL using Relative word of the lowest domain of target URL without vowels.

URL Signatures for Improving URL Normalization (URL 정규화 향상을 위한 URL 서명)

  • Soon, Lay-Ki;Lee, Sang-Ho
    • Journal of KIISE:Databases
    • /
    • v.36 no.2
    • /
    • pp.139-149
    • /
    • 2009
  • In the standard URL normalization mechanism, URLs are normalized syntactically by a set of predefined steps. In this paper, we propose to complement the standard URL normalization by incorporating the semantically meaningful metadata of the web pages. The metadata taken into consideration are the body texts and the page size of the web pages, which can be extracted during HTML parsing. The results from our first exploratory experiment indicate that the body texts are effective in identifying equivalent URLs. Hence, given a URL which has undergone the standard normalization, we construct its URL signature by hashing the body text of the associated web page using Message-Digest algorithm 5 in the second experiment. URLs which share identical signatures are considered to be equivalent in our scheme. The results in the second experiment show that our proposed URL signatures were able to further reduce redundant URLs by 32.94% in comparison with the standard URL normalization.

Fast URL Lookup Using URL Prefix Hash Tree (URL Prefix 해시 트리를 이용한 URL 목록 검색 속도 향상)

  • Park, Chang-Wook;Hwang, Sun-Young
    • Journal of KIISE:Information Networking
    • /
    • v.35 no.1
    • /
    • pp.67-75
    • /
    • 2008
  • In this paper, we propose an efficient URL lookup algorithm for URL list-based web contents filtering systems. Converting a URL list into URL prefix form and building a hash tree representation of them, the proposed algorithm performs tree searches for URL lookups. It eliminates redundant searches of hash table method. Experimental results show that proposed algorithm is $62%{\sim}210%$ faster, depending on the number of segment, than conventional hash table method.

URL Normalization for Web Applications (웹 어플리케이션을 위한 URL 정규화)

  • Hong, Seok-Hoo;Kim, Sung-Jin;Lee, Sang-Ho
    • Journal of KIISE:Information Networking
    • /
    • v.32 no.6
    • /
    • pp.716-722
    • /
    • 2005
  • In the m, syntactically different URLs could represent the same resource. The URL normalization is a process that transform a URL, syntactically different and represent the same resource, into canonical form. There are on-going efforts to define standard URL normalization. The standard URL normalization designed to minimize false negative while strictly avoiding false positive. This paper considers the four URL normalization issues beyond ones specified in the standard URL normalization. The idea behind our work is that in the URL normalization we want to minimize false negatives further while allowing false positives in a limited level. Two metrics are defined to analyze the effect of each step in the URL normalization. Over 170 million URLs that were collected in the real web pages, we did an experiment, and interesting statistical results are reported in this paper.

Evaluating Site-based URL Normalization (사이트 기반의 URL 정규화 평가)

  • Jeong, Hyo-Sook;Kim, Sung-Jin;Lee, Sang-Ho
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11b
    • /
    • pp.28-30
    • /
    • 2005
  • URL 정규화는 다양하게 표현된 동일 URL들을 하나의 통일된(cannonical) 형태의 URL로 변환하는 과정이다. 동일문서에 대한 중복된 URL 표현은 URL 정규화를 통하여 제거된다. 표준 정규화는 잘못된 긍정(동일하지 않는 URL들을 동일 문자열로 변환)이 없도록 개발되었다. 그러나 표준 정규화는 많은 잘못된 부정이 발생하게 되므로, 잘못된 긍정을 일부 허용하면서 잘못된 부정을 현격히 줄일 수 있는 확장 정규화가 제기되고 연구되어 왔다. 본 논문에서는 동일 사이트 내의 URL들에 대한 확장 정규화의 적용 결과가 유사한 정도를 보임으로써, 한 사이트 내의 URL에 대한 임의의 확장 정규화 결과 정보가 동일 사이트 내의 다른 URL들의 정규화에 효과적으로 사용될 수 있음을 보인다. 이를 위하여, 한 사이트의 확장 정규화 결과 동일성 척도와 사이트 기반의 확장 정규화 평가 척도를 제안한다. 20,000만개의 실제 국내 웹 사이트에서 추출된 25만개의 URL에 대해 6가지 확장 정규화가 평가된다.

  • PDF

Development of a Malicious URL Machine Learning Detection Model Reflecting the Main Feature of URLs (URL 주요특징을 고려한 악성URL 머신러닝 탐지모델 개발)

  • Kim, Youngjun;Lee, Jaewoo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.12
    • /
    • pp.1786-1793
    • /
    • 2022
  • Cyber-attacks such as smishing and hacking mail exploiting COVID-19, political and social issues, have recently been continuous. Machine learning and deep learning technology research are conducted to prevent any damage due to cyber-attacks inducing malicious links to breach personal data. It has been concluded as a lack of basis to judge the attacks to be malicious in previous studies since the features of data set were excessively simple. In this paper, nine main features of three types, "URL Days", "URL Word", and "URL Abnormal", were proposed in addition to lexical features of URL which have been reflected in previous research. F1-Score and accuracy index were measured through four different types of machine learning algorithms. An improvement of 0.9% in a result and the highest value, 98.5%, were examined in F1-Score and accuracy through comparatively analyzing an existing research. These outcomes proved the main features contribute to elevating the values in both accuracy and performance.

Effects and Evaluations of URL Normalization (URL정규화의 적용 효과 및 평가)

  • Jeong, Hyo-Sook;Kim, Sung-Jin;Lee, Sang-Ho
    • Journal of KIISE:Databases
    • /
    • v.33 no.5
    • /
    • pp.486-494
    • /
    • 2006
  • A web page can be represented by syntactically different URLs. URL normalization is a process of transforming URL strings into canonical form. Through this process, duplicate URL representations for a web page can be reduced significantly. A number of normalization methods have been heuristically developed and used, and there has been no study on analyzing the normalization methods systematically. In this paper, we give a way to evaluate normalization methods in terms of efficiency and effectiveness of web applications, and give users guidelines for selecting appropriate methods. To this end, we examine all the effects that can take place when a normalization method is adopted to web applications, and describe seven metrics for evaluating normalization methods. Lastly, the evaluation results on 12 normalization methods with the 25 million actual URLs are reported.

A Study proposal for URL anomaly detection model based on classification algorithm (분류 알고리즘 기반 URL 이상 탐지 모델 연구 제안)

  • Hyeon Wuu Kim;Hong-Ki Kim;DongHwi Lee
    • Convergence Security Journal
    • /
    • v.23 no.5
    • /
    • pp.101-106
    • /
    • 2023
  • Recently, cyberattacks are increasing in social engineering attacks using intelligent and continuous phishing sites and hacking techniques using malicious code. As personal security becomes important, there is a need for a method and a solution for determining whether a malicious URL exists using a web application. In this paper, we would like to find out each feature and limitation by comparing highly accurate techniques for detecting malicious URLs. Compared to classification algorithm models using features such as web flat panel DB and based URL detection sites, we propose an efficient URL anomaly detection technique.

Design and Implementation of a Forwarding Server for Using the Logical URL (논리적 URL 사용을 위한 포워딩 서버의 설계 및 구현)

  • 양희재
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.2
    • /
    • pp.239-249
    • /
    • 2003
  • A lot many WWW sites are come into the world more and more as Web is considered as the unified Internet information tool. The location of each site or resource is usually specified by a physical URL, which is often too long to remember and tends to raise difficulty to show the aim of the site intuitively by seeing it. Since any person or organization can get his/her own domain name easily, it is more desirable to use a logical URL with the domain name which can be chosen more compact to remember and meaningful to represent the ultimate intention of the site. This paper presents an implementation of a URL forwarding server which forwards a URL to another, so that a WWW site can use a logical URL instead of a physical one. The server consists of a domain mapper which uses the redirection transaction of the HTTP protocol, and a name server based on the HIND. The paper shows how the interaction between the domain mapper and the name sever can make forwarding possible and describes its implementation in detail. Experimental results shows that the overhead incurred by URL forwarding is negligible compared to the typical delay of current Internet traffic condition.

OTACUS: Parameter-Tampering Prevention Techniques using Clean URL (OTACUS: 간편URL기법을 이용한 파라미터변조 공격 방지기법)

  • Kim, Guiseok;Kim, Seungjoo
    • Journal of Internet Computing and Services
    • /
    • v.15 no.6
    • /
    • pp.55-64
    • /
    • 2014
  • In a Web application, you can pass without restrictions special network security devices such as IPS and F/W, URL parameter, which is an important element of communication between the client and the server, is forwarded to the Web server. Parameters are modulated by an attacker requests a URL, disclose confidential information or through e-commerce, can take financial gain. Vulnerability parameter manipulation thereof cannot be able to determine whether to operate in only determined logical application, blocked with Web Application Firewall. In this paper, I will present a technique OTACUS(One-Time Access Control URL System) to complement the shortcomings of the measures existing approaches. OTACUS can be effectively blocked the modulation of the POST or GET method parameters passed to the server by preventing the exposure of the URL to the attacker by using clean URL technique simplifies complex URL that contains the parameter. Performance test results of the actual implementation OTACUS proves that it is possible to show a stable operation of less than 3% increase in the load.