URL Classification

Home  /  Security Articles  /  URL Classification

URL Classification

URL classification is based on real users actively visiting URLs, as opposed to classifying bot traffic. The classification approach employs a crowd-sourced approach for obtaining a constant stream of URLs to analyze.

The continuous stream of URLs actively being visited by 500 million end users is the primary in-house source for threat corpuses and comes from a global network of customers across several markets. This combined and integrative approach allows us to continuously enhance, optimize and tune malicious detection capabilities in an ever-changing threat landscape.


How URL classification works?

WebTitan URL Classification utilizes an integrative multi-vector approach using in-house analysis that combines the following methods:

  •   Link analysis
  •   Content analysis
  •   Static profile and heuristic analysis
  •   Behavioral analysis (sandboxing)
  •   3rd Party Industry Feeds
  •   Honeypot Infrastructure
  •   Bot Detection Infrastructure
  •   In-house and 3rd Party Tools
  •   Human supervised/validated Machine Learning

Try WebTitan URL filtering solution for free today - sign up for our Free Trial

Try for Free


Human-supervised and validated Machine Learning

Malicious detections are continuously sampled to profile, test and validate malicious detections. The results of the continuous sampling are then used to feed/train the supervised Machine Learning systems and adjust or tune the efficiency, accuracy and overall effectiveness of malicious detections using internal key performance indicators.


URL Domain and Path Coverage

One of the critical features that our URL classification provides is an ability for deep analysis due to full path detection. In a nutshell, page and path level reporting provides analytical credibility to what is being marked as malicious. The majority of malicious URLs in the databases are detailed down to the path level. In the case of non-IP based URLs, 88.35% are marked as malicious down to the path level. In the case of IP based URLs, the number is significantly higher with 99.70% of URLs being identified as having a path. This is extremely important because DNS-based systems are typically working at the domain level only.


Malicious URL Revisit Process

Due to the variable life cycle of malicious URLs, it is imperative be able to inspect and detect URLs quickly and ensure they are still malicious. The Malicious Detection Service includes an automated revisit process where malicious URLs are revisited on set schedule. Each day 300,000 malicious URLs are revisited to see if they are still infected or are now clean. As our malicious detection service is able to obtain the full path, it is is able to specifically revisit that exact URL and obtain crucial results on a granular and highly accurate level.

The detection systems utilize the following nine types of Malicious Categories:


Try WebTitan URL filtering solution for free today - sign up for our Free Trial

Try for Free


Ad Fraud

Sites that are being used to commit fraudulent online display advertising transactions using different ad impression boosting techniques including but not limited to the following, ads stacking, iframe stuffing, and hidden ads. Sites that have high non-human web traffic and with rapid, large and unexplained changes in traffic.



Bots are compromised machines running software that is used by hackers to send spam, phishing attacks, and denial of service attacks.


Command and Control Centers

Internet servers used to send commands to infected machines called bots.


Compromised & Links To Malware

Compromised web pages are pages that appear to be legitimate, but house malicious code or link to malicious  websites hosting malware. These sites have been compromised by someone other than the site owner. If Firefox blocks a site as malicious, use this category.  Examples are defaced, hacked by etc.


Malware Call-Home

When viruses and spyware report information back to a particular URL or check a URL for updates, this is considered a malware call-home address.


Malware Distribution Point

Web pages that host viruses, exploits, and other malware are considered Malware Distribution Points. Web Analysts may use this category if their anti-virus program triggers on a particular website.



Web pages that impersonate other web pages usually with the intent of stealing passwords, credit card numbers,  or other information. Also includes web pages that are part of scams such as a ""419"" scam where a person is convinced to hand over money with the expectation of a big payback that never comes. Examples con, hoax, scam etc.


Spam URLs

URLs that frequently occur in spam messages.


Spyware & Questionable Software

Software that reports information back to a central server such as spyware or keystroke loggers. Also includes  software that may have legitimate purposes, but some people may object to having on their system. 

As you can see our classification system uses significant and continually optimized intelligence.  Cybercriminals are constantly finding new ways operate, hide online and exploit vulnerabilities. All of our solutions are  constantly adapting to meet these new modes of operation, we continually learn from new data to impede the cybercriminal at source.

To learn more about WebTitan why not get in touch and take a closer look at the solution in action.

Try WebTitan URL filtering solution for free today - sign up for our Free Trial

Try for Free

Start My Free Trial Now

Sign Up
Get Your 14 Day Free Trial

Talk to Our Email and DNS Security Team

Call us on USA +1 813 304 2544 or IRL +353 91 545555

Contact Us