Web Content Classification: A powerful new policy tool for the PEF Firewall

Last week, we released ArubaOS 6.4.2 for Mobility Controllers. This release adds a significant new feature - Web Content Classification - to the already powerful set of controls included as part of the Policy Enforcement Firewall. Let's take a look at what it does, and how to use it alongside our application visibility and control, or "AppRF".

Web Content Classification extracts URLs and hostnames from all traffic as it traverses the Aruba Networks Mobility Controller. These URLs are then looked up in a local cache. If there is no entry locally, a query is made to the cloud. This allows customers to analyze and control the web browsing behavior of users based on category and risk. There are 82 different categories, and five different levels of risk.

These two pieces of information - category and risk - can then be used in the Policy Enforcement Firewall to control traffic, just like you can use IP addresses, ports, and application information. This means that you can block entire categories, rate limit them, and raise or lower the quality of service for each. And just like with applications, these controls can be applied via the global policy, where they will affect all users, or they can be applied only to certain roles.

Web Categories
Here's a look at the categories we use for classification. They are designed to allow you to enforce acceptable use policies and block or rate limit traffic based on these policies. It's important to note that a website can be a member of more than one category. So if you make a rule based on a category, it will be enforced if any of the categories returned for the site matches.

There are a couple of very powerful applications of these categories I'd like to point out.

First of all, there are five categories directly related to internet security. These are Malware Sites, Phishing and Other Frauds, Spyware and Adware, and Spam URLs. All of these categories are examples of places that you just don't want to go. These destinations have been found to exist strictly for these evil purposes. Therefore, it would be an excellent idea to create rules blocking these sites, so that users of your network don't visit them and become infected with malware or fall victim to a scam.

Secondly, there are a few categories that can help you control the bandwidth used on your wireless network. Specifically, CDNs, Peer to Peer, and Streaming Media are categories that, if abused, could lead to quite a bit of traffic on your Wi-Fi network. Depending on your business, these might be candidates for rate limiting.

One of the questions I am often asked about Web Content is how it is different from AppRF? AppRF relies on deep packet inspection - looking for signatures, following protocol finite state machines, examining SSL certificates and metadata, and applying advanced heuristics on packet flows. The result of these techniques is to classify about 1500 applications into 21 different categories. These numbers are very good - better than most Wi-Fi vendors. However, there are obviously millions of websites and web "applications" with new ones being created every hour of every day. That's where Web Classification comes in, with a database of over 20 million sites and growing.

There are a small number of categories of content that line up with AppRF categories. Peer-to-Peer and streaming media are the most obvious. If you're trying to bandwidth limit, QoS, or block these types of apps, the web content classification feature will have a far wider set of websites and web applications that it can block, and that list will change quickly over time as new sites become available. So if you are comfortable with a 'wide net' when enforcing policies, web content is the way to go. If you'd rather just block a finite, know set of applications, then application categories are the best way.

Note that you can get a list of applications that are part of an application category by using the CLI command "show dpi application category ". There is no way to get a list of all the sites that are part of a web content category.

Web Reputation
Another powerful feature of Web Classification is Web Reputation.

Web Reputation is a value from 0-100 based on how dangerous a website is. We simplify this into five different levels for configuration purposes. You can create policies to block websites based on their reputation, independent of what category a given website may fall under. This reputation is based on criteria such as:

Does the website have links to malware sites?
Does it have malware on it, or has it had malware in the recent past?
How long has the website existed? Many malware sites have a very short lifespan.

It would generally be a good idea to make a rule to block all traffic that is classified as "suspicious" or "high risk". These categories typically include no sites that are worthy of your users' attention, and you should see less malicious activity on your network if you do so.

Exceptions
Another frequently asked question is "how do I override a categorization I don't agree with?" Since Web Content is seamlessly integrated with the PEF firewall, this is a very simple thing to do. If you are trying to block all gambling sites, for example, but your CEO loves playing poker for free on pokerstars.com, all you need to do is make a "net destination alias". Put the alias above the rule that blocks the category, with a 'permit'. Since this rule will hit first, it will allow traffic to pokerstars while blocking the rest of the gambling category. Check the GUI example below:

System Requirements
Web Content Classification is included with the 6.4.2 build, and requires a PEF license. It is supported on all 7000 and 7200 series controllers. Note that Aruba Instant has a similar feature in their 4.1 release, which is also available now.