Why AI for profiling IoT and other clients makes sense

By Min-Yi Shen, Data Science Manager
Share Post

Anyone who has had to figure out what those mysterious clients connected to a home Wi-Fi router knows how hard it is. What is a kewo? (It’s a solar pump inverter by the way, something we’ll probably start seeing more of in the future.)  Can you imagine what the IT admin in a mission-critical setting must deal with when companies like IDC and Gartner are predicting that we’ll see over 55B connected clients by 2025?

Understanding the type of clients that join a Wi-Fi or wired network is important for the visibility and security of the enterprise. How many clients come and go throughout the day is also important. So is knowing how many clients are just smart phones in someone’s pocket walking past a building if you’re in retail or located on a busy street. All of this requires real-time and accurate client profiling.

This is where artificial intelligence (AI) and machine learning (ML) make a difference.

Aruba Client Insights is a unified, AI-powered solution that is built into Aruba Central cloud for this exact purpose. Even though other platforms provide some basic client profiling functions, Client Insights stands out for several reasons including:

  • It does not require the deployment of specialized appliances for it to work—it just needs the telemetry your Aruba infrastructure is setup to send to our AI cloud engine.
  • Instead of using static rules for a few networking attributes (most commonly HTTP user agents), Client Insights makes use of several ML algorithms to maximize the efficacy of our client profiling.

Here’s something else. Not only are Aruba access points (APs) easy to setup, they are also designed to use more classifiers than other vendors APs. This allows Client Insights to use better data—where more equals better efficacy. With APs running the latest Aruba OS (AOS) versions, profiling can approach up to 99% efficacy, making this a practical solution for security and bandwidth planning purposes.

The MAC Range Classifier

One of the machine learning solutions we use to enhance client profiling efficacy is our MAC (Media Address Control) Range classifier. As you all know, clients connect to the network with a MAC address as their OSI L2 physical address. The typical practice of a hardware manufacturer is to subdivide their OUIs (Organizationally Unique Identifier) into blocks or ranges and assign them to different device types. (See Figure 1.)

Figure 1: Apple MAC Ranges learned from Aruba Central cloud

Because Aruba Central cloud has unprecedented visibility into 100 million clients (and counting), we can learn these ranges from the data we gather. For modeling of the MAC Ranges, we have employed a Gaussian process classifier that looks at the similarity of the MAC space to a time series. The model then confidently classifies clients using the OUI within their MAC. The Gaussian process learns the structure of the client type (Figure 1), and our data science team interprets the probability distribution into MAC Ranges. This further helps to accurately learn the client type by manufacturer.

The Chained Classifier

The other way we use ML for Client Insights is our Chained Classifier. The main purpose of the Chained Classifier is to mimic the human decision-making process. For example, when you see HTTP user agent data like the following, we can start to make profiling decisions:

Mozilla/5.0 (Linux; Android 5.0.2; SAMSUNG SM-A500FU Build/LRX22G) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/3.3 Chrome/38.0.2125.102 Mobile Safari/537.36

When you see the keyword “Android,” you know it is an Android client, and not an Apple iOS client. As you read further, you see “SAMSUNG” so you know it is a Samsung client, not an HTC. Then “SM-A500” tells you it is a Samsung Galaxy. This reasoning process is identical to the ML models used by decision tree/random forest classifiers, where classification is reached by descending from the tree root to the leaf node with a chain of decisions.

These decision trees are learned from the rich client telemetry available to us in Aruba Central cloud. As the data gets richer, the number and quality of Chains that we can discover will be greater.

Fueled by the massive dataset yet to be discovered, I believe that we are entering an age where we can extend the value of AI/ML. In this day and age, the only scalable way to offer great Insights for this growing data-centric world will be AI and ML powered solutions. And this is where Aruba and the industry is heading towards as a means to improve customer experiences.