Tuesday, 2 March 2021

Machine Reasoning is the new AI/ML technology that will save you time and facilitate offsite NetOps

Cisco Prep, Cisco Tutorials and Material, Cisco Career, Cisco Preparation, Cisco Guides

Machine reasoning is a new category of AI/ML technologies that can enable a computer to work through complex processes that would normally require a human. Common applications for machine reasoning are detail-driven workflows that are extremely time-consuming and tedious, like optimizing your tax returns by selecting the best deductions based on the many available options. Another example is the execution of workflows that require immediate attention and precise detail, like the shut-off protocols in a refinery following a fire alarm. What both examples have in common is that executing each process requires a clear understanding of the relationship between the variables, including order, location, timing, and rules. Because, in a workflow, each decision can alter subsequent steps.

So how can we program a computer to perform these complex workflows? Let’s start by understanding how the process of human reasoning works. A good example in everyday life is the front door to a coffee shop. As you approach the door, your brain goes into reasoning mode and looks for clues that tell you how to open the door. A vertical handle usually means pull, while a horizontal bar could mean push. If the building is older and the door has a knob, you might need to twist the knob and they push or pull depending on which side of the threshold the door is mounted. Your brain does all of this reasoning in an instant, because it’s quite simple and based on having opened thousands of doors. We could program a computer to react to each of these variables in order, based on incoming data, and step through this same process.

Now let’s apply these concepts to networking. A common task in most companies is compliance checking where each network device, (switch, access point, wireless controller, and router) is checked for software version, security patches, and consistent configuration. In small networks, this is a full day of work; larger companies might have an IT administrator dedicated to this process full-time. A cloud-connected machine reasoning engine (MRE) can keep tabs on your device manufacturer’s online software updates and security patches in real time. It can also identify identical configurations for device models and organize them in groups, so as to verify consistency for all devices in a group. In this example, the MRE is automating a very tedious and time-consuming process that is critical to network performance and security, but a task that nobody really enjoys doing.

Another good real world example is troubleshooting an STP data loop in your network. Spanning Tree Protocol (STP) loops often appear after upgrades or additions to a layer-2 access network and can data storms that result in severe performance degradation. The process for diagnosing, locating, and resolving an STP loop can be time-consuming and stressful. It also requires a certain level of networking knowledge that newer IT staff members might not yet have. An AI-powered machine reasoning engine can scan your network, locate the source of the loop, and recommend the appropriate action in minutes.

Cisco DNA Center delivers some incredible machine reasoning workflows with the addition of a powerful cloud-connected Machine Reasoning Engine (MRE). The solution offers two ways to experience the usefulness of this new MRE. The first way is something many of you are already aware of, because it’s been part of our AI/ML insights in Cisco DNA Center for a while now: proactive insights. When Cisco DNA Center’s assurance engine flags an issue, it may determine to send this issue to the MRE for automated troubleshooting. If there is an MRE workflow to resolve this issue, you will be presented with a run button to execute that workflow and resolve the issue. Since we’ve already mentioned STP loops, let’s take a look at how that would work.

When a broadcast storm is detected, AI/ML can look at the IP addresses and determine that it’s a good candidate for STP troubleshooting. You’ll get the following window when you click on the alert:

Cisco Prep, Cisco Tutorials and Material, Cisco Career, Cisco Preparation, Cisco Guides
Image 1: Broadcast storm detected

When you click on the button “Start Automate Troubleshooting” you spin-up the machine reasoning engine and it traces the host flaps. If it detects STP loops, you’ll see this window:

Cisco Prep, Cisco Tutorials and Material, Cisco Career, Cisco Preparation, Cisco Guides
Image 2: STP Loops Detected

Cisco Prep, Cisco Tutorials and Material, Cisco Career, Cisco Preparation, Cisco Guides
Image 3: STP loops identified by device and VLAN

Now click on view details and the MRE will present you the specifics for the related VLANs as well as a logical map of the loop with the name of the relevant devices and the VLAN number. All you need to do now is prune your VLANs in those switches, and you’ve solved a complex issue in just a couple minutes. The ease at which this problem is resolved shows how MRE can bridge the skill gap and enable lesser trained IT members to proactively resolve network issues. It also demonstrates that machines can discover, investigate, and resolve network issues much faster than a human can. Eliminating human latency in issue resolution can greatly improve user experience on your network.

Another example of a proactive workflow is the “PSIRT alert” that flag Cisco devices which have advisories for bug or vulnerability software patches. You will see this alert automatically, anytime Cisco has released a PSIRT advisory that is relevant to one of your devices. Simply click the PSIRT alert and the software patch will be displayed and ready to load. The Cisco DNA Center team is working hard to create more proactive MRE workflows, so you’ll see more of these automated troubleshooting solutions in future upgrades.

The second way to experience machine reasoning in Cisco DNA Center, is in the new “Network Reasoner Dashboard,” which is located in the “Tools” menu. There you will find five new buttons that execute automated workflows through the MRE.

Cisco Prep, Cisco Tutorials and Material, Cisco Career, Cisco Preparation, Cisco Guides
Image 4: Network Reasoner Dashboard

1. CPU Utilization: There are a number of reasons that the CPU in a networking device would be experiencing high utilization. If you have ever had to troubleshoot this, you know that the remediation list for this is quite long and the tasks involved are both time-consuming and require a seasoned IT engineer to perform. This button works through numerous tasks, such as IOS process, packets per second flow, broadcast storm, etc. It then returns a result with specific guided remediation to resolve the issue.

2. Interface Down: Understanding the reasons for an interface that doesn’t come up requires deep knowledge of virtual routing and forwarding (VRF). This means that your less experienced team members will likely escalate this issue to a higher level engineer to be resolved. Furthermore, unless your switch has the capability of advanced telemetry you would need to have physical access to the switch in order to rule out a Layer-1 problem such as an SPF, cables, connectors, patch panel, etc. This button compares the interface link parameters at each end, runs a loopback, ping, traceroute, and other tests before returning a result for the most likely cause.

3. Power supply: Cisco Catalyst switches can detect power issues related to inconsistent voltage, fluctuating input, no connection, etc. This is generally done on site with visible inspection of the interface and LEDs. The MRE workflow uses sensors and logic reasoning to determine the probable cause. So, press this button if you want to skip a trip to the switch site.

4. Ping Device: I know what you’re thinking, it’s so simple to ping a device. But, it does take time to open a CLI window and it’s a distraction from the window you have open. Now all you need to do is push a button and enter the target IP address.

5. Fabric Data Collection: Moving to a software defined network with a fully layered fabric and micro-segmentation has tremendous benefits, but it does take some training to master. This button will collect show command outputs from network devices for complete visibility of your overlay (virtual) network. Having clear visibility can help troubleshoot issues in your fabric network.

Now that you know what machine reasoning is, and what it can offer your team, let’s take a look at how it works. It all starts with Cisco subject matter experts that have created a knowledge base of processes required to achieve certain outcomes which are based on best practices, defect signatures, PSIRTs, and other data. Using a “workflow editor” these processes are encapsulated into a central knowledge base, located in the Cisco cloud. When the AI/ML assurance engine in Cisco DNA Center sees and issue, it will send this issue to the MRE, which then uses inferences to select a relevant workflow from the knowledge base in the cloud. Cisco DNA Center can then present remediation or execute a complete workflow to resolve the issue. In the case of the workflows on demand in the network reasoner dashboard, the MRE simply selects the workflow from the knowledge base and executes it.

Cisco Prep, Cisco Tutorials and Material, Cisco Career, Cisco Preparation, Cisco Guides
Figure 1: MRE architecture

If you’re following my description of the process on the image above, you’ll notice I left out a couple icons in the diagram: Community, Partners, and Governance. Cisco is inviting our DEVNET community and fabulous Cisco Partners to create and publish MRE workflows. In conjunction with Cisco CX, we have developed a governance process, which works inside of our software Early Field Trials (EFT) program. This allows us to grow the library of workflows in the Network Reasoner window with industry-specific as well as other interesting and time-saving workflows. What tedious networking tasks would you like to automate? Let me know in the comments below!

If you haven’t yet installed the latest Cisco DNA Center software (version 2.1.2.x), the newly expanded machine reasoning engine is a great reason to do it. Look for continued development in our AI/ML machine reasoning engine in the coming releases with features for compliance verification (HIPPA, PCI, and DSS), network consistence checks (DNS, DHCP, IPAM, and AAA), security vulnerabilities (PSIRTs), and more.

Source: cisco.com

Related Posts


Post a comment