Sunday, 26 April 2020

Architecting Work-from-Home Solutions that Scale

The need for people to work from home challenges us to come up with new solutions that scale. One of these solutions involved the enablement of static IPs for VPN connections for individual users.

But why would anyone want to do that?

The Problem


It turns out that for years, India has had a strict no VoIP (Voice over IP) policy when using VPN law. This law only came to light when the head honcho, Chuck Robbins, encouraged all Cisco employees to work from home. In the past, our call center employees were not allowed to work from home because they were not allowed to use VoIP over VPN; but desperate times call for desperate measures.

With Cisco at the head, a bunch of tech companies sat down with India’s government and came up with an exception allowing VoIP over VPN as long as each employee was given the same IP upon connection along with a record provided to the DoT (Department of Telecommunications) of each employee’s work address. With the exception in place, that left us to come up with a solution.

The Solution


Conventionally, IP addresses are assigned through an IP address pool, which gives users the first available IP address from a range.

Take for example if I select a site within AnyConnect, e.g. “Headquarters” and click “Connect”. Within the AnyConnect client is an XML file that directly maps the site, “Headquarters”, to the URL, “headquarters.cisco.com/default”. This URL can be broken down into two parts: the device address, “headquarters.cisco.com”, and path, “default” which maps to the tunnel-group “Default_Tunnel_Group”. Within the VPN headend configuration is a line that says the address pool for the tunnel-group “Default_Tunnel_Group” is 10.0.0.1-10.0.0.254 or a “/24”. I am then assigned the first unallocated IP address in that range, in this case let’s say “10.0.0.101”, which becomes my IP address within the Cisco network. However, if I disconnect and then reconnect, I will then be assigned a new IP address from the above range.

Cisco Exam Prep, Cisco Prep, Cisco Tutorial and Materials, Cisco Learning, Cisco Certifications

The size of the IP address pool, the number of users connecting to a site, and the number of VPN headend devices (each with a unique address pool) in a cluster for a site, are all factors which make the likelihood of being assigned the same IP address upon connection extremely remote.

Example configuration of an IP address pool and tunnel group:

ip local pool DEFAULT_EMPLOYEE_POOL 10.0.0.1-10.0.0.254 mask 255.255.255.255 

tunnel-group Default_Tunnel_Group type remote-access 
tunnel-group Default_Tunnel_Group general-attributes 
 address-pool DEFAULT_EMPLOYEE_POOL 
 default-group-policy Default_Group_Policy 
tunnel-group Default_Tunnel_Group webvpn-attributes 
 group-url headquarters.cisco.com/default enable 

Our first approach to assigning static IPs was a solution that came up in forums from years past, which was to create a local user account on the ASA, and from there statically assign an IP for that specific user; however, this would require a static password stored on the ASA. And although encrypted, we knew our friends in InfoSec would have an absolute fit over that one. As a long shot, we attempted to authenticate a local user account with no static password against our AAA servers, but this attempt ultimately failed.

Our second attempt was to look at how we could use ISE (Identity Services Engine) in this scenario. ISE handles all of our authorization requests in the corporate network, whether remote or on-site, and it made sense to use it given we were mapping static IPs to users. With ISE we encountered two problems: first, ISE does not proxy all information given by RADIUS servers back to the VPN headends, so it was not a viable solution in our partner network where we rely on RADIUS groups to handle ACLs and second, there were concerns over how to complete this at scale – manually creating over 7,000 policies in ISE would take a serious effort both in people and time and we’d be sailing uncharted waters since it had never been tested for this type of scenario.

Our third approach was to use Active Directory in place of ISE for the IP address mapping. However, we once again faced the issue of resourcing to create 7,000 entries manually as well as the unknown strain we would be putting on the system.

Sometimes the best solution is the simplest, and after hours of trying fancy group manipulations with ISE and attempting to get it to pass RADIUS group information; we settled on one of the first ideas that came up while brainstorming and one we knew should work, a unique tunnel group and address pool of one IP for each user.

Cisco Exam Prep, Cisco Prep, Cisco Tutorial and Materials, Cisco Learning, Cisco Certifications

The solution can best be summarized by taking me, username “drew”, as an example of a user that needs a statically assigned IP address. By taking the “/24” from before with the IP range of 10.0.0.1-10.0.0.254, we designate the IP address 10.0.0.201 to be my statically assigned IP address. We declare an address pool of just this one IP address, which is now a “/32”. We assign this address pool to the tunnel group “drew”, with the URL “headquarters.cisco.com/drew”.

Example configuration of a static IP address pool and tunnel group:

ip local pool drew 10.0.0.201 mask 255.255.255.255 

tunnel-group drew type remote-access 
tunnel-group drew general-attributes 
 address-pool drew 
 default-group-policy Default_Group_Policy 
tunnel-group drew webvpn-attributes 
 group-url https://headquarters.cisco.com/drew enable 

After the successful testing and implementation of the above configuration (which used automation detailed below), questions rose throughout our team like wildfire (and to the credit of our customers, they have also had similar questions along these lines). The solution seems hacky to say the least. What are the security implications and very importantly, will it scale? We’re talking about a solution that has to work for thousands of Cisco call center employees in India (a number which has approached 7,000 as of today).

Here are some of the most notable questions:

1. How many tunnel groups (and thus users) can you have on each VPN headend?

Cisco ASA documentation states that the number of tunnel groups that can be configured is equivalent to the maximum number of VPN connections it can support. In our case we are using ASA 5585-SSP60s, which support 10,000 connections and thus can be configured with 10,000 tunnel groups.

2. Does the addition of such a large amount of tunnel groups increase overhead on the ASA and thus decrease performance?

The ASA uses a hash map for its tunnel groups (constant time lookup), so although there is memory used for the additional tunnel groups, this memory is constant and pales in comparison to the memory used for an ASA’s normal duties of encrypting/decrypting traffic.

Security


With our nerves slightly calmed about the number of tunnel groups we had just deployed to the VPN headend, we had some homework left to do. Because we’re Cisco, a solution is not complete without security, and DAP (dynamic access policies) on VPN headends is one of our core lines of defense. By keeping all tunnel groups under the same blanket group policy, we were able to maintain our standard DAP checks: such as verifying AnyConnect Client and operating system versions as well as other obscure policies such as AnyConnect session timeouts and FQDN split tunneling.

The last item was ensuring the static IP tunnel groups we had just created were used exclusively by the employees for which they were intended, and that employees who were supposed to be using these static IPs were not connecting to our regular corporate VPN headends and getting dynamically assigned IPs. To ensure the employee who was supposed to be connecting to a tunnel group was the only one successful, we applied a LUA script through DAP to the blanket group policy.


EVAL(cisco.aaa.username, "EQ", cisco.aaa.tunnelgroup) 

Essentially this checks the username authenticating is the same as the name of the tunnel group, which is purposely the same as the user’s username, preventing the user “damien” from connecting to my tunnel group, “drew”, and from using my static IP of 10.0.0.201. To secure employees were exclusively connecting to their assigned static IP tunnels, we used ISE to block all call center employees from connecting to our corporate VPN headends by denying the authorization of users in an Active Directory (AD) group to those sites.

Automation and Management


You can find the code in DevNet Code Exchange that we used to generate the ASA configuration for the thousands of tunnel groups and static IPs we needed. It uses simple string interpolation along with a text file of users. In addition to generating the tunnel groups, the functions provided also help you tear down these tunnel groups for easier clean up.

The intent of these functions is not meant to be a full-blown solution, but to provide you with the building blocks to make one.

The solution we created was not as elegant as we would have liked, however with automation we can change that. Using these configuration generation functions along with our favorite network configuration tools, such as NSO (Network Service Orchestrator) – or Ansible, Paramiko, etc. – we can create templates to automate the deployment of this configuration to the VPN headend.

Taking things a step further, we can build on top of these network configuration tools with an application to manage these tunnel groups paired with a database of the users and their statically assigned IPs. Thus, when you want to add or remove a user, the application does the allocation or deallocation of IPs for you without having to trove through thousands of lines of configuration.

Cisco Exam Prep, Cisco Prep, Cisco Tutorial and Materials, Cisco Learning, Cisco Certifications

In regard to the Walk-Run-Fly journey it presents, we see our solution as being in the “Run” state. We welcome and encourage use and enhancements from the community to achieve “Fly” status.

Closing Thoughts


It has now been over a month since we deployed our static IP solution for call center employees, and for the most part, things have been relatively smooth for such an adhoc implementation. This is not to say we have not faced issues since then. However, we have continued to work on improvements, such as adding redundancy to our call center VPN headend using an active failover configuration.

With all that being said, I cannot stress enough how much automation saved us in this hacky situation and continues to make things simple through the management of these static IP tunnels.

Related Posts

0 comments:

Post a comment