Managing Network Services Configuration with Ansible

From Building Network Automation Solutions

Building Network Automation Solutions
6 week advanced interactive online course Button-click-here.png
Course starting in
September 2017

At a Glance

  • High-intensity interactive online course;
  • Jump-start your network automation career;
  • Hands-on experience working on a solution to your own problem;
  • 6 week course spread across ~2 months;
  • Live discussion and guest speaker sessions;
  • Design and coding assignments and group work;
  • Final course completion certificate.

Tools like Puppet or Chef are great at managing desired device state making sure the actual state of a device (for example, list of VLANs configured on server-facing ports) eventually matches the specifications in a manifest (Puppet) or recipe (Chef).

Both of these tools manage only the parts of the device state supported by Puppet or Chef agents running on the device (or on a proxy node). While some vendors support a plethora of parameters (see for example Nexus OS Puppet agent documentation), others support the minimum viable subset that gets them a tick-in-the-RFP (interfaces, port channels and VLANs).

Device State Management with Ansible

You can manage any component of a network device state with Ansible. Ideally you’d manage the whole device configuration with Ansible and replace old configuration with new one on every change, making it exceedingly simple to add or remove services or other components of device state. The “only” component required on the device is configuration replace functionality available in numerous network operating systems including Arista EOS, Cisco IOS, Cisco IOS-XR, Cumulus Linux and Junos.

Unfortunately, most brownfield environments cannot use the replace configuration on every change approach, either due to existing processes, inconsistent device configuration management (changes being implemented manually), or non-unified device configurations.

The best one could do in such environments is to manage individual service state (VLANs, VXLAN, BGP neighbors, VRF…) with Ansible.

Service State Management with Ansible

Simplistic implementations of service state management use a service (or device) data model to create device configurations and push them to the devices regardless of what’s already configured on the device and what potential conflicts might be.

More sophisticated solutions usually implement an algorithm similar to the same mechanisms used by Puppet or Chef agents:

  • Check the current device state (assert your assumptions)
  • Execute commands that modify or add device state using tools like NAPALM, vendor_config Ansible code modules, or service-specific modules like nxos_vlan.
For more information about Ansible networking modules and numerous hands-on examples watch the Ansible for Networking Engineers webinar.
  • Check the new device state and verify it matches the desired state.

The crucial element in these approaches is the answer to the question what happens to other services that are not listed in the desired state. There are at least four possible answers:

  • Desired state as specified in a data model describes only the services that have to be added. The above three steps are good enough to implement this approach;
  • Desired state explicitly lists new/modified and obsolete (to be removed) services similar to how you delete files with Ansible using state: absent parameter in file module. The above three steps need to be modified with extra configuration elements deleting obsolete services;
  • Desired state is authoritative but not enforced. Services not listed in the desired state should not be configured on the box, but they are not removed. Ansible playbook could report them, or fail when encountering them;
  • Desired state is authoritative and enforced. Extra services configured on the devices are automatically removed.

Examples

Assuming the following VLAN services data model…

vlans:
  - { id: "1", name: "default" }
  - { id: "100", name: "mgmt", subnet: "172.16.1.0/24"}
  - { id: "101", name: "web",  subnet: "192.168.201.0/24"}
  - { id: "110", name: "db",   subnet: "192.168.202.0/24"}

… this simple playbook configures the desired VLANs on a Nexus OS switch (approach #1)

---
- hosts: all
  tasks:
  - nxos_vlan:
      provider: "{{cli}}"
      vlan_id: "{{item.id}}" 
      state:   "{{item.state | default('present') }}"
      admin_state: "{{ item.admin | default('up') }}"
      name:    "{{item.name}}"
    with_items: "{{vlans}}"

The nxos_vlan module already supports the state parameter. Deleting a VLAN is as simple as adding state key to a VLAN dictionary:

vlans:
  - { id: "1", name: "default" }
  - { id: "100", name: "mgmt", subnet: "172.16.1.0/24"}
  - { id: "101", name: "web",  subnet: "192.168.201.0/24", state: absent}
  - { id: "110", name: "db",   subnet: "192.168.202.0/24"}

If you want to identify extraneous VLANs configured on a switch use existing_vlans_list list returned by nxos_vlan module and do a set difference between desired VLAN list and existing VLAN list. This playbook uses that approach:

---
- hosts: all
  name:  configure VLANs
  tasks:
  - nxos_vlan:
      provider: "{{cli}}"
      vlan_id: "{{item.id}}" 
      state:   "{{item.state | default('present') }}"
      admin_state: "{{ item.admin | default('up') }}"
      name:    "{{item.name}}"
    with_items: "{{vlans}}"
    register: vlan_state
  - set_fact: vlans_list="{{vlan_state.results[0].existing_vlans_list}}"
  - set_fact: target_list="{{ vlans|map(attribute='id')|list }}"
  - fail: 
      msg: >
        Extra VLANs configured on {{inventory_hostname}} - 
        {{vlans_list | difference(target_list)}}
    when: "{{ vlans_list | difference(target_list) }}" 

Finally, you can remove extraneous VLANs with another nxos_vlan call:

---
- hosts: all
  name:  configure VLANs
  tags:  VLAN
  tasks:
  - nxos_vlan:
      provider: "{{cli}}"
      vlan_id: "{{item.id}}" 
      state:   "{{item.state | default('present') }}"
      admin_state: "{{ item.admin | default('up') }}"
      name:    "{{item.name}}"
    with_items: "{{vlans}}"
    register: vlan_state
#
# Merging the three set_fact calls into a single convoluted expression
# is left as an exercise for the reader
#
  - set_fact: vlans_list="{{vlan_state.results[0].existing_vlans_list}}"
  - set_fact: target_list="{{ vlans|map(attribute='id')|list }}"
  - set_fact: extra_vlans="{{ vlans_list|difference(target_list) }}"
  - nxos_vlan: 
      provider: "{{cli}}"
      vlan_id: "{{item }}" 
      state:   absent
    with_items: "{{extra_vlans}}"
    when: "{{ extra_vlans }}" 
For a more comprehensive solution explore my VLAN service GitHub repository.

Overlapping Services

In some cases special considerations should be given to replacing already-configured services, for example:

  • Configuring VRF on an interface belonging to a different VRF;
  • Configuring access-vlan on an interface belonging to another VLAN.

In both cases one should ask “is this a valid service replacement request or a potentially crippling typo” and act accordingly. The answer to this dilemma depends solely on the business logic and service decommissioning/provisioning processes.

If however the service management process requires an explicit decommissioning of old services before a new service can be configured on the same set of resources (example: interfaces) then the Ansible playbooks implementing the service provisioning process SHOULD check the resource state (for example: is there a VRF configured on this interface) before changing the network device configuration.

Useful Links

Finally, if you're more interested in the big picture than in individual modules used in an Ansible playbook, I'm sure you'll find it in the Building Network Automation Solutions online course.