NetDevOps an overview

I. Problem Statements

Problem 1 - Monitoring

An engineer would like to have his monitoring stacks automatically monitor new devices everytime he add them into the network.

Problem 2 - Configuration

An engineer wants add new commands to all 300 devices. Currently she tests the commands in a few devices and manually login to all remaining devices to add them one-by-one.

Problem 3 - Auditing

A network manager would like to ensure that every active interface in her 300 switches are for legitimate uses.

II. Infrastructure as Code

Values borrowed from DevOps concepts

  • Changes are fact of life
  • Small and frequent changes
  • Continuous integration and deployment
  • Test often
  • Continuous feedback and improvement

III. Tools

Single Source of Truth

  • Represents the desired state of a network versus its operational state
  • Designated authority over a specific data domain
    • Where to find the “correct” value for a piece of data
  • Typically includes:
    • IP address management (IPAM) - IP networks and addresses, VRFs, and VLANs
    • Equipment racks - Organized by group and site
    • Devices - Types of devices and where they are installed
    • Connections - Network, console, and power connections among devices
    • Virtualization - Virtual machines and clusters
    • Data circuits 
    • Netbox or Nautobot

      Labs

      Nowadays a network lab no longer requires huge resources enough to power a NASA space program as before.
      • VM: GNS3 or EVE (Emulated Virtual Environment Next Generation)
      • Docker: ContainerLab
      • Cloud: ARISTA, Cisco DevNet and Juniper Networks

      Develop

      Coding, scripting, and programming.
      • Editors: VSCode, Vim and Sublime Text
      • Languages: Phython and Go
      • Libaries: Netmiko, NAPALM, Nornir and Scrapli

      Review

      Version control, peer review, and repository.
      • GitLab
      • GitHub

      Test

      pyATS, Batfish and Scrapli

      Release

      Automated deployment, configuration management, and provisioning.
      • Nornir
      • Ansible
      • Chef
      • Saltstack

      Observe

      Service metric and streaming telemetry monitoring and observability stack.
      • Data schema/models: Vendor's Native (YANG models) and OpenConfig YANG models
      • Transports/Protocol: SNMP, Netconf, gNMI, RestConf
      • Collectors: Telegraf
      • Monitoring tools/TSDB: Prometheus, Elasticsearch and InfluxDB
      • Logging: GrayLog and Grafana loki
      • Visualization: Grafana and Kibana

      Alert

      Integration of alert and notifications with chat apps via webhooks or APIs.
      • Prometheus AlertManager
      • Grafana Alert
      • Phython

      IV. Back to the problem with solutions

      Problem 1 - Monitoring

      An engineer would like to have his monitoring stack automatically monitor new devices everytime he add them into the network.
      Rough idea:
      Manually add new device to Source of Truth Nautobot which will make API call to a webhook custom script which will generate a new Telegraf config to monitor the new device and make API update request to Prometheus and Grafana.

      Problem 2 - Configuration

      An engineer wants add new commands to all 300 devices. Currently she tests the commands in a few devices and manually login to all remaining devices to add them one-by-one. 
      Rough idea:
      Ensure source of truth Nautobot is updated with 300 devices. Use Nornir framework with dynamic inventory from Nautobot to send the commands via Scrapli to all devices at once.

      Problem 3 - Auditing

      A network manager would like to ensure that every active interface in her 300 switches are for legitimate uses.
      Rough idea:
      Ensure source of truth Nautobot is updated with corrected desired state. Use Batfish with dynamic inventory from Nautobot to validate all interface configuration of all devices. If the configs in the device differ from the source of truth, Batfish will display the diff.

      V. So Where do we start?

      We must learn to crawl before we can walk. 

      - From an expert somewhere

      One possible path

      • Get a Github account
      • Understand some basic Data Structure and Algorithm
      • Learn Python syntax and fundamentals
      • Understand concept of APIs and Distributed System Architecture
      • Try simple library such as Netmiko
      • Try framework such as Nornir with static inventory
      • Upgrade Nornir with dynamic inventory from Netbox
      Try non-destructive tasks such as monitoring and auditing before making changes.

       

      Resources & who to follow

      NetDevOps_KH

      Cambodia’s based Network Engineering community Telegram group: t.me/NetDevOps_KH

      Source: 
      by Vireak Ouk
      some kind of engineer

      25 Sep 2021 


      Comments

      Popular posts from this blog

      Telegram Desktop Keyboard Shortcuts (hotkeys)

      Lesson 2: Define subnet in network topology and add static route to Router || IP Network

      Basic linux command line interface