Network Troubleshooting with NSX Traceflow

February 1, 2016

One of the biggest challenges with troubleshooting a network is knowing just what’s going on behind the scenes, that’s why we recommend troubleshooting with NSX Traceflow.

Traceflow  is one of the new features VMware added in NSX for vSphere 6.2.  I thought I’d take a moment to briefly show Traceflow in action. I won’t go too in depth with this post (I plan to write a follow-up article later!) but I did want to cover the basics.

You can access Traceflow by launching the vSphere Web client.

Click on ‘Networking & Security’ and under the ‘Tools’ category in the navigator click on ‘Traceflow’. It’ll let you select a source VM and VNIC. This tells NSX where to inject it’s test packet. You’re going to want to select something connected to one of your logical switches. Then select a destination. In the sample below I’m just having it send the traffic to a VM attached to another logical switch. The ‘Advanced Options’ available to you let you set things like protocol and frame size. If you’re using TCP or UDP you have options like the source and destination and for TCP you can set specific flags.

Once you’ve set everything up hit the Trace button and wait for the results. I’m doing a unicast trace here which is basically going to simulate a ping from my source VNIC to the destination IP address. It takes a bit of time to complete but it returns with a lot of useful information.

Once it’s complete we’ll see everything that happened from the injected packet. It leaves the vNIC and gets processed by the firewall. In this case my firewall policy is configured to allow the packet so it forwards it to the logical switch. I’m using a distributed logical router here to route traffic between logical switches so it forwards it to the ‘routed-vnet2′ logical switch. From there it needs to get the packet from ‘hq-l-esxi12.kpsc.lan’ to ‘hq-l-esxi11.kpsc.lan.’ This is where VXLAN comes into play (more on this in a later post!) From there the firewall processes it again and finally passes it to the destination VM’s vNIC.

In a later post I’ll be going into a bunch more detail about how all of this works and maybe run through a couple specific scenarios when troubleshooting with NSX Traceflow.