Comments and logging and exceptions … OH MY!

April 25, 2019

Tweet This:
Share on LinkedIn:

By Steve Kaplan, Kovarus, SDDC & Cloud Management

A friend recently asked for me to run through a code review of some vRealize Orchestrator (vRO) workflows to sanity check that he was on the right track for his use cases. This brought on a conversation around some of the standards I tend to employ in my coding practices in general, and some that I specifically like to use when it comes to building code for vRO workflows/actions/etc. An all too common pattern I’ve observed when reviewing automation artifacts is a lack of solid comments, logging, and error handling built in. It doesn’t seem to be specific to a scripting language or a particular platform … this is just something we see all over the place.

There can be a lot of contributing factors, but a few of the most common are (and these are based on real life conversations I’ve had in just the past 6 months or so):

  1. You’re learning through examples from community-maintained resources and/or online platforms such as Pluralsight where these sorts of areas aren’t really covered because they tend to be more perceived to be more style than substance.
  2. Developing the base capability, whether that be a script or workflow or a full-on cloud management platform, tends to be deemed a lot more important than making code “operationalized.”
  3. A mindset that your developed capability is flawless and won’t ever break, and even if it does … you’ll be there to fix it, right?!

Excuse me while I take a deep breath …. A few quick thoughts, as I think there’s a lot of myth busting that needs to happen here. Let’s take this point by point to get it out of the way!

On the first point, you will establish bad habits that are really, really, really hard to break once they start to take root. It’s unfortunate, but it’s also human nature. If you’re still in the early stages of learning how to leverage a new scripting language and your coding style is evolving, the best thing you can do is find somebody smarter than you on the particular language / tool chain and ask questions! We all had to learn once upon a time, and the great thing about the internet is that it allows for forming of communities from people of different backgrounds and experience levels; there’s a high probability whatever you’re learning already has a community out there, even for things most of us would think are obscure or arcane. Online communities usually are going to have some of the smartest and most passionate advocates for the technology, and many of the best try and foster and grow their communities organically. You might even make some new friends (I certainly have)!

The second and third points are both relatively self-explanatory, and I think the most concise way to respond to those is simple: Don’t be that person. We tend to use automation to make our lives easier and to take tasks that are often thought of as mundane, routine, or repeatable and turn them into something that might take 20 minutes by hand, down to a minute, but then fall into the trap of “There’s never time to do things right, but there’s always time to redo it!”; believe me when I say that these are not the droids you’re looking for! Nobody wants to be digging in and trying to remember what they did six months earlier and trying to untangle the mess in place because of short-sighted thinking. Even worse than the “Oh man, what did I do here?!” scenario, what happens if you decide to take another job for a company, and are no longer available to provide assistance? It’s all bad news.

I implore you, dear reader, do not be that person!

Now that I’ve gotten that off my chest, let’s talk about what some of my best practices around coding are, and more importantly, why I try to adhere to these and why I ask all of our automation teams to do the same! I’m going to show off a few examples in each of the areas below to illustrate examples of how I think things should be done; this has mostly been informed from years of doing things better incrementally and learning from folks who know better, particularly in the PowerShell space!

Functions — Use them liberally and without refrain!

I’m a huge fan of using functions inside of automation artifacts because it really does provide an excellent mechanism for breaking up code into more consumable pieces that are generally, in my experience, easier to troubleshoot. What’s the value of utilizing functions as opposed to just a block of commands, you ask? My three principle reasons for doing things this way generally are:

  1. That using functions, even when during initial mocking of code, you’re going to make your life easier from an implementation standpoint because you’re developing a functional unit of code that can just be plugged into a larger automation artifact
  2. In taking the previous point a step further, it makes it a lot easier to add/remove functionality in an automation artifact by being pluggable. By that, I mean you can develop the functionality independently of the larger body by mocking up any dependencies in an interactive session, making the task of integrating the new functionality is mostly a task of making sure all inputs are properly provided to this function, not whether the function itself works. Likewise, assuming any dependency conflicts are sorted out, since the unit of functionality is self-contained, you can easily rip out a piece of code and either use it somewhere else that’s more appropriate, or just let it go off into the ether
  3. That it makes it a lot easier to troubleshoot due to being able to isolate things much easier because all of the code is contained and only ran if the function itself is invoked

How does this actually look in practice, you ask? Well, here are two real life examples where we’ve leveraged these principals!

We start with a PowerShell script that I developed to help initialize new vCenter instances and get the base data center and supporting vSphere resources ready to add hosts:

As you can see from reviewing the script, we’re principally doing four things with this script:

  1. Provisioning a data center in the root folder of the vCenter inventory using
    Initialize-Datacenter
  2. Provisioning a cluster inside of that data center using Initialize-Cluster
  3. Provisioning a distributed vSwitch and port groups inside the data center using Initialize-Networking
  4. Applying licenses to the relevant resources using Initialize-Licenses

The great part about how this script is set up is that each of those actions is a self-contained set of commands that make up each of those units of functionality. As we were building and validating all of this, splitting things up made it build incrementally as we defined standards around configuration and to more easily trace down when functionality broke due to adding new capabilities within each of those use cases. The great thing about how this is structured is, if one of the functions is misbehaving, we can comment out the other functions from running, if necessary, and focus on figuring out what’s going on; those are located between lines 104–107.

In our second example, let’s take a look at some JavaScript that was developed to support automation use cases around creating an NSX-V Service Group, that is implemented as a vRO action:

As you can see from the code, we’re doing a lot of things, but I want to focus specifically on lines 3–6, 24–49, and 52–67 … and why things are the way they are in there:

  1. Starting with lines 3–6, what I discovered fairly early on with vRO is that invoking an action within a scripting body could become very, very tiresome and ugly to remember the entire module path and the proper syntax, so what I like to do is take actions that may be utilized multiple times inside of a script and wrap it inside a localized function to make invoking that code easier (our logging function comes to mind here, see line 3). You can see this throughout the code anywhere the logging function gets called, rather than System.getModulem(“com.kovarus.general”).logging().
  2. The function defined between lines 24–49, the name newServiceGroupBody seems self-explanatory, but this function builds the XML body that gets used to create the new service group.
  3. The newServiceGroup function defined on lines 52–67 is what actually builds the full API request that gets sent to the NSX Manager and ultimately invokes the request, returning the response to the action. (NOTE: there is a ton of other “shared” code that gets executed here, like our standard REST request action, which isn’t covered here.)

The reason I generally build up these sort of API interactions this way is that I can isolate the act of building a request body from the act of creating and invoking the request … Why is this important? Well, if you dig into the newServiceGroupBody function, we have opted to dynamically generate the body that the NSX-V API is looking for with an XML object, rather than having the XML body be placed raw and just pass in values. In a more complicated API interaction, where the body being generated is completely predicated on the inputs coming in, being able to independently validate that the body is being generated properly prior to worrying about the rest of the request is a big win in my book!

Logging, Comments, & Help

Now that we’ve talked about functions and how they can help tidy up your automation artifacts, let’s shift gears and talk about two things that I think, largely, go hand in hand: Logging and comments. If you use both of these things in conjunction, you can do a lot of what I think of as self-documentation. What do I mean by that? Well, let’s re-visit the previous examples!

  1. In the first example previously, I opted not to include comments atop each of the functions because, well, I kind of felt like the names were relatively self-explanatory and didn’t require additional comments. However, if you look inside of the functions, I take a pretty liberal view of both comments and what I feel like is an interactive logging mechanism that PowerShell has called Write-Progress; I like using Write-Progress when I’m building scripts that are run from a shell interactively by a user as a standalone task, as opposed to a backend function that is used as part of a larger automation tool chain. The bar I’ve set for myself with things like this is “Could somebody else figure out what I did?,” so, I let a few of our advanced services engineers take a look, and the results were pretty good (notwithstanding the debate over tabs vs spaces :)). One thing that I didn’t really focus on here was verbose logging, as this wasn’t really something I felt was necessary here, but if you’re looking for examples around how you can integrate some more granular output when you want to troubleshoot, you can check out a module I built for automation deployments of certain appliances into vSphere. The project page is here: https://github.com/stvkpln/appliance-framework.
  2. In the second example (vRO), you’ll notice that there only seems to be one logging statement throughout the entire 70 lines of code and wonder “What the heck?!,” and my response there is “Well, it’s complicated” … There’s a few important things going on that I want to point out about this:
    • There’s a lot of shared code being used here, and a lot of the logging mechanics around REST requests gets built into the newRESTRequest action, so that we aren’t reinventing the wheel. We want consistent logging for these interactions, so it made sense to centralize it.
    • All of the validation parameters on lines 9–21 are all, in a way, logging for when something goes wrong …. Since vRO doesn’t have a mechanism for setting whether an action’s input parameters are mandatory or not, having this capability baked into your actions prior to anything else becomes helpful in troubleshooting actions and upstream workflows.
  3. Lastly, if you do go wandering through the repo I linked to in point #1, you’ll notice that all of the exported functions (i.e. all of the functions that deploy appliances) all make use of a great built-in function that PowerShell offers called Comment-Based Help. This allows you to document all of the relevant information for your PowerShell-based code and make it much more accessible to users by providing hooks to be used with Get-Help. Likewise, with any other tool or coding language, when you have the opportunity to leverage these sorts of things, it’s always good to do it! In vRO, pretty much every element has fields to add descriptions, from workflows to actions to configuration elements to scriptable tasks inside of workflows, as well as any parameters whether they be inputs, outputs, or in-workflow attributes. Use liberally and often!

Error Handling & Exceptions

In this section, I’m going to predominantly focus on vRO because, well … if you want some good generalities on this, my recommendation is to do a google search for the language you’re working with and add “try catch finally” to the end of that search. I’d be surprised if there weren’t a ton of examples for even some pretty arcane languages!

Why am I choosing to focus on vRO here? Because I find this is one of the least understood, so by extension, least (or incorrectly) used functionality in workflow development. Let’s take a look at an example of a vRO workflow with some error handling built into it!

The previous workflow is one we use in pretty much any of our engagements; it’s a workflow that effectively goes through an Event Broker Subscription properties object and extracts a number of regularly used values, as well as going and getting both the IaaS entities that may be necessary and vCenter virtual machine object, if necessary (these are governed by booleans you can specify in an upstream wrapper workflow). The important thing to note is the overall flow. On the scriptable tasks (icons look like sheets of paper), you’ll notice that there’s both a blue and red line that comes out of each, with the blue line clearly going onto the next step in the workflow, whereas the red goes to an ending that is a red exclamation, rather than the traditional target sign indicating all things are good. Just as important, the bottom row is a starting path for a default error exception, meaning what should happen if there’s an error, but the object the error is occurring on doesn’t have a defined error path to take.\

How does this work? Principally, there’s a few things you need to consider as part of your workflow development track here:

  1. Check what you’re putting on the canvas and see if there is an Exceptions tab; if there is, make sure and bind that to a workflow attribute you define for storing error information.
  2. Make sure you define that default exception branch of your workflow unless you’re going to define a unique branch for every artifact. If you have artifacts (like, say, decisions or switch elements) that do not support exception binding, well … having a way to catch that error properly is probably a good thing, right?

Recap!

So, why worry about and take the time to do any of this? Well, for one thing, we’re all human beings and are fallible. I’ve found it invaluable to build those good habits, which ultimately leads to a better overall outcome that is a lot more sustainable end product for everybody involved!