The Challenge of DevOps

Disclaimer: I am not an expert in this area, this is just a blog over difficulties and conflicts created in the search for DevOps. I will go through some fallacies that will make each group look bad, so stick with it if you can. I really would like to get conversations started where they weren’t.

The Phoenix Project. One of the greatest books of this IT age gives a good concept when it comes to how Infrastructure and Development work with each other. However, there are a plethora of definitions for “DevOps”. How do you take a company that has existed just fine with two (if not more) warring parties and get them not only to work with each other, but to help each other.

Fallacy 1: DevOps is Infrastructure working for Developers

This was my first collision with the term. The idea here, again, is basic. Infrastructure and its core individuals exist to give a foundation and support to the developers, and nothing more. A good example of this is a simple ticket request for a new server with specifics of the hardware. Any infrastructure worker worth his salt would look at those requirements and ask “why”, but this adoption of DevOps is “Why not?”. Obviously I’m taking things to an extreme here, but I have seen too many server requests for extreme hardware requirements (really…a virtual machine with 64 gb of ram?) and the inevitable fight that ensues in response to that three letter question “why?” There are so many other examples I’ve seen, from resource adjustments of the ridiculous, to the blatant opening of security holes. Obviously this is in varying environments, but the idea is the same.

This expands past infrastructure into Operations. For example, when there is a “priority one” ticket, and everyone jumps on the call, the first phrase is, “If operations did their job and fixed this we wouldn’t have to deal with this. I don’t even know why we are here.” Which is something I’ve heard this stated multiple times for multiple reasons.

Like all Fallacies there is truth here. Developers are the reason companies make money. Unless you work in the past, Devs work daily to create and stabilize performance that generate monetization for the rest of IT. The greatest example of this is Uber. Of course the top Taxi company in the nation is an app. So without it there is no real company at all.

For IT workers that, will lead them to a mindset of denial. No they’re not in denial, but  the next fallacy.

Fallacy 2: Developers should come to heel for Infrastructure

So ok, here we go. This is so prevalent in my career that I can’t even think of that many specific instances, because it’s almost daily that I hear something along this term.

Developer asked to be able to monitor or help maintain resources with their machine,  “No”. Developer asked to be able to create snaps for code roll, “No”. Developer asks to be local admin on a box, “No”. Developer asks to have a new dev environment that mirrors production, “No.”, and so on.

Its amazing and a marvel to me that people can just shutdown Developers for minimal reasons and yet, this is the normal situation I find myself involved in. I see a ticket and instead of doing investigation or adjustment the immediate response that I hear is either, “No”, or, “We cant do that”, or “Thats not our problem”.

This is just a few examples, and I’m sure everyone could think of their own examples. One thing that has really stirred this up and made my brain hurt is Containers. Developers ask for a container solution from infrastructure, and they get a big fat, “Why? It’s in the OS, so you figure it out.” This is a very frustrating stalemate.

This pushes to an extreme called “Shadow IT”. Shadow IT is basically if infrastructure wont grant the needed support or help getting things off the ground, Dev will use their individual budget and spin up an entire instance in AWS, GCP, Azure, or a basic private cloud. Just a license  for VMware Workstation can create Shadow IT, it’s that easy (don’t get any ideas). I heard a developer talking about the public cloud saying, “It makes me happy knowing no infrastructure people are touching my boxes.”

This fallacy again comes from a smidge of truth. Development doesn’t know everything about infrastructure. Infrastructure spends all their time doing resource management, monitoring, and adjustments trying to keep the infrastructure running in the best way possible, so they should be the main contributors to the how/what/why of infrastructure.

Fallacy 3: Security CONTROLS ALL

With Ransomware and Wannacry still being buzzwords in our time period, this is definitely a big deal. Security is extremely needed this day and age, from blocking things like DNS floods, to removing patch vulnerabilities. There are countless reasons to keep security up to date. Where does this come in for Infrastructure or Development? Well, Security is the Uber-Deny group. Almost every security individual I’ve met has stated something along the terms of, “My job is to make sure we don’t do something stupid.” I feel like the answer to this is both “Yes”, and “No”. I’m sure people do stupid stuff all the time that has nothing to do with security. The proper statement would be something like, “My position is to rectify security vulnerabilities throughout the stack” Or something like that.

Security will try to block where networks are, how they are setup, where infrastructure is placed, and the list goes on. Are there legitimate reasons? Of course! Security is the necessary part of IT that helps keep things in line, away from prying eyes, and malicious intent.

The Common Factor

All of these fallacies are connected to an amount of truth. They all start when the truth is twisted and exploited by individuals to make their position greater than it is.

It’s like the problem with DevOps is even more internal than IT. The problem with DevOps is us.

IMG_0427

People

I love this quite so much, because at its core it grasps the internal hierarchy of DevOps. It all starts with People.

Obviously each group mentioned is run by People. These people are all built around their own past experiences and troubles. Like iron, these experiences and troubles have made us stronger and sharper, but like the fingers of a string musician we become callous, and those callouses help generate better solutions in both good and bad ways.

There is also the problem with Ego. My favorite statement with immediate meetings are, “Everyone check your ego at the door. It has no place here.” If we only were able to do that and bring only the strengths of the department and not the callous overreaching, “where would we go?”

Finally for us, the problem of listening and understanding.

Meetings

I know this picture hits a chord for all us. How many of us have listened to what seems like the dumbest ideas and kept our mouths shut to it? DevOps starts with listening, and by listening, I mean the whole management stack. Is this easy to do? Absolutely not, in fact, it’s extremely hard to listen for nuggets of truth in a whirlwind of ideas. However, thats what we need to do. To start this, the greatest thing we can do is just listen, and if you don’t understand, ask the dumb questions that everyone is too proud to ask, so that you do. You are probably not the only one who has that same question.

Finally: People… Again

End Users…the concept is lost by all groups now and again, but everyone in IT works for end users. The hardest concept to grasp, and the easiest road to DevOps, is how to create a better solution for them. Developing a new patch, a more secure Dev infrastructure, or new storage solution. This all has DIRECT impact on an end user, and each group works individually to create a solution for them. Now how to mix them all together? A good start is a CI/CD pipeline and securing an automated solution for Developers to run continual delivery. It exists on infrastructure, it has to run in the best HA stability, and it has to be secured. This is a great solution that involves all groups together. There is so, so much more, but that’s the journey we are all on.

Call me “Optimistic”, or “Crazy,” Or “Nuts”, Or “Dangerous”, but I believe this is the future of our industry. With the kubernetes, containers, CI/CD buzzphrases, and the dominance of public cloud, the old standards needs to be replaced with the promise of DevOps. Now, how will you do it?

Things I learned this week!

  1. Error 500 in VCenter when deploying OVF.. Verify your VCenter certificate is trusted… Also How to do this on MAC and then trust it by selecting “get-info” and setting it to “always trust” Needed this for LifeCycle Manager and NSX deployment
  2. INSTALL PowerCLI on da MAC (Includes Homebrew, Powershell Core).
  3. Initial setup of vLCM

A Good Adjustment

I’m busy working on the homlab trying my best to duplicate a homelabber and failing miserably. But more information will be coming on that later.

For now I found a great KB that needs some sharing! VMware has been known for some great pointers to fix issues. This one fell into my lap from an issue I was seeing.

The Problem

So every VRealize Automation environment is different so let me be straight. This change will only help vRO extensiblility actions and automations. For me it was a good improvement over the vRO XaaS workflows that I had published.

We were seeing timeouts and errors showing “Form not found” when trying to open workflows that had actions to pull specific information(AD, VSphere, etc.) because of this the workflows were in the tank and sometimes even IaaS Deployments would return with an error 400.

The KB can be found here: https://kb.vmware.com/s/article/2147109

The Steps:

In Embedded vRealize Orchestrator Server:
  1. Open the /usr/lib/vco/app-server/bin/setenv.sh file using a text editor.
  2. Modify the memory by setting the Xmx and Xms values to the MB value required.For example:

    2.5 GB memory is allocated to each Xmx and Xms(this is the default setting):

    JVM_OPTS=”$JVM_OPTS –Xmx2560m Xms2560m -Xmn896m -XX:MetaspaceSize=512m -XX:MaxMetaspaceSize=1024m -Xss256k”

  3. Edit the /etc/vr/memory-custom file using a text editor.
  4. Add this entry:add_service_mem vco-server *NUMBER*

    Note: The *NUMBER* is equal to the sum of -Xmx and -MetaspaceSize as configured in step #1. The memory is in MB.

  5. Stop the vRealize Appliance and increase/decrease the memory to match the increased/decreased memory of the vRealize Orchestrator.
  6. Start the vRealize Appliance.
  7. Repeat steps #1 to #6 to rest of the nodes in the cluster.

Just a short, quick blog for today, but this was a very good change for me, and I saw a marked improvement in response time from my embedded vRO.

Hopefully some hilarity from homelabbing coming, but I hope it helps someone out there. Some highlights:

  1. ISP MAC LOCKING!
  2. WAN +2LAN2? or WAN+LAN2??
  3.  MODEM TO WHAT ON THE WAAAAAT???
  4. “Its just making them talk to each other right?”

VRealize Deployments: Part One, Active Directory Policy

So this is a blog series on how to setup and create quick deployments for self-service users in vRealize Automation. This is mostly built-in automation with minimal custom creations. This is pretty basic, but I wanted to start there and grow. I guess in that light I need to go over licensing and the architecture of building the automation as well. but for now, lets just say you have advanced licensing(at least) and your running a minimum to medium enterprise deployment this should work for you.

Should be fun, and maybe it will help someone out there.

*I’m going to assume the following.

  1. Endpoint agents are configured
  2. Resources are granted to specific business groups
  3. Entitlements are granted to said groups for appropriate services/catalog

That being said, here we go.

Deployment Step 1. AD Endpoint

First thing you will want to configure is your Active directory Endpoint. For this you will go to Administration -> vRO Configuration -> Endpoints. From here click “New” and on plugin you will select Active Directory policy.

adendpoint1

From here you will input the Name and Description of the endpoint, then the following details for the server. For now, we’ll use ldap connection. Input the host/ip at the top, baseDN, Default Domain, username/password. If failover, round-robin, or Single-server drop-down and add the DC’s to the array below. Finally add the Name for vRO and the final two options aren’t too needed for adjustment. Below is an example of how it should look.

adendpoint2

Active-Directory Policy

This is specifically for users who need the computer to drop to a specific OU. To find this on vRA you would go to Administration->Active Directory Policies (If the option does not exist you may be missing some roles on your vRA account). From there you can click “New” which will open up your settings. Here you select the ID(Remember for later),endpoint, domain, and OU you wish the policy to put the Policy.

adpolicy2

Now you have an endpoint, and a policy. How do you add it to the blueprint?

The custom property ext.policy.activedirectory.id is your go to there. Below is a screenshot that will explain how to associate the policy with the blueprint.

adpolicyblueprint

This will create the computer object before the deployment starts and will remove it upon destruction. Nice Self-Service.

Now how do I verify the computer name doesn’t exist?

Create an Action in vRO to find it!

Action in vRO for Checking Computer Name

Set you vRO to Design, and on the cog, you can create an action.

Return type = String

Inputs

  1. strComputer – string
  2. defaultADServer – AD:AdHost

Code:
var computers = ActiveDirectory.getComputerADRecursively(strComputer,defaultADServer);
if(computers[0] == null){
return "This name is available"
}
else
{
return "This name is unavailable"

Place this as a external action on the blueprint against the “Hostname” Custom property(basically the name that the VM will take). This return “This name is unavailable” if it finds anything close to the name, However, this does not stop the request going through. For that you can do a Match field with another Text Field of “This name is unavailable”. Which should only let the request go through if it states “This name is unavailable”

Hope this helps someone get things rolling! I’ll do a blog on Network Profiles next.

The One About Tagging..

So with the future of datacenter segmentation looking like tagging in IT we have seen a major push towards VMtagging around the cubes. Well at least I have.

Lets not mix words. I hated tags. It was basically like putting a sticky note on a machine with no management to make sure things ARE tagged, and no way to easily do tag assignment in bulk. Sound familiar… maybe like a naming convention? Here was the RUB for me. A naming convention is right in front of everyone’s face and it puts the devs in line. Tagging, however, is 100% infrastructure team(or virtual team depending on your organization size).  Well, by the end of this post I should think better of tags… and maybe you will too?

PowerCLI

Get PowerCLI

Lets be real. If your not using PowerCLI for automated management of your vCenter your doing yourself a disservice. For one, its a really good resource to pull information across the entire vCenter and dump it in front of yourself in different formats(.csv, .xml etc.). Another is that PowerCLI allows assignment and adjustment across multiple VMs. So for tags of course PowerCLI would be your go-to for assignment, adjustment, and removal.

There are multiple code sources out there for tag assignment. Here are a couple excerpts. NOTE: always remember to assign to the proper vCenter for the tags, Assigning to vCenter b to assign a tag to a VM in vCenter A will not work even if the IDs are identical.

Connect-viserver vcenter1 -user vcenteruser -pass vcenterpassword

This connects the your client machine to the needed vCenter. Though tag ID is now the same across linked vCenters PowerCLI needs you to assign to the VM’s vCenter to assign the tag. We’ll get more into this later.

From here you can run your gets, removal, and assignments of tags by NAME. so

get-tag -name ‘tagname’

This can be set to a variable like: $tag = get-tag -name ‘tagname’ which can then be assigned to a VM. So lets just see a simple VM get and tag assignment.

$vms = Get-VM vmname*
$tag = Get-Tag -name "TheCoolestTag"
$vms | New-TagAssignment -Tag $tag

NOTE: the astrisk after “vmname” is a wildcard, it it actually pulls a group of VMs starting with “vmname”. If you want to do one at a time(and why would you) remove the * and put the full VM name.

Just a simple get and assign of tags through powerCLI. Now lets look at the same thing via vRO

vRealize Orchestrator

Now in vRA deployments you want to tag all VMs properly so that they have the proper tags needed for management. The built in library has several tag based workflows out of the box, but first you need to run through some setup.

First create a vapi endpoint to your vCenter(Wokflow is found in the library -> VAPI -> “Import VAPI Metamodel”(the VAPI endpoint will be added as well)

metamodel

You want to use Secure Protocol Connection so that the endpoint is used for future orchestration. Input the name of the vCenter (plus /api), so https://vcenterlink.com/api. Input a username/Password combination that will not change(service account if possible), and Select to add the vAPI endpoint.

This will create the connection for you for tagging. Now, lets talk about that tagging assignment, and gets; this is where it can get a little tricky. The library for tagging is found in library -> VAPI -> Examples -> Tags. This includes creating category’s, tags, and assigning tags.  In the “examples” folder you will find some “Get” workflows, but, if you run you get a csv string for all IDs of the tags. I don’t know about you, but I don’t remember tags by IDs.

So, how do we do a pull by name? Well, there is an action in vRO for findTagByName in com.vmware.vapi.tags. This takes an input of the vAPI endpoint(metamodule is needed so it should be there if you followed above), name, and whether you want to run it as case sensitive(boolean). Now, you can take this workflow and run a system.log after the action for the needed information. Here is what my workflow looks like:

tagworkflow

This should return the information you need to tag VMs with the specific tag. You should be all set using the built in workflow “Associate vSphere tag to VM”. This workflow needs the API, the ID of the tag(tagid) and the VM :

associatetag

But lets make a quick change to that workflow’s “Scriptable task”. Currently the built-in workflow(as of 7.5) shows this:

if (vapiEndpoint == null) {
throw "'endpoint' parameter should not be null";
}
if (tagId == null) {
throw "'tagId' parameter should not be null";
}
var i = 0;
while (i<5)

try {
var client = vapiEndpoint.client();
var tagging = new com_vmware_cis_tagging_tag__association(client);
var enumerationId = new com_vmware_vapi_std_dynamic__ID() ;
enumerationId.id = vcVm.id;
enumerationId.type = vcVm.vimType;
tagging.attach(tagId, enumerationId);
System.debug("Tag ID " + tagId + " assigned to VC VM " + vcVm.name);
i=5;

} catch(e) {
System.debug("Associating " + tagId + " failed. Retrying " + i + " of 5 attempts");
i++;
if (i=4) { System.error(e.message); }
}
}


There are some opportunities for this workflow. First, if you use this out of the box and put in an incorrect tag it will continually cycle, 2nd if you fix the cycle, it will never fail. So here is the code with my adjustments to ensure it only tries 5x, fails on the 5th, and sends the exception.

if (vapiEndpoint == null) {
throw "'endpoint' parameter should not be null";
}
if (tagId == null) {
throw "'tagId' parameter should not be null";
}

var i = 0;
while (i<6){

try {
var client = vapiEndpoint.client();
var tagging = new com_vmware_cis_tagging_tag__association(client);
var enumerationId = new com_vmware_vapi_std_dynamic__ID() ;
enumerationId.id = vcVm.id;
enumerationId.type = vcVm.vimType;
tagging.attach(tagId, enumerationId);
System.debug("Tag ID " + tagId + " assigned to VC VM " + vcVm.name);
i=6;

} catch(e) {
System.debug("Associating " + tagId + " failed. Retrying " + i + " of 5 attempts");
i++;
if (i==6) {
System.error(e.message);
throw e.message}
}
}


Lets go through the changes:

  1. To try “5 out of 5”. The end catch should be 6 not 4…
  2. Change the “while” clause to 6 so that the catch runs at 6 and it doesn’t just end successfully.
  3. Finally “throw e.message” will make the workflow actually fail. If you just want the log, but want the workflow to continue, you can remove this.

**NOTE** You can attach multiple tags this way, just duplicate the workflow and add attributes for each tag ID, adjusting the script to use the proper attribute variables one at a time, per vAPI.

Now, with this workflow and inputs you should be able to add tags to your VMs, with property assignment, and subscriptions. That’s for another time I suppose.

Template management 101

So what does vRealize really utilize for automation. Well in 7.4(I believe) things got a lot easier with their guest agent. Utilizing the agent page from the appliance will now show an actual PowerShell or bash command for installing the agent(pretty awesome).

Here is the thing. Though the automated installer will remove the already installed guest agent, it doesn’t do it very cleanly. So some old commands are still very useful.

Goal of this blog:

Template management is key for success. Especially when running multiple vRA instances across the US and abroad. Making sure the template is pointed to the proper appliance(or VIP for HA), and IAAS Manager(Or VIP for HA) is CRUCIAL. Without the proper setup you will probably run into timeouts or issues with your deployments and the “customization”. If you’ve worked with vRealize you know how annoying the “Error during customization” message can be.

Solution

(Solution for removal pulled from Jim Griffiths @ http://itsgettingcloudy.blogspot.com/  Hes a great blogger with some good stuff, so check him out! )

Here are the steps to uninstall it;

From a command prompt on the machine which has the agent installed, type;
net stop vcacguestagentservice
then
sc delete vcacguestagentservice

If you do not want to install the vRAS\vCAD agent after this, then you can use the below command to remove the folder;
rmdir c:\vrmguestagent /s /q

I suggest running all three commands before installing the agent again.

Steps to run the installation of the gugent agent

  1. Download the gugent agent off your primary vRA appliance(if HA this can be any appliance).download guest agent1download guest agent2
  2. Move the .zip to the Machine that will be templatized.
  3. Extract the folder. There will be a folder within the extracted folder. I hate that so i just move the final folder to a c:\temp directory normally.
    1. Now open an command prompt runas admin and run PowerShell -NoProfile -ExecutionPolicy Bypass -Command C:\temp\**FOLDERNAME**\prepare_vra_template.ps1(if your like me and put it in c:\temp. **FOLDERNAME** is what vRA names the folder or if you rename it(default is currently prepare_vra_template_windows)*NOTE* To me this is the cleanest install, but for those that prefer .bat files…
    2. You can also run this by simply opening the prepare_vra_template_windows folder and running prepare_vra_template.bat as Administrator.
    3. Both of these solutions will get you a prompt asking for additional information
  4. First it will ask for you FQDN/Ip for the vRA Appliance(VIP if HA). *I find that if your using multiple domains the IP is better than the FQDN so DNS doesn’t bite you in the butt. However, If you didn’t put the IP in the appliance certificate(and shame on you if you did). This will not function properly.*download guest agent3
  5. It should find the appliance if the template can hit it and pull the certificate. Type “Yes” if the SHA1 matches(and if you really want to check that)
  6. Next is the manager. So type in the FQDN/Ip for the vRA IAAS Manager(VIP if HA)*I find that if your using multiple domains the IP is better than the FQDN so DNS doesn’t bite you in the butt. However, If you didn’t put the IP in the appliance certificate(and shame on you if you did). This will not work.*download guest agent4
  7. The rest is dependant on how your setting up the agent and where the template is (EC2, vsphere, etc) and the account you want to use on the machine for administration(local, or domain).download guest agent5
  8. After all information is completed the install will run and when complete, you should see a message that states “INSTALL COMPLETE Ready for shutdown”(or something like that…). Dont shut it down yet..
  9. Here is when your run your sdelete, or whatever cleanup solution you have(dont forget to delete the agent install!! )
  10. Now shut it down…

The GuestAgent log should now show proper calls, and responses including the full payload when the template is deployed. Pretty cool stuff…

Now never touch the template again…

giphy