So many management systems out there to choose from. Open-source and proprietary, the numbers are staggering. Who do you use? What do you look for?
Let’s get the obvious out there…. I want to know what is running, what is not running, notify me, and let me run historical reports on various metrics (i.e. CPU usage, Disk usage, bandwidth, etc). But, not all management systems are equal. Let’s pick apart the 2 pieces that I look at most:
- Agent vs Agent-less Management
- Synthetic Transactional Management
Agent vs Agentless
Many of the NMS’s out there, are agent-based systems. This means they rely on software installed on a remote machine to talk to, in order to get various data. While this is an important piece to management, it is not the only (and sometimes not the best) way to gather data.
First, let’s go over the advantages. There is data out there that can only be gathered by an agent. For example, CPU data, disk usage, memory usage, etc. Many systems create their own agent to run on the remote system to gather this data. Most commonly though, it is the SNMP agent that is used. Of which, my favorite is Net-SNMP, which is for another post, and beyond the scope of this one.
However, there are times where an agent can give you misleading information. For example, suppose you want to test to see if SMTP is running on the system. Easy, you just execute an SNMP query to check if that process is in the process table. If it’s there, it’s running, if not… trigger an event. Don’t see the problem yet? How many have seen that the program is running, but is not really responding? I’m sure everyone raised their hands, and hence why not everything should be based on Agents.
Now, you are probably saying to yourselves, “….well yes, but, that’s why you check to see if that TCP port is open”. And right you are! Moving away from agent-based systems gets us closer to getting more accurate information, but it still leaves room for error. If the application freezes, then many times, the port is left open in an inconsistent state. Meaning, your port check will validate, but nothing is responding….. The application is frozen, remember? Or, maybe there is a configuration/system error that prevents the application from fully doing what it needs. The application is still running, but still doesn’t do what it’s supposed to. I want to know that.
Starting from my last example, how do we monitor broken applications – Applications that “seem” to be running, but aren’t doing what they are supposed to? This is where Synthetic Transactions come in:
These are programs that run against the service, just like a remote user/application would use it. Taking from our SMTP example, we would actually establish an SMTP session, and send SMTP commands, expecting certain results. Or, maybe a real DNS query, expecting a certain result back. The same could be said for HTTP, or RADIUS, etc.
Not only do these transactions give you a more realistic view of how the user sees the service, but now you can track response times. Network performance can drop drastically if your DNS queries that were once <10ms are now taking 250+ms.
There are many other items that I look for in an NMS, like escalations, trend reporting, and using 3rd party ticketing systems. While I admit that most Network Management Systems utilize both agent and agent-less systems, it seems many have not yet moved to Synthetic Transactions, which, to me, is a key-component to really knowing how your “Network Services” are responding.