
Nagios – Monitoring our Business Service (1/3)
Well, Well… I wanted to start with this type of posts and let's see how far we can go with our Nagios!! In a series of 3 documents we will see how to measure the SLA offered by different layers of our organization. We will not only use Nagios to monitor our Infrastructure Layer, we can scale by leveraging this foundation and monitor your Company's Operational Services, This is achieved by interrelating the infrastructure services already monitored with departmental needs, even monitoring the Business Service provided by your company.
Interesting not only for obtaining ISO certifications for quality of service or ITIL implementation… If not, for example, to extend our monitoring and give access to new roles. This is, Why can't nuestr@ CEO have visualization of how his business is doing?? If we analyze all the dependencies and requirements that our departments have (to make them functional) with the infrastructure services we have already monitored, we will be able to show you on personalized maps and through which you can move and see the status of your Business as well as its dependencies. We will be able to measure in each Service, the Service Level Agreement you offer by viewing graphs, in addition to that, as usual,, if it is affected, it will alert us in Warning or Critical mode!
It is difficult to explain it in a few lines, But, well, little by little and with good handwriting that you will see that in the end you will have it clear!!! I'm sure many of you know what I'm talking about, and without being an expert in it, I am going to try to clarify it with an example. Of course, everyone will be able to define it differently or in more detail, But I'm going to try to abbreviate. Let's think of ourselves as a nut factory, A classic! We are a small company that produces and sells them online, We have a department that will manage customer orders and another for the final customer service.
Well, rolls aside, to provide service, We have a tiny DPC, where the infrastructure is supported by a virtual environment with vSphere, a pair of ESXi hosts and a cockpit, Windows virtual servers that offer e.g. printers, shares, Databases, ERP, CRM, a pair of Exchange servers for mail, A couple of firewalls, a couple of routers from 2 different ISPs… users have thinclients to connect and work on top of a centralized Citrix environment where they open their desktops/app… And without forgetting the website, that is outsourced to a hosting! We are not going to talk about the Factory part, let's think that it is a network of PLCs that we have already monitored.
Therefore, The business will be affected if any of the services offered by each area cannot work, whether the website does not offer the sales service, o You can't manage orders as if you don't respond to customers or you can't produce nuts.
Now we have to break down and see what needs each department has so that they can carry out their service, see what they need from the resources we have and relate it to each other with conditioning factors.
Business Services,
We begin to define which are the Services on which our business depends for everything to function properly, these are the so-called Business Services; and in my example they will be:
- Online Sales Service it would be anything that could affect the fact that products are not sold in our business.
- Customer Service it would cover everything that is communication with the end customer.
- Logistics and Order Distribution Service, What we understand to be from package preparation to shipment to end customer.
- Production Service, everything that concerns that nuts cannot be manufactured.
Operational Services,
They will therefore be, those that allow a Business Service to function properly, we will define all the needs we have to be able to operate. In this document we will define only one of them, we will use as an example the CUSTOMER SERVICE. GOOD, What we need to be able to serve customers? Let's think that for this we have professionals who communicate with customers through emails or phone calls, manage everything with an ERP as well as satisfaction issues in a CRM. To make the CUSTOMER SERVICE BUSINESS SERVICE functional, we must provide service to users with the following Operational Services:
- ERP Service, it will be everything necessary for our ERP to be functional.
- CRM Service just like the previous one, this service offers users to use the CRM.
- Mail Service will allow users to provide the possibility of communicating with users in this way.
- Telephony Service it will be everything related to phones working well and calls being able to be made or received.
- Internet Service it will provide them with the possibility of making necessary inquiries at times in order to fulfill their jobs.
If any Operational Service is not operational, it will affect the CUSTOMER SERVICE, therefore I will indicate a conditioning factor type AND, you will see later what these conditions 🙂 are for
We will break down each Operational Service we have to see what makes it work.
Intrastructure Services,
Infrastructure Services are the lowest level services, on which the Operational Services are supported. In this document we will see as an example only the needs of the MAIL SERVICE. We will define what requirements must be met for our mail to work well, in this organization they would be the following:
- Exchange Service, Email system on which the organization's mail is based.
- Active Directory Service, Without this core service, the users' equipment, among other items, would not be functional, so it is advisable to monitor its condition.
- DNS Service, if DNS resolution isn't working in our organization, Outlooks wouldn't even be able to connect.
- Virtualization Service, If the virtual platform is affected, Obviously any service we offer from virtual machines will be affected.
- Internal Communications Service these would be the items that allow communication between client PCs and servers.
Yes? We follow???
I would define for example the Active Directory Service of the organization by relating the services that make this work and operational, No? What is needed for the service offered by our AD to be operational?? Or what it depends on to make it work? In the example I have 2 Dcs (Domain Controller) called SRVDC01 and SRVDC02, Understanding that if you fail nothing happens, since both offer such a service, that the AD serves the resources/users. Therefore, I define the services that make it up:
- AD SRVDC01 Service
- AD SRVDC02 Service
What is AD SRVDC01 Service? It will be the services offered by said server for the AD to work on it, example:
- SRVDC01 Service
- Active Directory Status
- Services – Active Directory
- Port 389tcp – LDAP
- Port 636tcp – LDAPS
The latest 4 Services are already starting to ring a bell, No? These are the items that we already have monitored from our Nagios, Didn't you? Because we have monitored your TCP ports, Required Windows Services, or we know that with NRPE we can monitor anything with scripts, both a DCDIAG… All this if you don't have it, You'll find it if you search the blog :-). What remains for us to clarify is, ¿What is it SRVDC01 Service? GOOD, are the services that make that server work properly, The basic resources we monitor, be your CPU, RAM, Disk C or Ping normally. Sometimes, The same server serves different functions, that is why it is good to define it with its base configuration and then be used in other Infrastructure Services.
List? Well, this is how we will have defined our first Infrastructure Service! Now we have all the others left, that each organization has, define them as you need… What remains for us is to think about the interrelationships we need, create the ones we are missing directly in Nagios, since many things that the business needs we have surely not taken into account. So take out a pen and paper and start relating what you have monitored and the function they play! Get a good time!!
Here are some examples/screenshots of some services, of course in each environment they will be different or we want to define them differently! When we document it, we will indicate whether the relationship between them is OR or AND, although we force all the conditions to be met or it is enough that one of them is OK. As you read the document, I hope it is better understood!
DNS Service
This would be an example of the needs that depend on the DNS Service to be functional. We have 2 DNS services running in 2 servers to which, apart from monitoring the basics, in Nagios we also make DNS queries or verify that the Windows Service is up. Look carefully at the conditioning factors!
Virtualization Service
This Virtualization Service defines the dependencies to make it operational, apart from using Nagios checks on ESXi hosts, datastores or the SAN array has no problems, we will also depend on whether the internal networking works well or we have an electricity supply that we will verify with the UPSs.
Exchange Service
And here we define all the things that will affect our Exchange Service, We know we have 2 highly available servers that have Microsoft Exchange Server, they both offer the same services, both CAS or Client Access and DAG for database balancing. We'll take everything else into account, Anything that can make this fall, such as the SSL certificate we use, that we are not on SPAM lists or that we have Internet among other things.
Internal Communications Service
Another final example would be the Internal Communications Service, where in this case I define everything that must work so that there is connectivity between the computers and servers, both with switches or Wifi access points.
…Etc, etc… I leave you these examples to start, Ok? Once we define all the Operational Services and their Infrastructure Services, we will create all the dependencies that our business has so that it works properly and is not affected, We will analyze our critical points and how, for example, the expiration of a certificate can affect the business to stop. Our goal is to ensure that business leaders can have another point of view of what our IT department brings to the business, Displaying information on needs, Why we sometimes have to buy another cabin, etc… It is a very interesting way to justify and demonstrate our work! This together with the growth reports, We can create impressive reports!