HPE Software is now Micro Focus
HPE Software is now Micro Focus
IT Operations Management (ITOM)
cancel

Operations Bridge: Setting a node in adhoc outage - in just a click!

Operations Bridge: Setting a node in adhoc outage - in just a click!

Mgoyal

Micro Focus Operations Bridge hybrid IT monitoring software (OMi) has a powerful feature known as 'Downtime" to define planned and unplanned outages for the monitored CIs. This feature allows you to configure how events and CI status should be handled when related CIs are in downtime. However,  many customers, especially our long-time Operations Management (OM) customers, have been asking - how do I just right click a CI/event in OMi and start/end an outage on that, as and when needed?

The expected resullt of putting a node in outage is that events coming from that node CI should be automatically closed and not appear in the active event browser. Most of these customers do not need the advanced capabilities of the Downtime feature, like service health reflecting the outage status or tracking the downtimes etc.

If you are also looking for such an alternative, this article provides a work flow using OMi standard features.

Approach:

Use a custom tool that adds the outage node in a special node group and then all events coming from the node are automatically assigned to a special user group using a view filer on the node group. After that, an EPI script processes all such events assigned to the user group and closes them. Another tool is used to end the outage on the node which removes the node from the special node group.

Detailed Steps:

Here is a custom workflow that can be used and further customized to implement the simple outage functionality in OMi.

Step 1: Add the node to a specific node group e.g. nodes-in-outage.

There are many ways to make this happen:

a. Using Monitored Nodes UI, create the node group and add the node to it
b. Run the opr-node CLI to do so
c. Create a tool (using opr-node CLI) that appears in the right click context menu or action tab of a node CI or event

In this article, I use option c and create a tool as shown below: 

 

tool-definition.jpg

 

Step 2: Create a view that gets all the node CIs in the above node group and all the interesting CIs which will also be affected by the node outage like a running software and IP address.

 

view-definition.jpg

 

 

Step 3: Create a user group e.g. disabled-nodes-user-group. There is no real need to add any users to this group as its use is limited to the automatic assignment of outage events to the group so that we can easily filter such events later in the EPI.

 

user-group-definition.jpg

 

Step 4: Create a user group assignment rule which uses the view created in previous steps to filter the events matching the CIs in the view and assign those events to the specific user group i.e. outage-nodes-user-group

 

 user-group-assignment.jpg

 

Step 5: Create an EPI script at the "before storing the event" step. The script uses an event filter that selects only those events which are assigned to our special user group and closes these events. If you want, you can add a custom attribute to such events OR instead of closing them, you can drop the events. What you do in the script really depends on your use case requirements.

 

 EPI-definition.jpg

 The event filter used in the EPI script:

EPI-filter.jpg

 

 Step 6: Remove the node from the special node group when the outage is over

 Again, as in step 1, there are many ways to do this:

a. Using Monitored Nodes UI, remove the node from the node group
b. Run the opr-node CLI to do so
c. Create a tool (using opr-node CLI) that appears in the right click context menu or action tab of a node CI or event

Here, I use option c and create a tool as shown below: 

tool-definition 2.jpg

 

Some Points to Consider:
----------------------------------

1. For non-admin users, make sure the user has appropriate permissions to execute tools and to run the opr-node command, as documented in the OMi Online help

2. I used OMi version 10.61 for this workflow and while running opr-node command, it returns with HTTP error 500 in some cases, even though command works successfully. Its a known defect and has been fixed in version10.62

3. I have created a content pack that contains the rules, filter and EPI script definitions. If you would like to make use of that, leave a comment and I will share the same with you.

 

 I hope you find this useful. Your feedback/comments/questions are welcome.

 

Events

To get more information on this release and how customers are using Operations Bridge we are happy to announce the following events you can register for

Vivit event – Operations Special Interest Group Webinar – 11th October 2017 – Register here
ITOM Customer Forum, Copenhagen – October 3rd 2017 - Register here
ITOM Customer Forum, Stockholm – October 4th 2017 - Register here
ITOM Customer Forum, London – October 5th 2017 - Register here
ITOM Customer Forum, Brussels – December 7th 2017 -  coming soon

Read all our news at the OpsBridge
blog

References

Explore all the capabilities of the Operations Bridge Suite and technology integrations by visiting these sites:

 

  • operations bridge
Comments
Respected Contributor.

So how to add a specific node in to that particular outage group? How is this different that using downtime management option where we can select specific CI and start a downtime?

[Mamta] - As mentioned in the article, you can use any of the following approaches to add a node to the outage node group:

a. Using Monitored Nodes UI, create the node group and add the node to it
b. Run the opr-node CLI to do so
c. Create a tool (using opr-node CLI) that appears in the right click context menu or action tab of a node CI or event

Its different than downtime as it allows you to select an event OR a node CI in an explorer view and initiate/end the outage by just right clicking on it. In Downtime, you have to go to the downtime UI to create a downtime with starttime and endtime and add CIs to it. In downtime, it allows you to reflect the downtime status in service health views while in this case, there is no effect on the the CI status in service health.

Outstanding Contributor.

Hi,

very nice posting. Thank you!

I am just a little curious, why do you use the "assigned group"  attribute, but not a CMA with a speaking name (even with multiple possible values, maybe matching the downtime category)?

Is checking for a CMA value from within an event filter so much slower compared to checking for a standard message attribute?

Best regards,

Wodisch

 

Hi Wodisch, I'm using assigned group because group assignment can be done automatically using OMi group assignment rules wherein it allows me to use a view filter to select the nodes which are in outage. For the CMA, I will have to write an EPI script to find which nodes are in outage and then add the CMA and i wanted to avoid it.

I hope it clarifies.

Thanks,

Mamta

PS: sorry for the typos

Micro Focus Expert

Seems a great approach, thanks for sharing. I would be interested in the content pack.

Best regards

Marc

Thanks Marc, will be sending you the content pack shortly.

Regards,

Mamta

Outstanding Contributor.

Hello Mgoyal, I would like to have a look at that Content Pack. Thank you.

 

Thanks Raymond, will be sharing the content pack shortly.

Honored Contributor.

How does it work when the Alert is generated related to Disk (Drvie letter in related CIs). Will the same approach works for the Disk alerts?

Yes, the same approach will work for disk related alerts but you need to modify the view definition in step 2 to include disk CI type.

Honored Contributor.

Will yu be sharing it as Content pack on Market place ?

Hi Ashish,

Yes, thats the plan...

Thanks & Regards,

Mamta

Valued Contributor.

Hi Mgoyal.

               Many thanks for the post… this is what I’m looking for and kindly share the content pack?

 

Kind regards

Mohan