Configuring SCOM 2012

For this post I thought I’d run through a project I did to setup System Center Operations manager for a client.  It was a steep learning curve at first but once you get to grips with it SCOM 2012 is one of the most powerful monitoring systems out there.

 

Install Requirements

SCOM is a very demanding application, ideally you should give it as many resources as you can spare. You will also need Windows Server 2012 R2 or Windows Server 2016 and SQL server 2012, 2014 or 2016 to connect the data services.

It depends on how much data you plan on collecting to determine the disk size used. Collecting too much data can also see the SCOM SQL database grow to such a size that it will impact on performance.

I will now give you step-by-step instructions that I have drawn up to get the system operational and fully capable of monitoring a large enterprise environment.  We will assume that you have already installed SCOM 2012 and have just opened the console.

Adding client machines

Open the Operations Manager console and go to Administration>Device Management>Configure computers and devices to manage

Select Windows computers and click Next

Click Advanced discovery>select Servers Only and click Next

Select Scan Active Directory and click configure>Type the name of the server you want to install the agent on and click Next

Select Use the selected Management Server Action Account and click Discover

Select the device and select Agent then click Next

Leave these settings as the default and click finish, the agent will now be sent out remotely.

Updates to SCOM 2012 agents are rolled out through WSUS.

Adding Network Devices

In the Operations Console click Administration>Network Management>Discovery Rules

Select Networkdiscoveryrule1 and click Properties

Click Devices and then click Add. Then enter the details of the new network device. If the device supports SNMP select ICMP and SNMP. If you only want ICMP be sure to select this option. When finished click Ok.

Then save the rule

Now right click on the rule and click Run

Notifications Channels

Open the Operation Manager console and go to>Administration>Notifications>Channels

Right click on the existing SMTP channel and click Properties

Click Settings and configure the settings as below

 

Fill in as below:

Email Subject:

Alert: $Data[Default=’Not Present’]/Context/DataItem/AlertName$ Resolution state: $Data[Default=’Not Present’]/Context/DataItem/ResolutionStateName$

Email message:

Alert: $Data[Default=’Not Present’]/Context/DataItem/AlertName$

Source: $Data[Default=’Not Present’]/Context/DataItem/ManagedEntityDisplayName$

Path: $Data[Default=’Not Present’]/Context/DataItem/ManagedEntityPath$

Last modified by: $Data[Default=’Not Present’]/Context/DataItem/LastModifiedBy$

Last modified time: $Data[Default=’Not Present’]/Context/DataItem/LastModifiedLocal$

Alert description: $Data[Default=’Not Present’]/Context/DataItem/AlertDescription$

Alert view link: “$Target/Property[Type=”Notification!Microsoft.SystemCenter.AlertNotificationSubscriptionServer”]/WebConsoleUrl$?DisplayMode=Pivot&AlertID=$UrlEncodeData/Context/DataItem/AlertId$”

Notification subscription ID generating this message: $MPElement$

Notifications Subscribers

Again under Notifications, right click on System Center Notifications and click Properties.

Now click Schedule and set to Always send notifications

Then click Addresses and fill in as below:

Notifications Subscriptions

Click Notifications>Subscriptions and right click on ORG System Center Critical Notifications and click Properties. Fill in as below:

Click Subscribers and add as below:

Click Channels and setup as below:

Email subject:

Alert: $Data[Default=’Not Present’]/Context/DataItem/AlertName$ Resolution state: $Data[Default=’Not Present’]/Context/DataItem/ResolutionStateName$

 

Email message:

Alert: $Data[Default=’Not Present’]/Context/DataItem/AlertName$

Source: $Data[Default=’Not Present’]/Context/DataItem/ManagedEntityDisplayName$

Path: $Data[Default=’Not Present’]/Context/DataItem/ManagedEntityPath$

Last modified by: $Data[Default=’Not Present’]/Context/DataItem/LastModifiedBy$

Last modified time: $Data[Default=’Not Present’]/Context/DataItem/LastModifiedLocal$

Alert description: $Data[Default=’Not Present’]/Context/DataItem/AlertDescription$

Alert view link: “$Target/Property[Type=”Notification!Microsoft.SystemCenter.AlertNotificationSubscriptionServer”]/WebConsoleUrl$?DisplayMode=Pivot&AlertID=$UrlEncodeData/Context/DataItem/AlertId$”

Notification subscription ID generating this message: $MPElement$

Then click Summary and then Finish:

Adding SNMP devices

Some devices automatically transmit SNMP data to SCOM 2012 if they are recognised. Other devices need their SNMP data programmed in to SCOM. To collect SNMP data from devices we need to setup SNMP probes. In order to do this we also need to specify a Unit Monitor (the unit of whatever reading we are trying to monitor from the device).

1.In SCOM create a unit monitor, to do this open the console> Go to Authoring> Management Pack Objects> Monitors> Right click on Monitors> Click Create a Monitor> Unit Monitor

2. Expand SNMP> Probe Based Protection> SNMP Probe Monitor

3. Select the PMA SNMP Traps Management Pack and click Next.

4. Give the monitor a name i.e. PMA High Temperature Alert

5. Click Select next to Monitor Target> Select View All Targets> Select Node and click Ok, then click Next.

6. Enter the OID or object identifier of your first object to monitor. These can be taken from documentation from your device or a tool like GetIf. Below GetIf is shown browsing and object’s mib (management information base) attributes. Next to the mib is the OID which you need for system center.

So for this example we would take .1.3.6.1.4.1.17373.2.2.1.5.1 as our OID as we know this represents temperature.

7. Put this value in the Object Identifier section

8. Click Next, for the first expression click Insert> Expression and put the following in Parameter Name:

/DataItem/SnmpVarBinds/SnmpVarBind[1]/Value

9. Set the operator as whatever is desired, we don’t want a temerature as higher than 26 so we’ll use Greater than. Then set the value as 26.

10. For the second SNMP probe set the Object Identifier to have the same value as the first i.e. .1.3.6.1.4.1.17373.2.2.1.5.1

11. Then click Insert> Expression.

12. Then set the Parameter Name as the same /DataItem/SnmpVarBinds/SnmpVarBind[1]/Value and set the Operator as Less than. You can then set the Value to your ‘safe’ setting. i.e. we now want the system to tell us everything is ok once the temeprature drops below 27.

13. Next configure how you want the severity to appear, Critical is good for the first probe if it is important and Healthy for the second.

14. On the alerts windows check the box that says Generate alerts for this monitor. Give the alert a name and then put in a description. In the description we want to inform the user what is happening i.e. temperature is too high. We might also want to include the variable that we are generating the alert for. For example the current temperature is 28 degrees celcius. To do this simply use the following in the Alert description:

 

$Data/Context/SnmpVarBinds/SnmpVarBind[1]/Value$

If you have multiple expressions you could also have multiple variables in your description i.e.

$Data/Context/SnmpVarBinds/SnmpVarBind[1]/Value$

$Data/Context/SnmpVarBinds/SnmpVarBind[2]/Value$

When finished click Create and the monitor will go live instantly.

Maintenance Mode

Why use maintenance Mode?

System Center Operations Manager has a Maintenance Mode that we can set on its agents. When agent machines are set in maintenance Mode this is classed as scheduled downtime and is therefore not reflected in our SLAs/reports. More importantly when in maintenance Mode you will not get floods of alerts warning you about something that was scheduled and you knew would happen. This can be very useful if we want to show statistics on the particular level of uptime/downtime a machine or application has had as this data would otherwise be contaminated by scheduled maintenance.

 

Scheduled Server Maintenance

System Center 2012 as it stands does not let you schedule maintenance Mode yet. Fortunately there is still a way of doing it using a custom management pack and scripts.

1. First you need to import the management pack from Sample.Agent.MaintenanceMode.1.0.0.2\Sample.Agent.MaintenanceMode.xml into SCOM 2012.

2. Now connect on to a server needing maintenance Mode scheduled and copy the entire Sample.Agent.MaintenanceMode.1.0.0.2\MM folder onto the C:\ drive.

3. Now all you need to do is setup a scheduled task to run C:\MM\Trigger_SCOM_Maintenance_Mode.bat at the required time. For example when backing up the fileserver maintenance Mode is scheduled for 5:50PM. This is then ready for 6PM backups which generate lots of DFS errors if maintenance Mode is not enabled.

Manual Maintenance Mode

Maintenance Mode should always be scripted and added to scheduled tasks for recurring reboots/service restarts etc. If Maintenance Mode needs to be triggered manually for a one off reboot etc. you can simply run the C:\MM\OpsMgrMM.ps1 PowerShell file which presents you with the options below. You simply select the duration, reason and add a comment if necessary and click Start Maintenance.

Filtering unwanted monitors

SCOM 2012 is a very comprehensive monitoring system – too comprehensive in its out of the box state for most organisations. When enabled and you have added your first Windows Servers the influx of information can be overwhelming.

There is no template or set of monitors that can make this process any easier to setup. The only way to get what you want out of the system is to start filtering unwanted information as it comes in.

 

We have already setup notifications so you will now be emailed whenever and important alert comes in. When you receive the alert and if it is an unwanted alert immediately go to Monitoring>Active Alerts. Here you will see the alert that has just been generated and any other alerts that have come in recently.

To permanently stop receiving these alerts right click on the alert and go to Overrides>Disable the Monitor>For the object: <Object>

This will disable the monitor for this one particular object, if you want to disable it for an entire class of objects i.e. all windows servers select For all objects of class.

You will now be presented with the override properties screen. From here you need to make sure you select the Override Value as false and change the destination Management Pack to ORG Custom Monitors and click Ok.

Custom Monitors

When we make changes to existing Management Packs by changing Monitors, Groups or Rules we do not want to save these to the original Management Pack. The reason for this is that if we damage or corrupt that Management Pack we may want to revert back to it. Therefore every time we save changes to a Monitor change for example we select the custom Management Pack we created to save it in. This is almost like a log file of all our changes in the system.

 

To see all the amendments made to Monitors in your custom Management Pack go to Administration>Management Packs>Right click on ORG Custom Monitors>Click export Management Pack.

This will export your management pack as ORG.Custom.Monitors.xml which you can then view in your favourite text editor. You can then see all the edits you have made and even edit this file directly to make changes. This line for example ‘OverrideForMonitorMicrosoftWindowsServer2008LogicalDiskDefragAnalysis’ shows that we have overridden a Monitor that informs us about a Defragmentation analysis.

Groups

To see any custom groups created go to Authoring>Groups>Click find and type ORG.

Here you will see any custom groups created. The reason for creating this is mainly for grouping devices unknown to SCOM so that you can use the group to assign monitors to multiple objects with one assignment. i.e. The Windows 2012 servers group already exists so to set a monitor for this is quite easy. If you have just added three UPS units though you can create a custom group here to assign any custom monitors to it.

To create a new group right click on Groups and click Create New Group. You can then give the group and name and select the ORG Custom Groups Management pack.

You can then choose to add either Explicit or Dynamic members. Explicit members is the simplest way to do this but if you want to create more complex logic based rules you can get members added automatically based on a set of criteria.

Management Packs

We have already mentioned custom management packs but the majority of the data we receive from SCOM comes from the pre-defined Microsoft Management Packs at http://social.technet.microsoft.com/wiki/contents/articles/16174.microsoft-management-packs.aspx

 

Each Management Pack specialises in generating monitoring information for particular Microsoft Products. It is best not to download too many packs at first as the amount of information generated by each one takes a toll on your server’s resources.

We can install the following Management Packs:

Microsoft Exchange

Microsoft SQL Server

Microsoft Windows Active Directory

Microsoft Cluster

Microsoft Windows DFS Replication

Microsoft Windows Group Policy

Microsoft Windows Internet Information Services

Microsoft Windows Print Server

Microsoft Windows Remote Desktop Services

Microsoft Windows Server

Microsoft Windows Server DNS

Windows Server Update Services

To install the pack simple run the install file on your SCOM server, this will place the files in \\<Your Server>\c$\Program Files (x86)\System Center Management Packs

Then in the Operations Manager console go to Administration>Right click on Management Packs>Click import Management Pack and then click Add, then add from disk.  Select No when asked if you want to search the online catalog for dependencies.

Once imported you should then be able to see the new Management Pack in the Monitoring view.

To then see specific information about your chosen management pack you can expand any of the fields. For example to see a historical display of memory usage on the file server select the Microsoft Windows Server Management Pack. Expand it and select Operating System Performance. Then select the PercentMemoryUsed under the name of the file server.

A graph will then appear that shows the percentage memory used. To edit the time range right click on the graph and click Select Time Range.

You can then specify the From and To dates.

Of course the whole point of Operations Manager though is that you don’t have to sit there trawling through pages of system data. Every single management pack has rules setup that will send email notifications if any of the data falls within certain parameters. For example if a system’s CPU is above 80% usage or the disk queue is above 4 you will receive email alerts.

HP SIM Integration

HP Systems Insight Manager should be installed in your environment. This is the system that allows for detailed monitoring of all aspects of HP Proliant server hardware. Fortunately there is an HP Management Pack that allows HP Sim to integrate with SCOM 2012.

There is a wealth of information then transmitted from HP SIM to SCOM 2012 that should have you alerted immediately in the instance of a hardware problem. To see how many monitors are involved expand the HP Systems Management Pack and click Windows Computers. Select a server, then right click and choose Health Explorer. Once the Health Explorer opens close the ‘Scope is only unhealthy child monitors’. This will then show you all the available monitors for a Management Pack.

Day-to-Day Monitoring

For day-to-day monitoring you need something that you can quickly and easily use to spot potential problems. To do this you can create custom dashboards that just show the items you want to see. First of all create a Custom Dashboard Management Pack. Then right click on the Custom Dashboard Management Pack and click New>Dashboard view.

First of all you want to specify how your Dashboard will be set out. First give the Dashboard a name.   I then opted for two columns but you can set this to whatever you need.

Once your layout is complete you click the Settings cog above each of your columns and click Configure. Give the column a name and click Next. The Scope is where we specify the monitored items that we want to view. You can see below that I have separated my columns into Servers/Network devices and Applications.

Servers

Network Devices – you can see here that there was no built-in group that covered all ORG network devices so I created my own.

Applications – you can see that I have added a custom group here for ORG Websites. I will discuss adding a Website monitor below.

To add a Web Application Availability Monitor click Authoring and expand Management Pack Templates. Then Right click on Web Application Availability Monitoring and click Add Monitoring Wizard. Click Web Application Availability Monitoring then click Next. Give the Monitor a name and select the ORG Custom Monitor Management Pack.

Add the ORG website addresses and click Next then finish.

Investigating Problems

Here is how a typical server based problem will transpire.

1. You will get an email from System Center as below:

Alert: HP Windows (SNMP) Drive Array Physical Drive degraded. Resolution state: New

Alert: HP Windows (SNMP) Drive Array Physical Drive degraded.

Source: Server Storage

Path: SERVER.ORG.CO.UK;SERVER..ORG.CO.UK

Last modified by: System

Last modified time: 10/5/2015 1:57:25 PM Alert description: Drive Array Physical Drive Status Change. The physical drive in Slot 0, Port 2I Box 2 Bay 6 with serial number “KPGdfdJD4PF”, has a new status of 3.

(Drive status values: 1=other, 2=ok, 3=failed, 4=predictiveFailure, 5=erasing, 6=eraseDone, 7=eraseQueued, 8=ssdWearOut, 9=notAuthenticated)

[SNMP TRAP: 3046 in CPQIDA.MIB]

Alert view link: “http://SERVER/OperationsManager?DisplayMode=Pivot&AlertID=%7b72ceac17-9f03-4b92-90bb-5062119dc1b9%7d

Notification subscription ID generating this message: {5E3FE64E-71D3-FD66-10B7-D189C4571D01}

2. You go to System Center and see the following:

3. You Right click on the Red/failed device and click Health Explorer.

4. System Center informs you that in this case we have an HP based hardware issue.

5. We then login to HP SIM and indeed see that there is an HP hardware based problem with the server.

6. We visit the HP System Management homepage for more information on the problem.

7. We can see straight away that we have a failed drive in bay 6.

 

Leave a Reply

Your email address will not be published. Required fields are marked *