Showing posts with label Agents. Show all posts
Showing posts with label Agents. Show all posts

Friday, July 17, 2020

SCOM agent not connecting. Events 20000,21016,20070



The OpsMgr Connector connected to MS.contoso.com, but the connection was closed immediately after authentication occurred.  The most likely cause of this error is that the agent is not authorized to communicate with the server, or the server has not received configuration.  Check the event log on the server for the presence of 20000 events, indicating that agents which are not approved are attempting to connect

OpsMgr was unable to set up a communications channel to MS.contoso.com and there are no failover hosts.  Communication will resume when MS.contoso.com is available and communication from this computer is allowed.

After all actions have failed like repair, reinstall, delete,approve again etc. The following steps should work. Use at you own risk and don't forget to backup your database.

Uninstall the agent from the computer completely first. Then do the following actions in sequence.

1. Delete pending agent if any

exec p_AgentPendingActionDeleteByAgentName ‘agentname.domain.com’

2. Delete
USE [OperationsManager]
UPDATE dbo.[BaseManagedEntity]
SET
[IsManaged] = 0,
[IsDeleted] = 1,
[LastModified] = getutcdate()
WHERE FullName like ‘%computername%’

3. Grooming

DECLARE @GroomingThresholdUTC datetime

SET @GroomingThresholdUTC = DATEADD(d,-2,GETUTCDATE())
UPDATE BaseManagedEntity
SET LastModified = @GroomingThresholdUTC
WHERE [IsDeleted] = 1
UPDATE Relationship
SET LastModified = @GroomingThresholdUTC
WHERE [IsDeleted] = 1
UPDATE TypedManagedEntity
SET LastModified = @GroomingThresholdUTC
WHERE [IsDeleted] = 1

EXEC p_DataPurging

4.  Groom all partition tables.

/*-------------------------------*/

declare @counter int  set @counter = 0  while @counter < 122

begin

exec p_PartitioningAndGrooming

set @counter = @counter + 1

print 'The counter is ' + cast(@counter as char)

end

/*-----------------------------*/

Tuesday, May 2, 2017

Putting SCOM agents in maintenance mode during Configmgr SCCM patching using a management pack.

A while ago I was tasked to suppress the alerts from SCOM for servers which were being patched and rebooted. There is a checkbox in SCCM which allows you to suppress the SCOM alerts but it did not work in my case and we got bombarded with alerts during a patching window.

This management pack in your environment should help in catching those alerts and suppressing your servers when there is a reboot for patching. Copy the code below and create your own xml file.
Name it Contoso.Patching.Maintenance.xml before you import it or you can rename it to the company or organization you work for.
But make sure that you replace Contoso everywhere.

This management pack will monitor the agents for an event 1074. Which is the Configmgr initiated reboot.
Then run a power shell script which will put those agents in maintenance mode.


<?xml version="1.0" encoding="utf-8"?><ManagementPack ContentReadable="true" SchemaVersion="2.0" OriginalSchemaVersion="1.1" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <Manifest>
    <Identity>
      <ID>Contoso.Patching.Maintenance</ID>
      <Version>1.0.0.0</Version>
    </Identity>
    <Name>Contoso Patching Maintenance</Name>
    <References>
      <Reference Alias="MicrosoftWindowsLibrary7585010">
        <ID>Microsoft.Windows.Library</ID>
        <Version>7.5.8501.0</Version>
        <PublicKeyToken>31bf3856ad364e35</PublicKeyToken>
      </Reference>
      <Reference Alias="SystemLibrary7585010">
        <ID>System.Library</ID>
        <Version>7.5.8501.0</Version>
        <PublicKeyToken>31bf3856ad364e35</PublicKeyToken>
      </Reference>
      <Reference Alias="SystemCenter">
        <ID>Microsoft.SystemCenter.Library</ID>
        <Version>7.0.8433.0</Version>
        <PublicKeyToken>31bf3856ad364e35</PublicKeyToken>
      </Reference>
      <Reference Alias="Health">
        <ID>System.Health.Library</ID>
        <Version>7.0.8433.0</Version>
        <PublicKeyToken>31bf3856ad364e35</PublicKeyToken>
      </Reference>
    </References>
  </Manifest>
  <Monitoring>
    <Rules>
      <Rule ID="Contoso.ConfigmgrInitiated.Reboot.Alert.Rule" Enabled="true" Target="MicrosoftWindowsLibrary7585010!Microsoft.Windows.Server.Computer" ConfirmDelivery="true" Remotable="true" Priority="Normal" DiscardLevel="100">
        <Category>Alert</Category>
        <DataSources>
          <DataSource ID="DS" TypeID="MicrosoftWindowsLibrary7585010!Microsoft.Windows.EventProvider">
            <ComputerName>$Target/Property[Type="MicrosoftWindowsLibrary7585010!Microsoft.Windows.Computer"]/NetworkName$</ComputerName>
            <LogName>System</LogName>
            <Expression>
              <And>
                <Expression>
                  <SimpleExpression>
                    <ValueExpression>
                      <XPathQuery Type="UnsignedInteger">EventDisplayNumber</XPathQuery>
                    </ValueExpression>
                    <Operator>Equal</Operator>
                    <ValueExpression>
                      <Value Type="UnsignedInteger">1074</Value>
                    </ValueExpression>
                  </SimpleExpression>
                </Expression>
                <Expression>
                  <SimpleExpression>
                    <ValueExpression>
                      <XPathQuery Type="String">PublisherName</XPathQuery>
                    </ValueExpression>
                    <Operator>Equal</Operator>
                    <ValueExpression>
                      <Value Type="String">User32</Value>
                    </ValueExpression>
                  </SimpleExpression>
                </Expression>
                <Expression>
                  <RegExExpression>
                    <ValueExpression>
                      <XPathQuery Type="String">EventDescription</XPathQuery>
                    </ValueExpression>
                    <Operator>ContainsSubstring</Operator>
                    <Pattern>The process C:\Windows\CCM\Ccmexec.exe</Pattern>
                  </RegExExpression>
                </Expression>
                <Expression>
                  <RegExExpression>
                    <ValueExpression>
                      <XPathQuery Type="String">EventDescription</XPathQuery>
                    </ValueExpression>
                    <Operator>ContainsSubstring</Operator>
                    <Pattern>has initiated the restart of computer</Pattern>
                  </RegExExpression>
                </Expression>
              </And>
            </Expression>
          </DataSource>
        </DataSources>
        <WriteActions>
          <WriteAction ID="Alert" TypeID="Health!System.Health.GenerateAlert">
            <Priority>1</Priority>
            <Severity>0</Severity>
            <AlertName />
            <AlertDescription />
            <AlertOwner />
            <AlertMessageId>$MPElement[Name="Contoso.ConfigmgrInitiated.Reboot.Alert.Rule.AlertMessage"]$</AlertMessageId>
            <AlertParameters>
              <AlertParameter1>$Data/LoggingComputer$</AlertParameter1>
              <AlertParameter2>$Data/EventSourceName$</AlertParameter2>
              <AlertParameter3>$Data/EventNumber$</AlertParameter3>
              <AlertParameter4>$Data[Default='']/EventDescription$</AlertParameter4>
            </AlertParameters>
            <Suppression />
            <Custom1 />
            <Custom2 />
            <Custom3 />
            <Custom4 />
            <Custom5 />
            <Custom6 />
            <Custom7 />
            <Custom8 />
            <Custom9 />
            <Custom10 />
          </WriteAction>
        </WriteActions>
      </Rule>
      <Rule ID="Contoso.ConfigMgrInitiated.Reboot.Script.Rule" Enabled="true" Target="SystemCenter!Microsoft.SystemCenter.AllManagementServersPool" ConfirmDelivery="false" Remotable="true" Priority="Normal" DiscardLevel="100">
        <Category>Custom</Category>
        <DataSources>
          <DataSource ID="Scheduler" TypeID="SystemLibrary7585010!System.Scheduler">
            <Scheduler>
              <SimpleReccuringSchedule>
                <Interval Unit="Minutes">2</Interval>
              </SimpleReccuringSchedule>
              <ExcludeDates />
            </Scheduler>
          </DataSource>
        </DataSources>
        <WriteActions>
          <WriteAction ID="ExecuteScript" TypeID="MicrosoftWindowsLibrary7585010!Microsoft.Windows.PowerShellWriteAction">
            <ScriptName>SuppressPatchedServers.ps1</ScriptName>
            <ScriptBody>
            ## This script will suppress the patched servers
 Add-PSSnapin "Microsoft.EnterpriseManagement.OperationsManager.Client" -ErrorVariable errSnapin;
 Function WriteEvent ($Messages,$ID) {
 $Messages = $Messages + "`t" + $(get-date).ToString()
 Write-EventLog -LogName Application -Source PatchingSuppress -EventId $ID -Message $Messages
 }


#WriteEvent "Starting patching script in Contoso.Patching.Maintenance.xml mp" "1234"

$Date = Get-Date
$path = $date.Month.ToString() + $date.Day.ToString() + $date.Year.ToString() + ".log"
$minutes= 15
$comment= "Rebooted by Configuration Manager"
$reason="PlannedOther"
$startTime = [System.DateTime]::Now
$endTime = $startTime.AddMinutes($minutes)
$Agents = Get-SCOMAgent
$Alerts = Get-ScomAlert -Criteria {Name like 'Contoso Reboot Initiated Alert' and ResolutionState = 0}


foreach($Alert in $Alerts)
{
$agent = $Agents | Where-Object {$_.DisplayName -eq $alert.MonitoringObjectName}
if(($clusters = $agent.GetRemotelyManagedComputers()))
   {
      $clusterNodeClass = Get-MonitoringClass -Name Microsoft.Windows.Cluster.Node
       foreach($cluster in $clusters)
         {
           $clusterObj = Get-MonitoringClass -Name Microsoft.Windows.Cluster | Get-MonitoringObject -Criteria "Name='$($cluster.ComputerName)'"
            if($clusterObj)
             {
               $clusterObj.ScheduleMaintenanceMode($startTime,$endTime,$reason,$Comment,"Recursive")
               $nodes = $clusterObj.GetRelatedMonitoringObjects($clusterNodeClass)
                if($nodes)
                   {
                   foreach($node in $nodes)
                     {
                       $message = $node.ToString()
                       WriteEvent "Putting node $message into maintenance mode by patching script in Contoso.Patching.Maintenance.xml mp." "1235"
                       }
                   }
              }
              $message = $($cluster.Computer).ToString()
              WriteEvent "Putting cluster computer $message into maintenance mode by patching script in Contoso.Patching.Maintenance.xml mp." "1236"
              New-MaintenanceWindow -StartTime $startTime -EndTime $endTime -MonitoringObject $cluster.Computer -Reason $reason -Comment $comment
          }
    }
   
    else
   {
     $message = $($agent.HostComputer.DisplayName).ToString()
     WriteEvent "Putting server $message into maintenance mode by patching script in Contoso.Patching.Maintenance.xml mp." "1237"
     New-MaintenanceWindow -StartTime $startTime -EndTime $endTime -MonitoringObject $agent.HostComputer -Reason $reason -Comment $comment
   }

   $Alert | Set-SCOMAlert -ResolutionState 255 -CustomField8 "Patching maintenance completed"
 }
 #WriteEvent "Ending patching script in Contoso.Patching.Maintenance.xml mp" "1238"

            </ScriptBody>
            <TimeoutSeconds>60</TimeoutSeconds>
          </WriteAction>
        </WriteActions>
      </Rule>
    </Rules>
  </Monitoring>
  <Presentation>
    <Folders>
      <Folder ID="Folder_5e3e9391b6394ab288bd1c95f83e90cd" Accessibility="Public" ParentFolder="SystemCenter!Microsoft.SystemCenter.Monitoring.ViewFolder.Root" />
    </Folders>
    <StringResources>
      <StringResource ID="Contoso.ConfigmgrInitiated.Reboot.Alert.Rule.AlertMessage" />
    </StringResources>
  </Presentation>
  <LanguagePacks>
    <LanguagePack ID="ENU" IsDefault="false">
      <DisplayStrings>
        <DisplayString ElementID="Contoso.Patching.Maintenance">
          <Name>Contoso Patching Maintenance</Name>
          <Description>Author: Parag Waghmare
Reason: Created to suppress the servers during reboots initiated by patching</Description>
        </DisplayString>
        <DisplayString ElementID="Folder_5e3e9391b6394ab288bd1c95f83e90cd">
          <Name>Contoso Patching Maintenance</Name>
        </DisplayString>
        <DisplayString ElementID="Contoso.ConfigmgrInitiated.Reboot.Alert.Rule">
          <Name>Contoso Reboot Initiated Alert</Name>
          <Description />
        </DisplayString>
        <DisplayString ElementID="Contoso.ConfigmgrInitiated.Reboot.Alert.Rule.AlertMessage">
          <Name>Contoso Reboot Initiated Alert</Name>
          <Description>Computer:{0}
EventSource:{1}
EventID:{2}
Event Description: {3}
</Description>
        </DisplayString>
        <DisplayString ElementID="Contoso.ConfigmgrInitiated.Reboot.Alert.Rule" SubElementID="DS">
          <Name>DS</Name>
        </DisplayString>
        <DisplayString ElementID="Contoso.ConfigmgrInitiated.Reboot.Alert.Rule" SubElementID="Alert">
          <Name>Alert</Name>
        </DisplayString>
        <DisplayString ElementID="Contoso.ConfigMgrInitiated.Reboot.Script.Rule">
          <Name>Contoso suppress patching server script</Name>
        </DisplayString>
      </DisplayStrings>
    </LanguagePack>
  </LanguagePacks>
</ManagementPack>

Thursday, April 27, 2017

Putting a SCOM agent (non clustered) in maintenance mode using powershell

## Putting scom agent in maintenance mode. This does not yet put agents which are part ## of a cluster in maintenance mode.




$rootMS = "rootms"

$agentName = "agentname"

$minutes= 10

$comment= "Planned Reboot"

$reason="PlannedOther"

$startTime = [System.DateTime]::Now

$endTime = $startTime.AddMinutes($minutes)

Add-PSSnapin "Microsoft.EnterpriseManagement.OperationsManager.Client" -ErrorVariable errSnapin;

New-ManagementGroupConnection -ConnectionString:$rootMS

set-location "OperationsManagerMonitoring::";

$agent = Get-Agent | ?{$_.computername -match $agentName}

$agent

New-MaintenanceWindow -StartTime $startTime -EndTime $endTime -MonitoringObject $agent.HostComputer -Reason PlannedOther -Comment $Comment

Wednesday, October 5, 2016

Put Scom Agents and Cluster Servers in Maintenance mode and reboot

Here is a script to put your scom agents including clusters in maintenance mode and rebooting them at a specified time. First you will have to create a task and a management pack in SCOM for rebooting the servers. Link on how to create the management pack is here. https://technet.microsoft.com/en-us/library/hh563486%28v=sc.12%29.aspx?f=255&MSPPError=-2147217396.
Works on System Center Operations Manager 2012 R2.

Create the input files and output files on the locations that you prefer. I have created them in C:\Reboots folder. Add your server names in the input file. Then the script given below can be put in a scheduled task and run at specified times or on demand. You can also run the script manually. Make sure that you modify the maintenance window.

I have not yet added any code to verify and alert if the rebooted servers are not back up.

Will probably do so at a later time. Till then feel free to use this one and modify any way that you like.

#################Script Start ####################################

$RootMS = "RMSName"
$Minutes =  90
$Comment = "Unknown"
#Setting up SCOM connection

Import-Module OperationsManager
$null = New-ScomManagementGroupConnection -ComputerName $RootMS
Add-PSSnapin "Microsoft.EnterpriseManagement.OperationsManager.Client" -ErrorAction SilentlyContinue

$eventLog = New-Object System.Diagnostics.EventLog("Operations Manager")
$eventLog.Source = "Maintenance Mode"

$Servers = GC "C:\Reboots\input.txt"
Function Out($output)
{
Out-File -Filepath "C:\Reboots\output.txt" -InputObject $output -Append
}

Out "###############  Starting Script at $(Get-Date)  ##########################"
foreach($Server in $Servers)
{

$output = "Starting for $Server on $(Get-date)"
Out $output

###Putting the agent and cluster in maintenance mode###########
$agent = Get-ScomAgent | Where-Object { $_.DisplayName –eq $Server -or $_.ComputerName -eq $Server -or $_.PrincipalName -eq $Server }
if(!$agent) { Write-Host "ERROR: $Server is not a monitored system in SCOM." -ForeGroundColor Red; Set-Location $originalPath; return }
$Server = $agent.PrincipalName
$startTime = (Get-Date).ToUniversalTime()
$endTime = $startTime.AddMinutes($Minutes)
if(($clusters = $agent.GetRemotelyManagedComputers())) {
$clusterNodeClass = Get-SCOMClass -Name Microsoft.Windows.Cluster.Node
foreach($cluster in $clusters) {
#$clusterObj = Get-SCOMClass -Name Microsoft.Windows.Cluster | Get-ScomMonitoringObject -Criteria "Name='$($cluster.ComputerName)'"
$clusterobj = Get-SCOMClass -Name Microsoft.Windows.Cluster | Get-SCOMClassInstance | ?{$_.displayname -eq $cluster.ComputerName}
if($clusterObj) {
$clusterObj.ScheduleMaintenanceMode($startTime,$endTime,"PlannedOther",$Comment,"Recursive")
$nodes = $clusterObj.GetRelatedMonitoringObjects($clusterNodeClass)
if($nodes) {
foreach($node in $nodes) {
Out "Putting $node into maintenance mode."
$eventLog.MachineName = $node.Name
$eventLog.WriteEntry("The server entered into maintenance mode $(if($Server -notcontains $node.Name){"on behalf of $Server"}).`r`n`r`nDuration:`t$Minutes minutes`r`nReason:`t$Comment","Information",42)
}
}
}
Out "Putting $($cluster.Computer) into maintenance mode."
New-MaintenanceWindow -StartTime $startTime -EndTime $endTime -MonitoringObject $cluster.Computer -Reason PlannedOther -Comment $Comment
}
}
else {
Out "Putting $Server into maintenance mode."
$eventLog.WriteEntry("The server entered into maintenance mode.`r`n`r`nDuration:`t$Minutes minutes`r`nReason:`t$Comment","Information",42)
New-MaintenanceWindow -StartTime $startTime -EndTime $endTime -MonitoringObject $agent.HostComputer -Reason PlannedOther -Comment $Comment
}
#####Sending commands to the server###########

$Task = Get-SCOMTask -DisplayName "Reboot Computer"
$Overrides = @{Arguments = '"$Target/Property[Type="MicrosoftWindowsLibrary7585010!Microsoft.Windows.Computer"]/PrincipalName$" "true"'}
$Instance = Get-SCOMClassInstance | ?{$_.Name -eq $agent.name}
Out "Starting task to reboot $($Instance.Displayname) at $(Get-Date)"
Start-SCOMTask -Task $Task -Override $Overrides -Instance $Instance

}

Out "###############  Script Complete at $(Get-Date) #########################"

#################Script End ####################################


I don't take credit for all the scripts that I written. Whenever I can credit the original creators, I do.  If I don't at times please do inform me so that I can include them in the page.