MemcachedConnectError
|
error
critical
|
Application
|
Message Text:
${HOSTNAME}: Memcached server is in error
OR
Memcached server is in error : <with exception>
Description: Generated if attempting to connect to or write to the memcached server causes an exception.
|
clear
|
Application
|
Message Text: ${HOSTNAME}: Memcached server is operational
Description: Generated if successfully connect to or write to the memcached server.
|
ApplicationStartError
|
alert
|
Application
|
Message Text: ${HOSTNAME}: Feature %s is unable to start. Error - %s
Description: Generated if an installed feature cannot start.
|
clear
|
Application
|
Message Text: ${HOSTNAME}: Feature %s is Running
Description: Generated if an installed feature successfully started.
|
License Usage Threshold Exceeded
|
critical, error,
notice, warning
(Configurable)
|
Application
|
Message Text: ${HOSTNAME}: Session Count License Usage at: xxx%, threshold is:xxx%
Description: The number of sessions on the system has exceeded the configured threshold of sessions allowed by the current license.
The threshold value and alarm severity of this alarm is configurable in Policy Builder: Click Fault List in the navigation
pane, then create a new fault list or edit the existing fault list. By default, the threshold is set to 90%.
|
clear
|
Application
|
Message Text: ${HOSTNAME}: Session Count License Usage at: xxx%, threshold is:xxx%
Description: The number of sessions on the system is below the configured threshold of sessions allowed by the current license.
|
LicensedSessionCreation
|
critical
|
Application
|
Message Text: ${HOSTNAME}: Session creation is not allowed
Description: A predefined threshold of sessions covered by licensing has been passed. This is a warning and should be reported. License
limits may need to be increased soon. This message can be generated by an invalid license, but the AdditionalInfo portion
of the notification shows root cause.
|
clear
|
Application
|
Message Text: ${HOSTNAME}: Session creation is allowed
Description: The number of sessions are below the predefined threshold of sessions covered by licensing.
|
InvalidLicense
|
emergency
|
Application
|
Message Text: ${HOSTNAME}: xxx license has not been verified yet
Description: The system license currently installed is not valid. This prevents system operation until resolved. This is possible if no
license is installed or if the current license does not designate values. This may also occur if any of the VMs MAC addresses
change.
|
emergency
|
Application
|
Message Text: ${HOSTNAME}: xxx license is Invalid. %s
Description: License is invalid. For example, if RADIUS feature is installed and the license for the same is not installed, then this
alarm is generated.
Note
|
RADIUS-based policy control is no longer supported in CPS 14.0.0 and later releases as 3GPP Gx Diameter interface has become
the industry-standard policy control interface.
|
|
critical
|
Application
|
Message Text: ${HOSTNAME}: xxx license is Expired. %s
Description: License has expired.
|
error
|
Application
|
Message Text: ${HOSTNAME}: xxx license will Expire Soon. %s
Description: License is going to expire soon.
|
critical
|
Application
|
Message Text: ${HOSTNAME}: xxx license has exceeded the allowed parameters. %s
Description: License has exceeded the allowed parameters.
|
error
|
Application
|
Message Text: ${HOSTNAME}: xxx license is nearing the allowed parameters. %s
Description: RADIUS AAA proxy server is reachable.
Note
|
RADIUS-based policy control is no longer supported in CPS 14.0.0 and later releases as 3GPP Gx Diameter interface has become
the industry-standard policy control interface.
|
|
clear
|
Application
|
Message Text: ${HOSTNAME}: license is Valid
Description: License is valid.
|
PolicyConfiguration
|
error
|
Application
|
Message Text: ${HOSTNAME}: Last policy configuration failed with the following message: xxx
Description: A change to system policy structure has failed. The AdditionalInfo portion of the notification contains more information.
The system typically remains in a proper state and continues core operations. Either make note of this message or investigate
more fully.
|
clear
|
Application
|
Message Text: ${HOSTNAME}: Last policy configuration was successful
Description: A change to system policy structure has passed.
|
PoliciesNotConfigured
|
emergency
|
Application
|
Message Text: ${HOSTNAME}: 1001Policies not configured
Description: The policy engine cannot find any policies to apply while starting up. This may occur on a new system, but requires immediate
resolution for any system services to operate.
|
clear
|
Application
|
Message Text: ${HOSTNAME}: 1001:Policies successfully configured
Description: The policy engine has successfully configured all the policies while starting up.
|
DiameterPeerDown
|
error
|
Application
|
Message Text:
${HOSTNAME}: 3001:Host: %s Realm: %s is down
OR
${HOSTNAME}: 3001:Host: %s Realm: %s PeerIP: %s is down
OR
${HOSTNAME}: 3001:Host: %s Realm: %s PeerIP: %s Interface: %s is down
Description: Diameter peer is down.
|
clear
|
Application
|
Message Text:
${HOSTNAME}: 3001:Host: %s Realm: %s is back up
OR
${HOSTNAME}: 3001:Host: %s Realm: %s PeerIP: %s is back up
OR
${HOSTNAME}: 3001:Host: %s Realm: %s PeerIP: %s Interface: %s is back up
Description: Diameter peer is up.
|
DiameterAllPeersDown
|
critical
|
Application
|
Message Text: ${HOSTNAME}: 3002:Realm: %s:applicationId: %s:all peers are down
Description: All Diameter peer connections configured in a given realm are DOWN (i.e. connection lost). The alarm identifies which realm
is down. The alarm is cleared when at least one of the peers in that realm is available.
|
clear
|
Application
|
Message Text: ${HOSTNAME}: 3002:Realm: %s:applicationId: %s:peers are up
Description: The Diameter peer connections configured in a given realm are up.
|
DiameterStackNotStarted
|
critical
|
Application
|
Message Text: ${HOSTNAME}: 3004:Error starting diameter stack: <stack uri>. Reason: <error message>
Description: This alarm is generated when Diameter stack cannot start on a particular policy director (load balancer) due to some configuration
issues.
|
clear
|
Application
|
Message Text: ${HOSTNAME}: 3004:Stack <stack uri> is running
Description: The Diameter stack has started successfully.
|
All DB Member of replica set Down
|
critical
|
Application
|
Message Text: "${HOSTNAME}: All DB members of replica set ${SET_NAME}-SET$Loop are down"
Description: Not able to connect to any member of the replica set.
|
All DB Member of replica set Up
|
clear
|
Application
|
Message Text: "${HOSTNAME}: All DB members of replica set ${SET_NAME}-SET$Loop are up"
Description: Able to connect to all members of the replica set.
|
No Primary DB Member Found
|
critical
|
Application
|
Message Text: "${HOSTNAME}: Unable to find primary member for Replica-set ${SET_NAME}-SET$Loop"
Description: Unable to find primary member for the replica-set.
|
Primary DB Member Found
|
clear
|
Application
|
Message Text: "${HOSTNAME}: Found primary member $member for Replica-set ${SET_NAME}-SET$Loop"
Description: Found primary member for the replica-set.
|
DB Member Down
|
critical
|
Application
|
Message Text:
"${HOSTNAME}: DB_Member $member of SET $SET is down"
OR
"${HOSTNAME}: DB_Member $member_ip:$mem_port ($mem_hostname) of SET $SET is down"
Description: A secondary member of the replica set is down.
|
DB Member Up
|
clear
|
Application
|
Message Text:
"${HOSTNAME}: DB_Member $member of SET $SET is up"
OR
"${HOSTNAME}: DB_Member $member_ip:$mem_port ($mem_hostname) of SET $SET is up"
Description: A secondary member of the replica set has come back up.
|
Arbiter Down
|
critical
|
Application
|
Message Text:
"${HOSTNAME}: Arbiter $member of SET $SET is down"
OR
"${HOSTNAME}: Arbiter $member_ip:$mem_port ($mem_hostname) of SET $SET is down"
Description: The arbiter member of the replica set is not reachable.
|
Arbiter Up
|
clear
|
Application
|
Message Text:
"${HOSTNAME}: Arbiter $member of SET $SET is up"
OR
"${HOSTNAME}: Arbiter $member_ip:$mem_port ($mem_hostname) of SET $SET is up"
Description: The arbiter member of the replica set is functional.
|
DB Resync is needed
|
critical
|
Application
|
Message Text: "${HOSTNAME}: Resync is needed for secondary member $setRepl:$SET_NAME:$DB_MEMBER, this member is lagging behind by $SLAVE_BEHIND_SECS
seconds from the primary"
Description: The alarm is generated whenever a manual resynchronization of a database is required to recover from a failure.
|
DB Resync is not needed
|
clear
|
Application
|
Message Text:
"${HOSTNAME}: Resync is not needed for member $setRepl:$SET_NAME:$DB_MEMBER"
OR
"${HOSTNAME}: Resync is not needed for secondary member $setRepl:$SET_NAME:$DB_MEMBER"
Description: The alarm is cleared whenever a database changes to 'Good' state from 'Resync is needed' state, it indicates that the database's
resynchronization has completed.
|
Config Server Down
|
critical
|
Application
|
Message Text:
"${HOSTNAME}: Config_Server $member of SET $SET is down"
OR
"${HOSTNAME}: Config_Server $member_ip:$mem_port ($mem_hostname) of SET $SET is down"
Description: The configuration server for the replica set is unreachable. Not valid for non-sharded replica sets.
|
Config Server Up
|
clear
|
Application
|
Message Text:
"${HOSTNAME}: Config_Server $member of SET $SET is up"
OR
"${HOSTNAME}: Config_Server $member_ip:$mem_port ($mem_hostname) of SET $SET is up"
Description: The configuration server for the replica set is reachable. Not valid for non-sharded replica sets.
|
VM Down
|
critical
|
Application
|
Message Text: "${HOSTNAME}: unable to connect $member_ip ($member) VM. It is not reachable"
Description: The administrator is not able to ping the VM.
|
VM Up
|
clear
|
Application
|
Message Text: "${HOSTNAME}: Connected $member_ip ($member) VM. It is reachable"
Description: The administrator is able to ping the VM.
|
QNS Process Down
|
critical
|
Application
|
Message Text: "${HOSTNAME}: $server (<qns instance id>) server on $VM_HOSTNAME vm is down"
Description: Policy Server (qns-<instance_id>) java process on particular QNS instance is down.
|
QNS Process Up
|
clear
|
Application
|
Message Text: "${HOSTNAME}: $server (<qns instance id>) server on $VM_HOSTNAME vm is up"
Description: Policy Server (qns-<instance_id>) java process on particular QNS instance is up.
|
DeveloperMode
|
error
|
Application
|
Message Text: ${HOSTNAME}: Using Developer mode(100 session limit).To use a license file, remove -Dcom.broadhop.developer.mode from /etc/broadhop/qns.conf
Description: The alarm is generated if developer mode is configured in qns.conf file.
|
clear
|
Application
|
Message Text: ${HOSTNAME}: -Dcom.broadhop.developer.mode is disabled
Description: The alarm is cleared if developer mode is removed in qns.conf file.
|
ZeroMQConnectionError
|
error
|
Application
|
Message Text: ${HOSTNAME}: ZMQ Connection Down for %s
Description: Internal services cannot connect to a required Java ZeroMQ queue. Although retry logic and recovery is available, and core
system functions should continue, investigate and remedy the root cause.
|
clear
|
Application
|
Message Text: ${HOSTNAME}: ZMQ Connection Up for %s
Description: Internal services can connect to a required Java ZeroMQ queue.
|
VirtualInterface Down
|
alert
|
Application
|
Message Text: "${HOSTNAME}: unable to connect ${member}. Not reachable"
Description:Not able to ping the virtual Interface. This alarm is generated for external VIPs. For example, lbvip01.
|
VirtualInterface Up
|
clear
|
Application
|
Message Text: "${HOSTNAME}: ${member} is up"
Description: Successfully ping the virtual Interface. This alarm is cleared for external VIPs. For example, lbvip01.
|
VirtualInterfaceDown
|
alert
|
Application
|
Message Text: "unable to connect ${member}. Not reachable"
Description: Not able to ping the internal VIPs.
|
VirtualInterfaceUp
|
clear
|
Application
|
Message Text: "${member} is up"
Description: Able to ping internal VIPs.
|
Site Down
|
alert
|
Application
|
Message Text: "${HOSTNAME}: Site $site is down"
Description: Site is down. This alarm is related to GR deployments.
|
Site Up
|
clear
|
Application
|
Message Text:
"${HOSTNAME}: Site $site is up"
OR
"${HOSTNAME}: Site $site is up"
Description: Site is Up. This alarm is related to GR deployments.
|
LDAPAllPeersDown
|
error
|
Application
|
Message Text: ${HOSTNAME}: 1201:<LocalHostname>:LDAP connection down
Description: All LDAP peers are down.
|
clear
|
Application
|
Message Text: ${HOSTNAME}: 1201:<LocalHostname>:LDAP connection up
Description: LDAP connection is up.
|
LDAPPeerDown
|
error
|
Application
|
Message Text: ${HOSTNAME}: 1202:<IP Address of the LDAP server>:LDAP connection down
Description: LDAP peer identified by the IP address is down.
|
clear
|
Application
|
Message Text: ${HOSTNAME}: 1202:<IP Address of the LDAP server>:LDAP connection up
Description: LDAP peer identified by the IP address is up.
|
Percentage of LDAP retry threshold Exceeded
|
critical
|
Application
|
Message Text: ${HOSTNAME}: Percentage of LDAP retries compared to total LDAP Queries exceeded to $CURRENT_LEVEL% on $HOST VM
Description: This alarm is generated for LDAP search queries when LDAP retries compared to total LDAP queries exceeds the threshold value
on qnsXX VM.
Default Threshold: 10%
For threshold parameter configuration, refer to:
Note
|
The LDAP server Retry Count parameter must be set to a value greater than 1 for this alarm to be generated. In Policy Builder
navigate to Plugin Configuration > LDAP Configuration > LDAP Server Configuration > Retry Count.
|
|
Percentage of LDAP retry threshold Normal
|
clear
|
Application
|
Message Text: ${HOSTNAME}: Percentage of LDAP retries compared to total LDAP Queries normal to $CURRENT_LEVEL% on $HOST VM
Description: This alarm is cleared for LDAP search queries when LDAP retries copmared to total LDAP queries is normal or has fallen below
the threshold value on qnsXX VM.
|
LDAP Requests as percentage of CCR-I Dropped
|
critical
|
Application
|
Message Text: ${HOSTNAME}: LDAP Requests as percentage of CCR-I dropped to $CURRENT_LEVEL% on $HOST VM
Description: This alarm is generated for LDAP operations when LDAP requests as percentage of CCR-I (Gx messages) drops below threshold
value on qnsXX VM.
Default Threshold: 25%
For threshold parameter configuration, refer to:
|
LDAP Requests as percentage of CCR-I Normal
|
clear
|
Application
|
Message Text: ${HOSTNAME}: LDAP Requests as percentage of CCR-I normal to $CURRENT_LEVEL% on $HOST VM
Description: This alarm is cleared for LDAP operations when LDAP requests as a percentage of CCR-I messages is normal or above the threshold
value on qnsXX VM.
|
LDAP Requests Dropped
|
critical
|
Application
|
Message Text: ${HOSTNAME}: LDAP Requests dropped to $CURRENT_LEVEL on $HOST VM
Description: This alarm is generated for LDAP operations when LDAP requests drop below threshold value on lbXX VM.
Default Threshold: 0
For threshold parameter configuration, refer to:
|
LDAP Requests Normal
|
clear
|
Application
|
Message Text: ${HOSTNAME}: LDAP Requests normal to $CURRENT_LEVEL on $HOST VM
Description: This alarm is cleared when LDAP requests are normal on lbXX VM for LDAP operations.
|
LDAP Query Result Dropped
|
critical
|
Application
|
Message Text: ${HOSTNAME}: LDAP Query Result dropped to $CURRENT_LEVEL on $HOST VM
Description: This alarm is generated when LDAP result is less than or equal to the threshold value on qnsXX VM.
Default Threshold: 0
For threshold parameter configuration, refer to:
|
LDAP Query Result Normal
|
clear
|
Application
|
Message Text: ${HOSTNAME}: LDAP Query Result normal to $CURRENT_LEVEL on $HOST VM
Description: This alarm is cleared when LDAP Query Result goes above the threshold value on qnsXX VM.
|
Gx Message processing Dropped
|
critical
|
Application
|
Message Text: ${HOSTNAME}: Gx Message $MSG_TYPE dropped to $CURRENT_LEVEL% on $HOST_VM VM
Description: This alarm is generated for Gx Message CCR-I, CCR-U and CCR-T when processing of messages drops below 95% on qnsXX VM.
The 95% refers to the percentage of responses to the requests within a 60 second period of time.
For example, in 60 sec if you receive 100 requests and send 95 responses then your percentage would be 95%.
Default threshold: 95%
|
Gx Message processing Normal
|
clear
|
Application
|
Message Text: ${HOSTNAME}: Gx Message $MSG_TYPE normal to $CURRENT_LEVEL% on $HOST_VM VM
Description: This alarm is cleared when the processing of messages is equal or above 95% on qnsXX VM for Gx Message CCR-I, CCR-U and CCR-T
.
|
Gx Average Message processing Dropped
|
critical
|
Application
|
Message Text: ${HOSTNAME}: Gx average Message $MSG_TYPE processing increased to ${CURRENT_LEVEL}ms on $HOST_VM VM
Description: This alarm is generated for Gx Message CCR-I/CCR-U/CCR-T when average message processing exceeds the threshold value on qnsXX
VM.
Default Threshold: 20 ms
For threshold parameter configuration, refer to:
|
Gx Average Message processing Normal
|
clear
|
Application
|
Message Text: ${HOSTNAME}: Gx average Message $MSG_TYPE processing normal to ${CURRENT_LEVEL}ms on $HOST_VM VM
Description: This alarm is cleared when average message processing is equal or below the threshold value on qnsXX VM for Gx Message CCR-I/CCR-U/CCR-T.
|
All SMSC server
connections are down
|
critical
|
Application
|
Message Text: ${HOSTNAME}: 5002:<VMName>:All SMSC servers not reachable
Description: None of the SMSC servers configured are reachable. This Critical Alarm is generated when the SMSC Server endpoints are not
available to submit SMS messages thereby blocking SMS from being sent from CPS.
|
Atleast one SMSC
server connection is up
|
clear
|
Application
|
Message Text: ${HOSTNAME}: 5002:<VMName>:Atleast one SMSC server is reachable
Description: This alarm is cleared when at least one configured SMSC endpoint server is reachable after a state where none were reachable
from the mconfigured list of server endpoints.
|
SMSC server
connection down
|
error
|
Application
|
Message Text: ${HOSTNAME}: 5001:<SMSCServer Address>:<SMSC Port>:SMSC Server not reachable
Description: SMSC Server is not reachable. This alarm is generated when any one of the configured active SMSC server endpoints is not
reachable and CPS will not be able to deliver a SMS via that SMSC server.
|
SMSC server
connection up
|
clear
|
Application
|
Message Text: ${HOSTNAME}: 5001:<SMSCServer Address>:<SMSC Port>:SMSC server reachable
Description: This alarm is cleared when an earlier unreachable SMSC endpoint is now reachable.
|
All Email servers
not reachable
|
critical
|
Application
|
Message Text: ${HOSTNAME}: 5004:<VMName>:All Email Servers not reachable
Description: No email server is reachable. This alarm (Critical) is generated when all configured Email Server Endpoints are not reachable,
blocking e-mails from being sent from CPS.
|
At least one Email
server is reachable
|
clear
|
Application
|
Message Text: ${HOSTNAME}: 5004:<VMName>:At least one Email server is reachable
Description: At least one email server is reachable.
|
Email server is
not reachable
|
error
|
Application
|
Message Text: ${HOSTNAME}: 5003:<Mail Server Address>:<SMTP Port>Email Server not reachable
Description: Email server is not reachable. This alarm is generated when any of the configured Email Server Endpoints are not reachable.
CPS is not able to use the server to send e-mails.
|
Email server is
reachable
|
clear
|
Application
|
Message Text: ${HOSTNAME}: 5003:<Mail Server Address>:<SMTP Port>Email Server reachable
Description: Email server is reachable. This alarm is cleared when an earlier unreachable Email server endpoint is now reachable.
|
Binding Not Available
at Policy DRA
|
Critical, Error,
Notice, Warning
|
Application
|
Message Text: Binding DB not accessible or Binding Db not reachable at Policy DRA
Description: This alarm is generated when IPv6 binding for sessions is not found at Policy DRA. Only one notification is sent out whenever
this condition is detected.
This is a configurable notification. You can configure whether to send or not to send the notification. For more information,
refer to PolicyDRA Health Check under Diameter Configuration in CPS Mobile Configuration Guide.
|
clear
|
Application
|
Message Text: Binding DB Available at Policy DRA or Binding Db reachable at Policy DRA
Description: The alarm is cleared after the duration of Alarm Clearance Interval (configured under in Policy Builder) when the above alarm was generated.
|
SPR_DB_ALARM
|
error
|
Application
|
Message Text: 6101:Remote SPR DB:Error adding remote spr db
Description: This alarm indicates there is an issue in establishing connection to the Remote SPR Databases configured under during CPS policy server (qns) process initialization.
Message Text: 6101:Remote SPR DB:Primary member is down
OR
Description: The alarm is generated whenever Policy Server (QNS) node cannot connect to primary member of SPR replica set.
|
clear
|
Application
|
Message Text: 6101:Remote SPR DB: Cleared alarm Error adding remote spr db
Description: The issue of establishing connection to the Remote SPR database has been resolved.
Message Text: 6101:Remote SPR DB:Cleared alarm for remote spr db primary
Description: The alarms are cleared after starting Policy Server (qns) services.
|
DiameterQnsWarmupError
|
error
|
Application
|
Message Text: Diameter QNS warmup didn't start since QNS node num/SITE_ID not parsed. QNS will accept messages but call-loss expected.
Description: The alarm is raised when the warmup feature is enabled (qns.node.warmup set to true in qns.conf file) and there is a problem in retrieving qns node number, site ID. Make sure qns.node.warmup.hostname.substring and
GeoSiteName (if GR setup) in configured correctly in qns.conf file.
Message Text: Diameter QNS warmup did not start due to exception. QNS will accept the messages but the call loss is expected.
Description: The alarm is generated when the warmup feature is enabled and there is an exception while parsing the warmup dictionaries
or scenario file.
|
clear
|
Application
|
Message Text: Diameter QNS warmup alarms are cleared.
Description: When warmup feature is enabled, the alarms are cleared when restarting the qns nodes.
|
SPRNodeNotAvailable
|
Error
|
Application
|
Message Text: SPR Node not available
Description: This alarm is generated when all the members of SPR replica-set configured under are down and a master node is not available for that given replica-set.
|
clear
|
Application
|
Message Text: SPR node is available
Description: The alarms is cleared if at least one of the SPR replica set member became available.
|
GC State
|
error
|
Application
|
Message Text: {hostname}: Full GC event occurred <GC_ALARM_TRIGGER_COUNT> times on <qns_instance>(<pid>) process in last <GC_ALARM_TRIGGER_INTERVAL>
seconds interval
Description: This alarm is generated when Garbage collection on qns java process occurs three or more (configurable) times within 10 (configurable)
mins of interval.
|
clear
|
Application
|
Message Text: {hostname}: No Full GC event occurred in <GC_CLEAR_TRIGGER_INTERVAL> seconds on <qns_instance>(<pid>) process
Description: This alarm is cleared when Garbage collection does not occur for GC_CLEAR_TRIGGER_INTERVAL seconds (15 mins).
|
OldGen State
|
error
|
Application
|
Message Text: {hostname}: Oldgen% is more than <OLD_GEN_ALARM_TRIGGER_THR> for <OLD_GEN_ALARM_TRIGGER_CONT_GC_COUNT> continuous Full GC
event occurred on <qns_instance>(<pid>) process in last <GC_ALARM_TRIGGER_INTERVAL> seconds interval
Description: This alarm is generated if Oldgen% is more than configured threshold (OLD_GEN_ALARM_TRIGGER_THR) for more than 2 (OLD_GEN_ALARM_TRIGGER_CONT_GC_COUNT)
GC.
|
clear
|
Application
|
Message Text: {hostname}: Oldgen%(<oldgen_per>) is less than <OLD_GEN_CLEAR_TRIGGER_THR> for last Full GC event occurred on <qns_instance>(<pid>)
process"
Description: This alarm is cleared when Oldgen% is less than configured threshold (OLD_GEN_CLEAR_TRIGGER_THR) after last GC event.
|
SessionLimitOverload
ProtectionNotSet
|
warning
|
Application
|
Message Text: Session Limit Overload protection cannot be zero or negative. Change to recommended value in Policy Builder before DB crashes
Description: If configured to 0 (default), CPS can handle infinite number of sessions but this can affect the database and can lead to
application crash.
Warning
|
You must change the value as per your requirements.
|
|
clear
|
Application
|
Message Text: Session Limit Overload protection value set to recommended value in Policy Builder
Description: The alarm is cleared when the recommended value is set and published.
|
SessionLimitOverload
ProtectionExceeded
|
critical
|
Application
|
Message Text: Current Session count exceeded Session Limit Overload Protection. Session creation not allowed to avoid DB crashes
Description: The alarm is generated when the current session count of the system exceeds the value configured for Session Limit Overload
protection.
|
clear
|
Application
|
Message Text: Current Session count is less than Session Limit Overload protection
Description: The alarm is cleared within 30 seconds when the current session count of the system is less than the value configured for
Session Limit Overload protection.
|
SESSION_SHARD_
UNREACHABLE
|
Error
|
Application
|
Message Text: 6501: Session DB: Shards are not reachable
Description: This alarm is generated when a session manager VM other than primary member is unreachable.
Important
|
This alarm is generated only when -DskipUnreachableShards and -DskipDbOperOnUnreachableShards parameters are set to true in qns.conf file.
For more information on qns.conf file parameters, contact your Cisco Account representative.
|
|
clear
|
Application
|
Message Text: 6501: Session DB: Shards are reachable
Description: This alarm is cleared when a secondary session manager VM becomes reachable.
|
ADMIN_DB_MISSING_
SHARD_ENTRIES
|
Critical
|
Application
|
Message Text: 6502: Admin DB: Missing shard entires in (SK/Session) db
Description: This alarm is generated when there are no shards present in the ADMIN replica-skip set > sharding database > shards/sk_shards.
Important
|
This alarm is generated only when -DskipUnreachableShards and -DskipDbOperOnUnreachableShards parameters are set to true in qns.conf file.
For more information on qns.conf file parameters, contact your Cisco Account representative.
|
|
clear
|
Application
|
Message Text: 6502:Admin DB: At least one shard entry exists in (SK/Session) db
Description: This alarm is cleared when shards are present in sharding database, shards/sk_shards collections.
|
MISSING_SESSION_
INDEXES
|
Error
|
Application
|
Message Text: 6503: Session DB: Required Indexes missing on Session Collection <collectionName>
where, <collectionName> can be any one of the collection name in session database.
Description: This alarm is generated when the session database/session collection does not have the required indexes for the normal functioning
of the application.
Important
|
This alarm is generated only when -DskipUnreachableShards and -DskipDbOperOnUnreachableShards parameters are set to true in qns.conf file.
For more information on qns.conf file parameters, contact your Cisco Account representative.
|
|
clear
|
Application
|
Message Text: 6503: Session DB: Required Indexes created on Session Collection <collectionName>
where, <collectionName> can be any one of the collection name in session database.
Description: The alarm is cleared when the session database/session collection have the required indexes for the normal functioning of
the application.
|
MISSING_SPR_
INDEXES
|
Error
|
Application
|
Message Text: 6504: SPR DB: Required Indexes missing on SPR Collection <collectionName>
where, <collectionName> can be any one of the mongo collection names in SPR database.
Description: This alarm is generated when the SPR database/subscriber collections does not have the required indexes for the normal functioning
of the application.
Important
|
This alarm is generated only when -DskipUnreachableShards and -DskipDbOperOnUnreachableShards parameters are set to true in qns.conf file.
For more information on qns.conf file parameters, contact your Cisco Account representative.
|
|
Clear
|
Application
|
Message Text: 6504: SPR DB: Required Indexes created on SPR Collection < collectionName>
where, <collectionName> can be any one of the mongo collection names in SPR database.
Description: This alarm is cleared when the SPR database/subscriber collections have the required indexes for the normal functioning of
the application.
|
Database Operation
|
Critical
|
Application
|
Message Text: < QNS_VM_HOSTNAME> is not able to connect MongoPrimaryDB_<set_name>
Description: This alarm is generated when the Policy Server (QNS) VM is not able to connect to primary MongoDB replica-set member.
Note
|
This alarm is generated only when autoheal_qns_enabled parameter is set to TRUE in Configuration.csv for VMware environment and YAML file for OpenStack Environment.
For more information, refer to CPS Installation Guide for VMware and CPS Installation Guide for OpenStack.
|
|
Clear
|
Application
|
Message Text: <QNS_VM_HOSTNAME> is able to connect MongoPrimaryDB_<set_name>
Description: This alarm is cleared when the Policy Server (QNS) VM is able to connect to primary MongoDB replica-set member.
|
SVN is not in sync
|
Critical
|
Application
|
Message Text: SVN is not in sync since pcrfclient01 revision value <Revision values 1> is not equal to pcrfclient02 revision value <Revision
values 2>
Description: This alarm is generated when SVN is not in sync between pcrfclient VMs.
|
SVN is in sync
|
Clear
|
Application
|
Message Text: SVN is in sync with pcrfclient01 revision value <Revision values 1> is equal to pcrfclient02 revision value <Revision values
2>
Description: This alarm is cleared when SVN is in sync between pcrfclient VMs.
|
MongoPrimaryDB fragmentation exceeded the threshold value
|
Warning
|
Application
|
Message Text: MongoPrimaryDB fragmentation exceeded the threshold value, CURRENT_FRAG = 53%, THRESHOLD = 40% at <hostName>:<port> for <dbName>
of <replicaSetName>
Description: The alarm is generated if the fragmentation percent breaches default value if threshold value is not configured.
|
PrimaryDB fragmentation percent conforms to threshold
|
Clear
|
Application
|
Message Text: MongoPrimaryDB fragmentation conforms to the threshold value, CURRENT_FRAG = 10%, THRESHOLD = 40% at <hostName>:<port> for
<dbName> of <replicaSetName>
Description: The alarm is cleared when the fragmentation percentage is less than the default value if the threshold value is not configured.
|
Realtime Notification server is not reachable
|
error
|
Application
|
Message Text: Realtime Notification server <VMName> is not accessible
Description: This alarm is generated when the configured realtime notification server is not reachable blocking realtime notifications
to be sent from CPS.
|
Realtime notification server is reachable
|
clear
|
Application
|
Message Text: Realtime Notification server <VMName> is accessible now
Description: Realtime server is reachable. This alarm is cleared when an earlier unreachable Realtime server endpoint is now reachable.
|
Stateless Alarms: Alarms which provide the information about the event occuring on the system. These alarms do not have any state. There is
no clear alarm for these notifications.
|
HA Failover
|
info
|
Application
|
Message Text: "${HOSTNAME}: HA Failover done from $previous_member to $PRIMARYNODE of ${SET_NAME}-SET$Loop"
Description: The primary role of the replica set has been failed over to another member.
|
GR Failover
|
info
|
Application
|
Message Text: "${HOSTNAME}: Geo Failover done from $previous_member to $PRIMARYNODE of ${SET_NAME}-SET$Loop"
Description: The primary role of the replica set has been failed over to another member.
|
Admin User Logged in
|
info
|
Application
|
Message Text: "${HOSTNAME}: root user logged in on `hostname` terminal $terminal from machine $from_system at $dt"
Description: root user logged in on %hostname terminal.
|
ProcessRestarted
|
info
|
Application
|
Message Text: $PROCESS process is restarted on $HOSTNAME. Old_PID:$OLD_PID Current_PID:$CURRENT_PID
Description: The above event is info event so there will not be any clear event generated for it. There is no need for a clearing procedure.
|