Cisco CMX Alerts

Cisco CMX alerts can be of different level of severity. For critical alerts, there is an immediate impact on Cisco CMX and as a customer you should take necessary steps to resolve. Else, you will be risking losing data, for example, if a controller is down,you will not be able to retrieve data for any floor/access point that the controller manages.

As a customer, you can only resolve the obvious alerts such as controller not working. Most of the other alerts either indicate an undersized Cisco CMX or a critical failure in Cisco CMX. Both these cases would require intervention from Cisco CMX technical experts. You can use some of the cmxos and cmxctl commands to fix these critical failures. We recommend that you seek Cisco CMX technical help for troubleshooting.

Cisco CMX Alert

Description

Possible Solution

CPU_USAGE

Displayed when your CPU exceeds 80% on a Cisco CMX box.

Upgrade to a bigger Cisco CMX box.

MEMORY_USAGE

This alert is displayed when the memory usage is high.

Reduce the load on the Cisco CMX. Probably need a bigger CMX. Support should be able to figure that out.

SERVICE_STATUS

Displayed when a Cisco CMX service is crashed.

We recommend that you call the support.

DATA_PROCESSING_STATUS

Displayed when the Analytics service is slowing down.

Reduce load.

NMSP_CONNECTION_STATUS

Displayed when the Controller goes down for some reason.

Troubleshoot for a probable networking issue.

OUT_OF_MEMORY

Not used in Cisco CMX.

NA.

QUEUE_FULL

Not used in Cisco CMX.

NA

ARRAY_INDEX_OUT_OF_BOUND

Not used in Cisco CMX

NA

BEACON_STATUS

Not supported

NA

BEACON_MOVEMENT

Not supported

NA

DISK_USAGE

Displayed when the Hard drive is getting full.

Run the cmx cleanup tool or remove unnecessary load from the hard drive.

AWIPS_LICENSE

Not used in Cisco CMX

NA

NMSP_MSG_RATE_EXCEEDED

Displayed when the system is getting too many NMSP messages for its box type.

We recommend that you either get a bigger box or clear unwanted clients by removing a controller or a map.

LOCATION_OVERLOADED

Critical alert that is not expected to happen.

NA

EVAL_LICENSE_EXPIRY

Displayed after the built in license expired after 120 days.

We recommend that you buy and activate a new Cisco CMX license.

AP_CONTROLLER_FETCH_STATUS

Displayed if SNMP information from the controller cannot be fetched.

Provide Cisco CMX with valid SNMP credentials.

SSID_CONTROLLER_FETCH_STATUS

Same as AP Controller.

NA

MAP_IMPORT_ERROR

Displayed if maps are not imported successfully during the import process from Cisco Prime Infrastructure.

We recommend that you contact support to re-import maps from Cisco Prime Infrastructure.

ANALYTICS_MISMATCH

Displayed if Analytics sanity test is failed.

We recommend that you call the Cisco support.

HETERARCHY_SIZE_LIMIT_EXCEEDED

Displayed if maps/aps/zones numbers exceed their limit for the corresponding Cisco CMX service type.

This might affect Cisco CMX performance. We recommend that you either reduce the number of elements or move them to a larger Cisco CMX box.

mem_usage

Displayed once the memory usage is above 80%. This is a critical error.

Consider upgrading hardware or VM specs.

SERVER_STATUS

Displayed after the High Availability is successfully disabled. The Primary server is no longer syncing with secondary server.

This is an informational alert, and no action required.

SERVER_STATUS

Displayed when attempting to failback from secondary server to primary server: 192.168.99.110.

This is an informational alert, and no action required.

UNIQUE_DEVICE_EXCEEDED

Two alerts will be generated on Cisco CMX. First warning alert is generated when the number of unique devices seen in a particular day reaches 90% of allowed limit for that Cisco CMX. The second critical alert is generated when the number of unique devices seen in that day exceeds the allowed limit for that Cisco CMX.

This alert indicates that Cisco CMX is having heavy load than allowed in a day and this could lead to performance issue on Cisco CMX. One of the possible solution will be to lower the traffic using filtering parameters such as Disable Probing Clients or split the traffic among multiple Cisco CMX.

Monit Email

Customer Action

1m Load avg. above 3

No action required.

1m Load avg. recovered

No action required.

5m Load avg. above 3

No action required.

5m Load avg. recovered

No action required.

15m Load avg. above 2

No action required.

15m Load avg. recovered

No action required.

Adminui service is down

Run the cmxos adminui start command.

Agent service is down

Run the cmxctl agent start command.

Analytics service is down

Run the cmxctl analytics start command.

Analytics service recovered

No action required.

cache_6378 service is down

Run the cmxctl cache_6378 start command.

cache_6378 service recovered

No action required.

cache_6379 service is down

Run the cmxctl cache_6379 start command.

cache_6379 service recovered

No action required.

cache_6380 service is down

Run the cmxctl cache_6380 start command.

cache_6380 service recovered

No action required.

cache_6381 service is down

Run the cmxctl cache_6381 start command.

cache_6381 service recovered

No action required.

cache_6382 service is down

Run the cmxctl cache_6382 start command.

cache_6382 service recovered

No action required.

cache_6383 service is down

Run the cmxctl cache_6383 start command.

cache_6383 service recovered

No action required.

cache_6385 service is down

Run the cmxctl cache_6385 start command.

cache_6385 service recovered

No action required.

cassandra service is down

Run the cmxctl cassandra start command.

cassandra service recovered

No action required.

Collectd service is down

No action required.

Collectd service is up

No action required.

Confd service is down

Run the cmxctl confd start command.

Confd service is up

No action required.

configuration service is down

Run the cmxctl configuration start command.

configuration service recovered

No action required.

Consul Service is down

Run the cmxctl consul start command.

Disk usage is above 80%

Remove files. Add storage.

Disk usage recovered

No action required.

DNSMasq service is down

No action required.

File Descriptors are above bounds

No action required.

File Descriptors recovered

No action required.

File system

HAProxy service is down

Run the cmxctl haproxy start command.

HAProxy service is up

No action required.

hyperlocation service is down

Run the cmxctl hyperlocation start command.

hyperlocation service recovered

No action required.

Influxdb service is down

Run the cmxctl influxdb start command.

Influxdb service is up

No action required.

Inode usage is above 80%

Remove files.

Inode usage recovered

No action required.

Load

Suggested actions to lessen the load:

  • Create fewer notifications

  • Run fewer reports

  • Remove some WLCs

  • Upgrade system.

location service is down

Run the cmxctl location start command.

location service recovered

No action required.

matlabengine service is down

Run the cmxctl matlabengine start command.

matlabengine service recovered

No action required.

Memory usage is above 80%

Restart the system during a quiet period. Upgrade system.

Memory usage recovered

No action required.

Monit instance changed

None. Informational.

nmsplb service is down

Run the cmxctl nmsplb start command.

nmsplb service recovered

No action required.

Port 5432 is not responding

Run the cmxctl database stop and cmxctl database start command.

Port 5432 is responding

No action required.

Port 6378 is not responding

Run the cmxctl cache_6378 stop and cmxctl cache_6378 start command.

Port 6378 responding

No action required.

Port 6379 is not responding

Run the cmxctl cache_6379 stop and cmxctl cache_6379 start command.

Port 6379 responding

No action required.

Port 6380 is not responding

Run the cmxctl cache_6380 stop and cmxctl cache_6380 start command.

Port 6380 responding

No action required.

Port 6381 is not responding

Run the cmxctl cache_6381 stop and cmxctl cache_6381 start command.

Port 6381 responding

No action required.

Port 6382 is not responding

Run the cmxctl cache_6382 stop and cmxctl cache_6382 start command.

Port 6382 responding

No action required.

Port 6383 is not responding

Run the cmxctl cache_6383 stop and cmxctl cache_6383 start command.

Port 6383 responding

No action required.

Port 6385 is not responding

Run the cmxctl cache_6385 stop and cmxctl cache_6385 start command.

Port 6385 responding

No action required.

Port 6511 is not responding

Run the cmxctl hyperlocation stop and cmxctl hyperlocation start command.

Port 6512 responding

No action required.

Port 6531 is not responding

Run the cmxctl location stop and cmxctl location start command.

Port 6531 responding

No action required.

Port 6532 is not responding

Run the cmxctl location stop and cmxctl location start command.

Port 6532 responding

No action required.

Port 6541 is not responding

Run the cmxctl analytics stop and cmxctl analytics start command.

Port 6541 responding

No action required.

Port 6542 is not responding

Run the cmxctl analytics stop and cmxctl analytics start command.

Port 6542 responding

No action required.

Port 6551 is not responding

Run the cmxctl configuration stop and cmxctl configuration start command.

Port 6551 responding

No action required.

Port 6552 is not responding

Run the cmxctl configuration stop and cmxctl configuration start command.

Port 6552 responding

No action required.

Port 6571 is not responding

Run the cmxctl nmsplb stop and cmxctl nmsplb start command.

Port 6571 responding

No action required.

Port 6572 is not responding

Run the cmxctl nmsplb stop and cmxctl nmsplb start command.

Port 6572 responding

No action required.

Port 6581 is not responding

Run the cmxctl matlabengine stop and cmxctl matlabengine start command.

Port 6581 is responding

No action required.

Port 6582 is not responding

Run the cmxctl matlabengine stop and cmxctl matlabengine start command.

Port 6582 is responding

No action required.

Port 9042 is not responding

Run the cmxctl cassandra stop and cmxctl cassandra start command.

Port 9042 is responding

No action required.

postgres service is down

Run the cmxctl database start command.

postgres service is up

No action required.

qlesspy service is down

Run the cmxctl qlesspy start command.

qlesspy service recovered

No action required.

Socket 5432 is not responding

Run the cmxctl database stop and cmxctl database start command.

Socket 5432 is responding

No action required.

Swap usage is above 80%

Increase swap space or reduce memory usage.

Swap usage recovered

No action required.

SYS CPU usage is above 60%

No action required.

SYS CPU usage recovered

No action required.

The analytics service is not reporting health

Run the cmxctl analytics stop and cmxctl analytics start command.

The analytics service reporting health

No action required.

The configuration service is not reporting health

Run the cmxctl configuration stop and cmxctl configuration start command.

The configuration service reporting health

No action required.

The hyperlocation service is not reporting health

Run the cmxctl hyperlocation stop and cmxctl hyperlocation start command.

The hyperlocation service reporting health

No action required.

The location service is not reporting health

Run the cmxctl location stop and cmxctl location start command.

The location service reporting health

No action required.

The matlabengine service is not reporting health

Run the cmxctl matlabengine stop and cmxctl matlabengine start command.

The matlabengine service reporting health

No action required.

The nmsplb service is not reporting health

Run the cmxctl nmsplb stop and cmxctl nmsplb start command.

The nmsplb service reporting health

No action required.

USR CPU usage is above 80%

No action required.

USR CPU usage recovered

No action required.

WAIT CPU usage is above 60%

No action required.

WAIT CPU usage recovered

No action required.

Memory usage is above 80%

Restart the system during a quiet period.

Upgrade system.

Memory usage recovered

No action required.

Swap usage is above 80%

Increase swap space or reduce memory usage.

File system

Disk usage is above 80%

Remove files.

Add storage.

Disk usage recovered

No action required.

Inode usage is above 80%

Remove files.

Inode usage recovered

No action required.

File Descriptors are above bounds

Restart the system.

File Descriptors recovered

No action required.

ocation service is down

Run the cmxctl location start command.

location service recovered

No action required.

Port 6531 is not responding

Run the cmxctl location stop and cmxctl location start command.

Port 6531 responding

No action required.

Port 6532 is not responding

Run the cmxctl location stop and cmxctl location start command.

Port 6532 responding

No action required.

The location service is not reporting health

Run the cmxctl location stop and cmxctl location start command.

The location service reporting health

No action required.

matlabengine service is down

Run the cmxctl matlabengine start command.

matlabengine service recovered

No action required.

Port 6581 is not responding

Run the cmxctl matlabengine stop and cmxctl matlabengine start command.

Port 6581 responding

No action required.

Port 6582 is not responding

Run the cmxctl matlabengine stop and cmxctl matlabengine start command.

Port 6582 responding

No action required.

The matlabengine service is not reporting health

Run the cmxctl matlabengine stop and cmxctl matlabengine start command.

The matlabengine service reporting health

No action required.

nmsplb service is down

Run the cmxctl nmsplb start command.

nmsplb service recovered

No action required.

Port 6571 is not responding

Run the cmxctl nmsplb stop and cmxctl nmsplb start command.

Port 6572 responding

No action required.

The nmsplb service is not reporting health

Run the cmxctl nmsplb stop and cmxctl nmsplb start command.

The nmsplb service reporting health

No action required.

postgres service is down

Run the cmxctl database start command.

postgres service is up

No action required.

Socket 5432 is not responding

Run the cmxctl database stop and cmxctl database start command.

Socket 5432 is responding

No action required.

Port 5432 is not responding

Run the cmxctl database stop and cmxctl database start command.

Port 5432 is responding

No action required.

qlesspy service is down

Run the cmxctl qlesspy start command.

qlesspy service recovered

No action required.

cache_6378 service is down

Run the cmxctl cache_6378 start command.

cache_6378 service recovered

No action required.

Port 6378 is not responding

Run the cmxctl cache_6378 stop and cmxctl cache_6378 start command.

Port 6378 responding

No action required.

cache_6379 service is down

Run the cmxctl cache_6379 start command.

cache_6379 service recovered

No action required.

Port 6379 is not responding

Run the cmxctl cache_6379 stop and cmxctl cache_6379 start command.

Port 6379 responding

No action required.

cache_6380 service is down

Run the cmxctl cache_6380 start command.

cache_6380 service recovered

No action required.

Port 6380 is not responding

Run the cmxctl cache_6380 stop and cmxctl cache_6380 start command.

Port 6380 responding

No action required.

cache_6381 service is down

Run the cmxctl cache_6381 start command.

cache_6381 service recovered

No action required.

Port 6381 is not responding

Run the cmxctl cache_6381 stop and cmxctl cache_6381 start command.

Port 6381 responding

No action required.

cache_6382 service is down

Run the cmxctl cache_6382 start command.

cache_6382 service recovered

No action required.

Port 6382 is not responding

Run the cmxctl cache_6382 stop and cmxctl cache_6382 start command.

Port 6382 responding

No action required.

cache_6383 service is down

Run the cmxctl cache_6383 start command.

cache_6383 service recovered

No action required.

Port 6383 is not responding

Run the cmxctl cache_6383 stop and cmxctl cache_6383 start command.

Port 6383 responding

No action required.

cache_6385 service is down

Run the cmxctl cache_6385 start command.

cache_6385 service recovered

No action required.

Port 6385 is not responding

Run the cmxctl cache_6385 stop and cmxctl cache_6385 start command.

Port 6385 responding

No action required.