Northbound Event Notification

RackHD supports event notification via both web hook and AMQP.

A web hook allows applications to subscribe certain RackHD published events by configured URL, when one of the subscribed events is triggered, RackHD will send a POST request with event payload to configured URL.

RackHD also publishes defined events over AMQP, so subscribers to RackHD’s instance of AMQP don’t need to register a webhook URL to get events. The AMQP events can be prolific, so we recommend that consumers filter events as they are received to what is desired.

Events Payloads

All published external events’ payload formats are common, the event attributes are as below:

Attribute Type Description
version String Event payload format version.
type String It could be one of the values: heartbeat, node, polleralert, graph.
action String a verb or a composition of component and verb which indicates what happened, it’s associated with the type attribute.
severity String Event severity, it could be one of the values: critical, warning, information.
typeId String It’s associated with the type attribute. It could be graph ‘Id’ for graph type, poller ‘Id’ for polleralert type, <fqdn>.<service name> for heartbeat event, node ‘Id’ for node type. Please see table for more details .
createdAt String The time event happened.
nodeId String The node Id, it’s null for ‘heartbeat’ event.
data Object Detail information are included in this attribute.

The table of type, typeId, action and severity for all external events

type typeId action severity Description
heartbeat <fqdn>.<service name> updated information Each running RackHD service will publish a periodic heartbeat event message to notify that service is running.
polleralert the ‘Id’ of poller sel.updated related to sel rules, it could be one of the values: critical, warning, information Triggered when condition rules of sel alert defined in SKU PACK is matched
sdr.updated information Triggered when sdr information is updated.
fabricservice.updated information Triggered when fabricservice information is updated.
pdupower.updated information Triggered when pdu power state information is changed.
chassispower.updated information Triggered when chassis power state information is changed.
snmp.updated related to snmp rules, it could be one of the values: critical, warning, information Triggered when condition rules of snmp alert defined in SKU PACK is matched
graph the ‘Id’ of graph started information Triggered when graph started.
finished information Triggered when graph finished.
progress.updated information Triggered when long task’s progress information is updated.
node the ‘Id’ of node discovered information

Triggered in node’s discovery process,it has two cases:

  • Automatic discovery
  • Passive discovery by post a node by REST API
added information Triggered when a rack node is added to database by REST API
removed information Triggered when node is deleted by REST API
sku.assigned information Triggered when node’s sku field is assigned.
sku.unassigned information Triggered when node’s sku field is unassigned.
sku.updated information Triggered when node’s sku field is updated.
obms.assigned information Triggered when node’s obms field is assigned.
obms.unassigned information Triggered when node’s obms field is unassigned.
obms.updated information Triggered when node’s obms field is updated.
accessible information Triggered when node telemetry OBM service (IPMI or SNMP) is accessible
inaccessible information Triggered when node telemetry OBM service (IPMI or SNMP) is inaccessible
alerts could be one: information, warning, or critical Triggered when rackHD receives a redfish alert

Example of heartbeat event payload:

{
    "version": "1.0",
    "type": "heartbeat",
    "action": "updated",
    "typeId": "kickseed.example.com.on-taskgraph",
    "severity": "information",
    "createdAt": "2016-07-13T14:23:45.627Z",
    "nodeId": "null",
    "data": {
        "name": "on-taskgraph",
        "title": "node",
        "pid": 6086,
        "uid": 0,
        "platform": "linux",
        "release": {
            "name": "node",
            "lts": "Argon",
            "sourceUrl": "https://nodejs.org/download/release/v4.7.2/node-v4.7.2.tar.gz",
            "headersUrl": "https://nodejs.org/download/release/v4.7.2/node-v4.7.2-headers.tar.gz"
        },
        "versions": {
            "http_parser": "2.7.0",
            "node": "4.7.2",
            "v8": "4.5.103.43",
            "uv": "1.9.1",
            "zlib": "1.2.8",
            "ares": "1.10.1-DEV",
            "icu": "56.1",
            "modules": "46",
            "openssl": "1.0.2j"
        },
        "memoryUsage": {
            "rss": 116531200,
            "heapTotal": 84715104,
            "heapUsed": 81638904
        },
        "currentTime": "2017-01-24T07:18:49.236Z",
        "nextUpdate": "2017-01-24T07:18:59.236Z",
        "lastUpdate": "2017-01-24T07:18:39.236Z",
        "cpuUsage": "NA"
    }
}

Example of node discovered event payload:

{
    "type": "node",
    "action": "discovered",
    "typeId": "58aa8e54ef2b49ed6a6cdd4c",
    "nodeId": "58aa8e54ef2b49ed6a6cdd4c",
    "severity": "information",
    "data": {
        "ipMacAddresses": [
            {
                "ipAddress": "172.31.128.2",
                "macAddress": "2c:60:0c:ad:d5:ba"
            },
            {
                "macAddress": "90:e2:ba:91:1b:e4"
            },
            {
                "macAddress": "90:e2:ba:91:1b:e5"
            },
            {
                "macAddress": "2c:60:0c:c0:a8:ce"
            }
        ],
        "nodeId": "58aa8e54ef2b49ed6a6cdd4c",
        "nodeType": "compute"
    },
    "version": "1.0",
    "createdAt": "2017-02-20T06:37:23.775Z"
}

Events via AMQP

AMQP Exchange and Routing Key

The change of resources managed by RackHD could be retrieved from AMQP messages.

  • Exchange: on.events
  • Routing Key <type>.<action>.<severity>.<typeId>.<nodeId>

ALl the fields in routing key exists in the common event payloads event_payload.

Examples of routing key:

Heartbeat event routing key of on-tftp service:

heartbeat.updated.information.kickseed.example.com.on-tftp

Polleralert sel event routing key:

polleralert.sel.updated.critical.44b15c51450be454180fabc.57b15c51450be454180fa460

Node discovered event routing key:

node.discovered.information.57b15c51450be454180fa460.57b15c51450be454180fa460

Graph event routing key:

graph.started.information.35b15c51450be454180fabd.57b15c51450be454180fa460

AMQP Routing Key Filter

All the events could be filtered by routing keys, for example:

All services’ heartbeat events:

$ sudo node sniff.js "on.events" "heartbeat.#"

All nodes’ discovered events:

$ sudo node sniff.js "on.events" "#.discovered.#"

‘sniff.js’ is a tool located at https://github.com/RackHD/on-tools/blob/master/dev_tools/README.md

Events via Hook

Register Web Hooks

The web hooks used for subscribing event notification could be registered by POST <server>/api/current/hooks API as below

curl -H "Content-Type: application/json" -X POST -d @payload.json <server>api/current/hooks

The payload.json attributes in the example above are as below:

Attribute Type Flags Description
url String required The hook url that events are notified to. Both http and https urls are supported. url must be unique.
name String optional Any name user specified for the hook.
filters Array optional An array of conditions that decides which events should be notified to hook url.

When a hook is registered and eligible events happened, RackHD will send a POST request to the hook url. POST request’s Content-Type will be application/json, and the request body be the event payload.

An example of payload.json with minimal attributes:

{
    "url": "http://www.abc.com/def"
}

When multiple hooks are registered, a single event can be sent to multiple hook urls if it meets hooks’ filtering conditions.

Event Filter Rules

The conditions of which events should be notified could be specified in the filters attribute in the hook_payload, when filters attribute is not specified, or it’s empty, all the events will be notified to the hook url.

The filters attribute is an array, so multiple filters could be specified. The event will be sent as long as any filter condition is satisfied, even if the conditions may have overlaps.

The filter attributes are type, typeId, action, severity and nodeId listed in event_payload. Filtering by data is not supported currently. Filtering expression of hook filters is based on javascript regular expression, below table describes some base operations for hook filters:

Description Example Eligible Events
Attribute equals some value {“action”: “^discovered$”} Events with action equals discovered
Attribute can be any of specified value. {“action”: “discovered|updated”} Events with action equals either discovered or updated
Attribute can not be any of specified value. {“action”: “[^(discovered|updated)]”} Events with action equals neither discovered nor updated
Multiple attributes must meet specified values. {“action”: “[^(discovered|updated)]”, “type”: “node”} Events with type equals node while action equals neither discovered nor updated

An example of multiple filters:

{
    "name": "event sets",
    "url": "http://www.abc.com/def",
    "filters": [
        {
            "type": "node",
            "nodeId": "57b15c51450be454180fa460"
        },
        {
            "type": "node",
            "action": "discovered|updated",
        }
    ]
}

Web Hook APIs

Create a new hook

POST /api/2.0/hooks
{
    "url": "http://www.abc.com/def"
}

Delete an existing hook

DELETE /api/2.0/hooks/:id

Get a list of hooks

GET /api/2.0/hooks

Get details of a single hook

GET /api/2.0/hooks/:id

Update an existing hook

PATCH /api/2.0/hooks/:id
{
    "name": "New Hook"
}

Redfish Alert Notification

Description

RackHD is enabled to receive redfish based notifications. It is possible to configure a redfish endpoint to send alerts to RackHD. When RackHD receives an alert, it determines which node issued the alert and then it adds some additional context such as nodeId, service tag, etc. Lastly, RackHD publishes the alert to AMQP and Web Hook.

Configuring the Redfish endpoint

If the endpoint is redfish enabled and supports the Resfish EventService, it is possible to configure the endpoint to send the alerts to RackHD. Please note that the “Destination” property in the example below should be a reference to RackHD.

POST /redfish/v1/EventService/Subscriptions
    {
            "Context": "context string",
            "Description": "Event Subscription Details",
            "Destination": "https://10.240.19.226:8443/api/2.0/notification/alerts",
            "EventTypes": [
            "ResourceAdded",
            "StatusChange",
                "Alert"
            ],
            "Id": "id",
            "Name": "name",
            "Protocol": "Redfish"
    }

If the node is a Dell node, it is possible to post the Graph.Dell.Configure.Redfish.Alerting workflow. The workflow will:

1- Enable Alerts for the Dell node. Equivalent to running “set iDRAC.IPMILan.AlertEnable 1” racadam command.

2- Enable redfish alerts. Equivalent to running “eventfilters set -c idrac.alert.all -a none -n redfish-events” racadam command.

3- Disable the “Audit” info alerts. Equivalent to running “eventfilters set -c idrac.alert.audit.info -a none -n none” racadam command.

The workflow will run the default values if the node’s obm is set and the “rackhdPublicIp” property is set in the rackHD config.json file. Below is an example the default settings:

{
  "@odata.context": "/redfish/v1/$metadata#EventDestination.EventDestination",
  "@odata.id": "/redfish/v1/EventService/Subscriptions/b50106d4-32c6-11e7-8b05-64006ac35232",
  "@odata.type": "#EventDestination.v1_0_2.EventDestination",
  "Context": "RackhHD Subscription",
  "Description": "Event Subscription Details",
  "Destination": "https://10.1.1.1:8443/api/2.0/notification/alerts",
  "EventTypes": [
    "ResourceAdded",
    "StatusChange",
    "Alert"
  ],
  "EventTypes@odata.count": 3,
  "Id": "b50106d4-32c6-11e7-8b05-64006ac35232",
  "Name": "EventSubscription b50106d4-32c6-11e7-8b05-64006ac35232",
  "Protocol": "Redfish"
}

It is possible to overwrite any of the values by adding it to payload when posting the Graph.Configure.Redfish.Alerting workflow. Here is an instance of the payload:

{
    "options": {
            "redfish-subscribtion": {
                    "url": "https://10.240.19.130/redfish/v1/EventService/Subscriptions",
                    "credential": {
                            "username": "root",
                            "password": "1234567"
                    },
                    "data": {
                            "Context": "context string",
                            "Description": "Event Subscription Details",
                            "Destination": "https://1.1.1.1:8443/api/2.0/notification/alerts",
                            "EventTypes": [
                                    "StatusChange",
                                    "Alert"
                            ],
                            "Id": "id",
                            "Name": "name",
                            "Protocol": "Redfish"
                    }

            }
    }
}

Alert message

In addition to the redfish alert message, RackHD adds the following properties: “sourceIpAddress” (of the BMC), “nodeId”,”macAddress” (of the BMC), “ChassisName”, “ServiceTag”, “SN”.

{
        "type": "node",
        "action": "alerts",
        "data": {
                "Context": "context string",
                "EventId": "8689",
                "EventTimestamp": "2017-04-03T10:07:32-0500",
                "EventType": "Alert",
                "MemberId": "7e675c8e-127a-11e7-9fc8-64006ac35232",
                "Message": "The coin cell battery in CMC 1 is not working.",
                "MessageArgs": ["1"],
                "MessageArgs@odata.count": 1,
                "MessageId": "CMC8572",
                "Severity": "Critical",
                "sourceIpAddress": "10.240.19.130",
                "nodeId": "58d94cec316779d4126be134",
                "sourceMacAddress   ": "64:00:6a:c3:52:32",
                "ChassisName": "PowerEdge R630",
                "ServiceTag": "4666482",
                "SN": "CN747515A80855"
        },
        "severity": "critical",
        "typeId": "58d94cec316779d4126be134",
        "version": "1.0",
        "createdAt": "2017-04-03T14:11:46.245Z"
}

AMQP

The messages are pulished to:

  • Exchange: on.events
  • Routing Key: node.alerts.<severity>.<typeId>.<nodeId>