Network Monitors
Overview
Network Monitors let you define recurring checks against IP addresses, hostnames, ports, HTTP endpoints, and DNS records. Breeze dispatches each check through a connected agent in the same organization via WebSocket, records the result, and updates the monitor’s status. Monitors can optionally be linked to a discovered asset so that availability data appears alongside the asset’s SNMP and discovery information.
All monitor operations require authentication and one of the organization, partner, or system scopes. Organization-scoped users are automatically restricted to monitors within their own organization.
Monitor Types
Breeze supports four monitor types, stored in the monitor_type PostgreSQL enum:
| Type | DB Value | What It Checks |
|---|---|---|
| ICMP Ping | icmp_ping | Sends ICMP echo requests to a target host. Measures round-trip time. |
| TCP Port | tcp_port | Opens a TCP connection to a specific port. Optionally checks for a banner string. |
| HTTP/Endpoint | http_check | Makes an HTTP request to a URL. Validates status code, response body, SSL, and redirects. |
| DNS | dns_check | Resolves a hostname via DNS. Optionally checks record type, expected value, and nameserver. |
ICMP Ping Configuration
| Field | Type | Constraints | Default | Description |
|---|---|---|---|---|
count | integer | 1 — 20 | none | Number of echo requests to send. |
packetSize | integer | 16 — 65,535 | none | ICMP packet payload size in bytes. |
TCP Port Configuration
| Field | Type | Constraints | Default | Description |
|---|---|---|---|---|
port | integer | 1 — 65,535 | required | TCP port number to connect to. |
expectBanner | string | optional | none | String the server banner must contain for a successful check. |
HTTP/Endpoint Configuration
| Field | Type | Constraints | Default | Description |
|---|---|---|---|---|
url | string | valid URL | none | Full URL to request. If omitted, the monitor’s target field is used. |
method | enum | GET, HEAD, POST, PUT, OPTIONS | none | HTTP method. |
expectedStatus | integer | 100 — 599 | none | Required HTTP status code for a passing check. |
expectedBody | string | optional | none | Substring that must appear in the response body. |
headers | object | key-value pairs | none | Custom HTTP headers to include in the request. |
followRedirects | boolean | optional | none | Whether to follow HTTP redirects. |
verifySsl | boolean | optional | none | Whether to verify the TLS certificate. |
DNS Check Configuration
| Field | Type | Constraints | Default | Description |
|---|---|---|---|---|
hostname | string | min 1 char | none | Hostname to resolve. If omitted, the monitor’s target field is used. |
recordType | enum | A, AAAA, MX, CNAME, TXT, NS | none | DNS record type to query. |
expectedValue | string | optional | none | Expected value in the DNS response. |
nameserver | string | optional | none | Custom DNS nameserver to query. |
Creating a Monitor
-
Choose a target. Identify the IP address, hostname, or URL you want to monitor. Optionally associate the monitor with a discovered asset by providing its
assetId. -
Select a monitor type. Pick one of
icmp_ping,tcp_port,http_check, ordns_check. -
Send a POST request.
Terminal window curl -X POST /api/v1/monitors \-H "Authorization: Bearer <token>" \-H "Content-Type: application/json" \-d '{"name": "Web Server HTTPS","monitorType": "http_check","target": "https://app.example.com","config": {"method": "GET","expectedStatus": 200,"verifySsl": true},"pollingInterval": 120,"timeout": 10}' -
Verify the response. A
201status with the full monitor object confirms creation. The monitor begins inunknownstatus andisActive: true.
Polling Intervals and Timeouts
| Parameter | Column | Min | Max | Default | Unit |
|---|---|---|---|---|---|
pollingInterval | polling_interval | 10 | 86,400 | 60 | seconds |
timeout | timeout | 1 | 300 | 5 | seconds |
The polling interval controls how frequently the BullMQ scheduler enqueues a check job for the monitor. The timeout is passed to the agent as the maximum time allowed for the check to complete before it is considered failed.
Monitor Status Lifecycle
Each monitor tracks a last_status field with one of four values:
| Status | Meaning |
|---|---|
unknown | Initial state. No check has completed yet. |
online | The most recent check succeeded. |
degraded | The check succeeded but with concerning metrics (e.g., high response time). |
offline | The most recent check failed. |
Consecutive Failures
The consecutive_failures counter increments by 1 each time a check result has status offline. It resets to 0 on any non-offline result (online or degraded). This counter is used by alert rules with the consecutive_failures_gt condition.
State Update Transaction
When a check result is processed, the worker updates both the network_monitor_results row and the parent network_monitors row inside a single database transaction. The fields updated on the monitor are:
last_checked— timestamp of the checklast_status— the result statuslast_response_ms— response time in millisecondslast_error— error message (orNULLon success)consecutive_failures— incremented or resetupdated_at— current timestamp
Results and Response Times
Every completed check writes a row to the network_monitor_results table:
| Column | Type | Description |
|---|---|---|
id | UUID | Primary key. |
monitor_id | UUID | Foreign key to network_monitors. Cascades on delete. |
status | enum | online, offline, degraded, or unknown. |
response_ms | real | Response time in milliseconds. |
status_code | integer | HTTP status code (HTTP checks only). |
error | text | Error message if the check failed. |
details | JSONB | Additional check-specific data returned by the agent. |
timestamp | timestamp | When the check was performed. |
Querying Results
Fetch historical results with optional time-range filtering:
curl "/api/v1/monitors/<monitorId>/results?start=2026-02-01T00:00:00Z&end=2026-02-18T00:00:00Z&limit=500" \ -H "Authorization: Bearer <token>"| Query Parameter | Type | Constraints | Default | Description |
|---|---|---|---|---|
start | string | ISO 8601 datetime | none | Include results on or after this timestamp. |
end | string | ISO 8601 datetime | none | Include results on or before this timestamp. |
limit | integer | 1 — 1,000 | 100 | Maximum number of results to return. |
Results are returned in descending order by timestamp (newest first).
Dashboard Stats
The GET /api/v1/monitors/dashboard endpoint returns aggregate counts for the authenticated organization:
{ "data": { "total": 12, "status": { "online": 8, "offline": 2, "degraded": 1, "unknown": 1 }, "types": { "icmp_ping": 4, "tcp_port": 3, "http_check": 4, "dns_check": 1 } }}The response includes:
- total — total number of monitors in the organization.
- status — count of monitors grouped by their current
last_status. - types — count of monitors grouped by
monitor_type.
Alert Rules on Monitors
Each monitor can have one or more alert rules. Alert rules are evaluated against check results to trigger alerts at the configured severity level.
Alert Rule Conditions
| Condition | Description | Threshold |
|---|---|---|
offline | Fires when the monitor status is offline. | Not used. |
degraded | Fires when the monitor status is degraded. | Not used. |
response_time_gt | Fires when response time exceeds the threshold. | Milliseconds (string). |
consecutive_failures_gt | Fires when consecutive failures exceed the threshold. | Failure count (string). |
Alert Rule Severities
Rules use the shared alert_severity enum: critical, high, medium, low, info.
Creating an Alert Rule
curl -X POST /api/v1/monitors/alerts \ -H "Authorization: Bearer <token>" \ -H "Content-Type: application/json" \ -d '{ "monitorId": "<monitorId>", "condition": "consecutive_failures_gt", "threshold": "3", "severity": "critical", "message": "Web server has failed 3+ consecutive checks" }'Alert Rule Schema
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
monitorId | UUID | Yes | — | The monitor this rule applies to. |
condition | enum | Yes | — | One of offline, degraded, response_time_gt, consecutive_failures_gt. |
threshold | string | No | null | Threshold value for response_time_gt and consecutive_failures_gt. |
severity | enum | Yes | — | Alert severity: critical, high, medium, low, info. |
message | string | No | null | Custom message included in the alert. |
isActive | boolean | No | true | Whether this rule is currently active. |
Testing Monitors
The test endpoint sends a one-off check command to a connected agent without going through the BullMQ queue. It finds an online agent in the monitor’s organization and dispatches the command directly via WebSocket.
curl -X POST /api/v1/monitors/<monitorId>/test \ -H "Authorization: Bearer <token>"Response (success):
{ "data": { "monitorId": "<monitorId>", "status": "queued", "testedAt": "2026-02-18T12:00:00.000Z" }}Response (no agent available):
{ "data": { "monitorId": "<monitorId>", "status": "failed", "error": "No online agent available", "testedAt": "2026-02-18T12:00:00.000Z" }}On-Demand Checks via BullMQ
The check endpoint enqueues a monitor check job through BullMQ rather than dispatching directly. This is the same mechanism the scheduler uses for recurring checks, but triggered manually.
curl -X POST /api/v1/monitors/<monitorId>/check \ -H "Authorization: Bearer <token>"Response:
{ "data": { "monitorId": "<monitorId>", "status": "queued", "message": "Check request queued" }}How the Job Queue Works
The monitor system uses a BullMQ queue named monitors with three job types:
| Job Type | Name | Description |
|---|---|---|
check-monitor | Check Monitor | Looks up the monitor, finds an online agent in the org, and dispatches the check command via WebSocket. |
process-check-result | Process Check Result | Writes the result to network_monitor_results and updates the monitor’s state in a transaction. |
monitor-scheduler | Monitor Scheduler | Repeatable job (every 30 seconds) that finds all active monitors due for a check and enqueues check-monitor jobs. |
The worker processes jobs with a concurrency of 10. Completed jobs are retained (100 for success, 200 for failures) for debugging.
Agent Command Mapping
Each monitor type maps to a specific agent command type:
| Monitor Type | Agent Command |
|---|---|
icmp_ping | network_ping |
tcp_port | network_tcp_check |
http_check | network_http_check |
dns_check | network_dns_check |
The command payload includes the monitor’s target, timeout, and all fields from the monitor’s config object. Command IDs follow the pattern mon-<monitorId>-<timestamp>.
API Reference
All endpoints are mounted at /api/v1/monitors and require authentication.
Monitors
| Method | Path | Description |
|---|---|---|
GET | /api/v1/monitors | List monitors with optional filters. |
POST | /api/v1/monitors | Create a new monitor. |
GET | /api/v1/monitors/dashboard | Aggregate status and type counts. |
GET | /api/v1/monitors/:id | Get a single monitor with recent results and alert rules. |
PATCH | /api/v1/monitors/:id | Update monitor fields. |
DELETE | /api/v1/monitors/:id | Delete a monitor and all associated results/rules. |
POST | /api/v1/monitors/:id/check | Enqueue an on-demand check via BullMQ. |
POST | /api/v1/monitors/:id/test | Send a direct test check to an online agent. |
GET | /api/v1/monitors/:id/results | Query historical check results. |
Alert Rules
| Method | Path | Description |
|---|---|---|
POST | /api/v1/monitors/alerts | Create an alert rule for a monitor. |
GET | /api/v1/monitors/:monitorId/alerts | List alert rules for a monitor. |
PATCH | /api/v1/monitors/alerts/:id | Update an alert rule. |
DELETE | /api/v1/monitors/alerts/:id | Delete an alert rule. |
List Monitors Query Parameters
| Parameter | Type | Description |
|---|---|---|
orgId | UUID | Filter by organization. Required for partner and system scopes. |
assetId | UUID | Filter by discovered asset. Also infers orgId from the asset. |
monitorType | enum | Filter by type: icmp_ping, tcp_port, http_check, dns_check. |
status | enum | Filter by current status: online, offline, degraded, unknown. |
search | string | Search by monitor name or target (substring match). |
Monitoring Assets Integration
Network monitors also surface through the /api/v1/monitoring/assets endpoint, which returns discovered assets enriched with their monitoring configuration. Each asset includes a network object showing:
{ "network": { "configured": true, "totalCount": 3, "activeCount": 2 }}The DELETE /api/v1/monitoring/assets/:id endpoint disables all active network monitors (and SNMP devices) for a given asset by setting isActive to false.
Troubleshooting
Monitor stuck in “unknown” status
The monitor has never been checked. Verify that:
- The monitor has
isActive: true. - At least one agent in the same organization is online and connected via WebSocket.
- Redis is running and the monitor worker has been initialized (look for
[MonitorWorker] Monitor worker initializedin API logs). - The scheduler repeatable job is registered (look for
[MonitorWorker] Scheduled repeatable monitor scheduler (every 30s)in logs).
Check dispatched but no result recorded
The agent received the command but did not return a result. Check:
- Agent logs for errors processing the
network_ping,network_tcp_check,network_http_check, ornetwork_dns_checkcommand. - WebSocket connectivity between the agent and the API.
- That the
process-check-resultjob type is being processed (check BullMQ failed jobs for errors).
”Check service unavailable” (503)
The /check endpoint requires Redis for BullMQ job queuing. Ensure Redis is running and the REDIS_URL environment variable is correctly configured.
”No online agent available” on test
The /test endpoint looks for a device with status: 'online' in the monitor’s organization and verifies it has an active WebSocket connection. Ensure at least one agent is enrolled, heartbeating, and connected.
Monitors not running on schedule
The scheduler is a BullMQ repeatable job that runs every 30 seconds. If monitors are not being checked:
- Verify the worker is running — check for
[MonitorWorker]log lines. - Inspect the
monitorsqueue in Redis for stuck or failed jobs. - Confirm the monitor’s
polling_intervalhas elapsed sincelast_checked.
Alert rules not firing
Alert rules are stored in the network_monitor_alert_rules table. Verify:
- The rule has
isActive: true. - The
conditionandthresholdmatch the current monitor state (e.g.,consecutive_failures_gtwith threshold"3"requiresconsecutive_failures > 3). - The monitor’s check results are being recorded — query the
/resultsendpoint to confirm.