Alarms

An Alarm is an alert or notification that you establish, based on the state of a Delivery. A state
represents the health of a Delivery.

Alarms work in conjunction with Alarm Groups, Integrations, and Linked Integrations. You must
understand this association to fully comprehend this process.

Integrations are established at the Organization level and allow you to designate Contact
information, which includes the names of persons who must be contacted when there are specific
Delivery issues. These persons may be contacted via email or the PagerDuty notification service.

Next, you must establish Alarm Groups, which are Alarm categories. The Delivery team usually builds
Alarms and Alarm Groups based on customer Service Level Agreements (SLAs). These specifications
dictate what conditions constitute an INFO, WARNING, ERROR, or CRITICAL state.

It is a best practice to create Alarm Groups for each state. For example, all CRITICAL Alarms
might reside in the CRITICAL Alarm Group. In another scenario, all ERROR Alarms might reside
in the ERROR Alarm Group. As you establish Alarms, you can associate them with previously defined
Alarm Groups.

Finally, you must connect your Alarm Groups with Integrations via Linked Integrations, thereby
forming a relation between Alarms/Alarm Groups and the persons who must be notified when certain
conditions are satisfied (which trigger these alerts). While internal team members might be notified
of Delivery issues, some customers require this notification as well.

Any Alarm that is triggered in any Alarm Group will set the state of the Delivery. The highest
severity Alarm triggered represents the state of the Delivery. A Delivery might transition
from one state to another, then another. You might have several Alarms triggered in the INFO
state or WARNING state, for example, with only one triggered in the CRITICAL state. In this case,
the Delivery state will be CRITICAL – as it is the most severe state in this process. This
designation appears in the Alarm State field in the right pane on the Deliveries page
(or Delivery Dashboard).

There are five Alarm states. In the table below, an Error metric is used to explain each.
In this example, the Error metric range is from .25 to 1.0 percent, and this metric has been
defined for each state. One metric is tracked and escalated across each Alarm Group:

Alarm

Description

OK

System-assigned state attached to a Delivery
with no Alarms. If you establish Alarm Groups
(and no Alarms within the groups satisfy criteria
that trigger an Alarm), the Delivery will exist
in this state. If you did not establish any Alarms
(and errors occurred during the Delivery), it would
remain in an OK state – as no Alarms were created.

INFO

This state is aligned with an imperfect Delivery.
A .25 error rate, for example, might represent an
INFO state; however, if this value increases, so
might the severity level.

WARNING

If the Error metric for the Warning state reaches
.50 (the defined metric), the previous INFO state
escalates to WARNING.

ERROR

If the Error metric for the Error state reaches
.75 (the defined metric), the previous WARNING state
escalates to ERROR.

CRITICAL

If the Error metric for the CRITICAL state reaches
1% (the defined metric), the previous ERROR state
escalates to CRITICAL. This is the most severe state,
which commonly involves a Delivery failure. Commonly,
there is a PagerDuty Integration at this state. You
can establish this service via a mobile app or the
PagerDuty website. Integrations are needed at this
level, as communication is essential. Most failures
are critical, such as Snapshot failed to start OR
Snapshot did not push.

As you establish Alarms, you determine the criteria that triggers an alert. You must determine a Metric
and a Comparison Operator. Next, you associate these entries with an Alarm State.

If you establish validation checks (via the Checks page, for example) but do not create Alarms,
Snapshots can fail with no notification.

When creating an Alarm, you need to specify its triggering criteria (conditional metric value) and
potential alarm state. If the criteria are met, the Alarm will be triggered – sending a notification
to any enabled Linked Integration.

Add an Alarm

To add an Alarm:

  1. From the left navigation pane, click Alarms.

  2. From the top right of the Alarms page, click the Add an Alarm icon or plus (+) symbol.

  3. From the New Alarm page that now appears, enter text in the Name field.

  4. Use the arrow at the end of the field to select a Severity level.

    The Severity levels include CRITICAL, ERROR, WARNING, and INFO.

  5. Use the arrow at the end of the field to select a Metric.

    A list of health metrics appears. You will use metrics to establish the stipulations for establishing Alarms.

    Metric Description

    pages

    # of pages

    rows

    # of rows

    dataPct

    % Data

    noDataPct

    % No Data (excludes 404/410)

    notFoundPct

    % No Data Available (404/410)

    blockedPct

    % Blocked

    errorPct

    % Error

    noScreenshotPct

    % Missing screenshot

    noHtmlPct

    % Missing HTML

    rowsPerPage

    Rows per page with data

    avgAttempts

    Average attempts

    verrorsP99

    99% rows have less than X validation errors

    verrors95

    99% rows have less than X validation errors

    verrorsP75

    75% rows have less than X validation errors

    verrorsP50

    50% rows have less than X validation errors

    dupePct

    % Duplicates

    filteredPct

    % Filtered

    status

    A specific Snapshot status

    processed

    Snapshots processed

    total

    Snapshot total

  6. Use the arrow at the end of the field to select a Comparison Operator.

    Comparison Operator Description

    GREATER THAN

    >

    GREATER THAN OR EQUAL TO

    >=

    LESS THAN

    <

    LESS THAN OR EQUAL TO

    NOT EQUAL TO

    !=

    Example: HTML Extracted: (noHtmlPct – sum)>=99%

    The Alarm metrics will be checked every 15 minutes.
  7. Enter text in the Threshold field.

  8. Examine and modify (as necessary) the ALARM README section.

    As you create the Alarm by entering text into the fields on this page, this ALARM README section
    populates based on your entries. The ALARM README text is built automatically. This information
    is sent via email along with the Runbook, which functions as a Troubleshooting Guide to help
    resolve issues specific to Deliveries in a CRITICAL state. The Runbook provides the protocol
    or blueprint for issue resolution. Metrics are automatically populated or generated in the Runbook
    along with the Comparison Operators, Thresholds, and Alarm configuration. The Description
    and Alarm Group protocol are entered in the Runbook manually. The Runbook is delivered
    to the email address designated in the Linked Integration.

    alarms runbook
  9. To store content, click Save. To disregard, click Cancel.

Five Common Metrics:

alarms five metrics
  1. Delivery Health – Block: # blocked or data that could not be imported. This might be triggered,
    for example, from CAPTCHA problems, proxy pools, or backend issues.

  2. Delivery Health – Error: Might be a system error such as an unavailable proxy or bad code
    in an Extractor.

  3. Delivery Health – HTML: This metric is specific to providing actual pages along with parsed data.
    It would be problematic if the page content was skeleton, or the page could not be found.

  4. Delivery Speed – Input: This metric is determined by the Collection window speed. 1M inputs in one
    hour, for example, that do not finish and time out. If there are too many inputs (based on the
    designated Maximum Inputs, a failure will occur.

  5. Delivery Speed – Snapshot: This metric is indicative of the Delivery window speed.

Delete an Alarm

You cannot delete Alarms.