WorkAboutContact
Back to Work
ObservabilityIntegration healthStatus modelsOperational trust

Integration Data Health and Observability

Designing trust and observability into integration data flows so clients and internal teams could detect, understand, and act on sync failures.

Public-safe grayscale data health interface concept with connectors, health status cards, event volume, success rate, and average latency metrics.
Public-safe data health dashboard concept visual.

1. Executive Summary

Integration Data Health addressed a trust problem: clients and internal teams needed clearer ways to understand whether integration data was actively syncing and when action was needed.

This work connects directly to observability, monitoring, system status, and operational workflow design. The public version should be treated as a supporting strategy case study until shipped scope and metrics are confirmed.

2. System Context

The core user question was:

Is my integration working?

The system needed to connect raw technical signals to understandable health states across many vendors, features, expected sync frequencies, internal admin surfaces, and client-facing integration surfaces.

Conceptual progression:

Signal collection -> Status visibility -> Internal alerting -> Client alerting

Visual Artifacts

Data domains low-fi

Low-fidelity concept for domain-level health, showing data domains, health states, and attention indicators across user events, campaigns, journeys, and attributes.

3. The Real Problems

Clients had limited visibility into whether an integration was actively syncing data. Silent failures could affect campaigns, shopper data, and business outcomes before anyone noticed.

Internal teams also needed better ways to reason about integration health without relying only on engineering investigation.

Health definitions varied by feature and vendor, so the model needed to be vendor-agnostic while still useful enough to guide action.

4. Design Principles

  • Start with the user's trust question.
  • Status should describe system behavior in human terms.
  • Internal visibility can be deeper than client-facing communication.
  • Health states need clear thresholds and next actions.
  • Observability UX should reduce ambiguity without creating false certainty.
  • Roadmap vision should translate into operational milestones.

5. Major Decisions

Define A Vendor-Agnostic Status Model

The model needed to work across many integrations and feature-specific signals. The design direction focused on overall and per-feature health, last successful sync, expected frequency, and understandable degraded or broken states.

Separate Internal And Client-Facing Visibility

Internal admin tools could expose deeper diagnostic context. Client-facing marketplace surfaces needed simpler, action-oriented status that helped users understand whether something was healthy, degraded, disabled, or broken.

Treat Alerting As A Progression

Rather than jumping directly to client alerting, the model progressed from signal collection and visibility toward internal alerting and then client-facing notifications.

6. System Evolution

This began as future-state Data Health and Observability vision work, then branched into roadmap-aligned milestones such as Data Health, Integration Health, and Product Catalog Health.

Public-safe roadmap model:

Future-state vision
-> Integration Health MVP
-> Internal visibility
-> Client-facing status
-> Alerting and actionability

7. Collaboration Model

This work required close engineering partnership because the spec and signals were technical. The design role was translating data-flow behavior into status models, marketplace concepts, internal admin concepts, and clearer PM/Engineering decisions.

The source material notes ongoing PM and engineering syncs for operationalizing Integration Health. Public details should stay generalized until shipped scope is confirmed.

8. Results

Product outcomes:

  • Defined public-safe models for integration health, status, sync visibility, and internal/client-facing observability.
  • Helped translate future-state Data Health and Observability direction into roadmap-aligned Integration Health work.

Metrics:

  • Public support-ticket metric: Metric needed.
  • Time to detect integration failure: Metric needed.
  • Time to resolve sync issues: Metric needed.
  • Client self-service fix rate: Metric needed.
  • Number of integrations covered by health definitions: Metric needed.

9. Reflection

The key lesson is that trust in platform systems depends on visibility. Users do not only need a connection to exist; they need to know whether it is working, when it last worked, and what action is appropriate when it degrades.

Good observability UX is not just about exposing backend signals. It is about turning signals into confidence, prioritization, and next steps.

10. Reusable Patterns

  • Signal -> status -> alert models.
  • Internal vs client-facing visibility.
  • Healthy / degraded / broken states.
  • Last successful sync context.
  • Expected frequency models.
  • Feature-level health.
  • Trust-centered status design.
  • Vision-to-roadmap translation.

11. Artifact Gallery

Public-safe visuals should include:

  • "Is my integration working?" journey.
  • Data flow visibility model.
  • Signal -> status -> alert progression.
  • Internal vs external observability layers.
  • Healthy / degraded / broken state model.
  • Redacted marketplace status examples.

Some visuals have been recreated or simplified to protect confidential product details.

12. Lessons That Carried Forward

This work is a continuation of the integration platform foundation and EDS. First, the platform needed reusable setup and configuration patterns. Then, as integrations became operational infrastructure, users needed health, status, troubleshooting, and trust.