Live Streaming

Real-time Monitoring

Watch your audit happen live. WebSocket streaming gives you instant visibility into every URL crawled, every link found.

Overview

What Is Real-time Monitoring?

Real-time monitoring means you get instant updates as your website audit progresses—no waiting, no guessing, no black box. Using WebSocket technology (Socket.IO), our system streams live progress data directly to your browser as URLs are crawled.

The moment a page is fetched, you see it appear in the "FETCHING" list. When the fetch completes, counters update immediately: URLs fetched increments, newly discovered links are added to URLs found, external links are counted. All of this happens in real-time, with sub-second latency.

This isn't just a progress bar that estimates completion. This is actual, live data from the crawling engine. You see the exact URLs being processed, the current depth level, memory usage, elapsed time, and sitemap validation results—all updating continuously.

Real-time monitoring transforms the audit experience from "submit and wait" to "watch and understand." You can debug issues on the fly, cancel audits if they're going wrong, and share live progress with team members or clients.

See It In Action

Create an audit and watch the report page. You'll see live updates streaming in: URLs being fetched appear in real-time, progress bars move smoothly, and completion happens automatically with no page refresh needed.

Capabilities

Monitoring Features

Comprehensive real-time visibility into audit progress

WebSocket Streaming

Live bi-directional connection using Socket.IO. Get instant updates without polling or page refreshes.

Live URL Tracking

See exactly which URLs are being fetched right now. Watch concurrent crawls happening in parallel.

Progress Metrics

Real-time counters for URLs fetched, URLs found, external links blocked, and current depth level.

Sitemap Validation Updates

Track crawlable vs non-crawlable sitemap URLs as they're discovered. No waiting until completion.

Elapsed Time Counter

Live timer showing exactly how long the audit has been running. Updates every second.

Progress Bar

Visual completion indicator based on page limits and depth. See how close you are to finishing.

Data Points

What You Can Monitor

Every metric tracked and updated in real-time

OUTPUT_FILE

Report filename being generated

Updates: Once at start

audit_1731536789.txt
SITEMAP_AUDIT

Crawlable vs non-crawlable sitemap URLs

Updates: After each fetch

CAN_CRAWL(T): 245 / CAN_CRAWL(F): 12
URLS_FETCHED

Total pages successfully crawled

Updates: After each fetch

142
URLS_FOUND

Pages discovered but not yet crawled

Updates: After each fetch

387
EXTERNAL_URLS

External links found and blocked

Updates: After each fetch

52
DEPTH_COUNT

Current crawl depth level

Updates: When depth changes

Level 3
FETCHING

URLs being fetched right now (concurrent)

Updates: Every fetch start/end

5 concurrent requests
MEMORY

Heap used/total and RSS in MB

Updates: After each fetch

45.2 MB / 512 MB heap, 72 MB RSS
ELAPSED_TIME

Time since audit started

Updates: Every second

00:03:47
Technology

How It Works Under the Hood

The technical architecture enabling real-time streaming

Socket.IO

WebSocket communication layer

Reliable real-time messaging with automatic reconnection and fallback to polling

vs. HTTP polling (inefficient)

JSON Streaming

Structured progress data format

Machine-readable updates with consistent schema. Easy to parse and display.

vs. Plain text parsing (error-prone)

stdout Unbuffering

Immediate output from CLI process

Using stdbuf -o0 disables buffering so updates stream instantly

vs. Buffered output (delayed updates)

Persistent State

Job survives disconnects

SQLite database maintains state. Reconnect anytime to resume watching.

vs. Memory-only (lost on disconnect)
Comparison

Why Real-time Beats Other Approaches

See the difference that WebSocket streaming makes

No Monitoring
Submit form, wait, pray it works

User Experience

Black box. No visibility into progress.

Problems

Uncertainty, anxiety, can't debug

Polling
Frontend requests status every N seconds

User Experience

Delayed updates. Server strain from constant requests.

Problems

Inefficient, not real-time

WebSocket (Ours)Recommended
Live stream of events as they happen

User Experience

Instant updates. Watch every URL being crawled.

Problems

None. Best approach.

Benefits

Why It Matters

The business value of real-time visibility

Transparency

No black box waiting. You see exactly what's happening at every moment during the crawl.

100% visibility

Debugging

If something goes wrong, real-time logs reveal the issue immediately. Fix problems faster.

Instant feedback

Trust

Clients and stakeholders can watch audits in progress. Builds confidence in the process.

User confidence

Control

Cancel long-running audits if you see they're going off-track. No wasting resources.

Stop anytime
Technical Achievement

Building real-time streaming required solving multiple technical challenges:

  • Stdout buffering: Node.js buffers CLI output. We use stdbuf -o0 wrapper to disable buffering.
  • JSON parsing: Switched from console.log() to process.stdout.write() for reliable JSON formatting.
  • Field naming: Ensured JSON output matches frontend expectations (UPPERCASE format).
  • Reconnection: Socket.IO handles automatic reconnection if WebSocket drops.
  • State persistence: SQLite database ensures audit state survives disconnects and server restarts.
✅ All systems operational (Tested 2025-11-07)

Experience Real-time Monitoring

See your audit come to life. Watch every URL being crawled, every metric updating, in real-time.

Free plan includes real-time monitoring • No credit card required