Signal Loom AI — See It Work

1

Source Received — Provenance Attached

YouTube · File upload · Direct URL · Live stream

Any media source — YouTube URL, uploaded file, direct link, or live microphone — enters the pipeline with its provenance metadata attached. Signal Loom records what it is, where it came from, and what platform it came from before processing begins.

Source metadata attached

source_ref: youtube.com/watch?v=dQw4w9WgXcQ source_kind: youtube_url media_kind: audio platform: youtube title: [extracted from video metadata]

2

Signal Loom Schema — Temporal + Semantic Intelligence

timestamped segments · language · duration · source reference

Audio is transformed into structured data the moment it enters the pipeline. Every word gets a timestamp. Every segment gets a time boundary. The source gets a provenance reference. What was once unsearchable audio becomes a citable, queryable knowledge object.

Schema dimensions extracted

Provenance: source_kind=youtube_url | platform=youtube Temporal: 4 segments mapped | start=0.48s | end=16.88s Language: English (detected, no explicit language flag) Duration: 20.0 seconds processed in 4.3s (4.6× realtime) Format: JSON / SRT / VTT / TXT — all from a single pass

3

Don't Transcribe it — Loom it. Loose threads become the lasting fabric in AI Context building.

schema-defined JSON · 4 formats · provenance-rich

The output isn't a transcript for humans to read — it's a structured knowledge object for AI systems to reason about. Timestamps enable citation. Segments enable retrieval. Provenance enables source-grounding. This is what makes audio useful to agents, RAG pipelines, and enterprise workflows.

Signal Loom Schema — JSON output

Schema: "Signal Loom Schema v1" (schema.signalloomai.com) Source: source_kind=youtube_url | source_ref=youtube.com/watch?v=... Temporal: 4 segments | chars=186 | 4.3s processing Language: English | model=mlx-community/whisper-large-v3-turbo What AI systems get: → Timestamps for citation: "see segment S2, 6.46s–11.16s" → Segments for retrieval: chunk by topic, not by token count → Provenance for grounding: always know what video a quote came from → Multi-format output: JSON (AI) · SRT (subtitles) · VTT (web) · TXT (archive)

4

Structured Output — Four Formats, One Pass

JSON · SRT · VTT · TXT

Every run produces all four output formats simultaneously. JSON for AI systems, SRT and VTT for subtitles and captions, TXT for simple archival. Timestamps on every word in every format.

JSON

SRT

VTT

TXT

Signal Loom Schema — JSON output (truncated)

{ "schema": "Signal Loom Schema v1", "schema_url": "https://signalloomai.com/schema", "source_ref": "youtube.com/watch?v=...", "source_kind": "youtube_url", "media_kind": "audio", "language": "en", "duration_seconds": 20.0, "segments": [ { "segment_id": "S1", "start_seconds": 0.48, "end_seconds": 5.64, "start_time": "00:00:00", "end_time": "00:00:06", "text": "Hello, this is Travis Brady with AIM-T Pulse..." } ], "metadata": { "schema_dimensions": [ { "dimension": "temporal", "description": "timestamped segments with start/end seconds" }, { "dimension": "provenance", "description": "source reference + platform metadata" } ] } }

Watch audio become structured intelligence