TVR E2E Scenarios — Reference

Sources: TVR v2.3 §2.9.6 (Tables 2 & 3) · TVR v2.3 Appendix 1 · GTL v3 (Service level tests sheet) · STM

This document covers two related sources:

  1. TVR §2.9.6 — E2E performance tests confirmed as carried out in Phase 2 (Tables 2 and 3), assigning specific test beds to each scenario.
  2. TVR Appendix 1 — Complete catalogue of E2E service scenarios with detailed definitions, grouped as performable (A.1) and non-performable (A.2) in Phase 2.

For the full list of 16 service scenarios (Table 1), see Service Scenarios.


TVR §2.9.6 — Phase 2 E2E Performance Tests

Table 2: Strand 3–6 E2E Performance Tests in Phase 2

E2E performance tests are also carried out across Strand 3 (DTH — SES, EBU), Strand 4, Strand 5 (Maritime), and Strand 6 (Arctic Space / DTE). These include:

  • WP 3.1.5 (SES DTH): Startup delay comparison across CDN/Edge/satellite methods; buffering frequency under different delivery methods
  • WP 3.x (EBU DTH): Equivalent startup and buffering tests for bidirectional satellite reception
  • Strand 5 (Maritime): Buffering and resilience tests under maritime connectivity conditions
  • Strand 6 / Arctic Space: Latency by region (C.1); scalability and error rate tests

Full strand-specific test plans are documented in the respective strand/WP technical notes.

Table 3: Strand 2 (WP 2.1) Tests Confirmed for Phase 2

These are the Strand 2 tests (defined in TVR §2.9.6, Table 3) that will be carried out in Phase 2, alongside tests specific to the test bed. Tests depend on the ability of far/near-edge components to collect statistics from the user player to evaluate QoE and QoS.

Scenario Use Case Test Bed(s) Test Title KPI
B DTE RAI Assess if watch time drops with degraded quality Average Watch Time per User
E DTE RAI Test correlation between delivery method and playback continuity Average Session Duration
G DTE RAI Test if completions are higher in resilient methods (less buffering) Content Completion Rate
I DTE Arctic Space Quantify wasted resources if streams are abandoned early Content Completion Rate
J DTE RAI Identify abandonment spikes after buffering events Content Abandonment Points
K DTE RAI Test if users drop at predictable quality loss points Content Abandonment Points
M DTE, DTH SES, EBU, Maritime, Arctic Space Directly test startup times distribution across methods Start-up Delay
O DTE Maritime Stress-test buffering under poor networks vs resilient distribution Buffering Ratio / Rebuffering Events
P DTH, DTE SES, EBU, Maritime, RAI Compare buffering frequency under different methods Buffering Ratio / Rebuffering Events
R DTH, DTE SES, EBU, Arctic Space Track frequency of resolution switches under different delivery Buffering Ratio / Rebuffering Events
T DTE Maritime Test retries under network drops or overloaded servers Error Rate (Failed Attempts)
V DTE Maritime, Arctic Space Measure load effect on failure rates under stress tests Error Rate (Failed Attempts)
W DTE Maritime Quantify cost of wasted connection attempts Error Rate (Failed Attempts)
A.2 DTH, DTE SES, EBU, Maritime, RAI Track downtime as resilience indicator Availability (Uptime %)
C.1 DTE Arctic Space Measure response times across different geographies Latency by Region

Note: A number of other Strand 2 scenarios assume large-scale deployment and are not feasible in Phase 2 — these are detailed in Appendix 1, Section A.2.



A.1 — Performable Tests (Confirmed Feasible in Phase 2)

The following scenarios are confirmed as feasible in the Phase 2 test beds and must be listed as test cases in the respective strand/WP test documents.

Scenario Title KVI KPI Test Beds
B Assess if watch time drops with degraded quality Quality Average Watch Time per User DTE
E Test correlation between delivery method and playback continuity Quality Average Session Duration DTE
G Test if completions are higher in resilient methods (less buffering) Resilience Content Completion Rate DTE
I Quantify wasted resources if streams are abandoned early Cost Efficiency Content Completion Rate DTE
J Identify abandonment spikes after buffering events Resilience Content Abandonment Points DTE
K Test if users drop at predictable quality loss points Quality Content Abandonment Points DTE
M Directly test startup times distribution across methods Quality Start-up Delay DTE & DTH
O Stress-test buffering under poor networks vs resilient distribution Resilience Buffering Ratio / Rebuffering Events DTE
P Compare buffering frequency under different methods Quality Buffering Ratio / Rebuffering Events DTE
R Track frequency of resolution switches under different delivery Quality Bitrate Stability / QoE DTE & DTH
T Test retries under network drops or overloaded servers Resilience Error / Failure Rate DTE
V Measure load effect on failure rates under stress tests Scalability Error / Failure Rate DTE
W Quantify cost of wasted connection attempts Cost Efficiency Error / Failure Rate DTE
A.2 Track downtime as resilience indicator Resilience Uptime / Availability DTE & DTH
C.1 Measure response times across different geographies Quality Latency by Region DTE

Scenario Descriptions

Scenario B — Assess if watch time drops with degraded quality

  • KVI: Quality · KPI: Average Watch Time per User · Test Beds: DTE
  • Input: Playback session logs with quality indicators (bitrate, resolution, buffering events); controlled quality degradation conditions (reduced bitrate, simulated packet loss, induced buffering); user engagement data under optimal and degraded conditions.
  • Output: Average watch time (min/user) segmented by distribution method and quality level; correlation analysis between quality degradation and watch time reduction.
  • Expectation: Watch time decreases as quality deteriorates; adaptive methods (CDN with ABR) mitigate the reduction.

Scenario E — Test correlation between delivery method and playback continuity

  • KVI: Quality · KPI: Average Session Duration · Test Beds: DTE
  • Input: Playback session logs with timestamps (start, pauses, interruptions, end); distribution method identifiers; quality indicators (bitrate shifts, buffering events, dropped frames).
  • Output: Average session duration mapped against percentage of uninterrupted playback time; correlation metrics between delivery method and continuity.
  • Expectation: Methods with adaptive streaming and localised infrastructure show higher playback continuity and longer sessions.

Scenario G — Test if completions are higher in resilient methods (less buffering)

  • KVI: Resilience · KPI: Content Completion Rate · Test Beds: DTE
  • Input: Playback session logs showing start/end points per stream; event data on buffering frequency, duration, severity; network condition parameters.
  • Output: Content completion rates (% of streams played to end) per distribution method; correlation between buffering events and early abandonment.
  • Expectation: Resilient distribution methods (adaptive bitrate via CDN or edge caching) support higher completion rates by mitigating buffering.

Scenario I — Quantify wasted resources if streams are abandoned early

  • KVI: Cost Efficiency · KPI: Content Completion Rate · Test Beds: DTE (Arctic Space)
  • Input: Playback session logs showing stream start/end points; resource consumption data (bandwidth, compute) per stream; abandonment timestamps and causes.
  • Output: Resource waste ratio (resources consumed by abandoned streams vs. completed streams); cost model for wasted delivery per distribution method.
  • Expectation: Methods with higher completion rates waste fewer resources; early abandonment in centralised delivery is costlier due to longer data paths.

Scenario J — Identify abandonment spikes after buffering events

  • KVI: Resilience · KPI: Content Abandonment Points · Test Beds: DTE
  • Input: Playback logs showing exact abandonment timestamps; event data on buffering (frequency, duration, severity) preceding abandonment.
  • Output: Correlation between buffering events and abandonment points; identification of thresholds where buffering causes significant user drop-off.
  • Expectation: Abandonment spikes cluster immediately after prolonged or repeated buffering; resilient methods show fewer abandonment spikes.

Scenario K — Test if users drop at predictable quality loss points

  • KVI: Quality · KPI: Content Abandonment Points · Test Beds: DTE
  • Input: Playback logs recording quality degradation events (bitrate reductions, resolution drops); session abandonment logs aligned with those events.
  • Output: Patterns showing correlation between quality loss and session drop-offs; comparative data across distribution methods.
  • Expectation: Predictable abandonment points emerge at moments of severe quality loss; methods with adaptive bitrate may prevent sharp drops.

Scenario M — Directly test startup times distribution across methods

  • KVI: Quality · KPI: Start-up Delay · Test Beds: DTE & DTH
  • Input: Time measurements from user request initiation to video playback start across CDN, Edge, etc.; controlled tests under consistent network conditions.
  • Output: Startup delay (seconds) for each method — average, min, max, p95, p99.9 histograms; comparative dataset of responsiveness across methods.
  • Expectation: Edge or adaptive methods achieve lower average startup delays; centralised methods may show longer delays depending on server distance.

Scenario O — Stress-test buffering under poor networks vs resilient distribution

  • KVI: Resilience · KPI: Buffering Ratio / Rebuffering Events · Test Beds: DTE
  • Input: Playback tests under controlled poor network conditions (reduced bandwidth, induced packet loss); session logs capturing buffering time and total playback time.
  • Output: Buffering ratios across methods under identical degraded conditions; comparative resilience scores.
  • Expectation: Resilient methods maintain lower buffering ratios under stress; non-adaptive methods degrade more sharply under poor conditions.

Scenario P — Compare buffering frequency under different methods

  • KVI: Quality · KPI: Buffering Ratio / Rebuffering Events · Test Beds: DTE
  • Input: Playback logs recording frequency of buffering events per stream; method identifiers (CDN, Edge, etc.).
  • Output: Mean, distribution, and peak buffering frequency per method; comparative dataset highlighting which methods deliver smoother playback.

Scenario R — Track frequency of resolution switches under different delivery

  • KVI: Quality · KPI: Bitrate Stability / QoE · Test Beds: DTE & DTH
  • Input: Playback event logs recording bitrate/resolution switches; timestamps and network conditions during switches.
  • Output: Resolution switch frequency per method; distribution of switch magnitude (up vs down); QoE impact correlation.
  • Expectation: More stable delivery methods show fewer downward resolution switches under variable conditions.

Scenario T — Test retries under network drops or overloaded servers

  • KVI: Resilience · KPI: Error / Failure Rate · Test Beds: DTE
  • Input: Request logs under simulated network drops and server overload conditions; method identifiers.
  • Output: Retry rates and failure rates per method; time-to-recovery measurements.
  • Expectation: Resilient methods with redundancy (CDN, edge failover) show lower failure rates and faster recovery.

Scenario V — Measure load effect on failure rates under stress tests

  • KVI: Scalability · KPI: Error / Failure Rate · Test Beds: DTE
  • Input: Controlled load tests (scaling concurrent viewers); error and failure event logs.
  • Output: Failure rate as a function of load per distribution method; identification of capacity thresholds.
  • Expectation: Distributed methods (CDN, edge) maintain lower failure rates as load increases compared to centralised approaches.

Scenario W — Quantify cost of wasted connection attempts

  • KVI: Cost Efficiency · KPI: Error / Failure Rate · Test Beds: DTE
  • Input: Connection attempt logs including failed attempts and associated resource consumption.
  • Output: Proportion of connection attempts that result in wasted resources; cost model for connection failures per method.
  • Expectation: Methods with lower error rates generate less wasted compute/bandwidth resource per viewer.

Scenario A.2 — Track downtime as resilience indicator

  • KVI: Resilience · KPI: Uptime / Availability · Test Beds: DTE & DTH
  • Input: Uptime monitoring logs across all delivery endpoints; incident and failover event records.
  • Output: Availability percentage per method over the test period; correlation between architecture type and uptime.
  • Expectation: Distributed methods with redundancy achieve higher availability; centralised methods are more vulnerable to single points of failure.

Scenario C.1 — Measure response times across different geographies

  • KVI: Quality · KPI: Latency by Region · Test Beds: DTE (Arctic Space)
  • Input: Request-response timing data collected across geographically distributed endpoints; network path information; delivery method identifiers.
  • Output: Latency measurements (average, p95, p99) per region and delivery method; geographic latency heat maps; correlation between delivery architecture and regional performance.
  • Expectation: Edge and CDN methods achieve lower latency in remote regions compared to centralised approaches; satellite-augmented delivery narrows the latency gap for underserved geographies.

A.2 — Non-Performable Tests (Not Feasible in Phase 2)

The following scenarios cannot be executed in Phase 2 due to scale limitations, missing real-world user populations, or infrastructure constraints. They are documented for completeness and potential Phase 3 consideration.

Scenario Title KVI KPI Reason Not Performable
A Compare watch times across regions/distribution methods to test accessibility Reach, Scalability Average Watch Time per User Requires multi-region real-user populations at scale
C Evaluate trade-off between longer engagement vs. energy consumption Cost Efficiency Average Watch Time / Energy per Viewing Hour Real-user longitudinal engagement data unavailable
D Measure whether different methods (CDN vs Edge) support longer sessions globally Reach, Quality Average Session Duration Global deployment at scale out of scope for Phase 2
F Compare energy per minute of viewing across methods Cost Efficiency Average Session Duration / Energy Cross-method energy measurement requires live deployments at scale
H Measure impact of delivery on full content enjoyment Quality Content Completion Rate Requires real broadcast events with large viewer pools
L Calculate energy/data wasted due to mid-stream exits Cost Efficiency Content Abandonment Points Large-scale energy metering not available in test beds
N Estimate resource cost of longer startup overheads Cost Efficiency Start-up Delay Cost modelling requires full production cost data
Q Energy wasted during idle buffering quantified Cost Efficiency Buffering Ratio / Rebuffering Events Requires energy instrumentation at viewer device scale
S Test link between bitrate volatility and energy inefficiency Cost Efficiency Bitrate Stability / QoE Energy measurement at the bitrate-event level not available
U Evaluate impact of failures on playback experience Quality Error / Failure Rate Requires real viewer feedback at scale
X Record app crashes under varying load/network conditions Resilience Error / Failure Rate Requires real mobile app deployments at scale
Y Test if crashes correlate with poor QoE Quality Error / Failure Rate Requires cross-correlation of crash and QoE data at scale
Z Costs linked to re-initialising playback sessions Cost Efficiency Error / Failure Rate Production cost data not available
A.1 Verify geographic reach by uptime reporting across regions Reach Uptime / Availability Multi-region deployment at real geographic scale not available