TVR E2E Scenarios — Reference
Sources: TVR v2.3 §2.9.6 (Tables 2 & 3) · TVR v2.3 Appendix 1 · GTL v3 (Service level tests sheet) · STM
This document covers two related sources:
- TVR §2.9.6 — E2E performance tests confirmed as carried out in Phase 2 (Tables 2 and 3), assigning specific test beds to each scenario.
- TVR Appendix 1 — Complete catalogue of E2E service scenarios with detailed definitions, grouped as performable (A.1) and non-performable (A.2) in Phase 2.
For the full list of 16 service scenarios (Table 1), see Service Scenarios.
TVR §2.9.6 — Phase 2 E2E Performance Tests
Table 2: Strand 3–6 E2E Performance Tests in Phase 2
E2E performance tests are also carried out across Strand 3 (DTH — SES, EBU), Strand 4, Strand 5 (Maritime), and Strand 6 (Arctic Space / DTE). These include:
- WP 3.1.5 (SES DTH): Startup delay comparison across CDN/Edge/satellite methods; buffering frequency under different delivery methods
- WP 3.x (EBU DTH): Equivalent startup and buffering tests for bidirectional satellite reception
- Strand 5 (Maritime): Buffering and resilience tests under maritime connectivity conditions
- Strand 6 / Arctic Space: Latency by region (C.1); scalability and error rate tests
Full strand-specific test plans are documented in the respective strand/WP technical notes.
Table 3: Strand 2 (WP 2.1) Tests Confirmed for Phase 2
These are the Strand 2 tests (defined in TVR §2.9.6, Table 3) that will be carried out in Phase 2, alongside tests specific to the test bed. Tests depend on the ability of far/near-edge components to collect statistics from the user player to evaluate QoE and QoS.
| Scenario | Use Case | Test Bed(s) | Test Title | KPI |
|---|---|---|---|---|
| B | DTE | RAI | Assess if watch time drops with degraded quality | Average Watch Time per User |
| E | DTE | RAI | Test correlation between delivery method and playback continuity | Average Session Duration |
| G | DTE | RAI | Test if completions are higher in resilient methods (less buffering) | Content Completion Rate |
| I | DTE | Arctic Space | Quantify wasted resources if streams are abandoned early | Content Completion Rate |
| J | DTE | RAI | Identify abandonment spikes after buffering events | Content Abandonment Points |
| K | DTE | RAI | Test if users drop at predictable quality loss points | Content Abandonment Points |
| M | DTE, DTH | SES, EBU, Maritime, Arctic Space | Directly test startup times distribution across methods | Start-up Delay |
| O | DTE | Maritime | Stress-test buffering under poor networks vs resilient distribution | Buffering Ratio / Rebuffering Events |
| P | DTH, DTE | SES, EBU, Maritime, RAI | Compare buffering frequency under different methods | Buffering Ratio / Rebuffering Events |
| R | DTH, DTE | SES, EBU, Arctic Space | Track frequency of resolution switches under different delivery | Buffering Ratio / Rebuffering Events |
| T | DTE | Maritime | Test retries under network drops or overloaded servers | Error Rate (Failed Attempts) |
| V | DTE | Maritime, Arctic Space | Measure load effect on failure rates under stress tests | Error Rate (Failed Attempts) |
| W | DTE | Maritime | Quantify cost of wasted connection attempts | Error Rate (Failed Attempts) |
| A.2 | DTH, DTE | SES, EBU, Maritime, RAI | Track downtime as resilience indicator | Availability (Uptime %) |
| C.1 | DTE | Arctic Space | Measure response times across different geographies | Latency by Region |
Note: A number of other Strand 2 scenarios assume large-scale deployment and are not feasible in Phase 2 — these are detailed in Appendix 1, Section A.2.
A.1 — Performable Tests (Confirmed Feasible in Phase 2)
The following scenarios are confirmed as feasible in the Phase 2 test beds and must be listed as test cases in the respective strand/WP test documents.
| Scenario | Title | KVI | KPI | Test Beds |
|---|---|---|---|---|
| B | Assess if watch time drops with degraded quality | Quality | Average Watch Time per User | DTE |
| E | Test correlation between delivery method and playback continuity | Quality | Average Session Duration | DTE |
| G | Test if completions are higher in resilient methods (less buffering) | Resilience | Content Completion Rate | DTE |
| I | Quantify wasted resources if streams are abandoned early | Cost Efficiency | Content Completion Rate | DTE |
| J | Identify abandonment spikes after buffering events | Resilience | Content Abandonment Points | DTE |
| K | Test if users drop at predictable quality loss points | Quality | Content Abandonment Points | DTE |
| M | Directly test startup times distribution across methods | Quality | Start-up Delay | DTE & DTH |
| O | Stress-test buffering under poor networks vs resilient distribution | Resilience | Buffering Ratio / Rebuffering Events | DTE |
| P | Compare buffering frequency under different methods | Quality | Buffering Ratio / Rebuffering Events | DTE |
| R | Track frequency of resolution switches under different delivery | Quality | Bitrate Stability / QoE | DTE & DTH |
| T | Test retries under network drops or overloaded servers | Resilience | Error / Failure Rate | DTE |
| V | Measure load effect on failure rates under stress tests | Scalability | Error / Failure Rate | DTE |
| W | Quantify cost of wasted connection attempts | Cost Efficiency | Error / Failure Rate | DTE |
| A.2 | Track downtime as resilience indicator | Resilience | Uptime / Availability | DTE & DTH |
| C.1 | Measure response times across different geographies | Quality | Latency by Region | DTE |
Scenario Descriptions
Scenario B — Assess if watch time drops with degraded quality
- KVI: Quality · KPI: Average Watch Time per User · Test Beds: DTE
- Input: Playback session logs with quality indicators (bitrate, resolution, buffering events); controlled quality degradation conditions (reduced bitrate, simulated packet loss, induced buffering); user engagement data under optimal and degraded conditions.
- Output: Average watch time (min/user) segmented by distribution method and quality level; correlation analysis between quality degradation and watch time reduction.
- Expectation: Watch time decreases as quality deteriorates; adaptive methods (CDN with ABR) mitigate the reduction.
Scenario E — Test correlation between delivery method and playback continuity
- KVI: Quality · KPI: Average Session Duration · Test Beds: DTE
- Input: Playback session logs with timestamps (start, pauses, interruptions, end); distribution method identifiers; quality indicators (bitrate shifts, buffering events, dropped frames).
- Output: Average session duration mapped against percentage of uninterrupted playback time; correlation metrics between delivery method and continuity.
- Expectation: Methods with adaptive streaming and localised infrastructure show higher playback continuity and longer sessions.
Scenario G — Test if completions are higher in resilient methods (less buffering)
- KVI: Resilience · KPI: Content Completion Rate · Test Beds: DTE
- Input: Playback session logs showing start/end points per stream; event data on buffering frequency, duration, severity; network condition parameters.
- Output: Content completion rates (% of streams played to end) per distribution method; correlation between buffering events and early abandonment.
- Expectation: Resilient distribution methods (adaptive bitrate via CDN or edge caching) support higher completion rates by mitigating buffering.
Scenario I — Quantify wasted resources if streams are abandoned early
- KVI: Cost Efficiency · KPI: Content Completion Rate · Test Beds: DTE (Arctic Space)
- Input: Playback session logs showing stream start/end points; resource consumption data (bandwidth, compute) per stream; abandonment timestamps and causes.
- Output: Resource waste ratio (resources consumed by abandoned streams vs. completed streams); cost model for wasted delivery per distribution method.
- Expectation: Methods with higher completion rates waste fewer resources; early abandonment in centralised delivery is costlier due to longer data paths.
Scenario J — Identify abandonment spikes after buffering events
- KVI: Resilience · KPI: Content Abandonment Points · Test Beds: DTE
- Input: Playback logs showing exact abandonment timestamps; event data on buffering (frequency, duration, severity) preceding abandonment.
- Output: Correlation between buffering events and abandonment points; identification of thresholds where buffering causes significant user drop-off.
- Expectation: Abandonment spikes cluster immediately after prolonged or repeated buffering; resilient methods show fewer abandonment spikes.
Scenario K — Test if users drop at predictable quality loss points
- KVI: Quality · KPI: Content Abandonment Points · Test Beds: DTE
- Input: Playback logs recording quality degradation events (bitrate reductions, resolution drops); session abandonment logs aligned with those events.
- Output: Patterns showing correlation between quality loss and session drop-offs; comparative data across distribution methods.
- Expectation: Predictable abandonment points emerge at moments of severe quality loss; methods with adaptive bitrate may prevent sharp drops.
Scenario M — Directly test startup times distribution across methods
- KVI: Quality · KPI: Start-up Delay · Test Beds: DTE & DTH
- Input: Time measurements from user request initiation to video playback start across CDN, Edge, etc.; controlled tests under consistent network conditions.
- Output: Startup delay (seconds) for each method — average, min, max, p95, p99.9 histograms; comparative dataset of responsiveness across methods.
- Expectation: Edge or adaptive methods achieve lower average startup delays; centralised methods may show longer delays depending on server distance.
Scenario O — Stress-test buffering under poor networks vs resilient distribution
- KVI: Resilience · KPI: Buffering Ratio / Rebuffering Events · Test Beds: DTE
- Input: Playback tests under controlled poor network conditions (reduced bandwidth, induced packet loss); session logs capturing buffering time and total playback time.
- Output: Buffering ratios across methods under identical degraded conditions; comparative resilience scores.
- Expectation: Resilient methods maintain lower buffering ratios under stress; non-adaptive methods degrade more sharply under poor conditions.
Scenario P — Compare buffering frequency under different methods
- KVI: Quality · KPI: Buffering Ratio / Rebuffering Events · Test Beds: DTE
- Input: Playback logs recording frequency of buffering events per stream; method identifiers (CDN, Edge, etc.).
- Output: Mean, distribution, and peak buffering frequency per method; comparative dataset highlighting which methods deliver smoother playback.
Scenario R — Track frequency of resolution switches under different delivery
- KVI: Quality · KPI: Bitrate Stability / QoE · Test Beds: DTE & DTH
- Input: Playback event logs recording bitrate/resolution switches; timestamps and network conditions during switches.
- Output: Resolution switch frequency per method; distribution of switch magnitude (up vs down); QoE impact correlation.
- Expectation: More stable delivery methods show fewer downward resolution switches under variable conditions.
Scenario T — Test retries under network drops or overloaded servers
- KVI: Resilience · KPI: Error / Failure Rate · Test Beds: DTE
- Input: Request logs under simulated network drops and server overload conditions; method identifiers.
- Output: Retry rates and failure rates per method; time-to-recovery measurements.
- Expectation: Resilient methods with redundancy (CDN, edge failover) show lower failure rates and faster recovery.
Scenario V — Measure load effect on failure rates under stress tests
- KVI: Scalability · KPI: Error / Failure Rate · Test Beds: DTE
- Input: Controlled load tests (scaling concurrent viewers); error and failure event logs.
- Output: Failure rate as a function of load per distribution method; identification of capacity thresholds.
- Expectation: Distributed methods (CDN, edge) maintain lower failure rates as load increases compared to centralised approaches.
Scenario W — Quantify cost of wasted connection attempts
- KVI: Cost Efficiency · KPI: Error / Failure Rate · Test Beds: DTE
- Input: Connection attempt logs including failed attempts and associated resource consumption.
- Output: Proportion of connection attempts that result in wasted resources; cost model for connection failures per method.
- Expectation: Methods with lower error rates generate less wasted compute/bandwidth resource per viewer.
Scenario A.2 — Track downtime as resilience indicator
- KVI: Resilience · KPI: Uptime / Availability · Test Beds: DTE & DTH
- Input: Uptime monitoring logs across all delivery endpoints; incident and failover event records.
- Output: Availability percentage per method over the test period; correlation between architecture type and uptime.
- Expectation: Distributed methods with redundancy achieve higher availability; centralised methods are more vulnerable to single points of failure.
Scenario C.1 — Measure response times across different geographies
- KVI: Quality · KPI: Latency by Region · Test Beds: DTE (Arctic Space)
- Input: Request-response timing data collected across geographically distributed endpoints; network path information; delivery method identifiers.
- Output: Latency measurements (average, p95, p99) per region and delivery method; geographic latency heat maps; correlation between delivery architecture and regional performance.
- Expectation: Edge and CDN methods achieve lower latency in remote regions compared to centralised approaches; satellite-augmented delivery narrows the latency gap for underserved geographies.
A.2 — Non-Performable Tests (Not Feasible in Phase 2)
The following scenarios cannot be executed in Phase 2 due to scale limitations, missing real-world user populations, or infrastructure constraints. They are documented for completeness and potential Phase 3 consideration.
| Scenario | Title | KVI | KPI | Reason Not Performable |
|---|---|---|---|---|
| A | Compare watch times across regions/distribution methods to test accessibility | Reach, Scalability | Average Watch Time per User | Requires multi-region real-user populations at scale |
| C | Evaluate trade-off between longer engagement vs. energy consumption | Cost Efficiency | Average Watch Time / Energy per Viewing Hour | Real-user longitudinal engagement data unavailable |
| D | Measure whether different methods (CDN vs Edge) support longer sessions globally | Reach, Quality | Average Session Duration | Global deployment at scale out of scope for Phase 2 |
| F | Compare energy per minute of viewing across methods | Cost Efficiency | Average Session Duration / Energy | Cross-method energy measurement requires live deployments at scale |
| H | Measure impact of delivery on full content enjoyment | Quality | Content Completion Rate | Requires real broadcast events with large viewer pools |
| L | Calculate energy/data wasted due to mid-stream exits | Cost Efficiency | Content Abandonment Points | Large-scale energy metering not available in test beds |
| N | Estimate resource cost of longer startup overheads | Cost Efficiency | Start-up Delay | Cost modelling requires full production cost data |
| Q | Energy wasted during idle buffering quantified | Cost Efficiency | Buffering Ratio / Rebuffering Events | Requires energy instrumentation at viewer device scale |
| S | Test link between bitrate volatility and energy inefficiency | Cost Efficiency | Bitrate Stability / QoE | Energy measurement at the bitrate-event level not available |
| U | Evaluate impact of failures on playback experience | Quality | Error / Failure Rate | Requires real viewer feedback at scale |
| X | Record app crashes under varying load/network conditions | Resilience | Error / Failure Rate | Requires real mobile app deployments at scale |
| Y | Test if crashes correlate with poor QoE | Quality | Error / Failure Rate | Requires cross-correlation of crash and QoE data at scale |
| Z | Costs linked to re-initialising playback sessions | Cost Efficiency | Error / Failure Rate | Production cost data not available |
| A.1 | Verify geographic reach by uptime reporting across regions | Reach | Uptime / Availability | Multi-region deployment at real geographic scale not available |