Part Two: Our Live Streaming Platform
This is the second post in a series about the progress and achievements of our video delivery platform. It will focus on detailing the problems we solved on our live video streaming platform. Our first post was Part One: Our On-Demand Video Platform.
When we first started broadcasting live streaming events at The New York Times, Flash was still a thing. We used proprietary protocols and components from the signal reception to the delivery. At the of end 2015, we decided to remove Flash components from our video player and switch to HTTP Live Streaming (HLS) as our main protocol.
During the 2016 presidential election cycle, the newsroom expressed interest in doing more live events, including the coverage of live debates on the homepage and live blogs. With a mindset of making live events easier and more affordable for the company in the long run, the video technology department decided to invest more in the infrastructure and bring the signal reception and streaming packaging in-house.
We upgraded our on-premises video recording and streaming appliances to a multichannel GPU-accelerated server. With this physical all-in-one solution in place we had more flexibility to set up and broadcast live events, including streaming and serving our content to partners such as YouTube and Facebook.
Our Live Infrastructure: How We Receive and Deliver our Streams
Most of our live content is produced by third parties on location and sent to us at broadcast quality. For example, if we want to stream a press conference from the White House, we make use of our subscription to the network television pool feed. Depending on the event, the feed may be a single camera angle or a more polished “line cut” switching between multiple angles.
The feed is delivered via a point of presence local to the event, which we can route to New York City. This can be expensive, since it is a dedicated circuit transmitting uncompressed HD-SDI signals, but it gives us the highest quality and most flexibility. This way, we avoid the delay and image degradation from extra compression/decompression passes and the potential IP packet loss of transmission over the public internet. The signal is then delivered to our main building via one of our dedicated video fiber lines.
Once in our building, the signal is sent to our live encoding server. Our current solution can simultaneously encode up to eight HD-SDI feeds and distribute to any format needed.
For the live streaming events that we broadcast on nytimes.com, we use our live encoding server to generate six different HTTP Live Streaming (HLS) outputs from the incoming feed. The outputs are composed of H.264/MPEG-TS segments of 3 seconds length in a range of bitrates and resolutions. This allows our video player to adapt and select the best output based on a user’s current connection and device capabilities. We also set the creation of the manifest (M3U8) files of each output in appending mode, where the manifest aggregates all video segments from the beginning to the end of the transmission. This is preferable to rolling mode, since it enables us to do an extremely fast switch on the video asset from live to video on-demand (VoD) once the event is over.
From the same input used on the live streaming events, we also generate an Apple ProRes QuickTime output on our shared Storage Area Network (SAN), which our video editors use for chase editing during the live event or for cuts after.
The Problems We Faced Delivering and Managing Live Events
As shown in the picture below, after our Live Encoding Server generated each individual MPEG-TS segment, they would then be sent individually over the open internet to our CDN. Each packet was sent using HTTP, and our NetStorage/CDN was responsible for hosting, caching and serving the assets.
The transmissions of the election debates and related events went well, but the number of requests to the CDN proved to be a problem: some players were getting stuck buffering as the M3U8 manifests were taking too long to be updated with new segments. We investigated the cause, digging into the live encoding server logs, and began to notice a pattern of errors when pushing segments over the open internet via HTTP PUT. Even after setting automatic retries, since we shared the same internet connection for operations and development in our office, we found that the live feed was competing for bandwidth with other network activities. We needed a more robust approach if we wanted to continue streaming live events.
Another major pain point was the need to have technical staff in-house during live events setting up the feeds and making sure we were live streaming to the right endpoint and saving the Apple ProRes QuickTime version on the right path. The process was mostly manual and very stressful. The truth was, we needed a simpler and more resilient way to create, start, and monitor live events. It had to be easy enough that someone with very little technical knowledge could intuitively manage our events.
How We Solved Our Delivery and Managing Challenges
The delivery challenge we faced was due to the very nature of the HTTP protocol over the open internet. After discussing possible solutions with our networking team, we realized our best bet was to avoid the internet all together for delivery to our CDN. Instead, we decided to leverage our Direct Connect with Amazon Web Services, solving our bandwidth and latency issues.
Direct Connect is essentially a dedicated network connection that allows for consistent performance between our building and Amazon Web Services. Unfortunately, we didn’t have it enabled for S3 (where we wanted to store our video segments), but we took an approach of proxying via EC2 (which was enabled for Direct Connect). Since our connection to our EC2 proxy was dedicated, and the same for EC2 to S3, we eliminated the chances of packet loss over HTTP. As a bonus, we open-sourced the service that we created and called it the s3-upload-proxy.
After the segments were available on S3, we configured a caching layer for spreading the content around the world.
Our Live Streaming Manager: A User-Friendly Tool to Manage Live Events
Our second challenge was the need for technical staff to be on-site and available during each of our live events. In order to avoid this, and reduce the number of error-prone manual steps involved, we decided to implement a Live Streaming Manager. It consists of a web application that is responsible for talking to our live encoding server and coordinating with other components through REST API calls. Although the entire application hasn’t been open-sourced, we did make available our elemental-live-client, which allows for easy interactions with the encoding server we use.
The actual design of the application was meant to be user-friendly and simple. It consists of three different screens, that allow a user to create, operate, and end their live streaming event. Below is a short breakdown of each of these screens:
- Event Setup Screen: It allows anyone to preview the inputs, decide whether they want to ingest the stream to partners and schedule or start the event.
- Operation Screen: It allows the one in charge of the event to preview how the event is showing up for the audience on our internal player, tweak the volume of the input, and stop the event.
- Trim Screen: It shows up right after finishing the event. This allows us to remove pre-roll and post-roll that we don’t want to keep for the on-demand version. The interface makes it possible for editors to mark the start and end of the event and republish the video asset quickly without re-encoding the entire program. This way, users that view the video after the event has completed are able to watch from the beginning without having to skip past uninteresting setup and delays.
This project has been a huge success and so far, we’ve successfully managed over 100 live events using it.
Future Improvements to Our Live Streaming Systems
We would like to support DVR actions in our video player during live events, much as we do for VoD content. Since we are already generating the HLS level playlists in appending mode, as mentioned earlier, we are just one step away. All media playback engines we use on our players are able to seek, so we essentially just need to expose the functionality for the users.
For our Live Streaming Manager, our team requested additional features, such as integrations with more partners and the possibility to clip straight to social media networks.
Coming Up Next
For our last post in the Improving Our Video Experience series, we will explain how we added Closed Captioning support for our videos, including on our web and native players. We’ll also cover how we added accessibility support for our player’s controls.