Understanding the HTML Audio Element Buffered Regions

The HTML audio element exposes multiple properties and methods which prove to be useful for making a reactive web music player. In addition to the usual play/pause controls, the <audio> tag has a currentTime, a duration, and a buffered attributes which can all be used to get the playing media state in real time using a bit of JavaScript code.

currentTime and duration

The currentTime and duration attributes are pretty straightforward. currentTime returns the playback position from the audio file start, in seconds, as a float. Similarly, duration is a float corresponding to the total media length. One can then simply convert to a more natural time format using a bit of formatting code.

export function formatDuration(floatSeconds) {
  const durationMillis = floatSeconds * 1000;
  const timeInSeconds = Math.floor(durationMillis / 1000);
  const minutes = Math.floor(timeInSeconds / 60);
  const seconds = timeInSeconds - (minutes * 60);
  const formattedSeconds = seconds > 9 ? `${seconds}` : `0${seconds}`;
  return `${minutes}:${formattedSeconds}`;
}

buffered regions

The HTML audio element internally uses multiple buffered regions, which live independently. Each buffered region start and stop times can be accessed using the following JavaScript lines.

audioelement.buffered.start(<buffered_region_id>)
audioelement.buffered.end(<buffered_region_id>)

Initially, the browser creates a first buffered region (numbered 0) which is enough to play the first few seconds. As the playback approaches the end of that region, a new part of the media file is automatically requested.

If the user happens to seek the media to an unbuffered region, the browser creates a new buffered region starting at that point. This means that, to properly reflect the next buffered zone, our application has to now show the buffered region #1 in gray. Then, when the user rewinds back to a time inside the first buffered region, our 0th region gets reused. Note that the browser likes to rearrange these ranges in the background, so hardcoded values will definitely become an issue.

Here is a simple JavaScript/VueJS snippet which finds the present buffered region and updates the “buffered but unplayed” region in our application.

updateSeekBar() {
      const currentTime = this.$refs.audiotag.currentTime;
      const totalDuration = this.$refs.audiotag.duration;

      const timeRangeLength = this.$refs.audiotag.buffered.length;
      let currentBufferRange = 0;
      for (let range = 0; range < timeRangeLength; range++) { 
             const rangeStartTime = this.$refs.audiotag.buffered.start(range); 
             const rangeEndTime = this.$refs.audiotag.buffered.end(range); 
             if (currentTime >= rangeStartTime && currentTime < rangeEndTime) {
                 currentBufferRange = range;
         }
      }
      this.bufferedPercentage = (this.$refs.audiotag.buffered.end(currentBufferRange) - currentTime) / totalDuration * 100;
    },

Handling Partial Content Requests (HTTP 206) for audio streaming

Web content streaming requires sending partial files between a server and a client, most often a web browser. This is a strict requirement for controlling audio playback using an HTML audio tag. While a simple HTTP 200 response containing the whole file will work for most use cases, it will not allow proper playback control, as is to be expected of a respectable music streaming app. 

Request

First, the browser requests a partial file by including the Range header, which encodes the position in bytes. The first word refers to the allowed range type, bytes in our case.

bytes=<start_offset>-<end_offset>

Response

Next, the server should reply that :

  1. it accepts file ranges, using the Accept-Ranges: bytes header;
  2. it is sending an incomplete file, using the 206 HTTP status code, containing a specific number of bytes with the Content-Length header;
  3. and that it is sending the requested byte range, in the Content-Range header.

The Content-Range header should contain the first byte index, the last, and the total file size.

Content-Range: bytes <start>-<end>/<total>

Note that the last fragment’s “end” offset should be exactly one less than the total file size.

Example

Below is a quick and dirty example for sending a partial file in python.

if request.headers['HTTP_RANGE']:
  total_size = os.path.getsize(filename)
  http_range = request.headers['HTTP_RANGE']
  start = int(http_range.split("=")[1].split("-")[0])
  end = start + 2000000 if http_range.split("-")[1] == '' else int(http_range.split("-")[1])
  
  if end > total_size:
      end = total_size - 1
  
  with open(filename, 'rb') as f:
    f.seek(start)
    body = f.read(end - start)
  return Response(206, {"Content-Type": "application/octet-stream",
                        "Accept-Ranges": "bytes",
                        "Content-Range": "bytes {}-{}/{}".format(start, end, total_size)},
                  body)