在 MacBook 上使用 Python 作實況視訊串流

本文是山姆鍋在學習實況視訊串流(live video streaming)過程，用來驗證概念(proof of concept) 的紀錄。透過 MacBook 內建的鏡頭作為視訊源，並藉由 HTTP Live Streaming (HLS) 協定作實況串流。雖說是實況，但因為採用 HLS 協定，先天上就會有延遲的。實驗的結果不算太成功，本來只能使用桌面環境的 Safari 瀏覽器來觀看視訊，經過高手指正後，現在手機版的也可以了。

何謂 HTTP Live Streaming (HLS)?

HLS 是蘋果公司制定，以 HTTP 協定為基礎的媒體串流協定，可以支援隨選(Video-on-Demand; VOD)以及實況(live)模式。其它同樣使用 HTTP 作為基礎的串流協定，主要的有：

Adobe HTTP Dynamic Streaming (HDS)
Microsoft Smooth Streaming (MSS)
MPEG-DASH

本文選擇使用 HLS 純粹是因為山姆鍋比較熟悉。

測試環境

主機: MacBook Pro
OS: OSX 10.10
CPU: X86-64
Python: 2.7.10

運作流程

程式共分成發布端(publisher)、串流端(streamer)以及回放端(player)三個部分，回放端使用的是 OSX 內建的 Safari 瀏覽器，所以我們只需要有發布端跟串流端即可。

基本流程說明如下：

發布端即時從鏡頭擷取影像，轉碼(encode)成串流需要的編碼與格式(MPEG2 TS)後通知串流端有新的區塊(segment)；
串流端根據收到的視訊區塊動態產生串流中介資料檔(metadata)；
回放端則依照中介資料檔來決定該回放的區塊。

串流端

串流端在正式系統需要使用其它的伺服軟體，如 Nginx。因為只是驗證，這裡山姆鍋使用 Gevent¹ + Bottle² 來作為串流端的技術推疊(technology stack)。

為了要完成 HLS 串流工作，串流端需提供兩種資料給回放端：

串流中介資料

: HLS 的中介資料以 m3u8 格式，content type為: application/x-mpegURL
媒體區段資料

: HLS 的區段須以 MPEG2 TS 格式存放，每個區段一個檔案，通常副檔名為 .ts， content type: video/mp2t

底下簡單說明串流中介資料，首先看一段實際的內容：

#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:3
#EXT-X-MEDIA-SEQUENCE:28
#EXTINF:2.250000,
http://127.0.0.1:8080/live/out028.ts
#EXTINF:1.500000,
http://127.0.0.1:8080/live/out029.ts

其中，

#EXTM3U

: 讓回放端知道中介資料是以擴充版的 M3U 格式撰寫。
#EXT-X-VERSION:3

: 指定此中介資料格式的版本，不支援此版本的回放端無法解讀。
#EXT-X-TARGETDURATION:3

: 指定串流中，此敘述之後的視訊區段最長的秒數。本文每個區段接近 2 秒，所以這裡指定 3 秒。
#EXT-X-MEDIA-SEQUENCE:58

: 指定中介資料中的第一個區塊在整個串流中的序號，沒有這個敘述則預設為 0。因為是實況串流，區塊會不斷持續產生，如果保留所有過往的區塊資料，除了浪費頻寬跟效能外，最終也會導致程式掛點。所以，需要以滾動視窗(rolling window)的方式，只保留最近的區塊。
#EXTINF:1.500000

: 每個區塊之前都需要有這個宣告，其中 1.50000是此區塊的時間長度(以秒為單位)。這個宣告之後的下一行必須是區塊檔案的 URL位址，讓回放端知道要如何以及去何處擷取區塊資料。
#EXT-X-ENDLIST

: 如果是實況串流，了解以上的宣告就足夠，但對於隨選視訊，需要這個宣告讓回放端知道中介資料結束。也就是說，只要這個宣告沒有出現，回放端會假設是實況串流。

關於 HLS 的近一步資訊可以參考規格文件 ⎘ 。

底下是串流端主要的程式內容(已刪減)：

# -*- coding: utf-8 -*-
from __future__ import absolute_import, print_function

import os
from collections import deque

from gevent import monkey; monkey.patch_all()
from bottle import route, run, static_file, request, response, hook

WEBROOT = os.path.abspath('./webroot')

LIVE_MEDIA_FOLDER = os.path.join(WEBROOT, 'live')

ROLLING_WINDOW = 10
playlist = deque(maxlen=ROLLING_WINDOW)

published_segments = 0


@hook('after_request')
def no_cache():
    response.set_header('Cache-Control', 'no-cache, no-store, must-revalidate')
    response.set_header('Pragma', 'no-cache')
    response.set_header('Expires', '0')


@route('/stream.m3u8')
def live_stream_meta():
    global playlist
    global published_segments

    print("Serve playlist")
    response.content_type = 'application/x-mpegURL'

    result = list()
    result.append('#EXTM3U\n')
    result.append('#EXT-X-VERSION:3\n')
    # result.append('#EXT-X-PLAYLIST-TYPE:EVENT\n')
    result.append('#EXT-X-TARGETDURATION:3\n')

    if len(playlist) == 0:
        result.append('#EXT-X-MEDIA-SEQUENCE:0\n')
    else:
        sequence = playlist[0][2]
        result.append('#EXT-X-MEDIA-SEQUENCE:%d\n' % sequence)

    for name, duration, sequence in playlist:
        result.append('#EXTINF:%s,\n' % duration)
        result.append('/live/%s\n' % name)
    # result.append('#EXT-X-ENDLIST')
    print(result)
    return result


@route('/live/<filename>')
def live_stream_data(filename):
    print("Serve stream data:", filename)
    response.content_type = 'video/mp2t'
    in_file = os.path.join(LIVE_MEDIA_FOLDER, filename)
    with open(in_file) as f:
        return f.read()


@route('/publish/<filename:path>/<duration>')
def publish(filename, duration):
    global playlist
    global published_segments
    playlist.append((filename, duration, published_segments))
    print("Published segment:(%s, %s)" % (filename, duration))
    published_segments += 1



def main():
    run(host='0.0.0.0', port=8080, server='gevent')


if __name__ == '__main__':
    main()

其中，

live_stream_meta

: 用來提供回放端需要的串流中介資料。
live_stream_data

: 用來提供媒體區塊資料給回放端。
publish

: 讓發布端通知有新的區塊產生，發布端須提供檔名以及區塊時間長度。

發布端

從實作的角度，發布端其實比較麻煩，由於山姆鍋希望使用實況的視訊來源，自然把腦筋動到 MacBook 內建的鏡頭身上；另外需要將影像轉碼成 HLS 串流可以接受的格式 (MPEG2 TS)，一開始還真的不知道如何著手。

針對轉碼的部分有評估過 GStreamer(因為 Kivy 好像有使用)，但對於要如何組合 pipeline 還真的沒有概念，跳過。說到視訊轉碼，另外的候選當然是鼎鼎大名的 ffmpeg 了！但問題是要使用哪個 Python 的綁定(binding)？過程就省略，反正最後選擇 PyAV ³ 這個程式庫，如果您有其它更好的選擇，請不吝指教。

再來就是影像擷取的問題：一開始還在想 GStreamer, OpenCV 怎麼作？後來發現 ffmpeg 就有支援，幸運的是 PyAV 也有提供相關範例：

source = av.open(format='avfoundation', file='0')

其中，`av` 是 PyAV 的套件名稱。當然這個只適用在 OSX 環境。

底下是發布端的程式碼：

# -*- coding: utf-8 -*-
from __future__ import absolute_import, print_function

import os
import time
import urllib2
import av
import threading
from Queue import Queue

OUTPUT_FOLDER = os.path.abspath('./webroot/live')

interrupted = False


class SegmentSubmitter(threading.Thread):

    def __init__(self):
        super(SegmentSubmitter, self).__init__()
        self.queue = Queue()
        self.setDaemon(True)

    def put_item(self, item):
        self.queue.put_nowait(item)

    def run(self):
        print("Submitter started.")
        while True:
            item = self.queue.get()
            if len(item) == 0:
                break

            print("Submitting %s" % item[0])
            url = 'http://127.0.0.1:8080/publish/%s/%f' % item
            try:
                content = urllib2.urlopen(url=url).read()
            except urllib2.URLError:
                pass


def gen_segment(filename, source, bit_rate=1000000, vcodec='h264',
                pix_fmt='yuv420p', frame_rate=20, duration=2):
    global interrupted

    out_filename = os.path.join(OUTPUT_FOLDER, filename)
    output = av.open(out_filename, 'w')

    outs = output.add_stream(vcodec, str(frame_rate))
    outs.bit_rate = bit_rate
    outs.pix_fmt = pix_fmt
    outs.width = 640
    outs.height = 480
    secs_per_frame = 1.0 / frame_rate
    frame_count = 0
    segment_start_time = time.time()

    while True:
        start_time = time.time()
        packet = source.next()

        for frame in packet.decode():
            frame.pts = None
            out_packet = outs.encode(frame)
            frame_count += 1
            if out_packet:
                output.mux(out_packet)

        if (time.time() - segment_start_time) > duration:
            break

        time_to_wait = start_time + secs_per_frame - time.time()
        if time_to_wait > 0:
            try:
                time.sleep(time_to_wait)
            except KeyboardInterrupt:
                interrupted = True
                break

    while True:
        out_packet = outs.encode()
        if out_packet:
            frame_count += 1
            output.mux(out_packet)
        else:
            break

    output.close()

    segment_duration = time.time() - segment_start_time
    return segment_duration, frame_count


def publish(source):
    global interrupted

    num_segments = 0
    submitter = SegmentSubmitter()
    submitter.start()

    stream = next(s for s in source.streams if s.type == 'video')
    it = source.demux(stream)

    while not interrupted:
        filename = 'seg-%d.ts' % num_segments
        print("Generating segment: %s" % filename)
        num_segments += 1
        duration, frame_count = gen_segment(filename, it)
        print("Segment generated: (%s, %f, %d)" %
              (filename, duration, frame_count))
        submitter.put_item((filename, duration))


def main():
    source = av.open(format='avfoundation', file='0')
    #source = av.open(file='movie.mp4', 'r')

    print("Number of streams in source: %d" % len(source.streams))

    publish(source)

if __name__ == '__main__':
    main()

共有兩個執行緒在運作，其中一個負責影像擷取並產生區塊檔案，另一個負責通知串流端有新區塊產生。
不知道是程式寫得沒有效率還是怎樣，source 的 frame rate 最多只能到每秒 20 幀左右。
雖然有根據 frame rate, 來調整擷取的時間間隔以避免影像快轉，結果有改善，但似乎還要加強。

使用 Flowplayer 讓其它瀏覽器也可以觀看 HLS 串流

除了 Apple 自家的 Safari 外，其它瀏覽器對於 HLS 的支援上不完整，在這些瀏覽器需要特別處理。底下是使用 Flowplayer ⎘ 的範例:

<!doctype html>

<head>
  <link rel="stylesheet" href="player/skin/functional.css" />

  <!-- site specific styling -->
  <style>
    body {
      font:
        12px "Myriad Pro",
        "Lucida Grande",
        sans-serif;
      text-align: center;
      padding-top: 5%;
    }
    .flowplayer {
      width: 80%;
    }
  </style>

  <!-- for video tag based installs flowplayer depends on jQuery 1.7.2+ -->
  <script src="https://code.jquery.com/jquery-1.11.2.min.js"></script>

  <!-- include flowplayer -->
  <script src="player/flowplayer.min.js"></script>
</head>

<body>
  <!-- the player -->
  <div class="flowplayer" data-swf="/player/flowplayer.swf" data-ratio="0.4167">
    <video>
      <source
        type="application/x-mpegurl"
        src="http://127.0.0.1:8080/stream.m3u8"
      />
    </video>
  </div>
</body>

實際使用會很卡，由於使用 Safari 也會稍微卡卡的，應該是我的程式問題。

結語

本文提供的範例還有不少坑，真的希望有哪位高人能夠指導一下。在過程中，最大的收穫竟然是發現 Nginx (透過插件) 已經可以支援多種串流協定!

參考資料

Gevent: http://www.gevent.org/ ⎘ ↩
Bottle: http://bottlepy.org/docs/dev/index.html ⎘ ↩
PyAV: https://github.com/mikeboers/PyAV ⎘ ↩

在 MacBook 上使用 Python 作實況視訊串流

何謂 HTTP Live Streaming (HLS)?

測試環境

運作流程

串流端

發布端

使用 Flowplayer 讓其它瀏覽器也可以觀看 HLS 串流

結語

參考資料

郭信義 (Sam Kuo)

微型企業之雲端專案管理系統

Python 的沙盒實驗

持續交付的 8 條原則

使用 Python 設計一個參與者模式框架

在 MacBook 上使用 Python 作實況視訊串流

何謂 HTTP Live Streaming (HLS)?

測試環境

運作流程

串流端

發布端

使用 Flowplayer 讓其它瀏覽器也可以觀看 HLS 串流

結語

參考資料

Footnotes

郭信義 (Sam Kuo)

微型企業之雲端專案管理系統

Python 的沙盒實驗

持續交付的 8 條原則

使用 Python 設計一個參與者模式框架