16. API - mmalobj

This module provides an object-oriented interface to libmmal which is the library underlying picamera, raspistill, and raspivid. It is provided to ease the usage of libmmal to Python coders unfamiliar with C and also works around some of the idiosyncrasies in libmmal.

Warning

This part of the API is still experimental and subject to change in future versions. Backwards compatibility is not (yet) guaranteed.

16.1. The MMAL Tour

MMAL operates on the principle of pipelines:

  • A pipeline consists of one or more MMAL components (MMALBaseComponent and derivatives) connected together in series.
  • A MMALBaseComponent has one or more ports.
  • A port (MMALControlPort and derivatives) is either a control port, an input port or an output port (there are also clock ports but you generally don’t need to deal with these as MMAL sets them up automatically):
    • Control ports are used to accept and receive commands, configuration parameters, and error messages. All MMAL components have a control port, but in picamera they’re only used for component configuration.
    • Input ports receive data from upstream components.
    • Output ports send data onto downstream components (if they’re connected), or to callback routines in the user’s program (if they’re not connected).
    • Input and output ports can be audio, video or sub-picture (subtitle) ports, but picamera only deals with video ports.
    • Ports have a format which (in the case of video ports) dictates the format of image/frame accepted or generated by the port (YUV, RGB, JPEG, H.264, etc.)
    • Video ports have a framerate which specifies the number of images expected to be received or sent per second.
    • Video ports also have a framesize which specifies the resolution of images/frames accepted or generated by the port.
    • Finally, all ports (control, input and output) have params which affect their operation.
  • An output port can have a MMALConnection to an input port. Connections ensure the two ports use compatible formats, and handle transferring data from output ports to input ports in an orderly fashion. A port cannot have more than one connection from/to it.
  • Data is written to / read from ports via instances of MMALBuffer.
    • Buffers belong to a port and can’t be passed arbitrarily between ports.
    • The size of a buffer is dictated by the format and frame-size of the port that owns the buffer. The memory allocation of a buffer (readable from size) cannot be altered once the port is enabled, but the buffer can contain any amount of data up its allocation size. The actual length of data in a buffer is stored in length.
    • Likewise, the number of buffers belonging to a port is fixed and cannot be altered without disabling the port, reconfiguring it and re-enabling it. The more buffers a port has, the less likely it is that the pipeline will have to drop frames because a component has overrun, but the more GPU memory is required.
    • Buffers also have flags which specify information about the data they contain (e.g. start of frame, end of frame, key frame, etc.)
    • When a connection exists between two ports, the connection continually requests a buffer from the output port, requests another buffer from the input port, copies the output buffer’s data to the input buffer’s data, then returns the buffers to their respective ports (this is a simplification; various tricks are pulled under the covers to minimize the amount of data copying that actually occurs, but as a mental model of what’s going on it’s reasonable).
    • Components take buffers from their input port(s), process them, and write the result into a buffer from the output port(s).

16.1.1. Components

Now we’ve got a mental model of what an MMAL pipeline consists of, let’s build one. For the rest of the tour I strongly recommend using a Pi with a screen (so you can see preview output) but controlling it via an SSH session (so the preview doesn’t cover your command line). Follow along, typing the examples into your remote Python session. And feel free to deviate from the examples if you’re curious about things!

We’ll start by importing the mmalobj module with a convenient alias, then construct a MMALCamera component, and a MMALRenderer component.

>>> from picamera import mmal, mmalobj as mo
>>> camera = mo.MMALCamera()
>>> preview = mo.MMALRenderer()

16.1.2. Ports

Before going any further, let’s have a look at the various ports on these components.

>>> len(camera.inputs)
0
>>> len(camera.outputs)
3
>>> len(preview.inputs)
1
>>> len(preview.outputs)
0

The fact the camera has three outputs should come as little surprise to those who have read the Camera Hardware chapter (if you haven’t already, you might want to skim it now). Let’s examine the first output port of the camera and the input port of the renderer:

>>> camera.outputs[0]
<MMALVideoPort "vc.ril.camera:out:0": format=MMAL_FOURCC('I420')
buffers=1x7680 frames=320x240@0fps>
>>> preview.inputs[0]
<MMALVideoPort "vc.ril.video_render:in:0" format=MMAL_FOURCC('I420')
buffers=2x15360 frames=160x64@0fps>

Several things to note here:

  • We can tell from the port name what sort of component it belongs to, what its index is, and whether it’s an input or an output port
  • Both ports are currently configured for the I420 format; this is MMAL’s name for YUV420 (full resolution Y, quarter resolution UV).
  • The ports have different frame-sizes (320x240 and 160x64 respectively), buffer counts (1 and 2 respectively) and buffer sizes (7680 and 15360 respectively).
  • The buffer sizes look unrealistic. For example, 7680 bytes is nowhere near enough to hold 320 * 240 * 1.5 bytes (YUV420 requires 1.5 bytes per pixel).

Now we’ll configure the camera’s output port with a slightly higher resolution, and give it a frame-rate:

>>> camera.outputs[0].framesize = (640, 480)
>>> camera.outputs[0].framerate = 30
>>> camera.outputs[0].commit()
>>> camera.outputs[0]
<MMALVideoPort "vc.ril.camera:out:0(I420)": format=MMAL_FOURCC('I420')
buffers=1x460800 frames=640x480@30fps>

Note that the changes to the configuration won’t actually take effect until the commit() call. After the port is committed, note that the buffer size now looks reasonable: 640 * 480 * 1.5 = 460800.

16.1.3. Connections

Now we’ll try connecting the renderer’s input to the camera’s output. Don’t worry about the fact that the port configurations are different. One of the nice things about MMAL (and the mmalobj layer) is that connections try very hard to auto-configure things so that they “just work”. Usually, auto-configuration is based upon the output port being connected so it’s important to get that configuration right, but you don’t generally need to worry about the input port.

The renderer is what mmalobj terms a “downstream component”. This is a component with a single input that typically sits downstream from some feeder component (like a camera). All such components have the connect() method which can be used to connect the sole input to a specified output:

>>> preview.connect(camera)
<MMALConnection "vc.ril.camera:out:0/vc.ril.video_render:in:0">
>>> preview.connection.enable()

Note that we’ve been quite lazy in the snippet above by simply calling connect() with the camera component. In this case, a connection will be attempted between the first input port of the owner (preview) and the first unconnected output of the parameter (camera). However, this is not always what’s wanted so you can specify the exact ports you wish to connect. In this case the example was equivalent to calling:

>>> preview.inputs[0].connect(camera.outputs[0])
<MMALConnection "vc.ril.camera:out:0/vc.ril.video_render:in:0">
>>> preview.inputs[0].connection.enable()

Note that the connect() method returns the connection that was constructed but you can also retrieve this by querying the port’s connection attribute later.

As soon as the connection is enabled you should see the camera preview appear on the Pi’s screen. Let’s query the port configurations now:

>>> camera.outputs[0]
<MMALVideoPort "vc.ril.camera:out:0(OPQV)": format=MMAL_FOURCC('OPQV')
buffers=10x128 frames=640x480@30fps>
>>> preview.inputs[0]
<MMALVideoPort "vc.ril.video_render:in:0(OPQV)": format=MMAL_FOURCC('OPQV')
buffers=10x128 frames=640x480@30fps>

Note that the connection has implicitly reconfigured the camera’s output port to use the OPAQUE (“OPQV”) format. This is a special format used internally by the camera firmware which avoids passing complete frame data around, instead passing pointers to frame data around (this explains the tiny buffer size of 128 bytes as very little data is actually being shuttled between the components). Further, note that the connection has automatically copied the port format, frame size and frame-rate to the preview’s input port.

_images/preview_pipeline.svg

16.1.4. Opaque Format

At this point it is worth exploring the differences between the camera’s three output ports:

  • Output 0 is the “preview” output. On this port, the OPAQUE format contains a pointer to a complete frame of data.
  • Output 1 is the “video recording” output. On this port, the OPAQUE format contains a pointer to two complete frames of data. The dual-frame format enables the H.264 video encoder to calculate motion estimation without the encoder having to keep copies of prior frames itself (it can do this when something other than OPAQUE format is used, but dual-image OPAQUE is much more efficient).
  • Output 2 is the “still image” output. On this port, the OPAQUE format contains a pointer to a strip of an image. The “strips” format is used by the JPEG encoder (not to be confused with the MJPEG encoder) to deal with high resolution images efficiently.

Generally, you don’t need to worry about these differences. The mmalobj layer knows about them and negotiates the most efficient format it can for connections. However, they’re worth bearing in mind if you’re aiming to get the most out of the firmware or if you’re confused about why a particular format has been selected for a connection.

16.1.5. Component Configuration

So far we’ve seen how to construct components, configure their ports, and connect them together in rudimentary pipelines. Now, let’s see how to configure components via control port parameters:

>>> camera.control.params[mmal.MMAL_PARAMETER_SYSTEM_TIME]
177572014208
>>> camera.control.params[mmal.MMAL_PARAMETER_SYSTEM_TIME]
177574350658
>>> camera.control.params[mmal.MMAL_PARAMETER_BRIGHTNESS]
Fraction(1, 2)
>>> camera.control.params[mmal.MMAL_PARAMETER_BRIGHTNESS] = 0.75
>>> camera.control.params[mmal.MMAL_PARAMETER_BRIGHTNESS]
Fraction(3, 4)
>>> fx = camera.control.params[mmal.MMAL_PARAMETER_IMAGE_EFFECT]
>>> fx
<picamera.mmal.MMAL_PARAMETER_IMAGEFX_T object at 0x765b8440>
>>> dir(fx)
['__class__', '__ctypes_from_outparam__', '__delattr__', '__dict__',
'__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__',
'__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__',
'__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
'__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__',
'__weakref__', '_b_base_', '_b_needsfree_', '_fields_', '_objects', 'hdr',
'value']
>>> fx.value
0
>>> mmal.MMAL_PARAM_IMAGEFX_NONE
0
>>> fx.value = mmal.MMAL_PARAM_IMAGEFX_EMBOSS
>>> camera.control.params[mmal.MMAL_PARAMETER_IMAGE_EFFECT] = fx
>>> camera.control.params[mmal.MMAL_PARAMETER_BRIGHTNESS] = 1/2
>>> camera.control.params[mmal.MMAL_PARAMETER_IMAGE_EFFECT] = mmal.MMAL_PARAM_IMAGEFX_NONE
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pi/picamera/picamera/mmalobj.py", line 1109, in __setitem__
    assert mp.hdr.id == key
AttributeError: 'int' object has no attribute 'hdr'
>>> fx.value = mmal.MMAL_PARAM_IMAGEFX_NONE
>>> camera.control.params[mmal.MMAL_PARAMETER_IMAGE_EFFECT] = fx
>>> preview.disconnect()

Things to note:

  • The parameter dictates the type of the value returned (and accepted, if the parameter is read-write).
  • Many parameters accept a multitude of simple types like int, float, Fraction, str, etc. However, some parameters use ctypes structures and such parameters only accept the relevant structure.
  • The easiest way to use such “structured” parameters is to query them first, modify the resulting structure, then write it back to the parameter.

To find out what parameters are available for use with the camera component, have a look at the source for the PiCamera class, especially property getters and setters.

16.1.6. File Output (RGB capture)

Let’s see how we can produce some file output from the camera. First we’ll perform a straight unencoded RGB capture from the still port (2). As this is unencoded output we don’t need to construct anything else. All we need to do is configure the port for RGB encoding, select an appropriate resolution, then activate the output port:

>>> camera.outputs[2].format = mmal.MMAL_ENCODING_RGB24
>>> camera.outputs[2].framesize = (640, 480)
>>> camera.outputs[2].commit()
>>> camera.outputs[2]
<MMALVideoPort "vc.ril.camera:out:2(RGB3)": format=MMAL_FOURCC('RGB3')
buffers=1x921600 frames=640x480@0fps>
>>> camera.outputs[2].enable()

Unfortunately, that didn’t seem to do much! An output port that is participating in a connection needs nothing more: it knows where its data is going. However, an output port without a connection requires a callback function to be assigned so that something can be done with the buffers of data it produces.

The callback will be given two parameters: the MMALPort responsible for producing the data, and the MMALBuffer containing the data. It is expected to return a bool which will be False if further data is expected and True if no further data is expected. If True is returned, the callback will not be executed again. In our case we’re going to write data out to a file we’ll open before-hand, and we should return True when we see a buffer with the “frame end” flag set:

>>> camera.outputs[2].disable()
>>> import io
>>> output = io.open('image.data', 'wb')
>>> def image_callback(port, buf):
...     output.write(buf.data)
...     return bool(buf.flags & mmal.MMAL_BUFFER_HEADER_FLAG_FRAME_END)
...
>>> camera.outputs[2].enable(image_callback)
>>> output.tell()
0

At this stage you may note that while the file exists, nothing’s been written to it. This is because output ports 1 and 2 (the video and still ports) won’t produce any buffers until their “capture” parameter is enabled:

>>> camera.outputs[2].params[mmal.MMAL_PARAMETER_CAPTURE] = True
>>> camera.outputs[2].params[mmal.MMAL_PARAMETER_CAPTURE] = False
>>> output.tell()
921600
>>> camera.outputs[2].disable()
>>> output.close()

Congratulations! You’ve just captured your first image with the MMAL layer. Given we disconnected the preview above, the current state of the system looks something like this:

_images/rgb_capture_pipeline.svg

16.1.7. File Output (JPEG capture)

Whilst RGB is a useful format for processing we’d generally prefer something like JPEG for output. So, next we’ll construct an MMAL JPEG encoder and use it to compress our RGB capture. Note that we’re not going to connect the JPEG encoder to the camera yet; we’re just going to construct it standalone and feed it data from our capture file, writing the output to another file:

>>> encoder = mo.MMALImageEncoder()
>>> encoder.inputs
(<MMALVideoPort "vc.ril.image_encode:in:0": format=MMAL_FOURCC('RGB2')
buffers=1x15360 frames=96x80@0fps>,)
>>> encoder.outputs
(<MMALVideoPort "vc.ril.image_encode:out:0": format=MMAL_FOURCC('GIF ')
buffers=1x81920 frames=0x0@0fps>,)
>>> encoder.inputs[0].format = mmal.MMAL_ENCODING_RGB24
>>> encoder.inputs[0].framesize = (640, 480)
>>> encoder.inputs[0].commit()
>>> encoder.outputs[0].copy_from(encoder.inputs[0])
>>> encoder.outputs[0]
<MMALVideoPort "vc.ril.image_encode:out:0": format=MMAL_FOURCC('RGB3')
buffers=1x81920 frames=640x480@0fps>
>>> encoder.outputs[0].format = mmal.MMAL_ENCODING_JPEG
>>> encoder.outputs[0].commit()
>>> encoder.outputs[0]
<MMALVideoPort "vc.ril.image_encode:out:0(JPEG)": format=MMAL_FOURCC('JPEG')
buffers=1x307200 frames=0x0@0fps>
>>> encoder.outputs[0].params[mmal.MMAL_PARAMETER_JPEG_Q_FACTOR] = 90

Just pausing for a moment, let’s re-cap what we’ve got: an image encoder constructed, configured for 640x480 RGB input, and JPEG output with a quality factor of “90” (i.e. “very good” - don’t try to read much more than this into JPEG quality settings!). Note that MMAL has set the buffer size at a size it thinks will be typical for the output. As JPEG is a lossy format this won’t be precise and it’s entirely possible that we may receive multiple callbacks for a single frame (if the compression overruns the expected buffer size).

Let’s continue:

>>> rgb_data = io.open('image.data', 'rb')
>>> jpg_data = io.open('image.jpg', 'wb')
>>> def image_callback(port, buf):
...     jpg_data.write(buf.data)
...     return bool(buf.flags & mmal.MMAL_BUFFER_HEADER_FLAG_FRAME_END)
...
>>> encoder.outputs[0].enable(image_callback)

16.1.8. File Input (JPEG encoding)

How do we feed data to a component without a connection? We enable its input port with a dummy callback (we don’t need to “do” anything on data input). Then we request buffers from its input port, fill them with data and send them back to the input port:

>>> encoder.inputs[0].enable(lambda port, buf: True)
>>> buf = encoder.inputs[0].get_buffer()
>>> buf.data = rgb_data.read()
>>> encoder.inputs[0].send_buffer(buf)
>>> jpg_data.tell()
87830
>>> encoder.outputs[0].disable()
>>> encoder.inputs[0].disable()
>>> jpg_data.close()
>>> rgb_data.close()

Congratulations again! You’ve just produced a hardware-accelerated JPEG encoding. The following illustrates the state of the system at the moment (note the camera and renderer still exist; they’re just not connected to anything at the moment):

_images/jpeg_encode_pipeline.svg

Now let’s repeat the process but with the encoder attached to the still port on the camera directly. We can re-use our image_callback routine from earlier and just assign a different output file to jpg_data:

>>> encoder.connect(camera.outputs[2])
<MMALConnection "vc.ril.camera:out:2/vc.ril.image_encode:in:0">
>>> encoder.connection.enable()
>>> encoder.inputs[0]
<MMALVideoPort "vc.ril.image_encode:in:0(OPQV)": format=MMAL_FOURCC('OPQV')
buffers=10x128 frames=640x480@0fps>
>>> jpg_data = io.open('direct.jpg', 'wb')
>>> encoder.outputs[0].enable(image_callback)
>>> camera.outputs[2].params[mmal.MMAL_PARAMETER_CAPTURE] = True
>>> camera.outputs[2].params[mmal.MMAL_PARAMETER_CAPTURE] = False
>>> jpg_data.tell()
99328
>>> encoder.connection.disable()
>>> jpg_data.close()

Now the state of our system looks like this:

_images/jpeg_capture_pipeline.svg

16.1.9. Threads & Synchronization

The one issue you may have noted is that image_callback is running in a background thread. If we were running our capture extremely fast our main thread might disable the capture before our callback had run. Ideally we want to activate capture, wait on some signal indicating that the callback has completed a single frame successfully, then disable capture. We can do this with the communications primitives from the standard threading module:

>>> from threading import Event
>>> finished = Event()
>>> def image_callback(port, buf):
...     jpg_data.write(buf.data)
...     if buf.flags & mmal.MMAL_BUFFER_HEADER_FLAG_FRAME_END:
...         finished.set()
...         return True
...     return False
...
>>> def do_capture(filename='direct.jpg'):
...     global jpg_data
...     jpg_data = io.open(filename, 'wb')
...     finished.clear()
...     encoder.outputs[0].enable(image_callback)
...     camera.outputs[2].params[mmal.MMAL_PARAMETER_CAPTURE] = True
...     if not finished.wait(10):
...         raise Exception('capture timed out')
...     camera.outputs[2].params[mmal.MMAL_PARAMETER_CAPTURE] = False
...     encoder.outputs[0].disable()
...     jpg_data.close()
...
>>> do_capture()

The above example has several rough edges: globals, no proper clean-up in the case of an exception, etc. but by now you should be getting a pretty good idea of how picamera operates under the hood.

The major difference between picamera and a “typical” MMAL setup is that upon construction, the PiCamera class constructs both a MMALCamera (accessible as PiCamera._camera) and a MMALSplitter (accessible as PiCamera._splitter). The splitter remains permanently attached to the camera’s video port (output port 1). Furthermore, there’s always something connected to the camera’s preview port; by default it’s a MMALNullSink component which is switched with a MMALRenderer when the preview is started.

Encoders are constructed and destroyed as required by calls to capture(), start_recording(), etc. The following illustrates a typical picamera pipeline whilst video recording without a preview:

_images/picamera_pipeline.svg

16.1.10. Debugging Facilities

Before we move onto the pure Python components it’s worth mentioning the debugging capabilities built into mmalobj. Firstly, most objects have useful repr() outputs (in particular, it can be useful to simply evaluate a MMALBuffer to see what flags it’s got and how much data is stored in it). Also, there’s the print_pipeline() function. Give this a port and it’ll dump a human-readable version of your pipeline leading up to that port:

>>> preview.inputs[0].enable(lambda port, buf: True)
>>> buf = preview.inputs[0].get_buffer()
>>> buf
<MMALBuffer object: flags=_____ length=0>
>>> buf.flags = mmal.MMAL_BUFFER_HEADER_FLAG_FRAME_END
>>> buf
<MMALBuffer object: flags=E____ length=0>
>>> buf.release()
>>> preview.inputs[0].disable()
>>> mo.print_pipeline(encoder.outputs[0])
 vc.ril.camera [2]                           [0] vc.ril.image_encode [0]
   encoding    OPQV-strips    -->    OPQV-strips      encoding       JPEG
      buf      10x128                     10x128         buf         1x307200
    bitrate    0bps                         0bps       bitrate       0bps
     frame     640x480@0fps         640x480@0fps        frame        0x0@0fps

16.1.11. Python Components

So far all the components we’ve looked at have been “real” MMAL components which is to say that they’re implemented in C, and all talk to bits of the firmware running on the GPU. However, a frequent request has been to be able to modify frames from the camera before they reach the image or video encoder. The Python components are an attempt to make this request relatively simple to achieve from within Python.

The means by which this is achieved are inefficient (to say the least) so don’t expect this to work with high resolutions or framerates. The mmalobj layer in picamera includes the concept of a “Python MMAL” component. To the user these components look a lot like the MMAL components you’ve been playing with above (MMALCamera, MMALImageEncoder, etc). They are instantiated in a similar manner, they have the same sort of ports, and they’re connected using the same means as ordinary MMAL components.

Let’s try this out by placing a transformation between the camera and a preview which will draw a cross over the frames going to the preview. For this we’ll subclass picamera.array.PiArrayTransform. This derives from MMALPythonComponent and provides the useful capability of providing the source and target buffers as numpy arrays containing RGB data:

>>> from picamera import array
>>> class Crosshair(array.PiArrayTransform):
...     def transform(self, source, target):
...         with source as sdata, target as tdata:
...             tdata[...] = sdata
...             tdata[240, :, :] = 0xff
...             tdata[:, 320, :] = 0xff
...         return False
...
>>> transform = Crosshair()

That’s all there is to constructing a transform! This one is a bit crude in as much as the coordinates are hard-coded, and it’s very simplistic, but it should illustrate the principle nicely. Let’s connect it up between the camera and the renderer:

>>> transform.connect(camera)
<MMALPythonConnection "vc.ril.camera.out:0(RGB3)/py.component:in:0">
>>> preview.connect(transform)
<MMALPythonConnection "py.component:out:0/vc.ril.video_render:in:0(RGB3)">
>>> transform.connection.enable()
>>> preview.connection.enable()
>>> transform.enable()

At this point we should take a look at the pipeline to see what’s been configured automatically:

>>> mo.print_pipeline(preview.inputs[0])
 vc.ril.camera [0]                             [0] py.transform [0]                             [0] vc.ril.video_render
   encoding    RGB3            -->            RGB3   encoding   RGB3            -->            RGB3      encoding
      buf      1x921600                   2x921600     buf      2x921600                   2x921600         buf
     frame     640x480@30fps         640x480@30fps    frame     640x480@30fps         640x480@30fps        frame

Apparently the MMAL camera component is outputting RGB data (which is extremely large) to a “py.transform” component, which draws our cross-hair on the buffer and passes it onto the renderer again as RGB. This is part of the inefficiency alluded to above: RGB is a very large format (compared to I420 which is half its size, or OPQV which is tiny) so we’re shuttling a lot of data around here. Expect this to drop frames at higher resolutions or framerates.

The other source of inefficiency isn’t obvious from the debug output above which gives the impression that the “py.transform” component is actually part of the MMAL pipeline. In fact, this is a lie. Under the covers mmalobj installs an output callback on the camera’s output port to feed data to the “py.transform” input port, uses a background thread to run the transform, then copies the results into buffers obtained from the preview’s input port. In other words there’s really two (very short) MMAL pipelines with a hunk of Python running in between them. If mmalobj does its job properly you shouldn’t need to worry about this implementation detail but it’s worth bearing in mind from the perspective of performance.

16.1.12. Performance Hints

Generally you want to your frame handlers to be fast. To avoid dropping frames they’ve got to run in less than a frame’s time (e.g. 33ms at 30fps). Bear in mind that a significant amount of time is going to be spent shuttling the huge RGB frames around so you’ve actually got much less than 33ms available to you (how much will depend on the speed of your Pi, what resolution you’re using, the framerate, etc).

Sometimes, performance can mean making unintuitive choices. For example, the Pillow library (the main imaging library in Python these days) can construct images which share buffer memory (see Image.frombuffer), but only for the indexed (grayscale) and RGBA formats, not RGB. Hence, it can make sense to use RGBA (a format even larger than RGB) if only because it allows you to avoid copying any data when performing a composite.

Another trick is to realize that although YUV420 has different sized planes, it’s often enough to manipulate the Y plane only. In that case you can treat the front of the buffer as an indexed image (remember that Pillow can share buffer memory with such images) and manipulate that directly. With tricks like these it’s possible to perform multiple composites in realtime at 720p30 on a Pi3.

Here’s a (heavily commented) variant of the cross-hair example above that uses the lower level MMALPythonComponent class instead, and the Pillow library to perform compositing on YUV420 in the manner just described:

from picamera import mmal, mmalobj as mo, PiCameraPortDisabled
from PIL import Image, ImageDraw
from signal import pause


class Crosshair(mo.MMALPythonComponent):
    def __init__(self):
        super(Crosshair, self).__init__(name='py.crosshair')
        self._crosshair = None
        self.inputs[0].supported_formats = mmal.MMAL_ENCODING_I420

    def _handle_frame(self, port, buf):
        # If we haven't drawn the crosshair yet, do it now and cache the
        # result so we don't bother doing it again
        if self._crosshair is None:
            self._crosshair = Image.new('L', port.framesize)
            draw = ImageDraw.Draw(self._crosshair)
            draw.line([
                (port.framesize.width // 2, 0),
                (port.framesize.width // 2, port.framesize.height)],
                fill=(255,), width=1)
            draw.line([
                (0, port.framesize.height // 2),
                (port.framesize.width , port.framesize.height // 2)],
                fill=(255,), width=1)
        # buf is the buffer containing the frame from our input port. First
        # we try and grab a buffer from our output port
        try:
            out = self.outputs[0].get_buffer(False)
        except PiCameraPortDisabled:
            # The port was disabled; that probably means we're shutting down so
            # return True to indicate we're all done and the component should
            # be disabled
            return True
        else:
            if out:
                # We've got a buffer (if we don't get a buffer here it most
                # likely means things are going too slow downstream so we'll
                # just have to skip this frame); copy the input buffer to the
                # output buffer
                out.copy_from(buf)
                # now grab a locked reference to the buffer's data by using
                # "with"
                with out as data:
                    # Construct a PIL Image over the Y plane at the front of
                    # the data and tell PIL the buffer is writeable
                    img = Image.frombuffer('L', port.framesize, data, 'raw', 'L', 0, 1)
                    img.readonly = False
                    img.paste(self._crosshair, (0, 0), mask=self._crosshair)
                # Send the output buffer back to the output port so it can
                # continue onward to whatever's downstream
                try:
                    self.outputs[0].send_buffer(out)
                except PiCameraPortDisabled:
                    # The port was disabled; same as before this probably means
                    # we're shutting down so return True to indicate we're done
                    return True
            # Return False to indicate that we want to continue processing
            # frames. If we returned True here, the component would be
            # disabled and no further buffers would be processed
            return False


camera = mo.MMALCamera()
preview = mo.MMALRenderer()
transform = Crosshair()

camera.outputs[0].framesize = '720p'
camera.outputs[0].framerate = 30
camera.outputs[0].commit()

transform.connect(camera)
preview.connect(transform)

transform.connection.enable()
preview.connection.enable()

preview.enable()
transform.enable()
camera.enable()

pause()

It’s a sensible idea to perform any overlay rendering you want to do in a separate thread and then just handle compositing your overlay onto the frame in the MMALPythonComponent._handle_frame() method. Anything you can do to avoid buffer copying is a bonus here.

Here’s a final (rather large) demonstration that puts all these things together to construct a MMALPythonComponent derivative with two purposes:

  1. Render a partially transparent analogue clock in the top left of the frame.
  2. Produces two equivalent I420 outputs; one for feeding to a preview renderer, and another to an encoder (we could use a proper MMAL splitter for this but this is a demonstration of how Python components can have multiple outputs too).
import io
import datetime as dt
from threading import Thread, Lock
from collections import namedtuple
from math import sin, cos, pi
from time import sleep

from picamera import mmal, mmalobj as mo, PiCameraPortDisabled
from PIL import Image, ImageDraw


class Coord(namedtuple('Coord', ('x', 'y'))):
    @classmethod
    def clock_arm(cls, radians):
        return Coord(sin(radians), -cos(radians))

    def __add__(self, other):
        try:
            return Coord(self.x + other[0], self.y + other[1])
        except TypeError:
            return Coord(self.x + other, self.y + other)

    def __sub__(self, other):
        try:
            return Coord(self.x - other[0], self.y - other[1])
        except TypeError:
            return Coord(self.x - other, self.y - other)

    def __mul__(self, other):
        try:
            return Coord(self.x * other[0], self.y * other[1])
        except TypeError:
            return Coord(self.x * other, self.y * other)

    def __floordiv__(self, other):
        try:
            return Coord(self.x // other[0], self.y // other[1])
        except TypeError:
            return Coord(self.x // other, self.y // other)

    # yeah, I could do the rest (truediv, radd, rsub, etc.) but there's no
    # need here...


class ClockSplitter(mo.MMALPythonComponent):
    def __init__(self):
        super(ClockSplitter, self).__init__(name='py.clock', outputs=2)
        self.inputs[0].supported_formats = {mmal.MMAL_ENCODING_I420}
        self._lock = Lock()
        self._clock_image = None
        self._clock_thread = None

    def enable(self):
        super(ClockSplitter, self).enable()
        self._clock_thread = Thread(target=self._clock_run)
        self._clock_thread.daemon = True
        self._clock_thread.start()

    def disable(self):
        super(ClockSplitter, self).disable()
        if self._clock_thread:
            self._clock_thread.join()
            self._clock_thread = None
            with self._lock:
                self._clock_image = None

    def _clock_run(self):
        # draw the clock face up front (no sense drawing that every time)
        origin = Coord(0, 0)
        size = Coord(100, 100)
        center = size // 2
        face = Image.new('L', size)
        draw = ImageDraw.Draw(face)
        draw.ellipse([origin, size - 1], outline=(255,))
        while self.enabled:
            # loop round rendering the clock hands on a copy of the face
            img = face.copy()
            draw = ImageDraw.Draw(img)
            now = dt.datetime.now()
            midnight = now.replace(
                hour=0, minute=0, second=0, microsecond=0)
            timestamp = (now - midnight).total_seconds()
            hour_pos = center + Coord.clock_arm(2 * pi * (timestamp % 43200 / 43200)) * 30
            minute_pos = center + Coord.clock_arm(2 * pi * (timestamp % 3600 / 3600)) * 45
            second_pos = center + Coord.clock_arm(2 * pi * (timestamp % 60 / 60)) * 45
            draw.line([center, hour_pos], fill=(200,), width=2)
            draw.line([center, minute_pos], fill=(200,), width=2)
            draw.line([center, second_pos], fill=(200,), width=1)
            # assign the rendered image to the internal variable
            with self._lock:
                self._clock_image = img
            sleep(0.2)

    def _handle_frame(self, port, buf):
        try:
            out1 = self.outputs[0].get_buffer(False)
            out2 = self.outputs[1].get_buffer(False)
        except PiCameraPortDisabled:
            return True
        if out1:
            # copy the input frame to the first output buffer
            out1.copy_from(buf)
            with out1 as data:
                # construct an Image using the Y plane of the output
                # buffer's data and tell PIL we can write to the buffer
                img = Image.frombuffer('L', port.framesize, data, 'raw', 'L', 0, 1)
                img.readonly = False
                with self._lock:
                    if self._clock_image:
                        img.paste(self._clock_image, (10, 10), self._clock_image)
            # if we've got a second output buffer replicate the first
            # buffer into it (note the difference between replicate and
            # copy_from)
            if out2:
                out2.replicate(out1)
            try:
                self.outputs[0].send_buffer(out1)
            except PiCameraPortDisabled:
                return True
        if out2:
            try:
                self.outputs[1].send_buffer(out2)
            except PiCameraPortDisabled:
                return True
        return False


def main(output_filename):
    camera = mo.MMALCamera()
    preview = mo.MMALRenderer()
    encoder = mo.MMALVideoEncoder()
    clock = ClockSplitter()
    target = mo.MMALPythonTarget(output_filename)

    # Configure camera output 0
    camera.outputs[0].framesize = (640, 480)
    camera.outputs[0].framerate = 24
    camera.outputs[0].commit()

    # Configure H.264 encoder
    encoder.outputs[0].format = mmal.MMAL_ENCODING_H264
    encoder.outputs[0].bitrate = 2000000
    encoder.outputs[0].commit()
    p = encoder.outputs[0].params[mmal.MMAL_PARAMETER_PROFILE]
    p.profile[0].profile = mmal.MMAL_VIDEO_PROFILE_H264_HIGH
    p.profile[0].level = mmal.MMAL_VIDEO_LEVEL_H264_41
    encoder.outputs[0].params[mmal.MMAL_PARAMETER_PROFILE] = p
    encoder.outputs[0].params[mmal.MMAL_PARAMETER_VIDEO_ENCODE_INLINE_HEADER] = True
    encoder.outputs[0].params[mmal.MMAL_PARAMETER_INTRAPERIOD] = 30
    encoder.outputs[0].params[mmal.MMAL_PARAMETER_VIDEO_ENCODE_INITIAL_QUANT] = 22
    encoder.outputs[0].params[mmal.MMAL_PARAMETER_VIDEO_ENCODE_MAX_QUANT] = 22
    encoder.outputs[0].params[mmal.MMAL_PARAMETER_VIDEO_ENCODE_MIN_QUANT] = 22

    # Connect everything up and enable everything (no need to enable capture on
    # camera port 0)
    clock.inputs[0].connect(camera.outputs[0])
    preview.inputs[0].connect(clock.outputs[0])
    encoder.inputs[0].connect(clock.outputs[1])
    target.inputs[0].connect(encoder.outputs[0])
    target.connection.enable()
    encoder.connection.enable()
    preview.connection.enable()
    clock.connection.enable()
    target.enable()
    encoder.enable()
    preview.enable()
    clock.enable()
    try:
        sleep(10)
    finally:
        # Disable everything and tear down the pipeline
        target.disable()
        encoder.disable()
        preview.disable()
        clock.disable()
        target.inputs[0].disconnect()
        encoder.inputs[0].disconnect()
        preview.inputs[0].disconnect()
        clock.inputs[0].disconnect()


if __name__ == '__main__':
    main('output.h264')

16.1.13. IO Classes

The Python MMAL components include a couple of useful IO classes: MMALSource and MMALTarget. We could have used these instead of messing around with output callbacks in the sections above but it was worth exploring how those callbacks operated first (in order to comprehend how Python transforms operated).

16.2. Components

16.3. Ports

16.4. Connections

16.5. Buffers

16.6. Python Extensions

16.7. Debugging

The following functions are useful for quickly dumping the state of a given MMAL pipeline:

Note

It is also worth noting that most classes, in particular MMALVideoPort and MMALBuffer have useful repr() outputs which can be extremely useful with simple print() calls for debugging.

16.8. Utility Functions

The following functions are provided to ease certain common operations in the picamera library. Users of mmalobj may find them handy in various situations: