🗄️Gyroflow protobuf
Gyroflow protobuf format is designed to contain even the most advanced and detailed data about the video capture pipeline, that is useful for post-stabilization.
Download the .proto definition
You can download the latest protobuf definition from telemetry-parser's git repository.
Supported features
Camera and lens metadata: Brand, model, focal length, f-number, focus distance etc.
Lens distortion model and coefficients
Frame readout time for rolling shutter correction
Frame capture metadata - ISO, shutter speed, white balance, etc.
Raw IMU samples - gyroscope, accelerometer, magnetometer readings
Quaternions - final camera orientation after sensor fusion
Lens OIS data - detailed information about lens OIS movements so we can stabilization when Lens OIS was enabled
IBIS data - detailed information about the in-body image stabilization, so we can support stabilization when IBIS was enabled
EIS data - if the camera contains any form of electronic stabilization, the protobuf can contain what exactly it did to the image so we can account for it.
Technical details
This page provides detailed documentation for integrating the Gyroflow Protobuf format natively into camera firmware. By embedding this standardized telemetry directly into video files, cameras can achieve pixel-perfect software stabilization, distortion correction, and rolling shutter compensation in Gyroflow and supported NLEs.
1. Binary Protobuf Embedding in ISO Base Media File Format (MP4/MOV)
Transport format
The Gyroflow Protobuf data should be stored as binary data in a separate MP4 track of the video file. This makes it easy to read and write and it's the standard way to embed additional data in video files.
The data should be stored per-frame so that each video frame should have corresponding metadata in the metadata MP4 track.
Track Setup:
Create a dedicated track with
hdlr(Handler Reference Box) set to'meta'(Metadata).The
stsd(Sample Description Box) should indicate a binary gyroflow format'gyrf'.Maintain a 1:1 Sample-to-Frame relationship. For every encoded video frame in the video track, there must be exactly one corresponding sample in the metadata track.
Sample Structure
First Sample: The very first metadata sample must contain the
Mainmessage initialized with theHeaderblock (camera metadata, clip metadata) AND the firstFrameMetadata.Subsequent Samples: All subsequent samples should contain the
Mainmessage but omit theHeaderblock, including only theFrameMetadatafor that specific frame.
MP4 Track Diagram
2. General overview of the included fields
The Gyroflow Protobuf schema is designed to encapsulate all the necessary metadata required for advanced video stabilization and lens correction.
The Header (Static Metadata)
The Header message acts as the foundational context for the entire video clip. It contains information that generally remains constant throughout the recording and is divided into two sub-messages:
CameraMetadata: This defines the physical hardware used to capture the footage.Identification: Fields like
camera_brand,camera_model,lens_brand, andlens_modelhelp identify the exact gear setup.Sensor & Optics: Fields like
sensor_pixel_width,sensor_pixel_height, andpixel_pitch_nmdefine the physical characteristics of the sensor. Thelens_profilefield can be used to embed the lens profile, or the lens distortion metadata can be stored directly using fields in theLensData.Orientation:
imu_orientationand the optionalimu_rotation/quats_rotationquaternions allow you to define base offsets for the sensor data, ensuring the software interprets the XYZ axes correctly.
ClipMetadata: This defines the specific parameters of the video file itself.Dimensions & Timing: Standard video properties like
frame_width,frame_height, andduration_us.Framerates: It separates
record_frame_rate,sensor_frame_rate, andfile_frame_rateto accurately handle variable frame rate (VFR) and slow-motion recording scenarios.
FrameMetadata (Dynamic Data)
The FrameMetadata message contains the telemetry and camera settings that change continuously during the recording. Because multiple IMU samples usually occur within the span of a single video frame, this message is built to handle arrays of data.
Timing & Synchronization:
start_timestamp_usandend_timestamp_usstrictly define the capture window of the frame using the camera's internal clock.
It's crucial to include accurate timestamps for both the sensor readouts and IMU data samples. All the timestamps should come from the same internal monotonic camera clock. The clock doesn't have to be synchronized with wall time.
Per-Frame Camera Settings: Exposure and image properties can fluctuate, especially in auto-exposure modes. Fields like
iso,exposure_time_us, andwhite_balance_kelvintrack these shifts.Dynamic cropping (
crop_x,crop_width,digital_zoom_ratio) allows the metadata to reflect digital zooms or sensor punches that happen mid-recording.
LensData: This field captures dynamic optical changes, such as shifts in
focal_length_mmorfocus_distance_mm.It also embeds mathematical distortion models and their corresponding
distortion_coefficientsandcamera_intrinsic_matrixto accurately map lens warping at that specific moment in time.
Motion Data:
IMUData: This is the core raw telemetry. It contains arrays of gyroscope (rotation in degrees/sec) and accelerometer (acceleration in m/s²) readings, alongside precisesample_timestamp_usmarkers for each sample.QuaternionData: If the camera performs its own sensor fusion, this field provides the calculated orientation (W, X, Y, Z angles) mapped to a timestamp.
3. Frame Readout Time and Rolling Shutter Correction
Rolling shutter occurs because CMOS sensors read pixels row-by-row rather than instantly. To correct this, Gyroflow maps every single pixel row to an exact IMU timestamp.
Timestamps & VSync
Gyroflow requires timestamps to be strictly linked to the internal camera clock, down to the microsecond (_us).
start_timestamp_us: The exact moment the first row of the crop area is exposed/read.end_timestamp_us: The exact moment the last row of the crop area is exposed/read.
The frame_readout_time_us in ClipMetadata must represent the readout time of the captured crop (not the whole physical sensor, unless reading the whole sensor).
Readout & IMU Interpolation Diagram
Gyroflow does not use IMU data outside of the captured pixels. The IMUData array provided in FrameMetadata can include all the IMU samples (even outside of the capture window), but you have to make sure the timestamps of the IMU samples and sensor the readouts are in sync, so Gyroflow can skip not needed samples.
4. Lens Data and Distortion Models
To accurately stabilize a video, Gyroflow needs to undistort the image before applying the rotation. The LensData message defines camera intrinsics and lens distortion parameters.
Standard Models
OpenCV Fisheye model: Default model used in gyroflow. It also works for non-fisheye lenses. Uses 4 coefficients (
p1, p2, p3, p4)OpenCV Standard model: Classic polynomial radial/tangential models (
k1,k2,p1,p2,k3,k4,k5,k6).Poly3 / Poly5 / PTLens: Lensfun models.
Generic Polynomial
For manufacturers using complex custom glass mapping, the GenericPolynomial model calculates physical distortion offsets.
Math implementation:
Let (X,Y) be normalized image coordinates.
Radius: r=X2+Y2
Angle: θ=arctan(r)
Distortion calculation using polynomial coefficients (k0 to k5):
θd=θ⋅k0+θ2⋅k1+θ3⋅k2+θ4⋅k3+θ5⋅k4+θ6⋅k5
Scaling factor: scale=θd/r (if r=0, scale is 1.0)
Apply post-scale parameters (k6, k7):
Xdistorted=X⋅scale⋅k6
Ydistorted=Y⋅scale⋅k7
Intrinsic Matrix:
The protobuf requires a row-major 3x3 intrinsic matrix. Usually defined as:
5. Lens OIS Data (Optical Image Stabilization)
Optical stabilization physically shifts a floating lens element to counteract camera shake. Because the glass moves independently of the camera body, the IMU (which is rigidly attached to the body) records rotation that the sensor didn't actually see.
By supplying LensOISData (the X/Y shift of the optical element in nanometers at specific timestamps), Gyroflow mathematically calculates the exact optical deviation angle. It then subtracts the OIS movement from the IMU quaternion data to establish the absolute trajectory of the optical path before applying its own digital stabilization.
The sampling rate of OIS data can be much lower than the IMU data, because it's a physical element, it will move slowly. Gyroflow will interpolate any gaps. Typically 3-10 samples of OIS data per frame should be enough.
6. IBIS Data (In-Body Image Stabilization)
IBIS mechanically shifts and rolls the physical image sensor on its X, Y, and Roll axes.
How it works in the Protobuf:
IBISData requires the exact timestamp, shift_x, shift_y (in nanometers), and roll_angle_degrees.
Since the sensor is physically moving inside the camera body during exposure, Gyroflow needs this data for the same reason it needs OIS: the IMU records the camera body moving, but the pixels being recorded are shifting within the body.
The sampling rate of IBIS can be much lower than the IMU data, because it's a physical element, it will move slowly. Gyroflow will interpolate any gaps. Typically 3-10 samples of IBIS data per frame should be enough.
IBIS interaction with IMU
If IBIS is active, Gyroflow maps the mechanical sensor position at the exact row_readout_time.
Sensor X/Y Shift: Converted from nanometers to pixel offsets based on
pixel_pitch_nm.Sensor Roll: Added directly to the Z-axis rotation matrix before inverse-projection.
If IBIS completely countered a bump, Gyroflow relies on the IBISData to know that the frame is already stable at that specific timestamp, preventing Gyroflow from digitally correcting a bump that IBIS already fixed (preventing double-correction artifacts).
7. EIS Data (Electronic Image Stabilization)
When the camera applies internal digital stabilization (EIS), the pixels stored in the final MP4 are vastly different from the raw sensor readout. To stabilize this in post, Gyroflow needs to know exactly how the camera deformed the original sensor crop.
There are three options to choose from:
A. QUATERNION
The camera encodes a quaternion representing the internal rotation the camera applied to the frame.
Gyroflow applies the inverse of this quaternion to the frame to "un-stabilize" the footage back to raw sensor data, and then applies its own smoother quaternion path.
This method is used by GoPro (internal Hypersmooth)
B. MESH_WARP
The camera divides the video frame into a grid (grid_width x grid_height). The values array contains a list of floats representing how each grid intersection was displaced (X, Y) relative to the original sensor read.
This allows to encode arbitrary internal frame distortions, including internal rolling shutter correction, lens distortion correction, and crop movements natively done by the camera. Gyroflow reads the displacement mesh, reverses the deformation per-pixel, and maps the original pixels to its own computed stable mesh.
This method is used by Sony (Active mode)
C. MATRIX_4X4
A standard 16-float row-major matrix representing an affine/perspective 3D transform applied by the camera to the frame. Works similarly to QUATERNION but allows the camera to pass internal scaling and translations directly.
8. Other
If your camera has any specific needs, we're open to extend the protobuf to include any additional fields or features.
If you have any questions, feel free to contact us at [email protected] or [email protected]
Protobuf
Last updated