 2008/05/17
|
Last update 2000/05/19
The Labs - Design & Functionality For The NetRawVideoStudio: Specifications
There are still several formats and codecs available, yet we might
use our own codecs for specials purposes (e.g. low-bitrate recording).
But we definitly like to support MPEG2, Quicktime at least, among
our own codecs. So there the RVS format specifications:
- RVS Format Proposal
- Order of Chunks
- Video Header
- Frame Header, Frame Chunk-Header & Frame Chunk
- Audio Header & Chunk
- Considerations
- Video Devices
- RawVideo API
- Codec Matrix
- Controllers
| Specifications1. RVS Format Proposal
|
Since we primary target interlacing video & audio we split both streams
into chunks, to keep the design open, we propose:
- each chunk starts with an ASCII line terminated by "\n":
- first word is chunk-name
- followed by nothing or whatever ASCII-content terminated by "\n"
next line contains the length of the chunk written in ASCII
terminated by "\n"
next data are binary-data
Comments start with "#" and should be ignored (ending with "\n").
Sample:
|
video 15,320,240
|
|
0
|
|
title 2001 Odysee
|
|
0
|
|
author Stanley Kubrik
|
|
0
|
|
frame jpeg
|
|
4048
|
|
<binary-data-chunk of 4048 bytes follows>
|
|
...
|
Each chunk name which cannot be recognize should be ignored and skipped.
Following chunk-names are proposed:
| video | first chunk defining fps, width, height |
| title | title of stream |
| author | author information |
| copyright | copyright notice |
| frame | defining codec of frames |
| audio | defining codec of audio |
| framestart | frame start |
| framechunk | chunk of frame |
| audiochunk | chunk of audio |
| nextfile | file point to next file (2GB filesize problem) |
|
First we just wanted to support JPEG (MJPEG) and PPM for picture-format,
and PCM for audio, but then thought of leaving it open and supporting
'plug-in' for other compression formats. For the beginning we will
just support MJPEG & PPM, and PCM for the audio.
We may have to create an unique file-header, ie. "# RawVideo-version\n" or alike
to have applications recognize the file.
| Specifications2. Order of Chunks
|
| Interlaced: |
| video |
| frame |
| audio |
| framestart |
| framechunk |
| audiochunk |
| framechunk |
| audiochunk |
| framechunk |
| audiochunk |
| framechunk |
| ... |
| framestart |
| frame |
| audio |
| ... |
|
|
| Random: |
| video |
| frame |
| framestart |
| framechunk |
| framechunk |
| framechunk |
| framechunk |
| ... |
| audio |
| audiochunk |
| audiochunk |
| audiochunk |
| audiochunk |
| audiochunk |
| audiochunk |
| ... |
|
|
| Random: |
| video |
| frame |
| framestart |
| framestart |
| framestart |
| framestart |
| framestart |
| ... |
| audio |
| audiochunk |
|
|
There are two ways to "order" the chunks, either interlaced or random.
Interlaced is for capturing and playing the best: we capture and get
frames and audio at the same time and interlace the stream, so when
playing back.
Random is for best when adding or removing frames or audio, or
creating video from still pictures.
Since we propose both ways, we require to time-code each frame.
To play random ordered video-files, one has to index all frames or audio,
needless to say this only makes sense for small video-files.
| Specifications3. Video Header
|
video fps,width,height,order
| fps | frames per second |
| width | frame width |
| height | frame height |
| order | either interlaced or random |
|
| Specifications4. Frame Header, Frame Chunk-Header & Frame Chunk
|
Header | | | frame type | type: jpeg, rtjpeg, ppm, pgm, cmap, rgb555, rgb565, rgb24, yuv444, yuv422, yuv421, yuv420 |
|
|
Chunk Header | | | framestart | |
| length | length in bytes of the full frame |
| binary-data | binary-data, obviously starting with first frame-chunk |
|
|
Chunk | | | framechunk |
| length | length in bytes of the chunk |
| binary-data | binary-data of the chunk |
|
|
So far the frame-type is based on 'plug-in' encode & decoders, and should be open.
We will implement jpeg & ppm (rgb24) for the beginning.
Codec Overview | | | Format: | Compression: | Comment: |
| jpeg | 20-5:1 | jpeg |
| rtjpeg | 20-10:1 | realtime jpeg codec (yuv 4:2:0) |
| ppm | 1:1 | simple format just RGB (3 bytes, 24bpp) |
| pgm | 3:1 | greyscale 8bpp |
| grey | 3:1 | greyscale 8bpp |
| rgb24 | 1:1 | RGB (3 bytes, 24bpp) |
| rgb555 | 3:2 | two bytes 15bpp (one bit undefined) |
| rgb565 | 3:2 | two bytes fully used 16bpp |
| rgb332 | 3:1 | one byte 8bpp |
| cmap | 3:1 | 2^n colormap (1 upto 8 bpp) arguments: (int size, int cmap[size]) ie. cmap(16,000000,00ff00,...) |
| yuv444 | 1:1 |
| yuv422 | 4:3 |
| yuv421 | 4:3 |
| yuv420 | 3:1.5 |
|
The codec jpeg is prefered for long recordings, all others
(except cmap and yuv's) are simple to implement.
|
Datarate | | | FormatFrames/sec | 160x120 | 320x240 | 640x480 |
| 10fps | 288KB/s | 2.3MB/s | 9MB/s |
| 15fps | 432KB/s | 3.4MB/s | 13.5MB/s |
| 20fps | 576KB/s | 4.6MB/s | 18MB/s |
| 25fps | 720KB/s | 5.7MB/s | 22.5MB/s |
| 30fps | 864KB/s | 6.9MB/s | 27MB/s |
| 50fps | 1.4MB/s | 11.5MB/s | 45MB/s |
| 60fps | 1.7MB/s | 13.8MB/s | 54MB/s |
|
|
JPEG Compression | | JPEG compression definitly helps us to record long videos, ie. 20 minutes
without compression 320x240 @ 30fps gives 8.4GB file. Using LIBJPEG we
got following table:
| Format: | Rate: | Compression: | Quality: |
| PPM 320x256 | 255KB/frame | 1:1 | perfect |
| JPEG 75% 320x256 | 17KB/frame | 1:15 | very good |
| JPEG 65% 320x256 | 14KB/frame | 1:18 | very good |
| JPEG 55% 320x256 | 12KB/frame | 1:21 | good |
| JPEG 45% 320x256 | 11KB/frame | 1:23 | good |
| JPEG 35% 320x256 | 9KB/frame | 1:28 | reasonable |
| JPEG 25% 320x256 | 8KB/frame | 1:31 | fairly acceptable |
| JPEG 15% 320x256 | 6KB/frame | 1:42 | blocks, not good |
| JPEG 10% 320x256 | 5KB/frame | 1:51 | blocks, not good |
| JPEG 5% 320x256 | 3KB/frame | 1:85 | not usable |
|
RTJpeg (RealTime JPEG) produces according Justin Schoeman' tests
20:1 compression for 60fps @ 384x288, or 384x288@12.5fps then 253KB/s,
instead of 4MB/s (ratio of 15:1).
If we use the original LIBJPEG we have to check performance (encoders differs from RTJpeg and LIBJPEG)
In general can be said, we reach surely 20:1 compression, at least 15:1 without
losing too much. This means for a 20 minute 320x240 @ 30fps gives 420KB file (instead 8.4GB)
which is a very reasonable reduction.
|
Other Compressions | | The other compressions with ratio 3:2 or 3:1 we don't really get much out
of it, except the conversion is done quite fast unlike JPEG compression.
For YUV compression we have to do some cumbersome color-transformation
for recording and playing then, good coding required! Colormap using
isn't really usable for video-recording, but may be usuable for
frame-based computer generated videos (animated GIFs as example).
|
RGB to YUV Conversion
|
|
|
|
Y = (0.257 * R) + (0.504 * G) + (0.098 * B) + 16
|
|
Cr = (0.439 * R) - (0.368 * G) - (0.071 * B) + 128
|
|
Cb = -(0.148 * R) - (0.291 * G) + (0.439 * B) + 128
|
|
|
|
YUV to RGB Conversion
|
|
|
|
B = 1.164(Y - 16) + 2.018(Cb - 128)
|
|
G = 1.164(Y - 16) - 0.813(Cr - 128) - 0.391(Cb - 128)
|
|
R = 1.164(Y - 16) + 1.596(Cr - 128)
|
|
| Specifications5. Audio Header & Chunk
|
Header | | | audio type | type:pcm(speed,bits,channels)adpcm(bits)mp3(bits) |
|
|
Chunk | | | audiochunk | |
| length | length in bytes |
| binary-data | binary-data |
|
|
The audio-type is open and will be 'plugin' based. pcm (Pulse Code Modulation) is surely the most simple but most memory intensive,
and mp3 (MPEG Layer 3) will be implement if we get hold of a simple library.
Datarate | | | Format /Rate | 8bit Mono | 8bit Stereo / 16bit Mono | 16Bit Stereo |
| 8000Hz | 8KB/s | 16KB/s | 32KB/s |
| 11025Hz | 11KB/s | 22KB/s | 44KB/s |
| 22050Hz | 22KB/s | 44KB/s | 88KB/s |
| 44100Hz | 44KB/s | 88KB/s | 176KB/s |
|
|
Audio Codecs with Arguments | | | Type: | Arguments: | Example: | Description: | Usage: |
| pcm | int speed, int bits, int channels | pcm(22050,8,2) | 22.050kHz, 8 bits, Stereo | 16bit @ 22kHz or 44.1kHz lossless CD-quality, no compression at all (1:1) |
| adpcm | int bits | adpcm(4) | if bits=4, then compression 1/4 | almost lossless compression (4:1) |
| mp3 | int bitrate | mp3(64) | 64kb/s (8KB/s) | 128kb/s near CD-quality, good compression (10:1) |
| lvocoder | int bands, int start, int end, int last | lvocoder(12,50,5000,100) | linear vocoder, 12 bands, (50Hz - 5kHz), 100 msec packets | voice only, very high compression (100:1 or more) |
|
We likely only will implement pcm for the beginning.
|
| Specifications6. Considerations
|
The ratio between frames and audio is about 100:1 (ie. 320x240 @ 20fps = 4.6MB/s vs 16bit-Mono @ 22kHz = 44KB/s) = 104:1
For that reason the frames require to be sub-splitted, each frame is splitted in ie. 4KB chunks or whatever.
Recording | | Checking if the disk-writing is sufficient to save all
frames, audio should be sufficient. Good timing-programming required.
The 2GB-Filesize-Problem: allowing sequential ("filepointer" to next file) and parallel writing (junk1 -> filea, junk2 -> fileb, junk3 -> filea etc)
|
One of the reason to define codec and argument in ASCII is to avoid little/big endian problematic.
All frame & audio code are or should be little/big endian independent.
| Specifications7. Video Devices
|
In the moment drivers for Bt848, QuickCam and few other cards are
under development. Check Video4Linux Resources
for updates.
| Specifications8. RawVideo API
|
| Function: | Comment: |
| RVVideoHeader *rvreadfile(char *fname); | open video-stream |
| rvclosefile(RVVideoHeader *rvh); | close video-stream |
| rvframeadd(RVVideoHeader *rvh, char *type, int len, void *data); | add frame, type could be ie. "ppm"; all frames must be the same type |
| rvaudioadd(RVVideoHeader *rvh, char *type, int len, void *data); | add audio, type could be ie. "pcm(8,22050,2)"; all audio must be the same type |
RVVideoHeader *rvrecord(char *fname, char *dev, int (*stop_func()), int time, int fps, int w, int h, int bits, int speed, int channels); | filename and device name (/dev/video0) stop-button, or time in msec frame info sound info |
| RVFrame *rvframeread(char *fname) | read single frame, ppm, jpeg and gif will be supported |
| rvframefree(RVFrame *f) |
| RVAudio *rvfetchaudio(RVVideoHeader *rvh) | playing: get audio-chunk |
| RVFrame *rvfetchframe(RVVideoHeader *rvh, int type) | playing: get video-frame, type is preferred video-type |
| type2itype(char *type) | converts string like "ppm" to video-type RV_VIDEO_PPM (used in rvlib.h) |
|
Conversion of different video-types: we think of targetting rgb24 for MIT-XSHM playback as
internal standard format, for that reason rvlib.h provides anytorgb24(type, src, dest),
whereas type is RV_VIDEO_*, src and dest void* or unsigned char* pointers which automatically increment.
For RV_VIDEO_JPEG we will add appropriate function-call. For now anytorgb24() is a macro for sake of speed (to avoid a function-call).
There is also rgb24toany() available which doesn't support many types yet.
PLEASE NOTE: this API is not final at all, it's a proposal and
subject to change at any time. Once the package is programmed we will
document the API fully.
| Specifications9. Codec Matrix
|
We should be able to get any format to read/write we like, for
that purpose each supported codec must support xxxtorgb24, rgb24toxxx or
more:
RGBxxx | | Following conversion will be provided:
rgb32torgb24(),
rgb555torgb24(),
rgb565torgb24(),
rgb24torgb32(),
rgb24torgb555(),
rgb24torgb565().
|
YUVxxx | | Following conversion will be provided:
yuv444torgb24(),
yuv422torgb24(),
yuv420torgb24(),
rgb24toyuv444(),
rgb24toyuv422,
rgb24toyuv420().
|
RTJPEG | | Assuming we have a video-device providing YUV420, since RTJPEG-codec
only (for now) supports YUV420 we do not convert into RGB24 before
compressing, but pass the entire frame to the compressor:
|
YUV420 -> RTJPEG-COMPRESSOR -> RTJPEG
|
In case we have another device which just has RGB24, then we use:
|
RGB24 -> YUV420 -> RTJPEG-COMPRESSOR -> RTJPEG
|
To play back a RTJPEG frame we use this pipe:
|
RTJPEG -> RTJPEG-UNCOMPRESSOR -> YUV420 -> RGB24
|
Supported conversion:
yuv420tortjpeg(),
rtjpegtoyuv420().
|
Based on this we get a codec-matrix:
in-codec/ out-codec | rgb24 | rgb32 | yuv444 | yuv422 | yuv420 | rtjpeg | jpeg | mjpeg |
| rgb24 | X | X | X | X | X | | X | X |
| rgb32 | X | X |
| yuv444 | X | | X |
| yuv422 | X | | | X |
| yuv420 | X | | | | X | X |
| rtjpeg | | | | | X | X |
| jpeg | X | | | | | | X |
| mjpeg | X | | | | | | | X |
|
This is just theorectically, including current RTJPEG limitation of
not producing rgb24 direct either way. With a small algorithm it
should be determined how to get from "any" input frame-codec to
"any" output frame-codec. Additional to each "X" (supported) should
be a weight-value for conversion, 1 fast conversion, 2 twices as fast,
etc. then the algorithm "walks" through the matrix trying to get the
fastest conversion done and contructs then the conversion-pipe.
| Specifications10. Controllers
|
Additionally also controllers (ie. brightness, contrast, saturation, cropping) should
be considered as "format-converters" they just don't convert but apply additional
effects and adjustments unto the frame.
|
RTJPEG -> RTJPEG-UNCOMPRESSOR -> YUV420 -> BRIGHTNESS -> RGB24
|
List of "controllers":
- brightness (best using with YUV)
- saturation (best using with YUV)
- contrast (best using with YUV)
- cropping (cut-out or filled with color)
- border
- rescale
- etc.
It should be avoided to implement more sophisticated effects, because the
requirement for every conversion is: real-time, and we speak of
50fps here :-)
Since RGB24 is our prefered internal format, we will implement all controllers
with RGB24 format, and YUV444 too (should be not much changes from RGB24).

Last update 2000/05/19 
All Rights Reserved - (C) 1997 - 2008 by The Labs.Com |