Contents
The 10Duke SDK media processing toolchain can be used for virtually any media processing and file conversion, common use cases including:
The media processing toolchain defines framework for operating on media, and it supports hooking up tools preferred by the user. Any command line tool can be used by the toolchain, commonly used tools include FFmpeg, MEncoder, SoX, ImageMagick convert etc. Media that can be processed by the toolchain is not restricted to audio-visual media (video, images, audio) but technically any files can be processed. With suitable tools hooked up, it could be used for creating PDF from input documents, extracting plain text from HTML input, transforming XML etc.
Technically the media processing toolchain is an engine that allows mapping input files to series of commands for producing output files. Commands for producing output can be selected by input file mime type (file format) and metadata. An input file can be mapped to any number of output files. This allows, for instance, creating a web compatible video and several thumbnail images for user uploaded videos.
Media processing functionality is accessed via MediaProcessingProvider interface. MediaProcessingProviderFactory instantiates MediaProcessingProviders and allows different MediaProcessingProvider configurations for different use cases. MediaProcessingProvider defines two specific use cases:
MediaProcessingProvider.PROVIDER_NAME_DEFAULT (null): Generic processing from one input to one output that should be used to process input to output format specified by caller.
MediaProcessingProvider.PROVIDER_NAME_USER_GENERATED_CONTENT ("user"): Processing of user generated content. For instance, in an application that allows users to upload video, this named MediaProcessingProvider would be configured to generate output video in format required by the application, thumbnail images etc.
MediaProcessingProviderFactory allows configuring concrete MediaProcessingProvider implementation to use, this is configured by media.mediaprocessingprovider.classname configuration parameter that takes fully qualified class name of the MediaProcessingProvider implementation. If this configuration parameter is not set, MediaProcessingToolchain. Developers should consider setting value of this parameter to use StorageProviderMediaProcessingToolchain to support using StorageProvider in the toolchain. This can be configured by setting the following configuration parameter:
<entry key="media.mediaprocessingprovider.classname">com.tenduke.services.multimedia.StorageProviderMediaProcessingToolchain</entry>
Each MediaProcessingProvider is responsible for supporting different configurations different named use cases.
MediaProcessingToolchain is implementation MediaProcessingProvider included in the 10Duke SDK distribution. StorageProviderMediaProcessingToolchain extends MediaProcessingToolchain and adds support for using StorageProvider in the toolchain.
MediaProcessingToolchain and StorageProviderMediaProcessingToolchain parse media processing configuration, select output commands, executes the commands and notifies observers by dispatching MediaProcessingEvent events. Both synchronous and asynchronous execution are supported.
MediaProcessingUtils is a utility class that provides method signatures that wrap initializing MediaProcessingJob objects, selecting MediaProcessingProvider and starting media processing. Here's a simple example of using MediaProcessingUtils:
URI inputVideoUrl = UriUtils.constructFileUri("test-resources/test.mpg");
URI outputVideoUrl = UriUtils.constructFileUri("tmp/test.ogv");
MediaProcessingUtils.processMedia(inputVideoUrl, outputVideoUrl);
In this example, a .mpg input video is transcoded to on .ogv video using the default MediaProcessingProvider. In order for this example to actually work, toolchain must be correctly configured.
Video and audio transcoding supports configuration of a arbitrary toolchain for processing a file. Configuration supports toolchain selection by input and output file format (container format) as well as video codec and audio codec in input file. Using the configuration starts by file format inspection of input and output file. It then branches by defined video and audio codec cases and allows a default case definition. Transcoding configuration can be though of as a matrix, which is explained by this table:
|
avi |
3gp |
flv |
mov |
mp4 |
wmv |
avi |
toolchain for avi 2 avi |
toolchain for avi 2 3gp |
toolchain for avi 2 flv |
toolchain for avi 2 mov |
toolchain for avi 2 mp4 |
toolchain for avi 2 wmv |
3gp |
toolchain for 3gp 2 avi |
toolchain for 3gp 2 3gp |
toolchain for 3gp 2 flv |
toolchain for 3gp 2 mov |
toolchain for 3gp 2 mp4 |
toolchain for 3gp 2 wmv |
flv |
toolchain for flv 2 avi |
|
... |
|
|
|
mov |
toolchain for mov 2 avi |
|
|
... |
|
|
mp4 |
toolchain for avi 2 avi |
|
|
|
... |
|
wmv |
|
|
|
|
|
... |
Each toolchain in the configuration can be branched according to video and audio codec in input file. A toolchain is a sequence of commandline executables with related arguments. Configuration for executables and arguments supports a set of format specifiers to support parametrization of input and output file names along with size, resolution, start time, end time, etc. variables for transcoding. Configuration of video and audio transcoding examples are given below.
The task for deployment is to acquire sufficient applications to be used as part of the toolchains defined in configuration for transcoding.
A deployed application that uses transcoding and related processing services in the SDK must be able to access and execute the applications defined in transcoding configuration. Setup of applications defined in transcoding configuration is usually done by operating system package managers, by installers or by building the applications from source code.
Applications defined in transcoding configuration must be found in the PATH of the application using the SDK or alternatively they must be configured by absolute path reference. Transcoding is usually CPU, memory and IO intensive, which must be considered in over all architecture and deployment design (e.g. when choosing server hardware).
Examples of well known transcoding engines and applications that can be used as part of toolchains are listed in deployment guide.
Image operations include scaling, rotating, cropping, etc. transformations. The SDK provides for processing and commands related to business entities describing images. Currently the supported external image processing tool is ImageMagick, which is available on several platforms. An executable named convert is the specific application used from the ImageMagick package. ImageMagick can be acquired by package managers on several systems and by downloading installer or source code found at http://www.imagemagick.org/
MediaProcessingToolchain and StorageProviderMediaProcessingToolchain read configuration from configuration parameter media.conversion.configuration.[mediaProcessingProviderName], where [mediaProcessingProviderName] is replaced with name given to the object. This name is the providerName given to MediaProcessingProviderFactory.instance().getMediaProcessingProvider(String providerName) or one of the processMedia(...) methods of MediaProcessingUtils. For instance, if media processing is started by calling MediaProcessingProviderFactory.instance().getMediaProcessingProvider(MediaProcessingProvider.PROVIDER_NAME_USER_GENERATED_CONTENT) or MediaProcessingUtils.processMedia(inputUrl, outputUrl, MediaProcessingProvider.PROVIDER_NAME_USER_GENERATED_CONTENT), configuration will be read from media.conversion.configuration.user configuration parameter. For default media processing provider (MediaProcessingProvider.PROVIDER_NAME_DEFAULT), the configuration key is just media.conversion.configuration. Example configuration parameter for user (MediaProcessingProvider.PROVIDER_NAME_USER_GENERATED_CONTENT) media processing provider:
<entry key="media.conversion.configuration.user"><![CDATA[<?value url="mediaProcessingUser.xml" encoding="utf-8" cache="false"?>]]></entry>
MediaProcessingToolchain and StorageProviderMediaProcessingToolchain configuration is given as XML and has the following DTD (Document Type Definition):
<!ELEMENT MediaProcessingToolchainConfiguration (InputFormat+)> <!ELEMENT InputFormat (OutputFormat+)> <!ELEMENT OutputFormat (Case*,Exe+)> <!ELEMENT Case (Exe+)> <!ELEMENT Exe (Arg+)> <!ATTLIST InputFormat inputMime CDATA #REQUIRED> <!ATTLIST InputFormat containerName CDATA #REQUIRED> <!ATTLIST OutputFormat outputMime CDATA #REQUIRED> <!ATTLIST OutputFormat containerName CDATA #REQUIRED> <!ATTLIST Case audio CDATA #REQUIRED> <!ATTLIST Case video CDATA #REQUIRED> <!ATTLIST Exe name CDATA #REQUIRED <!ATTLIST Arg value CDATA #REQUIRED>
Element and attribute specification:
MediaProcessingToolchainConfiguration is the XML root element for the configuration document. It could be named differently and configuration would still work. The default name has been chosen to clarify what the configuration is for.
InputFormat is the name for elements that declare supported input file formats.
InputFormat/inputMime is the name for an attribute that defines the input files type using a content type (mime type) notation
InputFormat/containerName is the name for an attribute that defines the input files type using a file format name. The value must not include the dot character used to separate file format names.
OutputFormat is the name for elements that declare supported output file formats.
OutputFormat/outputMime is the name for an attribute that defines the output files type using a content type (mime type) notation
OutputFormat/containerName is the name for an attribute that defines the output files type using a file format name. The value must not include the dot character used to separate file format names.
Case is the name of an element that declares a conditional branch by audio and video codec in input file
Case/audio is the name of an attribute that holds the audio codec name that makes the case's condition to evaluate to true
Case/video is the name of an attribute that holds the video codec name that makes the case's condition to evaluate to true
Exe is the name of an element that encapsulates the definition of one command line tool and it's arguments
Exe/name is the name of an attribute that holds the path / name to call on command line to start the execute the command line application
Arg is the name of an element that defines one command line argument for the execututable it is defined in
Arg/value is the name of an attribute that holds the command line argument
This example configuration would be suitable for used with the default media processing provider that is used to convert from an input format to a specified output format.
<MediaProcessingToolchainConfiguration>
<InputFormat inputMime="video/x-flv" containerName="flv">
<OutputFormat outputMime="video/x-flv" containerName="flv">
<Exe name="cp">
<Arg value="$INPUT_FILE" />
<Arg value="$OUTPUT_FILE" />
</Exe>
<Exe name="flvtool2">
<Arg value="-UP" />
<Arg value="$OUTPUT_FILE" />
</Exe>
</OutputFormat>
<OutputFormat outputMime="video/mp4" containerName="mp4">
<Exe name="/usr/local/bin/ffmpeg">
<Arg value="-v" />
<Arg value="-1.0" />
<Arg value="-y" />
<Arg value="-i" />
<Arg value="$INPUT_FILE" />
<Arg value="-acodec" />
<Arg value="libfaac" />
<Arg value="-ab" />
<Arg value="128k" />
<Arg value="-ac" />
<Arg value="2" />
<Arg value="-vcodec" />
<Arg value="libx264" />
<Arg value="-vpre" />
<Arg value="normal" />
<Arg value="-b" />
<Arg value="256k" />
<Arg value="-bt" />
<Arg value="256k" />
<Arg value="-threads" />
<Arg value="0" />
<Arg value="$OUTPUT_FILE" />
</Exe>
<Exe name="/usr/local/bin/MP4Box">
<Arg value="-inter" />
<Arg value="0.5" />
<Arg value="$OUTPUT_FILE" />
</Exe>
</OutputFormat>
<OutputFormat outputMime="*" containerName="*">
<Exe name="/usr/local/bin/ffmpeg">
<Arg value="-v" />
<Arg value="-1.0" />
<Arg value="-y" />
<Arg value="-i" />
<Arg value="$INPUT_FILE" />
<Arg value="$OUTPUT_FILE" />
</Exe>
</OutputFormat>
</InputFormat>
<InputFormat inputMime="video/mp4" containerName="mp4">
<OutputFormat outputMime="video/x-flv" containerName="flv">
<Case audio="aac" video ="mpeg4">
<Exe name="/usr/local/bin/ffmpeg">
<Arg value="-v" />
<Arg value="-1.0" />
<Arg value="-y" />
<Arg value="-i" />
<Arg value="$INPUT_FILE" />
<Arg value="-acodec" />
<Arg value="libfaac" />
<Arg value="-ar" />
<Arg value="44100" />
<Arg value="-ab" />
<Arg value="192k" />
<Arg value="-ac" />
<Arg value="2" />
<Arg value="-r" />
<Arg value="30000/1001" />
<Arg value="-vcodec" />
<Arg value="libx264" />
<Arg value="-vpre" />
<Arg value="normal" />
<Arg value="-b" />
<Arg value="2M" />
<Arg value="-bt" />
<Arg value="2M" />
<Arg value="-threads" />
<Arg value="0" />
<Arg value="$OUTPUT_FILE" />
</Exe>
</Case>
<Exe name="/usr/local/bin/ffmpeg">
<Arg value="-v" />
<Arg value="-1.0" />
<Arg value="-y" />
<Arg value="-i" />
<Arg value="$INPUT_FILE" />
<Arg value="-acodec" />
<Arg value="libfaac" />
<Arg value="-ab" />
<Arg value="128k" />
<Arg value="-ac" />
<Arg value="2" />
<Arg value="-vcodec" />
<Arg value="libx264" />
<Arg value="-vpre" />
<Arg value="normal" />
<Arg value="-b" />
<Arg value="256k" />
<Arg value="-bt" />
<Arg value="256k" />
<Arg value="-threads" />
<Arg value="0" />
<Arg value="$OUTPUT_FILE" />
</Exe>
</OutputFormat>
<OutputFormat outputMime="video/mp4" containerName="mp4">
<Case audio="aac" video ="mpeg4">
<Exe name="/usr/local/bin/ffmpeg">
<Arg value="-v" />
<Arg value="-1.0" />
<Arg value="-y" />
<Arg value="-i" />
<Arg value="$INPUT_FILE" />
<Arg value="-acodec" />
<Arg value="libfaac" />
<Arg value="-ar" />
<Arg value="44100" />
<Arg value="-ab" />
<Arg value="192k" />
<Arg value="-ac" />
<Arg value="2" />
<Arg value="-r" />
<Arg value="30000/1001" />
<Arg value="-vcodec" />
<Arg value="libx264" />
<Arg value="-vpre" />
<Arg value="normal" />
<Arg value="-b" />
<Arg value="2M" />
<Arg value="-bt" />
<Arg value="2M" />
<Arg value="-threads" />
<Arg value="0" />
<Arg value="$OUTPUT_FILE" />
</Exe>
<Exe name="/usr/local/bin/MP4Box">
<Arg value="-inter" />
<Arg value="0.5" />
<Arg value="$OUTPUT_FILE" />
</Exe>
</Case>
<Exe name="/usr/local/bin/ffmpeg">
<Arg value="-v" />
<Arg value="-1.0" />
<Arg value="-y" />
<Arg value="-i" />
<Arg value="$INPUT_FILE" />
<Arg value="-acodec" />
<Arg value="libfaac" />
<Arg value="-ab" />
<Arg value="128k" />
<Arg value="-ac" />
<Arg value="2" />
<Arg value="-vcodec" />
<Arg value="libx264" />
<Arg value="-vpre" />
<Arg value="normal" />
<Arg value="-b" />
<Arg value="256k" />
<Arg value="-bt" />
<Arg value="256k" />
<Arg value="-threads" />
<Arg value="0" />
<Arg value="$OUTPUT_FILE" />
</Exe>
<Exe name="/usr/local/bin/MP4Box">
<Arg value="-inter" />
<Arg value="0.5" />
<Arg value="$OUTPUT_FILE" />
</Exe>
</OutputFormat>
<OutputFormat outputMime="*" containerName="*">
<Exe name="mencoder">
<Arg value="-mc" />
<Arg value="0" />
<Arg value="-msglevel" />
<Arg value="all=-1" />
<Arg value="$INPUT_FILE" />
<Arg value="-ss" />
<Arg value="$START_TIME" />
<Arg value="-endpos" />
<Arg value="$DURATION" />
<Arg value="-o" />
<Arg value="$WORKING_FILE.mp4" />
<Arg value="-oac" />
<Arg value="mp3lame" />
<Arg value="-ovc" />
<Arg value="xvid" />
<Arg value="-xvidencopts" />
<Arg value="bitrate=800" />
</Exe>
<Exe name="ffmpeg">
<Arg value="-v" />
<Arg value="-1.0" />
<Arg value="-y" />
<Arg value="-i" />
<Arg value="$WORKING_FILE.mp4" />
<Arg value="$OUTPUT_FILE" />
</Exe>
<Exe name="rm">
<Arg value="-f" />
<Arg value="$WORKING_FILE.mp4" />
</Exe>
</OutputFormat>
</InputFormat>
<InputFormat inputMime="video/x-ms-wmv" containerName="wmv">
<OutputFormat outputMime="video/mp4" containerName="mp4">
<Exe name="/usr/local/bin/ffmpeg">
<Arg value="-v" />
<Arg value="-1.0" />
<Arg value="-y" />
<Arg value="-i" />
<Arg value="$INPUT_FILE" />
<Arg value="-acodec" />
<Arg value="libfaac" />
<Arg value="-ab" />
<Arg value="128k" />
<Arg value="-ac" />
<Arg value="2" />
<Arg value="-vcodec" />
<Arg value="libx264" />
<Arg value="-vpre" />
<Arg value="normal" />
<Arg value="-b" />
<Arg value="256k" />
<Arg value="-bt" />
<Arg value="256k" />
<Arg value="-threads" />
<Arg value="0" />
<Arg value="$OUTPUT_FILE" />
</Exe>
<Exe name="/usr/local/bin/MP4Box">
<Arg value="-inter" />
<Arg value="0.5" />
<Arg value="$OUTPUT_FILE" />
</Exe>
</OutputFormat>
<OutputFormat outputMime="*" containerName="*">
<Exe name="/usr/local/bin/ffmpeg">
<Arg value="-v" />
<Arg value="-1.0" />
<Arg value="-y" />
<Arg value="-i" />
<Arg value="$INPUT_FILE" />
<Arg value="$OUTPUT_FILE" />
</Exe>
</OutputFormat>
</InputFormat>
</MediaProcessingToolchainConfiguration>