Skip to content

ContentFilterEncryptedMedia

danielweck edited this page Dec 18, 2014 · 13 revisions

Step-by-step instructions

For a quick turnaround when experimenting with encrypted media resources, you may modify the existing ContentFilter class called "PassThroughFilter" (which by default simply passes the raw data extracted from the zip archive, no byte transformation involved).

When you're done experimenting, you need to make your implementation more permanent: create a new file/class specifically for your ContentFilter, and copy/paste the modified PassThroughFilter. Remember to restore the original PassThroughFilter code!

1) First

Register the filter in the PopulateFilterManager() initialisation function, so that the filter gets included in the processing chain:

https://github.com/readium/readium-sdk/blob/develop/ePub3/ePub/initialization.cpp#L41

PassThroughFilter::Register();

2) Second

Make sure that the SniffPassThroughContent method correctly tests your resource. For example, your may need to decode only certain types of audio and video files:

https://github.com/readium/readium-sdk/blob/develop/ePub3/ePub/PassThroughFilter.cpp#L37

// "item" is a ManifestItem pointer
auto mediaType = item->MediaType();
return (mediaType == "audio/mp4" || mediaType == "audio/mpeg" || mediaType == "video/mp4" || mediaType == "video/mpeg");

The above condition only tests the file extension, which may not always be adequate. Here is a more sophisticated example that checks the spine item metadata as well (note that the metadata namespace URIs and names are fictitious):

// "item" is a ManifestItem pointer
auto mediaType = item->MediaType();
if (mediaType == "application/xhtml+xml" || mediaType == "text/html")
{
bool isCrypt = false;

IRI iri1 = IRI("http://foo.com/example#encrypted");
if (item->ContainsProperty(iri1, false))
{
    PropertyPtr prop1 = item->PropertyMatching(iri1, false);
    if (prop1->Value() == "true")
        isCrypt = true;
}

IRI iri2 = IRI("http://foo.com/example/encrypted");
if (item->ContainsProperty(iri2, false))
{
    PropertyPtr prop2 = item->PropertyMatching(iri2, false);
    if (prop2->Value() == "true")
        isCrypt = true;
}

// spine or manifest item 'properties' attribute values only accept predefined EPUB3 vocabulary!
//string iri3 = "http://foo.com/example#encrypted";
//if (item->HasProperty(iri3))
//{
//    isCrypt = true;
//}

if (isCrypt)
{
    return true;
}
}

As you can see, the above example demonstrates 3 different types of metadata. Here are the corresponding pieces in the OPF package XML:

UPDATE: spine or manifest item properties attribute values actually only accept predefined EPUB3 vocabulary, so this particular method does not work. The XML below still contains this attribute, just to make the point.

<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf"
    prefix="example:http://foo.com/example#">
    <metadata xmlns:example="http://foo.com/example/">
        <meta property="example:encrypted" refines="#chap1">true</meta>
        <example:encrypted refines="#chap1">true</example:encrypted>
    </metadata> 
    <manifest>
        <item id="chap1" 
            href="chapter1.xhtml" 
            media-type="application/xhtml+xml"
            properties="example:encrypted"
        />
    </manifest>
</package>

3) Third

Implement the FilterData() decryption routine, which consumes raw bytes provided by the PassThroughContext instance, and transforms them into a decoded buffer:

https://github.com/readium/readium-sdk/blob/develop/ePub3/ePub/PassThroughFilter.cpp#L76

Note that by default, the PassThroughFilter returns "OperatingMode::SupportsByteRanges" in the GetOperatingMode() method, which means that your implementation must support arbitrary HTTP byte range requests (i.e. query of bytes within the stream, from a given offset, for a given length).

https://github.com/readium/readium-sdk/blob/develop/ePub3/ePub/PassThroughFilter.h#L42

Should your decryption algorithm not support byte ranges, you may need to disable support for "OperatingMode::SupportsByteRanges", and instead use the "OperatingMode::RequiresCompleteData" variant (which implies that each encrypted media resource will be entirely loaded in memory).

4) Fourth

Write additional code to calculate the size of the decrypted media resource (by default, the BytesAvailable() method returns the length of the raw byte stream):

https://github.com/readium/readium-sdk/blob/develop/ePub3/ePub/PassThroughFilter.cpp#L71

This is used in the HTTP protocol to test for the EOF status (End Of File) when a resource is being incrementally requested.

Concrete example 1

Encryption

This Gist contains the contentFilterChainEncode.java code that implements a (very) naive encryption routine (for demonstration purposes only):

https://gist.github.com/danielweck/d04c5eb9fcb2693e0805

This Gist contains the contentFilterChainEncode.sh shell script that compiles and runs the above Java file:

https://gist.github.com/danielweck/b84c7da78670bb61bb21

Decryption

This Gist contains both the BytesAvailable() and the FilterData() methods, matching the above encryption algorithm (note that the global class variables in the Content Filter should really be migrated to the Filter Context, as this where the state of an individual HTTP connection is maintained):

https://gist.github.com/danielweck/eab6bd6ac63aaeb211b6

Note that this example demonstrates an "encryption" scheme that increases the size of the resource (conversely, decryption results in a smaller byte output). This may not always be the case, in fact files would usually be zipped prior to encryption, and stored in the EPUB archive, resulting in decrypted files that are smaller than the stored resource! See the next example below.

Concrete example 2

Encryption

Here, the single-file miniz.c library is used to simply zip (deflate / compress) the resource prior to storing it in the EPUB archive.

https://code.google.com/p/miniz/

This is effectively our "encryption" scheme :)

Decryption

Although the code linked below is very dirty, it is a working example that demonstrates how the "decryption" (inflate / decompress) results in a smaller stream of bytes.

https://github.com/readium/readium-sdk/blob/da8e29131bf031e6b28708acabc94e582505fa62/ePub3/ePub/PassThroughFilter.cpp#L151

This works fine in the non-HTTP-byte-range scenario, because there is an intermediary "cache" buffer that stores data for later requests:

https://github.com/readium/readium-sdk/blob/develop/ePub3/ePub/filter_chain_byte_stream.cpp#L83

However, there is a known caveat with HTTP byte-range requests, in which case the possibility of larger-than-requested data chunks is not handled properly yet (there is no equivalent intermediary "cache" byte buffer):

https://github.com/readium/readium-sdk/blob/develop/ePub3/ePub/filter_chain_byte_stream_range.cpp#L136