Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add opus custom support #18

Open
wants to merge 41 commits into
base: main
Choose a base branch
from
Open

Conversation

emlynmac
Copy link

I've been working on an implementation of Jamulus' audio protocol, which uses a custom Opus setting.

The PR contains changes to get the custom mode up and running.

@ydnar
Copy link
Contributor

ydnar commented Apr 19, 2022

Big PR, thanks! Some initial feedback that I think should be addressed before I can do a thorough review:

  1. I think there should be a CustomMode Swift type that wraps or is the underlying Opus C type.
  2. The constructors for Encoder and Decoder could accept a CustomMode, or implicitly handle it if the specified frame size isn’t standard.
  3. Please format according to the swiftformat spec in this repo. Changes with unformatted or differently formatted code will not be accepted.
  4. Please separate out any submodule changes into another PR (or remove them).

Thanks!

@emlynmac
Copy link
Author

Big PR, thanks! Some initial feedback that I think should be addressed before I can do a thorough review:

  1. I think there should be a CustomMode Swift type that wraps or is the underlying Opus C type.
  2. The constructors for Encoder and Decoder could accept a CustomMode, or implicitly handle it if the specified frame size isn’t standard.
  3. Please format according to the swiftformat spec in this repo. Changes with unformatted or differently formatted code will not be accepted.
  4. Please separate out any submodule changes into another PR (or remove them).

Thanks!

I forgot to update the submodule; done
Just ran swiftformat too

To your first points around breaking those out further, I originally started down this path but stopped as it didn't really help the module be more usable from an API perspective. The underlying calls are different enough that to wrapping them up together was clearer and didn't need to change the existing code at all.

@ydnar
Copy link
Contributor

ydnar commented Apr 20, 2022

To your first points around breaking those out further, I originally started down this path but stopped as it didn't really help the module be more usable from an API perspective. The underlying calls are different enough that to wrapping them up together was clearer and didn't need to change the existing code at all.

As I understand it, Opus custom encoders/decoders exist primarily (or only) to support nonstandard frame sizes. I don’t think that justifies a new, somewhat duplicative expansion of this package’s public API.

How about this: could you propose a minimal API change that enables nonstandard frame sizes? We can start right here in the PR comments.

Once we have a solid API proposal, then implement it.

@emlynmac
Copy link
Author

To your first points around breaking those out further, I originally started down this path but stopped as it didn't really help the module be more usable from an API perspective. The underlying calls are different enough that to wrapping them up together was clearer and didn't need to change the existing code at all.

As I understand it, Opus custom encoders/decoders exist primarily (or only) to support nonstandard frame sizes. I don’t think that justifies a new, somewhat duplicative expansion of this package’s public API.

How about this: could you propose a minimal API change that enables nonstandard frame sizes? We can start right here in the PR comments.

Once we have a solid API proposal, then implement it.

If there's a preferred API you'd like to make, I'm all up for that. I agree the almost duplication of some of the helper functions to expand the pointers is not ideal, but as I said - I did not want to disturb the other parts of the code here.
The crux of the issue is being able to handle the fame size and also to call opus_custom_encode rather than opus_encode. I've also had to ensure that the expected packet size and nils are handled by the decoder too.

Speaking of the existing code, it looks like it's not handling dropped packets. My understanding is that if you have a dropped packet you should feed nil into the decoder to cover that case.

The decode method in Opus.Decoder isn't able to handle that:

private func decode(_ input: UnsafeBufferPointer<UInt8>, to output: UnsafeMutableBufferPointer<Float32>) throws -> Int {
		let decodedCount = opus_decode_float(
			decoder,
			input.baseAddress!,
			Int32(input.count),
			output.baseAddress!,
			Int32(output.count),
			0
		)
		if decodedCount < 0 {
			throw Opus.Error(decodedCount)
		}
		return Int(decodedCount)
	}

@emlynmac
Copy link
Author

Ok, I found some time :)

I've merged the coding / decoding changes in with the existing classes.
See what you think of the changes!

} else {
decodedCount = opus_decode(
decoder,
input.isEmpty ? nil : input.baseAddress,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change means that if you pass an empty data packet, opus decoder gets the nil it needs to signify a dropped packet.

Copy link
Contributor

@ydnar ydnar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be simplified further. Key detail is hiding the existence and ownership of the OpusCustomMode within Encoder and Decoder.

@@ -0,0 +1,95 @@
import AVFoundation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What’s the purpose of this file? Can it be deleted?

(I think it should be, if Opus.Encoder and Opus.Decoder can support nonstandard frame sizes.

@@ -22,6 +23,27 @@ public extension Opus {
}
}

public init(customOpus: OpaquePointer,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the encoder and decoder typically exist on different machines, I think Decoder should own the OpusCustomMode and create it here.

It can be freed in deinit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree - custom encoders and decoders need the custom object; you need to have a custom object in order to use either.
It maybe in your use case you have encoding on one machine and decoding on another, but you have to have the same custom instance for both if you're using them together.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constructor for an OpusCustomMode takes two arguments: a sample rate and a target frame size in number of samples, which must be 64-1024 plus some additional constraints on prime factors.

The sample rate can be derived from the AVAudioFormat in the constructor along with a customFrameSize parameter to the custom constructor for an Encoder and Decoder.

It’s probably useful to have common helper code to aid in the creation of a valid OpusCustomMode that throws an error, but isn’t part of this package’s public API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s probably useful to have common helper code to aid in the creation of a valid OpusCustomMode that throws an error, but isn’t part of this package’s public API.

That is the Opus.Custom.swift file.

Sources/Opus/Opus.Decoder.swift Outdated Show resolved Hide resolved
@@ -5,9 +5,10 @@ public extension Opus {
class Decoder {
let format: AVAudioFormat
let decoder: OpaquePointer
let customFrameSize: Int32?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably be a non-optional int that defaults to 0.

We’ll also need to add a let customMode: OpaquePointer here (see below).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually no, this is a way to avoid having a duplicate un-needed pointer.
If you have a custom frame size, then your decoder is a custom one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s no function in the Opus library to extract the frame size used to initialize an OpusCustomMode, hence storing it here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow your point here; can you clarify?

decodedCount = opus_custom_decode(
decoder,
input.isEmpty ? nil : input.baseAddress,
size,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is packetSize passed as an argument here (and other decode methods) when it should just be equal to Int32(input.count)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packet size is needed for the custom decoder to know how much data it has.
In the case of a dropped packet, input is nil, but the encoder still needs to know how much data got dropped

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment at the top-level of this PR regarding dropped packets.

If the decoder was initialized with a custom frame size, then either use that, or have decodeDroppedPacket accept an (optional?) frame size argument.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Custom frame size and compressed packet size are not the same thing. Custom encode / decode need to know both in order to work.

packetSize: Int32?) throws -> Int
{
var decodedCount: Int32 = 0
if let size = packetSize {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, this should branch on the presence of a non-nil customMode instance variable, which implies the decoder was initialized with a custom frame size.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented this a different way; using the optional presence of a custom frame size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When decoding a stream of packets, is the frame size consistent, or does it vary? If it’s consistent, then this parameter shouldn’t exist.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size can be multiples of the underlying frame size

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it need the packetSize argument? Can that be inferred from input.count?

Copy link
Author

@emlynmac emlynmac Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, now I've had my coffee...

Compressed packet size != frame size.

Both are needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, great. Then can that be taken from the customFrameSize instance variable?

public init(format: AVAudioFormat, application _: Application = .audio) throws {
customFrameSize = nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is made non-optional, this can be customFrameSize = 0, with customMode = nil.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then you need 2 new vars; customMode and customFrameSize.
Using an optional for customFrameSize means you don't need a separate variable.

@ydnar
Copy link
Contributor

ydnar commented Apr 21, 2022

Speaking of the existing code, it looks like it's not handling dropped packets. My understanding is that if you have a dropped packet you should feed nil into the decoder to cover that case.

The decode method in Opus.Decoder isn't able to handle that:

Good observation. I’d be game to add support for this in a separate PR.

My instinct says that a decodeDroppedPacket method (rather than interpreting nil) is probably the right approach. Thoughts?

@emlynmac
Copy link
Author

Speaking of the existing code, it looks like it's not handling dropped packets. My understanding is that if you have a dropped packet you should feed nil into the decoder to cover that case.
The decode method in Opus.Decoder isn't able to handle that:

Good observation. I’d be game to add support for this in a separate PR.

My instinct says that a decodeDroppedPacket method (rather than interpreting nil) is probably the right approach. Thoughts?

I've implemented a fix in the PR; no need to make a different method to handle the case of a pretty regular occurrence. Will make the API kinda strange to have to call a different decode API for nil.

@ydnar
Copy link
Contributor

ydnar commented Apr 21, 2022

Speaking of the existing code, it looks like it's not handling dropped packets. My understanding is that if you have a dropped packet you should feed nil into the decoder to cover that case.
The decode method in Opus.Decoder isn't able to handle that:

Good observation. I’d be game to add support for this in a separate PR.
My instinct says that a decodeDroppedPacket method (rather than interpreting nil) is probably the right approach. Thoughts?

I've implemented a fix in the PR; no need to make a different method to handle the case of a pretty regular occurrence. Will make the API kinda strange to have to call a different decode API for nil.

But from the caller’s perspective, a dropped packet isn’t nil. It just never arrived, hence can be handled with a more specific API. The goal of this package was to present a Swift-like interface to the Opus library, and not necessarily mirror its C API.

@emlynmac
Copy link
Author

Speaking of the existing code, it looks like it's not handling dropped packets. My understanding is that if you have a dropped packet you should feed nil into the decoder to cover that case.
The decode method in Opus.Decoder isn't able to handle that:

Good observation. I’d be game to add support for this in a separate PR.
My instinct says that a decodeDroppedPacket method (rather than interpreting nil) is probably the right approach. Thoughts?

I've implemented a fix in the PR; no need to make a different method to handle the case of a pretty regular occurrence. Will make the API kinda strange to have to call a different decode API for nil.

But from the caller’s perspective, a dropped packet isn’t nil. It just never arrived, hence can be handled with a more specific API. The goal of this package was to present a Swift-like interface to the Opus library, and not necessarily mirror its C API.

If you look at how you'd be using this, it would be from a jitter buffer. Typically that will just have the read method called periodically as the audio system requests more data.
The buffer would either return a compressed packet or a nil if no data. Simply feeding that into the encoder is somewhat cleaner than calling a separate method that ultimately calls the same opus method under the hood.

@ydnar
Copy link
Contributor

ydnar commented Apr 21, 2022

If you look at how you'd be using this, it would be from a jitter buffer. Typically that will just have the read method called periodically as the audio system requests more data. The buffer would either return a compressed packet or a nil if no data. Simply feeding that into the encoder is somewhat cleaner than calling a separate method that ultimately calls the same opus method under the hood.

Right now, there are two public decode methods. One accepts Data, and the other accepts UnsafeBufferPointer<UInt8>.

Adding support for nil packets to both of these methods doesn’t seem great, and adding two additional methods to decode a dropped packet also doesn’t seem great.

@emlynmac
Copy link
Author

If you look at how you'd be using this, it would be from a jitter buffer. Typically that will just have the read method called periodically as the audio system requests more data. The buffer would either return a compressed packet or a nil if no data. Simply feeding that into the encoder is somewhat cleaner than calling a separate method that ultimately calls the same opus method under the hood.

Right now, there are two public decode methods. One accepts Data, and the other accepts UnsafeBufferPointer<UInt8>.

Adding support for nil packets to both of these methods doesn’t seem great, and adding two additional methods to decode a dropped packet also doesn’t seem great.

Both of those methods already have support in my PR. No changes needed; just using an empty data signifies a nil, which is then passed into the decode method.
This has the added benefit of not crashing in the case where the bufferPtr is nil.

@ydnar
Copy link
Contributor

ydnar commented Apr 21, 2022

@emlynmac thanks for your efforts here, I appreciate it.

I think we should split out decoding dropped packets into a separate thread of work (probably a new PR), which is relevant for both custom and non-custom modes. I think this could help improve/inform the public API design for custom mode support.

@emlynmac
Copy link
Author

@emlynmac thanks for your efforts here, I appreciate it.

I think we should split out decoding dropped packets into a separate thread of work (probably a new PR), which is relevant for both custom and non-custom modes. I think this could help improve/inform the public API design for custom mode support.

You're welcome, I needed it to get jamulus working on swift, so I figured I'd pass it back.

If you want to refactor the dropping into another method, then that's def another PR. I need it to work for now, but that's why we have forks :)

@ydnar
Copy link
Contributor

ydnar commented Apr 21, 2022

I want this to land. Supporting dropped packets and supporting custom modes are two distinct and valuable features, and should be treated as such. Given that dropped packet support is relevant for all users of this package, I think it’s a good candidate to extract and land first. Then we can streamline and simplify the discussion around the design for support for custom modes.

@emlynmac
Copy link
Author

emlynmac commented Apr 21, 2022

I want this to land. Supporting dropped packets and supporting custom modes are two distinct and valuable features, and should be treated as such. Given that dropped packet support is relevant for all users of this package, I think it’s a good candidate to extract and land first. Then we can streamline and simplify the discussion around the design for support for custom modes.

From my perspective, the cleanest way would be to let the public decode methods take optionals.
That way, you can have a separate internal method to handle the dropped packet case, but the external API is unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants