Skip to content

Conversation

hiroshihorie
Copy link
Member

Configuring AVAudioSession can fail, especially during a phone call. To handle this, propagate the configuration failure so that getUserMedia() will throw before continuing with the audio stack.

This PR already includes checks at the WebRTC level to prevent crashes. However, without this PR, failures will likely occur silently, leaving the user unaware. With this PR, the error will be thrown explicitly.

@hiroshihorie hiroshihorie requested a review from davidliu October 3, 2025 13:55

// Activate the audio session
do {
try session.setActive(true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the session need to be activated here? The intention of this method was just for changing the configuration that the webrtc AVAudioSession will use, not necessarily activate it (there's also no corresponding way to deactivate it).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe at startAudioSession() is better ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't see a WebRTC startAudioSession() API, maybe it is our internal one ?

From a quick look at WebRTC RTCAudioSession::setActive API, it seems to handle the cases that when configuration gets changed, it will be called to apply the changes. From a quick google search, which gave out :
Yes, you must call setActive(true) after setting the AVAudioSession category for the new configuration to take effect. Setting the category defines your app's intended audio behavior, but activating the session enforces that behavior within the system.

I believe what it meant is changing a configuration of inactive audio stream, then you need to call setActive to apply the change. If the audio stream is active already, it shouldn't be needed. But WebRTC code seems to handle both use cases already:

- (BOOL)setActive:(BOOL)active error:(NSError **)outError {
  if (![self checkLock:outError]) {
    return NO;
  }
  int activationCount = _activationCount.load();
  if (!active && activationCount == 0) {
    RTCLogWarning(@"Attempting to deactivate without prior activation.");
  }
  [self notifyWillSetActive:active];
  BOOL success = YES;
  BOOL isActive = self.isActive;
  // Keep a local error so we can log it.
  NSError *error = nil;
  BOOL shouldSetActive =
      (active && !isActive) || (!active && isActive && activationCount == 1);
  // Attempt to activate if we're not active.
  // Attempt to deactivate if we're active and it's the last unbalanced call.
  if (shouldSetActive) {
    AVAudioSession *session = self.session;
    // AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation is used to ensure
    // that other audio sessions that were interrupted by our session can return
    // to their active state. It is recommended for VoIP apps to use this
    // option.
    AVAudioSessionSetActiveOptions options =
        active ? 0 : AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation;
    success = [session setActive:active withOptions:options error:&error];
    if (outError) {
      *outError = error;
    }
  } 
  if (success) {
    if (active) {
      if (shouldSetActive) {
        self.isActive = active;
        if (self.isInterrupted) {
          self.isInterrupted = NO;
          [self notifyDidEndInterruptionWithShouldResumeSession:YES];
        }
      }
      [self incrementActivationCount];
      [self notifyDidSetActive:active];
    }
  } else {
    RTCLogError(@"Failed to setActive:%d. Error: %@",
                active,
                error.localizedDescription); 
    [self notifyFailedToSetActive:active error:error];
  }
  // Set isActive and decrement activation count on deactivation
  // whether or not it succeeded.
  if (!active) {
    if (shouldSetActive) {
      self.isActive = active;
      [self notifyDidSetActive:active];
    }
    [self decrementActivationCount];
  }
  RTCLog(@"Number of current activations: %d", _activationCount.load());
  return success;
}`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AVAudioSession automatically gets activated when the WebRTC call gets started:

https://github.com/webrtc-sdk/webrtc/blob/ebd5a9f966b31aaa723ac7d2520e56d7ae9101f7/sdk/objc/native/src/audio/audio_device_ios.mm#L943

https://github.com/webrtc-sdk/webrtc/blob/ebd5a9f966b31aaa723ac7d2520e56d7ae9101f7/sdk/objc/components/audio/RTCAudioSession.mm#L749

It's more that the intention for this API is just to set the configuration that will be used during the call. Users can either preset the configuration prior to the call, or change it during the call as needed, but I don't think we'd want to activate the audiosession while there's no active call (as we'd be leaking our reach beyond the scope of our SDK and potentially clash with any other audio handling from the user).

Copy link
Member Author

@hiroshihorie hiroshihorie Oct 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed setActive from setAppleAudioConfiguration().

AudioUnit version of AudioDeviceModule:
I hadn’t noticed that it calls setActive internally 😅. In Swift/Flutter, we always explicitly called setActive. But it doesn’t seem to take into account the fact that setActive can fail when high-priority audio is active. In my opinion, without session activation the audio stack shouldn’t even attempt to start.

AudioEngine version of AudioDeviceModule:
This one doesn’t call setActive automatically. It only performs a category check but doesn’t actually modify anything. That means the SDK side is fully responsible for activation (except for interruption handling).

Side note: I’m not a fan of RTCAudioSession since it introduces extra states, and I see hacky workarounds to support AudioUnit that AVAudioEngine doesn’t require. It can also interfere with users or libraries that interact with AVAudioSession directly. My long-term goal is to avoid touching it at all. Currently, the AVAudioEngine ADM only uses it for interruption observation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, users should be calling our start/stop audio session apis anyways.

}

do {
try session.setActive(true)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved setActive(true) here, I haven't confirmed it this get's invoked before starting internal AVAudioEngine, I assume it does but need to double check.

}

do {
try session.setActive(false)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved setActive(false) here.

session.unlockForConfiguration()

do {
try session.setConfiguration(config)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using RTCAudioSession's setConfiguration to simplify here.

@hiroshihorie hiroshihorie requested review from davidliu and removed request for davidliu October 4, 2025 03:21
@hiroshihorie
Copy link
Member Author

hiroshihorie commented Oct 4, 2025

I think I need to update useIOSAudioManagement() also to setActivate appropriately, but I'm afraid the timing isn't correct since LocalTrackPublished event is too late to activate the session. We need to activate before starting the audio engine.

@hiroshihorie
Copy link
Member Author

7b60981 uses livekit/react-native-webrtc#62 to get audio engine events directly instead of having to count local/remote tracks.
Room event listening is not required and it is possible to reject the audio engine to continue if AudioSession.setActive etc fails (If in a phone call etc.).

* @param preferSpeakerOutput
* @param onConfigureNativeAudio A custom method for determining options used.
*/
export function useIOSAudioManagement(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the behavior quite a bit. Better to create a separate method and deprecate the old one.

The use method prefix is sort of special in React and indicates a React Hook (something that'd only live for the lifecycle of the containing component), whereas this one looks like it's a one and done setup method. Something like setupIOSAudioManagement would work instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I think deprecating it is better 👍

const tryConfigure = async (newState: AudioEngineConfigurationState, oldState: AudioEngineConfigurationState) => {
if ((!newState.isPlayoutEnabled && !newState.isRecordingEnabled) && (oldState.isPlayoutEnabled || oldState.isRecordingEnabled)) {
log.info("AudioSession deactivating...")
await AudioSession.stopAudioSession()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, users will already need to call start/stopAudioSession() manually for Android, will this have any issues with that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh... yes in that case it could be funky...

@hiroshihorie
Copy link
Member Author

I think I will revert 7b60981 for now .

@hiroshihorie hiroshihorie merged commit 4e0ad99 into main Oct 7, 2025
8 checks passed
@hiroshihorie hiroshihorie deleted the hiroshi/propagate-session-config-failure branch October 7, 2025 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants