Skip to content

Experimental microphone support

SaracenOne requested to merge github/fork/SaracenOne/audio_mic into master

Okay, following @reduz’s advice, I decided to post a PR for this feature I was working on even though it is not complete and definitely not ready for merging. Given that I knew next to nothing about audio encoding before starting this, this has largely been learning experience, but given some walls I have run up against with trying to make this feature work correctly, I feel I should put out what I have already done in an attempt to get feedback, and also support perhaps from people with more experience in this particular field. I'm led to understand that @marcelofg55 in particular might be able to help out with this.

In my discussion with @reduz, the idea behind this particular approach was to add microphone support directly through the AudioStream interface which could then in theory go through AudioMixer for processing. This idea actually introduced somewhat more complexity than my original attempt at the interface, and necessitated in the introduction of a kind of proxy interface between the audio driver, server, and individual streams. This comes in the form of MicrophoneRecievers and MicrophoneDeviceOutputs. The MicrophoneDeviceOutputs are the primary endpoints that the actually drivers for each capture device. MicrophoneRecievers are invisibly assigned by the new AudioStreamMicrophone classes when playback is requested which then allow these classes to decode the microphone buffers themselves and go through the standard audio mixing process. The presence of receivers also automatically control whether a capture device should be active or not. These classes are also heavily virtualised as they also support non-physical capture devices such as the ‘default’ endpoint which can also be changed while the engine is already running.

I will stress further that this is NOT a fully functional implementation of microphones; it is merely a proof of concept which has severe audio clipping and which will actually crash 10 seconds after starting a microphone stream due to a buffer overflow. It is only meant to demonstrate the underlying design of the interface and only currently supports one audio driver. It also probably requires some further cleanup and refactoring.

While most of the issues can be addressed fairly easily, the main thing I am having trouble with right now is how to solve the audio clipping problem. The main issue lies in synchronising the audio capture with the audio stream processing; audio capture packets currently seem to come in at a bigger size than they get decoded at, meaning that the only way to keep them in sync currently has been to only process part of them in the AudioStream, clipping off the end, otherwise greater and greater latency gets introduced between the capture and playback. If anyone can figure out how to handle this issue, please let me know or even send me another pull request.

I’ll list a couple of the things I feel still need to be addressed before this PR would ready for merging:

  • Keep captured audio in sync with output without clipping
  • Default audio endpoints which can be changed while the engine is running
  • Allow opening of arbitrary microphone devices.
  • AudioStreamMicrophone resampling.
  • Fix editor crash on exit
  • WASAPI support
  • PulseAudio Support
  • CoreAudio Support

While there are many local applications for microphone support, the main motivation for this feature is to provide a way of support VOIP. This would likely be achieved by compressing audio packets through Opus, sending them through networking interface and then decoding them for the clients. While it would likely take the form of another interface sitting atop the basic microphone input interface, it would be nice to have a sample implementation of such a feature developers could easily integrate into their own games to instantly have VOIP support.

Lastly, I have also included an extremely basic sample project to demonstrate this feature. Remember that this currently only supports listed supported drivers. godot_mic_test.zip

Merge request reports