I believe that there is a mistake in the definition of TUD_AUDIO_HEADSET_STEREO_DESCRIPTOR. The nitfs in the audio descriptor is always 3 regardless of ITF_NUM_TOTAL.
This example code creates USB Audio 2.0 headset device.
Device has two audio interfaces first stereo speaker
with 48kHz stereo stream.
Second interface for microphone with 48kHz mono stream.
This example can be used to start working on audio device.
It can be also used to verify ISO endpoints for boards.
Speaker adaptive clock (bound to SOF).
Microphone for now has asynchronous clock.
Volume and mute control while present are not used for data stream
modification.