Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ReplayGain Feature #936

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

complexlogic
Copy link
Contributor

@complexlogic complexlogic commented Jan 4, 2025

Problem

The loudness of audio files varies significantly. In USDX, I need to frequently adjust the volume to compensate for the variable loudness, which is very inconvenient.

ReplayGain is a common solution to the variable loudness problem. A scanning program measures the loudness of a file and "tags" it with metadata. Then, the player can read the metadata tag and automatically adjust the volume of the file during playback, so all files play back at the same perceived volume.

USDX does not currently support ReplayGain (see #352, #638). The current workaround for loudness normalization is to encode the volume changes into the audio file itself. However, this is undesirable because it is a lossy operation, unlike ReplayGain, which is lossless.

Proposed Implementation

This PR adds ReplayGain functionality to USDX, using the existing Sound->Music Gain setting to minimize changes to the UI. The currently available options for this setting are: 'Soft', 'Medium', and 'Hard'. This PR adds 'ReplayGain' as another option.

Screenshot

The Soft/Medium/Hard presets are described in the code as "ReplayGain", but it's not actually ReplayGain. It just applies a static adjustment to the playback stream, irrespective of the characteristics of the file (see code here). This feature is also exclusive to the BASS playback engine. So, not functional for non-Windows users, despite being exposed in the Options UI.

The new ReplayGain feature uses the FFmpeg audio decoder to access FFmpeg's parsed dictionary of metadata. The program looks for the REPLAYGAIN_TRACK_GAIN tag, plus R128_TRACK_GAIN for Opus files. If found, the program will store the requested volume adjustment in a new property in TAudioPlaybackStream called RGAdjustment. Then, USDX will apply the requested volume adjustment during playback.

The reason I chose FFmpeg's metadata implementation is because:

  • It requires no new dependencies for the program, since FFmpeg is already bundled. It just requires exposing a few more fields in the FFmpeg bindings.
  • FFmpeg's metadata dictionary is a convenient abstraction that works across all audio formats, even though they have different underlying metadata structures. I've successfully tested the new ReplayGain feature with the following file formats:
    • MP3
    • FLAC
    • M4A/MP4
    • Ogg Vorbis
    • Opus

Summary of Code Changes

Here is a summary of the code changes:

  • Add 'ReplayGain' option to Options->Music Gain setting in Options UI
  • Use a compile time switch to limit the Soft/Medium/Hard options to supported platforms. As mentioned above, Soft/Medium/Hard options are non-functional when not using BASS as the playback engine. Having options that do nothing is confusing for users. So, for non-Windows users, the only options will be 'Off' and 'ReplayGain'.
  • As mentioned above, the BASS Soft/Medium/Hard effects do not implement ReplayGain, despite being described as such in the code. Therefore, rename the associated types to avoid confusion with the new ReplayGain implementation:
    • TReplayGain/FReplayGain->TAutoGain/FAutoGain
    • TReplayGainBass->TAutoGainBass
  • Add code to TFFmpegDecodeStream to parse ReplayGain value from file
  • Add RGAdjustment property to TAudioPlaybackStream, which is used to adjust the stream volume during playback. This adjustment is applied in addition to the volume setting. It does not replace the volume. This ensures that fading continues to work.
  • Add EnableReplayGain method to TAudioPlaybackStream. This ensures that ReplayGain adjustments are only applied to the music stream. For example, if any of the game assets such as the background track are inadvertently tagged with ReplayGain metadata, it will not be applied because it's not enabled for that particular stream.

Limitations

Windows Version

The ReplayGain parsing code is implemented in the FFmpeg implementation of TAudioDecodeStream. However, the Windows version uses the BASS audio decoder for most file formats, and the FFmpeg audio decoder is only used as a fallback for formats that BASS cannot handle.

I saw in #750 there was discussion from the developers about removing the BASS audio decoder in favor of the FFmpeg audio decoder, and using BASS for audio input/playback only. However, it seems this change was never implemented. If the BASS audio decoder is removed, then all audio files will open with FFmpeg on all platforms, and my current ReplayGain implementation will work.

If removing the BASS audio decoder is not acceptable, I can look into refactoring the code by moving the ReplayGain tag parsing code into TMediaCore_FFmpeg. However, this would require opening and parsing the file twice on Windows: first in BASS then in FFmpeg. I think it would be best to harmonize around FFmpeg as disussed in #750 and avoid the duplicated effort.

When I disable the BASS audio decoder locally, everything works fine in the Windows version.

Positive Gains

Currently, the program will not apply any positive gains in the ReplayGain tag. The reason is because there is no headroom available. The App volume and music stream volume are both set to 1 (0 dB).

Given that almost all commerically released pop music will have a negative gain, I don't see this as a huge issue. If positive gains must be supported, potential solutions could include turning down the app volume, or applying a constant negative preamp value to the music stream.

FFmpeg Bindings

I only changed the bindings for FFmpeg 5, 6, and 7. If support is needed for older versions, let me know and I can add the required changes for those bindings.

Fixes #352
Fixes #638

@bohning
Copy link
Collaborator

bohning commented Jan 4, 2025

Awesome, thanks a lot. I'd vote for removing BASS all together and harmonize the use of FFmpeg over all platforms. @basisbit @s09bQ5 Does any of you know/remember the reason behind using BASS for Windows? I can't.

Supporting FFmpeg 5 and higher should be more than enough.

@barbeque-squared
Copy link
Member

Another vote for deleting anything older than ffmpeg 5 and getting rid of bass (we're basically playing catchup with 8+ years of no real maintenance...)

Ideally this PR only gets merged/looked at again once we're in such a state -- I don't want to introduce platform-dependent things, or have it depend on the ffmpeg version etc. At that point it should be pretty easy to implement proper ReplayGain.

@basisbit
Copy link
Member

basisbit commented Jan 5, 2025

Awesome, thanks a lot. I'd vote for removing BASS all together and harmonize the use of FFmpeg over all platforms. @basisbit @s09bQ5 Does any of you know/remember the reason behind using BASS for Windows? I can't.

For audio output: It was the only library which didn't have issues with vbr mp3 playback on windows, and this was still the case last time I tested it a few years ago.
For audio input: Also, we propably still need bass for audio-input unless SDL2 devs fixed all the multichannel audio-input bugs in the mean time. And then someone would have to implement support for audio input stuff from SDL2, including dealing with the various events.

Regarding deprecating old ffmpeg versions, I'd be okay with getting rid of support for anything older than ffmpeg 4.0, thats the oldest version which I regularly still saw in the wild last year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants