isaacschemm: Drawing of myself as a snail (snail)
[personal profile] isaacschemm posting in [community profile] snailsharp

I started working on Looping Audio Converter back in 2015. Looping Audio Converter is designed to handle music files with seamless loops, and maintain those loops when converting from one format to another - usually when extracting music from one video game, with the intention of using it in another (this is why the default output format is the Wii's .brstm format and ADPCM codec). It's always been a little bit of a kludge, built from pieces that were floating around elsewhere: almost all input and output formats the program supports are handled by calling out to either a .NET library or a Windows executable to convert the input to 16-bit PCM, then again to convert that PCM data to the output format.

One important component of Looping Audio Converter is the included FFmpeg binary. FFmpeg is used to encode and decode certain formats (including FLAC, Ogg Vorbis, and AAC), but it's also used for most "effects" (like sample rate conversion and tempo and volume adjustment).

The normal mode of Looping Audio Converter converts all inputs to and then from 16-bit PCM, stored in this data structure (this code is from a newer branch than the current version 3.0):

public sealed class PCM16Audio : ILoopPoints {
    public short Channels { get; }
    public int SampleRate { get; }
    public short[] Samples { get; }
    public bool Looping { get; set; }
    public int LoopStart { get; set; }
    public int LoopEnd { get; set; }
}

The main format that Looping Audio Converter does support natively is the standard 16-bit PCM Windows .wav format, which it uses to exchange data with its dependencies. Besides "fmt " and "data", it also looks for (or writes to) a "smpl" chunk. This chunk type was created for use in MIDI samplers, but like vgmstream, Looping Audio Converter treats it as a loop point indicator (which isn't that different, I suppose).

if (id == "smpl") {
smpl* smpl = (smpl*)ptr2;
if (smpl->sampleLoopCount > 1) {
throw new WaveConverterException("Cannot read looping .wav file with more than one loop");
} else if (smpl->sampleLoopCount == 1) {
// There is one loop - we only care about start and end points
smpl_loop* loop = (smpl_loop*)(smpl + 1);
if (loop->type != 0) {
throw new WaveConverterException("Cannot read looping .wav file with loop of type " + loop->type);
}
loopStart = loop->start;
loopEnd = loop->end;
}
}

The Converter module contains the main application workflow after the input files, output directory, and options are specified. It looks something like this:

  • Make sure FFmpeg exists in the path defined in the .config file
  • Select an appropriate audio exporter for the output format (see below)
  • Generate a list of audio importers to try
  • For each input file:
    • Try each audio importer, in order, until the file is successfully decoded to 16-bit PCM
    • If the loop point was changed via loop.txt, apply the new loop point
    • Apply any changes to number of channels, sample rate, volume, or pitch/tempo using FFmpeg
    • For each section/variant of the music (whole, pre-loop, loop, post-loop, final-lap) being exported:
      • If channels are to be split apart, or into pairs, do so here
      • Apply loop point override set in the GUI (if any)
        • This might include showing a loop point selection dialog (borrowed from BrawlCrate), if the user chose one of the "ask" options
      • Encode the file using the selected audio exporter, and write to the output directory

The application contains several exporters (that you can pick from), and several importers (that it'll try in order, for each file, although some of them will get skipped if a file extension doesn't match). The decoders look like this:

public interface IRenderingHints {
    int RenderingSampleRate { get; }
    TimeSpan? RequiredDecodingDuration { get; }
}

public interface IAudioImporter {
    bool SupportsExtension(string extension);
    IEnumerable<object> TryReadUncompressedAudioFromFile(string filename);
    Task<PCM16Audio> ReadFileAsync(string filename, IRenderingHints hints = null, IProgress<double> progress = null);
}

The function SupportsExtension will be called first; some importers only work with specific file types, so they can be skipped if the extension is incorrect. ReadFileAsync will decode the file to 16-bit PCM; often, this means calling an external program and then reading the resulting .wav file. TryReadUncompressedAudioFromFile is used in the rare case that "copy audio data only" is selected; importers that support this will return an object that contains just the existing audio data (like an AudioData from VGAudio), and this can hopefully be picked up by the exporter's TryWriteCompressedAudioToFile. Importers that don't have this functionality will just yield break here and return nothing. RenderingSampleRate is used for import from .vgm/.vgz files (the only supported format that needs to be rendered, not just decoded), and RequiredDecodingDuration is there to work around a weird case (reading an infinite-loop input file with FFmpeg) that probably won't come up.

In the branch I'm working on, the importers are:

Name
ReadFileAsync TryReadUncompressedAudioFromFile
WaveImporter
MP3Importer
✓ (MP3Audio)
VorbisImporter
✓ (VorbisAudio)
VGMImporter
MSU1Converter
MSFImporter
✓ (MP3Audio, sometimes)
VGMStreamImporter
VGAudioImporter ✓ (AudioData)
BrawlLibImporter ✓ (RSTMNode)
FFmpegEngine

The exporter type looks like this:

public interface IAudioExporter {
    Task WriteFileAsync(PCM16Audio lwav, string output_dir, string original_filename_no_ext, IProgress<double> progress = null);
    bool TryWriteCompressedAudioToFile(object audio, ILoopPoints loopPoints, string output_dir, string original_filename_no_ext);
}

The function WriteFileAsync will take an object containing 16-bit PCM data, convert it to the exporter's format (which could take a while), then write it to a file in output_dir, optionally reporting its progress to a delegate. TryWriteCompressedAudioToFile will take the output of TryReadUncompressedAudioFromFile, and return true if it is able to do something with it.

In the branch I'm working on, the exporters are:

Name
WriteFileAsync TryWriteCompressedAudioToFile Includes loop data
VGAudioExporter
✓ (AudioData)
BrawlLibRSTMExporter
✓ (RSTMNode)
BrawlLibRWAVExporter

MSFExporter

MSU1Converter

FFmpegExporter


MP3Exporter
✓ (MP3Audio)
AACExporter


VorbisExporter
✓ (VorbisAudio)
WaveExporter

I'm thinking about adding a type for audio formats that already contain 16-bit PCM, and letting them pass through, but since it's already lossless when doing it the normal way, I'll likely decide against it (same with FLAC, which I do plan to add looping metadata support for, just like Vorbis).

Tags:

Snail#

A programming blog where the gimmick is that I pretend to be a snail.

Expand Cut Tags

No cut tags

Style Credit

Page generated Jun. 17th, 2025 04:12 pm
Powered by Dreamwidth Studios