![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
![[community profile]](https://www.dreamwidth.org/img/silk/identity/community.png)
I started working on Looping
Audio Converter back in 2015. Looping Audio Converter is
designed to handle music files with seamless loops, and maintain
those loops when converting from one format to another - usually
when extracting music from one video game, with the intention of
using it in another (this is why the default output format is the
Wii's .brstm format and ADPCM codec). It's always been a little
bit of a kludge, built from pieces that were floating around
elsewhere: almost all input and output formats the program
supports are handled by calling out to either a .NET library or a
Windows executable to convert the input to 16-bit PCM, then again
to convert that PCM data to the output format.
One important component of Looping Audio Converter is the
included FFmpeg
binary. FFmpeg is used to encode and decode certain formats
(including FLAC, Ogg Vorbis, and AAC), but it's also used for most
"effects" (like sample rate conversion and tempo and volume
adjustment).
The normal mode of Looping Audio Converter converts all inputs to
and then from 16-bit PCM, stored in this data structure (this code
is from a newer branch than the current version 3.0):
public sealed class PCM16Audio : ILoopPoints {
public short Channels { get; }
public int SampleRate { get; }
public short[] Samples { get; }
public bool Looping { get; set; }
public int LoopStart { get; set; }
public int LoopEnd { get; set; }
}
The main format that Looping Audio Converter does support
natively is the standard 16-bit PCM Windows .wav format, which it
uses to exchange data with its dependencies. Besides "fmt " and
"data", it also looks for (or writes to) a "smpl" chunk. This
chunk type was created for use in MIDI samplers, but like
vgmstream,
Looping Audio Converter treats it as a loop point indicator (which
isn't that different, I suppose).
if (id == "smpl") {
smpl* smpl = (smpl*)ptr2;
if (smpl->sampleLoopCount > 1) {
throw new WaveConverterException("Cannot read looping .wav file with more than one loop");
} else if (smpl->sampleLoopCount == 1) {
// There is one loop - we only care about start and end points
smpl_loop* loop = (smpl_loop*)(smpl + 1);
if (loop->type != 0) {
throw new WaveConverterException("Cannot read looping .wav file with loop of type " + loop->type);
}
loopStart = loop->start;
loopEnd = loop->end;
}
}
The Converter module contains the main application workflow after
the input files, output directory, and options are specified. It
looks something like this:
- Make sure FFmpeg exists in the path defined in the .config file
- Select an appropriate audio exporter for the output format (see below)
- Generate a list of audio importers to try
- For each input file:
- Try each audio importer, in order, until the file is successfully decoded to 16-bit PCM
- If the loop point was changed via loop.txt, apply the new loop point
- Apply any changes to number of channels, sample rate, volume, or pitch/tempo using FFmpeg
- For each section/variant of the music (whole, pre-loop, loop, post-loop, final-lap) being exported:
- If channels are to be split apart, or into pairs, do so here
- Apply loop point override set in the GUI (if any)
- This might include showing a loop point selection dialog
(borrowed from BrawlCrate),
if the user chose one of the "ask" options
- Encode the file using the selected audio exporter, and write to the output directory
The application contains several exporters (that you can pick
from), and several importers (that it'll try in order, for each
file, although some of them will get skipped if a file extension
doesn't match). The decoders look like this:
public interface IRenderingHints {
int RenderingSampleRate { get; }
TimeSpan? RequiredDecodingDuration { get; }
}
public interface IAudioImporter {
bool SupportsExtension(string extension);
IEnumerable<object> TryReadUncompressedAudioFromFile(string filename);
Task<PCM16Audio> ReadFileAsync(string filename, IRenderingHints hints = null, IProgress<double> progress = null);
}
The function SupportsExtension will be called first;
some importers only work with specific file types, so they can be
skipped if the extension is incorrect. ReadFileAsync
will decode the file to 16-bit PCM; often, this means calling an
external program and then reading the resulting .wav file. TryReadUncompressedAudioFromFile
is used in the rare case that "copy audio data only" is selected;
importers that support this will return an object that contains
just the existing audio data (like an AudioData from VGAudio),
and this can hopefully be picked up by the exporter's TryWriteCompressedAudioToFile.
Importers that don't have this functionality will just yield
break here and return nothing. RenderingSampleRate
is used for import from .vgm/.vgz
files (the only supported format that needs to be rendered,
not just decoded), and RequiredDecodingDuration is there
to work around a weird case (reading an infinite-loop input file
with FFmpeg) that probably won't come up.
In the branch I'm working on, the importers are:
Name |
ReadFileAsync | TryReadUncompressedAudioFromFile |
---|---|---|
WaveImporter | ✓ | |
MP3Importer | ✓ (MP3Audio) |
|
VorbisImporter | ✓ (VorbisAudio) | |
VGMImporter | ✓ | |
MSU1Converter | ✓ | |
MSFImporter | ✓ (MP3Audio, sometimes) |
|
VGMStreamImporter | ✓ | |
VGAudioImporter | ✓ | ✓ (AudioData) |
BrawlLibImporter | ✓ | ✓ (RSTMNode) |
FFmpegEngine | ✓ |
The exporter type looks like this:
public interface IAudioExporter {
Task WriteFileAsync(PCM16Audio lwav, string output_dir, string original_filename_no_ext, IProgress<double> progress = null);
bool TryWriteCompressedAudioToFile(object audio, ILoopPoints loopPoints, string output_dir, string original_filename_no_ext);
}
The function WriteFileAsync will take an object
containing 16-bit PCM data, convert it to the exporter's format
(which could take a while), then write it to a file in output_dir,
optionally reporting its progress to a delegate. TryWriteCompressedAudioToFile
will take the output of TryReadUncompressedAudioFromFile,
and return true if it is able to do something with it.
In the branch I'm working on, the exporters are:
Name |
WriteFileAsync | TryWriteCompressedAudioToFile | Includes loop data |
---|---|---|---|
VGAudioExporter |
✓ | ✓ (AudioData) | ✓ |
BrawlLibRSTMExporter |
✓ | ✓ (RSTMNode) | ✓ |
BrawlLibRWAVExporter |
✓ | ✓ | |
MSFExporter |
✓ | ✓ | |
MSU1Converter |
✓ | ✓ | |
FFmpegExporter |
✓ | ||
MP3Exporter |
✓ | ✓ (MP3Audio) | |
AACExporter |
✓ | ||
VorbisExporter |
✓ | ✓ (VorbisAudio) | ✓ |
WaveExporter |
✓ | ✓ |
I'm thinking about adding a type for audio formats that already
contain 16-bit PCM, and letting them pass through, but since it's
already lossless when doing it the normal way, I'll likely decide
against it (same with FLAC, which I do plan to add looping
metadata support for, just like Vorbis).