Thursday, 19 September 2024 | KDAB on Qt

Implementing an Audio Mixer, Part 2

Recap

In Part 1, we covered PCM audio and superimposing waveforms, and developed an algorithm to combine an arbitrary number of audio streams into one.

Now we need to use these ideas to finish a full implementation using Qt Multimedia.

Using Qt Multimedia for Audio Device Access

So what do we need? Well, we want to use a single QAudioOutput, which we pass an audio device and a supported audio format.

We can get those like this:

const QAudioDeviceInfo &device = QAudioDeviceInfo::defaultOutputDevice();
const QAudioFormat &format = device.preferredFormat();

Let’s construct our QAudioOutput object using the device and format:

static QAudioOutput audioOutput(device, format);

Now, to use it to write data, we have to call start on the audio output, passing in a QIODevice *.

Normally we would use the QIODevice subclass QBuffer for a single audio buffer. But in this case, we want our own subclass of QIODevice, so we can combine multiple buffers into one IO device.

We’ll call our subclass MixerStream. This is where we will do our bufferCombine, and keep our member list of streams mStreams.

We will also need another stream type for mStreams. For now let’s just call it DecoderStream, forward declare it, and worry about its implementation later.

One thing that’s good to know at this point is DecoderStream objects will get the data buffers we need by decoding audio data from a file. Because of this, we’ll need to keep our audio format from above to as a data member mFormat. Then we can pass it to decoders when they need it.

Implementing `MixerStream`

Since we are subclassing QIODevice, we need to provide reimplementations for these two protected virtual functions:

virtual qint64 QIODevice::readData(char *data, qint64 maxSize);
virtual qint64 QIODevice::writeData(const char *data, qint64 maxSize);

We also want to provide a way to open new streams that we’ll add to mStreams, given a filename. We’ll call this function openStream. We can also allow looping a stream multiple times, so let’s add a parameter for that and give it a default value of 1.

Additionally, we’ll need a user-defined destructor to delete any pointers in the list that might remain if the MixerStream is abruptly destructed.

// mixerstream.h

#pragma once

#include <QAudioFormat>
#include <QAudioOutput>
#include <QIODevice>

class DecodeStream;

class MixerStream : public QIODevice
{
    Q_OBJECT

public:
    explicit MixerStream(const QAudioFormat &format);
    ~MixerStream();

    void openStream(const QString &fileName, int loops = 1);

protected:
    qint64 readData(char *data, qint64 maxSize) override;
    qint64 writeData(const char *data, qint64 maxSize) override;

private:
    QAudioFormat mFormat;
    QList<DecodeStream *> mStreams;
};

Notice that combineSamples isn’t in the header. It’s a pretty basic function that doesn’t require any members, so we can just implement it as a free function.

Let’s put it in a header mixer.h and wrap it in a namespace:

// mixer.h

#pragma once

#include <QtGlobal>

#include <limits>

namespace Mixer
{
inline qint16 combineSamples(qint32 samp1, qint32 samp2)
{
    const auto sum = samp1 + samp2;
    if (std::numeric_limits<qint16>::max() < sum)
        return std::numeric_limits<qint16>::max();

    if (std::numeric_limits<qint16>::min() > sum)
        return std::numeric_limits<qint16>::min();

    return sum;
}
} // namespace Mixer

There are some very basic things we can get out of the way quickly in the MixerStream cpp file. Recall that we must implement these member functions:

explicit MixerStream(const QAudioFormat &format);
~MixerStream();

void openStream(const QString &fileName, int loops = 1);

qint64 readData(char *data, qint64 maxSize) override;
qint64 writeData(const char *data, qint64 maxSize) override;

The constructor is very simple:

MixerStream::MixerStream(const QAudioFormat &format)
    : mFormat(format)
{
    setOpenMode(QIODevice::ReadOnly);
}

Here we use setOpenMode to automatically open our device in read-only mode, so we don’t have to call open() directly from outside the class.

Also, since it’s going to be read-only, our reimplementation of QIODevice::writeData will do nothing:

qint64 MixerStream::writeData([[maybe_unused]] const char *data,
                              [[maybe_unused]] qint64 maxSize)
{
    Q_ASSERT_X(false, "writeData", "not implemented");
    return 0;
}

The custom destructor we need is also quite simple:

MixerStream::~MixerStream()
{
    while (!mStreams.empty())
        delete mStreams.takeLast();
}

readData will be almost exactly the same as the implementation we did earlier, but returning qint64. The return value is meant to be the amount of data written, which in our case is just the maxSize argument given to it, as we write fixed-size buffers.

Additionally, we should call qAsConst (or std::as_const) on mStreams in the range-for to avoid detaching the Qt container. For more on qAsConst and range-based for loops, see Jesper Pederson’s blog post on the topic.

qint64 MixerStream::readData(char *data, qint64 maxSize)
{
    memset(data, 0, maxSize);

    constexpr qint16 bitDepth = sizeof(qint16);
    const qint16 numSamples = maxSize / bitDepth;

    for (auto *stream : qAsConst(mStreams))
    {
        auto *cursor = reinterpret_cast<qint16 *>(data);
        qint16 sample;

        for (int i = 0; i < numSamples; ++i, ++cursor)
            if (stream->read(reinterpret_cast<char *>(&sample), bitDepth))
                *cursor = Mixer::combineSamples(sample, *cursor);
    }

    return maxSize;
}

That only leaves us with openStream. This one will require us to discuss DecodeStream and its interface.

The function should construct a new DecodeStream on the heap, which will need a file name and format. DecodeStream, as implied by its name, needs to decode audio files to PCM data. We’ll use a QAudioDecoder within DecodeStream to accomplish this, and for that, we need to pass mFormat to the constructor. We also need to pass loops to the constructor, as each stream can have a different number of loops.

Now our constructor call will look like this:

DecodeStream(fileName, mFormat, loops);

We can then use operator<< to add it to mStreams.

Finally, we need to remove it from the list when it’s done. We’ll give it a Qt signal, finished, and connect it to a lambda expression that will remove the stream from the list and delete the pointer.

Our completed openStream function now looks like this:

void MixerStream::openStream(const QString &fileName, int loops)
{
    auto *decodeStream = new DecodeStream(fileName, mFormat, loops);
    mStreams << decodeStream;

    connect(decodeStream, &DecodeStream::finished, this, [this, decodeStream]() {
        mStreams.removeAll(decodeStream);
        decodeStream->deleteLater();
    });
}

Recall from earlier that we call read on a stream, which takes a char * to which the read data will be copied and a qint64 representing the size of the data.

This is a QIODevice function, which will internally call readData. Thus, DecoderStream also needs to be a QIODevice.

Getting PCM Data for `DecodeStream`

In DecodeStream, we need readData to spit out PCM data, so we need to decode our audio file to get its contents in PCM format. In Qt Multimedia, we use a QAudioDecoder for this. We pass it an audio format to decode to, and a source device, in this case a QFile file handle for our audio file.

When a QAudioDecoder's start method is called, it will begin decoding the source file in a non-blocking manner, emitting a signal bufferReady when a full buffer of decoded PCM data is available.

On that signal, we can call the decoder’s read method, which gives us a QAudioBuffer. To store in a data member in DecodeStream, we use a QByteArray, which we can interact with using QBuffers to get a QIODevice interface for reading and writing. This is the ideal way to work with buffers of bytes to read or write in Qt.

We’ll make two QBuffers: one for writing data to the QByteArray (we’ll call it mInputBuffer), and one for reading from the QByteArray (we’ll call it mOutputBuffer). The reason for using two buffers rather than one read/write buffer is so the read and write positions can be independent. Otherwise, we will encounter more stuttering.

So when we get the bufferReady signal, we’ll want to do something like this:

const QAudioBuffer buffer = mDecoder.read();
mInputBuf.write(buffer.data<char>(), buffer.byteCount());

We’ll also need to have some sort of state enum. The reason for this is that when we are finished with the stream and emit finished(), we remove and delete the stream from a connected lambda expression, but read might still be called before that has completed. Thus, we want to only read from the buffer when the state is Playing.

Let’s update mixer.h to put the enum in namespace Mixer:

#pragma once

#include <QtGlobal>

#include <limits>

namespace Mixer
{
enum State
{
    Playing,
    Stopped
};

inline qint16 combineSamples(qint32 samp1, qint32 samp2)
{
    const auto sum = samp1 + samp2;

    if (std::numeric_limits<qint16>::max() < sum)
        return std::numeric_limits<qint16>::max();

    if (std::numeric_limits<qint16>::min() > sum)
        return std::numeric_limits<qint16>::min();

    return sum;
}
} // namespace Mixer

Implementing `DecodeStream`

Now that we understand all the data members we need to use, let’s see what our header for DecodeStream looks like:

// decodestream.h

#pragma once

#include "mixer.h"

#include <QAudioDecoder>
#include <QBuffer>
#include <QFile>

class DecodeStream : public QIODevice
{
    Q_OBJECT

public:
    explicit DecodeStream(const QString &fileName, const QAudioFormat &format, int loops);

protected:
    qint64 readData(char *data, qint64 maxSize) override;
    qint64 writeData(const char *data, qint64 maxSize) override;

signals:
    void finished();

private:
    QFile mSourceFile;
    QByteArray mData;
    QBuffer mInputBuf;
    QBuffer mOutputBuf;
    QAudioDecoder mDecoder;
    QAudioFormat mFormat;
    Mixer::State mState;
    int mLoopsLeft;
};

In the constructor, we’ll initialize our private members, open the DecodeStream in read-only (like we did earlier), make sure we open the QFile and QBuffers successfully, and finally set up our QAudioDecoder.

DecodeStream::DecodeStream(const QString &fileName, const QAudioFormat &format, int loops)
    : mSourceFile(fileName)
    , mInputBuf(&mData)
    , mOutputBuf(&mData)
    , mFormat(format)
    , mState(Mixer::Playing)
    , mLoopsLeft(loops)
{
    setOpenMode(QIODevice::ReadOnly);

    const bool valid = mSourceFile.open(QIODevice::ReadOnly) &&
                       mOutputBuf.open(QIODevice::ReadOnly) &&
                       mInputBuf.open(QIODevice::WriteOnly);

    Q_ASSERT(valid);

    mDecoder.setAudioFormat(mFormat);
    mDecoder.setSourceDevice(&mSourceFile);
    mDecoder.start();

    connect(&mDecoder, &QAudioDecoder::bufferReady, this, [this]() {
        const QAudioBuffer buffer = mDecoder.read();
        mInputBuf.write(buffer.data<char>(), buffer.byteCount());
    });
}

Once again, our QIODevice subclass is read-only, so our writeData method looks like this:

qint64 DecodeStream::writeData([[maybe_unused]] const char *data,
                               [[maybe_unused]] qint64 maxSize)
{
    Q_ASSERT_X(false, "writeData", "not implemented");
    return 0;
}

Which leaves us with the last part of the implementation, DecodeStream's readData function.

We zero out the char * with memset to avoid any noise if there are areas that are not overwritten. Then we simply read from the QByteArray into the char * if mState is Mixer::Playing.

We check to see if we finished reading the file with QBuffer::atEnd(), and if we are, we decrement the loops remaining. If it’s zero now, that was the last (or only) loop, so we set mState to stopped, and emit finished(). Either way we seek back to position 0. Now if there are loops left, it starts reading from the beginning again.

qint64 DecodeStream::readData(char *data, qint64 maxSize)
{
    memset(data, 0, maxSize);

    if (mState == Mixer::Playing)
    {
        mOutputBuf.read(data, maxSize);
        if (mOutputBuf.size() && mOutputBuf.atEnd())
        {
            if (--mLoopsLeft == 0)
            {
                mState = Mixer::Stopped;
                emit finished();
            }

            mOutputBuf.seek(0);
        }
    }

    return maxSize;
}

Now that we’ve implemented DecodeStream, we can actually use MixerStream to play two audio files at the same time!

Using `MixerStream`

Here’s an example snippet that shows how MixerStream can be used to route two simultaneous audio streams into one system mixer channel:

const auto &device = QAudioDeviceInfo::defaultOutputDevice();
const auto &format = device.preferredFormat();

auto mixerStream = std::make_unique<MixerStream>(format);

auto *audioOutput = new QAudioOutput(device, format);
audioOutput->setVolume(0.5);
audioOutput->start(mixerStream.get());

mixerStream->openStream(QStringLiteral("/path/to/some/sound.wav"));
mixerStream->openStream(QStringLiteral("/path/to/something/else.mp3"), 3);

Final Remarks

The code in this series of posts is largely a reimplementation of Lova Widmark’s project QtMixer. Huge thanks to her for a great and lightweight implementation. Check the project out if you want to use something like this for a GPL-compliant project (and don’t mind that it uses qmake).

The post Implementing an Audio Mixer, Part 2 appeared first on KDAB.

Implementing an Audio Mixer, Part 2

Implementing an Audio Mixer, Part 2

Recap

Using Qt Multimedia for Audio Device Access

Implementing MixerStream

Getting PCM Data for DecodeStream

Implementing DecodeStream

Using MixerStream

Final Remarks

Implementing `MixerStream`

Getting PCM Data for `DecodeStream`

Implementing `DecodeStream`

Using `MixerStream`