Implementing an Audio Mixer, Part 2
Recap
In Part 1, we covered PCM audio and superimposing waveforms, and developed an algorithm to combine an arbitrary number of audio streams into one.
Now we need to use these ideas to finish a full implementation using Qt Multimedia.
Using Qt Multimedia for Audio Device Access
So what do we need? Well, we want to use a single QAudioOutput
, which we pass an audio device and a supported audio format.
We can get those like this:
const QAudioDeviceInfo &device = QAudioDeviceInfo::defaultOutputDevice();
const QAudioFormat &format = device.preferredFormat();
Let’s construct our QAudioOutput
object using the device and format:
static QAudioOutput audioOutput(device, format);
Now, to use it to write data, we have to call start
on the audio output, passing in a QIODevice *
.
Normally we would use the QIODevice
subclass QBuffer
for a single audio buffer. But in this case, we want our own subclass of QIODevice
, so we can combine multiple buffers into one IO device.
We’ll call our subclass MixerStream
. This is where we will do our bufferCombine
, and keep our member list of streams mStreams
.
We will also need another stream type for mStreams
. For now let’s just call it DecoderStream
, forward declare it, and worry about its implementation later.
One thing that’s good to know at this point is DecoderStream
objects will get the data buffers we need by decoding audio data from a file. Because of this, we’ll need to keep our audio format from above to as a data member mFormat
. Then we can pass it to decoders when they need it.
Implementing MixerStream
Since we are subclassing QIODevice
, we need to provide reimplementations for these two protected virtual
functions:
virtual qint64 QIODevice::readData(char *data, qint64 maxSize);
virtual qint64 QIODevice::writeData(const char *data, qint64 maxSize);
We also want to provide a way to open new streams that we’ll add to mStreams
, given a filename. We’ll call this function openStream
. We can also allow looping a stream multiple times, so let’s add a parameter for that and give it a default value of 1
.
Additionally, we’ll need a user-defined destructor to delete any pointers in the list that might remain if the MixerStream
is abruptly destructed.
// mixerstream.h
#pragma once
#include <QAudioFormat>
#include <QAudioOutput>
#include <QIODevice>
class DecodeStream;
class MixerStream : public QIODevice
{
Q_OBJECT
public:
explicit MixerStream(const QAudioFormat &format);
~MixerStream();
void openStream(const QString &fileName, int loops = 1);
protected:
qint64 readData(char *data, qint64 maxSize) override;
qint64 writeData(const char *data, qint64 maxSize) override;
private:
QAudioFormat mFormat;
QList<DecodeStream *> mStreams;
};
Notice that combineSamples
isn’t in the header. It’s a pretty basic function that doesn’t require any members, so we can just implement it as a free function.
Let’s put it in a header mixer.h
and wrap it in a namespace
:
// mixer.h
#pragma once
#include <QtGlobal>
#include <limits>
namespace Mixer
{
inline qint16 combineSamples(qint32 samp1, qint32 samp2)
{
const auto sum = samp1 + samp2;
if (std::numeric_limits<qint16>::max() < sum)
return std::numeric_limits<qint16>::max();
if (std::numeric_limits<qint16>::min() > sum)
return std::numeric_limits<qint16>::min();
return sum;
}
} // namespace Mixer
There are some very basic things we can get out of the way quickly in the MixerStream
cpp file. Recall that we must implement these member functions:
explicit MixerStream(const QAudioFormat &format);
~MixerStream();
void openStream(const QString &fileName, int loops = 1);
qint64 readData(char *data, qint64 maxSize) override;
qint64 writeData(const char *data, qint64 maxSize) override;
The constructor is very simple:
MixerStream::MixerStream(const QAudioFormat &format)
: mFormat(format)
{
setOpenMode(QIODevice::ReadOnly);
}
Here we use setOpenMode
to automatically open our device in read-only mode, so we don’t have to call open()
directly from outside the class.
Also, since it’s going to be read-only, our reimplementation of QIODevice::writeData
will do nothing:
qint64 MixerStream::writeData([[maybe_unused]] const char *data,
[[maybe_unused]] qint64 maxSize)
{
Q_ASSERT_X(false, "writeData", "not implemented");
return 0;
}
The custom destructor we need is also quite simple:
MixerStream::~MixerStream()
{
while (!mStreams.empty())
delete mStreams.takeLast();
}
readData
will be almost exactly the same as the implementation we did earlier, but returning qint64
. The return value is meant to be the amount of data written, which in our case is just the maxSize
argument given to it, as we write fixed-size buffers.
Additionally, we should call qAsConst
(or std::as_const
) on mStreams
in the range-for to avoid detaching the Qt container. For more on qAsConst
and range-based for
loops, see Jesper Pederson’s blog post on the topic.
qint64 MixerStream::readData(char *data, qint64 maxSize)
{
memset(data, 0, maxSize);
constexpr qint16 bitDepth = sizeof(qint16);
const qint16 numSamples = maxSize / bitDepth;
for (auto *stream : qAsConst(mStreams))
{
auto *cursor = reinterpret_cast<qint16 *>(data);
qint16 sample;
for (int i = 0; i < numSamples; ++i, ++cursor)
if (stream->read(reinterpret_cast<char *>(&sample), bitDepth))
*cursor = Mixer::combineSamples(sample, *cursor);
}
return maxSize;
}
That only leaves us with openStream
. This one will require us to discuss DecodeStream
and its interface.
The function should construct a new DecodeStream
on the heap, which will need a file name and format. DecodeStream
, as implied by its name, needs to decode audio files to PCM data. We’ll use a QAudioDecoder
within DecodeStream
to accomplish this, and for that, we need to pass mFormat
to the constructor. We also need to pass loops
to the constructor, as each stream can have a different number of loops.
Now our constructor call will look like this:
DecodeStream(fileName, mFormat, loops);
We can then use operator<<
to add it to mStreams
.
Finally, we need to remove it from the list when it’s done. We’ll give it a Qt signal, finished
, and connect it to a lambda expression that will remove the stream from the list and delete the pointer.
Our completed openStream
function now looks like this:
void MixerStream::openStream(const QString &fileName, int loops)
{
auto *decodeStream = new DecodeStream(fileName, mFormat, loops);
mStreams << decodeStream;
connect(decodeStream, &DecodeStream::finished, this, [this, decodeStream]() {
mStreams.removeAll(decodeStream);
decodeStream->deleteLater();
});
}
Recall from earlier that we call read
on a stream, which takes a char *
to which the read data will be copied and a qint64
representing the size of the data.
This is a QIODevice
function, which will internally call readData
. Thus, DecoderStream
also needs to be a QIODevice
.
Getting PCM Data for DecodeStream
In DecodeStream
, we need readData
to spit out PCM data, so we need to decode our audio file to get its contents in PCM format. In Qt Multimedia, we use a QAudioDecoder
for this. We pass it an audio format to decode to, and a source device, in this case a QFile
file handle for our audio file.
When a QAudioDecoder
‘s start
method is called, it will begin decoding the source file in a non-blocking manner, emitting a signal bufferReady
when a full buffer of decoded PCM data is available.
On that signal, we can call the decoder’s read
method, which gives us a QAudioBuffer
. To store in a data member in DecodeStream
, we use a QByteArray
, which we can interact with using QBuffers
to get a QIODevice
interface for reading and writing. This is the ideal way to work with buffers of bytes to read or write in Qt.
We’ll make two QBuffers
: one for writing data to the QByteArray
(we’ll call it mInputBuffer
), and one for reading from the QByteArray
(we’ll call it mOutputBuffer
). The reason for using two buffers rather than one read/write buffer is so the read and write positions can be independent. Otherwise, we will encounter more stuttering.
So when we get the bufferReady
signal, we’ll want to do something like this:
const QAudioBuffer buffer = mDecoder.read();
mInputBuf.write(buffer.data<char>(), buffer.byteCount());
We’ll also need to have some sort of state enum
. The reason for this is that when we are finished with the stream and emit finished()
, we remove and delete the stream from a connected lambda expression, but read
might still be called before that has completed. Thus, we want to only read from the buffer when the state is Playing
.
Let’s update mixer.h
to put the enum
in namespace Mixer
:
#pragma once
#include <QtGlobal>
#include <limits>
namespace Mixer
{
enum State
{
Playing,
Stopped
};
inline qint16 combineSamples(qint32 samp1, qint32 samp2)
{
const auto sum = samp1 + samp2;
if (std::numeric_limits<qint16>::max() < sum)
return std::numeric_limits<qint16>::max();
if (std::numeric_limits<qint16>::min() > sum)
return std::numeric_limits<qint16>::min();
return sum;
}
} // namespace Mixer
Implementing DecodeStream
Now that we understand all the data members we need to use, let’s see what our header for DecodeStream
looks like:
// decodestream.h
#pragma once
#include "mixer.h"
#include <QAudioDecoder>
#include <QBuffer>
#include <QFile>
class DecodeStream : public QIODevice
{
Q_OBJECT
public:
explicit DecodeStream(const QString &fileName, const QAudioFormat &format, int loops);
protected:
qint64 readData(char *data, qint64 maxSize) override;
qint64 writeData(const char *data, qint64 maxSize) override;
signals:
void finished();
private:
QFile mSourceFile;
QByteArray mData;
QBuffer mInputBuf;
QBuffer mOutputBuf;
QAudioDecoder mDecoder;
QAudioFormat mFormat;
Mixer::State mState;
int mLoopsLeft;
};
In the constructor, we’ll initialize our private
members, open the DecodeStream
in read-only (like we did earlier), make sure we open the QFile
and QBuffer
s successfully, and finally set up our QAudioDecoder
.
DecodeStream::DecodeStream(const QString &fileName, const QAudioFormat &format, int loops)
: mSourceFile(fileName)
, mInputBuf(&mData)
, mOutputBuf(&mData)
, mFormat(format)
, mState(Mixer::Playing)
, mLoopsLeft(loops)
{
setOpenMode(QIODevice::ReadOnly);
const bool valid = mSourceFile.open(QIODevice::ReadOnly) &&
mOutputBuf.open(QIODevice::ReadOnly) &&
mInputBuf.open(QIODevice::WriteOnly);
Q_ASSERT(valid);
mDecoder.setAudioFormat(mFormat);
mDecoder.setSourceDevice(&mSourceFile);
mDecoder.start();
connect(&mDecoder, &QAudioDecoder::bufferReady, this, [this]() {
const QAudioBuffer buffer = mDecoder.read();
mInputBuf.write(buffer.data<char>(), buffer.byteCount());
});
}
Once again, our QIODevice
subclass is read-only, so our writeData
method looks like this:
qint64 DecodeStream::writeData([[maybe_unused]] const char *data,
[[maybe_unused]] qint64 maxSize)
{
Q_ASSERT_X(false, "writeData", "not implemented");
return 0;
}
Which leaves us with the last part of the implementation, DecodeStream
‘s readData
function.
We zero out the char *
with memset
to avoid any noise if there are areas that are not overwritten. Then we simply read from the QByteArray
into the char *
if mState
is Mixer::Playing
.
We check to see if we finished reading the file with QBuffer::atEnd()
, and if we are, we decrement the loops remaining. If it’s zero now, that was the last (or only) loop, so we set mState
to stopped, and emit finished()
. Either way we seek
back to position 0. Now if there are loops left, it starts reading from the beginning again.
qint64 DecodeStream::readData(char *data, qint64 maxSize)
{
memset(data, 0, maxSize);
if (mState == Mixer::Playing)
{
mOutputBuf.read(data, maxSize);
if (mOutputBuf.size() && mOutputBuf.atEnd())
{
if (--mLoopsLeft == 0)
{
mState = Mixer::Stopped;
emit finished();
}
mOutputBuf.seek(0);
}
}
return maxSize;
}
Now that we’ve implemented DecodeStream
, we can actually use MixerStream
to play two audio files at the same time!
Using MixerStream
Here’s an example snippet that shows how MixerStream
can be used to route two simultaneous audio streams into one system mixer channel:
const auto &device = QAudioDeviceInfo::defaultOutputDevice();
const auto &format = device.preferredFormat();
auto mixerStream = std::make_unique<MixerStream>(format);
auto *audioOutput = new QAudioOutput(device, format);
audioOutput->setVolume(0.5);
audioOutput->start(mixerStream.get());
mixerStream->openStream(QStringLiteral("/path/to/some/sound.wav"));
mixerStream->openStream(QStringLiteral("/path/to/something/else.mp3"), 3);
Final Remarks
The code in this series of posts is largely a reimplementation of Lova Widmark’s project QtMixer. Huge thanks to her for a great and lightweight implementation. Check the project out if you want to use something like this for a GPL-compliant project (and don’t mind that it uses qmake
).
If you like this article and want to read similar material, consider subscribing via our RSS feed.
Subscribe to KDAB TV for similar informative short video content.
KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.
The post Implementing an Audio Mixer, Part 2 appeared first on KDAB.