Skip to content

Countering string bloat

Saturday, 9 September 2023 | Volker Krause

A lot of our code deals with text strings in one way or the other, and often not even with particularly complex ones. Seemingly simple and yet we have a myriad of different APIs for that in Qt. Here are two concrete examples on the impact of picking the right string types and using them correctly.

JSON and QLatin1String

Qt’s JSON API provides overloads for QString, QStringView and QLatin1String. The latter one is usually said to be the most efficient option, but it’s also the most restrictive one, basically limiting you to 7bit ASCII string literals. For JSON keys that is often sufficient though.

For a real-world example let’s look at a pending MR for KWeatherCore, which changes much of its JSON code from using QStringLiteral to QLatin1String for JSON key constants.

In a size-optimized and stripped release build this reduces the library size by about 7%. This comes in part from a smaller read-only data section, due to string literals now using 8bit instead of 16bit per characters, but also from a smaller text section (ie. code), due to the simpler inline construction/destruction code of QLatin1String.

    FILE SIZE        VM SIZE
 --------------  --------------
  -0.2%     -12  -0.3%     -12    .hash
  -0.7%     -24  -0.7%     -24    .got.plt
  -0.7%     -48  -0.7%     -48    .plt
  -0.2%     -62  -0.2%     -62    .dynstr
  -0.4%     -72  -0.4%     -72    .dynsym
  -0.7%     -72  -0.7%     -72    .rela.plt
  -3.2%    -136  -3.2%    -136    .eh_frame_hdr
  -2.6%    -496  -2.6%    -496    .eh_frame
  -3.3% -2.11Ki  -3.4% -2.11Ki    .text
 -29.7% -8.12Ki -29.8% -8.12Ki    .rodata
  -7.1% -16.0Ki  -5.1% -11.1Ki    TOTAL

The runtime impact is harder to measure, as this drowns in the noise here. Less code to run and less data to load certainly doesn’t hurt though.

QLatin1String constants

As QLatin1String has constexpr constructors it lends itself for centrally defined string constants, e.g. for repeatedly used JSON keys. Quotient makes use of this for example.

constexpr QLatin1String my_magic_constant("42");

This works, but looking at the generated binary this shows static construction code being emitted in every translation unit that includes this definition (independent of use even!). The subtle issue here is the missing inline keyword:

constexpr inline QLatin1String my_magic_constant("42");

Fixing that in Quotient (PR) decreased the library size by about 3%, the majority of that being contributed by just 10 constants defined in a frequently used header. The savings in this case mainly come from code, and more importantly from code run as part of static construction when the library is loaded, ie. typically on startup.

Does this matter?

We are talking about quite small improvements here, in the range of a few percent only. But those are improvements that are basically free (limited effort, limited risk of breakage, code complexity/readability doesn’t change, etc), and they are very widely applicable.

And while a few percent smaller applications might hardly be felt by the user, they easily multiply to big numbers when looking at this from the infrastructure perspective.

So it’s worth paying attention to those seemingly minor details.