Skip to content

Improvements to Mozilla’s Searchfox Code Browser

Tuesday, 17 December 2024 | KDAB on Qt

Mozilla is the maker of the famous Firefox web browser and the birthplace of the likes of Rust and Servo (read more about Embedding the Servo Web Engine in Qt).

Firefox is a huge, multi-platform, multi-language project with 21 million lines of code back in 2020, according to their own blog post. Navigating in projects like those is always a challenge, especially at the cross-language boundaries and in platform-specific code.

To improve working with the Firefox code-base, Mozilla hosts an online code browser tailored for Firefox called Searchfox. Searchfox analyzes C++, JavaScript, various IDLs (interface definition languages), and Rust source code and makes them all browsable from a single interface with full-text search, semantic search, code navigation, test coverage report, and git blame support. It’s the combination of a number of projects working together, both internal to Mozilla (like their Clang plugin for C++ analysis) and external (such as rust-analyzer maintained by Ferrous Systems).

It takes a whole repository in and separately indexes C++, Rust, JavaScript and now Java and Kotlin source code. All those analyses are then merged together across platforms, before running a cross-reference step and building the final index used by the web front-end available at searchfox.org.

Mozilla asked KDAB to help them with adding Java and Kotlin support to Searchfox in prevision of the merge of Firefox for Android into the main mozilla-central repository and enhance their C++ support with macro expansions. Let’s dive into the details of those tasks.

Java/Kotlin Support

Mozilla merged the Firefox for Android source code into the main mozilla-central repository that Searchfox indexes. To add support for that new Java and Kotlin code to Searchfox, we reused open-source tooling built by Sourcegraph around the SemanticDB and SCIP code indexing formats. (Many thanks to them!)

Sourcegraph’s semanticdb-javac and semanticdb-kotlinc compiler plugins are integrated into Firefox’s CI system to export SemanticDB artifacts. The Searchfox indexer fetches those SemanticDB files and turns them into a SCIP index, using scip-semanticdb. That SCIP index is then consumed by the existing Searchfox-internal scip-indexer tool.

In the process, a couple of upstream contributions were made to rust-analyzer (which also emits SCIP data) and scip-semanticdb.

A few examples of Searchfox at work:

If you want to dive into more details, see the feature request on Bugzilla, the implementation and further discussion on GitHub and the release announcement on the mozilla dev-platform mailing list.

Java/C++ Cross-language Support

GeckoView is an Android wrapper around Gecko, the Firefox web engine. It extensively uses cross-language calls between Java and C++.

Searchfox already had support for cross-language interfaces, thanks to its IDL support. We built on top of that to support direct cross-language calls between Java and C++.

First, we identified the different ways the C++ and Java code interact and call each other. There are three ways Java methods marked with the native keyword call into C++:

  • Case A1: By default, the JVM will search for a matching C function to call based on its name. For instance, calling org.mozilla.gecko.mozglue.GeckoLoader.nativeRun from Java will call Java_org_mozilla_gecko_mozglue_GeckoLoader_nativeRun on the C++ side.
  • Case A2: This behavior can be overridden at runtime by calling the JNIEnv::RegisterNatives function on the C++ side to point at another function.
  • Case A3: GeckoView has a code generator that looks for Java items decorated with the @WrapForJNI and native annotations and generates a C++ class template meant to be used through the Curiously Recurring Template Pattern. This template provides an Init static member function that does the right JNIEnv::RegisterNatives calls to bind the Java methods to the implementing C++ class’s member functions.

We also identified two ways the C++ code calls Java methods:

  • Case B1: directly with JNIEnv::Call… functions.
  • Case B2: GeckoView’s code generator also looks for Java methods marked with @WrapForJNI (without the native keyword this time) and generates a C++ wrapper class and member functions with the right JNIEnv::Call… calls.

Only the C++ side has the complete view of the bindings; so that’s where we decided to extract the information from, by extending Mozilla’s existing Clang plugin.

First, we defined custom C++ annotations bound_as and binding_to that the clang plugin transforms into the right format for the cross-reference analysis. This means we can manually set the binding information:

class __attribute__((annotate("binding_to", "jvm", "class", "S_jvm_sample/Jni#"))) CallingJavaFromCpp
{
    __attribute__((annotate("binding_to", "jvm", "method", "S_jvm_sample/Jni#javaStaticMethod().")))
    static void javaStaticMethod()
    {
        // Wrapper code
    }

    __attribute__((annotate("binding_to", "jvm", "method", "S_jvm_sample/Jni#javaMethod().")))
    void javaMethod()
    {
        // Wrapper code
    }

    __attribute__((annotate("binding_to", "jvm", "getter", "S_jvm_sample/Jni#javaField.")))
    int javaField()
    {
        // Wrapper code
        return 0;
    }

    __attribute__((annotate("binding_to", "jvm", "setter", "S_jvm_sample/Jni#javaField.")))
    void javaField(int)
    {
        // Wrapper code
    }

    __attribute__((annotate("binding_to", "jvm", "const", "S_jvm_sample/Jni#javaConst.")))
    static constexpr int javaConst = 5;
};

class __attribute__((annotate("bound_as", "jvm", "class", "S_jvm_sample/Jni#"))) CallingCppFromJava
{
    __attribute__((annotate("bound_as", "jvm", "method", "S_jvm_sample/Jni#nativeStaticMethod().")))
    static void nativeStaticMethod()
    {
        // Real code
    }

    __attribute__((annotate("bound_as", "jvm", "method", "S_jvm_sample/Jni#nativeMethod().")))
    void nativeMethod()
    {
        // Real code
    }
};

(This example is, in fact, extracted from our test suite, jni.cpp vs Jni.java.)

Then, we wrote some heuristics that try and identify cases A1 (C functions named Java_…), A3 and B2 (C++ code generated from @WrapForJNI decorators) and automatically generate these annotations. Cases A2 and B1 (manually calling JNIEnv::RegisterNatives or JNIEnv::Call… functions) are rare enough in the Firefox code base and impossible to reliably recognize; so it was decided not to cover them at the time. Developers who wish to declare such bindings could manually annotate them.

After this point, we used Searchfox’s existing analysis JSON format and mostly re-used what was already available from IDL support. When triggering the context menu for a binding wrapper or bound function, the definitions in both languages are made available, with “Go to” actions that jump over the generally irrelevant binding internals.

The search results also display both sides of the bridge, for instance:

If you want to dive into more details, see the feature request and detailed problem analysis on Bugzilla, the implementation and further discussion on GitHub, and the release announcement on the Mozilla dev-platform mailing list.

Displaying Interactive Macro Expansions

Aside from this Java/Kotlin-related work, we also added support for displaying and interacting with macro expansions. This was inspired by KDAB’s own codebrowser.dev, but improves it to:

  • Display all expansion variants, if they differ across platforms or by definition:

Per-platform expansions

Per-platform expansions

Per-definition expansions

Per-definition expansions

  • Make macros fully indexed and interactive:

In-macro context menu

In-macro context menu

This work mainly happened in the Mozsearch Clang plugin to extract macro expansions during the pre-processing stage and index them with the rest of the top-level code.

Again, if you want more details, the feature request is available on Bugzilla and the implementation and further technical discussion is on GitHub.

Summary

Because of the many technologies it makes use of, from compiler plugins and code analyzers written in many languages, to a web front-end written using the usual HTML/CSS/JS, by way of custom tooling and scripts in Rust, Python and Bash, Searchfox is a small but complex and really interesting project to work on. KDAB successfully added Java/Kotlin code indexing, including analyzing their C++ bindings, and are starting to improve Searchfox’s C++ support itself, first with fully-indexed macro expansions and next with improved templates support.

About KDAB

If you like this article and want to read similar material, consider subscribing via our RSS feed.

Subscribe to KDAB TV for similar informative short video content.

KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.

The post Improvements to Mozilla’s Searchfox Code Browser appeared first on KDAB.