Improvements to Mozilla’s Searchfox Code Browser
Mozilla is the maker of the famous Firefox web browser and the birthplace of the likes of Rust and Servo (read more about Embedding the Servo Web Engine in Qt).
Firefox is a huge, multi-platform, multi-language project with 21 million lines of code back in 2020, according to their own blog post. Navigating in projects like those is always a challenge, especially at the cross-language boundaries and in platform-specific code.
To improve working with the Firefox code-base, Mozilla hosts an online code browser tailored for Firefox called Searchfox. Searchfox analyzes C++, JavaScript, various IDLs (interface definition languages), and Rust source code and makes them all browsable from a single interface with full-text search, semantic search, code navigation, test coverage report, and git blame support. It’s the combination of a number of projects working together, both internal to Mozilla (like their Clang plugin for C++ analysis) and external (such as rust-analyzer maintained by Ferrous Systems).
It takes a whole repository in and separately indexes C++, Rust, JavaScript and now Java and Kotlin source code. All those analyses are then merged together across platforms, before running a cross-reference step and building the final index used by the web front-end available at searchfox.org.
Mozilla asked KDAB to help them with adding Java and Kotlin support to Searchfox in prevision of the merge of Firefox for Android into the main mozilla-central repository and enhance their C++ support with macro expansions. Let’s dive into the details of those tasks.
Java/Kotlin Support
Mozilla merged the Firefox for Android source code into the main mozilla-central repository that Searchfox indexes. To add support for that new Java and Kotlin code to Searchfox, we reused open-source tooling built by Sourcegraph around the SemanticDB and SCIP code indexing formats. (Many thanks to them!)
Sourcegraph’s semanticdb-javac and semanticdb-kotlinc compiler plugins are integrated into Firefox’s CI system to export SemanticDB artifacts. The Searchfox indexer fetches those SemanticDB files and turns them into a SCIP index, using scip-semanticdb. That SCIP index is then consumed by the existing Searchfox-internal scip-indexer tool.
In the process, a couple of upstream contributions were made to rust-analyzer (which also emits SCIP data) and scip-semanticdb.
A few examples of Searchfox at work:
- Searching for the Java class: org.mozilla.geckoview.GeckoSession$PromptDelegate$AutocompleteRequest shows the definition, a superclass, some uses in Java source code and some uses in Kotlin tests.
- Searching for the Java interface method: org.mozilla.geckoview.Autocomplete$StorageDelegate$onAddressFetch shows the definition, a couple of users, and a couple of implementers across Java and Kotlin code.
- Querying the callers of a method with up to 2 indirections: calls-to:’org::mozilla::geckoview::Autofill::Session::getDefaultDimensions’ depth:2
If you want to dive into more details, see the feature request on Bugzilla, the implementation and further discussion on GitHub and the release announcement on the mozilla dev-platform mailing list.
Java/C++ Cross-language Support
GeckoView is an Android wrapper around Gecko, the Firefox web engine. It extensively uses cross-language calls between Java and C++.
Searchfox already had support for cross-language interfaces, thanks to its IDL support. We built on top of that to support direct cross-language calls between Java and C++.
First, we identified the different ways the C++ and Java code interact and call each other. There are three ways Java methods marked with the native
keyword call into C++:
- Case A1: By default, the JVM will search for a matching C function to call based on its name. For instance, calling
org.mozilla.gecko.mozglue.GeckoLoader.nativeRun
from Java will callJava_org_mozilla_gecko_mozglue_GeckoLoader_nativeRun
on the C++ side. - Case A2: This behavior can be overridden at runtime by calling the
JNIEnv::RegisterNatives
function on the C++ side to point at another function. - Case A3: GeckoView has a code generator that looks for Java items decorated with the
@WrapForJNI
andnative
annotations and generates a C++ class template meant to be used through the Curiously Recurring Template Pattern. This template provides anInit
static member function that does the rightJNIEnv::RegisterNatives
calls to bind the Java methods to the implementing C++ class’s member functions.
We also identified two ways the C++ code calls Java methods:
- Case B1: directly with
JNIEnv::Call…
functions. - Case B2: GeckoView’s code generator also looks for Java methods marked with
@WrapForJNI
(without thenative
keyword this time) and generates a C++ wrapper class and member functions with the rightJNIEnv::Call…
calls.
Only the C++ side has the complete view of the bindings; so that’s where we decided to extract the information from, by extending Mozilla’s existing Clang plugin.
First, we defined custom C++ annotations bound_as
and binding_to
that the clang plugin transforms into the right format for the cross-reference analysis. This means we can manually set the binding information:
class __attribute__((annotate("binding_to", "jvm", "class", "S_jvm_sample/Jni#"))) CallingJavaFromCpp
{
__attribute__((annotate("binding_to", "jvm", "method", "S_jvm_sample/Jni#javaStaticMethod().")))
static void javaStaticMethod()
{
// Wrapper code
}
__attribute__((annotate("binding_to", "jvm", "method", "S_jvm_sample/Jni#javaMethod().")))
void javaMethod()
{
// Wrapper code
}
__attribute__((annotate("binding_to", "jvm", "getter", "S_jvm_sample/Jni#javaField.")))
int javaField()
{
// Wrapper code
return 0;
}
__attribute__((annotate("binding_to", "jvm", "setter", "S_jvm_sample/Jni#javaField.")))
void javaField(int)
{
// Wrapper code
}
__attribute__((annotate("binding_to", "jvm", "const", "S_jvm_sample/Jni#javaConst.")))
static constexpr int javaConst = 5;
};
class __attribute__((annotate("bound_as", "jvm", "class", "S_jvm_sample/Jni#"))) CallingCppFromJava
{
__attribute__((annotate("bound_as", "jvm", "method", "S_jvm_sample/Jni#nativeStaticMethod().")))
static void nativeStaticMethod()
{
// Real code
}
__attribute__((annotate("bound_as", "jvm", "method", "S_jvm_sample/Jni#nativeMethod().")))
void nativeMethod()
{
// Real code
}
};
(This example is, in fact, extracted from our test suite, jni.cpp vs Jni.java.)
Then, we wrote some heuristics that try and identify cases A1 (C functions named Java_…
), A3 and B2 (C++ code generated from @WrapForJNI
decorators) and automatically generate these annotations. Cases A2 and B1 (manually calling JNIEnv::RegisterNatives
or JNIEnv::Call…
functions) are rare enough in the Firefox code base and impossible to reliably recognize; so it was decided not to cover them at the time. Developers who wish to declare such bindings could manually annotate them.
After this point, we used Searchfox’s existing analysis JSON format and mostly re-used what was already available from IDL support. When triggering the context menu for a binding wrapper or bound function, the definitions in both languages are made available, with “Go to” actions that jump over the generally irrelevant binding internals.
The search results also display both sides of the bridge, for instance:
- searching for the mozilla::widget::GeckoViewSupport::Open C++ member function links to its Java binding org.mozilla.geckoview.GeckoSession$Window.open.
- searching for the org.mozilla.geckoview.GeckoSession.getCompositorFromNative Java method links to its generated C++ binding mozilla::java::GeckoSession::GetCompositor.
If you want to dive into more details, see the feature request and detailed problem analysis on Bugzilla, the implementation and further discussion on GitHub, and the release announcement on the Mozilla dev-platform mailing list.
Displaying Interactive Macro Expansions
Aside from this Java/Kotlin-related work, we also added support for displaying and interacting with macro expansions. This was inspired by KDAB’s own codebrowser.dev, but improves it to:
- Display all expansion variants, if they differ across platforms or by definition:
- Make macros fully indexed and interactive:
This work mainly happened in the Mozsearch Clang plugin to extract macro expansions during the pre-processing stage and index them with the rest of the top-level code.
Again, if you want more details, the feature request is available on Bugzilla and the implementation and further technical discussion is on GitHub.
Summary
Because of the many technologies it makes use of, from compiler plugins and code analyzers written in many languages, to a web front-end written using the usual HTML/CSS/JS, by way of custom tooling and scripts in Rust, Python and Bash, Searchfox is a small but complex and really interesting project to work on. KDAB successfully added Java/Kotlin code indexing, including analyzing their C++ bindings, and are starting to improve Searchfox’s C++ support itself, first with fully-indexed macro expansions and next with improved templates support.
If you like this article and want to read similar material, consider subscribing via our RSS feed.
Subscribe to KDAB TV for similar informative short video content.
KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.
The post Improvements to Mozilla’s Searchfox Code Browser appeared first on KDAB.