Modern distributed systems need to process massive amounts of data efficiently while maintaining strict ordering guarantees. This is especially challenging when scaling horizontally across multiple nodes. How do we ensure messages from specific sources are processed in order while still taking advantage of parallelism and fault tolerance?
Elixir, with its robust concurrency model and distributed computing capabilities, is well-suited for solving this problem. In this article, we’ll build a scalable, distributed message pipeline that:
Distribute the message pipelines evenly across the Elixir cluster.
Gracefully handles failures and network partitions.
Many modern applications require processing large volumes of data while preserving message order from individual sources. Consider, for example, IoT systems where sensor readings must be processed in sequence, or multi-tenant applications where each tenant’s data requires sequential processing.
The solution we’ll build addresses these requirements by treating each RabbitMQ queue as an ordered data source.
Let’s explore how to design this system using Elixir’s distributed computing libraries: Broadway, Horde, and libcluster.
Architecture overview
The system consists of multiple Elixir nodes forming a distributed cluster. Each node runs one or more Broadway pipelines to process messages from RabbitMQ queues and forward them to Google Cloud PubSub. To maintain message ordering, each queue has exactly one pipeline instance running across the cluster at any time. If a node fails the system must redistribute its pipelines to other nodes automatically, and if a new node joins the cluster then the existing pipelines should be redistributed to ensure a balanced load.
Elixir natively supports the ability to cluster multiple nodes together so that processes and distributed components within the cluster can communicate seamlessly. We will employ the libcluster library since it provides several strategies to automatize cluster formation and healing.
For the data pipelines, the Broadway library provides a great framework to support multi-stage data processing while handling back-pressure, batching, fault tolerance and other good features.
To correctly maintain the distribution of data pipelines across the Elixir nodes, the Horde library comes to the rescue by providing the building blocks we need: a distributed supervisor that we can use to distribute and maintain healthy pipelines on the nodes, and a distributed registry that we use directly to track which pipelines exist and on which nodes they are.
Finally, a PipelineManager component will take care of monitoring RabbitMQ for new queues and starting/stopping corresponding pipelines dynamically across the cluster.
Technical implementation
Let’s initiate a new Elixir app with a supervision tree.
mix new message_pipeline --sup
First, we’ll need to add our library dependencies in mix.exs and run mix deps.get:
defmodule MessagePipeline.MixProject do use Mix.Project
def project do [ app: :message_pipeline, version: "0.1.0", elixir: "~> 1.17", start_permanent: Mix.env() == :prod, deps: deps() ] end
def application do [ extra_applications: [:logger], mod: {MessagePipeline.Application, []} ] end
defp generate_auth_token do with {:ok, %{token: token}} <- Goth.fetch(MessagePipeline.Goth) do {:ok, token} end end end
Clustering with libcluster
We’ll use libcluster to establish communication between our Elixir nodes. Here’s an example configuration that uses the Gossip strategy to form a cluster between nodes:
defmodule MessagePipeline.Application do use Application
children = [ {Cluster.Supervisor, [topologies, [name: MessagePipeline.ClusterSupervisor]]}, # Other children... ]
Supervisor.start_link(children, strategy: :one_for_one) end end
Distributed process management with Horde
We’ll use Horde to manage our Broadway pipelines across the cluster. Horde ensures that each pipeline runs on exactly one node and handles redistribution when nodes fail.
Let’s add Horde’s supervisor and registry to the application’s supervision tree.
The UniformQuorumDistribution distribution strategy distributes processes using a hash mechanism among all reachable nodes. In the event of a network partition, it enforces a quorum and will shut down all processes on a node if it is split from the rest of the cluster: the unreachable node is drained and the pipelines can be resumed on the other cluster nodes.
defmodule MessagePipeline.Application do use Application
case Broadway.start_link(__MODULE__, pipeline_opts) do {:ok, pid} -> {:ok, pid}
{:error, {:already_started, _pid}} -> :ignore end end
def pipeline_name(queue_name) do String.to_atom("pipeline_#{queue_name}") end
@impl true def handle_message(_, message, _) do message |> Message.update_data(&process_data/1) end
@impl true def handle_batch(_, messages, _, _) do case publish_to_pubsub(messages) do {:ok, _message_ids} -> messages {:error, reason} -> # Mark messages as failed Enum.map(messages, &Message.failed(&1, reason)) end end
defp process_data(data) do # Transform message data as needed data end
defp publish_to_pubsub(messages) do MessagePipeline.GooglePubsub.publish_messages(messages) end end
Queue discovery and pipeline management
Finally, we need a process to monitor RabbitMQ queues and ensure pipelines are running for each one.
The Pipeline Manager periodically queries RabbitMQ for existing queues. If a new queue appears, it starts a Broadway pipeline only if one does not already exist in the cluster. If a queue is removed, the corresponding pipeline is shut down.
defmodule MessagePipeline.PipelineManager do use GenServer
@timeout :timer.minutes(1)
def start_link(opts) do GenServer.start_link(__MODULE__, opts, name: __MODULE__) end
def init(_opts) do state = %{managed_queues: MapSet.new()}
{:ok, state, {:continue, :start}} end
def handle_continue(:start, state) do state = manage_queues(state)
{:noreply, state, @timeout} end
def handle_info(:timeout, state) do state = manage_queues(state)
{:noreply, state, @timeout} end
def manage_queues(state) do {:ok, new_queues} = discover_queues() current_queues = state.managed_queues
# Filter out system queues queues |> Enum.reject(fn %{name: name} -> String.starts_with?(name, "amq.") or String.starts_with?(name, "rabbit") end) |> Enum.map(& &1.name) |> MapSet.new() end
defp start_pipeline(queue_name) do pipeline_name = MessagePipeline.Pipeline.pipeline_name(queue_name)
case Horde.Registry.lookup(MessagePipeline.PipelineRegistry, pipeline_name) do [{pid, _}] -> {:error, :already_started} [] -> Horde.DynamicSupervisor.start_child( MessagePipeline.PipelineSupervisor, {MessagePipeline.Pipeline, queue_name: queue_name} ) end end
defp stop_pipeline(queue_name) do pipeline_name = MessagePipeline.Pipeline.pipeline_name(queue_name)
case Horde.Registry.lookup(MessagePipeline.PipelineRegistry, pipeline_name) do [{pid, _}] -> Horde.DynamicSupervisor.terminate_child(MessagePipeline.PipelineSupervisor, pid) [] -> {:error, :not_found} end end end
Let’s not forget to also add the pipeline manager to the application’s supervision tree.
defmodule MessagePipeline.Application do use Application
def start(_type, _args) do children = [ {MessagePipeline.PipelineManager, []} # Other children... ]
Supervisor.start_link(children, strategy: :one_for_one) end end
Test the system
We should now have a working and reliable system. To quickly test it out, we can configure a local RabbitMQ broker, a Google Cloud PubSub topic, and finally a couple of Elixir nodes to verify that distributed pipelines are effectively run to forward messages between RabbitMQ queues and PubSub.
Let’s start by running RabbitMQ with the management plugin. RabbitMQ will listen for connections on the 5672 port, while also exposing the management interface at http://localhost:15672. The default credentials are guest/guest.
# Publish test messages ./rabbitmqadmin publish routing_key=test-queue-1 payload="Message 1 for queue 1" ./rabbitmqadmin publish routing_key=test-queue-1 payload="Message 2 for queue 1" ./rabbitmqadmin publish routing_key=test-queue-2 payload="Message 1 for queue 2"
# List queues and their message counts ./rabbitmqadmin list queues name messages_ready messages_unacknowledged
# Get messages (without consuming them) ./rabbitmqadmin get queue=test-queue-1 count=5 ackmode=reject_requeue_true
One can also use the RabbitMQ management interface at http://localhost:15672, authenticate with the guest/guest default credentials, go to the “Queues” tab, click “Add a new queue”, and create “test-queue-1” and “test-queue-2”.
After a minute, the Elixir nodes should automatically start some pipelines corresponding to the RabbitMQ queues.
# List all registered pipelines Horde.Registry.select(MessagePipeline.PipelineRegistry, [{{:"$1", :"$2", :"$3"}, [], [:"$2"]}])
# Check specific pipeline pipeline_name = :"pipeline_test-queue-1" Horde.Registry.lookup(MessagePipeline.PipelineRegistry, pipeline_name)
Now, if we publish messages on the RabbitMQ queues, we should see them appear on the PubSub topic.
We can verify it from Google Cloud Console, or by creating a subscription, publishing some messages on RabbitMQ, and then pulling messages from the PubSub subscription.
If we stop one of the Elixir nodes (Ctrl+C twice in its IEx session) to simulate a failure, the pipelines should be redistributed in the remaining node:
# Check updated node list Node.list()
# Check pipeline distribution Horde.Registry.select(MessagePipeline.PipelineRegistry, [{{:"$1", :"$2", :"$3"}, [], [:"$2"]}])
Rebalancing pipelines on new nodes
With our current implementation, pipelines are automatically redistributed when a node fail but they are not redistributed when a new node joins the cluster.
Fortunately, Horde supports precisely this functionality from v0.8+, and we don’t have to manually stop and re-start our pipelines to have them landing on other nodes.
All we need to do is enable the option process_distribution: :active on Horde’s supervisor to automatically rebalance processes on node joining / leaving. The option runs each child spec through the choose_node/2 function of the preferred distribution strategy, detects which processes should be running on other nodes considering the new cluster configuration, and specifically restarts those particular processes such that they run on the correct node.
defmodule MessagePipeline.Application do use Application
Supervisor.start_link(children, strategy: :one_for_one) end end
Conclusion
This architecture provides a robust solution for processing ordered message streams at scale. The combination of Elixir’s distributed capabilities, Broadway’s message processing features, and careful coordination across nodes enables us to build a system that can handle high throughput while maintaining message ordering guarantees.
To extend this solution for your specific needs, consider these enhancements:
Adopt a libcluster strategy suitable for a production environment, such as Kubernetes.
Tune queue discovery latency, configuring the polling interval based on how frequently new queues are created. Better yet, instead of polling RabbitMQ, consider setting up RabbitMQ event notifications to detect queue changes in real-time.
Declare AMQP queues as durable and make sure that publishers mark published messages as persisted, in order to survive broker restarts and improve delivery guarantees. Use publisher confirms to ensure messages are safely received by the broker. Deploy RabbitMQ in a cluster with queue mirroring or quorum queues for additional reliability.
Add monitoring, instrumenting Broadway and Horde with Telemetry metrics.
Enhance error handling and retry mechanisms. For example, retry message publication to PubSub N times before failing the messages, thus invalidating the (possibly costly) processing operation.
Unit & e2e testing. Consider that the gcloud CLI (gcr.io/google.com/cloudsdktool/google-cloud-cli:emulators) contains a PubSub emulator that may come in handy: e.g. gcloud beta emulators pubsub start — project=test-project — host-port=0.0.0.0:8085
Leverage an HorizontalPodAutoscaler for automated scaling on Kubernetes environments based on resource demand.
Evaluate the use of Workload Identities if possible. For instance, you can provide your workloads with access to Google Cloud resources by using federated identities instead of a service account key. This approach frees you from the security concerns of manually managing service account credentials.
The second maintenance release of the 24.12 cycle is out with multiple bug fixes. Notable changes include fixes for crashes, UI resizing issues, effect stack behavior, proxy clip handling, and rendering progress display, along with improvements to Speech-to-text in Flatpak and macOS packages.
This week, I focused on integrating the Monte Carlo Tree Search (MCTS) algorithm into the MankalaEngine. The primary goal was to test the performance of the MCTS-based agent against various existing algorithms in the engine. Let's dive into what MCTS is, how it works, and what I discovered during the testing phase.
What is Monte Carlo Tree Search (MCTS)?
The Monte Carlo Tree Search (MCTS) is a heuristic search algorithm used for decision-making in sequential decision problems. It incrementally builds a search tree and simulates multiple random moves at each step to evaluate potential outcomes. These simulations help the algorithm determine the most promising move to make.
How Does MCTS Work?
MCTS operates through four key steps:
1. Selection
The algorithm starts at the root node (representing the current game state) and traverses down the tree to a leaf node (an unexplored state). During this process, the algorithm selects child nodes using a specific strategy.
A popular strategy for node selection is Upper Confidence Bounds for Trees (UCT). The UCT formula helps balance exploration and exploitation by selecting nodes based on the following equation:
UCT = mean + C × sqrt(ln(N) / n)
Where:
mean is the average reward (or outcome) of a node.
N is the total number of simulations performed for the parent node.
n is the number of simulations performed for the current child node.
C is a constant that controls the level of exploration.
2. Expansion
Once the algorithm reaches a leaf node, it expands the tree by adding one or more child nodes representing potential moves or decisions that can be made from the current state.
3. Simulation
The algorithm then performs a simulation or rollout from the newly added child node. During this phase, the algorithm randomly plays out a series of moves (typically following a simple strategy) until the game reaches a terminal state (i.e., win, loss, or draw).
This is where the Monte Carlo aspect of MCTS shines. By simulating many random games, the algorithm gains insights into the likely outcomes of different actions.
4. Backpropagation
After the simulation ends, the results are propagated back up the tree, updating the nodes with the outcome of the simulation. This allows the algorithm to adjust the expected rewards of the parent nodes based on the result of the child node’s simulation.
With a solid understanding of the algorithm, I began implementing MCTS in C++. The initial step involved integrating the MCTS logic into the benchmark utility of the MankalaEngine. After resolving a series of issues and running multiple tests, the code was functioning as expected.
Testing Results
I compared the performance of the MCTS agent against other existing agents in the MankalaEngine, such as Minimax, MTDF, and Random agents. Here’s how the MCTS agent performed:
Random Agent (Player 1) vs. MCTS (Player 2)
MCTS won 80% of the time
MCTS (Player 1) vs. Random Agent (Player 2)
MCTS won 60% of the time
MCTS vs. Minimax & MTDF
Unfortunately, MCTS consistently lost against both Minimax and MTDF agents. 😞
Key Improvements for MCTS
While MCTS performed well against the Random Agent, there is still room for improvement, especially in its simulation phase. Currently, the algorithm uses a random policy for simulations, which can be inefficient. To improve performance, we can:
Use more efficient simulation policies that simulate only promising moves, rather than randomly selecting moves.
At the start of the Selection step, focus on moves that have historically been good opening strategies (this requires further research to identify these moves, especially in Pallanguli).
Fine-tune the exploration-exploitation balance to improve decision-making.
Upcoming Tasks
In the upcoming week, I plan to:
Write test cases for the Pallanguli implementation.
Four applications, four different ways of styling.
Last year during Akademy I gave a talk called Union: The Future of Styling in KDE?!. In this talk I presented a problem: We currently have four ways of styling our applications. Not only that, but some of these approaches are quite hard to work with, especially for designers who lack programming skills. This all leads to it being incredibly hard to make changes to our application styling currently, which is not only a problem for something like the Plasma Next Initiative, but even smaller changes take a lot of effort.
This problem is not new; we already identified it several years ago. Unfortunately, it also is not easy to solve. Some of the reasons it got to this state are simply inertia. Some things like Plasma's SVG styling were developed as a way to improve styling in an era where a lot of the technologies we currently use did not exist yet. The solutions developed in those days have now existed for a pretty long time so we cannot suddenly drop them. Other reasons are more technical in nature, such as completely different rendering stacks.
Introducing Union
Those different rendering stacks are actually one of the core issues that makes this hard to solve. It means that we cannot simply use the same rendering code for everything, but have to come up with a tricky compatibility layer to make that work. This is what we currently do, and while it works, it means we need to maintain said compatibility layer. It also means we are not utilizing the rendering stack to its full potential.
However, there is another option, which is to take a step back and realise that we actually may not even want to share the rendering code, given that they are quite different. Instead, we need a description of what the element should look like, and then we can have specific rendering code that implements how to render that in the best way for a certain technology stack.
This idea is at the core of a project I called Union, which is a styling system intended to unify all our separate approaches into a single unified styling engine that can support all the different technologies we use for styling our applications.
Image
The three separate parts of Union
Union consists of three parts: an input layer, an intermediate layer and an output layer. The input layer consists of plugins that can read and interpret some input file format containing a style description and turn it into a more abstract desciption of what to render. How to do that is defined by the middle intermediate layer, which is a library containing the description of the data model and a method of defining which elements to apply things to. Finally, the output layer consists of plugins that use the data from the intermediate layer and turn it into actual rendering commands, as needed for a specific rendering stack.
Implementing Things
This sounds nice on paper, but implementing it is easier said than done. For starters, everything depends on the intermediate layer being both flexible enough to handle varying use cases but at the same time rigid enough that it becomes hard to - intentionally or unintentionally - create dependencies between the input and output layers. Apart from that, replacing the entire styling stack is simply going to be a lot of work.
Image
Plasma's SVG styling uses specially-marked SVG items for styling.
To allow us to focus more on the core we needed to break things down into more manageable parts. We chose to focus on the intermediate layer first, by using Plasma's SVG themes as an input format and a QtQuick Style as output. This means we are working with an input format that we already know how to deal with. It also means we have a clear picture of what the output should look like, as it should ultimately look just like how Plasma looks.
At this point, a lot of this work has now been done. While Union does not yet implement a full QtQuick style, it implements most of the basic controls to allow something such as Discover to run without looking completely alien. Focusing on the intermediate layer proved very useful, we encountered and managed to solve several pretty tricky technical issues that would have been even trickier if we did not know what things should look like.
Image
Plasma Discover running using Union.
Union Needs You!
All that said, there is still a lot to be done. For starters, to be an actual unified styling system for KDE we need a QtWidgets implementation. Some work on that has started, but it is going to be a lot harder than the QtQuick implementation. We also need a different input format. While Plasma's SVG styling works, it is not ideal for developing new styles with. I would personally like to investigate using CSS as input format as it has most of what we need while also being familiar to a lot of people. Unfortunately, finding a good CSS parser library turns out to be quite hard.
However, at this stage we are at a point where we have multiple tasks that can be done in parallel. This means it is now at a point where it would be great if we had more people developing code, as well as some initial testing and feedback on the systen. If you are interested in helping out, the code can be found at invent.kde.org/plasma/union. There is also a Matrix channel for more realtime disucssions.
FOSDEM 2025 is just behind us and it was a great event as always. Alexander and I had a chance to talk about the local authentication hub project. Our FOSDEM talk was “localkdc – a general local authentication hub”. You can watch it and come back here for more details.
But before going into details, let us provide a bit of a background. It is 2025 now and we should go almost three decades back (ugh!).
Local authentication localkdc
History dive
Authentication on Linux systems is interwoven with the identity of the users. Once a user logged in, a process is running under a certain POSIX account identity. Many applications validate the presence of the account prior to the authentication itself. For example, the OpenSSH server does check the POSIX account and its properties and if the user was not found, will intentionally corrupt the password passed to the PAM authentication stack request. An authentication request will fail but the attempt will be recorded in the system journal.
This joint operation between authentication and identification sources in Linux makes it important to maintain a coherent information state. No wonder that in corporate environments it is often handled centrally: user and group identities stored at a central server and sourced from that one by a local software, such as SSSD. In order to consume these POSIX users and groups, SSSD needs to be registered with the centralized authority or, in other words, enrolled into the domain. Domain enrollment allows not only identity and authentication of users: both the central server and the enrolled client machine can mutually authenticate each other and be sure they talk to the right authority when authenticating the user.
FreeIPA provides a stable mechanism for building a centralized domain management system. Each user account has POSIX attributes associated with it and each user account is represented by the Kerberos principal. Kerberos authentication can be used to transfer the authentication state across multiple services and provides a chance for services to discover user identity information beyond POSIX. It also makes strong linking between the POSIX level identity and authentication structure possible: for example, a Kerberos service may introspect a Kerberos ticket presented by a user’s client application to see how this user was authenticated originally: with a password or some specific passwordless mechanism. Or, perhaps, that a client application performs operations on behalf of the user after claiming it was authenticated using a different (non-Kerberos) authentication.
Local user accounts’ use lacks this experience. Each individual service needs to reauthenticate a user again and again. Local system login: authenticate. Elevating privileges through SUDO? Authenticate again, if not explicitly configured otherwise. Details of the user session state, like how long this particular session is active, is not checked by the applications, making it also harder to limit access. There is no information on how this user was authenticated. Finally, overall user experience between local (standalone) authentication and domain-enrolled one differs, making it harder to adjust and educate users.
Local authentication is also typically password-based. This is not a bad thing in itself but depending on applications and protocols, worse choices could be made, security-wise. For example, contemporary SMB 3.11 protocol is quite secure if authenticated using Kerberos. For non-Kerberos usage, however, it is left to rely on NTLM authentication protocol which requires use of RC4 stream cipher. There are multiple attacks known to break RC4-based encryption, yet it is still used in majority of non-domain joined communications using SMB protocol simply because there was no (so far) alternative. To be correct, there was always an alternative, use of Kerberos protocol, but setting it up for individual isolated systems wasn’t practical.
The Kerberos protocol assumes the use of three different parties: a client, a service, and a key distribution center (KDC). In corporate environments a KDC is part of the domain controller system, a client and a service are both domain members, computers are enrolled in the domain. The client authenticates to KDC and obtains a Kerberos ticket granting ticket (TGT). It then requests a service ticket from the KDC by presenting its TGT and then presents this service ticket to the service. The service application, on its side, is able to decrypt the service ticket presented by the client and authenticate the request.
In the late 2000s Apple realised that for individual computers a number of user accounts is typically small and a KDC can be run as a service on the individual computer itself. When both the client and server are on the same computer, this works beautifully. The only problem is that when a user needs to authenticate to a different computer’s service, the client cannot reach the KDC hosted on the other computer because it is not exposed to the network directly. Luckily, MIT Kerberos folks already thought about this problem a decade prior to that: in 1997 a first idea was published for a Kerberos extension that allowed to tunnel Kerberos requests over a different application protocol. This specification became later known as “Initial and Pass Through Authentication Using Kerberos V5 and the GSS-API” (IAKerb). An initial implementation for MIT Kerberos was done in 2009/2010 while Apple introduced it in 2007 to enable remote access to your own Mac across the internet. It came in MacOS X 10.5 as a “Back to My Mac” feature and even got specified in RFC 6281, only to be retired from MacOS in 2019.
Modern days
In the 2020s Microsoft continued to work on NTLM removal. In 2023 they announced that all Windows systems will have a local KDC as their local authentication source, accessible externally via selected applications through the IAKerb mechanism. By the end of 2024, we have only seen demos published by Microsoft engineers at various events but this is a promising path forward. Presence of the local KDC in Windows raises an interoperability requirement: Linux systems will have to handle access to Windows machines in a standalone environment over SMB protocol. Authentication is currently done with NTLM, it will eventually be removed, thus we need to support the IAKerb protocol extension.
The NTLM removal for Linux systems requires several changes. First, the Samba server will need to learn how to accept authentication with the IAKerb protocol extension. Then, Samba client code needs to be able to establish a client connection and advertise IAKerb protocol extension. For kernel level access, the SMB filesystem driver needs to learn how to use IAKerb as well, this will also need to be implemented in the user space cifs-utils package. Finally, to be able to use the same feature in a pure Linux environment, we need to be able to deploy Kerberos KDC locally and do it in an easy manner on each machine.
This is where we had an idea. If we are going to have a local KDC running on each system, maybe we should use it to handle all authentication and not just for the NTLM removal? This way we can make both the local and domain-enrolled user experience the same and provide access locally to a whole set of authentication methods we support for FreeIPA: passwords, smartcards, one-time passwords and remote RADIUS server authentication, use of FIDO2 tokens, and authentication against an external OAuth2 Identity Provider using a device authorization grant flow.
How “local” a local KDC should be?
On standalone systems it is often not desirable to run daemons continuously. Also, it is not desirable to expose these services to the connected network if they really don’t need to be exposed. A common approach to solve this problem is by providing a local inter-process communication (IPC) mechanism to communicate with the server components. We chose to expose a local KDC via UNIX domain sockets. A UNIX domain socket is a well-known mechanism and has known security properties. With the help of a systemd feature called socket activation, we also can start local KDC on demand, when a Kerberos client connects over the UNIX domain socket. Since on local systems actual authentication requests don’t happen often, this helps to reduce memory and CPU usage in the long run.
If a local KDC is only accessible over a UNIX domain socket, remote applications could not get access to it directly. This means they would need to have help from a server application that can utilize the IAKerb mechanism to pass-through the communication between a client and the KDC. It would enable us to authenticate as a local user remotely from a different machine. Due to how the IAKerb mechanism is designed and integrated into GSS-API, this only allows password-based authentication. Anything that requires passwordless methods cannot obtain initial Kerberos authentication over IAKerb, at least at this point.
Here is a small demo on Fedora, using our localkdc tool to start a local KDC, obtain a Kerberos ticket upon login. The tickets can then be used effortlessly to authenticate to local services such as SUDO or Samba. For remote access we rely on Samba support for IAKerb and authenticate with GSSAPI but local smbclient uses a password first to obtain the initial ticket over IAKerb. This is purely a limitation of the current patches we have to Samba.
Make a pause here and think about the implications. We have an initial Kerberos ticket from the local system. The Kerberos ticket embeds details of how this authentication happened. We might have used a password to authenticate, or a smartcard. Or any other supported pre-authentication methods. We could reuse the same methods FreeIPA already provides in the centralized environment.
The Kerberos ticket also can contain details about the user session, including current group membership. It does not current have that in the local KDC case but we aim to fix that. This ticket can be used to authenticate to any GSS-API or Kerberos-aware service on this machine. If a remote machine accepts Kerberos, it theoretically could accept a ticket presented by a client application running on the local machine as well. Only, to do that it needs to be able to communicate with our local KDC and it couldn’t access it.
Trust management
Luckily, a local KDC deployment is a full-featured Kerberos realm and thus can establish cross-realm agreements with other Kerberos realms. If two “local” KDC realms have trust agreements between each other, they can issue cross-realm Kerberos tickets which applications can present over IAKerb to the remote “local” KDC. Then a Kerberos ticket to a service running on the target system can be requested and issued by the system’s local KDC.
Thus, we can achieve passwordless authentication locally on Linux systems and have the ability to establish peer to peer agreements across multiple systems, to allow authentication requests to flow and operate on commonly agreed credentials. A problem now moves to the management area: how to manage these peer to peer agreements and permissions in an easy way?
Systemd User/Group API support
MIT Kerberos KDC implementation provides a flexible way to handle Kerberos principals’ information. A database backend (KDB) implementation can be dynamically loaded and replaced. This is already used by both FreeIPA and Samba AD to integrate MIT Kerberos KDC with their own database backends based on different LDAP server implementations. For a local KDC use case running a full-featured LDAP server is not required nor intended. However, it would be great if different applications could expose parts of the data needed by the KDB interfaces and cooperate together. Then a single KDB driver implementation could be used to streamline and provide uniform implementation of Kerberos-specific details in a local KDC.
One of the promising interfaces to achieve that is the User/Group record lookup API via varlink from systemd. Varlink allows applications to register themselves and listen on UNIX domain sockets for communication similar to D-Bus but with much less implementation overhead. The User/Group API technically also allows to merge data coming from different sources when an application inquires the information. “Technically”, because io.systemd.Multiplexer API endpoint currently does not support merging non-overlapping data representing the same account from multiple sources. Once it would become possible, we could combine the data dynamically and may interact with users on demand when corresponding requsts come in. Or we can implement our own blending service.
Blending data requests from multiple sources within MIT KDC needs a specialized KDB driver. We certainly don’t want this driver to duplicate the code from other drivers, so making these drivers stackable would be a good option. Support for one level of stacking has been merged to MIT Kerberos through a quickly processed pull request and will be available in the next MIT Kerberos release. This allows us to have a single KDB driver that loads other drivers specialized in storing Kerberos principals and processing additional information like MS-PAC structure or applying additional authorization details.
Establishing trusts
If Alice and Bob are in the same network and want to exchange some files, they could do this using SMB and Samba. But that Alice can authenticate on Bob’s machine, they would need to establish a Kerberos cross realm trust. With the current tooling this is a complex task. For users we need to make this more accessible. We want to allow users to request trust on demand and validate these requests interactively. We also want to allow trust to be present for a limited timeframe, automatically expiring or manually removed.
If we have a Kerberos principal lookup on demand through a curated varlink API endpoint, we also can have a user-facing service to initiate establishing the trust between two machines on demand. Imagine a user trying to access SMB share on one desktop system that triggers a pop-up to establish trust relationship with a corresponding local KDC on the remote desktop system. Both owners of the systems would be able to communicate out of band that provided information is correct and can be trusted. Once it is done, we can return back the details of the specific Kerberos principal that represents this trust relationship. We can limit lifetime of this agreement so that it would disappear automatically in one hour or a day, or a week.
Current state of local authentication hub
We started with two individual implementation paths early in 2024:
Support IAKerb in MIT Kerberos and Samba
Enable MIT Kerberos to be used locally without network exposure
MIT Kerberos did have support for IAKerb protocol extension for more than a decade but since Microsoft introduced some changes to the protocol, those changes needed to be integrated as well. This was completed during summer 2024, though no upstream release is available yet. MIT Kerberos typically releases new versions yearly in January so we hope to get some updates early 2025.
Samba integration with IAKerb is currently under implementation. Originally, Microsoft was planning to release Windows 11 and Windows Server 2025 with IAKerb support enabled during autumn 2024. However, the Windows engineering team faced some issues and IAKerb is still not enabled in the Windows Server 2025 and Windows 11 releases. We are looking forward to getting access to Windows builds that enable IAKerb support to ensure interoperability before merging Samba changes upstream. We also need to complete the Samba implementation to properly support locally-issued Kerberos tickets and not only do acquisition of the ticket based on the password.
Meanwhile, our cooperation with MIT Kerberos development team led to advancements in the local KDC support. The MIT Kerberos KDC can now be run over a UNIX domain socket. Also on systemd-enabled systems we allow socket activation, transforming local KDC into an on-demand service. We will continue our work on a dynamic database for a local KDC, to allow on-demand combination of resources from multiple authoritative local sources (Samba, FreeIPA, SSSD, local KDC, future dynamic trust application).
For experiments and ease of deployments, a new configuration tool was developed, localkdc. The tool is available at localkdc and COPR repository can be used to try the whole solution on Fedora.
If you want to get that test tried in a simple setup, you might be interested in a tool that we developed initially for FreeIPA: FreeIPA local tests. This tool allows to provision and run a complex test environment in podman containers. The video of the local KDC usage was actually generated automatically by the scripts from here.
This blog series is all about implementing drag-and-drop in the Qt model/view framework. In addition to complete code examples, you'll find checklists that you can go through to make sure that you did not forget anything in your own implementation, when something isn't working as expected.
At first, we are going to look at Drag and Drop within a single view, to change the order of the items. The view can be a list, a table or a tree, there are very little differences in what you have to do.
Moving a row in a tableview, step 1
Moving a row in a tableview, step 2
Moving a row in a tableview, step 3
The main question, however, is whether you are using QListView/QTableView/QTreeView on top of a custom item model, or QListWidget/QTableWidget/QTreeWidget with items in them. Let's explore each one in turn.
With Model/View separation
The code being discussed here is extracted from the example. That example features a flat model, while this example features a tree model. The checklist is the same for these two cases.
Setting up the view
☑ Call view->setDragDropMode(QAbstractItemView::InternalMove) to enable the mode where only moving within the same view is allowed
☑ When using QTableView, call view->setDragDropOverwriteMode(false) so that it inserts rows instead of replacing cells (the default is false for the other views anyway)
Adding drag-n-drop support to the model
Reorderable ListView
Reorderable TableView
For a model being used in QListView or QTableView, all you need is something like this:
class CountryModel : public QAbstractTableModel
{
~~~
Qt::ItemFlags flags(const QModelIndex &index) const override
{
if (!index.isValid())
return Qt::ItemIsDropEnabled; // allow dropping between items
return Qt::ItemIsEnabled | Qt::ItemIsSelectable | Qt::ItemIsDragEnabled;
}
// the default is "copy only", change it
Qt::DropActions supportedDropActions() const override { return Qt::MoveAction; }
// the default is "return supportedDropActions()", let's be explicit
Qt::DropActions supportedDragActions() const override { return Qt::MoveAction; }
QStringList mimeTypes() const override { return {QString::fromLatin1(s_mimeType)}; }
bool moveRows(const QModelIndex &sourceParent, int sourceRow, int count, const QModelIndex &destinationParent, int destinationChild) override; // see below
};
The checklist for the changes you need to make in your model is therefore the following:
☑ Reimplement flags() For a valid index, add Qt::ItemIsDragEnabled and make sure Qt::ItemIsDropEnabled is NOT set (except for tree models where we need to drop onto items in order to insert a first child). \
☑ Reimplement mimeTypes() and make up a name for the mimetype (usually starting with application/x-)
☑ Reimplement supportedDragActions() to return Qt::MoveAction
☑ Reimplement supportedDropActions() to return Qt::MoveAction
☑ Reimplement moveRows()
Note that this approach is only valid when using QListView or, assuming Qt >= 6.8.0, QTableView - see the following sections for details.
In a model that encapsulates a QVector called m_data, the implementation of moveRows can look like this:
bool CountryModel::moveRows(const QModelIndex &sourceParent, int sourceRow, int count, const QModelIndex &destinationParent, int destinationChild)
{
if (!beginMoveRows(sourceParent, sourceRow, sourceRow + count - 1, destinationParent, destinationChild))
return false; // invalid move, e.g. no-op (move row 2 to row 2 or to row 3)
for (int i = 0; i < count; ++i) {
m_data.move(sourceRow + i, destinationChild + (sourceRow > destinationChild ? 0 : -1));
}
endMoveRows();
return true;
}
QTreeView does not call moveRows
Reorderable treeview
Reorderable treeview with a tree model
QTreeView does not (yet?) call moveRows in the model, so you need to:
☑ Reimplement mimeData() to encode row numbers for flat models, and node pointers for tree models
☑ Reimplement dropMimeData() to implement the move and return false (meaning: all done)
Note that this means a move is in fact an insertion and a deletion, so the selection isn't automatically updated to point to the moved row(s).
QTableView in Qt < 6.8.0
I implemented moving of rows in QTableView itself for Qt 6.8.0, so that moving rows in a table view is simpler to implement (one method instead of two), more efficient, and so that selection is updated. If you're not yet using Qt >= 6.8.0 then you'll have to reimplement mimeData() and dropMimeData() in your model, as per the previous section.
This concludes the section on how to implement a reorderable view using a separate model class.
Using item widgets
The alternative to model/view separation is the use of the item widgets (QListWidget, QTableWidget or QTreeWidget) which you populate directly by creating items.
Reorderable QListWidget
Reorderable QTableWidget
Reorderable QTreeWidget
Here's what you need to do to allow users to reorder those items.
☑ Call tableWidget->setDragDropOverwriteMode(false) so that it inserts rows instead of replacing cells
☑ Call item->setFlags(item->flags() & ~Qt::ItemIsDropEnabled); on each item, to disable dropping onto items
Note: Before Qt 6.8.0, QTableWidget did not really support moving rows. It would instead move data into cells (like Excel). The example code shows a workaround, but since it calls code that inserts a row and deletes the old one, header data is lost in the process. My changes in Qt 6.8.0 implement support for moving rows in QTableWidget's internal model, so it's all fixed there. If you really need this feature in older versions of Qt, consider switching to QTableView.
☑ Call item->setFlags(item->flags() & ~Qt::ItemIsDropEnabled); on each item, to disable dropping onto items
Conclusion about reorderable item widgets
Of course, you'll also need to iterate over the items at the end to grab the new order, like the example code does. As usual, item widgets lead to less code to write, but the runtime performance is worse than when using model/view separation. So, only use item widgets when the number of items is small (and you don't need proxy models).
Improvements to Qt
While writing and testing these code examples, I improved the following things in Qt 6.8:
QTBUG-130045 - QTableView: fix dropping between items when precisely on the cell border
QTBUG-1656 - Implement full-row drop indicator when the selection behavior is SelectRows
Conclusion
I hope this checklist will be useful when you have to implement your own reordering of items in a model or an item-widget. Please post a comment if anything appears to be incorrect or missing.
In the next blog post of this series, you will learn how to move (or even copy) items from one view to another.