Skip to content

Wednesday, 4 October 2023

It's been a Long long time without posting anything. Not that i'm lazy (well, a bit). But I have been working on a lot of things related to KDE this past few years, and I was finally able to release and opensource Codevis. I know this post is as small as a tweet, just checking if the integration is stil working

Tuesday, 3 October 2023

Last weekend I went to the Linux Days in Voralberg (Austria) to host a booth with Tobias and Kai. It was hosted at the Fachhochschule (a sort of university for applied science) in Dornbirn and it was my first time attending this event.

Me and Tobias in front of the LinuxDays poster at the entrance of the event
Me and Tobias in front of the LinuxDays poster at the entrance of the event

Our booth was well visited and we had a lot of interesting discussions. As always, we had various pieces of hardware on our booth: 2 laptops, a Steam Deck, a Pinephone, a graphic tablet with Krita and two Konqi amigurumis.

Our stand
Our stand

Between booth duty, I still managed to watched one talk about open source deployment in public institutions in Baden Wurtenberg (a region/state in German). After the linux days, we all went to a restaurant and mass ordered Käsespätzle. Käsespätzle is a traditional food from this region and is made of cheese, Spätzle (noodles) and onions. It was excellent.

Käsespätzle
Käsespätzle

On Sunday, Tobias and I went to Golm with a local we met the day before. We took a gondola lift to reach a high-rope park in the mountains and then took an Alpine Coaster to go back in the valley. It was a lot of fun.

The view from the gondola
The view from the gondola

Picture of the high-rope pakr
Picture of the high-rope pakr

After our little adventure, we again went to eat in a traditional restaurant.

Fish in a plate with noodles and pumpkin
Fish in a plate with noodles and pumpkin

Here a few more pictures of the trip:

Dornbirn market place
Dornbirn market place

Castel
Castel

KStars v3.6.7 is released on 2023.10.03 for MacOS & Linux. Windows build is still pending and should hopefully be released by 10th of October. It's a bi-monthly bugfix release with a couple of exciting features.

Image Overlay Component


Hy Murveit introduced a long requested feature: Custom Image Overlays!

With this new feature, a user can add their own processed/completed astro-images, and the system will display them scaled and rotated appropriately on the Sky Map.

The feature is controlled in the KStars Settings menu, in a new tab labelled Image Overlays. First the user needs to add files into a directory, parallel to the logs directory, called imageOverlays. Simply add the images there (typically jpegs). Ideally these aren't massive files for performance reasons, but probably width 1000 or 2000 are fine. I have been testing with larger files, which will also work be use more system resources on slower CPUs.


The user then uses the Image Overlays menu in KStars Settings to (one-time) plate-solve the images and check a box to enable the image display. Successful plate-solve info is stored in the user-db so that it doesn't have to be done again. The images should, from then on, appear in the SkyMap in the proper position. There is a way to easily navigate to the images without manipulating the SkyMap by selecting a row in the overlay table and clicking on the "Show" button. You can move from one image to the next with up/down arrow keyboard commands.

A user can adjust the plate-solve timeouts. As these are mostly blind solves (jpegs won't have any header info, and as currently implemented, no header info is used) the plate solving can be problematic. You can choose a default image scale (arcseconds-per-pixel) or leave that to 0.0 to not use scale. If there are files that won't solve, the user can add RA,DEC into the image's row in the table displayed, which would get the solver to use the sky position as a constraint. The user can also add the scale that way. In fact, if the user knows all the info for the image, he/she can populate all the fields on the image's row and simply set the status field to OK, and plate-solving would no longer be required.

Rotator Dialog Improvements



Toni Schriber continued simplifying the Rotator Dialog. Rotator Flip Policy was introduced. This (global) policy is an answer to this question and to this wish. It's now possible to define how the rotator reacts after a flip or if the result of a solved reference image reports a different pierside respective to the actual mount pierside. Preserve Rotator Angle will keep the rotator position and the camera is virtually rotated by 180°. Preserve Position Angle will keep the camera position angle.

The rotator always turns the camera to the original position angle and the image will show the original star arrangement. Flip Policy can be altered in the StellarSolver Options under Rotator Settings.

More File Placeholders


Due to popular demand, Wolfgang Reissenberger added support for camera temperature %C, gain %G, offset %O and pier side %P.


This is not only applicable to locally captured images, but also for images captured on a remote INDI server.

Sunday, 1 October 2023

Learning a language is, to me, about grinding. Continously exposing yourself.

Ich lerne Deutsch. Oder, ich versuche Deutsch zu lernen. 😉

I try to expose my self to the language via YouTube (thx Nils for the tip about 7 gegen Wild), but also news papers and just chatting with people. I’d say the biggest hurdle is that people find English easier than having me try to find and reorder the words, so practice at full speed is hard to find.

I guess I do the same for people trying to learn Swedish, and i really shouldn’t.

If you have tips for how to expose myself more to German – spoken or written – please drop a comment here or join the conversation at mastodon.

Friday, 29 September 2023

Let’s go for my web review for the week 2023-39.


I don’t want your data – Manu

Tags: tech, web, data, attention-economy

This is a good way to manage your website. I do the same regarding my blog, I don’t do any analytics etc.

https://manuelmoreale.com/i-don-t-want-your-data


Recent advances in computer science since 2010? - Theoretical Computer Science Stack Exchange

Tags: tech, science

Lots of good answers in there… It provodes plenty of rabbit holes to follow.

https://cstheory.stackexchange.com/questions/53343/recent-advances-in-computer-science-since-2010


GPU.zip

Tags: tech, browser, gpu, security, gpu

Interesting new side-channel attack. A bit mind boggling to be honest. Only one browser seems affected so far (since it’s Chrome probably most of its variants are affected as well).

https://www.hertzbleed.com/gpu.zip/


Platform that enables Windows driver development in Rust

Tags: tech, rust, microsoft, system

Another system where it becomes easier to make drivers in Rust.

https://github.com/microsoft/windows-drivers-rs


Dotfiles matter!

Tags: tech, settings, standard

Definitely this, use standard locations as much as possible. We can tame the mess of dotfiles in user homes.

https://dotfiles-matter.click/


Was Javascript really made in 10 days? • Buttondown

Tags: tech, javascript, history

Interesting light shed on Javascript early history.

https://buttondown.email/hillelwayne/archive/did-brendan-eich-really-make-javascript-in-10-days/


Python 3.12 Preview: Static Typing Improvements – Real Python

Tags: tech, python, type-systems

Nice improvements coming to the Python typing system. Especially interesting in the case of kwargs.

https://realpython.com/python312-typing/


Ditch That Else

Tags: tech, programming, craftsmanship

Since we often still see in the wild code with deep nesting due to edge cases handling, it looks like this advice is still very relevant.

https://preslav.me/2023/09/22/ditch-that-else/


Smooth Database Changes in Blue-Green Deployments · Django Beats

Tags: tech, django, databases

This is more manual work of course but too often forgotten. This way you get easier database migrations in complex environments though.

https://fly.io/django-beats/smooth-database-changes-in-blue-green-deployments/


Choose Postgres queue technology :: Adriano Caloiaro’s personal blog

Tags: tech, architecture, dependencies, complexity

Maybe you don’t need to pull even more dependencies. Think of the operational costs and the complexity.

https://adriano.fyi/posts/2023-09-24-choose-postgres-queue-technology/


Demystifying Database Transactions | Dinesh Gowda

Tags: tech, databases, postgresql, sql

Good primer about database transactions and the issues you might run with when using them.

https://dineshgowda.com/posts/demystifying-database-transcations/


GUIDs - How I messed up my RSS feed :: TheOrangeOne

Tags: tech, blog, rss

Interesting tidbit of the RSS standard. Probably worth putting such GUIDs early on.

https://theorangeone.net/posts/rss-guids/


Network health overview with mtr, ss, lsof and iperf3 | Medium

Tags: tech, tools, command-line, networking

Know your tools. Those are useful to check network uses.

https://raduzaharia.medium.com/network-health-overview-with-mtr-ss-lsof-and-iperf3-8d0d2d191781


Style with Stateful, Semantic Selectors | Ben Myers

Tags: tech, web, accessibility, frontend, css

This is an interesting use of the accessibility directive for better styling in web frontend code.

https://benmyers.dev/blog/semantic-selectors/


3x Explore, Expand, Extract • Kent Beck • YOW! 2018 - YouTube

Tags: tech, management, project-management, xp, agile

Lots of food for thought in here. I really appreciate how Kent Beck’s thinking keeps evolving. This Explore, Expand, Extract curve is indeed a good way to frame things. It is a good base to know what to put in place or not.

https://www.youtube.com/watch?v=WazqgfsO_kY


How Many Direct Reports Should a Manager Have? - The Engineering Manager

Tags: tech, management, tech-lead

This is indeed an interesting scale to keep in mind. Teams shouldn’t get too big, or too small.

https://www.theengineeringmanager.com/qa/how-many-direct-reports-should-a-manager-have/


What does a CTO actually do?

Tags: tech, career, cto, management

Ever wondered what the job of CTO encompasses? This article does a good job at it. It’s especially nice that it’s split based on company size. Indeed, the role can change dramatically depending on how big an organization is.

https://vadimkravcenko.com/shorts/what-cto-does/


How (not) to apply for a software job

Tags: hr, hiring, interviews

Plenty of sound advices for the written part of an application.

https://benhoyt.com/writings/how-to-apply/



Bye for now!

A few days ago Volker Krause posted this blog about the Nextcloud conference - a very interesting read.

One of the topics is the VFS (Virtual Filesystem-) API for the Linux desktop. Indeed that is a topic for us at ownCloud as well, and I like to share our perspective on it, discussing it in the scope of the free desktop.

The topic is very important, as “syncing” of data from and to cloud storages has changed over time. From having all files mirrored from client to server and vice versa, it has now shifted to keep all files in the cloud, and have them as so called placeholders on the desktop. That means that most files on the client appear with size zero to save space, but the complete filesystem structure is available.

If a user starts to interact with such a dehydrated file, the content is of the file is downloaded transparently utilizing the cloud system client, for example ownClouds desktop client. The same happens when an application accesses such a file. As a result, the placeholders look and behave like the normal filesystem we are used to.

On Windows and on MacOSX, the problem is kind of solved. Both have added APIs to their OS that can be used to implement the access of data on the cloud.

On Linux, we do not have this kind of API yet. That means that it is close to impossible to implement this user experience. Volker already said that desktop environment specific solutions probably do not scale, which I agree with.

At ownCloud we have looked into the implementation of a specific FUSE file system. That should certainly be possible, and is probably a part of the solution, but is considerable effort because of the asynchronous nature of the topic. Given that the market share of Linux desktop systems is pretty small it is not attractive for companies to invest a lot into a Linux only system. Here the power of community could make a difference again.

It would be best if we as open source community would come up with a shared solution as a free desktop standard, that might be oriented on one of the existing APIs, maybe the MacOSX File Provider API: A library and little framework that the linux desktop environments can work with abstracting the VFS.

While collaborating on that, all data clouds could implement the bindings to their storage. With that, the extra implementation efforts for the Linux solution hopefully wouldnt be dramatic any more.

Let’s call this system openVFS as a work title. How can we evolve it? I’d like to invite all interested parties to discuss in this temporary Github repo to collect ideas and opinions. There is also a little experimental code.

Thursday, 28 September 2023

At the beginning of 2022 I started helping Grupo EVM to shape their plans to ship digital products. They are a technology centric group with a strong data profile, specialised in designing and developing bleeding edge projects and services for public administrations, mostly in Spain, as well as EU funded R&D projects. Their HQ are … Continue reading Tamiz, a new and great experience

Wednesday, 27 September 2023

This is a longer one … it took long too.

I would apologise, but it is one of those things where it hurt me more than it will hurt you to read this extremely abbreviated version.

Setting up sudoedit

There is no two ways about it: sudoedit is the smart way to edit text files as root.

In short what sudoedit does is to:

  1. sudo,
  2. copy the file you want to edit into a temporary location,
  3. edit that file with sudo $VISUAL, or failing that sudo $EDITOR, and
  4. when you save and exit the editor, replace the original file with the temporary copy.

To set up Helix as the default text editor in the console, and KWrite when X11 or Wayland are available (both also for sudoedit), I did the following (in Fish shell):

set --universal --export EDITOR helix
set --universal --export VISUAL kwrite

Borg backup

Before I started messing with the Btrfs RAID, I decided to set up the backups.

I decided to stick with Borg, but to simplify setting up and running, used the Borgmatic wrapper. I have to say I am very pleased with it.

First, I installed it with yay borgmatic.

I took my sweet time to set it up, but most of that was spent reading the documentation and the extensive (over 900 lines) config file template.

But ultimately my /etc/borgmatic/config.yaml consist of less than 30 actual lines and, including all the reading, I was done in an afternoon. The interesting bits I share below.

Excludes

Some things are just not worth backing up and just waste resources. So I added the following:

exclude_from:
    - /etc/borgmatic/excludes

exclude_caches: true

exclude_if_present:
    - .nobackup

Then I slapped .nobackup into directories in my home that I typically do not want backed up like downloads, video, games and some (non-hidden) temporary folders.

The exclude_caches bit relies on people following the Cache Directory Tagging Specification, which from what I can tell, is a pretty rare occasion.

In fact, locate CACHEDIR.TAG on my system shows only three such files – I hope this improves as I install more things:

/home/hook/.cache/borg/CACHEDIR.TAG
/home/hook/.cache/fontconfig/CACHEDIR.TAG
/home/hook/.cargo/registry/CACHEDIR.TAG
So to filter out more caches and for the rest I created a /etc/borgmatic/excludes file that includes: (click to expand)
## Temporary and backup files

*~
*.tmp
*.swp

## Temporary and cache directories

**/tmp
**/temp
**/cache
**/Cache
**/.cache
**/.Cache
**.cache
**.Cache
**.ccache

## Big files

*.iso
*.img

## Top level directories to always exclude

/dev
/etc/mtab
/media
/mnt
/proc
/run
/sys
/tmp
/var/cache
/var/tmp

## Local applications to be excluded (for more use `.nobackup`)

/home/*/.local/share/akonadi
/home/*/.local/share/baloo
/home/*/.local/share/lutris
/home/*/.local/share/Trash
/home/*/.local/share/Steam
/home/*/.kde4/share/apps/ktorrent
/home/*/.config/chromium
/home/*/.wine

I have to agree with the author of the CacheDir spec, caches are aplenty and it is hard to RegExp them all. If only everyone just put their cache in ~/.cache

Relying on .nobackup is a new approach I tried, but let us see how it works out for me. I am cautiously optimistic, as it is much easier to just touch .nobackup than it is to sudoedit /etc/borgmatic/config.yaml, enter the password and copy paste the folder.

Checks

Backups are only worth anything if you can restore from them. Borg does offer checks, and Borgmatic offers a way to easily fine-tune which checks to run and how often.

With the below settings Borgmatic:

  • every day when it creates a backup, also checks the integrity of the repository and the archives
  • once a month, tries to extract the last backup
  • every three months checks the integrity of all data
checks:
    - name: repository
      frequency: always
    - name: archives
      frequency: always
    - name: extract
      frequency: 1 month
    - name: data
      frequency: 3 months

Restrict access

Of course I encrypt my backups.

To further limit access to the backup server, accessing the backup server is only possible with an SSH key. Furthermore, the ~/.ssh/authorized_keys on the server restricts access only to the specific backup repository and only allows the borg serve command to be ran:

command="cd {$path_to_borg_repos}; borg serve --restrict-to-path {$path_to_borg_repos}/{$repo_of_this_machine}",no-pty,no-agent-forwarding,no-port-forwarding,no-X11-forwarding,no-user-rc {$ssh_key}

I already had Borg running on the backup server, so I merely needed to add another line like the above to the server and set up Borgmatic on Leza.

Automate it all

At the end I enabled systemctl enable --now borgmatic.timer to have systemd handle the “cronjob” on a daily basis.

Borgmatic does the heavy lifting of figuring what exactly (if anything) it needs to do that day, so that was super-simple.

Btrfs RAID

Well, RAID-ifying my Btrfs was quite a ride … and I have no-one to blame but myself1 for the issues I had.

I also have to thank everyone on #btrfs IRC channel (esp. balrog, dicot, kepstin, multicore, opty, specing, Zygo); dalto from the EndeavourOS forum, and above all TJ from KDE’s Matrix channel for their help, helping me to dig myself out of the mess I made.

Below, I will document things as they should have been done, and add what I did wrong and how we fixed it in a expandable sub-section.

Do note that “what I should have done” is just me applying some hindsight to my errors, it may still have holes. Also the “what I did” parts are partially re-created from memory and omit a lot of trial and error of trying to fix stuff.

Baloo fixed on Btrfs

The Baloo reindexes everything after every reboot issue is now fixed and will be part of the KDE Frameworks 5.111 release.

Add a new device

As I described in the my base install blog post, the partitioning of my Goodram PX500 SSD was (roughly):

  • 1 GBESP
  • 990 GBLUKS + Btrfs
  • 9 GBLUKS + swap

The Goodram PX500 has 953,86 GiB of space.

But Samsung 970 Evo Plus has 931,51 GiB of space.

Although both drives are marketed as 1 TB (= 931,32 GiB), there was a difference of 20 GiB between them. And unfortunately in the wrong direction, so just the system partition on the Goodram SSD was larger than the whole Samsung SSD.

After plugging in the new SSD, what I should have done was just:

  • create a 1 GiB fat32 partition at the beginning,
  • (create a 10 GiB LUKS + swap partition at the end),
  • create a LUKS partition of what-ever remains on the newly added drive.

… and simply not care that the (to-be) Btrfs partition is not the same size as on the old one.

If fact, I could have also just skipped making a swap partition on the Samsung SSD. If/when one of the drives dies, I would replace it with a new one, and could just create swap on the new drive anyway.

Was that what I did?

Of course bloody not! 🤦‍♂️

Expand if you are curious about how removing and re-creating the swap partition caused me over two days of fixing the mess.

While the gurus at Btrfs assured me it would totally work, if I just had two differently-sized Btrfs partitions put into one RAID1, as long as I do not care some part of the larger partition will not be used, I still wanted to try to resize the partitions on Goodram SSD.

Did I ignore this good advice?

Oh yes!

I am not a total maniac though, so just in case, before I did anything, I made backups, and ran both btrfs scrub and btrfs balance (within Btrfs Assistant, so I do not mess anything up).

The big complication here is that this is a multi-step approach, depending on many tools, as one needs to resize the LUKS partition as well as Btrfs within it, and there are several points where one can mess up things royally, ending with a corrupt file system.

One option would be to go the manual way through CLI commands and risk messing up myself.

The other option would be to use a GUI to make things easier, but risk that the GUI did not anticipate such a complex task and will mess things up.

After much back and forth, I still decided to give KDE Partition Manager a go and see if I can simply resize a Btrfs partition within LUKS there. In the worst case, I already had backups.

… and here is where I messed things up.

Mea culpa!

Honestly, I would have messed things up the same way if I did it in CLI.

If anything, I am impressed how well KDE Partition Manager handled such a complex task in a very intuitive fashion.

What I did then was:

  1. Resized luks+btrfs (nvme0n1p2) on Goodram to be 20 GiB smaller – this is where I thought things would break, but KDE Partition Manager handled it fine. But now I had 20 GiB of unused disk space between nvme0n1p2 (btrfs) and nvme0n1p3 (swap).
  2. To fix this I decided to simply remove the swap (nvme0n1p3) and create a new one to fill the whole remaning space.
  3. (While I was at it, I added and changed a few partition labels, but that did not affect anything.)

So, I ended up with:

Goodram PX500:

partitionsizefile systemmount point
unallocated1,00 MiBunallocated
nvme0n1p11.000,00 MiBfat32/boot/efi
nvme0n1p2924,55 GiBbtrfs (luks)/
nvme0n1p328,34 GiBswap (luks)swap

Samsung 970 Evo Plus:

partitionsizefile systemmount point
nvme1n1p11.000,00 MiBfat32/mnt/backup_efi
nvme1n1p2920,77 GiBbtrfs (luks)
nvme1n1p39,77 GiBswap (luks)swap

At first things were peachy fine.

… and then I rebooted and was greeted with several pages of Dracut essentially warning me that it cannot open an encrypted partition.

Several lines of dracut-initqueue warnings about a hook timing out. Dracut gives up and offers logging into an emergency shell.

So what happened was that I forgot that since I removed and re-created nvme0n1p3 (swap), it now has a different UUID – which is why Dracut could not find it. 😅

After much trial and error and massive help from TC, we managed to identify the problem and solution through the emergency shell. It would have been possible to do that – and probably faster – by booting from LiveUSB too, but both TC and I were already deeply invested and had (some kind of twisted) fun doing it the emergency shell2. Luckily the Btrfs partition got unencrypted, so we could use chroot.

Long story short, this was the solution:

  1. Reboot and in GRUB edit the boot command to remove the non-existing swap partition from the kernel line.
  2. Wait during boot that systemd gives up on the non-existing swap partition.
  3. When in my normal system sudoedit /mnt/rootfs/etc/crypttab and sudoedit /mnt/rootfs/etc/fstab to change the UUID of the encrypted swap partition to the new partition’s UUID.
  4. sudoedit /etc/dracut.conf.d/calamares-luks.conf to change the swap partition’s UUID for the new one.
  5. sudo dracut-rebuild
  6. sudoedit /etc/default/grub – specifically the GRUB_CMDLINE_LINUX_DEFAULT line – to change the swap partition’s UUID for the new one, as well as make sure every LUKS-encrypted partition’s UUID has a rd.luks.uuid= entry there.
  7. sudo grub-install (just in case) and sudo grub-mkconfig.
  8. Reboot 😄

There was another self-caused issue that took me way too long to figure out, until someone on the #btrfs IRC channel pointed it out. I forgot the closing ' in the linux (a.k.a. “kernel”) line in GRUB, which is why grub-install would fail me, complaining about GRUB_ENABLE_CRYPTODISK=y missing, while it was clearly there in /etc/default/grub. I just had to add that ' at the end of GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub and GRUB was happy again.

That was essentially the big oops and how it got fixed.

In the now my /etc/dracut.conf.d/* looks like:

# Configuration file automatically written by the Calamares system installer
# (This file is written once at install time and should be safe to edit.)
# Enables support for LUKS full disk encryption with single sign on from GRUB.

# force installing /etc/crypttab even if hostonly="no", install the keyfile
install_items+=" /etc/crypttab /crypto_keyfile.bin "
# enable automatic resume from swap
add_device+=" /dev/disk/by-uuid/2d90af35-7e6a-40f8-8353-f20433d0f994 "

omit_dracutmodules+=" network cifs nfs brltty "
compress="zstd"

force_drivers+=" amdgpu "
add_dracutmodules+=" plymouth "

add_dracutmodules+=" resume "

And /etc/default/grub:

# GRUB boot loader configuration

GRUB_DEFAULT='0'
GRUB_TIMEOUT='5'
GRUB_DISTRIBUTOR='EndeavourOS'
GRUB_CMDLINE_LINUX_DEFAULT='nowatchdog nvme_load=YES rd.luks.uuid=1a45a072-e9ed-4416-ac7e-04b69f11a9cc rd.luks.uuid=c82fca05-59d3-4595-969b-c1c4124d8559 rd.luks.uuid=2d90af35-7e6a-40f8-8353-f20433d0f994 rd.luks.uuid=2e91342f-3d19-4f75-a9a6-fc3f9798cb30 resume=/dev/mapper/luks-2d90af35-7e6a-40f8-8353-f20433d0f994 loglevel=3 splash quiet'
GRUB_CMDLINE_LINUX=""

# Preload both GPT and MBR modules so that they are not missed
GRUB_PRELOAD_MODULES="part_gpt part_msdos"

# Uncomment to enable booting from LUKS encrypted devices
GRUB_ENABLE_CRYPTODISK=y

# Set to 'countdown' or 'hidden' to change timeout behavior,
# press ESC key to display menu.
GRUB_TIMEOUT_STYLE=menu

# Uncomment to use basic console
GRUB_TERMINAL_INPUT=console

# Uncomment to disable graphical terminal
#GRUB_TERMINAL_OUTPUT=console

# The resolution used on graphical terminal
# note that you can use only modes which your graphic card supports via VBE
# you can see them in real GRUB with the command `videoinfo'
GRUB_GFXMODE=auto

# Uncomment to allow the kernel use the same resolution used by grub
GRUB_GFXPAYLOAD_LINUX=keep

# Uncomment if you want GRUB to pass to the Linux kernel the old parameter
# format "root=/dev/xxx" instead of "root=/dev/disk/by-uuid/xxx"
#GRUB_DISABLE_LINUX_UUID=true

# Uncomment to disable generation of recovery mode menu entries
GRUB_DISABLE_RECOVERY='true'

# Uncomment and set to the desired menu colors.  Used by normal and wallpaper
# modes only.  Entries specified as foreground/background.
#GRUB_COLOR_NORMAL="light-blue/black"
#GRUB_COLOR_HIGHLIGHT="light-cyan/blue"

# Uncomment one of them for the gfx desired, a image background or a gfxtheme
GRUB_BACKGROUND='/usr/share/endeavouros/splash.png'
#GRUB_THEME="/path/to/gfxtheme"

# Uncomment to get a beep at GRUB start
#GRUB_INIT_TUNE="480 440 1"

# Uncomment to make GRUB remember the last selection. This requires
# setting 'GRUB_DEFAULT=saved' above.
#GRUB_SAVEDEFAULT=true

# Uncomment to disable submenus in boot menu
GRUB_DISABLE_SUBMENU='false'

# Probing for other operating systems is disabled for security reasons. Read
# documentation on GRUB_DISABLE_OS_PROBER, if still want to enable this
# functionality install os-prober and uncomment to detect and include other
# operating systems.
#GRUB_DISABLE_OS_PROBER=false

Automate decryption

Having four LUKS-encrypted partitions also means needing to decrypt all of them.

To make things easier, I added the same key that nvme1n1p2 uses also to all the other three partitions:

cryptsetup luksAddKey /dev/nvme1n1p3 /crypto_keyfile.bin    # new swap @ Goodram
cryptsetup luksAddKey /dev/nvme0n1p3 /crypto_keyfile.bin    # btrfs @ Samsung
cryptsetup luksAddKey /dev/nvme0n1p2 /crypto_keyfile.bin    # swap @ Samsung

And then I added them also to /etc/crypttab:

# <name>                                    <device>                                        <password>              <options>

### Goodram

## root in RAID1
luks-1a45a072-e9ed-4416-ac7e-04b69f11a9cc   UUID=1a45a072-e9ed-4416-ac7e-04b69f11a9cc       /crypto_keyfile.bin     luks

## swap
luks-2d90af35-7e6a-40f8-8353-f20433d0f994   UUID=2d90af35-7e6a-40f8-8353-f20433d0f994       /crypto_keyfile.bin     luks


### Samsung

## root in RAID1
luks-c82fca05-59d3-4595-969b-c1c4124d8559   UUID=c82fca05-59d3-4595-969b-c1c4124d8559       /crypto_keyfile.bin     luks

## swap
luks-2e91342f-3d19-4f75-a9a6-fc3f9798cb30   UUID=2e91342f-3d19-4f75-a9a6-fc3f9798cb30       /crypto_keyfile.bin     luks

Even after this, I still need to enter LUKS password thrice:

  • before GRUB to unlock the Goodram SSD’s root partition,
  • before GRUB to unlock the Samsung SSD’s root partition,
  • during systemd to unlock all four partitions.

If I could shorten this down to just once, it would be even nicer. But that is as far as I managed to get so far. Happy to hear suggestions, of course!

Add new drive to Btrfs to make RAID1

On the new Samsung SSD the nvme1n1p2 partition is LUKS + Btrfs, but when adding a new device to Btrfs RAID with btrfs device add, it expects the partition to be without a file system.

This was a(nother self-inflicted) problem.

I could probably avoid this if I did it in CLI – and perhaps even in KDE Partition Manager, if I spent more time with it.

But now I had to deal with it.

Initially I planned to simply use --force the `btrfs device, but was quickly told by the Btrfs gurus, that there was a much safer option:

So I used wipefs to hide the file system:

wipefs --all /dev/disk/by-uuid/a19847bc-d137-4443-9cd5-9f311a5d8636

Then I had to add the device to the same Btrfs mount point:

btrfs device add /dev/mapper/luks-c82fca05-59d3-4595-969b-c1c4124d8559 /

And finally convert the two devices into Btrfs RAID13 with:

btrfs balance start -mconvert=raid1,soft /
btrfs balance start -dconvert=raid1,soft /

At the end of all this my /etc/fstab looks like:

# /etc/fstab: static file system information.
# Use 'blkid' to print the universally unique identifier for a device; this may
# be used with UUID= as a more robust way to name devices that works even if
# disks are added and removed. See fstab(5).
#
# <file system>                                        <mount point>    <type>  <options>                      <dump> <pass>

## ESP @ Goodram
UUID=B33A-4C29                                          /boot/efi        vfat   noatime                              0 2

## ESP backup @ Samsung
UUID=44D2-04AD                                          /mnt/backup_efi  vfat   noatime                              0 2

## btrfs @ Goodram (in RAID1 with Samsung)
/dev/mapper/luks-1a45a072-e9ed-4416-ac7e-04b69f11a9cc   /                btrfs  subvol=/@,noatime,compress=zstd      0 0
/dev/mapper/luks-1a45a072-e9ed-4416-ac7e-04b69f11a9cc   /home            btrfs  subvol=/@home,noatime,compress=zstd  0 0
/dev/mapper/luks-1a45a072-e9ed-4416-ac7e-04b69f11a9cc   /var/cache       btrfs  subvol=/@cache,noatime,compress=zstd 0 0
/dev/mapper/luks-1a45a072-e9ed-4416-ac7e-04b69f11a9cc   /var/log         btrfs  subvol=/@log,noatime,compress=zstd   0 0

## swap @ Goodram
/dev/mapper/luks-2d90af35-7e6a-40f8-8353-f20433d0f994   swap             swap   defaults                             0 0

## swap @ Samsung
/dev/mapper/luks-2e91342f-3d19-4f75-a9a6-fc3f9798cb30   swap             swap   defaults                             0 0

## tmpfs
tmpfs                                                   /tmp             tmpfs  noatime,mode=1777                    0 0

And with that my Btrfs RAID1 was basically done. 😌

There were some smart things to do still …

Automate Btrfs maintenance

According to the Btrfs documentation:

[Btrfs scrub is an] online filesystem checking tool. Reads all the data and metadata on the filesystem and uses checksums and the duplicate copies from RAID storage to identify and repair any corrupt data.

Which is one of the main reasons I embarked on this convoluted set-up adventure 😅

After consulting the Arch Wiki: Btrfs and the gurus on #btrfs IRC channel, it turns out I only needed to enable systemctl enable btrfs-scrub@-.timer.

The wiki says that @- equals the / mount point, @home equals /home mount point, etc., which suggests one should scrub each of the subvolumes / mount points.

But it turns out (at least the way I have things up) scrubbing / (i.e. @-) is perfectly enough, as it scrubs the whole device(s) anyway.

Re-introduce the “reserve tank”

Since I was resizing the original Btrfs partition, I wanted to re-introduce the “reserve tank”.

Measure twice, cut once!

If you did not mess up things like I did, you probably just need to do it for the new device.

Check how much Device slack you have on each device, before you do this. And if you are low on Device unallocated, run btrfs balance first.

In my case I started with 0 Bytes of Device slack, as sudo btrfs filesystem usage / -T shows:

Overall:
    Device size:                   1.78TiB
    Device allocated:            120.06GiB
    Device unallocated:            1.67TiB
    Device missing:                  0.00B
    Device slack:                    0.00B
    Used:                        114.63GiB
    Free (estimated):            864.45GiB      (min: 854.45GiB)
    Free (statfs, df):           862.56GiB
    Data ratio:                       2.00
    Metadata ratio:                   2.00
    Global reserve:              160.89MiB      (used: 0.00B)
    Multiple profiles:                  no

                                                         Data     Metadata System
Id Path                                                  RAID1    RAID1    RAID1    Unallocated Total     Slack
-- ----------------------------------------------------- -------- -------- -------- ----------- --------- -----
 1 /dev/mapper/luks-1a45a072-e9ed-4416-ac7e-04b69f11a9cc 58.00GiB  2.00GiB 32.00MiB   864.52GiB 924.55GiB 0.00B
 2 /dev/mapper/luks-c82fca05-59d3-4595-969b-c1c4124d8559 58.00GiB  2.00GiB 32.00MiB   860.74GiB 920.77GiB 0.00B
-- ----------------------------------------------------- -------- -------- -------- ----------- --------- -----
   Total                                                 58.00GiB  2.00GiB 32.00MiB     1.67TiB   1.78TiB 0.00B
   Used                                                  56.18GiB  1.14GiB 16.00KiB

To add some slack / “reserve tank” to the Btrfs file system, I had to run:

sudo btrfs filesystem resize 1:-10G /
sudo btrfs filesystem resize 2:-10G /

The first command reduced the file system on the device ID 1 by 10 GiB, the second one reduced it on device ID 2.

As a result, I ended up 20 GiB of Device slack, 10 GiB on each drive, as sudo btrfs filesystem usage / -T shows:

Overall:
    Device size:                   1.78TiB
    Device allocated:            120.06GiB
    Device unallocated:            1.67TiB
    Device missing:                  0.00B
    Device slack:                 20.00GiB
    Used:                        114.63GiB
    Free (estimated):            854.45GiB      (min: 854.45GiB)
    Free (statfs, df):           852.56GiB
    Data ratio:                       2.00
    Metadata ratio:                   2.00
    Global reserve:              160.89MiB      (used: 0.00B)
    Multiple profiles:                  no

                                                         Data     Metadata System
Id Path                                                  RAID1    RAID1    RAID1    Unallocated Total     Slack
-- ----------------------------------------------------- -------- -------- -------- ----------- --------- --------
 1 /dev/mapper/luks-1a45a072-e9ed-4416-ac7e-04b69f11a9cc 58.00GiB  2.00GiB 32.00MiB   854.52GiB 914.55GiB 10.00GiB
 2 /dev/mapper/luks-c82fca05-59d3-4595-969b-c1c4124d8559 58.00GiB  2.00GiB 32.00MiB   850.74GiB 910.77GiB 10.00GiB
-- ----------------------------------------------------- -------- -------- -------- ----------- --------- --------
   Total                                                 58.00GiB  2.00GiB 32.00MiB     1.67TiB   1.78TiB 20.00GiB
   Used                                                  56.18GiB  1.14GiB 16.00KiB

How to restore from a failed drive

This is more a note to future self.

When one of the drives dies:

  1. turn off laptop
  2. physically remove the faulty drive
  3. turn laptop back on and during boot mount the remaining drive as “degraded”
  4. buy new drive (use laptop normally in the meantime)
  5. when it arrives, turn off laptop
  6. put in replacement drive
  7. turn laptop back on and run btrfs replace

That is assuming you do not have a spare at hand. If you have it, just skip steps 3-5.

Replacing a dead drive in the Btrfs RAID

The internet seems full of messages that once a drive in Btrfs RAID dies, you can mount it as read-write only once and never again.

The Btrfs gurus on #btrfs IRC channel say that this was a bug and it was fixed several years ago (someone mentioned 6 years ago). Nowadays the btrfs replace command works as one would expect.

Create fallback ESP

So, with that I should be well equipped for when one of the drives dies.

But wait! There is an important part missing!

I cannot boot if ESP is also dead.

Remember the /mnt/backup_efi? Now it is time to make use of it.

Making sure the backup ESP includes everything, takes just a simple:

rsync --archive --delete /boot/efi/ /mnt/backup_efi

And to make sure this happens regularly enough, I decide to create a systemd service that triggers rsync every time I reboot or shutdown my computer.

For that I put into /etc/systemd/system/sync-efi.service the following:

[Unit]
Description=Sync EFI partitions
DefaultDependencies=no
Before=shutdown.target

[Service]
Type=oneshot
ExecStart=/usr/bin/rsync --archive --delete /boot/efi/ /mnt/backup_efi
TimeoutStartSec=0

[Install]
WantedBy=shutdown.target

Of course, the service unit should be enabled too:

systemctl enable sync-efi.service

hook out → well that was a rollercoaster ride


  1. … and perhaps GRUB for being a bit hard and weird to set up in such use cases. 

  2. The emergency shell was quite a pain, as it did not even have a text editor. So we had to be creative. 

  3. The Btrfs gurus suggested using soft in order to avoid re-balancing any new block groups that are created with. 

Recently, I’ve stumbled across some behavior of C++ lambda captures that has, initially, made absolutely no sense to me. Apparently, I wasn’t alone with this, because it has resulted in a memory leak in QtFuture::whenAll() and QtFuture::whenAny() (now fixed; more on that further down).

I find the corner cases of C++ quite interesting, so I wanted to share this. Luckily, we can discuss this without getting knee-deep into the internals of QtFuture. So, without further ado:

Time for an example

Consider this (godbolt):

#include <iostream>
#include <functional>
#include <memory>
#include <cassert>
#include <vector>

struct Job
{
    template<class T>
    Job(T &&func) : func(std::forward<T>(func)) {}

    void run() { func(); hasRun = true; }

    std::function<void()> func;
    bool hasRun = false;
};

std::vector<Job> jobs;

template<class T>
void enqueueJob(T &&func)
{
    jobs.emplace_back([func=std::forward<T>(func)]() mutable {
        std::cout << "Starting job..." << std::endl;
        // Move func to ensure that it is destroyed after running
        auto fn = std::move(func);
        fn();
        std::cout << "Job finished." << std::endl;
    });
}

int main()
{
    struct Data {};
    std::weak_ptr<Data> observer;
    {
        auto context = std::make_shared<Data>();
        observer = context;
        enqueueJob([context] {
            std::cout << "Running..." << std::endl;
        });
    }
    for (auto &job : jobs) {
        job.run();
    }
    assert((observer.use_count() == 0) 
                && "There's still shared data left!");
}

Output:

Starting job...
Running...
Job finished.

The code is fairly straight forward. There’s a list of jobs to which we can be append with enqueueJob(). enqueueJob() wraps the passed callable with some debug output and ensures that it is destroyed after calling it. The Job objects themselves are kept around a little longer; we can imagine doing something with them, even though the jobs have already been run.
In main(), we enqueue a job that captures some shared state Data, run all jobs, and finally assert that the shared Data has been destroyed. So far, so good.

Now you might have some issues with the code. Apart from the structure, which, arguably, is a little forced, you might think “context is never modified, so it should be const!”. And you’re right, that would be better. So let’s change it (godbolt):

--- old
+++ new
@@ -34,7 +34,7 @@
     struct Data {};
     std::weak_ptr<Data> observer;
     {
-        auto context = std::make_shared<Data>();
+        const auto context = std::make_shared<Data>();
         observer = context;
         enqueueJob([context] {
             std::cout << "Running..." << std::endl;

Looks like a trivial change, right? But when we run it, the assertion fails now!

int main(): Assertion `(observer.use_count() == 0) && "There's still shared data left!"' failed.

How can this be? We’ve just declared a variable const that isn’t even used once! This does not seem to make any sense.
But it gets better: we can fix this by adding what looks like a no-op (godbolt):

--- old
+++ new
@@ -34,9 +34,9 @@
     struct Data {};
     std::weak_ptr<Data> observer;
     {
-        auto context = std::make_shared<Data>();
+        const auto context = std::make_shared<Data>();
         observer = context;
-        enqueueJob([context] {
+        enqueueJob([context=context] {
             std::cout << "Running..." << std::endl;
         });
     }

Wait, what? We just have to tell the compiler that we really want to capture context by the name context – and then it will correctly destroy the shared data? Would this be an application for the really keyword? Whatever it is, it works; you can check it on godbolt yourself.

When I first stumbled across this behavior, I just couldn’t wrap my head around it. I was about to think “compiler bug”, as unlikely as that may be. But GCC and Clang both behave like this, so it’s pretty much guaranteed not to be a compiler bug.

So, after combing through the interwebs, I’ve found this StackOverflow answer that gives the right hint: [context] is not the same as [context=context]! The latter drops cv qualifiers while the former does not! Quoting cppreference.com:

Those data members that correspond to captures without initializers are direct-initialized when the lambda-expression is evaluated. Those that correspond to captures with initializers are initialized as the initializer requires (could be copy- or direct-initialization). If an array is captured, array elements are direct-initialized in increasing index order. The order in which the data members are initialized is the order in which they are declared (which is unspecified).

https://en.cppreference.com/w/cpp/language/lambda

So [context] will direct-initialize the corresponding data member, whereas [context=context] (in this case) does copy-initialization! In terms of code this means:

  • [context] is equivalent to decltype(context) captured_context{context};, i.e. const std::shared_ptr<Data> captured_context{context};
  • [context=context] is equivalent to auto capture_context = context;, i.e. std::shared_ptr<Data> captured_context = context;

Good, so writing [context=context] actually drops the const qualifier on the captured variable! Thus, for the lambda, it is equivalent to not having written it in the first place and using direct-initialization.

But why does this even matter? Why do we leak references to the shared_ptr<Data> if the captured variable is const? We only ever std::move() or std::forward() the lambda, right up to the place where we invoke it. After that, it goes out of scope, and all captures should be destroyed as well. Right?

Nearly. Let’s think about the compiler generates for us when we write a lambda. For the direct-initialization capture (i.e. [context]() {}), the compiler roughly generates something like this:

struct lambda
{
    const std::shared_ptr<Data> context;
    // ...
};

This is what we want to to std::move() around. But it contains a const data member, and that cannot be moved from (it’s const after all)! So even with std::move(), there’s still a part of the lambda that lingers, keeping a reference to context. In the example above, the lingering part is in func, the capture of the wrapper lambda created in enqueueJob(). We move from func to ensure that all captures are destroyed when the it goes out of scope. But for the const std::shared_ptr<Data> context, which is hidden inside func, this does not work. It keeps holding the reference. The wrapper lambda itself would have to be destroyed for the reference count to drop to zero.
However, we keep the already-finished jobs around, so this never happens. The assertion fails.

How does this matter for Qt?

QtFuture::whenAll() and whenAny() create a shared_ptr to a Context struct and capture that in two lambdas used as continuations on a QFuture. Upon completion, the Context stores a reference to the QFuture. Similar to what we have seen above, continuations attached to QFuture are also wrapped by another lambda before being stored. When invoked, the “inner” lambda is supposed to be destroyed, while the outer (wrapper) one is kept alive.

In contrast to our example, the QFuture situation had created an actual memory leak, though (QTBUG-116731): The “inner” continuation references the Context, which references the QFuture, which again references the continuation lambda, referencing the Context. The “inner” continuation could not be std::move()d and destroyed after invocation, because the std::shared_ptr data member was const. This had created a reference cycle, leaking memory. I’ve also cooked this more complex case down to a small example (godbolt).

The patch for all of this is very small. As in the example, it simply consists of making the capture [context=context]. It’s included in the upcoming Qt 6.6.0.

Bottom line

I seriously didn’t expect there to be these differences in initialization of by-value lambda captures. Why doesn’t [context] alone also do direct- or copy-initialization, i.e. be exactly the same as [context=context]? That would be the sane thing to do, I think. I guess there is some reasoning for this; but I couldn’t find it (yet). It probably also doesn’t make a difference in the vast majority of cases.

In any case, I liked hunting this one down and getting to know another one of those dark corners of the C++ spec. So it’s not all bad 😉.

Today I was doing some experiments with qmllint hoping it would help us make QML code more robust.


I created a very simple test which is basically a single QML file that creates an instance of an object I've created from C++.


But when running qmllint via the all_qmllint  target it tells me


Warning: Main.qml:14:9: No type found for property "model". This may be due to a missing import statement or incomplete qmltypes files. [missing-type]
        model: null
        ^^^^^
Warning: Main.qml:14:16: Cannot assign literal of type null to QAbstractItemModel [incompatible-type]
        model: null
               ^^^^
 

Which is a relatively confusing error, since it first says that it doesn't know what the model property is, but then says "the model property is an QAbstractItemModel and you can't assign null to it"


Here the full code https://bugreports.qt.io/secure/attachment/146411/untitled1.zip in case you want to fully reproduce but first some samples of what i think it's important


QML FILE

import QtQuick
import QtQuick.Window

import untitled1 // This is the name of my import

Window {
    // things     
    ObjectWithModel {
        model: null
    }
}
 

HEADER FILE (there's nothing interesting in the cpp file)

#pragma once

#include <QtQmlIntegration>
#include <QAbstractItemModel>
#include <QObject>

class ObjectWithModel : public QObject {
    Q_OBJECT
    QML_ELEMENT  
  
    Q_PROPERTY(QAbstractItemModel* model READ model WRITE setModel NOTIFY modelChanged)

public:
    explicit ObjectWithModel(QObject* parent = nullptr);  

    AbstractItemModel* model() const;
    void setModel(QAbstractItemModel* model);

signals:
    void modelChanged();

private:
    QAbstractItemModel* mModel  = nullptr;
};

CMAKE FILE

cmake_minimum_required(VERSION 3.16)
project(untitled1 VERSION 0.1 LANGUAGES CXX)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
find_package(Qt6 6.4 REQUIRED COMPONENTS Quick)
qt_standard_project_setup()

qt_add_executable(appuntitled1 main.cpp)

qt_add_qml_module(appuntitled1
    URI untitled1 VERSION 1.0
    QML_FILES Main.qml
    SOURCES ObjectWithModel.h ObjectWithModel.cpp
)

target_link_libraries(appuntitled1 PRIVATE Qt6::Quick)  
 

As you can see it's quite simple and, as far as I know, using the recommended way of setting up a QML module when using a standalone app.

 

But maybe I am holding it wrong?