Friday, September 29, 2023

Same-node communication

For containers/VMs running on the same physical machine - including containers in the same Pod or in different Pods scheduled using affinity - it would be highly useful to use modern inter-process communication based on shared memory, DMA or virtio instead of keep copying bytes from buffer to kernel buffer to yet another buffer ( 3 copy is the best case - usually far more).

We have the tools - Istio CNI (and others) can inject abstract unix sockets, there are CSI providers that can inject real unix sockets. 

Unix sockets - just like Android Binder - can pass file descriptors and shared memory blocks to a trusted per node component - which can further pass it to the destination after applying security policies. 

I was looking into this for some time - I worked for many years in Android so I started in the wrong direction attempting to use binder ( which is now included in many kernels ). But I realized Wayland is already there, and it's not a bad generic protocol if you ignore the display parts and the XML. 

Both X11 and Wayland use shared buffers on the local machine - but X11 is a monster with an antiquated protocol focused on rendering on the client - and browsers are doing this far better. Wayland was designed for local display and security - but underneath there is a very clean IPC protocol based on buffer passing. 

How would it look like in Istio or other cloud meshes ? Ztunnel (or another per-node daemon ) would act as a CSI or as a CNI injecting an unix socket in each Pod. It could use the Wayland binary protocol  - but not implement any of the display protocols, just act as a proxy. If it receives a TCP connection - it can just pass the file descriptor after reading the header, but it would mainly act as a proxy for messages containing file/buffer descriptors. Like Android, it can also pass open UDS file descriptors from a container to another, after checking permissions - allowing direct communication. 

The nice thing is that even when using VMs instead of containers - there is now support for virtwl in kernel and sommelier - and this would also work for adding stronger policies on a desktop or when communicating with a GPU. 

Modern computers have a lot of cores and memory - running K8S clusters with fewer but larger nodes and taking advantage of affinity can allow co-location of the entire stack, avoiding slower network and slower TCP traffic for most communications - while keeping the 'least privilege' and isolation. Of course, a monolith can be slightly faster - but shared memory is far closer in speed compared with TCP.

I've been looking at this for few years in my spare time - most of the code and experiments is obsolete now, but I think using Wayland as a base ( with a clean, display independent proxy) is the right pragmatic solution. And simpler is better - I still like Binder and Android model - wish clouds would add it to their kernels...

Tuesday, September 26, 2023

Chrome and secrets on linux

 Wasted few good hours on this: if you want to move from gnome (and variants like cinnamon) to something else, like sway, and not have to re-enter all the passwords - ignore the man page and all the search results that suggest `--password-store=gnome`.

It is `--password-store=gnome-libsecret` instead.

The rest - installing/starting gnome keyring is still valid, validate with seahorse (i.e. gnome password manager) it is working. 

And add a desktop entry with the right flag. "--enable-logging=stderr --v" help to debug, look for key_storage_linux.cc

Saturday, September 23, 2023

Changing settings for Crostini in ChromeOS

Found: Mount Block Devices in ChromeOS

Apparently it is possible to change the LXC config and get access to the real VM, which appears to be read-only. Combined with moving devices to the VM there is more control - but still limited by the small number of kernel modules in the VM.

I love the security model - the 'host' just handles display and a number of jailed services, all the apps in the VM with LXC on top. The problem is that it's too restrictive - and the linux apps are still all in the same sandbox with access to each other. Flatpak at least tries to isolate each app - but falls to the same trap that Java and early android did - the apps ask for too many permissions. 

I'm sticking with my less efficient setup - docker and pods with explicit mounted volumes, syncthing and remote desktop, with one container per app or dev project - but I've been looking to move from ChromeOS to normal linux set in a similar way.