ExtendedReality XR/VR portal #1165

swick · 2023-10-20T20:28:45Z

swick
Oct 20, 2023

This is a strawman proposal for adding an XR portal.

The long term goal here is to make it possible to ship and run an XR compositor with flatpak. This means DRM leasing, access to sensors, input devices etc.

For now the goal is to provide DRM leasing infrastructure but design the portal in such a way that adding access to the other types of devices will be possible.

<?xml version="1.0"?>
<!--
 Copyright (C) 2023 Red Hat, Inc.

 This library is free software; you can redistribute it and/or
 modify it under the terms of the GNU Lesser General Public
 License as published by the Free Software Foundation; either
 version 2 of the License, or (at your option) any later version.

 This library is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 Lesser General Public License for more details.

 You should have received a copy of the GNU Lesser General Public
 License along with this library. If not, see <http://www.gnu.org/licenses/>.
-->

<node name="/" xmlns:doc="http://www.freedesktop.org/dbus/1.0/doc.dtd">
  <!--
      org.freedesktop.portal.ExtendedReality:
      @short_description: ExtendedReality portal interface

      XXX

      This documentation describes version 1 of this interface.
  -->
  <interface name="org.freedesktop.portal.ExtendedReality">
    <!--
        RequestDevices:
        @devices: The list of devices the Extended Reality application supports
        @options: Vardict with optional further information
        @request_handle: Object path for the #org.freedesktop.portal.Request object representing this call

        Request access to devices. This will typically result in the portal
        presenting a dialog showing the supported connected devices and
        allows the users to selectively grant access to those devices.

        Each element of the @devices array is a vardict with the following keys:
        <variablelist>
          <varlistentry>
            <term>id u</term>
            <listitem><para>
              A caller provided id for the device used to identify a device in
              other methods and events of this portal.
            </para></listitem>
          </varlistentry>
          <varlistentry>
            <term>type s</term>
            <listitem><para>
              The type of device. One of:
              "display" for XR headsets
            </para></listitem>
          </varlistentry>
          <varlistentry>
            <term>display_id uu</term>
            <listitem><para>
              Required and only applicable to devices of type display.
              The Manufacturer and Model respectively of the display as reported
              by EDID or DisplayID.
            </para></listitem>
          </varlistentry>
        </variablelist>

        Supported keys in the @options vardict include:
        <variablelist>
          <varlistentry>
            <term>handle_token s</term>
            <listitem><para>
              A string that will be used as the last element of the
              @request_handle. Must be a valid object path element. See the
              #org.freedesktop.portal.Request documentation for more
              information about the @request_handle.
            </para></listitem>
          </varlistentry>
        </variablelist>
    -->
    <method name="RequestDevices">
      <arg type="a(a{sv})" name="devices" direction="in"/>
      <arg type="a{sv}" name="options" direction="in"/>
      <arg type="o" name="request_handle" direction="out"/>
    </method>

    <!--
        AvailableDeviceInstances:
        @device_instances: The list of all available device instances

        Lists the currently available device instances

        Each element of the @devices array is a tuple of an id previously set in
        the RequestDevices method identifying a device and an instance id
        identifying a specific instance of the device.
    -->
    <method name="AvailableDeviceInstances">
      <arg type="a(uu)" name="device_instances" direction="out"/>
    </method>

    <!--
        LeaseDevice:
        @device_instance: A tuple of device id and instance id
        @lease: A file descriptor of the leased device instance

        Leases out the device instance identified by the @device_instance tuple.
        The file descriptor @lease can be revoked at any time. Depending on the
        type of the device, the @lease can represent a different kind of
        resource.

        For devices of type "display" @lease is a DRM lease.
    -->
    <method name="LeaseDevice">
      <arg type="uu" name="device_instance" direction="in"/>
      <arg type="h" name="lease" direction="out"/>
    </method>

    <!--
        AvailableDeviceInstancesChanged:

        Notifies about changes in the available device instances.
    -->
    <signal name="AvailableDeviceInstancesChanged">
    </signal>

    <property name="version" type="u" access="read"/>
  </interface>
</node>

Mikenux · 2023-10-21T02:02:50Z

Mikenux
Oct 21, 2023

The only thing I can say is that we might prefer to see a set of XR devices (display + input + ...) as a single device.

Some questions:

Is the firmware problem solved? (or will it be when the portal is ready?)
Can any “damage” be done to the device, assuming the firmware problem is solved?
I'm a bit confused. What is the real objective? Have a DRM leasing portal first, then an XR device portal later? Or a portal for XR devices?

0 replies

blue42u · 2023-10-22T15:47:48Z

blue42u
Oct 22, 2023

The long term goal here is to make it possible to ship and run an XR compositor with flatpak.

Speaking as a user of VR software (primarily SteamVR + Steam VR titles), this statement seems counter-intuitive. I fully consider the VR/XR compositor to be a user/session service, I would never run more than a single VR/XR compositor at a time and I would expect all my running VR/XR applications to use the same compositor. Also AFAIK the compositor has stringent low-latency requirements to implement timewarp/reprojection/motion smoothing and other critical features that prevent VR-sickness on app framerate drop/stutter. That requirement alone puts it in a different class from your typical desktop app IMHO, and closer to other latency-critical software such as the sound stack (e.g. JACK or Pipewire).

As such, I don't immediately see the appeal in running my VR/XR compositor under Flatpak. AFIAK one mission of Flatpak is to provide a permissions model to prevent apps from accessing or modifying anything on the system without the user's explicit consent. I would grant a VR/XR compositor low-level access to all sorts of hardware on my system in a heartbeat: my VR hardware, connected GPUs, and realtime scheduling priority on my CPU. Without those permissions, I'm either not running any VR titles or they are uncomfortable to experience. Any "security" provided by the Flatpak permissions model is utterly insignificant here IMHO.

Maybe someone with more technical knowledge can explain what the benefit is to running a VR/XR compositor in a Flatpak, rather than an end-user application (e.g. Blender)? Thanks ❤️.

11 replies

misyltoad Oct 23, 2023

OpenXR is no server/client architecture. The XR runtime is mapped into the same process as the XR app. There is no security boundary between the runtime and the app.

That's like saying Wayland is no server/client architecture because libwayland exists. What you are saying makes absolutely no sense.

misyltoad Oct 23, 2023

Apps like a 3D viewer (glTF) might want to show the 3D view on connected XR devices. This app is then using the OpenXR runtime which in turn will need to lease the required resources.

That's not how this works at all? What are you talking about!
There is no leasing required for OpenXR applications only vrcompositor owns the DRM lease and vrserver owns the hidraw stuff.

technobaboo Oct 23, 2023

OpenXR is no server/client architecture. The XR runtime is mapped into the same process as the XR app. There is no security boundary between the runtime and the app.

That's like saying Wayland is no server/client architecture because libwayland exists. What you are saying makes absolutely no sense.

to be fair, openxr runtime libraries can theoretically access devices directly while the wayland spec inherently uses a socket for all communication. It doesn't make sense to access things directly as if it did then closing the app means reconnecting and reinitializing the headset as a whole. We wouldn't need to blow out a huge security hole but we would need a way to vary permissions and exposed sockets depending on which openxr runtime library is the active one.

GeorgesStavracas Oct 23, 2023
Maintainer

Please avoid addressing discussion points with phrases like "you make no sense". This gives the discussion a personal and hostile tone, and makes it easy to derail it. Please be kind to other people when correcting them.

technobaboo Oct 23, 2023

It's incredibly frustrating when the person trying to make this happen doesn't understand how XR works well enough to lead the thread in a productive direction.

I'm glad that he said "I'd love to get out of this discussion anyway and let the XR experts discuss all of this.". So I'm trying to be understanding and help make this more productive.

GeorgesStavracas · 2023-10-23T13:46:27Z

GeorgesStavracas
Oct 23, 2023
Maintainer

My understanding from talking to @swick is that the process of leasing devices is not strictly bound to Wayland or X11, so in the grand scheme of things, it's more of a desktop API rather than a compositor API. From this perspective, if I understood it correctly, it feels natural for this API to live in XDG desktop portal, and the security / permissions boundaries are secondary and somewhat opportunistic.

I don't understand enough of the problem domain to be able to agree or disagree with this. It just sounds right to me. I also don't understand enough of XR to review the D-Bus interfaces in full detail, I can just comment on whether or not it follows the good practices of XDG portals.

What would be helpful is for someone to give a panorama of the current state of XR devices on Linux - what components are involved, how do they communicate, what information they need to exchange, etc - and, from there, we can try and escalate that into a portal interface (or not - if we conclude that's not a good idea after all).

Can someone do such a writeup?

2 replies

technobaboo Oct 23, 2023

I'm working on it right now

GeorgesStavracas Oct 23, 2023
Maintainer

Thank you, that's very much appreciated :)

technobaboo · 2023-10-23T13:58:58Z

technobaboo
Oct 23, 2023

Here's a rough overview of how OpenXR works, technologically from a systems perspective:

Any process that wants to display its content in XR will dynamically or statically link to the OpenXR runtime loader. The runtime loader, on application intialization, will search directories at standard/env defined paths or environment variable overrides for an XR runtime manifest json as well as API layer json files. Inside each one is metadata about what dynamic library to link. The openxr runtime loader dynamically links the libraries and the application uses xrGetProcAddress to get all the function pointers it needs.

Those dynamic libraries for XR runtimes and API layers could theoretically talk to devices directly, but in practice none actually do and accounting for that is not worth the security and complexity and time. If a dynamic library were to talk to them directly, then only that 1 application could run (monado and steamvr both have overlays of different kinds and monado allows multiple XR apps with 1 focused). As well, when the XR app closed it would deinitialize the headset hardware meaning no launcher could exist. Nobody would do that.

In practice, what happens is an XR service runs in the background (SteamVR or Monado for now, but I am working on my own) and exposes a unix domain socket in $XDG_RUNTIME_DIR just as any display server does. The runtime library connects to that service and sends messages/dmabufs/fds/whatever. The only exception is Ultraleap's hand tracking API layer which uses TCP loopback sockets instead of unix domain sockets.

The service does connect to devices directly and requires DRM leasing as well as hidraw access and CAP_SYS_NICE (until other solutions are found) as well as possibly other permissions like Bluetooth. This combined with the fact that it just exposes a socket means it acts like a display server would, a system service. The only practical difference is that it's not pre-installed because XR isn't yet a dominant computer interface.

We do have XR service runtimes already (not to be confused with XR runtime services) like Steam's Linux runtime (sniper i believe) as well as Envision for Monado. These allow an XR runtime service as well as its libraries to be run on any Linux device without the need to be installed to the system. Flatpak is not necessary, and given no other system services are put in flatpaks I'm not sure it would fit.

So to summarize: for XR apps we need to find the right OpenXR libraries and expose loopback/unix domain sockets, XR runtimes are best left as system services.

6 replies

misyltoad Oct 23, 2023

FWIW, SteamVR does not only use sockets, but also /dev/shm

swick Oct 23, 2023
Author

Those dynamic libraries for XR runtimes and API layers could theoretically talk to devices directly, but in practice none actually do and accounting for that is not worth the security and complexity and time.

Thanks, that wasn't clear from just the OpenXR spec.

So to summarize: for XR apps we need to find the right OpenXR libraries and expose loopback/unix domain sockets, XR runtimes are best left as system services.

This definitely needs fixing. If it's supposed to work with sandboxing tech, there needs to be a standard on what kind of resources have to be passed to the sandbox and especially because the runtime could be started after the app was started, a dynamic way to do so.

We do have XR service runtimes already (not to be confused with XR runtime services) like Steam's Linux runtime (sniper i believe) as well as Envision for Monado. These allow an XR runtime service as well as its libraries to be run on any Linux device without the need to be installed to the system.

Ignoring Steam because they have their proprietary solutions for it: Does Envision handle updating, sandboxing, etc? I still believe that shipping the compositor/server side with flatpak could be useful. They seem to come with their own desktop app which is used to control certain aspects.

technobaboo Oct 23, 2023

envision handles updating and building and so on but doesn't sandbox, Monado is FOSS with many contributors watching out for it and given the nature of OpenXR RCE is virtually impossible.

Also: realistically the resources needed won't be standardized. We'd need least 2 big companies to move and change their codebases (valve and ultraleap) and making porting runtimes not practical. However, why not use the same technique flatpak themes do? No reason we can't have a SteamVR flatpak package that contains the necessary permissions to connect to SteamVR (same with monado, eventually stardust, etc.) and can share resources with another, right?

Edit: flatpak extensions is what I meant, it fits perfectly in line with what we need: https://github.com/flatpak/flatpak/wiki/Extensions

swick Oct 23, 2023
Author

The interface between the XR runtime and XR compositor is the part I'm worried about. This is not standardized so we can't say "we need to mount those sockets and then XR runtimes will be able to reach their compositors".

technobaboo Oct 23, 2023

right, but if we make a flatpak extension for each runtime that does contain that per compositor it should work

ChristophHaag · 2023-10-23T16:12:54Z

ChristophHaag
Oct 23, 2023

These discussions always get bogged down into specific implementation details.

I'm not all too familiar with flatpak but I'll dump some text here anyway. There are different scenarios with different requirements.

The first thing that should be said is that OpenXR works quite similar to Vulkan and ideally OpenXR could be made to work similarly:

A Vulkan application links to the Khronos Vulkan loader. The Vulkan loader loads a json manifest from some standard directory (driver ICD). With information from the json manifest, the Vulkan loader loads a shared library with the vendor implementation.

An OpenXR application links to the Khronos OpenXR loader. The OpenXR loader loads an OpenXR runtime json manifest from some standard directory. With information from the json manifest, the OpenXR loader loads a shared library with the vendor implementation.

The difference is that with Vulkan that's usually the end of the line before going directly to a kernel driver, wheras the OpenXR backend is often another userspace application/service.

Runtime architecture 1: Client-Server based runtimes

This is what runtimes most often do today.

Scenario 1: OpenXR runtime service on the host, OpenXR application in flatpak

The OpenXR application needs access to the OpenXR loader shared library (may be statically linked into the app (unusual on linux), may be shipped with the app, may rely on it being available in the system. I believe it is not available in the Steam runtime as of now).

The OpenXR loader shared library needs access to a runtime json manifest in the filesystem. It would be very inconvenient to have to ship them with the app, so ideally it would be loaded from the host.

The same goes with the OpenXR runtime client side library, ideally it would be loaded from the host.

The client side library then needs some connection to the runtime on the host. This is very runtime specific, but often it's a socket or so.

There is no need for flatpak to handle drm leasing, device permissions here, it all happens on the host side.

Scenario 2: OpenXR runtime service in flatpak, OpenXR application in flatpak

Yes, we here don't want to particularly do this, but there are people out there who want to distribute and run the entire stack in docker, flatpak etc.

In this scenario, flatpak doesn't need to care about OpenXR loaders, runtime manifest, etc. But it does need to handle drm leasing and device permisisons.

Scenario 3: OpenXR runtime service in flatpak, OpenXR application on the host

I'm not sure if people want to do that.

The OpenXR application needs access to the OpenXR loader shared library, which we can just pretend is a host system problem.

The OpenXR loader shared library needs access to a runtime json manifest in the filesystem. flatpak would need to let the OpenXR runtime provide it to the host system.

The same goes with the OpenXR runtime client side library, flatpak would need to provide it to the host.

The client side library then needs some connection to the runtime in flatpak. Again this is very runtime specific, often a socket etc.

Here flatpak needs to handle drm leasing, device permissions, and in addition handle quite some OpenXR-y stuff

Runtime architecture 2: All-in-One (in-process) runtimes

Essentially this is the same as Client-Server based runtimes with OpenXR application and OpenXR runtime in flatpak.

What now

The question is not which of those is the one true scenario that should be supported and nothing else. The question is, should the flatpak implementation / the protocol dictate which of those scenarios you can or can not do? Imo the more of those can be enabled, the better.

1 reply

swick Oct 23, 2023
Author

Thanks for this!

Two of those scenarios require a flatpak app to have access to XR devices. This makes me believe that a portal is the right choice. Portals can also be used by host apps/services and the permission model for them can be different from sanboxed apps.

The other big thing is getting the client/server architecture working for flatpak apps. We can handle the OpenXR runtimes (client side) just like Vulkan and build flatpak extensions but then they would have to match the server side on the host. This is getting ugly real fast and looks a lot like the Nvidia vulkan driver.

If the server part is in another flatpak then we'd need to be able to get the client part into other flatpaks. Horrible, because this is essentially code injection.

We also can't mount a single socket because 1. that's not standardized and 2. the socket might become available after the app started. IPC is generally restricted.

I think realistically right now we can support client on host or flatpak and server on host, as well single process OpenXR on host or flatpak. For everything else we'd need a standard OpenXR client implementation.

orowith2os · 2023-10-24T01:50:30Z

orowith2os
Oct 24, 2023

I believe it would be worthwhile to also add DRM leasing support to logind, both to help in implementing some parts of a VR portal (this way, we don't need the compositor to lease out resources, the portal can do so separately), and to help further some other parts of the Linux ecosystem (like running several compositors on one GPU's displays).

4 replies

swick Oct 24, 2023
Author

There are a lot of unsolved issues with moving leasing to logind right now. The good thing is that the portal backends can be changed in the future if we manage to move leasing to logind and it will be transparent to users.

orowith2os Oct 24, 2023

If you could list the issues and maybe help solve them, that would also be great.

orowith2os Nov 27, 2023

Considering you're finally abandoning this effort, I'd also still like that list of problems. I haven't seen anything yet.

GeorgesStavracas Nov 27, 2023
Maintainer

It surely would be good, as a documentational effort, to have swick's perspective on what these issues might be. But please don't demand that with such an abrasive tone, this is not welcomed in this space. It's fine if swick doesn't want to engage with this any further too.

swick · 2023-11-10T15:43:59Z

swick
Nov 10, 2023
Author

I'm going to close this issue.

The fact that all the VR compositors right now have a client server architecture means that no flatpak app has to be able to lease resources from the compositor. It also means that a wayland protocol is a very bad choice because they are exposed to all flatpak apps. A d-bus service for leasing out resources from a compositor such as displays and input devices (there is a MR for logind for hidraw devices already) would still be a better design but they are not regular apps so a portal is the wrong approach.

We'll also have to figure out how flatpak apps can access the XR runtimes. The OpenXR implementation in the flatpak runtime really has to be shared and the way it punches a hole for the XR compositor IPC has to be standardized.

1 reply

technobaboo Jan 5, 2024

I still maintain that flatpak extensions are the way to go, standardizing the IPC for openxr means that XR runtimes cannot optimize their over the wire protocols to deliver the least latency (sometimes it's faster to cache values at the beginning of each frame for example) and you'd need to get several huge companies to change their codebases and making a shim between openxr on the system and inside the flatpak would add latency. It's a lose lose lose scenario.

so given opengl/vulkan already uses flatpak extensions, why not? you can have an XR runtime also install a flatpak extension that exposes only the necessary permissions/sockets and then all the openxr client flatpaks would work with it.

roshanshariff · 2023-11-15T03:00:07Z

roshanshariff
Nov 15, 2023

there is a MR for logind for hidraw devices already

For those following along, I believe this is

logind: add support for hidraw devices (via eBPF-based revoke) systemd/systemd#29797

There are also some other related discussions:

RFE: uaccess on game controllers' raw HID interfaces systemd/systemd#22681
WIP: Add a hwdb listing game controllers, use it to apply uaccess systemd/systemd#22860

0 replies

ExtendedReality XR/VR portal #1165

Replies: 8 comments · 25 replies

GeorgesStavracas Oct 23, 2023 Maintainer

GeorgesStavracas Oct 23, 2023 Maintainer

GeorgesStavracas Oct 23, 2023 Maintainer

swick Oct 23, 2023 Author

swick Oct 23, 2023 Author

Runtime architecture 1: Client-Server based runtimes

Scenario 1: OpenXR runtime service on the host, OpenXR application in flatpak

Scenario 2: OpenXR runtime service in flatpak, OpenXR application in flatpak

Scenario 3: OpenXR runtime service in flatpak, OpenXR application on the host

Runtime architecture 2: All-in-One (in-process) runtimes

What now

swick Oct 23, 2023 Author

swick Oct 24, 2023 Author

GeorgesStavracas Nov 27, 2023 Maintainer

swick Nov 10, 2023 Author

Replies: 8 comments 25 replies

GeorgesStavracas Oct 23, 2023
Maintainer

GeorgesStavracas
Oct 23, 2023
Maintainer

GeorgesStavracas Oct 23, 2023
Maintainer

swick Oct 23, 2023
Author

swick Oct 23, 2023
Author

swick Oct 23, 2023
Author

swick Oct 24, 2023
Author

GeorgesStavracas Nov 27, 2023
Maintainer

swick
Nov 10, 2023
Author