Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel config suggestions #8

Open
muke101 opened this issue Nov 5, 2020 · 12 comments
Open

Kernel config suggestions #8

muke101 opened this issue Nov 5, 2020 · 12 comments

Comments

@muke101
Copy link

muke101 commented Nov 5, 2020

Hi, I compiled a custom kernel for my reMarkable 1 and was talking about the changes to the configs I made in the discord. It was suggested I share them here, as they may be useful to include in the default kernel to improve performance, battery life or both.

The following changes I see no reason not to have on all kernels:

  • Disable high memory support - the default config enables support for memory addresses greater than 4GB. This seems like an over sight, and disabling it yields a faster kernel.
  • Trim unused kernel symbols - this makes the kernel smaller and gives greater compiler optimization opportunity on the kernel source. Some external kernel modules require the ones that are trimmed, but I've trimmed them in my kernel and my rm1 works fine. I obviously can't be certain myself but it's fair to assume this is at least worth looking at. It should be mentioned I haven't found time to build a kernel that includes the proprietary wifi driver yet though, so for all I know perhaps this relies on something from here.
  • Disable SLUB debugging - this adds to the kernel binary size and serves no purpose

I've made some more changes that I'm slightly uncertain about for enabling by default, but will share them anyway:

  • Disable watchdog timer - this takes up memory and CPU cycles in the background, taking up usage and draining battery. Supposedly though it might be useful for breaking out of bootloops, but I'm not sure how much of an issue this is on the rm1. If it's not, this should definitely be considered for removal.
  • Optimize very likely/unlikely branches - this should make the kernel run faster most of the time and very occasionally run slower on some operations. I decided to make the trade off, I'm not sure which might be preferred for default usage though.
  • 1khz over 100hz frequency timer - this makes a trade off between latency to responding to hardware interrupts and battery life as well as the amount of cache space the kernel takes up. I'm less sure if this is really worth it, but I think it's safe to assume drawing on the screen includes a hardware interrupt here, and so might even reduce latency? I'm unable to benchmark it, but I thought would be worth mentioning here anyway. Obviously needs to be taken into account with whatever, if even slight, reductions in battery life too. There are also middle grounds between these two extremes.
  • Patched for real time preemption - the default kernel is set at involuntary preemption, or 'low latency desktop'. This is a trade off between responsiveness and max CPU throughput. I figured in a similar fashion to the timer frequency, enabling even more preemption should reduce latency, and in this case not even specifically to hardware interrupts but to all user space software, especially when the CPU is under load. Obviously though this is a separate patch that needs to be applied, and may not even be worth it at all, but once again thought it would be worth mentioning.
  • Switch from LZO compression to LZ4 - This makes the kernel slightly bigger but decreases boot times. As it's already at LZO I figured it's assumed the additional speed isn't worth the additional space, but I decided to make this trade off myself anyway.

I'm an armature coming at this with little experience, so forgive me if everything here has already been considered and accounted for, but I figured if there was a chance even one of these things hadn't been thought of in the existing config it could be useful to flag up! I will say, though I haven't benchmarked regular usage performance or screen latency, I have recorded an almost exactly three second decrease in boot time with these settings, which I think is fairly significant.

@cweagans
Copy link

cweagans commented Mar 2, 2021

@larsim are there any plans to implement some of these changes in the "official" kernel for the reMarkable? Would you be more likely to incorporate these changes if a PR were opened? It would be great to not have to compile and install a custom kernel to get the performance and battery life improvements listed above!

@larsim
Copy link

larsim commented Mar 26, 2021

@muke101 Thank you for this! @cweagans Yes, we plan to test some of these changes and possibly incorporate them, and we are always looking to improve our kernel. I'm also interested if you have other improvements that you'd suggest I look into while I'm at it.

@LinusCDE
Copy link

LinusCDE commented Mar 29, 2021

One really cool addition would also be to have uinput in the kernel by default. I don't think that the additional size would be noticeable to most while solving a plethora of challenges for certain custom software.

Examples:

  • Launchers have to reside to ring buffer overflows to cancel input
  • Adding a custom uinput modules as a package is possible but tricky and bound to break after some new firmware
  • Adding more in depth control for software would be way more easy that way (one could for example then use ld_preload or ideally mount namespaces to swap out a event file with a custom uinput one).

The fuse module would probably also be nice for some people but is of lesser importance for me personally and I don't see as much usecases for it.

As said this is just a suggestion. I fully understand if this is not something that is deemed useful for the device and therefore not added.

Edit: Fixed grammar

@loicpoulain
Copy link
Contributor

loicpoulain commented Jun 10, 2021

The following changes I see no reason not to have on all kernels:

* Disable high memory support - the default config enables support for memory addresses greater than 4GB. This seems like an over sight, and disabling it yields a faster kernel.

I agree.

* Trim unused kernel symbols - this makes the kernel smaller and gives greater compiler optimization opportunity on the kernel source. Some external kernel modules require the ones that are trimmed, but I've trimmed them in my kernel and my rm1 works fine. I obviously can't be certain myself but it's fair to assume this is at least worth looking at. It should be mentioned I haven't found time to build a kernel that includes the proprietary wifi driver yet though, so for all I know perhaps this relies on something from here.

* Disable SLUB debugging - this adds to the kernel binary size and serves no purpose

Ideally, every debug stuff should be disabled.

I've made some more changes that I'm slightly uncertain about for enabling by default, but will share them anyway:

* Disable watchdog timer - this takes up memory and CPU cycles in the background, taking up usage and draining battery. Supposedly though it might be useful for breaking out of bootloops, but I'm not sure how much of an issue this is on the rm1. If it's not, this should definitely be considered for removal.

watchdog is a security mechanism to prevent the unbounded hanging of the system, it's a must-have feature on these devices. Moreover, the overhead is insignificant, something like few CPU cycles every ~30s when the device is running, that's not executed when the device is sleeping,

* Optimize very likely/unlikely branches - this should make the kernel run faster most of the time and very occasionally run slower on some operations. I decided to make the trade off, I'm not sure which might be preferred for default usage though.

Can you elaborate here, what do you want to change? add more branch-prediction helper calls?

* 1khz over 100hz frequency timer - this makes a trade off between latency to responding to hardware interrupts and battery life as well as the amount of cache space the kernel takes up. I'm less sure if this is really worth it, but I think it's safe to assume drawing on the screen includes a hardware interrupt here, and so might even reduce latency? I'm unable to benchmark it, but I thought would be worth mentioning here anyway. Obviously needs to be taken into account with whatever, if even slight, reductions in battery life too. There are also middle grounds between these two extremes.

You should not worry too much about CONFIG_HZ, interrupt are not really impacted by that, hard interrupt handlers are executed 'synchronously' and so do not depend on that value, 'threaded' interrupts handlers are executed with SCHED_FIFO algorithm making them running fast after the interrupt occurred. CONFIG_HZ mainly impacts the time wheel accuracy, but most of the time it's not so important. The other impact could be the task timeslice being too large, but it's in part fixed by CONFIG_SCHED_HRTICK that causes the kernel to rely on high-resolution timer for fair scheduling.

* Patched for real time preemption - the default kernel is set at involuntary preemption, or 'low latency desktop'. This is a trade off between responsiveness and max CPU throughput. I figured in a similar fashion to the timer frequency, enabling even more preemption should reduce latency, and in this case not even specifically to hardware interrupts but to all user space software, especially when the CPU is under load. Obviously though this is a separate patch that needs to be applied, and may not even be worth it at all, but once again thought it would be worth mentioning.

I would say such a device does not host real-time applications, so I'm not really sure what it could improve for the user. Sure it will bring 'bounded latency', but will also degrade overall performances. One problem is also that RT-patched Linux is far less tested than mainline Linux, and not sure that downstream drivers integrated for this project have been well tested with RT patches. RT patches make substantial modifications (e.g. converting spinlock to mutex) that may have not been taken into account by all these drivers, and so can bring a wide new range of issues.

* Switch from LZO compression to LZ4 - This makes the kernel slightly bigger but decreases boot times. As it's already at LZO I figured it's assumed the additional speed isn't worth the additional space, but I decided to make this trade off myself anyway.

Yes, that should probably be worth testing that.

I'm an armature coming at this with little experience, so forgive me if everything here has already been considered and accounted for, but I figured if there was a chance even one of these things hadn't been thought of in the existing config it could be useful to flag up! I will say, though I haven't benchmarked regular usage performance or screen latency, I have recorded an almost exactly three second decrease in boot time with these settings, which I think is fairly significant.

Thanks for all your ideas, we are looking at them, and some changes have already been applied internally.

@Eeems
Copy link

Eeems commented Jun 10, 2021

Thanks for all your ideas, we are looking at them, and some changes have already been applied to Linux 5.4: https://github.com/reMarkable/linux-internal/pull/148

It looks like that's a private repository.

@muke101
Copy link
Author

muke101 commented Jun 23, 2021

* Optimize very likely/unlikely branches - this should make the kernel run faster most of the time and very occasionally run slower on some operations. I decided to make the trade off, I'm not sure which might be preferred for default usage though.

Can you elaborate here, what do you want to change? add more branch-prediction helper calls?

'Optimize very likely/unlikely branches' is a kernel config option. It's just provides hints to gcc, making things slightly faster when it's profiling is right and slightly slower when it's not, which should be the minority case. Maybe you can argue negativity bias might make disabling it preferable but I think it's worth considering.

Thanks for all your ideas, we are looking at them, and some changes have already been applied internally.

This is really cool to hear, glad I could help!

@Etn40ff
Copy link

Etn40ff commented Feb 1, 2022

One of the suggested changes, i.e. trim unused symbols, has been implemented with commit 6df0622

Unfortunately, because of this change it is now impossible to load custom modules. Could this change be reverted?

@loicpoulain
Copy link
Contributor

Unfortunately, because of this change it is now impossible to load custom modules. Could this change be reverted?

Can't you rebuild your own kernel with required config changes to load your module(s)?

@Eeems
Copy link

Eeems commented Feb 7, 2022

Unfortunately, because of this change it is now impossible to load custom modules. Could this change be reverted?

Can't you rebuild your own kernel with required config changes to load your module(s)?

Yes, but that can be risky, and makes it harder to provide custom software to people that requires it, as it's a lot harder to ask them to replace the kernel on their device.

@cgevans
Copy link

cgevans commented Feb 18, 2022

I'd similarly ask if you'd consider reverting the TRIM_UNUSED_KSYMS change. It has significantly degraded my experience with my rM2. It breaks almost all ability to add extra modules: something as simple as needing the device to connect to a VPN now requires completely rebuilding a new kernel, with all the risks that this entails (particularly on the rM2), rather than simply compiling a module and copying over a few files, eg, for Wireguard or tun/tap interfaces.

The kernel config documentation generally recommends against enabling the option, and doesn't even display it in configuration unless expert mode is enabled, noting that it's for 'specialized environments that can tolerate a "non-standard" kernel'. It really does cripple the device for some users, for seemingly minimal benefit. Having it disabled is not a debugging option: keeping ksyms is the standard.

@Etn40ff
Copy link

Etn40ff commented Jun 26, 2022

Thank you 80c6812

@loicpoulain
Copy link
Contributor

I would suggest closing this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants