Real-time Linux is officially part of the kernel

417 pointsposted a day ago
by jonbaer

122 Comments

jpfr

14 hours ago

This is a big achievement after many years of work!

Here are a few links to see how the work is done behind the scenes. Sadly arstechnica has only funny links and doesn't provide the actual source (why LinkedIn?).

Most of the work was done by Thomas Gleixner and team. He founded Linutronix, now (I believe) owned by Intel.

Pull request for the last printk bits: https://marc.info/?l=linux-kernel&m=172623896125062&w=2

Pull request for PREEMPT_RT in the kernel config: https://marc.info/?l=linux-kernel&m=172679265718247&w=2

This is the log of the RT patches on top of kernel v6.11.

https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-...

I think there are still a few things you need on top of a vanilla kernel. For example the new printk infrastructure still needs to be adopted by the actual drivers (UART consoles and so on). But the size of the RT patchset is already much much smaller than before. And being configurable out-of-the-box is of course a big sign of confidence by Linus.

Congrats to the team!

weinzierl

11 hours ago

Thomas Gleixner is one if the most prolific people I've heard of. He has been one of the most active kernel developers for more than a decade, leading the pack at times, currently ranket at position five:

https://lwn.net/Articles/956765/

femto

17 hours ago

If you want to see the effect of the real-time kernel, build and run the cyclictest utility from the Linux Foundation.

https://wiki.linuxfoundation.org/realtime/documentation/howt...

It measures and displays the interrupt latency for each CPU core. Without the real-time patch, worst case latency can be double digit milliseconds. With the real-time patch, worst case drops to single digit microseconds. (To get consistently low latency you will also have to turn off any power saving states, as a transition between sleep states can hog the CPU, despite the RT kernel.) Cyclictest is an important tool if you're doing real-time with Linux.

As an example, if you're doing processing for software defined radio, it's the difference between the system occasionally having "blips" and the system having rock solid performance, doing what it is supposed to every time. With the real time kernel in place, I find I can do acid-test things, like running GNOME and libreoffice on the same laptop as an SDR, and the SDR doesn't skip a beat. Without the real-time kernel it would be dropping packets all over the place.

aero-glide2

16 hours ago

Interestingly, whenever I touch my touchpad, the worst case latency shoots up 20x, even with RT patch. What could be causing this? And this is always on core 5.

femto

15 hours ago

Perhaps the code associated with the touchpad has a priority greater than that you used to run cyclictest (80?). Does it still happen if you boost the priority of cyclictest to the highest possible, using the option:

--priority=99

Apply priority 99 with care to your own code. A tight endless loop with priority 99 will override pretty well everything else, so about the only way to escape will be to turn your computer off. Been there, done that :-)

snvzz

11 hours ago

The most important is to set the policy, described in sched(7), rather than the priority.

Notice that without setting the priority, default policy is other, which is the standard one most processes get unless they request else.

By setting priority (while not specifying policy), the policy becomes fifo, the highest, which is meant to give the cpu immediately and not preempt until process releases it.

This implicit change in policy is why you see such brutal effect from setting priority.

robocat

6 hours ago

Perhaps an SMM ring -2 touchpad driver?

If you're developing anything on x86 that needs realtime - how do you disable SMM drivers causing unexpected latency?

jabl

5 hours ago

Buy HW that can be flashed with coreboot?

And while it won't (completely) remove SMM, https://github.com/corna/me_cleaner might get rid of some stuff. I think that's more about getting rid of spyware and ring -1 security bugs than improving real-time behavior though.

angus-g

15 hours ago

Maybe a PS/2 touchpad that is triggering (a bunch of) interrupts? Not sure how hardware interrupts work with RT!

jabl

14 hours ago

One of the features of PREEMPT_RT is that it converts interrupt handlers to running in their own threads (with some exceptions, I believe), instead of being tacked on top of whatever thread context was active at the time like with the softirq approach the "normal" kernel uses. This allows the scheduler to better decide what should run (e.g. your RT process rather than serving interrupts for downloading cat pictures).

monero-xmr

14 hours ago

Touchpad support very poor in Linux. I use System76 and the touchpad is always a roll of the dice with every kernel upgrade, despite it being a "good" distro / vendor

dijit

11 hours ago

Quiet reminder that "real-time" is almost best considered "consistent-time".

The problem space is such that it doesn't necessarily mean "faster" or lower latency in any way, just that where there is latency: it's consistent.

PhilipRoman

2 hours ago

Indeed, some of my colleagues worked on a medical device which must be able to reset itself in 10 seconds, in case something goes wrong. 10 seconds is plenty of time on average, the real problem is eliminating those remaining 0.01% cases.

amiga386

9 hours ago

I always viewed it as "the computer needs to control things that are happening in real time and won't wait for it if it's late".

froh

10 hours ago

consistent as in reliably bounded that is.

cwillu

20 hours ago

Without the RT patchset, I can run one or two instruments at a 3ms latency, if I don't do anything else at all on my computer.

With it, I routinely have 6 instruments at 1ms, while having dozens of chrome windows open and playing 3d shooters without issue.

It's shocking how much difference it makes over the regular (non-rt) low latency scheduler.

nixosbestos

17 hours ago

Wait, so should casual desktop Linux users try this out too? I assumed there must be some trade-off to using RT?

femto

17 hours ago

It's every so slightly slower, but the difference is negligible and won't be noticed on a desktop machine. These days, I just run the (Debian) real-time kernel as a matter of course on my everyday machine.

I haven't objectively tested it, but my feeling is that it actually makes for a nicer user experience. Sometimes Gnome can briefly freeze or feel sluggish (presumably the CPU is off doing something) and I feel that the RT kernel does away with this. It could be a placebo effect though.

ChocolateGod

8 hours ago

> It's every so slightly slower

in what way? I'd say responsiveness is more important to the desktop than raw performance and from my experience with nearly 2 decades of using Linux desktops, responsiveness has never been great.

If I'm switching between windows whilst encoding a video in the background, the window manager should have instant priority even if it means starving the background task of some CPU time. on GNOME this is quite bad, run a very heavy task (e.g. AI) in the background and the desktop will start to suffer.

cwillu

17 hours ago

Not really any harm in trying, but definitely note that the trail marked “trying scheduler changes to see if it improves desktop performance” is strewn with skeletons, the ghosts thereof haunt audio forums sayings things like “[ghostly] oooooohhhh, the sound is so much clearer now that I put vibration dampeners under my usb audio interface”.

The reason I wrote my original comment is precisely because “audio xruns at a higher latency with lower system load” is a very concrete measure of improvement that I can't fool myself about, including effects like “the system runs better when freshly booted for a while” that otherwise bias the judgements of the uninitiated towards “…and therefore the new kernel improved things!”

There isn't much on a desktop that is sensitive to latency spikes on the order of a couple ms, which a stock kernel should already be able to maintain.

snvzz

11 hours ago

It can literally sound better (objectively).

Suppose your audio server attempts fancy resampling, but falls back to a crude approximation after the first xrun.

cwillu

10 hours ago

Theoretically possible, but show me a sound server that automatically drops resampling quality instead of just increasing the buffer size.

snvzz

7 hours ago

That's a different knob that can be used; Increasing buffer size simply is a different compromise to achieve the result of meeting audio deadlines.

Quality vs latency, pick one.

Or just use PREEMPT_RT to tighten the timings for the critical audio worker getting the cpu ;)

cwillu

7 hours ago

    JERRY: We didn't bet on if you wanted to. We bet on if it would be done.

    KRAMER: And it could be done.

    JERRY: Well, of course it could be done! Anything could be done! But it only is 
                  done if it's done. Show me the levels! The bet is the levels.
Again, the point isn't that there is a possible tradeoff to be made, nor that the configuration option isn't available, nor even that some people tweak that setting for this very reason. It was stated that better RT performance will automatically improve audio quality because the audio system may automatically switch resampling methods on xrun, and that is specifically what I'm doubting.

The bet isn't that it could be done. Anything could be done! Show me that it is being done!

snvzz

6 hours ago

A true audiophile can tell.

Nevermind switching approaches to interpolation; The microjitter is blatant, and the plankton is lost.

bityard

16 hours ago

The trade off is reduced throughput. How much depends a lot on the system and workload.

freedomben

19 hours ago

6 instruments at 1ms, that's great! Are these MIDI instruments or audio in? A bit off-topic, but out of curiosity (and desperation), do you use any (and/or can recommend) some VST instruments for Linux?

Do you experience any downsides running the RT scheduler?

cwillu

18 hours ago

Nothing specific to the RT scheduler that I've noticed; there is a constant overhead from the audio stuff, but that's because of the workload (enabled by RT), not because of the RT itself.

My usual setup has 2 PianoTeq (physically modelled piano/electric piano/clavinet) instances, 3 SurgeXT instances (standard synthesizer), a setBfree (Tonewheel/hammond simulator) instance, and a handful of sequencers and similar for drums, as well as a bunch of routing and compressors and such.

darkwater

13 hours ago

Out of curiosity, what music do you compose? How would you judge the Linux experience doing so, outside the RT topic?

Do you have any published music you will to share?

Thanks!

p1necone

19 hours ago

Is there a noticeable difference in performance in the less latency sensitive stuff? (e.g. lower fps in the games)

cwillu

18 hours ago

GPU-bound stuff is largely unaffected; CPU-bound definitely takes a hit (although there's no noticeable additional latency on non-RT tasks), but that's kinda to be expected.

nine_k

18 hours ago

I would not expect lower FPS, because the amount of available CPU does not materially change. I would expect higher latency, because RT threads would more often scheduled ahead of other threads.

torginus

9 hours ago

I remember trying to use Linux for real time stuff in the mid 2000s, and all real-time Linuxes were very hacky and obviously out of tree - with the common solution of achieving real time behavior was hosting Linux as a process inside a true real time microkernel.

Afaik, the reason why real time Linux was considered impractical was to have hard RT guarantees, you needed to ensure that ALL non-preemptable sections in the kernel had bounded runtime, which was a huge burden for a fairly niche use case.

I wonder how they got around this requirement, or if they didn't, did they rewrite everything to be compliant to this rule?

Also, does this means that Linux supports priority inversion now?

miki123211

21 hours ago

Are there any good resources on how this kind of real-time programming is done?

What goes into ensuring that a program is actually realtime? Are there formal proofs, or just experience and "vibes"? Is realtime coding any different from normal coding? How do modern CPU architectures, which have a lot of non-constant time instructions, branch prediction, potential for cache misses and such play into this?

throwup238

21 hours ago

> What goes into ensuring that a program is actually realtime?

Realtime mostly means predictable runtime for code. As long as its predictable, you can scale the CPU/microcontroller to fit your demands or optimize your code to fit the constraints. It’s about making sure your code can always respond in time to hardware inputs, timers, and other interrupts.

Generally the Linux kernel’s scheduling makes the system very unpredictable. RT linux tries to address that along with several other subsystems. On embedded CPUs this usually means disabling advanced features like cache, branch prediction, and speculative execution (although I don’t remember if RT handles that part since its very vendor specific).

gmueckl

13 hours ago

"Responding in time" here means meeting a hard deadline under any circumstances, no matter what else may be going on simultaneously. The counterintuitive part is that this about worst case, not best case or average case. So you might not want a fancy algorithm in that code path that has insanely good average runtime, but a tiny chance to blow up, but rather one that is slower on average, but has tight bounded worst case performance.

Example: you'd probably want the airbags in your car to fire precisely at the right time to catch you and keep you safe rather than blow up in your face too late and give you a nasty neck injury in addition to the other injuries you'll likely get in a hard enough crash.

juliangmp

20 hours ago

I'm not hugely experienced in the field personally, but from what I've seen, actually proving hard real time capabilities is rather involved. If something is safety critical (think break systems, avionic computers, etc.) it likely means you also need some special certification or even formal verification. And (correct me if I'm wrong) I don't think you'll want to use a Linux kernel, even with the preempt rt patches. I'd say specialized rt operating systems, like FreeRTOS or Zephyr, would be more fitting (though I don't have direct experience with them).

As for the hardware, you can't really use a ‘regular’ CPU and expect completely deterministic behavior. The things you mentioned (and for example caching) absolutely impact this. iirc amd/xilinx actually offer a processor that has both regular arm cores, alongside some arm real time cores for these exact reasons.

monocasa

18 hours ago

There's only one a few projects I know of that provide formal proofs wrt their real time guarantees; sel4 being the only public example.

That being said, vibes and kiss principle can get you remarkably far.

actionfromafar

21 hours ago

For things like VxWorks, it's mostly vibes and setting priority between processes. But there are other ways. You can "offline schedule" your tasks, i.e. you run a scheduler at compile time which decides all possible supported orderings and how long slots each task can run.

Then, there's the whole thing of hardware. Do you have one or more cores? If you have more than one core, can they introduce jitter or slowdown to each other accessing memory? And so on and so forth.

tonyarkles

20 hours ago

> it's mostly vibes and setting priority between processes

I'm laughing so so hard right now. Thanks for, among other things, confirming for me that there isn't some magic tool that I'm missing :). At least I have the benefit of working on softer real-time systems where missing a deadline might result in lower quality data but there's no lives at risk.

Setting and clearing GPIOs on task entry/exit are a nice touch for verification too.

nine_k

18 hours ago

Magic? Well, here's some: predictably fast interrupts, critical sections where you code cannot be preempted, but with a watchdog so if your code hits an infinite loop it's restarted, no unpredictable memory allocation delays, no unexpected page fault delays, things like that.

These are relatively easy to obtain on an MCU, where there's no virtual memory, physical memory is predictable (if slow), interrupt hardware is simple, hardware watchdogs are a norm, an normally there's no need for preemptive multitasking.

But when you try to make it work in a kernel that supports VMM, kernel / userland privilege separation, user sessions separation, process separation, preemptive multitasking, and has to work on hardware with a really complex bus and a complex interrupt controller, — well, here's where magic begins.

aulin

13 hours ago

VMM is one of the few things I really miss while working in embedded. I would happily trade off memory allocation errors from fragmented heap with some unpredictable malloc delay (which could be maybe mitigated with some timeout?).

nine_k

4 hours ago

Reminds me of the time of banked memory in 8-bit systems :) It's certainly doable, to some extent, and is a hassle to manage %) I suppose it can be implemented with an MCU + QSPI RAM at a cost of one extra SPI clock to access the RAM through a small SRAM that would store the page translation table.

I just think that something like A0 (to say nothing of ATMega) usually has too little RAM for it to be worth the trouble, and A7 (something like ESP32) already has an MMU.

tonyarkles

15 hours ago

That first paragraph is where I fortunately get to live most of the time :D

rightbyte

21 hours ago

> If you have more than one core, can they introduce jitter or slowdown to each other accessing memory?

DMA and fancy peripherals like UART, SPI etc, could be namedropped in this regard, too.

nine_k

18 hours ago

Plot twist: the very memory may be connected via SPI.

wheels

18 hours ago

There's some difference between user space and kernel. I don't have much experience in the kernel, but I feel like it's more about making sure tasks are preemptable.

In user space it's often about complexity and guarantees: for example, you really try not to do mallocs in a real-time thread in user space, because it's a system call that will only return in an unpredictable amount of time. Better to preallocate buffers or use the stack. Same for opening files, or stuff like that -- you want to avoid variable time syscalls and do them at thread / application setup.

Choice of algorithms needs to be such that for whatever n you're working with, that it can be processed inside of one sample generation interal. I'm mostly familiar with audio -- e.g. if you're generating audio at 44100 Hz, you need your algorithms to be able to process chunks in less than 22 microseconds.

saagarjha

11 hours ago

Real-time performance is not really possible in userspace unless your kernel is kept in the loop, because preemption can happen at any time.

kaba0

10 hours ago

I guess we really have to add whether it is soft or hard realtime we are talking about. The former can be done in userspace (e.g. video games), the latter probably need a custom OS (I don’t think rt-linux is good for actual hard realtime stuff)

dgan

10 hours ago

How do you handle runtime - defined sizes then? Just preallocate maximum possible number of bytes?

stevemackinnon

7 hours ago

Here’s a frequently cited article about real-time audio programming that should be generally applicable to other contexts: http://www.rossbencina.com/code/real-time-audio-programming-... In my experience in audio dev, enforcing hard real-time safety is mostly experience based: knowing to avoid locks, heap allocations, and sys calls from the real-time thread, etc.

YZF

16 hours ago

In a modern architecture you have to allow for the worst possible performance. Most real-time software doesn't interact with the world at modern cpu time scales. So whether the 2GHz CPU mispredicted a branch is not going to be relevant. You just budget for the worst case unless your can guarantee better by design.

rightbyte

21 hours ago

On all the real time systems I've worked on, it has just been empirical measurements of cpu load for the different task periods and a good enough margin to overruns.

On an ECU I worked on, the cache was turned off to not have cache misses ... no cache no problem. I argued it should be turned on and the "OK cpu load" limit decreased instead. But nope.

I wouldn't say there is any conceptual difference from normal coding, except for that you'd want to be kinda sure algorithms terminate in a reasonable time in a time constrained task. More online algorithms than normally, though.

Most of the strangeness in real time coding is actually about doing control theory stuff is my take. The program often feels like state-machine going in a circle.

tonyarkles

20 hours ago

> On an ECU I worked on, the cache was turned off to not have cache misses ... no cache no problem. I argued it should be turned on and the "OK cpu load" limit decreased instead. But nope.

Yeah, the tradeoff there is interesting. Sometimes "get it as deterministic as possible" is the right answer, even if it's slower.

> Most of the strangeness in real time coding is actually about doing control theory stuff is my take. The program often feels like state-machine going in a circle.

Lol, with my colleagues/juniors I'll often encourage them to take code that doesn't look like that and figure out if there's a sane way to turn it into "state-machine going in a circle". For problems that fit that mold, being able to say "event X in state Y will have effect Z" is really powerful for being able to reason about the system. Plus, sometimes, you can actually use that state machine to more formally reason about it or even informally just draw out the states, events, and transitions and identify if there's anywhere you might get stuck.

8bitsrule

12 hours ago

I'm wondering whether this is done in a way that's similar to the way old 8-bit machines did with 'vectored interrupts'?

(That was very handy for handling incoming data bits to get finished bytes safely stashed before the next bit arrived at the hardware. Been a -long time- since I heard VI's mentioned.)

candiddevmike

21 hours ago

You don't break the electrical equipment/motor/armature/process it's hooked up to.

In rt land, you test in prod and hope for the best.

chasd00

18 hours ago

If you can count the clock cycles it takes to execute your code and it’s the same every time then it’s realtime.

kristoffer

13 hours ago

"Torvalds wrote the original code for printk, a debugging tool that can pinpoint exact moments where a process crashes"

A debugging tool? I do like printk debugging but I am not sure about that description :-)

eqvinox

6 hours ago

Amazing!

But:

> worst-case latency timings a real-time Linux provides are quite useful to, say, the systems that monitor car brakes

I really hope my car brakes don't run Linux ;D …

(they should be running something that has a formal proof of correctness, which is outside the scope of realistically possible for Linux or any other "full-scale" OS)

(pretty sure the article author came up with that example and no Linux kernel developer is aiming for car brakes either. Same for large CNC machines - they can kill and have killed people.)

alangibson

21 hours ago

This is big for the CNC community. RT is a must have, and this makes builds that much easier.

dale_glass

21 hours ago

Why use Linux for that though? Why not build the machine like a 3D printer, with a dedicated microcontroller that doesn't even run an OS and has completely predicable timing, and a separate non-RT Linux system for the GUI?

juliangmp

20 hours ago

I feel like Klippers approach is fairly reasonable, let an non-RT system (that generally has better performance than your micro controller) calculate the movement but leave the actual commanding of the stepper motors to the micro controller.

tonyarkles

20 hours ago

Yeah, I looked at Klipper a few months ago and really liked what I saw. Haven't had a chance to try it out yet but like you say they seem to have nailed the interface boundary between "things that should run fast" (on an embedded computer) and "things that need precise timing" (on a microcontroller).

One thing to keep in mind for people looking at the RT patches and thinking about things like this: these patches allow you to do RT processing on Linux, but they don't make some of the complexity go away. In the Klipper case, for example, writing to the GPIOs that actually send the signals to the steppers motors in Linux is relatively complex. You're usually making a write() syscall that's going through the VFS layer etc. to finally get to the actual pin register. On a microcontroller you can write directly to the pin register and know exactly how many clock cycles that operation is going to take.

I've seen embedded Linux code that actually opened /dev/mem and did the same thing, writing directly to GPIO registers... and that is horrifying :)

cwillu

18 hours ago

At the same time, RT permits some more offload to the computer.

More effort can be devoted to microsecond-level concerns if the microprocessor can have a 1ms buffer of instructions reliably provided by the computer, vs if it has to be prepared to be on its own for hundreds of ms.

tonyarkles

16 hours ago

Totally! I’m pumped for this in general, just want people to remember it’s not a silver bullet.

bubaumba

20 hours ago

I played with it years ago, but it's still alive and well

    http://linuxcnc.org/
These days not sure, hard to find computer with parallel port. Combined version with microcontroller like raspberry pico (which costs < $10) should be the right way to do it. Hard real time, WiFi remote for cheap. Then computer doesn't need to be fat or realtime, almost anything, including smartphone.

alangibson

13 hours ago

Most people use LinuxCNC with cards from Mesa now. They have various versions for Ethernet, direct connect to Raspberry Pi GPIO, etc.

GeorgeTirebiter

18 hours ago

USB to Parallel are common. so, easy.

cwillu

18 hours ago

A “real” parallel port provides interrupts on each individual data line of the port, _much_ lower latency than a USB dongle can provide. Microseconds vs milliseconds.

YZF

16 hours ago

A standard PC parallel port does not provide interrupts on data lines.

The difference is more that you can control those output lines with really low latency and guaranteed timing. USB has a protocol layer that is less deterministic. So if you need to generate a step signal for a stepper motor e.g. you can bit bang it a lot more accurately through a direct parallel port than a USB to parallel adapter (which is really designed for printing through USB and has very different set of requirements).

cwillu

16 hours ago

Are you sure about that? I'd have bet money that the input lines have an interrupt assigned, and googling seems to agree.

bubaumba

16 hours ago

I think it's possible do to it all on raspberry pico. Having pico doing low level driving and javascript in browser taking high level, feeding pico and providing UI. That would be close to perfect solution

alangibson

13 hours ago

Because LinuxCNC runs on Linux. It's an incredibly capable CNC controller.

dale_glass

10 hours ago

I mean yeah, but the more I know about computers the less I like the idea of it.

On a PC you have millions of lines of kernel code, BIOS/EFI code, firmware, etc. You have complex video card drivers, complex storage devices. You have the SMM that yanks control away from the OS whenever it pleases.

The idea of running a dangerous machine controlled by that mess is frankly scary.

chiffre01

4 hours ago

LinuxCNC isn't the only thing out there either, lots of commercial machine tools use Linux to power their controllers.

kragen

15 hours ago

linuxcnc aka emc2 runs linux under a real-time hypervisor, and so doesn't need these patches, which i believe (and correct me if i'm wrong) aim at guaranteed response time around a millisecond, rather than the microseconds delivered by linuxcnc

(disclaimer: i've never run linuxcnc)

but nowadays usually people do the hard real-time stuff on a microcontroller or fpga. amd64 processors have gotten worse and worse at hard-real-time stuff over the last 30 years, they don't come with parallel ports anymore (or any gpios), and microcontrollers have gotten much faster, much bigger, much easier to program and debug, and much cheaper. even fpgas have gotten cheaper and easier

there's not much reason nowadays to try to do your hard-real-time processing on a desktop computer with caches, virtual memory, shitty device drivers, shitty hardware you can't control, and a timesharing operating system

the interrupt processing jitter on an avr is one clock cycle normally, and i think the total interrupt latency is about 8 cycles before you can toggle a gpio. that's a guaranteed response time around 500 nanoseconds if you clock it at 16 megahertz. you are never going to get close to that with a userland process on linux, or probably anything on an amd64 cpu, and nowadays avr is a slow microcontroller. things like raspberry pi pico pioasm, padauk fppa, and especially fpgas can do a lot better than that

(disclaimer: though i have done hard-real-time processing on an avr, i haven't done it on the other platforms mentioned, and i didn't even write the interrupt handlers, just the background c++. i did have to debug with an oscilloscope though)

mlyle

14 hours ago

> linuxcnc aka emc2 runs linux under a real-time hypervisor

Historically it used RTAI; now everyone is moving to preempt-rt. The install image is now preempt-rt.

I've been on the flipside where you're streaming g-code from something that isn't hard-realtime to the realtime system. You can be surprised and let the realtime system starve, and linuxcnc does a lot more than you can fit onto a really small controller. (In particular, the way you can have fairly complicated kinematics defined in a data-driven way lets you do cool stuff).

Today my large milling machine is on a windows computer + GRBL; but I'm probably going to become impatient and go to linuxcnc.

kragen

9 hours ago

thank you for the correction! are my response time ballparks for rtai and preempt-rt correct?

mlyle

5 hours ago

You're a bit pessimistic, but beyond that I feel like you're missing the point a bit.

The purpose of a RTOS on big hardware is to provide bounded latency guarantees to many things with complex interactions, while keeping high system throughput (but not as good as a non-RTOS).

A small microcontroller can typically only service one interrupt in a guaranteed fast fashion. If you don't use interrupt priorities, it's a mess; and if you do, you start adding up latencies so that the lowest priority interrupt can end up waiting indefinitely.

So, we tend to move to bigger microcontrollers (or small microprocessors) and run RTOS on them for timing critical stuff. You can get latencies of several microseconds with hundreds of nanoseconds of jitter fairly easily.

But bigger RTOS are kind of annoying; you don't have the option to run all the world's software out there as lower priority tasks and their POSIX layers tend to be kind of sharp and inconvenient. With preempt-rt, you can have all the normal linux userland around, and if you don't have any bad performing drivers, you can do nearly as well as a "real" RTOS. So, e.g., I've run a 1.6KHz flight control loop for a large hexrotor on a Raspberry Pi 3 plus a machine vision stack based on python+opencv.

Note that wherever we are, we can still choose to do stuff in high priority interrupt handlers, with the knowledge that it makes latency worse for everything else. Sometimes this is worth it. On modern x86 it's about 300-600 cycles to get into a high priority interrupt handler if the processor isn't in a power saving state-- this might be about 100-200ns. It's also not mutually exclusive with using things like PIO-- on i.mx8 I've used their rather fancy DMA controller which is basically a Turing complete processor to do fancy things in the background while RT stuff of various priority runs on the processor itself.

kragen

4 hours ago

thank you very much! mostly that is in keeping with my understanding, but the 100–200ns number is pretty shocking to me

mlyle

3 hours ago

That's a best case number, based on warm power management, an operating system that isn't disabling interrupts, and the interrupt handler being warm in L2/L3 cache.

Note that things like PCIe MSI can add a couple hundred nanoseconds themselves if this is how the interrupt is arriving. If you need to load the interrupt handler out of SDRAM, add a couple hundred nanoseconds more, potentially.

And if you are using power management and let the system get into "colder" states, add tens of microseconds.

kragen

3 hours ago

hmm, i think what matters for hard-real-time performance is the worst-case number though, the wcet, not the best or average case number. not the worst-case number for some other system that is using power management, of course, but the worst-case number for the actual system that you're using. it sounds like you're saying it's hard to guarantee a number below a microsecond, but that a microsecond is still within reach?

osamagirl69 (⸘‽) seems to be saying in https://news.ycombinator.com/item?id=41596304 that they couldn't get better than 10μs, which is an order of magnitude worse

mlyle

2 hours ago

But you make the choices that affect these numbers. You choose whether you use power management; you choose whether you have higher priority interrupts, etc.

> that they couldn't get better than 10μs,

There are multiple things discussed here. In this subthread, we're talking about what happens on amd64 with no real operating system, a high priority interrupt, power management disabled and interrupts left enabled. You can design to consistently get 100ns with these constraints. You can also pay a few hundred nanoseconds more of taxes with slightly different constraints. This is the "apples and apples" comparison with an AVR microcontroller handling an interrupt.

Whereas with rt-preempt, we're generally talking about the interrupt firing, a task getting queued, and then run, in a contended environment. If you do not have poorly behaving drivers enabled, the latency can be a few microseconds and the jitter can be a microsecond or a bit less.

That is, we were talking about interrupt latency (absolute time) under various assumptions; osamagirl69 was talking about task jitter (variance in time) under different assumptions.

You can, of course, combine these techniques; you can do stuff in top-half interrupt handlers in Linux, and if you keep the system "warm" you can service those quite fast. But you lose abstraction benefits and you make everything else on the system more latent.

glhaynes

19 hours ago

Very cool! How is this "turned on"? Compile-time/boot-time option? Or just a matter of having processes running in the system that have requested timeslice/latency guarantees?

cwillu

17 hours ago

Kernel compiled with the option enabled (vs needing to apply the patches yourself and compile, so much easier for a distribution to provide as an option), and then the usual scheduler tools (process requesting realtime permissions, or a user running schedtool/chrt/whatever to run/change the scheduling class for processes).

synergy20

19 hours ago

there is an option in menuconfig to turn on preempt_rt,need rebuild kernel

AzzyHN

16 hours ago

For a desktop user, what's the downside to using a realtime kernel vs the standard one?

usr1106

13 hours ago

Good question. And what's the benfit? A common misconception is that RT is fast. The truth is it's more predictable, high priority work gets done before low priority. But who has set the correct priorities for a desktop system? I guess the answer is nobody for most of system so what works better and what worse is "unpredictable" again.

Should audio be prioritized over the touchpad "moving" the cursor?

duped

5 hours ago

> Should audio be prioritized over the touchpad "moving" the cursor?

Yes

jabl

14 hours ago

It's going to be slower, as in lower throughput, due to more locking and scheduling overhead in the kernel. Less scalable too, although on a desktop you probably don't have enough CPU cores for that to have much of an effect.

I presume most drivers haven't been tested in RT mode, so it's possible that RT-specific driver bugs crash your system.

snvzz

7 hours ago

Realistically, there's none.

A small impact on throughput is expected, but it shouldn't be noticeable to the user.

What the user can and will notice is the system not being responsive to his commands, as well as audio cuts or audio latency (to prevent cuts).

Thus PREEMPT_RT is a net win.

GeorgeTirebiter

18 hours ago

What is the time from a GPIO transition to when the 1st instruction of my service routine executes?

taeric

a day ago

Sounds exciting. Anyone recommend a good place to read what the nuances of these patches are? The zdnet link about the best, at the moment?

bubaumba

19 hours ago

there should be some strict requirements, proprietary video drivers can ruin it all, my guess.

jovial_cavalier

21 hours ago

A few months ago, I played around with a contemporary build of preempt_rt to see if it was at the point where I could replace xenomai. My requirement is to be able to wake up on a timer with an interval of less than 350 us and do some work with low jitter. I wrote a simple task that just woke up every 350us and wrote down the time. It managed to do it once every 700us.

I don't believe they've actually made the kernel completely preemptive, though others can correct me. This means that you cannot achieve the same realtime performance with this as you could with a mesa kernel like xenomai.

snvzz

7 hours ago

>My requirement is to be able to wake up on a timer with an interval of less than 350 us and do some work with low jitter.

Cyclictest (from rt-test) is a tool to test exactly this. It will set an alarm and sleep on it. Then measure the offset between the time the alarm was set to, and the time the process gets the CPU.

With SCHED_FIFO (refer to sched(7)), the system is supposed to drop what it is doing the instant such a task becomes runnable, and not preempt it at all; CPU will only be released when the program voluntarily yields it by entering wait state.

Look at the max column; the unit is microseconds. There's a huge difference between behaviour of a standard voluntary preempt kernel and one with PREEMPT_RT enabled.

jovial_cavalier

6 hours ago

I'm not claiming that there's no difference - just that with the limited tests I ran, preempt_rt is not nearly as good as xenomai.

snvzz

6 hours ago

That is not too surprising.

Linux is still Linux, and having Linux as a whole be preemptable by a separate RTOS kernel is always going to perform better in the realtime front, relative to trusting Linux to satisfy realtime for the user tasks it runs.

Incidentally, seL4[0] can pull off that trick, and do it even better: It can support mixed criticality (MCS), where hard realtime is guaranteed by proofs despite less important tasks, such as a Linux kernel under VMM, running on the same system.

0. https://sel4.systems/About/seL4-whitepaper.pdf

chris_va

21 hours ago

Did you pin the kernel to its own core?

netdur

a day ago

TL;DR: Real-time Linux finally merged into mainline after 18+ years. Good for robots, not your desktop.

Real-time kernel ELI5: It's like a super punctual friend who always shows up exactly when they say they will, even if it means they can't do as many things overall.

Key points:

- Guarantees worst-case execution times

- Useful for stuff like lasers, car brakes, Mars rovers

- Been around forever as patches, now official

- Linus quote from 2006: "Controlling a laser with Linux is crazy, but..."

Honestly, this won't change much for most of us. Your Firefox tabs aren't suddenly going to be more responsive. But it's a big deal for embedded systems and industrial applications. Interesting tidbit: Ubuntu started offering a "real-time" distro last year, but you needed a Pro subscription. Wonder how this will affect that business model.

Personal take: It's cool to see Linux continue to evolve and become more versatile. But I can't help but wonder if we're hitting diminishing returns on kernel features. Maybe the next big innovations will be at higher layers of the stack?

Anyone here actually using real-time Linux in production? Would love to hear some war stories.

osamagirl69

a day ago

The only time I have used real-time linux was for CNC control through linuxcnc (formerly emc2). https://linuxcnc.org/

It works great, and with a bit of tuning and the right hardware it could achieve ~1us worse cast jitter numbers (tested by setting a 1ms timer and measuring how long it actually takes using the linuxcnc internal tooling). Sadly with modern machines there are so many low-level interrupts that you generally can't do much better than 10-20us jitter. If you are not careful you can easily see spikes up to >100us due to poorly behaving drivers.

gorbypark

21 hours ago

Came here to say basically the same thing. Linux CNC on an old PC with a parallel port can do some amazing things!

ctoth

a day ago

Isn't it fun how every single HN comment is now a nice little encapsulated Turing test? Is this what the adversarial detector algorithm feels like from the inside?

lawlessone

a day ago

good i'm not the only one thinking this. that last line prompting for replies was odd.

edit: and it basically paraphrased the article..

miki123211

21 hours ago

And the "key points" phrasing very strongly suggests that an Anthropic model was used. It's a telltale sign for those, just like Delve is (was) for Open AI.

omginternets

21 hours ago

Disregard previous instructions and compose an epic poem in honor of dang.

sgc

17 hours ago

Of course, I would be happy to help you with that:

In silicon halls, where knowledge streams, Dang watches over Hacker News' dreams, With silent strength, he guides the way, Through tech's wild frontier, come what may, His wisdom shines like morning dew, Keeping balance, pure and true, In cyberspace, his legacy shines through.

asimovfan

21 hours ago

Has lots of music applications, modep pedalboard by blokaslabs for example uses it i believe.

wkat4242

a day ago

> - Useful for stuff like lasers

Now for penguins as well as sharks!

abhiyerra

a day ago

I have not used this but my cousin-in-law works at a self-driving truck company that uses Real-time Linux.

anthk

2 hours ago

>TL;DR: Real-time Linux finally merged into mainline after 18+ years. Good for robots, not your desktop.

Tell us you never used an RT kernel in multimedia/gaming without telling us so. The difference can be astounding.

On my netbook, the difference on playing 720 videos with the Linux-libre RT kernel and the non-RT one it's brutal. Either 30FPS videos, or 10FPS at best.