Why use Linux for that though? Why not build the machine like a 3D printer, with a dedicated microcontroller that doesn't even run an OS and has completely predicable timing, and a separate non-RT Linux system for the GUI?
I feel like Klippers approach is fairly reasonable, let an non-RT system (that generally has better performance than your micro controller) calculate the movement but leave the actual commanding of the stepper motors to the micro controller.
Yeah, I looked at Klipper a few months ago and really liked what I saw. Haven't had a chance to try it out yet but like you say they seem to have nailed the interface boundary between "things that should run fast" (on an embedded computer) and "things that need precise timing" (on a microcontroller).
One thing to keep in mind for people looking at the RT patches and thinking about things like this: these patches allow you to do RT processing on Linux, but they don't make some of the complexity go away. In the Klipper case, for example, writing to the GPIOs that actually send the signals to the steppers motors in Linux is relatively complex. You're usually making a write() syscall that's going through the VFS layer etc. to finally get to the actual pin register. On a microcontroller you can write directly to the pin register and know exactly how many clock cycles that operation is going to take.
I've seen embedded Linux code that actually opened /dev/mem and did the same thing, writing directly to GPIO registers... and that is horrifying :)
At the same time, RT permits some more offload to the computer.
More effort can be devoted to microsecond-level concerns if the microprocessor can have a 1ms buffer of instructions reliably provided by the computer, vs if it has to be prepared to be on its own for hundreds of ms.
Totally! I’m pumped for this in general, just want people to remember it’s not a silver bullet.
I played with it years ago, but it's still alive and well
http://linuxcnc.org/
These days not sure, hard to find computer with parallel port. Combined version with microcontroller like raspberry pico (which costs < $10) should be the right way to do it. Hard real time, WiFi remote for cheap. Then computer doesn't need to be fat or realtime, almost anything, including smartphone.
Most people use LinuxCNC with cards from Mesa now. They have various versions for Ethernet, direct connect to Raspberry Pi GPIO, etc.
USB to Parallel are common. so, easy.
A “real” parallel port provides interrupts on each individual data line of the port, _much_ lower latency than a USB dongle can provide. Microseconds vs milliseconds.
A standard PC parallel port does not provide interrupts on data lines.
The difference is more that you can control those output lines with really low latency and guaranteed timing. USB has a protocol layer that is less deterministic. So if you need to generate a step signal for a stepper motor e.g. you can bit bang it a lot more accurately through a direct parallel port than a USB to parallel adapter (which is really designed for printing through USB and has very different set of requirements).
Are you sure about that? I'd have bet money that the input lines have an interrupt assigned, and googling seems to agree.
I think it's possible do to it all on raspberry pico. Having pico doing low level driving and javascript in browser taking high level, feeding pico and providing UI. That would be close to perfect solution
Because LinuxCNC runs on Linux. It's an incredibly capable CNC controller.
I mean yeah, but the more I know about computers the less I like the idea of it.
On a PC you have millions of lines of kernel code, BIOS/EFI code, firmware, etc. You have complex video card drivers, complex storage devices. You have the SMM that yanks control away from the OS whenever it pleases.
The idea of running a dangerous machine controlled by that mess is frankly scary.
LinuxCNC isn't the only thing out there either, lots of commercial machine tools use Linux to power their controllers.
linuxcnc aka emc2 runs linux under a real-time hypervisor, and so doesn't need these patches, which i believe (and correct me if i'm wrong) aim at guaranteed response time around a millisecond, rather than the microseconds delivered by linuxcnc
(disclaimer: i've never run linuxcnc)
but nowadays usually people do the hard real-time stuff on a microcontroller or fpga. amd64 processors have gotten worse and worse at hard-real-time stuff over the last 30 years, they don't come with parallel ports anymore (or any gpios), and microcontrollers have gotten much faster, much bigger, much easier to program and debug, and much cheaper. even fpgas have gotten cheaper and easier
there's not much reason nowadays to try to do your hard-real-time processing on a desktop computer with caches, virtual memory, shitty device drivers, shitty hardware you can't control, and a timesharing operating system
the interrupt processing jitter on an avr is one clock cycle normally, and i think the total interrupt latency is about 8 cycles before you can toggle a gpio. that's a guaranteed response time around 500 nanoseconds if you clock it at 16 megahertz. you are never going to get close to that with a userland process on linux, or probably anything on an amd64 cpu, and nowadays avr is a slow microcontroller. things like raspberry pi pico pioasm, padauk fppa, and especially fpgas can do a lot better than that
(disclaimer: though i have done hard-real-time processing on an avr, i haven't done it on the other platforms mentioned, and i didn't even write the interrupt handlers, just the background c++. i did have to debug with an oscilloscope though)
> linuxcnc aka emc2 runs linux under a real-time hypervisor
Historically it used RTAI; now everyone is moving to preempt-rt. The install image is now preempt-rt.
I've been on the flipside where you're streaming g-code from something that isn't hard-realtime to the realtime system. You can be surprised and let the realtime system starve, and linuxcnc does a lot more than you can fit onto a really small controller. (In particular, the way you can have fairly complicated kinematics defined in a data-driven way lets you do cool stuff).
Today my large milling machine is on a windows computer + GRBL; but I'm probably going to become impatient and go to linuxcnc.
thank you for the correction! are my response time ballparks for rtai and preempt-rt correct?
You're a bit pessimistic, but beyond that I feel like you're missing the point a bit.
The purpose of a RTOS on big hardware is to provide bounded latency guarantees to many things with complex interactions, while keeping high system throughput (but not as good as a non-RTOS).
A small microcontroller can typically only service one interrupt in a guaranteed fast fashion. If you don't use interrupt priorities, it's a mess; and if you do, you start adding up latencies so that the lowest priority interrupt can end up waiting indefinitely.
So, we tend to move to bigger microcontrollers (or small microprocessors) and run RTOS on them for timing critical stuff. You can get latencies of several microseconds with hundreds of nanoseconds of jitter fairly easily.
But bigger RTOS are kind of annoying; you don't have the option to run all the world's software out there as lower priority tasks and their POSIX layers tend to be kind of sharp and inconvenient. With preempt-rt, you can have all the normal linux userland around, and if you don't have any bad performing drivers, you can do nearly as well as a "real" RTOS. So, e.g., I've run a 1.6KHz flight control loop for a large hexrotor on a Raspberry Pi 3 plus a machine vision stack based on python+opencv.
Note that wherever we are, we can still choose to do stuff in high priority interrupt handlers, with the knowledge that it makes latency worse for everything else. Sometimes this is worth it. On modern x86 it's about 300-600 cycles to get into a high priority interrupt handler if the processor isn't in a power saving state-- this might be about 100-200ns. It's also not mutually exclusive with using things like PIO-- on i.mx8 I've used their rather fancy DMA controller which is basically a Turing complete processor to do fancy things in the background while RT stuff of various priority runs on the processor itself.
thank you very much! mostly that is in keeping with my understanding, but the 100–200ns number is pretty shocking to me
That's a best case number, based on warm power management, an operating system that isn't disabling interrupts, and the interrupt handler being warm in L2/L3 cache.
Note that things like PCIe MSI can add a couple hundred nanoseconds themselves if this is how the interrupt is arriving. If you need to load the interrupt handler out of SDRAM, add a couple hundred nanoseconds more, potentially.
And if you are using power management and let the system get into "colder" states, add tens of microseconds.
hmm, i think what matters for hard-real-time performance is the worst-case number though, the wcet, not the best or average case number. not the worst-case number for some other system that is using power management, of course, but the worst-case number for the actual system that you're using. it sounds like you're saying it's hard to guarantee a number below a microsecond, but that a microsecond is still within reach?
osamagirl69 (⸘‽) seems to be saying in https://news.ycombinator.com/item?id=41596304 that they couldn't get better than 10μs, which is an order of magnitude worse
But you make the choices that affect these numbers. You choose whether you use power management; you choose whether you have higher priority interrupts, etc.
> that they couldn't get better than 10μs,
There are multiple things discussed here. In this subthread, we're talking about what happens on amd64 with no real operating system, a high priority interrupt, power management disabled and interrupts left enabled. You can design to consistently get 100ns with these constraints. You can also pay a few hundred nanoseconds more of taxes with slightly different constraints. This is the "apples and apples" comparison with an AVR microcontroller handling an interrupt.
Whereas with rt-preempt, we're generally talking about the interrupt firing, a task getting queued, and then run, in a contended environment. If you do not have poorly behaving drivers enabled, the latency can be a few microseconds and the jitter can be a microsecond or a bit less.
That is, we were talking about interrupt latency (absolute time) under various assumptions; osamagirl69 was talking about task jitter (variance in time) under different assumptions.
You can, of course, combine these techniques; you can do stuff in top-half interrupt handlers in Linux, and if you keep the system "warm" you can service those quite fast. But you lose abstraction benefits and you make everything else on the system more latent.