Show HN: Tacopy – Tail Call Optimization for Python

94 pointsposted 7 days ago
by raaid-rt

49 Comments

lukasnxyz

2 days ago

what is this, why is 95% of the file useless comments: https://github.com/raaidrt/tacopy/blob/main/src/tacopy/unpar...

you're calling a built in function, why make obfuscate this in an "llm way"

jasonjmcghee

2 days ago

It's very llm-y. But, kudos to them for preserving the commit history / being honest about it.

stingraycharles

2 days ago

I also don’t think it’s actually tail call optimization, but rather an “unrecurser”.

I’m also not convinced that this actually is worth the effort, considering it’s doing runtime rewriting of Python using Python.

throwaway81523

16 hours ago

It's a suboptimal implementation but it's interesting for the purpose of transpiling something like Scheme or ML or maybe Purescript to Python.

Historically, suggestions to make Python tail recursive were rejected on the theory that the added stack frames helped debugging, and that Python was supposed to be an imperative language. But it has been dragged into supporting some FP idioms over the years and that has seemed like a good thing.

hencq

2 days ago

From the description, it doesn't really seem to be full Tail Call Optimization, but only optimizes tail recursion. At least all the examples are about tail recursion, where the function calls itself in tail position, which can indeed easily be changed to a loop. Tail Call Optimization would mean it optimizes any function call in tail position. Typically you'd implement that with a trampoline, but it doesn't seem like this does that.

Edit: It's actually called out under limitations "No mutual recursion: Only direct self-recursion is optimized"

denys_potapov

2 days ago

Cool. Recursion in python is common bottleneck in competitive programming. Will give it a try. I created a similar tool for recursion [1]. But ended with rewriting AST and emulating stack. Pros - no need for accumulator, cons - almost unusable in real world.

[1] https://dev.to/denyspotapov/callonce-python-macro-for-unlimi...

qsort

2 days ago

Do you frequently use Python for competitive programming puzzles? I've done it a bit in the past, and everyone always used C++.

denys_potapov

2 days ago

Always. Probably you can't get in top 100 with python[1]. But I like dicts with tuple keys, bigints, all the list things.

[1] My best is around 1300 place in HackerCup.

btilly

2 days ago

I have to wonder how much better you'd do if they made pypy an option.

jagged-chisel

2 days ago

Taco Py. No Ta Copy. Took my brain a minute or so …

busfahrer

2 days ago

Fun fact: In Scheme, TCO is required by the spec

debugnik

2 days ago

EcmaScript too, but most browser JS runtimes don't (didn't?) support it. It's also part of .NET CIL but only F# uses it so support for it has historically been flakey outside the CLR's main JIT mode.

monster_truck

2 days ago

Safari's JSC (and much more recently, WebAssembly) are the only ones that actually implement it. In practice I don't think it actually ends up being any better than V8, which I believe has some amount of logic to replace them with iterators or trampolines when it can.

IIRC ES6 introduced PTC (Proper Tail Calls), Chrome had an experimental flag for a while but it introduced more issues than it solved (stack traces became a mess and the stack inspection changes came with realistic security concerns). Microsoft and Firefox refused to implement it. Safari refused to un-implement it, and also refused to adopt an opt-in per function flag.

It's crazy how fast javascript has gotten. Ported a classic game earlier this year using all of the new stuff in ES5/6/onwards, the benchmarks are within a couple of percent of what the perf would be were it a standalone game. Runs with 250x monsters at 30x the original tick rate, or >1000x as many monsters at the original tick rate.

dkersten

2 days ago

Once upon a time I tried to write such a decorator too in python 2.x and the byteplay bytecode disassembler library. I was trying to do the conversion at the bytecode level instead of transforming the AST. I believe I got as far as detecting simple self recursive functions, but never actually managed to implement the actual transformation.

srean

2 days ago

> Tacopy is a Python library that provides a decorator to optimize tail-recursive functions by transforming them into iterative loops.

Can this handle mutually recursive calls ? Because those are mostly the only place I use tail calls, rest I translate to iterative loops, list comprehension, maps and reduces.

pansa2

2 days ago

> Limitations […] No mutual recursion: Only direct self-recursion is optimized

srean

2 days ago

Oh! slapping head. How did I miss that. Thanks

javierbg95

2 days ago

Really cool project, fairly succinct and to the point :)

I would love to see support for arbitrarily nested functions, as it is common to wrap these into a public API function without the iteration parameters.

ndr

2 days ago

Yes, it is quite surprised they're not allowed. I wonder what's the likely limitation, any ideas?

ndr

2 days ago

Found out

> Nested function definitions with @tacopy decorators are not supported. Functions decorated with @tacopy must be defined at module level. This constraint exists because inspect.getsource() on nested functions returns the source of the entire enclosing function, making it impossible to reliably extract and transform just the nested function's code. The decorator detects nested functions by checking for '<locals>' in the function's __qualname__ attribute and raises a clear error message instructing users to extract the function to module level.

https://github.com/raaidrt/tacopy/blob/8f5db70da2b7bbfafc41b...

phplovesong

2 days ago

TCO can be implemented easily in non TC optimized langauges with a trampoline wrapper.

Why do i need a fully fledged library for something that is basically a few lines of code?

srean

2 days ago

There's quite a bit of overhead.

I believe Clojure does it with trampoline as JVM does not (as far as I know) does not support tail call optimization. Ironic, given Guy Steele.

ndr

2 days ago

Yes, Clojure doesn't have TCO either.

You get `(loop ... (recur ...))` or `trampoline`.

Naive recursion will consume the stack.

https://clojuredocs.org/clojure.core/loop

https://clojuredocs.org/clojure.core/recur

https://clojuredocs.org/clojure.core/trampoline

pfdietz

2 days ago

Nor can you count on Common Lisp to have TCO. People who are new to CL and treat it like Scheme run into this constantly. Basically never recur down a list, since the list could be long.

This problem also shows up in cases where TCO is not possible. For example, suppose we have a syntax tree for some program and need to traverse the tree. Nodes that represent lists of things might have a long list of children, and in a sufficiently large program recursing down that list can blow out the stack. Just recurse on list elements, which one can reasonably assume don't nest too deeply.

anilakar

2 days ago

> This eliminates the risk of stack overflow errors

When you get stack overflows anywhere from a thousand down to fifty(!) frames in the stack it's not a risk, it's an inevitability in anything more complex than a programming tutorial.

Yeah, I've been bitten by this in production. Writing the functionality in a clean iterative style was just too much of a hassle.

artemonster

2 days ago

Can someone give some real world examples for TCO? Every time I see this I only see fibonacci and gcd and I really want to encounter this one in the wild on something real and applicable

chuckadams

2 days ago

State machines come to mind: a transition is just a function call. Unfortunately that's a general tail call, not always a recursive one, so no love from this library, and that's where "proper" TCO wins (or trampolines if $your_language lacks TCO)

Also it wouldn't help with Fibonacci, since while it's recursive, it's not tail-recursive (yes, it can be written that way, but I'm talking about the idiomatic naive definition).

tpoacher

2 days ago

I don't know about real world "examples", but the beauty of tail-call recursion specifically is the theoretical insight that they have a one-to-one mapping with an loop-based equivalent formulation, and vice versa (which is generally not necessarily true of all recursion).

But, for languages that don't have loop constructs and you need to rely on recursion, all you need to do is write your recipe in standard loop form, and then map back to a tail-call syntax. This is often a LOT easier than trying to think of the problem in a recursive mindset from scratch. (though occasionally, the reverse is also true.)

So the only constraint for re-implementing such looped logic onto tailcalls is that this relies on the stack, which may overflow. By providing TCO you are effectively removing that restriction, so it's a very useful thing for a language to support (especially if they don't provide low-level loops).

The title "tail call optimisation" in the package above is a bit of a misnomer, since this is more of a "transformation" than an "optimisation", but effectively the whole loop-tailcall equivalence is exactly what the package mentioned above relies on to work; it uses decorators to transform tail-call recursive functions to their equivalent loop-based formulations, and thus passing the need to create multiple stacks for the recursion (and risk stack overflow), since the translated loop will now take place in a single stack frame.

artemonster

2 days ago

I know what TCO is. Screw the "beauty", honestly. I want to see at least one real world use case

tpoacher

an hour ago

but i suspect you're talking about tail-recursion rather than TCO specifically. Otherwise the only sensible answer is, why on earth wouldn't you want that if you could have it for free?

so as for tail recursion examples, one nice example i had in the past which made thinking about the problem a lot easier than loops, was when I was designing a 3D maze-like game. The recuraion allowed me to draw each subsequent "step" visible on the screen without having to kniw in advance hiw many steps should be visible. you just draw the "next" room at increasing vanishing distance, until you hit a "wall" (the base case). It was a very simple, elegant result for minimal code; where the equivalent loop would have been long and horrible.

creata

2 days ago

There isn't a killer use case, because tail calls (to yourself or to siblings) can always be easily converted to a loop, and the loop is more idiomatic in most mainstream languages.

jhgb

2 days ago

...and that costs you code modularity and separate compilation. Why lose them when you don't have to?

creata

a day ago

Got an example?

Because it's hard to imagine real "modularity" between a bunch of functions that can cyclically call each other.

mwkaufma

2 days ago

Lua has mandatory TCO and several games I've been on which use it for a scripting use TCO for a state machine. Easy to debug - just look at the stack!

spapas82

2 days ago

For tco to be really useful you need to think in a non procedural way. Imagine that you don't have loops in your language so you need recursion to do stuff multiple times.

Also even in procedural languages there are some problems that are easier to understand and model if you use recursion, for example tree or graph like structures.

artemonster

2 days ago

traversing graph or a tree is not a TCO case because it would involve a stack/queue for DFS/BFS, whatever. I dont want to think in non procedural way, I reserve this nonsense to haskellers, please provide me a valid python use case for TCO :)

bfffbgfdcb

a day ago

Traversing a graph and inspecting each node can definitely make good use of tail call optimization.

For instance: you have a large graph and you are traversing a particular path through it — say a R/B tree seeking a node. You can write it iteratively or recursively. Neither needs to hold more than 1 node reference at a time, the choice is which you prefer to read and write.

I prefer to write that recursively. Sounds like you may not. Observing “well I can write it iteratively so why do I need TCO” is obvious and uninteresting; that’s the point.

jhgb

2 days ago

It gives you arbitrarily complex control flow even in presence of modularity. A tail call is a state transition. Without them, you'd have to resort to a big loop (which breaks modularity), or some sort of trampoline (which works but it's a bit clumsy).

artemonster

2 days ago

this whole thing is equivalent to a goto at the beginning of a function.

dragonwriter

2 days ago

This is true in the same sense that function calls are equivalent to gotos at the beginning and end of the function.

jhgb

2 days ago

Yes, except without all the known disadvantages of goto. That's the whole point.

artemonster

a day ago

huh? what is the disadvantage? i am very tired of people parroting some non-arguments. goto is awesome. solves TCO, multiloop breaks, switches and whole lot more

upghost

2 days ago

A really simple one is traversing a linked list (or any naturally recursive data structure, such as a dictionary or tree). It is very natural to traverse a recursive data structure recursively.

nonameiguess

2 days ago

It's always amusing to see what Python has become thanks to NumPy and Django and how widely deployed it became when its intended purpose was always to be easy to understand, not performant. But just like JavaScript, people are going to hack it to death to make it fast anyway.

Guido on tail calls 16 years ago: https://neopythonic.blogspot.com/2009/04/final-words-on-tail...

upghost

2 days ago

Really impressive! For anyone who's not a pythonista, trying to implement TCO is something akin to solving the Collatz conjecture but for Python. It's often just an exercise in madness. So seeing an elegant solution to this is really cool, I myself was a victim of this madness and was unable to do it so very cool to see someone nail it! This will be a goto tool for sure.