srean
5 months ago
At the root of the fast transform is the simple fact that
ax + bx = (a+b)x
The right hand side has fewer arithmetic operations. It's about finding common factors and pushing parentheses in. Because of the inherent symmetry of the FT expression there are lots of opportunities for this optimization.Efficient decoding of LDPC codes also use the same idea. LDPCs were quite a revolution (pun intended) in coding/information theory.
On the other hand, something completely random, few days ago I found out that Tukey (then a Prof) and Feynman (then a student) along with other students were so enamored and intrigued by flexagons that they had set up an informal committee to understand them. Unfortunately their technical report never got published because the war intervened.
Strangely, it does not find a mention in Surely You're Joking.
rajnathani
4 months ago
The part about complex numbers needs some intuition to build. This comes up in linear algebra in very relevant ways too, for example in 3D computer graphics calculations.
This is just my 2 cents, but I don’t have an intuition built for complex numbers.
rigtorp
5 months ago
How is belief propagation used for decoding LDPC codes related to FFT?
srean
5 months ago
At the core both derive their optimization from the distributive property. If the expression graph has symmetry, you get more optimization out of it.
https://www.cs.ubc.ca/~murphyk/Teaching/Papers/GDL.pdf
Check out the first paragraph
THE humble distributive
law, in its simplest form
states that...this leads
to a large family of fast
algorithms, including
Viterbi’s algorithm and
the fast Fourier
transform (FFT).
Two extremely influential papers appeared back to back in transactions information theory. This is one of them.The other is
https://vision.unipv.it/IA2/Factor graphs and the sum-product algorithm.pdf
Both are absolute gems of papers. The editor made sure that both appear in the same volume.
rigtorp
5 months ago
Interesting, of course many computations can be expressed as a graph. In the case of the bipartite graph we perform belief propagation on to decode LDPC where is the optimization from the distributive property? The parity matrix would typically be constructed so that there's few subexpression to factor out, to maximize the error correcting properties.
I agree both FFT and belief propagation can be expressed as message passing algorithms.
srean
5 months ago
It shows up in pushing in the parenthesis and pulling common terms out in the expression that is a sum (over all possible assignments) of products of terms.
Doing the summation the naive way will be exponential in the number of variables. The goal is to this in an efficient way exploiting the distributive property and symmetry if any, much like in the FFT case.
This can be done efficiently, for example, when the graph is a tree. (Even if it isn't, one can pretend as if it is. Surprisingly that often works very well but that's a different topic entirely)
Read the paper it's not difficult to follow.
kqbx
5 months ago
The second link is broken on HN because it contains a space. Here's a clickable version: https://vision.unipv.it/IA2/Factor%20graphs%20and%20the%20su...
adamnemecek
5 months ago
There’s a whole subfield called generalized distributive law https://en.wikipedia.org/wiki/Generalized_distributive_law
ajross
5 months ago
> At the root of the fast transform is the simple fact that
Actually... no? That's a constant factor optimization; the second expression has 75% the operations of the first. The FFT is algorithmically faster. It's O(N·log2(N)) in the number of samples instead of O(N²).
That property doesn't come from factorization per se, but from the fact that the factorization can be applied recursively by creatively ordering the terms.
srean
5 months ago
It's the symmetry that gives recursive opportunities to apply the optimization. It's the same optimization folded over and over again. Butterfly diagrams are great for understanding this. https://news.ycombinator.com/item?id=45291978 has pointers to more in depth exploration of the idea.
emil-lp
5 months ago
Well, actually ... Summation is linear time, multiplication is superlinear (eg n log n in number of digits).
Meaning that this takes k summations and one multiplication rather than k multiplications and k summations.
... Where k is the number of terms.
ajross
5 months ago
"Digits" are constant in an FFT (or rather ignored, really, precision is out of scope of the algorithm definition).
Obviously in practice these are implemented as (pairs of, for a complex FFT, though real-valued DCTs are much more common) machine words in practice, and modern multipliers and adders pipeline at one per cycle.