This seems like a reasonable solution to me, and I often stream music through my ssh sessions via port forwards so I may well already be getting the benefit in some places
This and a similar suggestion in another thread may sound nice and easy, "just add a constant stream of noise", but it assumes you can generate enough constant noise and be able to intersperse the noise with valid commands without being able to distinguish these events. The problem is not necessarily that you want to hide (to a network adversary) that you've been typing. It's that you do not want to reveal, through some side-channel, what the exact contents were.
On the openssh-unix-dev mailing list, someone recently pointed out[0] that just periodically (without jitter) sending out packets may be problematic due to subtle differences in clock timing. Aside, they also link to a presentation[1] [PDF] that shows influence of temperature on clock skew (especially page 18) and that this gives a possibility for fingerprinting.
Then there's the challenge of keeping SSH interactive enough that people do not experience too much input lag while typing. What if the user typed a character, but due to such a timing side-channel preventive measure, that character needs to be sent in the next packet, adding latency to the user experience? Surely it improves security, but it may add too much frustration for regular usage.
[0] https://marc.info/?l=openssh-unix-dev&m=169402700622936&w=2
[1] https://murdoch.is/talks/ccs06hotornot.pdf ([2006] Hot or Not: Revealing hidden services by their clock skew, see also https://doi.org/10.1145/1180405.1180410 and an HN thread from 2014: https://news.ycombinator.com/item?id=7694612)
But I don't think the conversation here is about anonymity, its about side channels to discover the actual content of the SSH session. The OP is looking at determining the command typed based on keystroke timing. The attacks you link would work for any traffic that could be intercepted, SSH or otherwise, and they wouldn't give any info about the content of the stream.
If we're just focused on removing all traces of keystroke timing from the channel, then I think a decoupled SSH transport layer which is providing say 1kB of zero-pad every 20ms to the the shell to fill up, along with a FIFO to spread that out, and maybe some logic to ramp up and down the channel bandwidth based on queue length, you would go a long way to mitigating this specific attack.