I defy you to name an environment in which it is impossible to read an array from backwards to forwards.
That's an array, as in, things already in memory. Not a stream. An array. A collection of fixed-size-in-RAM elements, contiguously in order.
That is not "an environment which doesn't provide a one-liner to do it". C, for instance, may not provide a backwards iterator, but then, it doesn't exactly provide a forwards iterator either. It is trivial to iterate backwards on an array in C. To the extent that it's a pain to provide a function that does it either way, that's C's problem, not the fact it can't be done. Everything's a pain in C.
Cache coherency is not an issue here either because I have specified it as being read once and already called out that if you're going to read it multiple times it may be worth it. Reading it once to do whatever it is you are doing is faster than reading it once, writing it in a new order, then reading it in that new order, at least on anything remotely resembling real, existing hardware.
Uh, no, you're missing the point. It's always possible for you to reverse the array and read it backwards. After all, that's how you are capable of reversing it! The problem is that the thing that you are handing it to might not be capable of that. You can't pass an array to write(2) and tell it to actually print it in reverse order. You can't mmap a page where first byte of data is actually at the high address. Your DMA engine won't flip your buffer for you. If you want to use these APIs, you have to reverse the array before you hand it to them. There's just no way around it.
Also, I do want to note that just because your language is higher level and takes iterators or sequences or whatever abstraction you have on top of "a bunch of elements" doesn't actually mean that reversing an array is never worth it. It is often the case that your API/compiler/language will support passing in an arbitrary "view", but if you actually try to do this you will find that it can only fully reason about the "forward, contiguous" buffer case, which it will optimize well. If you pass it anything funny it will fall back to the element-by-element code which can be an order of magnitude slower, or more (usually this means no vectorization, for example). In this case it's often better to pull out an optimized reverse and then pass a normal array to the API. Even though it would read the data twice this can still be a lot faster.