Please, No More Loops (Than Necessary)

1 pointsposted 12 days ago
by zaikunzhang

3 Comments

pklausler

11 days ago

The discussion there badly misunderstands the nature of ELEMENTAL procedures in Fortran and their relevance to parallel execution.

ELEMENTAL is relevant to DO CONCURRENT only indirectly. The ELEMENTAL attribute matters there only because it implies the PURE attribute by default, and PURE is required for procedures referenced in DO CONCURRENT. (Which is not a parallel construct, but that's another matter.)

ELEMENTAL in array expressions (incl. FORALL) should not be understood as being a way for one procedure invocation to receive and return entire arrays as arguments and results. That would require buffering during the evaluation of an array expression. Instead, ELEMENTAL should be viewed (and implemented) as a means of allowing a function to be called as part of the implementation of unbuffered array expression execution.

ELEMENTAL has its roots in compilation for true vector machines. It once caused a function to have multiple versions generated: a normal one with scalar arguments, and a vector one with vector register arguments. This would allow a user-written ELEMENTAL function to be called in a vectorized DO loop, just like an intrinsic vectorizable function like SIN. A compiler for today's SIMD "vector" ISAs could implement ELEMENTAL in a similar fashion.

ivanpribec

10 days ago

I can see the subtle distinction you make. The flang notes on array composition provide a good introduction to the way array expressions (https://flang.llvm.org/docs/ArrayComposition.html) are treated.

But in practice it looks like the elemental function must be in the same translation unit for vectorization to occur with compilers popular today. Explicit options like !$omp declare simd are a different matter (and have different pitfalls).

ivanpribec

10 days ago

*same translation unit as the call site