[flang] Add runtime interface for NUM_IMAGES #1076

rasmussn · 2021-09-21T22:47:15Z

NUM_IMAGES takes optional arguments TEAM and TEAM_NUMBER. This
patch provides an interface for only the optional TEAM_NUMBER
dummy argument. It is likely that a different function will
be required to support TEAM arguments.

Reviewed By: PeteSteinfeld, klausler

Differential Revision: https://reviews.llvm.org/D109547

…project into fir-dev Getting ready to merge changes for the CSHIFT intrinsic.

This change implements the lowering runtime portion of the CSHIFT intrinsic. The implementation of the runtime was done in revision D106292.

[flang] Implement the lowering portion of the CSHIFT intrinsic

@banach-space

…roduct Add lowering for the DOT_PRODUCT intrinsic I confirmed that it builds and runs without issue (regression tests pass and an end-to-end dot product test runs correctly too). Thanks @banach-space !

Initial implementation with assumption that rank will be always 2

In processing the UNPACK intrinsic, my implementation had an unneeded extra call to createBox(). This change removes it.

-> Handle case where rank of one the matrices is 1. -> Add test for the above case with logical type. -> Remove return value from the matmul call.

This change fixes a bug in the runtime portion of the CSHIFT intrinsic that happens when the value of the SHIFT argument is negative. Differential Revision: https://reviews.llvm.org/D106292

[flang] Remove an unneeded call to createBox

…soc-review-comments [NFC] address comments from flang-compiler#907

When instantiating variables, identify the ones that must be default initialized at runtime (locals and intent(out) dummies). Call the default initialization runtime routine to perform the initialization. Global variables that have no explicit initialization and have a default initialization are left TODO. The runtime cannot be used for them, the fir.global generation will need to be updated for these variables (it was generating an undefined undefined until now). For now, simply catch and add a hard TODO for these cases. For information, it is firing 13 times in NAG F95 tests. Runtime interface for finalization is added, but nothing is done with it yet (Final procedures are not F95).

…-type-init-2 Lower default initialization of local and dummy entities

[OpenMP][Flang]Add support for modifiers monotonic and non-monotonic

Also insert casts in fir.array_coor op (error revealed by the first fix).

…issue Fix fir.alloca codegen for array with dynamic type parameters

It does not make sense to embox a box. Instead we can convert the box's signature with a fir.convert, which is idempotent on the boxed value.

…faces Since BOZ literal arguments are typeless, we cannot know how to pass them as actual arguments to procedures with implicit interfaces. This change avoids the problem by emitting an error message in such situations. This change stemmed from the following issue -- flang-compiler#794 Differential Revision: https://reviews.llvm.org/D106831

According to C7109, "A boz-literal-constant shall appear only as a data-stmt-constant in a DATA statement, or where explicitly allowed in 16.9 as an actual argument of an intrinsic procedure." This change enforces that constraint for output list items. I also added a general interface to determine if an expression is a BOZ literal constant and changed all of the places I could find where it could be used. I also added a test. This change stemmed from the following issue -- https://gitlab-master.nvidia.com/fortran/f18-stage/issues/108 Differential Revision: https://reviews.llvm.org/D106893

When a WRITE overwrites an endfile record, we need to forget that there was an endfile record. When doing a BACKSPACE after an explicit ENDFILE statement, the position afterwards must be upon the endfile record. Attempts to join list-directed delimited character input across record boundaries was due to a bad reading of the standard and has been deleted, now that the requirements are better understood. This problem would cause a read attempt past EOF if a delimited character input value was at the end of a record. It turns out that delimited list-directed (and NAMELIST) character output is required to emit contiguous doubled instances of the delimiter character when it appears in the output value. When fixed-size records are being emitted, as is the case with internal output, this is not possible when the problematic character falls on the last position of a record. No two other Fortran compilers do the same thing in this situation so there is no good precedent to follow. Because it seems least wrong, with this patch we now emit one copy of the delimiter as the last character of the current record and another as the first character of the next record. (The second-least-wrong alternative might be to flag a runtime error, but that seems harsh since it's not an explicit error in the standard, and the output may not have to be usable later as input anyway.) Consequently, the output is not suitable for use as list-directed or NAMELIST input. If a later standard were to clarify this case, this behavior will of course change as needed to conform. Differential Revision: https://reviews.llvm.org/D106695

NAMELIST I/O formatting uses the runtime infrastructure for list-directed I/O. List-directed input processing has same state that requires reinitialization for each successive NAMELIST input item. This patch fixes bugs with "null" items and repetition counts on NAMELIST input items after the first in the group. Differential Revision: https://reviews.llvm.org/D106694

cherry pick runtime I/O fixes for namelist array input, backspace, rewind

Fix issue flang-compiler#917.

Lowering for the Matmul Intrinsic

All tests from flang/test/Lower/intrinsic-procedures.f90 are moved to dedicated files (with the exception of `lge`, `lgt`, `lle` and `llt`). With this change, every filename clearly defines which intrinsic procedure is being tested. After the initial split, we discovered that some tests were invalid (e.g. `! CHECK` was used instead of `! CHECK:`). These tests are now fixed. Some tests are additionally hardened (e.g. some `CHECK-DAG` directives was replaced with `CHECK`). Some tests are updated with checks generated with mlir/utils/generate-test-checks.py. In other words, this patch _moves_ and _reformats_ the tests. The documentation is updated accordingly (with some extra clarifications, as suggested by the reviewers). Please see the following PRs for a complete discussion: * flang-compiler#927 * flang-compiler#914

…ate_tests [flang][nfc] Move tests from intrinsic-procedures.f90 to dedicated files

Fix the verify test.

…he builder elided those extents. Fixes flang-compiler#942.

Delete code that was redundantly removing the elided extents before t…

…r with a scalar result. See the following Phabricator review for more information: https://reviews.llvm.org/D106820

Fix evaluation semantics of FORALL constructs per 10.2.4.2.4.

The added pass works when mlir::LLVM::LLVMDialect, mlir::acc::OpenACCDialect, and mlir::omp::OpenMPDialect are present, which adds a dependency on those dialect libraries. https://github.com/flang-compiler/f18-llvm-project/blob/bb1ab876a5fec1eb3f2ab0603756f9124515a68a/flang/include/flang/Optimizer/Transforms/Passes.td#L189

Define a VectorSubscriptBox class that allow representing and working with a lowered Designator containing vector subscripts while ensuring all the subscripts expression are only lowered once. It's a bit of a super ExtendedValue (but is is not added as such because it is heavy and its use case is only restricted to IO input so far). The key point of this class is that it has members functions `loopOverElements` and `loopOverElementsWhile` that allow creating loops over each element of the designator, and call a provided callback with the address of the element. This class is used in input IO to create an IO runtime call for each element of the designator, since it is not possible to build a descriptor for it. The `loopOverElementsWhile` version is required when error recovery is enabled in IO. A hidden VectorSubscriptBoxBuilder is in charge of making a custom Designator<T> visit and create the VectorSubscriptBox. Once this is done, the VectorSubscriptBox does not equire any front-end data structures to work with. The motivation for creating such tool is that the current lowering array lowering infrastructure is centered around assignment (to a variable or temporary), and could not be used here to created the loops over the IO runtime calls without some non trivial modification to it. Adding complexity to the array expression lowering framework to cover a corner case did not appear a good idea. The option of creating a temp, passing it to the runtime, and copying it back was explored, but it was not possible to guarantee that the subscript would be evaluated only once given there was no way to "keep the lowered representation" of the designator between the temp creation and the copy back (the temp creation requires evaluating the susbcripts to compute the shape in general). However, note that with the added VectorSubscriptBox, it would actually now also be possible to deploy a temp + copy back mechanism. This can be done later if it appeared beneficial in real world program. The only cases left TODOs are the cases when one of the Components is a parent type reference (not yet handled properly in the general case), the coarray case, and the PDT case (not sure exactly how type params will be threaded in fir.field_index). All other cases are covered, and I tried to add exhaustive tests since I do not expect real world program to be very harsh on this utility (most will just do READ(*, *) x(y)).

…-io-2 Lower IO input with vector subscripts

Also fix some tests that were walls of CHECK-DAG checks.

Changes so that temporaries do not use uniq_name.

- Some file name size made it into transfer.f90 test in flang-compiler#1070 test update through the fir.char<?,90> type. This makes the test dependent on where LLVM directory is. Replace fir.char<?,90> by fir.char<?,{{.*}}> - vector-subscript-io.f90 was not updated with the uniq_name -> bindc_name change in flang-compiler#1070.

Use the array lowering framework to lower elemental subroutine. This required: - Splitting the part of genIterSpace that creates the implicit loops into a new genImplicitLoops function. genIterspace deals with the destination and other explicit contexts set-ups that do not matter in the subroutine case (there is no destination). Other than having no destination, a big difference is that there is no threaded innerArg. genImplicitLoops handles this variation. - Fixing a bug in the iteration shape determination. The code was assuming there would always be at least one array_load, which is not true if there are only RefTransparent operands. This is always true in the subroutine case, but was actually also an issue for things like `print *, elem_func(array)` with the previous code (compile crash). Add a new ArrayOperands structure to keep track of what matters to later deduce the shape of the array expression instead of keeping track of array_loads for that purpose. - Elemental subroutine with intent(out)/intent(inout) arguments must be applied "in array element order" (15.8.3). It turns out this is also the case for impure elemental functions (10.1.4 p5). Instead of always creating loops as "unordered=true" add an unordered field to the ArrayExprLowering class. It is set to true by default, but will be set to false if any impure elemental function is lowered, or in case of intent(out)/intent(inout) arguments in elemental subroutines. - Last, instead of using createFirExpr to lowering subroutine calls, use a new createSubroutineCall that deals with the array/scalar dispatching. A TODO is added for the user defined elemental assignment, because overlap analysis between RHS/LHS must be done, and this require somehow plugging this special and very restricted case of elemental calls into the array_load/array_update/array_merge framework or some ad-hoc lowering.

…ub-2 Lower elemental subroutine calls (outside of user assignment case)

jeanPerier · 2021-09-22T13:26:12Z

This interface looks good to me, however it looks like it did not make it in LLVM as part of https://reviews.llvm.org/D109547 (I see the runtime discussed in the thread, but I cannot see the runtime change itself in the last diff or in https://github.com/flang-compiler/f18-llvm-project/tree/fir-dev/flang/include/flang/Runtime). Was this intended ?

flang/include/flang/Runtime/coarray.h

NUM_IMAGES takes optional arguments TEAM and TEAM_NUMBER. This patch provides an interface for only the optional TEAM_NUMBER dummy argument. It is likely that a different function will be required to support TEAM arguments. Reviewed By: PeteSteinfeld, klausler Differential Revision: https://reviews.llvm.org/D109547

rasmussn · 2021-09-23T21:25:44Z

This interface looks good to me, however it looks like it did not make it in LLVM as part of https://reviews.llvm.org/D109547 (I see the runtime discussed in the thread, but I cannot see the runtime change itself in the last diff or in https://github.com/flang-compiler/f18-llvm-project/tree/fir-dev/flang/include/flang/Runtime). Was this intended ?

I don't recall at this point why I thought I had to commit it to fir-dev. Probably because I have the implementation for NUM_IMAGES done and this will definitely have to go in fir-dev. However, I previously pushed

[flang] Make 'this_image()' an intrinsic function

on Thu Sep 16 17:21:40 and this isn't if fir-dev yet. So perhaps I thought that by submitting a pull request to fir dev I could use the the file as soon as it was accepted rather that waiting around for the file to show up in fir-dev.

I'm a bit confused at the moment how this all works. I have several intrinsic functions I've been working on that I'd like to get into fir-dev and I'm finding that the coding is the easiest part. Now that I have commit privileges on the llvm-project itself it is straight forward for me to push stuff there.

I think I'll just go ahead and submit a pull request for the whole implementation of NUM_IMAGES since the interface has already been accepted.

This commit provides a single-image implementation of NUM_IMAGES. It returns 1.

rasmussn · 2021-09-24T05:06:53Z

I've added the implementation for NUM_IMAGES. It will likely be easier to understand the interface for num_images() in coarray.h in context of the implementation. The implementation is for execution on a single image only, in which case the function returns 1.

schweitzpgi · 2021-09-24T15:59:46Z

Hi Craig,

At this time we are typically cherry picking changes from the main branch onto the fir-dev branch when and where it makes sense to do so. Lowering code is not upstreamed, so changes to that part of the compiler come in as PRs like this one. It looks like this PR could've cherry picked over a change or two, but don't worry about it at this point.

Thanks for the help with coarrays.

rasmussn · 2021-09-28T19:35:52Z

flang/runtime/coarray.cpp

+
+int RTNAME(NumImages)(int team,
+    const char *sourceFile, int sourceLine) {
+  std::fputs("Fortran NUM_IMAGES: single image value is 1", stderr);


This line should go away along with the include of C++ header file.

rasmussn · 2021-09-28T19:38:27Z

flang/runtime/coarray.cpp

+    const char *sourceFile, int sourceLine) {
+  std::fputs("Fortran NUM_IMAGES: single image value is 1", stderr);
+  // return value for single image (default coarray runtime)
+  return 1;


Error checking needs to be added. I need to think through what values team can take on a single image.

rasmussn · 2021-09-28T19:45:58Z

This pull request should wait until https://reviews.llvm.org/D110356 is merged into the main branch. It makes several changes related to TEAM_TYPE in flang/lib/Evaluate/intrinsics.cpp that will be useful here.

jeanPerier · 2021-10-28T13:15:08Z

Hi, I have just rebased fir-dev on LLVM 85bf221 from last week.
That means you will need to update this PR with the rebased fir-dev.
You cannot simply rebase your PR with fir-dev (it will try to reapply all fir-dev changes).

You can update the PR as follow (assuming your branch is called my-branch):

backup your branch into my-branch-bckup (just to be safe :) )
create a new branch my-branch-new following the new fir-dev head
cherry-pick all your commits from this PR into my-branch-new
once it looks good, force push my-branch-new into the upstream my-branch (git push origin my-branch-new:my-branch -f )

Sorry for the inconvenience, I could not wait for every PR to get in and solve all new conflicts on my side, the rebase
is meant to help upstreaming so that we can avoid having this too many times again in the future.

schweitzpgi

This PR requires a rebase.

schweitzpgi

.

psteinfeld and others added 30 commits July 22, 2021 10:41

Merge branch 'fir-dev' of https://github.com/flang-compiler/f18-llvm-…

c121e87

…project into fir-dev Getting ready to merge changes for the CSHIFT intrinsic.

[flang] Implement the lowering portion of the CSHIFT intrinsic

ca8245a

This change implements the lowering runtime portion of the CSHIFT intrinsic. The implementation of the runtime was done in revision D106292.

Merge pull request flang-compiler#937 from flang-compiler/ps-cshift

23d2f3e

[flang] Implement the lowering portion of the CSHIFT intrinsic

Merge pull request flang-compiler#914 from banach-space/andrzej/dot_p…

be011ea

…roduct Add lowering for the DOT_PRODUCT intrinsic I confirmed that it builds and runs without issue (regression tests pass and an end-to-end dot product test runs correctly too). Thanks @banach-space !

Lowering for the Matmul Intrinsic

019b009

Initial implementation with assumption that rank will be always 2

[flang] Remove an unneeded call to createBox

0c69f27

In processing the UNPACK intrinsic, my implementation had an unneeded extra call to createBox(). This change removes it.

Fixes to address TODOs and review comments

dbe480d

-> Handle case where rank of one the matrices is 1. -> Add test for the above case with logical type. -> Remove return value from the matmul call.

[flang] Implement the runtime portion of the CSHIFT intrinsic

e80fbf3

This change fixes a bug in the runtime portion of the CSHIFT intrinsic that happens when the value of the SHIFT argument is negative. Differential Revision: https://reviews.llvm.org/D106292

Merge pull request flang-compiler#940 from flang-compiler/ps-fix-unpack

651a3be

[flang] Remove an unneeded call to createBox

Merge pull request flang-compiler#938 from flang-compiler/jpr-host-as…

90cb2d6

…soc-review-comments [NFC] address comments from flang-compiler#907

Merge pull request flang-compiler#939 from flang-compiler/jpr-derived…

776a3da

…-type-init-2 Lower default initialization of local and dummy entities

Merge pull request flang-compiler#791 from Leporacanthicus/modifiers

18b0621

[OpenMP][Flang]Add support for modifiers monotonic and non-monotonic

Fix fir.alloca codegen for array with dynamic type parameters

c424f04

Also insert casts in fir.array_coor op (error revealed by the first fix).

Merge pull request flang-compiler#943 from flang-compiler/jpr-alloca-…

8abd290

…issue Fix fir.alloca codegen for array with dynamic type parameters

Fix issue flang-compiler#917.

b5bb1ef

It does not make sense to embox a box. Instead we can convert the box's signature with a fir.convert, which is idempotent on the boxed value.

Merge pull request flang-compiler#947 from vdonaldson/vkd2

9065951

cherry pick runtime I/O fixes for namelist array input, backspace, rewind

Merge pull request flang-compiler#949 from flang-compiler/ch-i917

b209722

Fix issue flang-compiler#917.

Merge pull request flang-compiler#931 from kiranchandramohan/matmul

cea71dc

Lowering for the Matmul Intrinsic

Merge pull request flang-compiler#927 from banach-space/andrzej/separ…

3cedc9c

…ate_tests [flang][nfc] Move tests from intrinsic-procedures.f90 to dedicated files

Fix the verify test.

9ba7afe

Merge pull request flang-compiler#955 from schweitzpgi/ch-verify

4b9d243

Fix the verify test.

Delete code that was redundantly removing the elided extents before t…

d3c32a4

…he builder elided those extents. Fixes flang-compiler#942.

Merge pull request flang-compiler#956 from flang-compiler/ch-ch5egs

b4e472a

Delete code that was redundantly removing the elided extents before t…

Cherry pick from llvm-main, the runtime fixes for MAXLOC runtime erro…

55e391e

…r with a scalar result. See the following Phabricator review for more information: https://reviews.llvm.org/D106820

schweitzpgi and others added 9 commits September 16, 2021 13:29

Merge pull request flang-compiler#1063 from schweitzpgi/ch-forall

2af3c27

Fix evaluation semantics of FORALL constructs per 10.2.4.2.4.

Merge pull request flang-compiler#1057 from flang-compiler/jpr-vector…

40d96d4

…-io-2 Lower IO input with vector subscripts

Changes so that temporaries do not use uniq_name.

6286954

Also fix some tests that were walls of CHECK-DAG checks.

Merge pull request flang-compiler#1070 from schweitzpgi/ch-uniqname

76c6aa0

Changes so that temporaries do not use uniq_name.

Merge pull request flang-compiler#1069 from flang-compiler/jpr-elem-s…

0b3fa79

…ub-2 Lower elemental subroutine calls (outside of user assignment case)

schweitzpgi reviewed Sep 22, 2021

View reviewed changes

flang/include/flang/Runtime/coarray.h Outdated Show resolved Hide resolved

rasmussn closed this Sep 23, 2021

rasmussn force-pushed the coarray_h branch from 2daa4df to 0b3fa79 Compare September 23, 2021 20:19

rasmussn reopened this Sep 23, 2021

[flang] NUM_IMAGES(TEAM_NUMBER) runtime implementation

3d2f9d0

This commit provides a single-image implementation of NUM_IMAGES. It returns 1.

schweitzpgi requested review from jeanPerier, clementval and mleair September 24, 2021 15:56

rasmussn commented Sep 28, 2021

View reviewed changes

jeanPerier force-pushed the fir-dev branch from f72c975 to d680db3 Compare October 28, 2021 13:04

schweitzpgi approved these changes Jan 25, 2022

View reviewed changes

schweitzpgi requested changes Jan 25, 2022

View reviewed changes

clementval removed their request for review July 17, 2024 15:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flang] Add runtime interface for NUM_IMAGES #1076

[flang] Add runtime interface for NUM_IMAGES #1076

rasmussn commented Sep 21, 2021

jeanPerier commented Sep 22, 2021

rasmussn commented Sep 23, 2021

rasmussn commented Sep 24, 2021

schweitzpgi commented Sep 24, 2021

rasmussn Sep 28, 2021 •

edited

Loading

rasmussn Sep 28, 2021

rasmussn commented Sep 28, 2021

jeanPerier commented Oct 28, 2021

schweitzpgi left a comment

schweitzpgi left a comment

[flang] Add runtime interface for NUM_IMAGES #1076

Are you sure you want to change the base?

[flang] Add runtime interface for NUM_IMAGES #1076

Conversation

rasmussn commented Sep 21, 2021

jeanPerier commented Sep 22, 2021

rasmussn commented Sep 23, 2021

rasmussn commented Sep 24, 2021

schweitzpgi commented Sep 24, 2021

rasmussn Sep 28, 2021 • edited Loading

Choose a reason for hiding this comment

rasmussn Sep 28, 2021

Choose a reason for hiding this comment

rasmussn commented Sep 28, 2021

jeanPerier commented Oct 28, 2021

schweitzpgi left a comment

Choose a reason for hiding this comment

schweitzpgi left a comment

Choose a reason for hiding this comment

rasmussn Sep 28, 2021 •

edited

Loading