Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow including the "main" header with < > #53013

Open
reddwarf69 opened this issue Jan 5, 2022 · 17 comments
Open

Allow including the "main" header with < > #53013

reddwarf69 opened this issue Jan 5, 2022 · 17 comments
Labels
clang-format enhancement Improving things as opposed to bug fixing, e.g. new or missing feature good first issue https://github.com/llvm/llvm-project/contribute

Comments

@reddwarf69
Copy link

IncludeCategories in https://clang.llvm.org/docs/ClangFormatStyleOptions.html explains that "The main header for a source file automatically gets category 0".
The first problem is that it fails to document it only considers a header "main" if it has been included with ".
The second issue is that it fails to consider a header that has been included with " as the "main" header.

This is not a problem for the LLVM project because it includes its own headers with ". This makes sense, the standard in "15.3 Source file inclusion" says

[Note: Although an implementation may provide a mechanism for making arbitrary source files available to
the < > search, in general programmers should use the < > form for headers provided with the implementation,
and the " " form for sources outside the control of the implementation.

and https://gcc.gnu.org/onlinedocs/cpp/Include-Syntax.html says "This variant is used for header files of your own program".

But unfortunately, the C++ Core Guidelines say the usage of < or " should not be considering whether it's "outside the control of the implementation" or "header files of your own program" but considering whether the file is "relative to the including file" (https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rs-incform).

A common thing to do is

src/thing.cpp
include/mylib/thing.hpp

to have the headers defining the public API separated from others. And since thing.hpp is not relative to thing.cpp (I don't think .. is defined in any standard, and it would be ugly anyway), the C++ Core Guidelines state that < should be used and clang-format fails to see #include mylib/thing.hpp is the main header of thing.cpp.

One way to solve this would be adding an option to specify the directory prefix used by the project. If I can say IncludeDirectoryOwnHeaders: mylib then clang-format could consider mylib/thing.hpp as the main header without getting confused with otherlib/thing.hpp.

@mydeveloperday mydeveloperday added clang-format enhancement Improving things as opposed to bug fixing, e.g. new or missing feature and removed new issue labels Jan 5, 2022
@mydeveloperday
Copy link
Contributor

Don't you think this would be very messy for large projects IncludeDirectoryOwnHeaders would need to be huge and would likely constantly be changing?

Did I misunderstand why you wouldn't use " for me personally I only ever use < for system headers. (This issue come up from time to time), but I don't really get why people are using < for a local library header, can you help me understand?

@reddwarf69
Copy link
Author

I don't really get why people are using < for a local library header, can you help me understand?

Literally only because the C++ Core Guidelines say so (https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rs-incform). To be clear, I'm not claiming it's a good guideline.
Once you have the headers with the public API separated in a different directory (which, who doesn't?) then those headers are not "relative to the including file" (unless I did #include "../include/mylib/thing.hpp", notice the include in there).

It's like east vs west const (https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rl-const). Do I like what the C++ Core Guidelines say? No. But I just follow it since, well... they are the C++ Core Guidelines.

FWIW there is isocpp/CppCoreGuidelines#1666, which wants to move it even further to the "< for a local library header".

@reddwarf69
Copy link
Author

Don't you think this would be very messy for large projects IncludeDirectoryOwnHeaders would need to be huge and would likely constantly be changing?

I would not expect a huge (long list?) IncludeDirectoryOwnHeaders. I would expect

myproject/.clang-format
myproject/lib/component_A/.clang_format
myproject/lib/component_A/src/thing.cpp
myproject/lib/component_A/include/myproject/component_A/thing.hpp
myproject/lib/component_A/.clang_format
myproject/lib/component_B/src/thing.cpp
myproject/lib/component_B/include/myproject/component_B/thing.hpp

With myproject/lib/component_A/.clang_format

BasedOnStyle:  InheritParentConfig
IncludeDirectoryOwnHeaders: myproject/component_A

(supposing component_A/component_B have their headers in its own subdirectory, it's not that unusual to have all the components having the headers directly in myproject/).

I certainly think it would be annoying. But, once again, the C++ Core Guidelines basically force me to do this.

@mydeveloperday
Copy link
Contributor

It's like east vs west const (https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rl-const). Do I like what the C++ Core Guidelines say? No. But I just follow it since, well... they are the C++ Core Guidelines.

I thought you were trying to win me over! ;-)

What bad things happen if we had an option to allow the main header to be either '"' or '<'? rather than having to maintain a whole list of "ownheaders"?

@reddwarf69
Copy link
Author

What bad things happen if we had an option to allow the main header to be either '"' or '<'? rather than having to maintain a whole list of "ownheaders"?

I have not looked at the implementation. But I suppose only " is being used right now to avoid clashes?

src/thing.cpp
include/mylib/thing.hpp

src/thing.cpp

#include <mylib/thing.hpp>
#include <otherlib/thing.hpp>

which one is the "main" one?

@mydeveloperday
Copy link
Contributor

Yes very likely... wouldn't be easy to differentiate.

Assume we have

1) #include <thing>                
2) #include <some_library/thing.h> 
3) #include "thing.h"                 
4) #include "foo_utils/thing.h"  
5) #include <component_b/thing.h>

And we are in thing.cpp can we make any clear distension over who should win as to who is the main?

a) In the presence of (1,2,3,4 & 5) 3 wins
b )in the presence of (1,2,4 & 5) 4 wins
c) In the presence of (1,2,5) 2 and 5 are effectively the same (not sure how we'd differentiate)
d) In the presence of (1) we must be inside STL itself I assume (so 1 wins)

So from this I understand that you'd want to be able to say "component_b" is in my project so that we can determine who out of 2 and 5 should win.

But ultimately that requires us to allow < to be a main header include AS long as it always looses to " first.

Ultimately if someone write

#include "MyMain.h"

#include "thing.h"
#include <string>

Personally I don't think we shouldn't assume that thing.h is the main header and hoist it out to be: do you?

#include "thing.h"
#include "MyMain.h"

#include <string>

@mydeveloperday
Copy link
Contributor

This mostly comes down to this code in lib/Tooling/Inclusions

bool IncludeCategoryManager::isMainHeader(StringRef IncludeName) const {
  if (!IncludeName.startswith("\""))
    return false;

  IncludeName =
      IncludeName.drop_front(1).drop_back(1); // remove the surrounding "" or <>
  // Not matchingStem: implementation files may have compound extensions but
  // headers may not.
  StringRef HeaderStem = llvm::sys::path::stem(IncludeName);
  StringRef FileStem = llvm::sys::path::stem(FileName); // foo.cu for foo.cu.cc
  StringRef MatchingFileStem = matchingStem(FileName);  // foo for foo.cu.cc
  // main-header examples:
  //  1) foo.h => foo.cc
  //  2) foo.h => foo.cu.cc
  //  3) foo.proto.h => foo.proto.cc
  //
  // non-main-header examples:
  //  1) foo.h => bar.cc
  //  2) foo.proto.h => foo.cc
  StringRef Matching;
  if (MatchingFileStem.startswith_insensitive(HeaderStem))
    Matching = MatchingFileStem; // example 1), 2)
  else if (FileStem.equals_insensitive(HeaderStem))
    Matching = FileStem; // example 3)
  if (!Matching.empty()) {
    llvm::Regex MainIncludeRegex(HeaderStem.str() + Style.IncludeIsMainRegex,
                                 llvm::Regex::IgnoreCase);
    if (MainIncludeRegex.match(Matching))
      return true;
  }
  return false;
}

I personally think we could use another regex to solve this, which might be more powerful than just a list.

@reddwarf69
Copy link
Author

Yes very likely... wouldn't be easy to differentiate.

Assume we have

1) #include <thing>                
2) #include <some_library/thing.h> 
3) #include "thing.h"                 
4) #include "foo_utils/thing.h"  
5) #include <component_b/thing.h>

And we are in thing.cpp can we make any clear distension over who should win as to who is the main?

a) In the presence of (1,2,3,4 & 5) 3 wins b )in the presence of (1,2,4 & 5) 4 wins c) In the presence of (1,2,5) 2 and 5 are effectively the same (not sure how we'd differentiate) d) In the presence of (1) we must be inside STL itself I assume (so 1 wins)

To be fair, the user may have already said which one he prefers with IncludeCategories.

If thing.cpp is in component_b, he may have '^<component_b/' with priority INT_MIN and so he may want only things in component_b considered as main headers.

So from this I understand that you'd want to be able to say "component_b" is in my project so that we can determine who out of 2 and 5 should win.

Honestly, here I would suppose 3 is the main one since it would be the more "local" one (supposing you are following the C++ Core Guidelines, which is not a given). But if thing.cpp is part of component_b the fact that both 3 and 5 exist would be already insane.

You could also argue that, in case of doubt/conflict, if any of the candidates is already the top header, it should be considered the main header. I mean, not only you have 5 different thing.h files includes in a single .cpp file... you also happen to have at the top 1, when it's not really the "main" header? It's likely 1 should win here.

At the end of the day, the "problem" is that clang-format is impossibly flexible when real projects are going to have some kind of more or less sane pattern (the issue being knowing which one). In a completely generic way, making at least any of 1, 3 or 5 the main header could potentially make sense.

Ultimately if someone write

#include "MyMain.h"

#include "thing.h"
#include <string>

Personally I don't think we shouldn't assume that thing.h is the main header and hoist it out to be: do you?

#include "thing.h"
#include "MyMain.h"

#include <string>

No, I would never expect clang-format to do this to my code.

@mydeveloperday
Copy link
Contributor

What do you think about starting by making something like this an option?

bool IncludeCategoryManager::isMainHeader(StringRef IncludeName) const {
  if (!Style.AnyIncludeIsMainHeader && !IncludeName.startswith("\""))
    return false;

@reddwarf69
Copy link
Author

Seems fine to me.

@mydeveloperday
Copy link
Contributor

Suggestions for a good name? not sure I like AnyIncludeIsMainHeader

@reddwarf69
Copy link
Author

Not that I think it's the best name ever. But no, sorry, I don't have a great suggestion either.

@mydeveloperday
Copy link
Contributor

I wonder could this be a regex as well?

by default it would be

image

but could be set to be

image

and then would allow

image

and

image

@seppestas
Copy link

seppestas commented Oct 11, 2022

This looks like a duplicate of #39735 and possibly #27008 to me.

@mydeveloperday
Copy link
Contributor

Reviewing this issue... has the core guideline changed? it seems to hold with the "" for a local header and <> for files elsewhere.

https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rs-incform
"SF.12: Prefer the quoted form of #include for files relative to the including file and the angle bracket form everywhere else"

Is this really still a requirement given that we seem to cover the guidelines and https://gcc.gnu.org/onlinedocs/cpp/Include-Syntax.html

@reddwarf69
Copy link
Author

reddwarf69 commented Jun 30, 2023

I don't think it has changed. But it says "for files relative to the including file", not "local header" (I guess "local" here would be same project or component).

So, according to the Core Guidelines, in

#include "llvm/Bitstream/BitstreamReader.h"
, supposedly the file is located in "llvm/lib/Bitstream/llvm/Bitstream/BitstreamReader.h", since it's "relative to the including file" (and if such file existed, it would have preference when using gcc). But, in fact, the file is in "llvm/include/llvm/Bitstream/BitstreamReader.h" (https://github.com/llvm/llvm-project/blob/5e2f0947c586042083cc1643f2aac90f2492bacb/llvm/include/llvm/Bitstream/BitstreamReader.h).

Notice the "note":

Failing to follow this results in difficult to diagnose errors due to picking up the wrong file by incorrectly specifying the scope when it is included. For example, in a typical case where the #include "" search algorithm might search for a file existing at a local relative path first, then using this form to refer to a file that is not locally relative could mean that if a file ever comes into existence at the local relative path (e.g. the including file is moved to a new location), it will now be found ahead of the previous include file and the set of includes will have been changed in an unexpected way.

That seems to be the whole argument in the Core Guidelines: What if a header file later appears relative to the including/source file? Better to use <> to avoid the risk.
Honestly, it's silly, you are never going to create a "llvm/Bitstream" directory structure under "llvm/lib/Bitstream". But...

The Core Guideline seem in direct contradiction to both the gcc manual and the C++ standard document. So an option is to try to change the Core Guideline, "the standard says something else" is a decent argument.

@Endilll Endilll added good first issue https://github.com/llvm/llvm-project/contribute and removed beginner labels Aug 15, 2023
@llvmbot
Copy link
Member

llvmbot commented Aug 15, 2023

Hi!

This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:

  1. Assign the issue to you.
  2. Fix the issue locally.
  3. Run the test suite locally.
    3.1) Remember that the subdirectories under test/ create fine-grained testing targets, so you can
    e.g. use make check-clang-ast to only run Clang's AST tests.
  4. Create a git commit
  5. Run git clang-format HEAD~1 to format your changes.
  6. Submit the patch to Phabricator.
    6.1) Detailed instructions can be found here

For more instructions on how to submit a patch to LLVM, see our documentation.

If you have any further questions about this issue, don't hesitate to ask via a comment on this Github issue.

@llvm/issue-subscribers-good-first-issue

j-jorge added a commit to j-jorge/llvm-project that referenced this issue Jan 19, 2024
j-jorge added a commit to j-jorge/llvm-project that referenced this issue Jan 21, 2024
j-jorge added a commit to j-jorge/llvm-project that referenced this issue Jan 22, 2024
owenca pushed a commit that referenced this issue Feb 6, 2024
Resolves #27008, #39735, #53013, #63619.

Hello, this PR adds the MainIncludeChar option to clang-format, allowing
to select which include syntax must be considered when searching for the
main header: quotes (`#include "foo.hpp"`, the default), brackets
(`#include <foo.hpp>`), or both.

The lack of support for brackets has been reported many times, see the
linked issues, so I am pretty sure there is a need for it :)

A short note about why I did not implement a regex approach as discussed
in #53013: while a regex would have allowed many extra ways to describe
the main header, the bug descriptions listed above suggest a very simple
need: support brackets for the main header. This PR answers this needs
in a quite simple way, with a very simple style option. IMHO the feature
space covered by the regex (again, for which there is no demand :)) can
be implemented latter, in addition to the proposed option.

The PR also includes tests for the option with and without grouped
includes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang-format enhancement Improving things as opposed to bug fixing, e.g. new or missing feature good first issue https://github.com/llvm/llvm-project/contribute
Projects
None yet
Development

No branches or pull requests

5 participants