Skip to content

[Clang] [Diagnostics] Simplify filenames that contain '..' #143520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 7, 2025

Conversation

Sirraide
Copy link
Member

@Sirraide Sirraide commented Jun 10, 2025

This can significantly shorten file paths to standard library headers, e.g. on my system, <ranges> is currently printed as

/usr/lib/gcc/x86_64-redhat-linux/15/../../../../include/c++/15/ranges

but with this change, we instead print

/usr/include/c++/15/ranges

This is of course just a heuristic, so there will definitely be paths that get longer as a result of this, but it helps with the standard library, and a lot of the diagnostics we print tend to originate in standard library headers (especially if you include notes listing overload candidates etc.). Update: We now always use whichever path ends up being shorter.

@AaronBallman pointed out that this might be problematic for network file systems since path resolution might take a while, so this is enabled only for paths that are part of a local filesystem.

The file names are cached in TextDiagnostic. While we could move it up into e.g. TextDiagnosticPrinter, DiagnosticsEngine, or maybe even the FileManager, to me this seems like something that we mainly care about when printing to the terminal (other diagnostics consumers probably don’t mind receiving the original file path). Moreover, this is already where we handle -fdiagnostics-absolute-paths or whatever that flag is called again.

@Sirraide Sirraide requested a review from AaronBallman June 10, 2025 12:43
@Sirraide Sirraide added the clang:diagnostics New/improved warning or error message in Clang, but not in clang-tidy or static analyzer label Jun 10, 2025
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Jun 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 10, 2025

@llvm/pr-subscribers-clang-tools-extra
@llvm/pr-subscribers-clang-tidy

@llvm/pr-subscribers-clang

Author: None (Sirraide)

Changes

This can significantly shorten file paths to standard library headers, e.g. on my system, &lt;ranges&gt; is currently printed as

/usr/lib/gcc/x86_64-redhat-linux/15/../../../../include/c++/15/ranges

but with this change, we instead print

/usr/include/c++/15/ranges

This is of course just a heuristic, so there will definitely be paths that get longer as a result of this, but it helps with the standard library, and a lot of the diagnostics we print tend to originate in standard library headers (especially if you include notes listing overload candidates etc.).

@AaronBallman pointed out that this might be problematic for network file systems since path resolution might take a while, so this is enabled only for paths that are part of a local filesystem.

The file names are cached in TextDiagnostic. While we could move it up into e.g. TextDiagnosticPrinter, DiagnosticsEngine, or maybe even the FileManager, to me this seems like something that we mainly care about when printing to the terminal (other diagnostics consumers probably don’t mind receiving the original file path). Moreover, this is already where we handle -fdiagnostics-absolute-paths or whatever that flag is called again.


Full diff: https://github.com/llvm/llvm-project/pull/143520.diff

2 Files Affected:

  • (modified) clang/include/clang/Frontend/TextDiagnostic.h (+1)
  • (modified) clang/lib/Frontend/TextDiagnostic.cpp (+20-12)
diff --git a/clang/include/clang/Frontend/TextDiagnostic.h b/clang/include/clang/Frontend/TextDiagnostic.h
index e2e88d4d648a2..9c77bc3e00e19 100644
--- a/clang/include/clang/Frontend/TextDiagnostic.h
+++ b/clang/include/clang/Frontend/TextDiagnostic.h
@@ -35,6 +35,7 @@ namespace clang {
 class TextDiagnostic : public DiagnosticRenderer {
   raw_ostream &OS;
   const Preprocessor *PP;
+  llvm::StringMap<SmallString<128>> SimplifiedFileNameCache;
 
 public:
   TextDiagnostic(raw_ostream &OS, const LangOptions &LangOpts,
diff --git a/clang/lib/Frontend/TextDiagnostic.cpp b/clang/lib/Frontend/TextDiagnostic.cpp
index b9e681b52e509..edbad42b39950 100644
--- a/clang/lib/Frontend/TextDiagnostic.cpp
+++ b/clang/lib/Frontend/TextDiagnostic.cpp
@@ -738,12 +738,20 @@ void TextDiagnostic::printDiagnosticMessage(raw_ostream &OS,
 }
 
 void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) {
-#ifdef _WIN32
-  SmallString<4096> TmpFilename;
-#endif
-  if (DiagOpts.AbsolutePath) {
-    auto File = SM.getFileManager().getOptionalFileRef(Filename);
-    if (File) {
+  auto File = SM.getFileManager().getOptionalFileRef(Filename);
+
+  // Try to simplify paths that contain '..' in any case since paths to
+  // standard library headers especially tend to get quite long otherwise.
+  // Only do that for local filesystems though to avoid slowing down
+  // compilation too much.
+  auto AlwaysSimplify = [&] {
+    return File->getName().contains("..") &&
+           llvm::sys::fs::is_local(File->getName());
+  };
+
+  if (File && (DiagOpts.AbsolutePath || AlwaysSimplify())) {
+    SmallString<128> &CacheEntry = SimplifiedFileNameCache[Filename];
+    if (CacheEntry.empty()) {
       // We want to print a simplified absolute path, i. e. without "dots".
       //
       // The hardest part here are the paths like "<part1>/<link>/../<part2>".
@@ -759,15 +767,15 @@ void TextDiagnostic::emitFilename(StringRef Filename, const SourceManager &SM) {
       // on Windows we can just use llvm::sys::path::remove_dots(), because,
       // on that system, both aforementioned paths point to the same place.
 #ifdef _WIN32
-      TmpFilename = File->getName();
-      llvm::sys::fs::make_absolute(TmpFilename);
-      llvm::sys::path::native(TmpFilename);
-      llvm::sys::path::remove_dots(TmpFilename, /* remove_dot_dot */ true);
-      Filename = StringRef(TmpFilename.data(), TmpFilename.size());
+      CacheEntry = File->getName();
+      llvm::sys::fs::make_absolute(CacheEntry);
+      llvm::sys::path::native(CacheEntry);
+      llvm::sys::path::remove_dots(CacheEntry, /* remove_dot_dot */ true);
 #else
-      Filename = SM.getFileManager().getCanonicalName(*File);
+      CacheEntry = SM.getFileManager().getCanonicalName(*File);
 #endif
     }
+    Filename = CacheEntry;
   }
 
   OS << Filename;

@Sirraide
Copy link
Member Author

This is of course just a heuristic, so there will definitely be paths that get longer as a result of this

Actually, it just occurred to me that we could just cache whichever path ends up being shorter, the original one or the resolved one.

@Sirraide
Copy link
Member Author

I’m not exactly sure how to test this change since this is not only platform-dependent but also path-dependent since we may end up producing absolute paths here.

@AaronBallman
Copy link
Collaborator

The file names are cached in TextDiagnostic. While we could move it up into e.g. TextDiagnosticPrinter, DiagnosticsEngine, or maybe even the FileManager, to me this seems like something that we mainly care about when printing to the terminal (other diagnostics consumers probably don’t mind receiving the original file path). Moreover, this is already where we handle -fdiagnostics-absolute-paths or whatever that flag is called again.

The downside to it being in TextDiagnostic is that consumers then all have to normalize the path themselves (some file system APIs on some systems are better about relative paths than others). If the paths are always equivalent, it might be kinder to pass the resolved path. WDYT?

@AaronBallman
Copy link
Collaborator

I’m not exactly sure how to test this change since this is not only platform-dependent but also path-dependent since we may end up producing absolute paths here.

I think this is a case where maybe we want to use unit tests. We have clang/unittests/Basic/DiagnosticTest.cpp or FileManagerTest.cpp already, so perhaps in one of those (depending on where the functionality ends up living)?

@Sirraide
Copy link
Member Author

The downside to it being in TextDiagnostic is that consumers then all have to normalize the path themselves (some file system APIs on some systems are better about relative paths than others). If the paths are always equivalent, it might be kinder to pass the resolved path. WDYT?

I mean, that’s also true I suppose; the only thing is then that we’d be normalising them twice if -fdiagnostics-absolute-paths is passed—unless we move the handling for that elsewhere as well, but now that’s dependent on the diagnostic options, so it probably shouldn’t be in FileManager—which leaves DiagnosticsEngine? But consumers don’t generally have access to the DiagnosticsEngine, so it’d have to be in the FileManager after all.

I guess we could always compute both the absolute and the ‘short’ path for a file whenever FileManager opens one so that they’re always available. But that might have some impact on performance (though I guess this is a perf/ branch already so we can try and see how it goes)?

@AaronBallman Thoughts?

@AaronBallman
Copy link
Collaborator

The downside to it being in TextDiagnostic is that consumers then all have to normalize the path themselves (some file system APIs on some systems are better about relative paths than others). If the paths are always equivalent, it might be kinder to pass the resolved path. WDYT?

I mean, that’s also true I suppose; the only thing is then that we’d be normalising them twice if -fdiagnostics-absolute-paths is passed—unless we move the handling for that elsewhere as well, but now that’s dependent on the diagnostic options, so it probably shouldn’t be in FileManager—which leaves DiagnosticsEngine? But consumers don’t generally have access to the DiagnosticsEngine, so it’d have to be in the FileManager after all.

We definitely don't want to normalize twice. Could we parameterize FileManager so we don't have to have it directly depend on diagnostic options?

I guess we could always compute both the absolute and the ‘short’ path for a file whenever FileManager opens one so that they’re always available. But that might have some impact on performance (though I guess this is a perf/ branch already so we can try and see how it goes)?

I think we could try it to see how it goes in terms of performance. Again, I think I'd be most worried about network builds -- I would expect a measurable different in performance even if there are no diagnostics issued just because we need the file information for SourceManager.

@Sirraide
Copy link
Member Author

We definitely don't want to normalize twice. Could we parameterize FileManager so we don't have to have it directly depend on diagnostic options?

One idea I just had is we could do something like:

enum class DiagnosticFileNameMode {
  Unmodified, // As specified by the user
  Canonical,  // Absolute path
  Short,      // Whichever is shorter
}

class FileManager {
  // ...
  StringRef getFileNameForDiagnostic(DiagnosticFileNameMode Mode);
};

And then have separate caches in FileManager for each kind of DiagnosticsFileNameMode and compute the corresponding file name lazily the first time it’s requested.

@AaronBallman
Copy link
Collaborator

We definitely don't want to normalize twice. Could we parameterize FileManager so we don't have to have it directly depend on diagnostic options?

One idea I just had is we could do something like:

enum class DiagnosticFileNameMode {
  Unmodified, // As specified by the user
  Canonical,  // Absolute path
  Short,      // Whichever is shorter
}

class FileManager {
  // ...
  StringRef getFileNameForDiagnostic(DiagnosticFileNameMode Mode);
};

And then have separate caches in FileManager for each kind of DiagnosticsFileNameMode and compute the corresponding file name lazily the first time it’s requested.

I think that approach makes sense! Thought "short path" means something different to those of us old enough to remember DOS 8.3 filenames. :-D

@Sirraide
Copy link
Member Author

I think that approach makes sense! Thought "short path" means something different to those of us old enough to remember DOS 8.3 filenames. :-D

Ha, those I’m not planning to add support for thankfully...

@Sirraide
Copy link
Member Author

Actually, we could also just put it in SourceManager because that already has a reference to the DiagnosticsEngine and then a single getNameForDiagnostic(StringRef Filename) function would do.

@llvmbot llvmbot added the clang:frontend Language frontend issues, e.g. anything involving "Sema" label Jun 10, 2025
@Sirraide
Copy link
Member Author

Actually, we could also just put it in SourceManager because that already has a reference to the DiagnosticsEngine and then a single getNameForDiagnostic(StringRef Filename) function would do.

I’ve done that. Also, SARIFDiagnostic::emitFilename() was just a copy-pasted version of TextDiagnostic::emitFilename(), so I’ve updated it to use the new function as well.

@Sirraide
Copy link
Member Author

Ok, I’ve fixed a crash involving a dangling reference and also disabled the check for a local filesystem on windows. It also seems that we do have a single test for this already.

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I'm in favor of this patch. Precommit CI found relevant failures that need to be fixed, but I think this is otherwise good to go once those are addressed.

Comment on lines +2459 to +2466
#ifdef _WIN32
TempBuf = File->getName();
llvm::sys::fs::make_absolute(TempBuf);
llvm::sys::path::native(TempBuf);
llvm::sys::path::remove_dots(TempBuf, /* remove_dot_dot */ true);
#else
TempBuf = getFileManager().getCanonicalName(*File);
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of pre-existing, but I am not sure doing something different for windows actually make sense.
Symlinks on Windows exist, they are just very rare.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I was interpreting the comment above this to mean that Windows itself doesn’t care about symlinks when resolving .. and actually just deletes preceding path segments, but I’m not much of a Windows person so I don’t know to be fair...

Comment on lines +2436 to +2437
if (!SimplifyPath)
return Filename;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be more efficient to check the map first

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure. This is a perf/ branch anyway so when I’m done fixing the tests (I think it’s just a clang-tidy test that’s left at this point?) we can just check if there are any regressions and try doing this instead if so.

@Sirraide
Copy link
Member Author

Sirraide commented Jul 2, 2025

Ok, looks like the clang-tidy test failure is related to the -header-filter option:

// Check that `-header-filter` operates on the same file paths as paths in
// diagnostics printed by ClangTidy.
#include "dir1/dir2/../header_alias.h"
// CHECK_HEADER_ALIAS: dir1/dir2/../header_alias.h:1:11: warning: single-argument constructors

So, I guess my question is now, should the header filter apply to the original filename, the simplified filename, or both?

@AaronBallman Thoughts? Or alternatively who do best ping for clang-tidy related questions?

@Sirraide
Copy link
Member Author

Sirraide commented Jul 2, 2025

should the header filter apply to the original filename,

I mean, I guess this beacuse it’s what the user specified and it’s what we’re currently doing? It’s just that the end result might be weird, e.g. if a user writes -exclude-header-filter="a/foo.h" and then we print diagnostics in a/foo.h because it was actually included via "a/b/../foo.h" (assuming I’m not misinterpreting what this option does), but maybe that’s ok?

@AaronBallman
Copy link
Collaborator

Ok, looks like the clang-tidy test failure is related to the -header-filter option:

// Check that `-header-filter` operates on the same file paths as paths in
// diagnostics printed by ClangTidy.
#include "dir1/dir2/../header_alias.h"
// CHECK_HEADER_ALIAS: dir1/dir2/../header_alias.h:1:11: warning: single-argument constructors

So, I guess my question is now, should the header filter apply to the original filename, the simplified filename, or both?

@AaronBallman Thoughts? Or alternatively who do best ping for clang-tidy related questions?

CC @5chmidti @PiotrZSL @HerrCai0907 @LegalizeAdulthood for more opinions on this

I would naively expect that I'm giving the tool a path, the tool will resolve all symlinks and relative parts, etc for me same as it would do when specifying the file to compile/tidy.

@LegalizeAdulthood
Copy link
Contributor

My expectation would be that if I specify a header filter I'm not going to use weird paths like a/b/../foo.h, but just a/foo.h because that is where foo.h lives.

@AaronBallman
Copy link
Collaborator

My expectation would be that if I specify a header filter I'm not going to use weird paths like a/b/../foo.h, but just a/foo.h because that is where foo.h lives.

What about symlinks though? Would you expect that passing path/to/file fails because path is a symlink and you should have specified /var/foo/bar/to/file? (It's basically the same problem -- do we make the user pass the resolved path or do we canonicalize the path for the user and use that to do the filtering?)

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Sirraide
Copy link
Member Author

Sirraide commented Jul 7, 2025

Windows CI failures seem unrelated.

@Sirraide Sirraide merged commit e3e7393 into llvm:main Jul 7, 2025
9 checks passed
Sirraide added a commit to Sirraide/llvm-project that referenced this pull request Jul 8, 2025
Sirraide added a commit that referenced this pull request Jul 8, 2025
I forgot to include a release note in #143520, and it also ocurred to me
that while #143514 is technically a bugfix in LLVM/Support, I think we
should have one for it as well.
@justincady
Copy link
Contributor

@Sirraide I'm seeing failures in the newly added test cases with a repo on a network mount. I see the code changes handle that scenario, but if I understand correctly the tests currently do not.

Could the tests be updated to either be skipped or validate that the canonicalization does not happen in that case?

@Sirraide
Copy link
Member Author

@Sirraide I'm seeing failures in the newly added test cases with a repo on a network mount. I see the code changes handle that scenario, but if I understand correctly the tests currently do not.

Could the tests be updated to either be skipped or validate that the canonicalization does not happen in that case?

Ah that makes sense; unfortunately, I’m not entirely sure how you’d detect the presence of a network mount in order to skip or change the tests in that case. Does lit have some way of doing that?

@Sirraide
Copy link
Member Author

Does lit have some way of doing that?

CC @AaronBallman

@AaronBallman
Copy link
Collaborator

I don't think we have a way to support that. I looked through lit to see what kind of REQUIRES support we have and we can target systems and such, but I don't see any built-in support for non-network mounts.

The only suggestion I have might be terrible, but you could write some python to enumerate mount points, determine which ones are network drives, and then check the test path against the list of mount points. But the easy alternative is for folks in this situation to XFAIL the test locally (unfortunately).

@justincady
Copy link
Contributor

Instead of operating in place, could you avoid the network mount entirely by creating the required structure under /tmp and running the test there?

@Sirraide
Copy link
Member Author

The only suggestion I have might be terrible, but you could write some python to enumerate mount points, determine which ones are network drives, and then check the test path against the list of mount points.

I will say that I don’t think I know enough Python (basically none at all really) to implement this approach in a sane manner...

Instead of operating in place, could you avoid the network mount entirely by creating the required structure under /tmp and running the test there?

We could try that (assuming that /tmp exists on every system but I sure hope so); it might run into issues if we somehow pick a file/directory name that something else is already using, but that’s probably not too likely.

@AaronBallman
Copy link
Collaborator

We could try that (assuming that /tmp exists on every system but I sure hope so); it might run into issues if we somehow pick a file/directory name that something else is already using, but that’s probably not too likely.

Windows doesn't have /tmp for example. I don't think we have any substitutions for getting the temp directory.

@Sirraide
Copy link
Member Author

Windows doesn't have /tmp for example. I don't think we have any substitutions for getting the temp directory.

Ah, I meant non-windows systems (I thought REQUIRES: shell already meant non-windows because the test uses quite a few unix commands but I might be wrong). But also, doesn’t windows have like %TEMP% or something? If that works we could probably ‘just’ add a substitution for that (I’m not an expert on anything cross-platform though so this might still be an issue for some platforms).

@AaronBallman
Copy link
Collaborator

Windows doesn't have /tmp for example. I don't think we have any substitutions for getting the temp directory.

Ah, I meant non-windows systems (I thought REQUIRES: shell already meant non-windows because the test uses quite a few unix commands but I might be wrong). But also, doesn’t windows have like %TEMP% or something? If that works we could probably ‘just’ add a substitution for that (I’m not an expert on anything cross-platform though so this might still be an issue for some platforms).

Good call about %TEMP! But I think if we're going to go down this path, it might make sense to add a new substitution to lit (maybe %tempdir) so the logic is hidden outside of the test and can be reused if anyone else runs into this.

I suppose another option is to use a regex in the test to accept either form of canonicalization with a comment that lit tests are sometimes run from network mounts and that's why the test is the way it is.

@Sirraide
Copy link
Member Author

I suppose another option is to use a regex in the test to accept either form of canonicalization with a comment that lit tests are sometimes run from network mounts and that's why the test is the way it is.

I mean, at that point we can just delete the test entirely though because testing that ‘either it canonicalises the path or it doesn’t’ feels a bit tautological...

@AaronBallman
Copy link
Collaborator

I suppose another option is to use a regex in the test to accept either form of canonicalization with a comment that lit tests are sometimes run from network mounts and that's why the test is the way it is.

I mean, at that point we can just delete the test entirely though because testing that ‘either it canonicalises the path or it doesn’t’ feels a bit tautological...

Yeah, but it does test we don't generate something else entirely invented. :-D But yeah, not a super useful test, I suppose.

@jyknight
Copy link
Member

This change is a major issue for Google:

Our distributed build system constructs a directory tree full of symlinks from the "expected" directory hierarchy for a compilation to a content-addressed-storage directory full of hashes. So, because this resolves symlinks before reporting filenames, errors are now reported like "/build/cas/0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef:1:12: error: ", whenever a path with ".." in it would've been shown before. This makes the error messages useless.

Additionally, I think this is also a bad change even outside of Google's arguably-weird use-case of CAS symlinks, because it can cause a build-root-relative path to be transformed into machine-absolute path.

As such, I'd request that this commit be reverted.

To solve the problem reported here (which I agree is a problem worth solving!), I think we'd be better off leaning on the already-existing "prefix canonicalization" functionality. When Clang finds that its C++ stdlib is at "/usr/lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14", as part of the GCC installation detection, it should canonicalize the path right then -- not later on when printing diagnostics.

I don't know why Clang doesn't already do this -- it seems eminently reasonable to canonicalize the GCC prefix under the same circumstances that Clang will canonicalize its own prefix. (That is: do so by default, but disabled by "-no-canonical-prefix". Google's buildsystem has specified that flag for aeons.)

@Sirraide
Copy link
Member Author

To solve the problem reported here (which I agree is a problem worth solving!), I think we'd be better off leaning on the already-existing "prefix canonicalization" functionality. When Clang finds that its C++ stdlib is at "/usr/lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14", as part of the GCC installation detection, it should canonicalize the path right then -- not later on when printing diagnostics.

That would also be an option; it’s arguably most prevalent of an issue when dealing with the standard library. That would also avoid the network filesystem issue because we’d only be resolving a single path, and doing that unconditionally shouldn’t be an issue.

As such, I'd request that this commit be reverted.

I think we’d still want to keep the part of this patch that moves the handling of -fdiagnostics-absolute-paths into SourceManager because that gets rid of some code duplication between the text and sarif diagnostics handlers. I can revert and then reland this if the issue is pressing for you, but if not then I’ll make a follow-up patch in the next couple days that changes this to canonicalises only the GCC dir—either way is fine by me.

@eaeltsin
Copy link
Contributor

I can revert and then reland this if the issue is pressing for you

@Sirraide - this is pressing us, so please clean revert first. Thank you!

Sirraide added a commit that referenced this pull request Jul 12, 2025
Sirraide added a commit that referenced this pull request Jul 12, 2025
…148367)

Revert #143520 for now since it’s causing issues for
people who are using symlinks and prefer to preserve the original path
(i.e. looks like we’ll have to make this configurable after all; I just
need to figure out how to pass `-no-canonical-prefixes` down through the
driver); I’m planning to refactor this a bit and reland it in a few
days.
@Sirraide
Copy link
Member Author

do so by default, but disabled by "-no-canonical-prefix"

So just to be clear, while there might be a better place to canonicalise the paths (we can’t treat include paths the same as the resource directory because that’s canonicalised in clang_main), the main thing you care about is being able to turn off the canonicalisation entirely with -no-canonical-prefix, is that right?

llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jul 12, 2025
…ain '..'" (#148367)

Revert llvm/llvm-project#143520 for now since it’s causing issues for
people who are using symlinks and prefer to preserve the original path
(i.e. looks like we’ll have to make this configurable after all; I just
need to figure out how to pass `-no-canonical-prefixes` down through the
driver); I’m planning to refactor this a bit and reland it in a few
days.
@jyknight
Copy link
Member

the main thing you care about is being able to turn off the canonicalisation entirely with -no-canonical-prefix, is that right?

In terms of not breaking the Google buildsystem: yes. But if the question is whether the current patch would be fine if its behavior was also disabled by that flag: I'm not in favor of that.

While that would address the breakage which brought this PR to my attention, my worry about ill-effects from canonicalization of the user's build-dir-relative file paths into absolute paths is a more general one.

@Sirraide
Copy link
Member Author

my worry about ill-effects from canonicalization of the user's build-dir-relative file paths into absolute paths is a more general one

Yeah, and moreover, another point I just thought of is that users tend to have more control over their own include directories, i.e. if you don’t want your paths to be printed w/ .., then just don’t use relative includes and add the directories you’re including from to the include path. But when it comes to the standard library, users don’t get a choice because we have those paths hard-coded as relative paths. So doing this only for standard library include paths sgtm.

More specifically, my idea now is to canonicalise all paths added via -include-isystem (unless -no-canonical-prefix is specified) since

  1. there are a lot of places in which we create relative paths to standard library headers, and doing canonicalisation where we first compose those paths would require doing that in dozens of places; from what I can tell though, we consistently use -include-isystem (or a variant of it for C headers iirc) for all of them, and
  2. -include-isystem is not supposed to be a user-facing flag anyway (I think it’s only supported as a cc1 option?), so if you’re using that as a user then you get what you get imo.

@AaronBallman
Copy link
Collaborator

I think the new plan forward makes sense. But also: thank you to everyone on this thread for the excellent collaboration on identifying an issue, getting the previous incarnation reverted, and discussing a good path forward. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:diagnostics New/improved warning or error message in Clang, but not in clang-tidy or static analyzer clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category clang-tidy clang-tools-extra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants