-
Notifications
You must be signed in to change notification settings - Fork 14.8k
[clang] Improve nested name specifier AST representation #147835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: Clang recently made major changes to the AST representation of nested name specifiers in [llvm-project#147835](llvm/llvm-project#147835). While waiting for Bindgen to be updated to understand the new representation, this diff changes our Bindgen invocation on `folly::IOBuf` to disregard the nested name `folly::IOBuf::Iterator`, which is unused anyway in the Rust `IOBuf` binding. This unblocks staging builds for the LLVM Server Compiler team. This diff causes the following difference in the Bindgen-generated Rust binding observed through `buck2 build fbcode//mode/opt fbcode//folly/rust/iobuf:iobuf-sys-bindgen`: ```lang=diff @@ -571,8 +571,6 @@ pub const IOBuf_CombinedOption_SEPARATE: root::folly::IOBuf_CombinedOption = 2; pub type IOBuf_CombinedOption = ::std::os::raw::c_int; pub type IOBuf_value_type = root::folly::ByteRange; - pub type IOBuf_iterator = root::folly::IOBuf_Iterator; - pub type IOBuf_const_iterator = root::folly::IOBuf_Iterator; pub type IOBuf_FreeFunction = ::std::option::Option< unsafe extern "C" fn( buf: *mut ::std::os::raw::c_void, @@ -765,31 +763,6 @@ "Offset of field: IOBuf::sharedInfo_", ][::std::mem::offset_of!(IOBuf, sharedInfo_) - 48usize]; }; - #[repr(C)] - #[derive(Debug, Copy, Clone)] - pub struct IOBuf_Iterator { - pub pos_: *const root::folly::IOBuf, - pub end_: *const root::folly::IOBuf, - pub val_: root::folly::ByteRange, - } - #[allow(clippy::unnecessary_operation, clippy::identity_op)] - const _: () = { - [ - "Size of IOBuf_Iterator", - ][::std::mem::size_of::<IOBuf_Iterator>() - 32usize]; - [ - "Alignment of IOBuf_Iterator", - ][::std::mem::align_of::<IOBuf_Iterator>() - 8usize]; - [ - "Offset of field: IOBuf_Iterator::pos_", - ][::std::mem::offset_of!(IOBuf_Iterator, pos_) - 0usize]; - [ - "Offset of field: IOBuf_Iterator::end_", - ][::std::mem::offset_of!(IOBuf_Iterator, end_) - 8usize]; - [ - "Offset of field: IOBuf_Iterator::val_", - ][::std::mem::offset_of!(IOBuf_Iterator, val_) - 16usize]; - }; } pub mod facebook { #[allow(unused_imports)] ``` Bindgen issue: [bindgen#3264](rust-lang/rust-bindgen#3264) Reviewed By: HighW4y2H3ll Differential Revision: D80186965 fbshipit-source-id: 9f0546a2964b5d4e5c67e6bdd91b542e3e5f7b2c
This is a major change on how we represent nested name qualifications in the AST. * The nested name specifier itself and how it's stored is changed. The prefixes for types are handled within the type hierarchy, which makes canonicalization for them super cheap, no memory allocation required. Also translating a type into nested name specifier form becomes a no-op. An identifier is stored as a DependentNameType. The nested name specifier gains a lightweight handle class, to be used instead of passing around pointers, which is similar to what is implemented for TemplateName. There is still one free bit available, and this handle can be used within a PointerUnion and PointerIntPair, which should keep bit-packing aficionados happy. * The ElaboratedType node is removed, all type nodes in which it could previously apply to can now store the elaborated keyword and name qualifier, tail allocating when present. * TagTypes can now point to the exact declaration found when producing these, as opposed to the previous situation of there only existing one TagType per entity. This increases the amount of type sugar retained, and can have several applications, for example in tracking module ownership, and other tools which care about source file origins, such as IWYU. These TagTypes are lazily allocated, in order to limit the increase in AST size. This patch offers a great performance benefit. It greatly improves compilation time for [stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for `test_on2.cpp` in that project, which is the slowest compiling test, this patch improves `-c` compilation time by about 7.2%, with the `-fsyntax-only` improvement being at ~12%. This has great results on compile-time-tracker as well:  This patch also further enables other optimziations in the future, and will reduce the performance impact of template specialization resugaring when that lands. It has some other miscelaneous drive-by fixes. About the review: Yes the patch is huge, sorry about that. Part of the reason is that I started by the nested name specifier part, before the ElaboratedType part, but that had a huge performance downside, as ElaboratedType is a big performance hog. I didn't have the steam to go back and change the patch after the fact. There is also a lot of internal API changes, and it made sense to remove ElaboratedType in one go, versus removing it from one type at a time, as that would present much more churn to the users. Also, the nested name specifier having a different API avoids missing changes related to how prefixes work now, which could make existing code compile but not work. How to review: The important changes are all in `clang/include/clang/AST` and `clang/lib/AST`, with also important changes in `clang/lib/Sema/TreeTransform.h`. The rest and bulk of the changes are mostly consequences of the changes in API. PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just for easier to rebasing. I plan to rename it back after this lands. Fixes llvm#136624 Fixes llvm#43179 Fixes llvm#68670 Fixes llvm#92757
…lvm#153344) This fixes a regression reported here llvm#147835 (comment), where getTrivialTemplateArgumentLoc can't see through template name sugar when producing a trivial TemplateArgumentLoc for template template arguments. Since this regression was never released, there are no release notes.
…ecializationType (#153646) This was a regression introduced in llvm/llvm-project#147835 Since this regression was never released, there are no release notes. Fixes llvm/llvm-project#153540
We started to see a CI failure around the time this PR landed in https://green.lab.llvm.org/job/llvm.org/job/clang-stage2-Rthinlto/1079/ The build hangs in this step (using a stage 2 compiler):
This is where the compiler seems to hang:
Any ideas what might be going on here? |
Out of curiosity. The old |
I don't know if this is the same as the other crash report, but we're also seeing assert failures when building chromium. I've minimized the repro to the following program:
This fails when running
|
This helps establishing the source file and module ownership origins of the dependencies of some piece of code. |
Seems like it is stuck somewhere calculating the linkage and visibility for some type. |
Thanks! However, for the following piece of code struct A;
struct A {};
struct A;
A* p;
|
This fixes a regression reported here #147835 (comment) Since this regression was never released, there are no release notes.
Yes that is correct. The 3rd line is what declaration was found by type lookup at that point, and the canonical declaration has long been established in clang as the first declaration. |
Thanks, this is going to be fixed by #153862 |
…nType (llvm#153646) This was a regression introduced in llvm#147835 Since this regression was never released, there are no release notes. Fixes llvm#153540
…red bindings (#153923) These are implicit vardecls which its type was never written in source code. Don't create a TypeLoc and give it a fake source location. The fake as-written type also didn't match the actual type, which after fixing this gives some unrelated test churn on a CFG dump, since statement printing prefers type source info if thats available. Fixes #153649 This is a regression introduced in #147835 This regression was never released, so no release notes are added.
…for structured bindings (#153923) These are implicit vardecls which its type was never written in source code. Don't create a TypeLoc and give it a fake source location. The fake as-written type also didn't match the actual type, which after fixing this gives some unrelated test churn on a CFG dump, since statement printing prefers type source info if thats available. Fixes llvm/llvm-project#153649 This is a regression introduced in llvm/llvm-project#147835 This regression was never released, so no release notes are added.
cast<InjectedClassNameType>(Ty)->getDecl()); | ||
mangleSourceNameWithAbiTags(cast<InjectedClassNameType>(Ty) | ||
->getOriginalDecl() | ||
->getDefinitionOrSelf()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this getDefinitionOrSelf()
call necessary here? Shouldn't InjectedClassNameType
always refer to the definition where the class name is injected to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not always.
I mean you could certainly attempt to define an InjectedClassNameType
whose declaration necessarily is the definition.
This would cause many annoyances, most of them surmountable, as you wouldn't be able to form a type to an arbitrary declaration (which we do offhandedly in error recovery and printing diagnostics), and the type would play by different rules than other tag types.
Though I think at the end, you would come to the problem of demoted definitions, which come up for example when merging definitions from the GMF of different modules. You would have to throw away the ability to represent an InjectedClassNameType pointing to a demoted definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've caught an example:
template <typename>
struct Tpl;
template <typename>
struct Tpl {
Tpl(const Tpl&) {}
};
The injected name from the copy ctor prototype refers to the forward-declaration from the first two lines. Looks a little strange but Ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure it's not just its canonical type which refers to the first two lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, interesting. getAs
function actually returns the canonical type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getAs only desugars. This could happen with a TemplateSpecializationType which its canonical type is an InjectedClassNameType, since in that case desugar just returns the canonical type, but that is not the case for that constructor, since its using an InjectedClassNameType as written in source code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've understood: there are specializations of getAs
for the leaf non-sugar types:
llvm-project/clang/include/clang/AST/Type.h
Lines 3138 to 3140 in 71925a9
template <> inline const Class##Type *Type::getAs() const { \ | |
return dyn_cast<Class##Type>(CanonicalType); \ | |
} \ |
If you call getAs<InjectedClassNameType>()
on a Type
which is already InjectedClassNameType
, you get its canonical type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, interesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like it needs to change, we probably don't want to do that for sugared leaf types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, it is a drawback of templates that you can easily overlook a specialization.
@Xazax-hun this could be the same as #153933 Could you confirm if the hang is limited to assertions enabled llvm builds, and if #153996 is sufficient to fix the problem? |
This is a major change on how we represent nested name qualifications in the AST.
This patch offers a great performance benefit.
It greatly improves compilation time for stdexec. For one datapoint, for
test_on2.cpp
in that project, which is the slowest compiling test, this patch improves-c
compilation time by about 7.2%, with the-fsyntax-only
improvement being at ~12%.This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and will reduce the performance impact of template specialization resugaring when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the reason is that I started by the nested name specifier part, before the ElaboratedType part, but that had a huge performance downside, as ElaboratedType is a big performance hog. I didn't have the steam to go back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove ElaboratedType in one go, versus removing it from one type at a time, as that would present much more churn to the users. Also, the nested name specifier having a different API avoids missing changes related to how prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
clang/include/clang/AST
andclang/lib/AST
, with also important changes inclang/lib/Sema/TreeTransform.h
.The rest and bulk of the changes are mostly consequences of the changes in API.
PS: TagType::getDecl is renamed to
getOriginalDecl
in this patch, just for easier to rebasing. I plan to rename it back after this lands.Fixes #136624
Fixes #43179
Fixes #68670
Fixes #92757