Skip to content

Conversation

@vladimirlaz
Copy link
Contributor

topperc and others added 30 commits July 16, 2020 11:06
We were previously expanding vselect and matching on the expansion to
generate bitselects, but in some cases the expansion would be further
combined and a bitselect would not get generated. This patch improves
codegen in those cases by legalizing vselect and lowering it to
v128.bitselect. The old pattern that matches the expansion is still
useful for lowering IR that already uses the expansion rather than a
select operation.

Differential Revision: https://reviews.llvm.org/D83734
Updating the simd-select.ll tests manually with consistent named
regexps for the register numbers was taking more time than it was
worth, so this patch updates that test file to have autogenerated
output. This is not a significant readability regression because the
tests in that file are all very small.

Depends on D83734.

Differential Revision: https://reviews.llvm.org/D83736
Many tests use opt's -analyze feature, which does not translate well to
NPM and has better alternatives. The alternative here is to explicitly
add a pass that calls ScalarEvolution::print().

The legacy pass manager RUNs aren't changing, but they are now pinned to
the legacy pass manager.  For each legacy pass manager RUN, I added a
corresponding NPM RUN using the 'print<scalar-evolution>' pass. For
compatibility with update_analyze_test_checks.py and existing test
CHECKs, 'print<scalar-evolution>' now prints what -analyze prints per
function.

This was generated by the following Python script and failures were
manually fixed up:

import sys
for i in sys.argv:
    with open(i, 'r') as f:
        s = f.read()
    with open(i, 'w') as f:
        for l in s.splitlines():
            if "RUN:" in l and ' -analyze ' in l and '\\' not in l:
                f.write(l.replace(' -analyze ', ' -analyze -enable-new-pm=0 '))
                f.write('\n')
                f.write(l.replace(' -analyze ', ' -disable-output ').replace(' -scalar-evolution ', ' "-passes=print<scalar-evolution>" ').replace(" | ", " 2>&1 | "))
                f.write('\n')
            else:
                f.write(l)

There are a couple failures still in ScalarEvolution under NPM, but
those are due to other unrelated naming conflicts.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D83798
There is currently no COMPILER_PATH test. A subsequent --ld-path patch
will improve the coverage here.
This is in preparation for fixing multiple problems with the way AGPR
copies are handled, but this change is NFC itself. First, it's relying
on recursively calling copyPhysReg, which is losing information
necessary to get correct super register handling.

Second, it's constructing a new RegScavenger and doing a O(N^2) walk
on every single sub-spill for every AGPR tuple copy. Third, it's using
the forward form of the scavenger, and not using the preferred
backwards scan.
The size of VTList that is pushed into this container is usually 1, but
often 6 or 7. Change the vector to SmallVector to eliminate frequent
mallocs.  This happens hundreds of thousands of times in each tablegen
execution during the LLVM/clang build.

https://reviews.llvm.org/D83849
Although the SIMD spec proposal does not specifically include a
select instruction, the select instruction in MVP WebAssembly is
polymorphic over the selected types, so it is able to work on v128
values when they are enabled. This patch introduces a new variant of
the select instruction for each legal vector type. Additional ISel
patterns are adapted from the SELECT_I32 and SELECT_I64 patterns.

Depends on D83736.

Differential Revision: https://reviews.llvm.org/D83737
Replace std::vector with SmallVector to reduce the number of mallocs.
This method is frequently executed, and the number of elements in the
vector is typically small.

https://reviews.llvm.org/D83920
Summary:
1. gcc uses `-march` and `-mtune` flag to chose arch and
pipeline model, but clang does not have `-mtune` flag,
we uses `-mcpu` to chose both infos.
2. Add SiFive e31 and u54 cpu which have default march
and pipeline model.
3. Specific `-mcpu` with rocket-rv[32|64] would select
pipeline model only, and use the driver's arch choosing
logic to get default arch.

Reviewers: lenary, asb, evandro, HsiangKai

Reviewed By: lenary, asb, evandro

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D71124
…on (NFCI)

Summary:
This refactors some common support related to shadow memory setup from
asan and hwasan into sanitizer_common. This should not only reduce code
duplication but also make these facilities available for new compiler-rt
uses (e.g. heap profiling).

In most cases the separate copies of the code were either identical, or
at least functionally identical. A few notes:

In ProtectGap, the asan version checked the address against an upper
bound (kZeroBaseMaxShadowStart, which is (2^18). I have created a copy
of kZeroBaseMaxShadowStart in hwasan_mapping.h, with the same value, as
it isn't clear why that code should not do the same check. If it
shouldn't, I can remove this and guard this check so that it only
happens for asan.

In asan's InitializeShadowMemory, in the dynamic shadow case it was
setting __asan_shadow_memory_dynamic_address to 0 (which then sets both
macro SHADOW_OFFSET as well as macro kLowShadowBeg to 0) before calling
FindDynamicShadowStart(). AFAICT this is only needed because
FindDynamicShadowStart utilizes kHighShadowEnd to
get the shadow size, and kHighShadowEnd is a macro invoking
MEM_TO_SHADOW(kHighMemEnd) which in turn invokes:
(((kHighMemEnd) >> SHADOW_SCALE) + (SHADOW_OFFSET))
I.e. it computes the shadow space needed by kHighMemEnd (the shift), and
adds the offset. Since we only want the shadow space here, the earlier
setting of SHADOW_OFFSET to 0 via __asan_shadow_memory_dynamic_address
accomplishes this. In the hwasan version, it simply gets the shadow
space via "MemToShadowSize(kHighMemEnd)", where MemToShadowSize just
does the shift. I've simplified the asan handling to do the same
thing, and therefore was able to remove the setting of the SHADOW_OFFSET
via __asan_shadow_memory_dynamic_address to 0.

Reviewers: vitalybuka, kcc, eugenis

Subscribers: dberris, #sanitizers, llvm-commits, davidxl

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D83247
This reverts commit b16dfbe.

Accidental push, reverting and creating a new revision.
The patch that introduces handling iterators implemented as pointers may
cause crash in some projects because pointer difference is mistakenly
handled as pointer decrement. (Similair case for iterators implemented
as class instances are already handled correctly.) This patch fixes this
issue.

The second case that causes crash is comparison of an iterator
implemented as pointer and a null-pointer. This patch contains a fix for
this issue as well.

The third case which causes crash is that the checker mistakenly
considers all integers as nonloc::ConcreteInt when handling an increment
or decrement of an iterator implemented as pointers. This patch adds a
fix for this too.

The last case where crashes were detected is when checking for success
of an std::advance() operation. Since the modeling of iterators
implemented as pointers is still incomplete this may result in an
assertion. This patch replaces the assertion with an early exit and
adds a FIXME there.

Differential Revision: https://reviews.llvm.org/D83295
…gnment assumptions"

due to the performance bugs filed in https://bugs.llvm.org/show_bug.cgi?id=46753.

An SROA change soon may obviate some of these problems.

This reverts commit 8d09f20.
Python is now handled in CMake with different variables, thus
the intel plugin needs a corresponding update.

Test Plan:

Differential Revision: https://phabricator.intern.facebook.com/D22555992
Summary:
Following guidance in
https://llvm.org/docs/TestingGuide.html#testing-analysis

Reviewers: mehdi_amini

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83918
Some time ago, I introduced shortcut features like dylib-has-no-shared_mutex
to encode whether the deployment target supported shared_mutex (say). This
made the test suite annotations cleaner.

However, the problem with building Lit features on top of other Lit
features is that it's easier for them to become stale, especially when
they are generated programmatically. Furthermore, it makes the bar for
defining configurations from scratch higher, since more features have
to be defined. Instead, I think it's better to put the XFAILs in the
tests directly, which allows cleaning them up with a simple grep.
This test has been failing on some SDKs for a long time because we lack
a proper way of identifying the SDK version in Lit. Until that is possible,
mark the test as unsupported on Apple to restore the CI.
… its the default 32-bit cpu in clang

Alternative to D83897. I believe the big change here is that I removed slow unaligned memory 16

Down side that it may adversely effect tuning if someone explicitly targets -march=pentium4 and expects pentium4 tuned code. Of course pentium4 is so old our default behavior with the previous settings may not have been the best either.

Reviewed By: echristo, RKSimon

Differential Revision: https://reviews.llvm.org/D83913
Statepoint has no static operands, so it cannot be verified
against MCInstrDescr. Revert NumDefs change introduced by ef658eb.
WenleiHe and others added 19 commits July 19, 2020 08:21
…ion remarks

Summary:
This change added a new inline advisor that takes optimization remarks from previous inlining as input, and provides the decision as advice so current inlining can replay inline decisions of a different compilation. Dwarf inline stack with line and discriminator is used as anchor for call sites including call context. The change can be useful for Inliner tuning as it provides a channel to allow external input for tweaking inline decisions. Existing alternatives like alwaysinline attribute is per-function, not per-callsite. Per-callsite inline intrinsic can be another solution (not yet existing), but it's intrusive to implement and also does not differentiate call context.

A switch -sample-profile-inline-replay=<inline_remarks_file> is added to hook up the new inline advisor with SampleProfileLoader's inline decision for replay. Since SampleProfileLoader does top-down inlining, inline decision can be specialized for each call context, hence we should be able to replay inlining accurately. However with a bottom-up inliner like CGSCC inlining, the replay can be limited due to lack of specialization for different call context. Apart from that limitation, the new inline advisor can still be used by regular CGSCC inliner later if needed for tuning purpose.

Subscribers: mgorny, aprantl, hiraditya, llvm-commits

Tags: #llvm

Resubmit for https://reviews.llvm.org/D84086
Hopefully this will make the bot a little less noisy. Rationale for each:

AlignTrailingComments: We don't want to force-align the various expected-error
                       and friends.

CommentPragmas:       Tell clang-format to leave the "// CHECK:" and the
                      "// expected-" alone.

AlwaysBreakTemplateDeclarations: Templates in tests often have no break between
                                 the template-head and the declaration.

Differential Revision: https://reviews.llvm.org/D83901
…eturn value to actually say "consteval".

This warning was modified in 796ed03 to use the term "consteval"
for consteval functions. However the warning has never worked as
intended since the diagnostic's arguments are used in the wrong order.

This was unfortunately missed by 796ed03 since no test did exercise
this specific warning.

Additionally send the NamedDecl* into the diagnostic instead of just the
IdentifierInfo* to correctly work with special names and template
arguments.
…Operator

This patch
- adds `canCreateUndefOrPoison`
- refactors `canCreatePoison` so it can deal with constantexprs

`canCreateUndefOrPoison` will be used at D83926.

Reviewed By: nikic, jdoerfert

Differential Revision: https://reviews.llvm.org/D84007
The function extractArgumentsFromModule() was passing a one-based index to,
but replaceFunctionCalls() was expecting a zero-based argument index. This
resulted in assertion errors when reducing function call arguments with
different types. Additionally, the

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D84099
The getAllOnesValue can only handle things that are bitcast from a
ConstantInt, while here we bitcast through a pointer, so we may see more
complex objects (like Array or Struct).

Differential Revision: https://reviews.llvm.org/D83870
…OR relocs.

When processing a MachO SUBTRACTOR/UNSIGNED pair, if the UNSIGNED target
is non-extern then check the r_symbolnum field of the relocation to find
the targeted section and use the section's address to find 'ToSymbol'.

Previously 'ToSymbol' was found by loading the initial value stored at
the fixup location and treating this as an address to search for. This
is incorrect, however: the initial value includes the addend and will
point to the wrong block if the addend is less than zero or greater than
the block size.

rdar://65756694
The map initialization unnecessarily ends up in almost every
translation unit due to transitive inclusion.
Its use was removed in dff07df ("[SPIRV] Fix assertion due to
duplicate operands in opencl version metadata. This happens when two
SPIR modules are linked together. Remove check for OCL version in
OCLTypeToSPIRV. Only check if language is C or C++.", 2015-12-16).
a7b763b introduces a bug, when it could be possible to map a
single pointer type in LLVM IR to two different pointer types in
SPIR-V (when SPV_INTEL_usm_storage_classes extension is not allowed).
This patch fixes this bug.

Signed-off-by: Dmitry Sidorov <[email protected]>
@bader
Copy link
Contributor

bader commented Jul 20, 2020

/summary:run

@vladimirlaz vladimirlaz merged commit 78f60bf into intel:sycl Jul 20, 2020
@vladimirlaz vladimirlaz deleted the private/vlazarev/llvmspirv_pulldown branch August 5, 2020 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.