Skip to content

P1001R2 Target Vectorization Policies from Parallelism V2 TS to C++20 #2723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 14, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 68 additions & 36 deletions source/algorithms.tex
Original file line number Diff line number Diff line change
Expand Up @@ -313,6 +313,35 @@
\end{itemize}
\end{example}

\pnum
A standard library function is \defn{vectorization-unsafe}
if it is specified to synchronize with another function invocation, or
another function invocation is specified to synchronize with it,
and if it is not a memory allocation or deallocation function.
\begin{note}
Implementations must ensure that internal synchronization
inside standard library functions does not prevent forward progress
when those functions are executed by threads of execution
with weakly parallel forward progress guarantees.
\end{note}
\begin{example}
\begin{codeblock}
int x = 0;
std::mutex m;
void f() {
int a[] = {1,2};
std::for_each(std::execution::par_unseq, std::begin(a), std::end(a), [&](int) {
std::lock_guard<mutex> guard(m); // incorrect: \tcode{lock_guard} constructor calls \tcode{m.lock()}
++x;
});
}
\end{codeblock}
The above program may result in two consecutive calls to \tcode{m.lock()}
on the same thread of execution (which may deadlock),
because the applications of the function object are not guaranteed
to run on different threads of execution.
\end{example}

\rSec2[algorithms.parallel.user]{Requirements on user-provided function objects}

\pnum
Expand Down Expand Up @@ -378,6 +407,29 @@
The invocations are not interleaved; see~\ref{intro.execution}.
\end{note}

\pnum
The invocations of element access functions in parallel algorithms invoked with
an execution policy object of type \tcode{execution::unsequenced_policy}
are permitted to execute in an unordered fashion
in the calling thread of execution,
unsequenced with respect to one another in the calling thread of execution.
\begin{note}
This means that multiple function object invocations
may be interleaved on a single thread of execution,
which overrides the usual guarantee from \ref{intro.execution}
that function executions do not overlap with one another.
\end{note}
The behavior of a program is undefined if
it invokes a vectorization-unsafe standard library function
from user code
called from a \tcode{execution::unsequenced_policy} algorithm.
\begin{note}
Because \tcode{execution::unsequenced_policy} allows
the execution of element access functions
to be interleaved on a single thread of execution,
blocking synchronization, including the use of mutexes, risks deadlock.
\end{note}

\pnum
The invocations of element access functions in parallel algorithms invoked with
an execution policy object of type \tcode{execution::parallel_policy}
Expand Down Expand Up @@ -439,7 +491,7 @@

\pnum
The invocations of element access functions in parallel algorithms invoked with
an execution policy of type \tcode{execution::parallel_unsequenced_policy} are
an execution policy object of type \tcode{execution::parallel_unsequenced_policy} are
permitted to execute
in an unordered fashion in unspecified threads of execution, and
unsequenced with respect to one another within each thread of execution.
Expand All @@ -451,48 +503,28 @@
This means that multiple function object invocations may be interleaved
on a single thread of execution,
which overrides the usual guarantee from \ref{intro.execution}
that function executions do not interleave with one another.
that function executions do not overlap with one another.
\end{note}
Since \tcode{execution::parallel_unsequenced_policy} allows
The behavior of a program is undefined if
it invokes a vectorization-unsafe standard library function
from user code
called from a \tcode{execution::parallel_unsequenced_policy} algorithm.
\begin{note}
Because \tcode{execution::parallel_unsequenced_policy} allows
the execution of element access functions
to be interleaved on a single thread of execution,
blocking synchronization, including the use of mutexes, risks deadlock.
Thus, the synchronization with \tcode{execution::parallel_unsequenced_policy}
is restricted as follows:
A standard library function is \defn{vectorization-unsafe}
if it is specified to synchronize with another function invocation, or
another function invocation is specified to synchronize with it, and
if it is not a memory allocation or deallocation function.
Vectorization-unsafe standard library functions may not be invoked by user code
called from \tcode{execution::parallel_unsequenced_policy} algorithms.
\begin{note}
Implementations must ensure
that internal synchronization inside standard library functions
does not prevent forward progress
when those functions are executed
by threads of execution with weakly parallel forward progress guarantees.
\end{note}
\begin{example}
\begin{codeblock}
int x = 0;
std::mutex m;
int a[] = {1,2};
std::for_each(std::execution::par_unseq, std::begin(a), std::end(a), [&](int) {
std::lock_guard<mutex> guard(m); // incorrect: \tcode{lock_guard} constructor calls \tcode{m.lock()}
++x;
});
\end{codeblock}
The above program may result in two consecutive calls to \tcode{m.lock()}
on the same thread of execution (which may deadlock),
because the applications of the function object are not guaranteed
to run on different threads of execution.
\end{example}

\pnum
\begin{note}
The semantics of the \tcode{execution::parallel_policy} or
the \tcode{execution::parallel_unsequenced_policy} invocation
The semantics of invocation with
\tcode{execution::unsequenced_policy},
\tcode{execution::parallel_policy}, or
\tcode{execution::parallel_unsequenced_policy}
allow the implementation to fall back to sequential execution
if the system cannot parallelize an algorithm invocation
due to lack of resources.
if the system cannot parallelize an algorithm invocation,
e.g., due to lack of resources.
\end{note}

\pnum
Expand Down
2 changes: 1 addition & 1 deletion source/support.tex
Original file line number Diff line number Diff line change
Expand Up @@ -588,7 +588,7 @@
\tcode{<unordered_set>} \\ \rowsep
\defnlibxname{cpp_lib_exchange_function} & \tcode{201304L} &
\tcode{<utility>} \\ \rowsep
\defnlibxname{cpp_lib_execution} & \tcode{201603L} &
\defnlibxname{cpp_lib_execution} & \tcode{201902L} &
\tcode{<execution>} \\ \rowsep
\defnlibxname{cpp_lib_filesystem} & \tcode{201703L} &
\tcode{<filesystem>} \\ \rowsep
Expand Down
25 changes: 25 additions & 0 deletions source/utilities.tex
Original file line number Diff line number Diff line change
Expand Up @@ -19059,10 +19059,14 @@
// \ref{execpol.parunseq}, parallel and unsequenced execution policy
class parallel_unsequenced_policy;

// \ref{execpol.unseq}, unsequenced execution policy
class unsequenced_policy;

// \ref{execpol.objects}, execution policy objects
inline constexpr sequenced_policy seq{ @\unspec@ };
inline constexpr parallel_policy par{ @\unspec@ };
inline constexpr parallel_unsequenced_policy par_unseq{ @\unspec@ };
inline constexpr unsequenced_policy unseq{ @\unspec@ };
}
\end{codeblock}

Expand Down Expand Up @@ -19156,6 +19160,26 @@
\tcode{terminate()} shall be called.
\end{itemdescr}

\rSec2[execpol.unseq]{Unsequenced execution policy}

\indexlibrary{\idxcode{execution::unsequenced_policy}}%
\begin{itemdecl}
class execution::unsequenced_policy { @\unspec@ };
\end{itemdecl}

\pnum
The class \tcode{unsequenced_policy} is an execution policy type
used as a unique type to disambiguate parallel algorithm overloading and
indicate that a parallel algorithm's execution may be vectorized,
e.g., executed on a single thread using instructions
that operate on multiple data items.

\pnum
During the execution of a parallel algorithm with
the \tcode{execution::unsequenced_policy} policy,
if the invocation of an element access function exits via an uncaught exception,
\tcode{terminate()} shall be called.

\rSec2[execpol.objects]{Execution policy objects}

\indexlibrary{\idxcode{seq}}%
Expand All @@ -19168,6 +19192,7 @@
inline constexpr execution::sequenced_policy execution::seq{ @\unspec@ };
inline constexpr execution::parallel_policy execution::par{ @\unspec@ };
inline constexpr execution::parallel_unsequenced_policy execution::par_unseq{ @\unspec@ };
inline constexpr execution::unsequenced_policy execution::unseq{ @\unspec@ };
\end{itemdecl}

\begin{itemdescr}
Expand Down