From 4143c51a3a3de64c607c2fdce0a64d2dd3e9262b Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Mon, 1 Aug 2016 19:37:02 -0700 Subject: [PATCH 01/62] Create SemanticARC.md Create the scaffolding for the semantic arc proposal --- docs/SemanticARC.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) create mode 100644 docs/SemanticARC.md diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md new file mode 100644 index 0000000000000..c18b24bd1dc9f --- /dev/null +++ b/docs/SemanticARC.md @@ -0,0 +1,22 @@ + +# Semantic ARC + +## Why is ARC difficult to optimize/verify + +### Reference Count Identity Problem + +### Pairing Problems + +## Solving the Reference Count Identity Problem + +### Add RC Identity semantics to use-def chains + +### Create an RC Identity Verifier + +## Solving the Pairing Problem + +### Transition to copy_value + +### Add Ownership Semantics to use-def chains + +### Create an Ownership Semantic Verifier From df36a0e1f0b22efee24915780284ddf6bf7a64fb Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Mon, 1 Aug 2016 21:35:58 -0700 Subject: [PATCH 02/62] Update SemanticARC.md Staging point --- docs/SemanticARC.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index c18b24bd1dc9f..1c4dae742fd6d 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -1,7 +1,11 @@ # Semantic ARC -## Why is ARC difficult to optimize/verify +This is a proposal for a series of changes to the SIL IR in order to improve the ability for compiler writers/optimizers to verify and optimize the ARC operations. Traditionally, ARC has been optimized via a bidirectional dataflow algorithm that attempts to prove that a set of retains and releases joint dominate/post-dominate each other. If the appropriate conditions are discovered during the dataflow then the retain/release sets are removed. Even though this algorithm is very powerful and removes most redundant retain/release pairings is does have several weaknesses: + +1. It attempts to prove via various heuristics that applying ARC operations to two retainable values will manipulate the same reference count. This is not always easy to prove and in the face of a changing IR the lack of verification causes this to be brittle. In addition, if the heuristics are incorrect, then retain and release operations that do not always manipulate the same reference count. This would then result in a use after free. + +2. ### Reference Count Identity Problem From 2600ebd0203c04467c67f22c1fcf8d8908ee238b Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Tue, 2 Aug 2016 17:27:42 -0700 Subject: [PATCH 03/62] Update SemanticARC.md More changes --- docs/SemanticARC.md | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 1c4dae742fd6d..e375ed9f84444 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -1,15 +1,29 @@ # Semantic ARC -This is a proposal for a series of changes to the SIL IR in order to improve the ability for compiler writers/optimizers to verify and optimize the ARC operations. Traditionally, ARC has been optimized via a bidirectional dataflow algorithm that attempts to prove that a set of retains and releases joint dominate/post-dominate each other. If the appropriate conditions are discovered during the dataflow then the retain/release sets are removed. Even though this algorithm is very powerful and removes most redundant retain/release pairings is does have several weaknesses: +## Preface + +This is a proposal for a series of changes to the SIL IR in order to improve the ability for compiler writers/optimizers to verify and optimize the ARC operations. We assume that the user is familiar with the basic concepts of ARC as this is a proposal meant for compiler writers and implementors. + +## Historical Notes + +### Representation + +ARC was first implemented for Objective C and is the basis for managing life times of reference types in Swift. In Objective C, ARC pointers were represented in LLVM IR as i8* pointers whose lifetimes were managed by retain and release function calls. The convention for passing an argument in Objective C was +0 so that was assumed by default. This caused issues since there was no true semantic verification that a value being passed off to a function was truly supposed to be passed at +0 or if a retain was truly matched with a release (or with a store). As a result, the ability to + +### Optimization + +Traditionally, ARC has been optimized via a bidirectional dataflow algorithm that attempts to prove that a set of retains and releases joint dominate/post-dominate each other. If the appropriate conditions are discovered during the dataflow then the retain/release sets are removed. Even though this algorithm is very powerful and removes most redundant retain/release pairings is does have several weaknesses: 1. It attempts to prove via various heuristics that applying ARC operations to two retainable values will manipulate the same reference count. This is not always easy to prove and in the face of a changing IR the lack of verification causes this to be brittle. In addition, if the heuristics are incorrect, then retain and release operations that do not always manipulate the same reference count. This would then result in a use after free. -2. +2. + +### ARC Problems -### Reference Count Identity Problem +#### Reference Count Identity Problem -### Pairing Problems +#### Pairing Problems ## Solving the Reference Count Identity Problem From c0e5293ebcfdd6cd3aa667e5064aca2d6717219e Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Tue, 2 Aug 2016 17:30:46 -0700 Subject: [PATCH 04/62] Update SemanticARC.md --- docs/SemanticARC.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index e375ed9f84444..78e76ec0eacfb 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -9,7 +9,9 @@ This is a proposal for a series of changes to the SIL IR in order to improve the ### Representation -ARC was first implemented for Objective C and is the basis for managing life times of reference types in Swift. In Objective C, ARC pointers were represented in LLVM IR as i8* pointers whose lifetimes were managed by retain and release function calls. The convention for passing an argument in Objective C was +0 so that was assumed by default. This caused issues since there was no true semantic verification that a value being passed off to a function was truly supposed to be passed at +0 or if a retain was truly matched with a release (or with a store). As a result, the ability to +ARC was first implemented for Objective C and is the basis for managing life times of reference types in Swift. In Objective C, ARC pointers are represented in LLVM IR as i8* pointers whose lifetimes were managed by retain and release function calls. These function calls do not necessarily have any semantic connection in the IR itself (i.e. one can not verify if two retain, release pairs are truly paired). In addition, two pointers could only be proven to be the same via conservative heuristics. The convention for passing an argument in Objective C was +0 so that was assumed by default. This caused issues since there was no true semantic verification that a value being passed off to a function was truly supposed to be passed at +0. + +The ARC implementation in Swift improved upon these issues by attempting to begin specifying ARC relationships at the function boundary level. For instance, in Swift a function specifies the ownership convention that it expects each one of its arguments to ### Optimization From 8bf3bd1aab1b00224cfef2f55b886df66df4367c Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Tue, 2 Aug 2016 17:44:04 -0700 Subject: [PATCH 05/62] Update SemanticARC.md --- docs/SemanticARC.md | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 78e76ec0eacfb..7867a6f383b81 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -3,23 +3,13 @@ ## Preface -This is a proposal for a series of changes to the SIL IR in order to improve the ability for compiler writers/optimizers to verify and optimize the ARC operations. We assume that the user is familiar with the basic concepts of ARC as this is a proposal meant for compiler writers and implementors. +This is a proposal for a series of changes to the SIL IR in order to ease the optimization of ARC operations and verification of ARC semantics in Swift programs. We assume that the user has a basic familiarity with the basic concepts of ARC: this is a proposal meant for compiler writers and implementors. -## Historical Notes +## Historical Implementations -### Representation +ARC was first implemented for Objective C. In Objective C, Pointers with ARC semantics are represented in LLVM IR as i8*. The lifetimes of these pointers were managed via retain and release operations and the end of a pointer's lifetime was ascertained via conservative analysis of uses. The retain, release calls did not have any semantic information in the IR itself that showed what operations they were balancing and often times uses of ARC pointers that /should/ have resulted in atomic uses were separated into separate uses. Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two pointers had the same RC Identity conservatively via alias analysis. -ARC was first implemented for Objective C and is the basis for managing life times of reference types in Swift. In Objective C, ARC pointers are represented in LLVM IR as i8* pointers whose lifetimes were managed by retain and release function calls. These function calls do not necessarily have any semantic connection in the IR itself (i.e. one can not verify if two retain, release pairs are truly paired). In addition, two pointers could only be proven to be the same via conservative heuristics. The convention for passing an argument in Objective C was +0 so that was assumed by default. This caused issues since there was no true semantic verification that a value being passed off to a function was truly supposed to be passed at +0. - -The ARC implementation in Swift improved upon these issues by attempting to begin specifying ARC relationships at the function boundary level. For instance, in Swift a function specifies the ownership convention that it expects each one of its arguments to - -### Optimization - -Traditionally, ARC has been optimized via a bidirectional dataflow algorithm that attempts to prove that a set of retains and releases joint dominate/post-dominate each other. If the appropriate conditions are discovered during the dataflow then the retain/release sets are removed. Even though this algorithm is very powerful and removes most redundant retain/release pairings is does have several weaknesses: - -1. It attempts to prove via various heuristics that applying ARC operations to two retainable values will manipulate the same reference count. This is not always easy to prove and in the face of a changing IR the lack of verification causes this to be brittle. In addition, if the heuristics are incorrect, then retain and release operations that do not always manipulate the same reference count. This would then result in a use after free. - -2. +The ARC implementation in Swift improved upon this situation by specifying ARC semantic properties at the function level. For instance, ### ARC Problems From fcf4c9716d47beac26371a8c9d84ffae17f72baa Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Tue, 2 Aug 2016 17:49:57 -0700 Subject: [PATCH 06/62] Update SemanticARC.md --- docs/SemanticARC.md | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 7867a6f383b81..6d65b9cc43ec3 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -7,11 +7,18 @@ This is a proposal for a series of changes to the SIL IR in order to ease the op ## Historical Implementations -ARC was first implemented for Objective C. In Objective C, Pointers with ARC semantics are represented in LLVM IR as i8*. The lifetimes of these pointers were managed via retain and release operations and the end of a pointer's lifetime was ascertained via conservative analysis of uses. The retain, release calls did not have any semantic information in the IR itself that showed what operations they were balancing and often times uses of ARC pointers that /should/ have resulted in atomic uses were separated into separate uses. Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two pointers had the same RC Identity conservatively via alias analysis. +ARC was first implemented for Objective C. In Objective C, Pointers with ARC semantics are represented in LLVM IR as i8*. The lifetimes of these pointers were managed via retain and release operations and the end of a pointer's lifetime was ascertained via conservative analysis of uses. The retain, release calls did not have any semantic information in the IR itself that showed what operations they were balancing and often times uses of ARC pointers that /should/ have resulted in atomic uses were separated into separate uses. Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two pointers had the same RC Identity conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. -The ARC implementation in Swift improved upon this situation by specifying ARC semantic properties at the function level. For instance, +The ARC implementation in Swift, in contrast, to Objective C, is implemented in the SIL IR. This suffered from many of the same issues as the Objective C implementation of ARC, with one exception: function signatures. In SIL, all functions specify the ownership convention expected of their arguments and return values. Since these conventions were not specified in the operations in the bodies of functions though, this could not be used to create a true ARC verifier. -### ARC Problems +## Semantic ARC + +As shown in the past section, the implementation of ARC in both Swift and Objective C lacked important semantic information in the following areas: + +1. Ability to determine semantic ARC pointer equivalence (RC Identity). +2. Ability to pair semantic ARC operations. + +Our proposal solves these problems as follows: #### Reference Count Identity Problem From 9fbee4b6304a1da21069a49f6f4ede95792a16ba Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Tue, 2 Aug 2016 17:50:38 -0700 Subject: [PATCH 07/62] Update SemanticARC.md --- docs/SemanticARC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 6d65b9cc43ec3..69a9d6ffcb3b5 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -3,7 +3,7 @@ ## Preface -This is a proposal for a series of changes to the SIL IR in order to ease the optimization of ARC operations and verification of ARC semantics in Swift programs. We assume that the user has a basic familiarity with the basic concepts of ARC: this is a proposal meant for compiler writers and implementors. +This is a proposal for a series of changes to the SIL IR in order to ease the optimization of ARC operations and verification of ARC semantics in Swift programs. This is a proposal meant for compiler writers and implementors, not users, i.e. we assume that the reader has a basic familiarity with the basic concepts of ARC. ## Historical Implementations From fb067b51ac9810909345922d4e634615765fe542 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Tue, 2 Aug 2016 18:01:45 -0700 Subject: [PATCH 08/62] Update SemanticARC.md --- docs/SemanticARC.md | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 69a9d6ffcb3b5..b576d739c15fc 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -7,22 +7,29 @@ This is a proposal for a series of changes to the SIL IR in order to ease the op ## Historical Implementations -ARC was first implemented for Objective C. In Objective C, Pointers with ARC semantics are represented in LLVM IR as i8*. The lifetimes of these pointers were managed via retain and release operations and the end of a pointer's lifetime was ascertained via conservative analysis of uses. The retain, release calls did not have any semantic information in the IR itself that showed what operations they were balancing and often times uses of ARC pointers that /should/ have resulted in atomic uses were separated into separate uses. Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two pointers had the same RC Identity conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. +The first historical implementation of ARC in a mid level IR was in LLVM IR for use by the Objective C language. In this model, a pointer with ARC semantics is represented as an i8*. The lifetimes of the pointer is managed via retain and release operations and the end of the pointer's lifetime is ascertained via conservative analysis of uses. The retain, release calls, since at the LLVM IR level were just calls, did not have any semantic ARC information that enabled reasoning about what the corresponding balancing operation opposing the function call was. Additionally, there were many operations on ARC pointers that /should/ have resulted in atomic operations were instead separated into separate uses (i.e. load/store strong). Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two operations on different pointers had the same RC identity conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. -The ARC implementation in Swift, in contrast, to Objective C, is implemented in the SIL IR. This suffered from many of the same issues as the Objective C implementation of ARC, with one exception: function signatures. In SIL, all functions specify the ownership convention expected of their arguments and return values. Since these conventions were not specified in the operations in the bodies of functions though, this could not be used to create a true ARC verifier. +The ARC implementation in Swift, in contrast, to Objective C, is implemented in the SIL IR. This suffered from the same issues as the Objective C implementation of ARC with one exception: argument conventions on function signatures. In SIL, all functions specify the ownership convention expected of their arguments and return values. Since these conventions were not specified in the operations in the bodies of functions though, this could not be used to create a true ARC verifier. ## Semantic ARC -As shown in the past section, the implementation of ARC in both Swift and Objective C lacked important semantic information in the following areas: +As shown in the past section, the implementation of ARC in both Swift and Objective C lacked important semantic ARC information in the following areas: 1. Ability to determine semantic ARC pointer equivalence (RC Identity). 2. Ability to pair semantic ARC operations. -Our proposal solves these problems as follows: +We suggest in this proposal the following changes to the SIL IR which solves these problems. -#### Reference Count Identity Problem +### Reference Count Identity Problem -#### Pairing Problems +In order to pair semantic ARC operations effectively, one has to be able to determine that two ARC operations are manipulating the same reference count. In Swift this is not as simple as determining that two pointers are the same from an aliasing point of view: This is because Swift has the notion of a non-trivial value type. A non-trivial value type. + +In order to solve this problem in a robust way in the face of compiler changes, we propose the following solution: + +1. In SILNodes.def, all instructions will have attached to them a notion of whether or not the instruction can "produce" a new reference count identity. There will be 3 states: True, False, Special. +2. An API will be provided that for any specific SILValue returns the set of RC Identity roots associated with the given SILValue. All RC Identity Roots must be RC Identity root instructions. + +### Pairing Semantic ARC Operations ## Solving the Reference Count Identity Problem From c751f7163e5423b00a08846b8a13e89ec65bb5b2 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Tue, 2 Aug 2016 18:56:47 -0700 Subject: [PATCH 09/62] Update SemanticARC.md Checkpoint --- docs/SemanticARC.md | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index b576d739c15fc..ac17dbc20d845 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -5,24 +5,33 @@ This is a proposal for a series of changes to the SIL IR in order to ease the optimization of ARC operations and verification of ARC semantics in Swift programs. This is a proposal meant for compiler writers and implementors, not users, i.e. we assume that the reader has a basic familiarity with the basic concepts of ARC. +*NOTE* We are talking solely about ARC as implemented beginning in Objective C. There may be other ARC implementations that are unknown to the writer. + ## Historical Implementations -The first historical implementation of ARC in a mid level IR was in LLVM IR for use by the Objective C language. In this model, a pointer with ARC semantics is represented as an i8*. The lifetimes of the pointer is managed via retain and release operations and the end of the pointer's lifetime is ascertained via conservative analysis of uses. The retain, release calls, since at the LLVM IR level were just calls, did not have any semantic ARC information that enabled reasoning about what the corresponding balancing operation opposing the function call was. Additionally, there were many operations on ARC pointers that /should/ have resulted in atomic operations were instead separated into separate uses (i.e. load/store strong). Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two operations on different pointers had the same RC identity conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. +The first historical implementation of ARC in a mid level IR was in LLVM IR for use by the Objective C language. In this model, a pointer with ARC semantics is represented as an i8*. The lifetimes of the pointer is managed via retain and release operations and the end of the pointer's lifetime is ascertained via conservative analysis of uses. The retain, release calls, since at the LLVM IR level were just calls, did not have any semantic ARC information that enabled reasoning about what the corresponding balancing operation opposing the function call was. Additionally, there were many operations on ARC pointers that /should/ have resulted in atomic operations were instead separated into separate uses (i.e. load/store strong when compiling for optimization). Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two operations on different pointers had the same RC identity conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. The ARC implementation in Swift, in contrast, to Objective C, is implemented in the SIL IR. This suffered from the same issues as the Objective C implementation of ARC with one exception: argument conventions on function signatures. In SIL, all functions specify the ownership convention expected of their arguments and return values. Since these conventions were not specified in the operations in the bodies of functions though, this could not be used to create a true ARC verifier. ## Semantic ARC -As shown in the past section, the implementation of ARC in both Swift and Objective C lacked important semantic ARC information in the following areas: +As discussed in the previous section, the implementation of ARC in both Swift and Objective C lacked important semantic ARC information in the following areas: 1. Ability to determine semantic ARC pointer equivalence (RC Identity). 2. Ability to pair semantic ARC operations. +3. Lack of specific atomic ARC operations that provide atomic semantics related to initialization, transfering of ARC operations to and from memory. -We suggest in this proposal the following changes to the SIL IR which solves these problems. +We suggest in this proposal the following changes to the SIL IR as solutions to these problems: ### Reference Count Identity Problem -In order to pair semantic ARC operations effectively, one has to be able to determine that two ARC operations are manipulating the same reference count. In Swift this is not as simple as determining that two pointers are the same from an aliasing point of view: This is because Swift has the notion of a non-trivial value type. A non-trivial value type. +In order to pair semantic ARC operations effectively, one has to be able to determine that two ARC operations are manipulating the same reference count. Currently in SIL this is not a robust operation due to the lack of IR level model of RC identity that is preserved by the frontend and all emitted instructions. We wish to define an algorithm which for any specific SILValue in a given program can determine the set of "RC Identity Sources" associated with the given SILValue. + +Let F be a given SIL function. Let V(F) be the set of SIL values in F and I(F) be the set of SIL instructions in F. There is a natural embedding of I(F) into V(F) defined by the function, value : I(F) -> V(F) defined by mapping each instruction in I(F) to its result value. Now define the predicate rcidsource : V -> Bool. Then define the set of rc identity source values as: + + RCIDSource = { v ϵ V : rcidsource(v) } + +We wish to attempt to define a function RCIDRoots : V(F) -> S with S ⊂ RCIDSource. In words this means that RCIDRoots must be able to map any SIL value v ϵ V(F) to a set of RCIDSource values. This algorithm is trivial to formulate when In order to solve this problem in a robust way in the face of compiler changes, we propose the following solution: From 62f71d45fec53aa8f87c1d86463f52784a2ebb75 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Tue, 2 Aug 2016 18:57:26 -0700 Subject: [PATCH 10/62] Change italics -> bold --- docs/SemanticARC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index ac17dbc20d845..df54399c384d6 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -5,7 +5,7 @@ This is a proposal for a series of changes to the SIL IR in order to ease the optimization of ARC operations and verification of ARC semantics in Swift programs. This is a proposal meant for compiler writers and implementors, not users, i.e. we assume that the reader has a basic familiarity with the basic concepts of ARC. -*NOTE* We are talking solely about ARC as implemented beginning in Objective C. There may be other ARC implementations that are unknown to the writer. +**NOTE** We are talking solely about ARC as implemented beginning in Objective C. There may be other ARC implementations that are unknown to the writer. ## Historical Implementations From dbc42ee5df58023b0cab4c8496db9fc9ad5678f7 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Tue, 2 Aug 2016 19:21:01 -0700 Subject: [PATCH 11/62] Update SemanticARC.md --- docs/SemanticARC.md | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index df54399c384d6..3812fb452f7fe 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -25,13 +25,21 @@ We suggest in this proposal the following changes to the SIL IR as solutions to ### Reference Count Identity Problem -In order to pair semantic ARC operations effectively, one has to be able to determine that two ARC operations are manipulating the same reference count. Currently in SIL this is not a robust operation due to the lack of IR level model of RC identity that is preserved by the frontend and all emitted instructions. We wish to define an algorithm which for any specific SILValue in a given program can determine the set of "RC Identity Sources" associated with the given SILValue. +In order to pair semantic ARC operations effectively, one has to be able to determine that two ARC operations are manipulating the same reference count. Currently in SIL this is not a robust operation due to the lack of IR level model of RC identity that is preserved by the frontend and all emitted instructions. We wish to define an algorithm which for any specific SILValue in a given program can determine the set of "RC Identity Sources" associated with the given SILValue. Define an RC Identity as a tuple (v, path) where v is a SILValue and path is a projection path into x. Then for a given SILValue v: -Let F be a given SIL function. Let V(F) be the set of SIL values in F and I(F) be the set of SIL instructions in F. There is a natural embedding of I(F) into V(F) defined by the function, value : I(F) -> V(F) defined by mapping each instruction in I(F) to its result value. Now define the predicate rcidsource : V -> Bool. Then define the set of rc identity source values as: +1. If v is a SILArgument we can compute its RC Identity Source set directly by . +2. If v is the result of a SILInstruction then we compute its RC Identity Source set as follows: + For a given - RCIDSource = { v ϵ V : rcidsource(v) } +Let F be a given SILFunction. Let V(F) be the set of SILValues in F and I(F) be the set of SILInstructions in F. There is a natural embedding of I(F) into V(F) defined by the function, value : I(F) -> V(F) defined by mapping each instruction in I(F) to its result value. For a given value v, define the type operator type : V(F) -> SILTypes that maps a SILValue to its associated SILType. For any given SILType, define proj_tree : SILTypes -> {(SILType, +)}. For any given SILValue, define ValueTree as ValueTree : V(F) -> {(V, T) : V in V(F), T in SILTypes}. Then define the shard operator, -We wish to attempt to define a function RCIDRoots : V(F) -> S with S ⊂ RCIDSource. In words this means that RCIDRoots must be able to map any SIL value v ϵ V(F) to a set of RCIDSource values. This algorithm is trivial to formulate when + shared : V(F) -> TypeTree(SILTypes) + +Now define the predicate rcidsource : V(F) -> Bool. For a given value v, rcidsource returns true if and only if v is a reference type and Then define the set of rc identity source values as: + + RCIDSource(V) = { v ϵ V : rcidsource(v) } + +We wish to attempt to define a function RCIDRoots : V(F) -> S with S ⊂ RCIDSource(V). In words this means that RCIDRoots must be able to map any SIL value v ϵ V(F) to a set of RCIDSource(V) values. This algorithm is trivial to formulate for a v ϵ RCIDSource(V), namely it is the identity operation. But lets say that we have some v not in RCIDSource(V). There are two ways that this can occur namely if v is an In order to solve this problem in a robust way in the face of compiler changes, we propose the following solution: From 30db06b685ed2b79a3c8c28aaf1a092ee0d34d96 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Wed, 3 Aug 2016 14:02:15 -0700 Subject: [PATCH 12/62] Fix tense. --- docs/SemanticARC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 3812fb452f7fe..4addd6dee9f87 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -9,7 +9,7 @@ This is a proposal for a series of changes to the SIL IR in order to ease the op ## Historical Implementations -The first historical implementation of ARC in a mid level IR was in LLVM IR for use by the Objective C language. In this model, a pointer with ARC semantics is represented as an i8*. The lifetimes of the pointer is managed via retain and release operations and the end of the pointer's lifetime is ascertained via conservative analysis of uses. The retain, release calls, since at the LLVM IR level were just calls, did not have any semantic ARC information that enabled reasoning about what the corresponding balancing operation opposing the function call was. Additionally, there were many operations on ARC pointers that /should/ have resulted in atomic operations were instead separated into separate uses (i.e. load/store strong when compiling for optimization). Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two operations on different pointers had the same RC identity conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. +The first historical implementation of ARC in a mid level IR is in LLVM IR for use by the Objective C language. In this model, a pointer with ARC semantics is represented as an i8*. The lifetimes of the pointer is managed via retain and release operations and the end of the pointer's lifetime is ascertained via conservative analysis of uses. The retain, release calls, since at the LLVM IR level were just calls, did not have any semantic ARC information that enabled reasoning about what the corresponding balancing operation opposing the function call was. Additionally, there were many operations on ARC pointers that /should/ have resulted in atomic operations were instead separated into separate uses (i.e. load/store strong when compiling for optimization). Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two operations on different pointers had the same RC identity conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. The ARC implementation in Swift, in contrast, to Objective C, is implemented in the SIL IR. This suffered from the same issues as the Objective C implementation of ARC with one exception: argument conventions on function signatures. In SIL, all functions specify the ownership convention expected of their arguments and return values. Since these conventions were not specified in the operations in the bodies of functions though, this could not be used to create a true ARC verifier. From 2693d19e7ba0cf3d3a8a4887240ad7dc445961de Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Thu, 4 Aug 2016 10:32:57 -0700 Subject: [PATCH 13/62] Update SemanticARC.md --- docs/SemanticARC.md | 36 +++++++++++++----------------------- 1 file changed, 13 insertions(+), 23 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 4addd6dee9f87..8e8c7b7c6e860 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -15,36 +15,26 @@ The ARC implementation in Swift, in contrast, to Objective C, is implemented in ## Semantic ARC -As discussed in the previous section, the implementation of ARC in both Swift and Objective C lacked important semantic ARC information in the following areas: +As discussed in the previous section, the implementation of ARC in both Swift and Objective C lacked important semantic ARC information. We fix these issues by embedding the following ARC semantic information into SIL in the following order of implementation. -1. Ability to determine semantic ARC pointer equivalence (RC Identity). -2. Ability to pair semantic ARC operations. -3. Lack of specific atomic ARC operations that provide atomic semantics related to initialization, transfering of ARC operations to and from memory. +1. RC Identity: For any given SILValue, one should be able to determine its set of RC Identity Roots. +2. Additional High Level ARC Operations: store_strong, load_strong, copy_value instructions should be added to SIL. +3. Operand ARC Conventions: Function signature ARC conventions should be extended to all instructions. +4. ARC Verifier: An ARC verifier should be written that uses RC Identity, Operand ARC Conventions, and High Level ARC operations to statically verify that a program obeys ARC semantics. +5. Elimination of Memory Locations from High Level SIL. Memory locations should be represented as SSA values instead of memory locations. This will allow for address only values to be manipulated and have their lifetimes verified just like normal class types. -We suggest in this proposal the following changes to the SIL IR as solutions to these problems: +Swift Extensions: -### Reference Count Identity Problem - -In order to pair semantic ARC operations effectively, one has to be able to determine that two ARC operations are manipulating the same reference count. Currently in SIL this is not a robust operation due to the lack of IR level model of RC identity that is preserved by the frontend and all emitted instructions. We wish to define an algorithm which for any specific SILValue in a given program can determine the set of "RC Identity Sources" associated with the given SILValue. Define an RC Identity as a tuple (v, path) where v is a SILValue and path is a projection path into x. Then for a given SILValue v: - -1. If v is a SILArgument we can compute its RC Identity Source set directly by . -2. If v is the result of a SILInstruction then we compute its RC Identity Source set as follows: - For a given - -Let F be a given SILFunction. Let V(F) be the set of SILValues in F and I(F) be the set of SILInstructions in F. There is a natural embedding of I(F) into V(F) defined by the function, value : I(F) -> V(F) defined by mapping each instruction in I(F) to its result value. For a given value v, define the type operator type : V(F) -> SILTypes that maps a SILValue to its associated SILType. For any given SILType, define proj_tree : SILTypes -> {(SILType, +)}. For any given SILValue, define ValueTree as ValueTree : V(F) -> {(V, T) : V in V(F), T in SILTypes}. Then define the shard operator, +1. One should be able to specify the parameter convention of **all** function parameters. - shared : V(F) -> TypeTree(SILTypes) +We now go into depth on each one of those points. -Now define the predicate rcidsource : V(F) -> Bool. For a given value v, rcidsource returns true if and only if v is a reference type and Then define the set of rc identity source values as: - - RCIDSource(V) = { v ϵ V : rcidsource(v) } - -We wish to attempt to define a function RCIDRoots : V(F) -> S with S ⊂ RCIDSource(V). In words this means that RCIDRoots must be able to map any SIL value v ϵ V(F) to a set of RCIDSource(V) values. This algorithm is trivial to formulate for a v ϵ RCIDSource(V), namely it is the identity operation. But lets say that we have some v not in RCIDSource(V). There are two ways that this can occur namely if v is an +### Reference Count Identity Problem -In order to solve this problem in a robust way in the face of compiler changes, we propose the following solution: +In order to pair semantic ARC operations effectively, one has to be able to determine that two ARC operations are manipulating the same reference count. Currently in SIL this is not a robust operation due to the lack of IR level model of RC identity that is guaranteed to be preserved by the frontend and all emitted instructions. We wish to define an algorithm which for any specific SILValue in a given program can determine the set of "RC Identity Sources" associated with the given SILValue. We do this as follows: Define an RC Identity as a tuple consisting of a SILValue and a ProjectionPath to a leaf reference type in the SILValue. Then define an RC Identity Source via the following recursive relation: -1. In SILNodes.def, all instructions will have attached to them a notion of whether or not the instruction can "produce" a new reference count identity. There will be 3 states: True, False, Special. -2. An API will be provided that for any specific SILValue returns the set of RC Identity roots associated with the given SILValue. All RC Identity Roots must be RC Identity root instructions. +1. rcidsource(a: SILArgument) consists of the list of all leaf reference types in a with the associated projection paths into a. *NOTE* This implies that by default, we consider SILArguments that act as phi nodes to block further association with rc identity sources. +2. To calculate the rcidsource of an instruction ### Pairing Semantic ARC Operations From 15317322654c2521661d68dcc4799819392695a9 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Thu, 4 Aug 2016 10:33:23 -0700 Subject: [PATCH 14/62] Update SemanticARC.md --- docs/SemanticARC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 8e8c7b7c6e860..61eb423137bcb 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -3,7 +3,7 @@ ## Preface -This is a proposal for a series of changes to the SIL IR in order to ease the optimization of ARC operations and verification of ARC semantics in Swift programs. This is a proposal meant for compiler writers and implementors, not users, i.e. we assume that the reader has a basic familiarity with the basic concepts of ARC. +This is a proposal for a series of changes to the SIL IR in order to ease the optimization of ARC operations and allow for static verification of ARC semantics in SIL. This is a proposal meant for compiler writers and implementors, not users, i.e. we assume that the reader has a basic familiarity with the basic concepts of ARC. **NOTE** We are talking solely about ARC as implemented beginning in Objective C. There may be other ARC implementations that are unknown to the writer. From 583ec85fc6de228d18795f58924ebdafdc4e647d Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Thu, 4 Aug 2016 10:44:09 -0700 Subject: [PATCH 15/62] Update SemanticARC.md --- docs/SemanticARC.md | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 61eb423137bcb..87a8bc617803d 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -9,18 +9,25 @@ This is a proposal for a series of changes to the SIL IR in order to ease the op ## Historical Implementations -The first historical implementation of ARC in a mid level IR is in LLVM IR for use by the Objective C language. In this model, a pointer with ARC semantics is represented as an i8*. The lifetimes of the pointer is managed via retain and release operations and the end of the pointer's lifetime is ascertained via conservative analysis of uses. The retain, release calls, since at the LLVM IR level were just calls, did not have any semantic ARC information that enabled reasoning about what the corresponding balancing operation opposing the function call was. Additionally, there were many operations on ARC pointers that /should/ have resulted in atomic operations were instead separated into separate uses (i.e. load/store strong when compiling for optimization). Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two operations on different pointers had the same RC identity conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. +The first historical implementation of ARC in a mid level IR is in LLVM IR for use by the Objective C language. In this model, a pointer with ARC semantics is represented as an i8*. The lifetimes of the pointer is managed via retain and release operations and the end of the pointer's lifetime is ascertained via conservative analysis of uses. retain, release instructions were just calls that did not have any semantic ARC information that enabled reasoning about what the corresponding balancing operation opposing it was. Additionally, there were many operations on ARC pointers that /should/ have resulted in atomic operations were instead separated into separate uses (i.e. load/store strong when compiling for optimization). Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two operations on different pointers had the same RC identity conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. -The ARC implementation in Swift, in contrast, to Objective C, is implemented in the SIL IR. This suffered from the same issues as the Objective C implementation of ARC with one exception: argument conventions on function signatures. In SIL, all functions specify the ownership convention expected of their arguments and return values. Since these conventions were not specified in the operations in the bodies of functions though, this could not be used to create a true ARC verifier. +The ARC implementation in Swift, in contrast, to Objective C, is implemented in the SIL IR. This suffered from the same issues as the Objective C implementation of ARC with a few exceptions: + +1. First Class Retainable Types. retain, release were understood to be IR level operations allowing for type system verification that a value passed into retain, release was actually legal for reference counting operations. +2. Argument conventions on function signatures. In SIL, all functions specify the ownership convention expected of their arguments and return values. Since these conventions were not specified in the operations in the bodies of functions though, this could not be used to create a true ARC verifier. ## Semantic ARC -As discussed in the previous section, the implementation of ARC in both Swift and Objective C lacked important semantic ARC information. We fix these issues by embedding the following ARC semantic information into SIL in the following order of implementation. +As discussed in the previous section, the implementation of ARC in both Swift and Objective C lacked important semantic ARC information. We fix these issues by embedding the following ARC semantic information into SIL in the following order of implementation: 1. RC Identity: For any given SILValue, one should be able to determine its set of RC Identity Roots. 2. Additional High Level ARC Operations: store_strong, load_strong, copy_value instructions should be added to SIL. -3. Operand ARC Conventions: Function signature ARC conventions should be extended to all instructions. -4. ARC Verifier: An ARC verifier should be written that uses RC Identity, Operand ARC Conventions, and High Level ARC operations to statically verify that a program obeys ARC semantics. +3. Endow Use-Def edges with ARC Conventions: Function signature ARC conventions should be extended to all instructions and block arguments. Thus **all** use-def edges should have an implied ownership transfer convention. +4. ARC Verifier: An ARC verifier should be written that uses RC Identity, Operand ARC Conventions, and High Level ARC operations to statically verify that a program obeys ARC semantics by ensuring the following properties are true of all reference counting operations in a function body: + + a. Every use-def edge must connect together a use and a def with compatible ARC semantics. As an example this means that any def that produces a +1 value must be paired with a -1 use. If one wishes to pass off a +1 value to an unowned use or a guaranteed use, one must use an appropriate conversion instruction. The conversion instruction would work as a pluggable adaptor and only certain adaptors that preserve safe ARC semantics would be provided. + b. Every +1 operation can only be balanced by a -1 once along any path through the program. This would be implemented in the verifier by using the use-def list of a +1, -1 to construct joint-domination sets. The author believes that there is a simple algorithm for disproving joint dominance of a set by an instruction, but if one can not be come up with, there is literature for computing generalized dominators that can be used. If computation of generalized dominators is too expensive for normal use, they could be used on specific verification bots and used when triaging bugs. + 5. Elimination of Memory Locations from High Level SIL. Memory locations should be represented as SSA values instead of memory locations. This will allow for address only values to be manipulated and have their lifetimes verified just like normal class types. Swift Extensions: From 5450064ee4ee018423ca577cc70a732002c9737b Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Thu, 4 Aug 2016 10:46:10 -0700 Subject: [PATCH 16/62] Update SemanticARC.md --- docs/SemanticARC.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 87a8bc617803d..91439a9b7d18d 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -11,9 +11,9 @@ This is a proposal for a series of changes to the SIL IR in order to ease the op The first historical implementation of ARC in a mid level IR is in LLVM IR for use by the Objective C language. In this model, a pointer with ARC semantics is represented as an i8*. The lifetimes of the pointer is managed via retain and release operations and the end of the pointer's lifetime is ascertained via conservative analysis of uses. retain, release instructions were just calls that did not have any semantic ARC information that enabled reasoning about what the corresponding balancing operation opposing it was. Additionally, there were many operations on ARC pointers that /should/ have resulted in atomic operations were instead separated into separate uses (i.e. load/store strong when compiling for optimization). Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two operations on different pointers had the same RC identity conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. -The ARC implementation in Swift, in contrast, to Objective C, is implemented in the SIL IR. This suffered from the same issues as the Objective C implementation of ARC with a few exceptions: +The ARC implementation in Swift, in contrast, to Objective C, is implemented in the SIL IR. This suffered from the same issues as the Objective C implementation of ARC but provided some noteworthy improvements: -1. First Class Retainable Types. retain, release were understood to be IR level operations allowing for type system verification that a value passed into retain, release was actually legal for reference counting operations. +1. Embedding Reference Semantics into the Type System. By embedding notions of retain, release were understood to be IR level operations allowing for type system verification that a value passed into retain, release was actually legal for reference counting operations. 2. Argument conventions on function signatures. In SIL, all functions specify the ownership convention expected of their arguments and return values. Since these conventions were not specified in the operations in the bodies of functions though, this could not be used to create a true ARC verifier. ## Semantic ARC From 3551feff072de31586595771ec122a9bf2d76893 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Thu, 4 Aug 2016 11:13:25 -0700 Subject: [PATCH 17/62] Update SemanticARC.md --- docs/SemanticARC.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 91439a9b7d18d..1c0cc7f2d0cbb 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -26,6 +26,7 @@ As discussed in the previous section, the implementation of ARC in both Swift an 4. ARC Verifier: An ARC verifier should be written that uses RC Identity, Operand ARC Conventions, and High Level ARC operations to statically verify that a program obeys ARC semantics by ensuring the following properties are true of all reference counting operations in a function body: a. Every use-def edge must connect together a use and a def with compatible ARC semantics. As an example this means that any def that produces a +1 value must be paired with a -1 use. If one wishes to pass off a +1 value to an unowned use or a guaranteed use, one must use an appropriate conversion instruction. The conversion instruction would work as a pluggable adaptor and only certain adaptors that preserve safe ARC semantics would be provided. + b. Every +1 operation can only be balanced by a -1 once along any path through the program. This would be implemented in the verifier by using the use-def list of a +1, -1 to construct joint-domination sets. The author believes that there is a simple algorithm for disproving joint dominance of a set by an instruction, but if one can not be come up with, there is literature for computing generalized dominators that can be used. If computation of generalized dominators is too expensive for normal use, they could be used on specific verification bots and used when triaging bugs. 5. Elimination of Memory Locations from High Level SIL. Memory locations should be represented as SSA values instead of memory locations. This will allow for address only values to be manipulated and have their lifetimes verified just like normal class types. From f27b336543725b8e5bb47a9595296246e5493b6f Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Thu, 4 Aug 2016 11:21:47 -0700 Subject: [PATCH 18/62] Update SemanticARC.md --- docs/SemanticARC.md | 39 +++++++++++++++++++-------------------- 1 file changed, 19 insertions(+), 20 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 1c0cc7f2d0cbb..b5e78f43e1399 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -20,16 +20,12 @@ The ARC implementation in Swift, in contrast, to Objective C, is implemented in As discussed in the previous section, the implementation of ARC in both Swift and Objective C lacked important semantic ARC information. We fix these issues by embedding the following ARC semantic information into SIL in the following order of implementation: -1. RC Identity: For any given SILValue, one should be able to determine its set of RC Identity Roots. -2. Additional High Level ARC Operations: store_strong, load_strong, copy_value instructions should be added to SIL. -3. Endow Use-Def edges with ARC Conventions: Function signature ARC conventions should be extended to all instructions and block arguments. Thus **all** use-def edges should have an implied ownership transfer convention. -4. ARC Verifier: An ARC verifier should be written that uses RC Identity, Operand ARC Conventions, and High Level ARC operations to statically verify that a program obeys ARC semantics by ensuring the following properties are true of all reference counting operations in a function body: - - a. Every use-def edge must connect together a use and a def with compatible ARC semantics. As an example this means that any def that produces a +1 value must be paired with a -1 use. If one wishes to pass off a +1 value to an unowned use or a guaranteed use, one must use an appropriate conversion instruction. The conversion instruction would work as a pluggable adaptor and only certain adaptors that preserve safe ARC semantics would be provided. - - b. Every +1 operation can only be balanced by a -1 once along any path through the program. This would be implemented in the verifier by using the use-def list of a +1, -1 to construct joint-domination sets. The author believes that there is a simple algorithm for disproving joint dominance of a set by an instruction, but if one can not be come up with, there is literature for computing generalized dominators that can be used. If computation of generalized dominators is too expensive for normal use, they could be used on specific verification bots and used when triaging bugs. - -5. Elimination of Memory Locations from High Level SIL. Memory locations should be represented as SSA values instead of memory locations. This will allow for address only values to be manipulated and have their lifetimes verified just like normal class types. +1. **Split the Canonical SIL Stage into High and Low Level SIL**: High Level SIL will be the result of running the guaranteed passes and is where ARC invariants will be enforced. +1. **RC Identity**: For any given SILValue, one should be able to determine its set of RC Identity Roots. +2. **Introduction of new High Level ARC Operations**: store_strong, load_strong, copy_value instructions should be added to SIL. +3. **Endow Use-Def edges with ARC Conventions**: Function signature ARC conventions should be extended to all instructions and block arguments. Thus all use-def edges should have an implied ownership transfer convention. +4. **ARC Verifier**: An ARC verifier should be written that uses RC Identity, Operand ARC Conventions, and High Level ARC operations to statically verify that a program obeys ARC semantics. +5. **Elimination of Memory Locations from High Level SIL**. Memory locations should be represented as SSA values instead of memory locations. This will allow for address only values to be manipulated and have their lifetimes verified just like normal class types. Swift Extensions: @@ -37,25 +33,28 @@ Swift Extensions: We now go into depth on each one of those points. -### Reference Count Identity Problem +## High Level SIL and Low Level SIL + + ARC optimization /will/ not occur at the Low Level SIL. This implies that function signature optimization and IPO must occur at High Level SIL. Necessarily both High and Low Level SIL must be able to be lowered by IRGen to LLVM IR to ensure that + +## RC Identity In order to pair semantic ARC operations effectively, one has to be able to determine that two ARC operations are manipulating the same reference count. Currently in SIL this is not a robust operation due to the lack of IR level model of RC identity that is guaranteed to be preserved by the frontend and all emitted instructions. We wish to define an algorithm which for any specific SILValue in a given program can determine the set of "RC Identity Sources" associated with the given SILValue. We do this as follows: Define an RC Identity as a tuple consisting of a SILValue and a ProjectionPath to a leaf reference type in the SILValue. Then define an RC Identity Source via the following recursive relation: 1. rcidsource(a: SILArgument) consists of the list of all leaf reference types in a with the associated projection paths into a. *NOTE* This implies that by default, we consider SILArguments that act as phi nodes to block further association with rc identity sources. 2. To calculate the rcidsource of an instruction -### Pairing Semantic ARC Operations - -## Solving the Reference Count Identity Problem +## New High Level ARC Operations -### Add RC Identity semantics to use-def chains +## Endow Use-Def edges with ARC Conventions -### Create an RC Identity Verifier +## ARC Verifier -## Solving the Pairing Problem + by ensuring the following properties are true of all reference counting operations in a function body: -### Transition to copy_value + a. Every use-def edge must connect together a use and a def with compatible ARC semantics. As an example this means that any def that produces a +1 value must be paired with a -1 use. If one wishes to pass off a +1 value to an unowned use or a guaranteed use, one must use an appropriate conversion instruction. The conversion instruction would work as a pluggable adaptor and only certain adaptors that preserve safe ARC semantics would be provided. + + b. Every +1 operation can only be balanced by a -1 once along any path through the program. This would be implemented in the verifier by using the use-def list of a +1, -1 to construct joint-domination sets. The author believes that there is a simple algorithm for disproving joint dominance of a set by an instruction, but if one can not be come up with, there is literature for computing generalized dominators that can be used. If computation of generalized dominators is too expensive for normal use, they could be used on specific verification bots and used when triaging bugs. -### Add Ownership Semantics to use-def chains +## Elimination of Memory Locations from High Level SIL -### Create an Ownership Semantic Verifier From 1a1dd0663070628e7f1ed1ed402766a0f89937ec Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Thu, 4 Aug 2016 11:24:43 -0700 Subject: [PATCH 19/62] Update SemanticARC.md --- docs/SemanticARC.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index b5e78f43e1399..0d2f6adcfa154 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -11,11 +11,13 @@ This is a proposal for a series of changes to the SIL IR in order to ease the op The first historical implementation of ARC in a mid level IR is in LLVM IR for use by the Objective C language. In this model, a pointer with ARC semantics is represented as an i8*. The lifetimes of the pointer is managed via retain and release operations and the end of the pointer's lifetime is ascertained via conservative analysis of uses. retain, release instructions were just calls that did not have any semantic ARC information that enabled reasoning about what the corresponding balancing operation opposing it was. Additionally, there were many operations on ARC pointers that /should/ have resulted in atomic operations were instead separated into separate uses (i.e. load/store strong when compiling for optimization). Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two operations on different pointers had the same RC identity conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. -The ARC implementation in Swift, in contrast, to Objective C, is implemented in the SIL IR. This suffered from the same issues as the Objective C implementation of ARC but provided some noteworthy improvements: +The ARC implementation in Swift, in contrast, to Objective C, is implemented in the SIL IR. This suffers from some of the same issues as the Objective C implementation of ARC but also provides some noteworthy improvements: -1. Embedding Reference Semantics into the Type System. By embedding notions of retain, release were understood to be IR level operations allowing for type system verification that a value passed into retain, release was actually legal for reference counting operations. +1. Reference Semantics in the Type System. Since SIL's type system is based upon the Swift type system, the notion of a reference countable type exists. This means that it is possible to verify that reference counting operations only apply to SSA values and memory locations associated reference counted types. 2. Argument conventions on function signatures. In SIL, all functions specify the ownership convention expected of their arguments and return values. Since these conventions were not specified in the operations in the bodies of functions though, this could not be used to create a true ARC verifier. +*TODO: I could point out here that many of the ideas in Semantic ARC are extending these ideas throughout the IR?* + ## Semantic ARC As discussed in the previous section, the implementation of ARC in both Swift and Objective C lacked important semantic ARC information. We fix these issues by embedding the following ARC semantic information into SIL in the following order of implementation: From 6fb7fbafdfec30674439f4edfa6bb484eaf3d498 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Thu, 4 Aug 2016 11:27:38 -0700 Subject: [PATCH 20/62] Update SemanticARC.md --- docs/SemanticARC.md | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 0d2f6adcfa154..7b4a1d1582297 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -23,15 +23,11 @@ The ARC implementation in Swift, in contrast, to Objective C, is implemented in As discussed in the previous section, the implementation of ARC in both Swift and Objective C lacked important semantic ARC information. We fix these issues by embedding the following ARC semantic information into SIL in the following order of implementation: 1. **Split the Canonical SIL Stage into High and Low Level SIL**: High Level SIL will be the result of running the guaranteed passes and is where ARC invariants will be enforced. -1. **RC Identity**: For any given SILValue, one should be able to determine its set of RC Identity Roots. -2. **Introduction of new High Level ARC Operations**: store_strong, load_strong, copy_value instructions should be added to SIL. +1. **RC Identity**: For any given SILValue, one should be able to determine its set of RC Identity Roots. This makes it easy to reason about which reference counts a reference count operation is affecting. +2. **Introduction of new High Level ARC Operations**: store_strong, load_strong, copy_value instructions should be added to SIL. These operations are currently split into separate low level operations. **TODO: ADD MORE HERE** 3. **Endow Use-Def edges with ARC Conventions**: Function signature ARC conventions should be extended to all instructions and block arguments. Thus all use-def edges should have an implied ownership transfer convention. 4. **ARC Verifier**: An ARC verifier should be written that uses RC Identity, Operand ARC Conventions, and High Level ARC operations to statically verify that a program obeys ARC semantics. -5. **Elimination of Memory Locations from High Level SIL**. Memory locations should be represented as SSA values instead of memory locations. This will allow for address only values to be manipulated and have their lifetimes verified just like normal class types. - -Swift Extensions: - -1. One should be able to specify the parameter convention of **all** function parameters. +5. **Elimination of Memory Locations from High Level SIL**. Memory locations should be represented as SSA values instead of memory locations. This will allow for address only values to be manipulated and have their lifetimes verified by the ARC verifier in a trivial way without the introduction of Memory SSA. We now go into depth on each one of those points. @@ -60,3 +56,6 @@ In order to pair semantic ARC operations effectively, one has to be able to dete ## Elimination of Memory Locations from High Level SIL +# Swift Extensions: + +1. One should be able to specify the parameter convention of **all** function parameters. From 565297ab7735156f7ffee339990f7bf9386a1336 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Thu, 4 Aug 2016 12:28:58 -0700 Subject: [PATCH 21/62] Update SemanticARC.md --- docs/SemanticARC.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 7b4a1d1582297..0f5867300b78a 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -23,11 +23,11 @@ The ARC implementation in Swift, in contrast, to Objective C, is implemented in As discussed in the previous section, the implementation of ARC in both Swift and Objective C lacked important semantic ARC information. We fix these issues by embedding the following ARC semantic information into SIL in the following order of implementation: 1. **Split the Canonical SIL Stage into High and Low Level SIL**: High Level SIL will be the result of running the guaranteed passes and is where ARC invariants will be enforced. -1. **RC Identity**: For any given SILValue, one should be able to determine its set of RC Identity Roots. This makes it easy to reason about which reference counts a reference count operation is affecting. -2. **Introduction of new High Level ARC Operations**: store_strong, load_strong, copy_value instructions should be added to SIL. These operations are currently split into separate low level operations. **TODO: ADD MORE HERE** -3. **Endow Use-Def edges with ARC Conventions**: Function signature ARC conventions should be extended to all instructions and block arguments. Thus all use-def edges should have an implied ownership transfer convention. -4. **ARC Verifier**: An ARC verifier should be written that uses RC Identity, Operand ARC Conventions, and High Level ARC operations to statically verify that a program obeys ARC semantics. -5. **Elimination of Memory Locations from High Level SIL**. Memory locations should be represented as SSA values instead of memory locations. This will allow for address only values to be manipulated and have their lifetimes verified by the ARC verifier in a trivial way without the introduction of Memory SSA. +2. **RC Identity**: For any given SILValue, one should be able to determine its set of RC Identity Roots. This makes it easy to reason about which reference counts a reference count operation is affecting. +3. **Introduction of new High Level ARC Operations**: store_strong, load_strong, copy_value instructions should be added to SIL. These operations are currently split into separate low level operations. **TODO: ADD MORE HERE** +4. **Endow Use-Def edges with ARC Conventions**: Function signature ARC conventions should be extended to all instructions and block arguments. Thus all use-def edges should have an implied ownership transfer convention. +5. **ARC Verifier**: An ARC verifier should be written that uses RC Identity, Operand ARC Conventions, and High Level ARC operations to statically verify that a program obeys ARC semantics. + We now go into depth on each one of those points. @@ -50,9 +50,9 @@ In order to pair semantic ARC operations effectively, one has to be able to dete by ensuring the following properties are true of all reference counting operations in a function body: - a. Every use-def edge must connect together a use and a def with compatible ARC semantics. As an example this means that any def that produces a +1 value must be paired with a -1 use. If one wishes to pass off a +1 value to an unowned use or a guaranteed use, one must use an appropriate conversion instruction. The conversion instruction would work as a pluggable adaptor and only certain adaptors that preserve safe ARC semantics would be provided. - - b. Every +1 operation can only be balanced by a -1 once along any path through the program. This would be implemented in the verifier by using the use-def list of a +1, -1 to construct joint-domination sets. The author believes that there is a simple algorithm for disproving joint dominance of a set by an instruction, but if one can not be come up with, there is literature for computing generalized dominators that can be used. If computation of generalized dominators is too expensive for normal use, they could be used on specific verification bots and used when triaging bugs. + a. Every use-def edge must connect together a use and a def with compatible ARC semantics. As an example this means that any def that produces a +1 value must be paired with a -1 use. If one wishes to pass off a +1 value to an unowned use or a guaranteed use, one must use an appropriate conversion instruction. The conversion instruction would work as a pluggable adaptor and only certain adaptors that preserve safe ARC semantics would be provided. + + b. Every +1 operation can only be balanced by a -1 once along any path through the program. This would be implemented in the verifier by using the use-def list of a +1, -1 to construct joint-domination sets. The author believes that there is a simple algorithm for disproving joint dominance of a set by an instruction, but if one can not be come up with, there is literature for computing generalized dominators that can be used. If computation of generalized dominators is too expensive for normal use, they could be used on specific verification bots and used when triaging bugs. ## Elimination of Memory Locations from High Level SIL From 2b649d88839d94551d167145ab508c983c200ef6 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Thu, 4 Aug 2016 13:42:02 -0700 Subject: [PATCH 22/62] Update SemanticARC.md --- docs/SemanticARC.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 0f5867300b78a..bdb374d04d379 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -33,6 +33,8 @@ We now go into depth on each one of those points. ## High Level SIL and Low Level SIL +The first step to bringing 7 + ARC optimization /will/ not occur at the Low Level SIL. This implies that function signature optimization and IPO must occur at High Level SIL. Necessarily both High and Low Level SIL must be able to be lowered by IRGen to LLVM IR to ensure that ## RC Identity @@ -54,8 +56,8 @@ In order to pair semantic ARC operations effectively, one has to be able to dete b. Every +1 operation can only be balanced by a -1 once along any path through the program. This would be implemented in the verifier by using the use-def list of a +1, -1 to construct joint-domination sets. The author believes that there is a simple algorithm for disproving joint dominance of a set by an instruction, but if one can not be come up with, there is literature for computing generalized dominators that can be used. If computation of generalized dominators is too expensive for normal use, they could be used on specific verification bots and used when triaging bugs. -## Elimination of Memory Locations from High Level SIL + From c9c09da10a2ca63cb8ba45ee5629542f8760ce28 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Thu, 4 Aug 2016 13:43:18 -0700 Subject: [PATCH 23/62] Update SemanticARC.md --- docs/SemanticARC.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index bdb374d04d379..83301435baeca 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -9,7 +9,7 @@ This is a proposal for a series of changes to the SIL IR in order to ease the op ## Historical Implementations -The first historical implementation of ARC in a mid level IR is in LLVM IR for use by the Objective C language. In this model, a pointer with ARC semantics is represented as an i8*. The lifetimes of the pointer is managed via retain and release operations and the end of the pointer's lifetime is ascertained via conservative analysis of uses. retain, release instructions were just calls that did not have any semantic ARC information that enabled reasoning about what the corresponding balancing operation opposing it was. Additionally, there were many operations on ARC pointers that /should/ have resulted in atomic operations were instead separated into separate uses (i.e. load/store strong when compiling for optimization). Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two operations on different pointers had the same RC identity conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. +The first historical implementation of ARC in a mid level IR is in LLVM IR for use by the Objective C language. In this model, a pointer with ARC semantics is represented as an i8*. The lifetimes of the pointer is managed via retain and release operations and the end of the pointer's lifetime is ascertained via conservative analysis of uses. retain, release instructions were just calls that did not have any semantic ARC information that enabled reasoning about what the corresponding balancing operation opposing it was. Additionally, there were many operations on ARC pointers that /should/ have resulted in atomic operations were instead separated into separate uses (i.e. load/store strong when compiling for optimization). Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two operations on different pointers had the same RC identity [[1]](#first-footnote) conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. The ARC implementation in Swift, in contrast, to Objective C, is implemented in the SIL IR. This suffers from some of the same issues as the Objective C implementation of ARC but also provides some noteworthy improvements: @@ -56,6 +56,8 @@ In order to pair semantic ARC operations effectively, one has to be able to dete b. Every +1 operation can only be balanced by a -1 once along any path through the program. This would be implemented in the verifier by using the use-def list of a +1, -1 to construct joint-domination sets. The author believes that there is a simple algorithm for disproving joint dominance of a set by an instruction, but if one can not be come up with, there is literature for computing generalized dominators that can be used. If computation of generalized dominators is too expensive for normal use, they could be used on specific verification bots and used when triaging bugs. +[1] + @@ -33,19 +33,26 @@ We now go into depth on each one of those points. ## High Level SIL and Low Level SIL -The first step to bringing 7 - - ARC optimization /will/ not occur at the Low Level SIL. This implies that function signature optimization and IPO must occur at High Level SIL. Necessarily both High and Low Level SIL must be able to be lowered by IRGen to LLVM IR to ensure that +The first step towards implementing Semantic ARC is to split the "Canonical SIL Stage" into two different stages: High Level and Low Level SIL. The main distinction in between the two stages is, that in High Level SIL, ARC semantic invariants will be enforced via extra conditions on the IR. In contrast, once Low Level SIL has been reached, no ARC semantic invariants are enforced and only very conservative ARC optimization may occur. The intention is that Low Level SIL would /only/ be used when compiling with optimization enabled, so both High and Low Level SIL will necessarily need to be able to be lowered to LLVM IR. ## RC Identity -In order to pair semantic ARC operations effectively, one has to be able to determine that two ARC operations are manipulating the same reference count. Currently in SIL this is not a robust operation due to the lack of IR level model of RC identity that is guaranteed to be preserved by the frontend and all emitted instructions. We wish to define an algorithm which for any specific SILValue in a given program can determine the set of "RC Identity Sources" associated with the given SILValue. We do this as follows: Define an RC Identity as a tuple consisting of a SILValue and a ProjectionPath to a leaf reference type in the SILValue. Then define an RC Identity Source via the following recursive relation: +Once High Level SIL has been implemented, we will embed RC Identity into High Level SIL to ensure that RC identity can always be computed for all SSA values. Currently in SIL this is not a robust operation due to the lack of IR level model of RC identity that is guaranteed to be preserved by the frontend and all emitted instructions. Define an RC Identity as a tuple consisting of a SILValue, V, and a ProjectionPath, P, to from V's type to a sub reference type in V [[2]](#footnote-2). We wish to define an algorithm that given any (V, P) in a program can determine the RC Identity Source associated with (V, P). We do this recursively follows: + +Let V be a SILValue and P be a projection path into V. Then: + +1. If V is a SILArgument, then (V, P) is an RC Identity Source. *NOTE* This implies that by default, SILArguments that act as phi nodes are RC Identity Sources. +2. If V is the result of a SILInstruction I, then if I does not have any operands, (V, P) is an RC Identity Source. If I does have SILOperands then I must define how (V, P) is related to its operands. Some possible relationships are: + i. RC Identity Forwarding. If I is a forwarding instruction, then (V, P) to an analogous RC Identity (OpV, OpP). Some examples of this sort of operation are casts, value projections, and value aggregations. + ii. RC Identity Introducing. These are instructions which introduce new RC Identity values implying that (V, P) is an RC Identity Source. Some examples of these sorts of instructions are: apply, partial_apply. + iii. Unspecified. If I is not an introducer or a forwarder and does not specify any specific semantics, then its RC Identity behavior is unspecified. -1. rcidsource(a: SILArgument) consists of the list of all leaf reference types in a with the associated projection paths into a. *NOTE* This implies that by default, we consider SILArguments that act as phi nodes to block further association with rc identity sources. -2. To calculate the rcidsource of an instruction +Our algorithm then is a very simple algorithm that applies the RC Identity Source algorithm to all SSA values in the program and ensures that RC Identity Sources can be computed for them. This should result in trivial use-def list traversal. ## New High Level ARC Operations +Once we are able to reason about + ## Endow Use-Def edges with ARC Conventions ## ARC Verifier @@ -56,7 +63,9 @@ In order to pair semantic ARC operations effectively, one has to be able to dete b. Every +1 operation can only be balanced by a -1 once along any path through the program. This would be implemented in the verifier by using the use-def list of a +1, -1 to construct joint-domination sets. The author believes that there is a simple algorithm for disproving joint dominance of a set by an instruction, but if one can not be come up with, there is literature for computing generalized dominators that can be used. If computation of generalized dominators is too expensive for normal use, they could be used on specific verification bots and used when triaging bugs. -[1] Reference Count Identity ("RC Identity") is a concept that is independent of pointer identity that refers to the set of reference counts that would be manipulated by a reference counted operation upon a specific SSA value. For more information see the [RC Identity](https://github.com/apple/swift/blob/master/docs/ARCOptimization.rst#rc-identity) section of the [ARC Optimization guide](https://github.com/apple/swift/blob/master/docs/ARCOptimization.rst) +[1] Reference Count Identity ("RC Identity") is a concept that is independent of pointer identity that refers to the set of reference counts that would be manipulated by a reference counted operation upon a specific SSA value. For more information see the [RC Identity](https://github.com/apple/swift/blob/master/docs/ARCOptimization.rst#rc-identity) section of the [ARC Optimization guide](https://github.com/apple/swift/blob/master/docs/ARCOptimization.rst) + +[2] *NOTE* In many cases P will be the empty set (e.g. the case of a pure reference type) We now go into depth on each one of those points. From 39cd3e59f83ca568bb935ed7c24a0c16622c380a Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Thu, 4 Aug 2016 14:36:47 -0700 Subject: [PATCH 28/62] Update SemanticARC.md --- docs/SemanticARC.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index a0c7fcc4eec63..0ca21fe572b6c 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -64,9 +64,9 @@ Once we are able to reason about b. Every +1 operation can only be balanced by a -1 once along any path through the program. This would be implemented in the verifier by using the use-def list of a +1, -1 to construct joint-domination sets. The author believes that there is a simple algorithm for disproving joint dominance of a set by an instruction, but if one can not be come up with, there is literature for computing generalized dominators that can be used. If computation of generalized dominators is too expensive for normal use, they could be used on specific verification bots and used when triaging bugs. -[1] Reference Count Identity ("RC Identity") is a concept that is independent of pointer identity that refers to the set of reference counts that would be manipulated by a reference counted operation upon a specific SSA value. For more information see the [RC Identity](https://github.com/apple/swift/blob/master/docs/ARCOptimization.rst#rc-identity) section of the [ARC Optimization guide](https://github.com/apple/swift/blob/master/docs/ARCOptimization.rst) +[1]Reference Count Identity ("RC Identity") is a concept that is independent of pointer identity that refers to the set of reference counts that would be manipulated by a reference counted operation upon a specific SSA value. For more information see the [RC Identity](https://github.com/apple/swift/blob/master/docs/ARCOptimization.rst#rc-identity) section of the [ARC Optimization guide](https://github.com/apple/swift/blob/master/docs/ARCOptimization.rst) -[2] *NOTE* In many cases P will be the empty set (e.g. the case of a pure reference type) +[2]**NOTE** In many cases P will be the empty set (e.g. the case of a pure reference type) We now go into depth on each one of those points. ### High Level SIL and Low Level SIL -The first step towards implementing Semantic ARC is to split the "Canonical SIL Stage" into two different stages: High Level and Low Level SIL. The main distinction in between the two stages is, that in High Level SIL, ARC semantic invariants will be enforced via extra conditions on the IR. In contrast, once Low Level SIL has been reached, no ARC semantic invariants are enforced and only very conservative ARC optimization may occur. The intention is that Low Level SIL would /only/ be used when compiling with optimization enabled, so both High and Low Level SIL will necessarily need to be able to be lowered to LLVM IR. +The first step towards implementing Semantic ARC is to split the "Canonical SIL +Stage" into two different stages: High Level and Low Level SIL. The main +distinction in between the two stages is, that in High Level SIL, ARC semantic +invariants will be enforced via extra conditions on the IR. In contrast, once +Low Level SIL has been reached, no ARC semantic invariants are enforced and only +very conservative ARC optimization may occur. The intention is that Low Level +SIL would /only/ be used when compiling with optimization enabled, so both High +and Low Level SIL will necessarily need to be able to be lowered to LLVM IR. ### RC Identity -Once High Level SIL has been implemented, we will embed RC Identity into High Level SIL to ensure that RC identity can always be computed for all SSA values. Currently in SIL this is not a robust operation due to the lack of IR level model of RC identity that is guaranteed to be preserved by the frontend and all emitted instructions. Define an RC Identity as a tuple consisting of a SILValue, V, and a ProjectionPath, P, to from V's type to a sub reference type in V [[2]](#footnote-2). We wish to define an algorithm that given any (V, P) in a program can determine the RC Identity Source associated with (V, P). We do this recursively follows: +Once High Level SIL has been implemented, we will embed RC Identity into High +Level SIL to ensure that RC identity can always be computed for all SSA +values. Currently in SIL this is not a robust operation due to the lack of IR +level model of RC identity that is guaranteed to be preserved by the frontend +and all emitted instructions. Define an RC Identity as a tuple consisting of a +SILValue, V, and a ProjectionPath, P, to from V's type to a sub reference type +in V [[2]](#footnote-2). We wish to define an algorithm that given any (V, P) in +a program can determine the RC Identity Source associated with (V, P). We do +this recursively follows: Let V be a SILValue and P be a projection path into V. Then: -1. If V is a SILArgument, then (V, P) is an RC Identity Source. *NOTE* This implies that by default, SILArguments that act as phi nodes are RC Identity Sources. -2. If V is the result of a SILInstruction I, then if I does not have any operands, (V, P) is an RC Identity Source. If I does have SILOperands then I must define how (V, P) is related to its operands. Some possible relationships are: - i. RC Identity Forwarding. If I is a forwarding instruction, then (V, P) to an analogous RC Identity (OpV, OpP). Some examples of this sort of operation are casts, value projections, and value aggregations. - ii. RC Identity Introducing. These are instructions which introduce new RC Identity values implying that (V, P) is an RC Identity Source. Some examples of these sorts of instructions are: apply, partial_apply. - iii. Unspecified. If I is not an introducer or a forwarder and does not specify any specific semantics, then its RC Identity behavior is unspecified. - -Our algorithm then is a very simple algorithm that applies the RC Identity Source algorithm to all SSA values in the program and ensures that RC Identity Sources can be computed for them. This should result in trivial use-def list traversal. +1. If V is a SILArgument, then (V, P) is an RC Identity Source. *NOTE* This + implies that by default, SILArguments that act as phi nodes are RC Identity + Sources. +2. If V is the result of a SILInstruction I, then if I does not have any + operands, (V, P) is an RC Identity Source. If I does have SILOperands then I + must define how (V, P) is related to its operands. Some possible + relationships are: + i. RC Identity Forwarding. If I is a forwarding instruction, then (V, P) to an + analogous RC Identity (OpV, OpP). Some examples of this sort of operation are + casts, value projections, and value aggregations. + ii. RC Identity Introducing. These are instructions which introduce new RC + Identity values implying that (V, P) is an RC Identity Source. Some examples + of these sorts of instructions are: apply, partial_apply. + iii. Unspecified. If I is not an introducer or a forwarder and does not specify + any specific semantics, then its RC Identity behavior is unspecified. + +Our algorithm then is a very simple algorithm that applies the RC Identity +Source algorithm to all SSA values in the program and ensures that RC Identity +Sources can be computed for them. This should result in trivial use-def list +traversal. ### New High Level ARC Operations -Once we are able to reason about RC Identity, the next step in implementing Semantic ARC is to eliminate in High Level SIL certain Low Level aggregate operations that have ARC semantics but are not conducive to reasoning about ARC operations on use-def edges. These are specifically: - -1. strong_release, release_value. These in High Level SIL will be replaced by a copy_value instruction with the following semantics: - - a. By default a copy_value will perform a bit by bit copy of its input argument and a retain_value operation. The argument still maintains its own lifetime and the result of the copy_value should semantically be able to be treated as a completely separate value from the program semantic perspective. - - b. If the copy_value instruction has the [take] flag associated with it, then a move is being performed and while a bit by bit copy of the value occurs, no retain_value is applied to it. The original SSA value as a result of this operation has an undefined bit value and in debugging situations could be given a malloc scribbled payload. - -2. strong_release, release_value will be replaced by a destroy_value instruction with the following semantics: - - a. By default a destroy_value will perform a release_value on its input value. After this point, the bit value of the SSA value is undefined and in debugging situations, the SSA value could be given a malloc scribbled payload. - - b. A destroy_value with the [noop] flag attached to it does not perform a release_value on its input value but /does/ scribble over the memory in debugging situations. *FIXME [noop] needs a better name*. - -3. strong store/strong load operations should be provided as instructions. This allows for normal loads to be considered as not having any ARC significant operations and eliminates a hole in ARC where a pointer is partially initialized (i.e. it a value is loaded but it has not been retained. In the time period in between those two points the value is partially initialized allowing for optimizer bugs). - -*NOTE* In Low Level SIL, each of these atomic primitives will be lowered to their low level variants. +Once we are able to reason about RC Identity, the next step in implementing +Semantic ARC is to eliminate in High Level SIL certain Low Level aggregate +operations that have ARC semantics but are not conducive to reasoning about ARC +operations on use-def edges. These are specifically: + +1. strong_release, release_value. These in High Level SIL will be replaced by a + copy_value instruction with the following semantics: + +a. By default a copy_value will perform a bit by bit copy of its input argument + and a retain_value operation. The argument still maintains its own lifetime and + the result of the copy_value should semantically be able to be treated as a + completely separate value from the program semantic perspective. +b. If the copy_value instruction has the [take] flag associated with it, then a + move is being performed and while a bit by bit copy of the value occurs, no + retain_value is applied to it. The original SSA value as a result of this + operation has an undefined bit value and in debugging situations could be + given a malloc scribbled payload. + +2. strong_release, release_value will be replaced by a destroy_value instruction + with the following semantics: + +a. By default a destroy_value will perform a release_value on its input + value. After this point, the bit value of the SSA value is undefined and in + debugging situations, the SSA value could be given a malloc scribbled payload. + +b. A destroy_value with the [noop] flag attached to it does not perform a + release_value on its input value but /does/ scribble over the memory in + debugging situations. *FIXME [noop] needs a better name*. + +3. strong store/strong load operations should be provided as instructions. This + allows for normal loads to be considered as not having any ARC significant + operations and eliminates a hole in ARC where a pointer is partially + initialized (i.e. it a value is loaded but it has not been retained. In the + time period in between those two points the value is partially initialized + allowing for optimizer bugs). + +*NOTE* In Low Level SIL, each of these atomic primitives will be lowered to +their low level variants. ### Endow Use-Def edges with ARC Conventions -Once we have these higher level operations, the next step is to create the notion of operand and result ARC conventions for all instructions. At a high level this is just the extension of argument/result conventions from apply sites to /all/ instructions. By then verifying that each use-def pair have compatible result/operand conventions, we can statically verify that ARC relationships are being preserved. +Once we have these higher level operations, the next step is to create the +notion of operand and result ARC conventions for all instructions. At a high +level this is just the extension of argument/result conventions from apply sites +to /all/ instructions. By then verifying that each use-def pair have compatible +result/operand conventions, we can statically verify that ARC relationships are +being preserved. In order to simplify this, we will make the following changes: -1. All SILInstructions must assign to their operands one of the following conventions: - * @owned - * @guaranteed - * @unowned @safe - * @unowned @unsafe - * @forwarding +1. All SILInstructions must assign to their operands one of the following + conventions: +* @owned +* @guaranteed +* @unowned @safe +* @unowned @unsafe +* @forwarding 2. All SILInstructions must assign to their result one of the following conventions: - * @owned - * @unowned @unsafe - * @unowned @safe - * @forwarding +* @owned +* @unowned @unsafe +* @unowned @safe +* @forwarding 3. All SILArguments must have one of the following conventions associated with it: - * @owned - * @guaranteed - * @unowned @unsafe - * @unowned @safe - * @forwarding - -@forwarding is a new convention that we add to reduce the amount of extra instructions needed to implement this scheme. @forwarding is a special convention intended for instructions that forward RC Identity that for simplictuy will be restricted to forwarding the convention of their def instruction to all of the uses of that instruction. Of course, for forwarding instructions with multiple inputs, we require that all of the inputs have the same convention. - -The general rule is that each result convention with the name x, must be matched with the operand convention with the same name with some specific exceptions. Let us consider an example. Consider the struct Foo: - - struct Foo { - var x: Builtin.NativeObject - var y: Builtin.NativeObject - } +* @owned +* @guaranteed +* @unowned @unsafe +* @unowned @safe +* @forwarding + +@forwarding is a new convention that we add to reduce the amount of extra +instructions needed to implement this scheme. @forwarding is a special +convention intended for instructions that forward RC Identity that for +simplictuy will be restricted to forwarding the convention of their def +instruction to all of the uses of that instruction. Of course, for forwarding +instructions with multiple inputs, we require that all of the inputs have the +same convention. + +The general rule is that each result convention with the name x, must be matched +with the operand convention with the same name with some specific +exceptions. Let us consider an example. Consider the struct Foo: + +struct Foo { +var x: Builtin.NativeObject +var y: Builtin.NativeObject +} and the following SIL: - sil @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) - - sil @foo : $@convention(thin) (@guaranteed Builtin.NativeObject) -> () { - bb0(%0 : @guaranteed Builtin.NativeObject): - %1 = struct $Foo(%0 : $@guaranteed Builtin.NativeObject, %0 : $@guaranteed Builtin.NativeObject) # This is forwarding - %2 = copy_value %1 : $@guaranteed Foo # This converts %1 from @guaranteed -> @owned - %3 = function_ref @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) - %4 = apply %3(%1, %2) : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) # This needs to be consumed - destroy_value %4 : $@owned Builtin.NativeObject - %5 = tuple() - return %5 : $() - } - -Let us consider another example that is incorrect and where the conventions allow for optimizer or frontend error to be caught easily. Consider foo2: - - sil @foo2 : $@convention(thin) (@guaranteed Builtin.NativeObject) -> () { - bb0(%0 : @guaranteed Builtin.NativeObject): - %1 = struct $Foo(%0 : $@guaranteed Builtin.NativeObject, %0 : $@guaranteed Builtin.NativeObject) # This is forwarding - %2 = copy_value [take] %1 : $@guaranteed Foo # ==> ERROR: Can not take a guaranteed parameter <== - %3 = function_ref @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) - %4 = apply %3(%1, %2) : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) # This needs to be consumed - destroy_value %4 : $@owned Builtin.NativeObject - %5 = tuple() - return %5 : $() - } - -In this case, since a copy_value [take] can only accept an @owned parameter as an argument, a simple use-def type verifier would throw, preventing an improper transfer of ownership. +sil @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) + +sil @foo : $@convention(thin) (@guaranteed Builtin.NativeObject) -> () { +bb0(%0 : @guaranteed Builtin.NativeObject): +%1 = struct $Foo(%0 : $@guaranteed Builtin.NativeObject, %0 : $@guaranteed Builtin.NativeObject) # This is forwarding +%2 = copy_value %1 : $@guaranteed Foo # This converts %1 from @guaranteed -> @owned +%3 = function_ref @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) +%4 = apply %3(%1, %2) : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) # This needs to be consumed +destroy_value %4 : $@owned Builtin.NativeObject +%5 = tuple() +return %5 : $() +} + +Let us consider another example that is incorrect and where the conventions +allow for optimizer or frontend error to be caught easily. Consider foo2: + +sil @foo2 : $@convention(thin) (@guaranteed Builtin.NativeObject) -> () { +bb0(%0 : @guaranteed Builtin.NativeObject): +%1 = struct $Foo(%0 : $@guaranteed Builtin.NativeObject, %0 : $@guaranteed Builtin.NativeObject) # This is forwarding +%2 = copy_value [take] %1 : $@guaranteed Foo # ==> ERROR: Can not take a guaranteed parameter <== +%3 = function_ref @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) +%4 = apply %3(%1, %2) : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) # This needs to be consumed +destroy_value %4 : $@owned Builtin.NativeObject +%5 = tuple() +return %5 : $() +} + +In this case, since a copy_value [take] can only accept an @owned parameter as +an argument, a simple use-def type verifier would throw, preventing an improper +transfer of ownership. ### ARC Verifier -Once we have endowed use-def edges with ARC semantic properties, we can ensure that ARC is statically correct by ensuring that for all function bodies the following is true: - - a. Every use-def edge must connect together a use and a def with compatible ARC semantics. As an example this means that any def that produces a +1 value must be paired with a -1 use. If one wishes to pass off a +1 value to an unowned use or a guaranteed use, one must use an appropriate conversion instruction. The conversion instruction would work as a pluggable adaptor and only certain adaptors that preserve safe ARC semantics would be provided. - - b. Every +1 operation can only be balanced by a -1 once along any path through the program. This would be implemented in the verifier by using the use-def list of a +1, -1 to construct joint-domination sets. The author believes that there is a simple algorithm for disproving joint dominance of a set by an instruction, but if one can not be come up with, there is literature for computing generalized dominators that can be used. If computation of generalized dominators is too expensive for normal use, they could be used on specific verification bots and used when triaging bugs. - -This guarantees via each instruction's interface that each +1 is properly balanced by a -1 and that no +1 is balanced multiple times along any path through the program... that is the program is ARC correct = ). +Once we have endowed use-def edges with ARC semantic properties, we can ensure +that ARC is statically correct by ensuring that for all function bodies the +following is true: + +a. Every use-def edge must connect together a use and a def with compatible ARC +semantics. As an example this means that any def that produces a +1 value must +be paired with a -1 use. If one wishes to pass off a +1 value to an unowned use +or a guaranteed use, one must use an appropriate conversion instruction. The +conversion instruction would work as a pluggable adaptor and only certain +adaptors that preserve safe ARC semantics would be provided. + +b. Every +1 operation can only be balanced by a -1 once along any path through +the program. This would be implemented in the verifier by using the use-def list +of a +1, -1 to construct joint-domination sets. The author believes that there +is a simple algorithm for disproving joint dominance of a set by an instruction, +but if one can not be come up with, there is literature for computing +generalized dominators that can be used. If computation of generalized +dominators is too expensive for normal use, they could be used on specific +verification bots and used when triaging bugs. + +This guarantees via each instruction's interface that each +1 is properly +balanced by a -1 and that no +1 is balanced multiple times along any path +through the program... that is the program is ARC correct = ). + +## Semantic ARC Based Optimization + +With this data, new and novel forms of optimization are now possible. We present +an algorithm called the Signature Sequence Algorithm. Considering the world of +solely static functions and linkage unit visibility. In such a world, all +dynamic lifetimes of all objects across all region boundaries will be 1 as a +result of this program. + +### "The Signature Optimization" + +Regions of lifetimes are determined solely by polymorphism (i.e. on +restrained by polymorphism). Everything else can be specialized as +appropriate. Similar to strongly connected components. If one images the world +of lifetimes, there is a minimal lifetime starting from a polymorphic function +that is open. The reason why this is true is since one can not know everything +that needs to be specialized. Or if from Storage. + +### "The Cleanup" + +Otherwise, one could perform offsetting retains, releases so that each +1, +1 is +at the same scope. Then run the cleanup crew. ## Footnotes From a64aa4f1a397f08ed9ff4b7352d98760c0baa239 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Sat, 13 Aug 2016 15:38:26 -0400 Subject: [PATCH 46/62] Undo long lines mistakes. --- docs/SemanticARC.md | 310 ++++++++++++++------------------------------ 1 file changed, 99 insertions(+), 211 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index a3bf0eabd99df..e556140d9e1da 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -6,258 +6,146 @@ - [Preface](#preface) - [Historical Implementations](#historical-implementations) - [Semantic ARC](#semantic-arc) -- [High Level SIL and Low Level SIL](#high-level-sil-and-low-level-sil) -- [RC Identity](#rc-identity) -- [New High Level ARC Operations](#new-high-level-arc-operations) -- [Endow Use-Def edges with ARC Conventions](#endow-use-def-edges-with-arc-conventions) -- [ARC Verifier](#arc-verifier) -- [Signature Optimization](#signature-optimization) + - [High Level SIL and Low Level SIL](#high-level-sil-and-low-level-sil) + - [RC Identity](#rc-identity) + - [New High Level ARC Operations](#new-high-level-arc-operations) + - [Endow Use-Def edges with ARC Conventions](#endow-use-def-edges-with-arc-conventions) + - [ARC Verifier](#arc-verifier) - [Footnotes](#footnotes) ## Preface -This is a proposal for a series of changes to the SIL IR in order to ease the -optimization of ARC operations and allow for static verification of ARC -semantics in SIL. This is a proposal meant for compiler writers and -implementors, not users, i.e. we assume that the reader has a basic familiarity -with the basic concepts of ARC. +This is a proposal for a series of changes to the SIL IR in order to ease the optimization of ARC operations and allow for static verification of ARC semantics in SIL. This is a proposal meant for compiler writers and implementors, not users, i.e. we assume that the reader has a basic familiarity with the basic concepts of ARC. -**NOTE** We are talking solely about ARC as implemented beginning in Objective -C. There may be other ARC implementations that are unknown to the writer. +**NOTE** We are talking solely about ARC as implemented beginning in Objective C. There may be other ARC implementations that are unknown to the writer. ## Historical Implementations -The first historical implementation of ARC in a mid level IR is in LLVM IR for -use by the Objective C language. In this model, a pointer with ARC semantics is -represented as an i8*. The lifetimes of the pointer is managed via retain and -release operations and the end of the pointer's lifetime is ascertained via -conservative analysis of uses. retain, release instructions were just calls that -did not have any semantic ARC information that enabled reasoning about what the -corresponding balancing operation opposing it was. Additionally, there were many -operations on ARC pointers that /should/ have resulted in atomic operations were -instead separated into separate uses (i.e. load/store strong when compiling for -optimization). Uses of Objective C pointers by functions were problematic as -well since there was no verification of the semantic ARC convention that a -function required of its pointer arguments. Finally, one could only establish -that two operations on different pointers had the same RC identity -[[1]](#footnote-1) conservatively via alias analysis. This prevents semantic -guarantees from the IR in terms of ability to calculate RC identity. - -The ARC implementation in Swift, in contrast, to Objective C, is implemented in -the SIL IR. This suffers from some of the same issues as the Objective C -implementation of ARC but also provides some noteworthy improvements: - -1. Reference Semantics in the Type System. Since SIL's type system is based upon - the Swift type system, the notion of a reference countable type exists. This - means that it is possible to verify that reference counting operations only - apply to SSA values and memory locations associated reference counted types. -2. Argument conventions on function signatures. In SIL, all functions specify - the ownership convention expected of their arguments and return values. Since - these conventions were not specified in the operations in the bodies of - functions though, this could not be used to create a true ARC verifier. +The first historical implementation of ARC in a mid level IR is in LLVM IR for use by the Objective C language. In this model, a pointer with ARC semantics is represented as an i8*. The lifetimes of the pointer is managed via retain and release operations and the end of the pointer's lifetime is ascertained via conservative analysis of uses. retain, release instructions were just calls that did not have any semantic ARC information that enabled reasoning about what the corresponding balancing operation opposing it was. Additionally, there were many operations on ARC pointers that /should/ have resulted in atomic operations were instead separated into separate uses (i.e. load/store strong when compiling for optimization). Uses of Objective C pointers by functions were problematic as well since there was no verification of the semantic ARC convention that a function required of its pointer arguments. Finally, one could only establish that two operations on different pointers had the same RC identity [[1]](#footnote-1) conservatively via alias analysis. This prevents semantic guarantees from the IR in terms of ability to calculate RC identity. + +The ARC implementation in Swift, in contrast, to Objective C, is implemented in the SIL IR. This suffers from some of the same issues as the Objective C implementation of ARC but also provides some noteworthy improvements: + +1. Reference Semantics in the Type System. Since SIL's type system is based upon the Swift type system, the notion of a reference countable type exists. This means that it is possible to verify that reference counting operations only apply to SSA values and memory locations associated reference counted types. +2. Argument conventions on function signatures. In SIL, all functions specify the ownership convention expected of their arguments and return values. Since these conventions were not specified in the operations in the bodies of functions though, this could not be used to create a true ARC verifier. *TODO: I could point out here that many of the ideas in Semantic ARC are extending these ideas throughout the IR?* ## Semantic ARC -As discussed in the previous section, the implementation of ARC in both Swift -and Objective C lacked important semantic ARC information. We fix these issues -by embedding the following ARC semantic information into SIL in the following -order of implementation: - -1. **Split the Canonical SIL Stage into High and Low Level SIL**: High Level SIL - will be the result of running the guaranteed passes and is where ARC - invariants will be enforced. -2. **RC Identity**: For any given SILValue, one should be able to determine its - set of RC Identity Roots. This makes it easy to reason about which reference - counts a reference count operation is affecting. -3. **Introduction of new High Level ARC Operations**: store_strong, load_strong, - copy_value instructions should be added to SIL. These operations are - currently split into separate low level operations and are the missing pieces - towards allowing all function local ARC relationships to be expressed via - use-def chains. **TODO: ADD MORE HERE** -4. **Endow Use-Def edges with ARC Conventions**: Function signature ARC - conventions should be extended to all instructions and block arguments. Thus - all use-def edges should have an implied ownership transfer convention. -5. **ARC Verifier**: An ARC verifier should be written that uses RC Identity, - Operand ARC Conventions, and High Level ARC operations to statically verify - that a program obeys ARC semantics. -6. **Elimination of Memory Locations from High Level SIL**. Memory locations -should be represented as SSA values instead of memory locations. This will allow -for address only values to be manipulated and have their lifetimes verified by -the ARC verifier in a trivial way without the introduction of Memory SSA. +As discussed in the previous section, the implementation of ARC in both Swift and Objective C lacked important semantic ARC information. We fix these issues by embedding the following ARC semantic information into SIL in the following order of implementation: +1. **Split the Canonical SIL Stage into High and Low Level SIL**: High Level SIL will be the result of running the guaranteed passes and is where ARC invariants will be enforced. +2. **RC Identity**: For any given SILValue, one should be able to determine its set of RC Identity Roots. This makes it easy to reason about which reference counts a reference count operation is affecting. +3. **Introduction of new High Level ARC Operations**: store_strong, load_strong, copy_value instructions should be added to SIL. These operations are currently split into separate low level operations and are the missing pieces towards allowing all function local ARC relationships to be expressed via use-def chains. **TODO: ADD MORE HERE** +4. **Endow Use-Def edges with ARC Conventions**: Function signature ARC conventions should be extended to all instructions and block arguments. Thus all use-def edges should have an implied ownership transfer convention. +5. **ARC Verifier**: An ARC verifier should be written that uses RC Identity, Operand ARC Conventions, and High Level ARC operations to statically verify that a program obeys ARC semantics. + + We now go into depth on each one of those points. ### High Level SIL and Low Level SIL -The first step towards implementing Semantic ARC is to split the "Canonical SIL -Stage" into two different stages: High Level and Low Level SIL. The main -distinction in between the two stages is, that in High Level SIL, ARC semantic -invariants will be enforced via extra conditions on the IR. In contrast, once -Low Level SIL has been reached, no ARC semantic invariants are enforced and only -very conservative ARC optimization may occur. The intention is that Low Level -SIL would /only/ be used when compiling with optimization enabled, so both High -and Low Level SIL will necessarily need to be able to be lowered to LLVM IR. +The first step towards implementing Semantic ARC is to split the "Canonical SIL Stage" into two different stages: High Level and Low Level SIL. The main distinction in between the two stages is, that in High Level SIL, ARC semantic invariants will be enforced via extra conditions on the IR. In contrast, once Low Level SIL has been reached, no ARC semantic invariants are enforced and only very conservative ARC optimization may occur. The intention is that Low Level SIL would /only/ be used when compiling with optimization enabled, so both High and Low Level SIL will necessarily need to be able to be lowered to LLVM IR. ### RC Identity -Once High Level SIL has been implemented, we will embed RC Identity into High -Level SIL to ensure that RC identity can always be computed for all SSA -values. Currently in SIL this is not a robust operation due to the lack of IR -level model of RC identity that is guaranteed to be preserved by the frontend -and all emitted instructions. Define an RC Identity as a tuple consisting of a -SILValue, V, and a ProjectionPath, P, to from V's type to a sub reference type -in V [[2]](#footnote-2). We wish to define an algorithm that given any (V, P) in -a program can determine the RC Identity Source associated with (V, P). We do -this recursively follows: +Once High Level SIL has been implemented, we will embed RC Identity into High Level SIL to ensure that RC identity can always be computed for all SSA values. Currently in SIL this is not a robust operation due to the lack of IR level model of RC identity that is guaranteed to be preserved by the frontend and all emitted instructions. Define an RC Identity as a tuple consisting of a SILValue, V, and a ProjectionPath, P, to from V's type to a sub reference type in V [[2]](#footnote-2). We wish to define an algorithm that given any (V, P) in a program can determine the RC Identity Source associated with (V, P). We do this recursively follows: Let V be a SILValue and P be a projection path into V. Then: -1. If V is a SILArgument, then (V, P) is an RC Identity Source. *NOTE* This - implies that by default, SILArguments that act as phi nodes are RC Identity - Sources. -2. If V is the result of a SILInstruction I, then if I does not have any - operands, (V, P) is an RC Identity Source. If I does have SILOperands then I - must define how (V, P) is related to its operands. Some possible - relationships are: - i. RC Identity Forwarding. If I is a forwarding instruction, then (V, P) to an - analogous RC Identity (OpV, OpP). Some examples of this sort of operation are - casts, value projections, and value aggregations. - ii. RC Identity Introducing. These are instructions which introduce new RC - Identity values implying that (V, P) is an RC Identity Source. Some examples - of these sorts of instructions are: apply, partial_apply. - iii. Unspecified. If I is not an introducer or a forwarder and does not specify - any specific semantics, then its RC Identity behavior is unspecified. - -Our algorithm then is a very simple algorithm that applies the RC Identity -Source algorithm to all SSA values in the program and ensures that RC Identity -Sources can be computed for them. This should result in trivial use-def list -traversal. +1. If V is a SILArgument, then (V, P) is an RC Identity Source. *NOTE* This implies that by default, SILArguments that act as phi nodes are RC Identity Sources. +2. If V is the result of a SILInstruction I, then if I does not have any operands, (V, P) is an RC Identity Source. If I does have SILOperands then I must define how (V, P) is related to its operands. Some possible relationships are: + i. RC Identity Forwarding. If I is a forwarding instruction, then (V, P) to an analogous RC Identity (OpV, OpP). Some examples of this sort of operation are casts, value projections, and value aggregations. + ii. RC Identity Introducing. These are instructions which introduce new RC Identity values implying that (V, P) is an RC Identity Source. Some examples of these sorts of instructions are: apply, partial_apply. + iii. Unspecified. If I is not an introducer or a forwarder and does not specify any specific semantics, then its RC Identity behavior is unspecified. + +Our algorithm then is a very simple algorithm that applies the RC Identity Source algorithm to all SSA values in the program and ensures that RC Identity Sources can be computed for them. This should result in trivial use-def list traversal. ### New High Level ARC Operations -Once we are able to reason about RC Identity, the next step in implementing -Semantic ARC is to eliminate in High Level SIL certain Low Level aggregate -operations that have ARC semantics but are not conducive to reasoning about ARC -operations on use-def edges. These are specifically: - -1. strong_release, release_value. These in High Level SIL will be replaced by a - copy_value instruction with the following semantics: - -a. By default a copy_value will perform a bit by bit copy of its input argument - and a retain_value operation. The argument still maintains its own lifetime and - the result of the copy_value should semantically be able to be treated as a - completely separate value from the program semantic perspective. -b. If the copy_value instruction has the [take] flag associated with it, then a - move is being performed and while a bit by bit copy of the value occurs, no - retain_value is applied to it. The original SSA value as a result of this - operation has an undefined bit value and in debugging situations could be - given a malloc scribbled payload. - -2. strong_release, release_value will be replaced by a destroy_value instruction - with the following semantics: - -a. By default a destroy_value will perform a release_value on its input - value. After this point, the bit value of the SSA value is undefined and in - debugging situations, the SSA value could be given a malloc scribbled payload. - -b. A destroy_value with the [noop] flag attached to it does not perform a - release_value on its input value but /does/ scribble over the memory in - debugging situations. *FIXME [noop] needs a better name*. - -3. strong store/strong load operations should be provided as instructions. This - allows for normal loads to be considered as not having any ARC significant - operations and eliminates a hole in ARC where a pointer is partially - initialized (i.e. it a value is loaded but it has not been retained. In the - time period in between those two points the value is partially initialized - allowing for optimizer bugs). - -*NOTE* In Low Level SIL, each of these atomic primitives will be lowered to -their low level variants. +Once we are able to reason about RC Identity, the next step in implementing Semantic ARC is to eliminate in High Level SIL certain Low Level aggregate operations that have ARC semantics but are not conducive to reasoning about ARC operations on use-def edges. These are specifically: + +1. strong_release, release_value. These in High Level SIL will be replaced by a copy_value instruction with the following semantics: + + a. By default a copy_value will perform a bit by bit copy of its input argument and a retain_value operation. The argument still maintains its own lifetime and the result of the copy_value should semantically be able to be treated as a completely separate value from the program semantic perspective. + + b. If the copy_value instruction has the [take] flag associated with it, then a move is being performed and while a bit by bit copy of the value occurs, no retain_value is applied to it. The original SSA value as a result of this operation has an undefined bit value and in debugging situations could be given a malloc scribbled payload. + +2. strong_release, release_value will be replaced by a destroy_value instruction with the following semantics: + + a. By default a destroy_value will perform a release_value on its input value. After this point, the bit value of the SSA value is undefined and in debugging situations, the SSA value could be given a malloc scribbled payload. + + b. A destroy_value with the [noop] flag attached to it does not perform a release_value on its input value but /does/ scribble over the memory in debugging situations. *FIXME [noop] needs a better name*. + +3. strong store/strong load operations should be provided as instructions. This allows for normal loads to be considered as not having any ARC significant operations and eliminates a hole in ARC where a pointer is partially initialized (i.e. it a value is loaded but it has not been retained. In the time period in between those two points the value is partially initialized allowing for optimizer bugs). + +*NOTE* In Low Level SIL, each of these atomic primitives will be lowered to their low level variants. ### Endow Use-Def edges with ARC Conventions -Once we have these higher level operations, the next step is to create the -notion of operand and result ARC conventions for all instructions. At a high -level this is just the extension of argument/result conventions from apply sites -to /all/ instructions. By then verifying that each use-def pair have compatible -result/operand conventions, we can statically verify that ARC relationships are -being preserved. +Once we have these higher level operations, the next step is to create the notion of operand and result ARC conventions for all instructions. At a high level this is just the extension of argument/result conventions from apply sites to /all/ instructions. By then verifying that each use-def pair have compatible result/operand conventions, we can statically verify that ARC relationships are being preserved. In order to simplify this, we will make the following changes: -1. All SILInstructions must assign to their operands one of the following - conventions: -* @owned -* @guaranteed -* @unowned @safe -* @unowned @unsafe -* @forwarding +1. All SILInstructions must assign to their operands one of the following conventions: + * @owned + * @guaranteed + * @unowned @safe + * @unowned @unsafe + * @forwarding 2. All SILInstructions must assign to their result one of the following conventions: -* @owned -* @unowned @unsafe -* @unowned @safe -* @forwarding + * @owned + * @unowned @unsafe + * @unowned @safe + * @forwarding 3. All SILArguments must have one of the following conventions associated with it: -* @owned -* @guaranteed -* @unowned @unsafe -* @unowned @safe -* @forwarding - -@forwarding is a new convention that we add to reduce the amount of extra -instructions needed to implement this scheme. @forwarding is a special -convention intended for instructions that forward RC Identity that for -simplictuy will be restricted to forwarding the convention of their def -instruction to all of the uses of that instruction. Of course, for forwarding -instructions with multiple inputs, we require that all of the inputs have the -same convention. - -The general rule is that each result convention with the name x, must be matched -with the operand convention with the same name with some specific -exceptions. Let us consider an example. Consider the struct Foo: - -struct Foo { -var x: Builtin.NativeObject -var y: Builtin.NativeObject -} + * @owned + * @guaranteed + * @unowned @unsafe + * @unowned @safe + * @forwarding + +@forwarding is a new convention that we add to reduce the amount of extra instructions needed to implement this scheme. @forwarding is a special convention intended for instructions that forward RC Identity that for simplictuy will be restricted to forwarding the convention of their def instruction to all of the uses of that instruction. Of course, for forwarding instructions with multiple inputs, we require that all of the inputs have the same convention. + +The general rule is that each result convention with the name x, must be matched with the operand convention with the same name with some specific exceptions. Let us consider an example. Consider the struct Foo: + + struct Foo { + var x: Builtin.NativeObject + var y: Builtin.NativeObject + } and the following SIL: -sil @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) - -sil @foo : $@convention(thin) (@guaranteed Builtin.NativeObject) -> () { -bb0(%0 : @guaranteed Builtin.NativeObject): -%1 = struct $Foo(%0 : $@guaranteed Builtin.NativeObject, %0 : $@guaranteed Builtin.NativeObject) # This is forwarding -%2 = copy_value %1 : $@guaranteed Foo # This converts %1 from @guaranteed -> @owned -%3 = function_ref @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) -%4 = apply %3(%1, %2) : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) # This needs to be consumed -destroy_value %4 : $@owned Builtin.NativeObject -%5 = tuple() -return %5 : $() -} - -Let us consider another example that is incorrect and where the conventions -allow for optimizer or frontend error to be caught easily. Consider foo2: - -sil @foo2 : $@convention(thin) (@guaranteed Builtin.NativeObject) -> () { -bb0(%0 : @guaranteed Builtin.NativeObject): -%1 = struct $Foo(%0 : $@guaranteed Builtin.NativeObject, %0 : $@guaranteed Builtin.NativeObject) # This is forwarding -%2 = copy_value [take] %1 : $@guaranteed Foo # ==> ERROR: Can not take a guaranteed parameter <== -%3 = function_ref @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) -%4 = apply %3(%1, %2) : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) # This needs to be consumed -destroy_value %4 : $@owned Builtin.NativeObject -%5 = tuple() -return %5 : $() -} - -In this case, since a copy_value [take] can only accept an @owned parameter as -an argument, a simple use-def type verifier would throw, preventing an improper -transfer of ownership. + sil @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) + + sil @foo : $@convention(thin) (@guaranteed Builtin.NativeObject) -> () { + bb0(%0 : @guaranteed Builtin.NativeObject): + %1 = struct $Foo(%0 : $@guaranteed Builtin.NativeObject, %0 : $@guaranteed Builtin.NativeObject) # This is forwarding + %2 = copy_value %1 : $@guaranteed Foo # This converts %1 from @guaranteed -> @owned + %3 = function_ref @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) + %4 = apply %3(%1, %2) : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) # This needs to be consumed + destroy_value %4 : $@owned Builtin.NativeObject + %5 = tuple() + return %5 : $() + } + +Let us consider another example that is incorrect and where the conventions allow for optimizer or frontend error to be caught easily. Consider foo2: + + sil @foo2 : $@convention(thin) (@guaranteed Builtin.NativeObject) -> () { + bb0(%0 : @guaranteed Builtin.NativeObject): + %1 = struct $Foo(%0 : $@guaranteed Builtin.NativeObject, %0 : $@guaranteed Builtin.NativeObject) # This is forwarding + %2 = copy_value [take] %1 : $@guaranteed Foo # ==> ERROR: Can not take a guaranteed parameter <== + %3 = function_ref @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) + %4 = apply %3(%1, %2) : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) # This needs to be consumed + destroy_value %4 : $@owned Builtin.NativeObject + %5 = tuple() + return %5 : $() + } + +In this case, since a copy_value [take] can only accept an @owned parameter as an argument, a simple use-def type verifier would throw, preventing an improper transfer of ownership. ### ARC Verifier From e4dbab3de3ba71a55cc336913b44468e2e90ab19 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Sat, 13 Aug 2016 15:41:01 -0400 Subject: [PATCH 47/62] Fix up TOC and add back in the section on address only types as values. --- docs/SemanticARC.md | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index e556140d9e1da..314bc33feb82b 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -3,15 +3,19 @@ **Table of Contents** -- [Preface](#preface) -- [Historical Implementations](#historical-implementations) -- [Semantic ARC](#semantic-arc) - - [High Level SIL and Low Level SIL](#high-level-sil-and-low-level-sil) - - [RC Identity](#rc-identity) - - [New High Level ARC Operations](#new-high-level-arc-operations) - - [Endow Use-Def edges with ARC Conventions](#endow-use-def-edges-with-arc-conventions) - - [ARC Verifier](#arc-verifier) -- [Footnotes](#footnotes) + - [Preface](#preface) + - [Historical Implementations](#historical-implementations) + - [Semantic ARC](#semantic-arc) + - [High Level SIL and Low Level SIL](#high-level-sil-and-low-level-sil) + - [RC Identity](#rc-identity) + - [New High Level ARC Operations](#new-high-level-arc-operations) + - [Endow Use-Def edges with ARC Conventions](#endow-use-def-edges-with-arc-conventions) + - [Elimination of Memory Locations from High Level SIL](#elimination-of-memory-locations-from-high-level-sil) + - [ARC Verifier](#arc-verifier) + - [Semantic ARC Based Optimization](#semantic-arc-based-optimization) + - ["The Signature Optimization"](#the-signature-optimization) + - ["The Cleanup"](#the-cleanup) + - [Footnotes](#footnotes) ## Preface @@ -39,8 +43,7 @@ As discussed in the previous section, the implementation of ARC in both Swift an 3. **Introduction of new High Level ARC Operations**: store_strong, load_strong, copy_value instructions should be added to SIL. These operations are currently split into separate low level operations and are the missing pieces towards allowing all function local ARC relationships to be expressed via use-def chains. **TODO: ADD MORE HERE** 4. **Endow Use-Def edges with ARC Conventions**: Function signature ARC conventions should be extended to all instructions and block arguments. Thus all use-def edges should have an implied ownership transfer convention. 5. **ARC Verifier**: An ARC verifier should be written that uses RC Identity, Operand ARC Conventions, and High Level ARC operations to statically verify that a program obeys ARC semantics. - - +6. **Elimination of Memory Locations from High Level SIL**. Memory locations should be represented as SSA values instead of memory locations. This will allow for address only values to be manipulated and have their lifetimes verified by the ARC verifier in a trivial way without the introduction of Memory SSA. We now go into depth on each one of those points. @@ -147,6 +150,8 @@ Let us consider another example that is incorrect and where the conventions allo In this case, since a copy_value [take] can only accept an @owned parameter as an argument, a simple use-def type verifier would throw, preventing an improper transfer of ownership. +### Elimination of Memory Locations from High Level SIL + ### ARC Verifier Once we have endowed use-def edges with ARC semantic properties, we can ensure @@ -201,8 +206,7 @@ at the same scope. Then run the cleanup crew. [2] **NOTE** In many cases P will be the empty set (e.g. the case of a pure reference type) - From a45a8c925dc63462eba98ae135f53524a0924654 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Sat, 13 Aug 2016 17:18:39 -0400 Subject: [PATCH 48/62] More notes. --- docs/SemanticARC.md | 134 ++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 129 insertions(+), 5 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 314bc33feb82b..5ca96246147d0 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -1,6 +1,7 @@ # Semantic ARC + **Table of Contents** - [Preface](#preface) @@ -10,13 +11,29 @@ - [RC Identity](#rc-identity) - [New High Level ARC Operations](#new-high-level-arc-operations) - [Endow Use-Def edges with ARC Conventions](#endow-use-def-edges-with-arc-conventions) - - [Elimination of Memory Locations from High Level SIL](#elimination-of-memory-locations-from-high-level-sil) - [ARC Verifier](#arc-verifier) + - [Elimination of Memory Locations from High Level SIL](#elimination-of-memory-locations-from-high-level-sil) - [Semantic ARC Based Optimization](#semantic-arc-based-optimization) - ["The Signature Optimization"](#the-signature-optimization) - ["The Cleanup"](#the-cleanup) + - [Implementation](#implementation) + - [Phase 1. Preliminaries](#phase-1-preliminaries) + - [Parallel Task 1. Introduce new High Level Instructions. Can be done independently.](#parallel-task-1-introduce-new-high-level-instructions-can-be-done-independently) + - [Parallel Task 2. Introduction of RC Identity Verification and RC Identity Sources.](#parallel-task-2-introduction-of-rc-identity-verification-and-rc-identity-sources) + - [Parallel Task 3. Implement use-def list convention and convention verification.](#parallel-task-3-implement-use-def-list-convention-and-convention-verification) + - [Subtask a. Introduction of signatures to all block arguments without verification.](#subtask-a-introduction-of-signatures-to-all-block-arguments-without-verification) + - [Subtask b. Introduce the notion of signatures to use-def lists.](#subtask-b-introduce-the-notion-of-signatures-to-use-def-lists) + - [Subtask c. We create a whitelist of instructions with unaudited use-def lists audited instructions and use it to advance incrementally fixing instructions.](#subtask-c-we-create-a-whitelist-of-instructions-with-unaudited-use-def-lists-audited-instructions-and-use-it-to-advance-incrementally-fixing-instructions) + - [Parallel Task 4. Elimination of memory locations from High Level SIL.](#parallel-task-4-elimination-of-memory-locations-from-high-level-sil) + - [Phase 2. ARC Verifier](#phase-2-arc-verifier) + - [Phase 3. Create Uses of Instrastructure](#phase-3-create-uses-of-instrastructure) + - [Parallel Task 1. Optimization: Create Lifetime Joining algorithm.](#parallel-task-1-optimization-create-lifetime-joining-algorithm) + - [Parallel Task 2. Optimization: Extend Function Signature Optimizer -> Owner Signature Optimizer](#parallel-task-2-optimization-extend-function-signature-optimizer---owner-signature-optimizer) + - [Parallel Task 3. Optimization Copy Propagation](#parallel-task-3-optimization-copy-propagation) - [Footnotes](#footnotes) + + ## Preface This is a proposal for a series of changes to the SIL IR in order to ease the optimization of ARC operations and allow for static verification of ARC semantics in SIL. This is a proposal meant for compiler writers and implementors, not users, i.e. we assume that the reader has a basic familiarity with the basic concepts of ARC. @@ -77,7 +94,7 @@ Once we are able to reason about RC Identity, the next step in implementing Sema 2. strong_release, release_value will be replaced by a destroy_value instruction with the following semantics: - a. By default a destroy_value will perform a release_value on its input value. After this point, the bit value of the SSA value is undefined and in debugging situations, the SSA value could be given a malloc scribbled payload. + a. By default a destroy_value will perform a release_value on its input value. After this point, the bit value of the SSA value is undefinevd and in debugging situations, the SSA value could be given a malloc scribbled payload. b. A destroy_value with the [noop] flag attached to it does not perform a release_value on its input value but /does/ scribble over the memory in debugging situations. *FIXME [noop] needs a better name*. @@ -150,8 +167,6 @@ Let us consider another example that is incorrect and where the conventions allo In this case, since a copy_value [take] can only accept an @owned parameter as an argument, a simple use-def type verifier would throw, preventing an improper transfer of ownership. -### Elimination of Memory Locations from High Level SIL - ### ARC Verifier Once we have endowed use-def edges with ARC semantic properties, we can ensure @@ -178,6 +193,8 @@ This guarantees via each instruction's interface that each +1 is properly balanced by a -1 and that no +1 is balanced multiple times along any path through the program... that is the program is ARC correct = ). +### Elimination of Memory Locations from High Level SIL + ## Semantic ARC Based Optimization With this data, new and novel forms of optimization are now possible. We present @@ -193,13 +210,120 @@ restrained by polymorphism). Everything else can be specialized as appropriate. Similar to strongly connected components. If one images the world of lifetimes, there is a minimal lifetime starting from a polymorphic function that is open. The reason why this is true is since one can not know everything -that needs to be specialized. Or if from Storage. +that needs to be specialized. Or if from Storage. We can have loading have +conventions. When one proves that there is a dominating lifetime, one changes +storage signature to be +0. ### "The Cleanup" Otherwise, one could perform offsetting retains, releases so that each +1, +1 is at the same scope. Then run the cleanup crew. +## Implementation + +### Phase 1. Preliminaries + +#### Parallel Task 1. Introduce new High Level Instructions. Can be done independently. + +This one should be simple to do. Could give to Roman. + +#### Parallel Task 2. Introduction of RC Identity Verification and RC Identity Sources. + +Here we basically fix any issues that come up in terms of RC Identity not +propagating correctly. To test this, we make RCIdentityAnalysis use it so +everything just plugs in. + +#### Parallel Task 3. Implement use-def list convention and convention verification. + +Once this task is complete, we know that all use-def lists in the program are +correct. + +##### Subtask a. Introduction of signatures to all block arguments without verification. + +At this point in time, this work will be down on the SILParser/SILPrinter side +and making sure that it serializes properly and everything. + +##### Subtask b. Introduce the notion of signatures to use-def lists. + +Again, this would not be verified. This is where we would not wire anything up +to it. + +##### Subtask c. We create a whitelist of instructions with unaudited use-def lists audited instructions and use it to advance incrementally fixing instructions. + +We visit each instruction. If the instruction is not in the white list, we skip +it. If the instruction is in the white list, we check its value and its +users. If any user is not in the whitelist, we do not check the +connection. Before, we know everything we get far less coverage. Lets just check +if for all instructions... Done! + +#### Parallel Task 4. Elimination of memory locations from High Level SIL. + +Add any missing instructions. Add SIL level address only type. + +**TODO: ADD SIL EXAMPLE HERE** + +#### Phase 2. ARC Verifier + +I implement this. Using this, we fix up each parallel task. I can farm out the +work to the other people to fix up any issues we run into. + +### Phase 3. Create Uses of Instrastructure + +These run in order bottom up. We first join all lifetimes. + +#### Parallel Task 1. Optimization: Create Lifetime Joining algorithm. + +Then one can create the lifetime joining algorithm. This takes all of the +copy_addr and discovers any that *could* be joined, i.e. have the same parent +value and the copied value has not been written to. In the case of a pointer, +this is always safe to do. + +Could run lifetime joining as a guaranteed pass. That suggests to me a minimal +thing and that the lifetime joining should happen before the copy propagation. + +#### Parallel Task 2. Optimization: Extend Function Signature Optimizer -> Owner Signature Optimizer + +**ROUGH NOTES** + +1. Define an owner signature as (RC Identity, Last Ownership Start Equivalence + Class). Last Ownership Start Equivalence class are the lists of how ownership + changes. Since we have ownership on PHI arguments, life is good, i.e. no node + can ever have a non-rc identity source. + +1. Create a (RC Identity, Last Ownership Start) Graph. This is a list defined by + the equivalence class of regions that forward from a specific rc identity and + ownership definition of foo. +2. Create a graph on these tuples where there is an arrow from (RC Identity, + Last Ownership Start[n]) -> (RC Identity, Last Ownership Start[n+1]) + i.e. where a new value is introduced. +3. If you view each thing as a safe copy, then if Last Ownership Start[n] -> + Last Ownership Start[n+1] has the same convention, one can forward. +4. Loading from storage and writing from storage is signature?! + +Think of new definitions as new start regions and copies as new start +regions. In a way, those are signatures. One could abstract that to ownership +perhaps? + +1. Create Region Signature Graph annotated where you have 2 types of + nodes. Signature nodes and argument nodes. +2. Any Signature node's def, if it has the same def as that signature and there +3. Flip colored graph. + +Refactor function signature optimization to be able to apply to function +arguments and block arguments. When visiting a function, start visiting blocks +and fix up their signatures and then fix up function signatures. Do all of this +bottom up. + +Can use Loop Information to reason about loops. + +#### Parallel Task 3. Optimization Copy Propagation + +Color regions of ownership by if it is +1 or not +1. Things that are +not-polymorphic can not cause a retain/release to occur. That would be an +amazing thing to be able to prove. The rule would be that a copy could only come +from a polymorphic unknown type or a load from an internal in that specific +lifetime value. + ## Footnotes [1] Reference Count Identity ("RC Identity") is a concept that is independent of pointer identity that refers to the set of reference counts that would be manipulated by a reference counted operation upon a specific SSA value. For more information see the [RC Identity](https://github.com/apple/swift/blob/master/docs/ARCOptimization.rst#rc-identity) section of the [ARC Optimization guide](https://github.com/apple/swift/blob/master/docs/ARCOptimization.rst) From 605a93bdb003e28313e06ca1d38490d202396239 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Sat, 13 Aug 2016 17:19:59 -0400 Subject: [PATCH 49/62] auto --- docs/SemanticARC.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 5ca96246147d0..c1a6610c6b260 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -25,7 +25,7 @@ - [Subtask b. Introduce the notion of signatures to use-def lists.](#subtask-b-introduce-the-notion-of-signatures-to-use-def-lists) - [Subtask c. We create a whitelist of instructions with unaudited use-def lists audited instructions and use it to advance incrementally fixing instructions.](#subtask-c-we-create-a-whitelist-of-instructions-with-unaudited-use-def-lists-audited-instructions-and-use-it-to-advance-incrementally-fixing-instructions) - [Parallel Task 4. Elimination of memory locations from High Level SIL.](#parallel-task-4-elimination-of-memory-locations-from-high-level-sil) - - [Phase 2. ARC Verifier](#phase-2-arc-verifier) + - [Phase 2. ARC Verifier](#phase-2-arc-verifier) - [Phase 3. Create Uses of Instrastructure](#phase-3-create-uses-of-instrastructure) - [Parallel Task 1. Optimization: Create Lifetime Joining algorithm.](#parallel-task-1-optimization-create-lifetime-joining-algorithm) - [Parallel Task 2. Optimization: Extend Function Signature Optimizer -> Owner Signature Optimizer](#parallel-task-2-optimization-extend-function-signature-optimizer---owner-signature-optimizer) @@ -262,7 +262,7 @@ Add any missing instructions. Add SIL level address only type. **TODO: ADD SIL EXAMPLE HERE** -#### Phase 2. ARC Verifier +### Phase 2. ARC Verifier I implement this. Using this, we fix up each parallel task. I can farm out the work to the other people to fix up any issues we run into. From 33405da79b17dd60d9d496b4310706509d7319c9 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Sat, 13 Aug 2016 17:35:19 -0400 Subject: [PATCH 50/62] Small functions. --- docs/SemanticARC.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index c1a6610c6b260..bc8444a066492 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -262,16 +262,16 @@ Add any missing instructions. Add SIL level address only type. **TODO: ADD SIL EXAMPLE HERE** -### Phase 2. ARC Verifier +### Phase 2. Create Uses of Instrastructure -I implement this. Using this, we fix up each parallel task. I can farm out the -work to the other people to fix up any issues we run into. +These run in order bottom up. -### Phase 3. Create Uses of Instrastructure +#### Parallel Task 1. Create Lifetime Verification algorithm. -These run in order bottom up. We first join all lifetimes. +The way this works is that we create an analysis of "verified" good +instructions. Then they all go away. -#### Parallel Task 1. Optimization: Create Lifetime Joining algorithm. +#### Parallel Task 2. Optimization: Create Lifetime Joining algorithm. Then one can create the lifetime joining algorithm. This takes all of the copy_addr and discovers any that *could* be joined, i.e. have the same parent @@ -281,7 +281,7 @@ this is always safe to do. Could run lifetime joining as a guaranteed pass. That suggests to me a minimal thing and that the lifetime joining should happen before the copy propagation. -#### Parallel Task 2. Optimization: Extend Function Signature Optimizer -> Owner Signature Optimizer +#### Parallel Task 3. Optimization: Extend Function Signature Optimizer -> Owner Signature Optimizer **ROUGH NOTES** @@ -316,7 +316,7 @@ bottom up. Can use Loop Information to reason about loops. -#### Parallel Task 3. Optimization Copy Propagation +#### Parallel Task 4. Optimization Copy Propagation Color regions of ownership by if it is +1 or not +1. Things that are not-polymorphic can not cause a retain/release to occur. That would be an From c2132e6fc325871bdababe95288d7fda1e3403c8 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Sat, 13 Aug 2016 17:37:44 -0400 Subject: [PATCH 51/62] auto --- docs/SemanticARC.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index bc8444a066492..8f850608028b5 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -1,6 +1,3 @@ - -# Semantic ARC - **Table of Contents** @@ -25,15 +22,18 @@ - [Subtask b. Introduce the notion of signatures to use-def lists.](#subtask-b-introduce-the-notion-of-signatures-to-use-def-lists) - [Subtask c. We create a whitelist of instructions with unaudited use-def lists audited instructions and use it to advance incrementally fixing instructions.](#subtask-c-we-create-a-whitelist-of-instructions-with-unaudited-use-def-lists-audited-instructions-and-use-it-to-advance-incrementally-fixing-instructions) - [Parallel Task 4. Elimination of memory locations from High Level SIL.](#parallel-task-4-elimination-of-memory-locations-from-high-level-sil) - - [Phase 2. ARC Verifier](#phase-2-arc-verifier) - - [Phase 3. Create Uses of Instrastructure](#phase-3-create-uses-of-instrastructure) - - [Parallel Task 1. Optimization: Create Lifetime Joining algorithm.](#parallel-task-1-optimization-create-lifetime-joining-algorithm) - - [Parallel Task 2. Optimization: Extend Function Signature Optimizer -> Owner Signature Optimizer](#parallel-task-2-optimization-extend-function-signature-optimizer---owner-signature-optimizer) - - [Parallel Task 3. Optimization Copy Propagation](#parallel-task-3-optimization-copy-propagation) + - [Phase 2. Create Uses of Instrastructure](#phase-2-create-uses-of-instrastructure) + - [Parallel Task 1. Create Lifetime Verification algorithm.](#parallel-task-1-create-lifetime-verification-algorithm) + - [Parallel Task 2. Optimization: Create Lifetime Joining algorithm.](#parallel-task-2-optimization-create-lifetime-joining-algorithm) + - [Parallel Task 3. Optimization: Extend Function Signature Optimizer -> Owner Signature Optimizer](#parallel-task-3-optimization-extend-function-signature-optimizer---owner-signature-optimizer) + - [Parallel Task 4. Optimization Copy Propagation](#parallel-task-4-optimization-copy-propagation) - [Footnotes](#footnotes) +# Semantic ARC + + ## Preface This is a proposal for a series of changes to the SIL IR in order to ease the optimization of ARC operations and allow for static verification of ARC semantics in SIL. This is a proposal meant for compiler writers and implementors, not users, i.e. we assume that the reader has a basic familiarity with the basic concepts of ARC. From 99cefa99066e5af9f6cee4eecf1b5e5fd70e11c5 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Mon, 15 Aug 2016 12:19:43 -0400 Subject: [PATCH 52/62] Fix simple example. --- docs/SemanticARC.md | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 8f850608028b5..c32458a036a25 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -142,11 +142,18 @@ and the following SIL: sil @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) sil @foo : $@convention(thin) (@guaranteed Builtin.NativeObject) -> () { - bb0(%0 : @guaranteed Builtin.NativeObject): - %1 = struct $Foo(%0 : $@guaranteed Builtin.NativeObject, %0 : $@guaranteed Builtin.NativeObject) # This is forwarding - %2 = copy_value %1 : $@guaranteed Foo # This converts %1 from @guaranteed -> @owned - %3 = function_ref @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) - %4 = apply %3(%1, %2) : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) # This needs to be consumed + bb0(%0 : @guaranteed $Builtin.NativeObject): + %1 = function_ref @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) + + # This is forwarding, so it is @guaranteed since %0 is guaranteed. + %2 = struct $Foo(%0 : $Builtin.NativeObject, %0 : $Builtin.NativeObject) + + # Then we use copy_value to convert our @guaranteed def to an @owned use. copy_value is a converter instruction that converts + # any @owned or @guaranteed def to an @owned def. + %3 = copy_value %2 : $Foo + + # %1 is called since we pass in the @guaranteed def, @owned def to the @guaranteed, @owned uses. + %4 = apply %1(%2, %3) : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) # This needs to be consumed destroy_value %4 : $@owned Builtin.NativeObject %5 = tuple() return %5 : $() @@ -155,17 +162,17 @@ and the following SIL: Let us consider another example that is incorrect and where the conventions allow for optimizer or frontend error to be caught easily. Consider foo2: sil @foo2 : $@convention(thin) (@guaranteed Builtin.NativeObject) -> () { - bb0(%0 : @guaranteed Builtin.NativeObject): - %1 = struct $Foo(%0 : $@guaranteed Builtin.NativeObject, %0 : $@guaranteed Builtin.NativeObject) # This is forwarding - %2 = copy_value [take] %1 : $@guaranteed Foo # ==> ERROR: Can not take a guaranteed parameter <== - %3 = function_ref @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) - %4 = apply %3(%1, %2) : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) # This needs to be consumed - destroy_value %4 : $@owned Builtin.NativeObject - %5 = tuple() - return %5 : $() + bb0(%0 : @guaranteed $Builtin.NativeObject): + %1 = function_ref @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) + %2 = struct $Foo(%0 : $Builtin.NativeObject, %0 : $Builtin.NativeObject) + # ERROR =><= Passed an @guaranteed definition to an @owned use! + %3 = apply %1(%2, %2) : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) + destroy_value %3 : $Builtin.NativeObject + %4 = tuple() + return %4 : $() } -In this case, since a copy_value [take] can only accept an @owned parameter as an argument, a simple use-def type verifier would throw, preventing an improper transfer of ownership. +In this case, since the apply's second argument must be @owned, a simple use-def type verifier would throw, preventing an improper transfer of ownership. ### ARC Verifier From fd15ac2d2309430b82495e8ea2a916058eb96d7a Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Mon, 15 Aug 2016 12:39:14 -0400 Subject: [PATCH 53/62] Move toc back into the correct position. --- docs/SemanticARC.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index c32458a036a25..3bde9b4695afe 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -1,3 +1,6 @@ + +# Semantic ARC + **Table of Contents** @@ -31,9 +34,6 @@ -# Semantic ARC - - ## Preface This is a proposal for a series of changes to the SIL IR in order to ease the optimization of ARC operations and allow for static verification of ARC semantics in SIL. This is a proposal meant for compiler writers and implementors, not users, i.e. we assume that the reader has a basic familiarity with the basic concepts of ARC. From 4097c3626b161f2ab677e1d53b4486ad0dbec83c Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Mon, 15 Aug 2016 12:39:46 -0400 Subject: [PATCH 54/62] Update SemanticARC.md --- docs/SemanticARC.md | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 3bde9b4695afe..eaa8aa3199235 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -165,14 +165,33 @@ Let us consider another example that is incorrect and where the conventions allo bb0(%0 : @guaranteed $Builtin.NativeObject): %1 = function_ref @UseFoo : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) %2 = struct $Foo(%0 : $Builtin.NativeObject, %0 : $Builtin.NativeObject) - # ERROR =><= Passed an @guaranteed definition to an @owned use! + # ERROR! Passed an @guaranteed definition to an @owned use! %3 = apply %1(%2, %2) : $@convention(thin) (@guaranteed Foo, @owned Foo) -> (@owned Builtin.NativeObject) destroy_value %3 : $Builtin.NativeObject %4 = tuple() return %4 : $() } -In this case, since the apply's second argument must be @owned, a simple use-def type verifier would throw, preventing an improper transfer of ownership. +In this case, since the apply's second argument must be @owned, a simple use-def type verifier would throw, preventing an improper transfer of ownership. Now let us consider a simple switch enum statement: + + sil @switch : $@convention(thin) (@guaranteed Optional) -> () { + bb0(%0 : $Optional): + # switch_enum takes in values at +1. + switch_enum %1, bb1: .Some, bb2: .None + + bb1(%payload : $Builtin.NativeObject): + br bb3 + + bb2: + br bb3 + + bb3: + %result = tuple() + return %result : $() + } + +While this may look correct to the naked eye, it is actually incorrect even in SIL today. This is because switch_enum always takes arguments at +1! Yet, in the IR there is no indication of the problem (and this code will compile). Now let us update the IR given semantic ARC: + ### ARC Verifier From 56071454bb5945f6705fae2de3b62414015f5b1c Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Mon, 15 Aug 2016 13:06:54 -0400 Subject: [PATCH 55/62] Update SemanticARC.md --- docs/SemanticARC.md | 79 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 77 insertions(+), 2 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index eaa8aa3199235..68b2c3d686399 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -176,10 +176,10 @@ In this case, since the apply's second argument must be @owned, a simple use-def sil @switch : $@convention(thin) (@guaranteed Optional) -> () { bb0(%0 : $Optional): - # switch_enum takes in values at +1. - switch_enum %1, bb1: .Some, bb2: .None + switch_enum %0, bb1: .Some, bb2: .None bb1(%payload : $Builtin.NativeObject): + destroy_value %payload : $Builtin.NativeObject br bb3 bb2: @@ -192,6 +192,81 @@ In this case, since the apply's second argument must be @owned, a simple use-def While this may look correct to the naked eye, it is actually incorrect even in SIL today. This is because switch_enum always takes arguments at +1! Yet, in the IR there is no indication of the problem (and this code will compile). Now let us update the IR given semantic ARC: + sil @switch : $@convention(thin) (@guaranteed Optional) -> () { + bb0(%0 : @guaranteed $Optional): + switch_enum %0, bb1: .Some, bb2: .None + + # ERROR! Passing an @guaranteed def to an @owned use. + bb1(%payload : @owned $Builtin.NativeObject): + destroy_value %payload : $Builtin.NativeObject + br bb3 + + bb2: + br bb3 + + bb3: + %result = tuple() + return %result : $() + } + +A linear checker would automatically catch such an error and even more importantly there are visual cues for the compiler engineer that the switch enum argument needs to be a +1. We can fix this by introducing a copy_value. + + sil @switch : $@convention(thin) (@guaranteed Optional) -> () { + bb0(%0 : @guaranteed $Optional): + # Change %1 from being an @guaranteed def to an @owned def. + %2 = copy_value %1 : $Optional + # Pass in the @owned def into the switch enum's @owned use. + switch_enum %1, bb1: .Some, bb2: .None + + bb1(%payload : @owned $Builtin.NativeObject): + destroy_value %payload : $Builtin.NativeObject + br bb3 + + bb2: + br bb3 + + bb3: + %result = tuple() + return %result : $() + } +Then this will compile in the semantic ARC world. Let us consider how we could convert the @owned switch_enum parameter to be @guaranteed. What does that even mean. Consider the following switch enum example. + + sil @switch : $@convention(thin) (@owned Optional) -> () { + bb0(%0 : @owned $Optional): + switch_enum %0, bb1: .Some, bb2: .None + + bb1(%payload : @owned $Builtin.NativeObject): + br bb3 + + bb2: + br bb3 + + bb3: + destroy_value %0 : $Builtin.NativeObject + %result = tuple() + return %result : $() + } + +This is correct. But what if we want to perform @owned -> @guaranteed on this function. We do this as follows: + + sil @switch : $@convention(thin) (@guaranteed Optional) -> () { + bb0(%0 : @guaranteed $Optional): + %1 = copy_value %0 : $Optional + switch_enum %1, bb1: .Some, bb2: .None + + bb1(%payload : @owned $Builtin.NativeObject): + br bb3 + + bb2: + br bb3 + + bb3: + destroy_value %0 : $Builtin.NativeObject + %result = tuple() + return %result : $() + } + +**NOTE** This is only done in coordination with placing a destroy_value in the caller of @switch. ### ARC Verifier From 58d0acd4a058b1abd4e7a79109f7716a21760364 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Mon, 15 Aug 2016 13:07:31 -0400 Subject: [PATCH 56/62] Update SemanticARC.md --- docs/SemanticARC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 68b2c3d686399..1f6c14bf25383 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -190,7 +190,7 @@ In this case, since the apply's second argument must be @owned, a simple use-def return %result : $() } -While this may look correct to the naked eye, it is actually incorrect even in SIL today. This is because switch_enum always takes arguments at +1! Yet, in the IR there is no indication of the problem (and this code will compile). Now let us update the IR given semantic ARC: +While this may look correct to the naked eye, it is actually incorrect even in SIL today. This is because in SIL today, switch_enum always takes arguments at +1! Yet, in the IR there is no indication of the problem and the code will compile! Now let us update the IR given semantic ARC: sil @switch : $@convention(thin) (@guaranteed Optional) -> () { bb0(%0 : @guaranteed $Optional): From 7917e5a9ddc45720c01fd3aa9ecc224df5bafc62 Mon Sep 17 00:00:00 2001 From: Michael Gottesman Date: Mon, 15 Aug 2016 13:30:51 -0400 Subject: [PATCH 57/62] Add an optimization example. --- docs/SemanticARC.md | 90 ++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 84 insertions(+), 6 deletions(-) diff --git a/docs/SemanticARC.md b/docs/SemanticARC.md index 1f6c14bf25383..7eec2ee232174 100644 --- a/docs/SemanticARC.md +++ b/docs/SemanticARC.md @@ -229,6 +229,7 @@ A linear checker would automatically catch such an error and even more important %result = tuple() return %result : $() } + Then this will compile in the semantic ARC world. Let us consider how we could convert the @owned switch_enum parameter to be @guaranteed. What does that even mean. Consider the following switch enum example. sil @switch : $@convention(thin) (@owned Optional) -> () { @@ -236,37 +237,113 @@ Then this will compile in the semantic ARC world. Let us consider how we could c switch_enum %0, bb1: .Some, bb2: .None bb1(%payload : @owned $Builtin.NativeObject): + destroy_value %payload : $Builtin.NativeObject br bb3 bb2: br bb3 bb3: - destroy_value %0 : $Builtin.NativeObject %result = tuple() return %result : $() } -This is correct. But what if we want to perform @owned -> @guaranteed on this function. We do this as follows: +This is correct. But what if we want to get rid of the destroy_value by performing @owned -> @guaranteed optimization. We do this first by converting the switch_enum's parameter from being @owned to being @guaranteed. + + sil @switch : $@convention(thin) (@owned Optional) -> () { + bb0(%0 : @owned $Optional): + # Convert the @owned def to an @guaranteed def. + %1 = guarantee_lifetime %0 : $Optional + # Pass in the @guaranteed optional to the switch. + switch_enum %1, bb1: .Some, bb2: .None + + # NO ERROR! + bb1(%payload : @guaranteed $Builtin.NativeObject): + br bb3 + + bb2: + br bb3 + + bb3: + # End the guaranteed lifetime and convert the object back to @owned. + %2 = destroy_lifetime_guarantee %1 : $Optional + # Destroy the @owned parameter. + destroy_value %2 : $Builtin.NativeObject + %result = tuple() + return %result : $() + } + +The key reason to have the guarantee_lifetime/destroy_lifetime_guarantee is that it encapsulates via the use-def list the region where the lifetime of the object is guaranteed. Once this is done, we then perform the @owned -> @guaranteed optimization [[3]](#footnote-3): sil @switch : $@convention(thin) (@guaranteed Optional) -> () { bb0(%0 : @guaranteed $Optional): + # Convert the @guaranteed argument to an @owned def. %1 = copy_value %0 : $Optional - switch_enum %1, bb1: .Some, bb2: .None + + # Convert the @owned def to an @guaranteed def. + %2 = guarantee_lifetime %1 : $Optional - bb1(%payload : @owned $Builtin.NativeObject): + # Pass in the @guaranteed optional to the switch. + switch_enum %2, bb1: .Some, bb2: .None + + bb1(%payload : @guaranteed $Builtin.NativeObject): + br bb3 + + bb2: + br bb3 + + bb3: + # End the guaranteed lifetime and convert the object back to @owned. + %2 = destroy_lifetime_guarantee %1 : $Optional + # Destroy the @owned parameter. + destroy_value %2 : $Builtin.NativeObject + %result = tuple() + return %result : $() + } + +Once this has been done, we can then optimize via use-def lists by noticing that the @owned parameter that we are converting to guaranteed was original @guaranteed. In such a case, the copy is not necessary. Thus we can rewrite %2 to refer to %0 and rewrite the destroy_value in BB3 to refer to %1 and eliminate the lifetime guarantee instructions, i.e.: + + sil @switch : $@convention(thin) (@guaranteed Optional) -> () { + bb0(%0 : @guaranteed $Optional): + # Convert the @guaranteed argument to an @owned def. + %1 = copy_value %0 : $Optional + + # Pass in the @guaranteed optional to the switch. + switch_enum %0, bb1: .Some, bb2: .None + + bb1(%payload : @guaranteed $Builtin.NativeObject): + br bb3 + + bb2: + br bb3 + + bb3: + # Destroy the @owned parameter. + destroy_value %1 : $Builtin.NativeObject + %result = tuple() + return %result : $() + } + +Then we have a dead copy of a value that can thus be eliminated yielding the following perfectly optimized function: + + sil @switch : $@convention(thin) (@guaranteed Optional) -> () { + bb0(%0 : @guaranteed $Optional): + + # Pass in the @guaranteed optional to the switch. + switch_enum %0, bb1: .Some, bb2: .None + + bb1(%payload : @guaranteed $Builtin.NativeObject): br bb3 bb2: br bb3 bb3: - destroy_value %0 : $Builtin.NativeObject %result = tuple() return %result : $() } -**NOTE** This is only done in coordination with placing a destroy_value in the caller of @switch. +Beautiful. ### ARC Verifier @@ -431,6 +508,7 @@ lifetime value. [2] **NOTE** In many cases P will be the empty set (e.g. the case of a pure reference type) +[3] **NOTE** This operation is only done in coordination with inserting a destroy_value into callers of @switch.