Skip to content

Commit 6a03b2c

Browse files
committed
Merge branch 'main' into uts18_testcases
2 parents d4744ad + 9ccde19 commit 6a03b2c

File tree

79 files changed

+3361
-1623
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

79 files changed

+3361
-1623
lines changed

Documentation/Evolution/RegexLiteralPitch.md

Lines changed: 0 additions & 292 deletions
This file was deleted.

Documentation/Evolution/RegexLiterals.md

Lines changed: 389 additions & 0 deletions
Large diffs are not rendered by default.

Documentation/Evolution/RegexTypeOverview.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
21
# Regex Type and Overview
32

43
- Authors: [Michael Ilseman](https://github.com/milseman)
@@ -225,7 +224,7 @@ func processEntry(_ line: String) -> Transaction? {
225224

226225
The result builder allows for inline failable value construction, which participates in the overall string processing algorithm: returning `nil` signals a local failure and the engine backtracks to try an alternative. This not only relieves the use site from post-processing, it enables new kinds of processing algorithms, allows for search-space pruning, and enhances debuggability.
227226

228-
Swift regexes describe an unambiguous algorithm, were choice is ordered and effects can be reliably observed. For example, a `print()` statement inside the `TryCapture`'s transform function will run whenever the overall algorithm naturally dictates an attempt should be made. Optimizations can only elide such calls if they can prove it is behavior-preserving (e.g. "pure").
227+
Swift regexes describe an unambiguous algorithm, where choice is ordered and effects can be reliably observed. For example, a `print()` statement inside the `TryCapture`'s transform function will run whenever the overall algorithm naturally dictates an attempt should be made. Optimizations can only elide such calls if they can prove it is behavior-preserving (e.g. "pure").
229228

230229
`CustomMatchingRegexComponent`, discussed in [String Processing Algorithms][pitches], allows industrial-strength parsers to be used a regex components. This allows us to drop the overly-permissive pre-parsing step:
231230

@@ -278,14 +277,14 @@ func processEntry(_ line: String) -> Transaction? {
278277
*Note*: Details on how references work is discussed in [Regex Builders][pitches]. `Regex.Match` supports referring to _all_ captures by position (`match.1`, etc.) whether named or referenced or neither. Due to compiler limitations, result builders do not support forming labeled tuples for named captures.
279278

280279

281-
### Algorithms, algorithms everywhere
280+
### Regex-powered algorithms
282281

283282
Regexes can be used right out of the box with a variety of powerful and convenient algorithms, including trimming, splitting, and finding/replacing all matches within a string.
284283

285284
These algorithms are discussed in [String Processing Algorithms][pitches].
286285

287286

288-
### Onward Unicode
287+
### Unicode handling
289288

290289
A regex describes an algorithm to be ran over some model of string, and Swift's `String` has a rather unique Unicode-forward model. `Character` is an [extended grapheme cluster](https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries) and equality is determined under [canonical equivalence](https://www.unicode.org/reports/tr15/#Canon_Compat_Equivalence).
291290

@@ -310,12 +309,12 @@ public struct Regex<Output> {
310309
/// Match a string in its entirety.
311310
///
312311
/// Returns `nil` if no match and throws on abort
313-
public func matchWhole(_ s: String) throws -> Regex<Output>.Match?
312+
public func wholeMatch(in s: String) throws -> Regex<Output>.Match?
314313

315314
/// Match part of the string, starting at the beginning.
316315
///
317316
/// Returns `nil` if no match and throws on abort
318-
public func matchPrefix(_ s: String) throws -> Regex<Output>.Match?
317+
public func prefixMatch(in s: String) throws -> Regex<Output>.Match?
319318

320319
/// Find the first match in a string
321320
///
@@ -325,17 +324,17 @@ public struct Regex<Output> {
325324
/// Match a substring in its entirety.
326325
///
327326
/// Returns `nil` if no match and throws on abort
328-
public func matchWhole(_ s: Substring) throws -> Regex<Output>.Match?
327+
public func wholeMatch(in s: Substring) throws -> Regex<Output>.Match?
329328

330329
/// Match part of the string, starting at the beginning.
331330
///
332331
/// Returns `nil` if no match and throws on abort
333-
public func matchPrefix(_ s: Substring) throws -> Regex<Output>.Match?
332+
public func prefixMatch(in s: Substring) throws -> Regex<Output>.Match?
334333

335334
/// Find the first match in a substring
336335
///
337336
/// Returns `nil` if no match is found and throws on abort
338-
public func firstMatch(_ s: Substring) throws -> Regex<Output>.Match?
337+
public func firstMatch(in s: Substring) throws -> Regex<Output>.Match?
339338

340339
/// The result of matching a regex against a string.
341340
///
@@ -344,19 +343,19 @@ public struct Regex<Output> {
344343
@dynamicMemberLookup
345344
public struct Match {
346345
/// The range of the overall match
347-
public let range: Range<String.Index>
346+
public var range: Range<String.Index> { get }
348347

349348
/// The produced output from the match operation
350-
public var output: Output
349+
public var output: Output { get }
351350

352351
/// Lookup a capture by name or number
353-
public subscript<T>(dynamicMember keyPath: KeyPath<Output, T>) -> T
352+
public subscript<T>(dynamicMember keyPath: KeyPath<Output, T>) -> T { get }
354353

355354
/// Lookup a capture by number
356355
@_disfavoredOverload
357356
public subscript(
358357
dynamicMember keyPath: KeyPath<(Output, _doNotUse: ()), Output>
359-
) -> Output
358+
) -> Output { get }
360359
// Note: this allows `.0` when `Match` is not a tuple.
361360

362361
}
@@ -482,6 +481,7 @@ We're also looking for more community discussion on what the default type system
482481

483482
The actual `Match` struct just stores ranges: the `Substrings` are lazily created on demand. This avoids unnecessary ARC traffic and memory usage.
484483

484+
485485
### `Regex<Match, Captures>` instead of `Regex<Output>`
486486

487487
The generic parameter `Output` is proposed to contain both the whole match (the `.0` element if `Output` is a tuple) and captures. One alternative we have considered is separating `Output` into the entire match and the captures, i.e. `Regex<Match, Captures>`, and using `Void` for for `Captures` when there are no captures.

Documentation/Evolution/StringProcessingAlgorithms.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -187,7 +187,7 @@ public protocol CustomMatchingRegexComponent : RegexComponent {
187187
_ input: String,
188188
startingAt index: String.Index,
189189
in bounds: Range<String.Index>
190-
) -> (upperBound: String.Index, match: Match)?
190+
) throws -> (upperBound: String.Index, match: Match)?
191191
}
192192
```
193193

Package.swift

Lines changed: 40 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,13 @@
33

44
import PackageDescription
55

6+
let availabilityDefinition = PackageDescription.SwiftSetting.unsafeFlags([
7+
"-Xfrontend",
8+
"-define-availability",
9+
"-Xfrontend",
10+
#"SwiftStdlib 5.7:macOS 9999, iOS 9999, watchOS 9999, tvOS 9999"#,
11+
])
12+
613
let package = Package(
714
name: "swift-experimental-string-processing",
815
products: [
@@ -22,7 +29,6 @@ let package = Package(
2229
],
2330
dependencies: [
2431
.package(url: "https://github.com/apple/swift-argument-parser", from: "1.0.0"),
25-
.package(url: "https://github.com/apple/swift-docc-plugin", from: "1.0.0"),
2632
],
2733
targets: [
2834
// Targets are the basic building blocks of a package. A target can define a module or a test suite.
@@ -31,12 +37,14 @@ let package = Package(
3137
name: "_RegexParser",
3238
dependencies: [],
3339
swiftSettings: [
34-
.unsafeFlags(["-enable-library-evolution"])
40+
.unsafeFlags(["-enable-library-evolution"]),
41+
availabilityDefinition
3542
]),
3643
.testTarget(
3744
name: "MatchingEngineTests",
3845
dependencies: [
39-
"_RegexParser", "_StringProcessing"]),
46+
"_RegexParser", "_StringProcessing"
47+
]),
4048
.target(
4149
name: "_CUnicode",
4250
dependencies: []),
@@ -45,53 +53,64 @@ let package = Package(
4553
dependencies: ["_RegexParser", "_CUnicode"],
4654
swiftSettings: [
4755
.unsafeFlags(["-enable-library-evolution"]),
56+
availabilityDefinition
4857
]),
4958
.target(
5059
name: "RegexBuilder",
5160
dependencies: ["_StringProcessing", "_RegexParser"],
5261
swiftSettings: [
5362
.unsafeFlags(["-enable-library-evolution"]),
54-
.unsafeFlags(["-Xfrontend", "-enable-experimental-pairwise-build-block"])
63+
.unsafeFlags(["-Xfrontend", "-enable-experimental-pairwise-build-block"]),
64+
availabilityDefinition
5565
]),
5666
.testTarget(
5767
name: "RegexTests",
5868
dependencies: ["_StringProcessing"],
59-
swiftSettings: [.unsafeFlags(["-Xfrontend", "-enable-experimental-string-processing"])]
60-
),
69+
swiftSettings: [
70+
.unsafeFlags(["-Xfrontend", "-enable-experimental-string-processing"]),
71+
.unsafeFlags(["-Xfrontend", "-disable-availability-checking"]),
72+
]),
6173
.testTarget(
6274
name: "RegexBuilderTests",
6375
dependencies: ["_StringProcessing", "RegexBuilder"],
6476
swiftSettings: [
65-
.unsafeFlags(["-Xfrontend", "-enable-experimental-pairwise-build-block"])
77+
.unsafeFlags(["-Xfrontend", "-enable-experimental-pairwise-build-block"]),
78+
.unsafeFlags(["-Xfrontend", "-disable-availability-checking"])
6679
]),
67-
.target(
80+
.testTarget(
6881
name: "Prototypes",
69-
dependencies: ["_RegexParser", "_StringProcessing"]),
82+
dependencies: ["_RegexParser", "_StringProcessing"],
83+
swiftSettings: [
84+
.unsafeFlags(["-Xfrontend", "-disable-availability-checking"])
85+
]),
7086

7187
// MARK: Scripts
7288
.executableTarget(
7389
name: "VariadicsGenerator",
7490
dependencies: [
75-
.product(name: "ArgumentParser", package: "swift-argument-parser")
91+
.product(name: "ArgumentParser", package: "swift-argument-parser")
7692
]),
7793
.executableTarget(
7894
name: "PatternConverter",
7995
dependencies: [
80-
.product(name: "ArgumentParser", package: "swift-argument-parser"),
81-
"_RegexParser",
82-
"_StringProcessing"
96+
.product(name: "ArgumentParser", package: "swift-argument-parser"),
97+
"_RegexParser",
98+
"_StringProcessing"
8399
]),
84100

85101
// MARK: Exercises
86102
.target(
87-
name: "Exercises",
88-
dependencies: ["_RegexParser", "Prototypes", "_StringProcessing", "RegexBuilder"],
89-
swiftSettings: [
90-
.unsafeFlags(["-Xfrontend", "-enable-experimental-pairwise-build-block"])
91-
]),
103+
name: "Exercises",
104+
dependencies: ["_RegexParser", "_StringProcessing", "RegexBuilder"],
105+
swiftSettings: [
106+
.unsafeFlags(["-Xfrontend", "-enable-experimental-pairwise-build-block"]),
107+
.unsafeFlags(["-Xfrontend", "-disable-availability-checking"])
108+
]),
92109
.testTarget(
93-
name: "ExercisesTests",
94-
dependencies: ["Exercises"]),
110+
name: "ExercisesTests",
111+
dependencies: ["Exercises"],
112+
swiftSettings: [
113+
.unsafeFlags(["-Xfrontend", "-disable-availability-checking"])
114+
])
95115
]
96116
)
97-

README.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,21 @@ See [Declarative String Processing Overview][decl-string]
1010

1111
- [Swift Trunk Development Snapshot](https://www.swift.org/download/#snapshots) DEVELOPMENT-SNAPSHOT-2022-03-09 or later.
1212

13+
## Trying it out
14+
15+
To try out the functionality provided here, download the latest open source development toolchain. Import `_StringProcessing` in your source file to get access to the API and specify `-Xfrontend -enable-experimental-string-processing` to get access to the literals.
16+
17+
For example, in a `Package.swift` file's target declaration:
18+
19+
```swift
20+
.target(
21+
name: "foo",
22+
dependencies: ["depA"],
23+
swiftSettings: [.unsafeFlags(["-Xfrontend", "-enable-experimental-string-processing"])]
24+
),
25+
```
26+
27+
1328
## Integration with Swift
1429

1530
`_RegexParser` and `_StringProcessing` are specially integrated modules that are built as part of apple/swift.

Sources/Exercises/Exercises.swift

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@ public enum Exercises {
1616
HandWrittenParticipant.self,
1717
RegexDSLParticipant.self,
1818
RegexLiteralParticipant.self,
19-
PEGParticipant.self,
2019
NSREParticipant.self,
2120
]
2221
}

Sources/Exercises/Participants/PEGParticipant.swift

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,9 @@
99
//
1010
//===----------------------------------------------------------------------===//
1111

12+
// Disabled because Prototypes is a test target.
13+
#if false
14+
1215
struct PEGParticipant: Participant {
1316
static var name: String { "PEG" }
1417
}
@@ -51,3 +54,4 @@ private func graphemeBreakPropertyData(forLine line: String) -> GraphemeBreakEnt
5154

5255
}
5356

57+
#endif

Sources/Exercises/Participants/RegexParticipant.swift

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -62,8 +62,8 @@ private func extractFromCaptures(
6262
private func graphemeBreakPropertyData<RP: RegexComponent>(
6363
forLine line: String,
6464
using regex: RP
65-
) -> GraphemeBreakEntry? where RP.Output == (Substring, Substring, Substring?, Substring) {
66-
line.matchWhole(regex).map(\.output).flatMap(extractFromCaptures)
65+
) -> GraphemeBreakEntry? where RP.RegexOutput == (Substring, Substring, Substring?, Substring) {
66+
line.wholeMatch(of: regex).map(\.output).flatMap(extractFromCaptures)
6767
}
6868

6969
private func graphemeBreakPropertyDataLiteral(
@@ -80,7 +80,7 @@ private func graphemeBreakPropertyDataLiteral(
8080
private func graphemeBreakPropertyData(
8181
forLine line: String
8282
) -> GraphemeBreakEntry? {
83-
line.matchWhole {
83+
line.wholeMatch {
8484
TryCapture(OneOrMore(.hexDigit)) { Unicode.Scalar(hex: $0) }
8585
Optionally {
8686
".."

Sources/RegexBuilder/Anchor.swift

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
import _RegexParser
1313
@_spi(RegexBuilder) import _StringProcessing
1414

15+
@available(SwiftStdlib 5.7, *)
1516
public struct Anchor {
1617
internal enum Kind {
1718
case startOfSubject
@@ -28,6 +29,7 @@ public struct Anchor {
2829
var isInverted: Bool = false
2930
}
3031

32+
@available(SwiftStdlib 5.7, *)
3133
extension Anchor: RegexComponent {
3234
var astAssertion: AST.Atom.AssertionKind {
3335
if !isInverted {
@@ -62,6 +64,7 @@ extension Anchor: RegexComponent {
6264

6365
// MARK: - Public API
6466

67+
@available(SwiftStdlib 5.7, *)
6568
extension Anchor {
6669
public static var startOfSubject: Anchor {
6770
Anchor(kind: .startOfSubject)
@@ -107,6 +110,7 @@ extension Anchor {
107110
}
108111
}
109112

113+
@available(SwiftStdlib 5.7, *)
110114
public struct Lookahead<Output>: _BuiltinRegexComponent {
111115
public var regex: Regex<Output>
112116

@@ -117,15 +121,15 @@ public struct Lookahead<Output>: _BuiltinRegexComponent {
117121
public init<R: RegexComponent>(
118122
_ component: R,
119123
negative: Bool = false
120-
) where R.Output == Output {
124+
) where R.RegexOutput == Output {
121125
self.init(node: .nonCapturingGroup(
122126
negative ? .negativeLookahead : .lookahead, component.regex.root))
123127
}
124128

125129
public init<R: RegexComponent>(
126130
negative: Bool = false,
127131
@RegexComponentBuilder _ component: () -> R
128-
) where R.Output == Output {
132+
) where R.RegexOutput == Output {
129133
self.init(node: .nonCapturingGroup(
130134
negative ? .negativeLookahead : .lookahead, component().regex.root))
131135
}

0 commit comments

Comments
 (0)