From 3c0109b87471bd60b089307b5f5dcb7a3b73ef3d Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Thu, 12 Jan 2023 10:15:47 +0100 Subject: [PATCH 01/28] Inception of a document about Seq --- data/tutorials/ds_05_seq.md | 71 +++++++++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) create mode 100644 data/tutorials/ds_05_seq.md diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md new file mode 100644 index 0000000000..7da4746ce3 --- /dev/null +++ b/data/tutorials/ds_05_seq.md @@ -0,0 +1,71 @@ +--- +id: Sequences +title: Sequences +description: > + Learn about one of OCaml's must used, built-in data types +category: "data-structures" +date: 2023-01-12T09:00:00-01:00 +--- + +# Sequences + +A sequence looks a lot like a list. However from a pragmatic perspective, one +should imagine it may be infinite. One way to look at a value of type `'a Seq.t` +is to consider it as an icicle, a frozen stream of data. To understand this +analogy, consider how sequences are defined in the standard libary: +```ocaml +type 'a node = + | Nil + | Cons of 'a * 'a t +and 'a t = unit -> 'a node +``` +This is the mutually recursive definition of two types; `Seq.node` which is almost +the same as `list`: +```ocaml +type 'a list = + | [] + | (::) of 'a * 'a list +``` +and `Seq.t` which is merely a type alias for `unit -> 'a Seq.node`. The whole +point of this definition is the type of second argument of `Seq.Cons`, in `list` +it is a list while in `Seq.t` it is function. Empty lists and empty sequence are +defined the same way (`Seq.Nil` and `[]`). Non-empty lists are non-empty +sequences values are both pairs those former member is a piece of data. But non +empty sequence values have a sequence returning function as latter member +instead of a list. That function is the frozen part of the sequence. When a +non-empty sequence is processed, access to data at the tip of the sequence is +immediate, but access to the rest of the sequence is deferred. To access the +tail of non-empty sequence, it has to be microwaved, that is, the tail returning +function must be passed a `unit` value. + +Having frozen-by-function tails explains why sequences should be considered +potentially infinite. Unless a `Seq.Nil` has been found in the sequence, one +can't say for sure if some will ever appear. The tail could be a stream of +client requests in a server, readings from an embedded sensor or logs. All have +unforseenable termination and should be considered infinite. + +Here is how to build seemingly infinite sequences of integers. +```ocaml +# let rec ints_from n : int Seq.t = fun () -> Seq.Cons (n, ints_from (n + 1));; +val ints_from : int -> int Seq.t = +# let ints = ints_from 0;; +val ints : ints Seq.t = +``` +The function `ints_from n` looks as if building the infinite sequence $(n; n + +1; n + 2; n + 3;...)$ while the value `ints` looks as if representing the +infinite sequence $(0; 1; 2; 3; ...)$. In reality, since there isn't an infinite +amount of distinct values of type `int`, those sequences are not increasing, +when reaching `max_int` the values will circle down to `min_int`, actually they +are ultimately periodic. + +The OCaml standard library contains a module on sequences called `Seq`. It contains an `Seq.iter` function, which has the same behaviour as `List.iter`. Writting this +```ocaml +# Seq.iter print_int ints;; +``` +in an OCaml toplevel actually means: “print integers forever” and you have to +type `Crtl-C` to interrupt the execution. Perhaps more interestingly, the +following code is an infinite loop: +```ocaml +# Seq.iter ignore ints;; +``` +But the key point is: it doesn't leak memory. From 7191b2c35649d9383b9091861a39c624a5b17271 Mon Sep 17 00:00:00 2001 From: Cuihtlauac Alvarado Date: Thu, 12 Jan 2023 20:56:38 +0100 Subject: [PATCH 02/28] Apply suggestions from code review Co-authored-by: Christine Rose --- data/tutorials/ds_05_seq.md | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 7da4746ce3..f82f05f232 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -19,50 +19,50 @@ type 'a node = | Cons of 'a * 'a t and 'a t = unit -> 'a node ``` -This is the mutually recursive definition of two types; `Seq.node` which is almost +This is the mutually recursive definition of two types: `Seq.node`, which is almost the same as `list`: ```ocaml type 'a list = | [] | (::) of 'a * 'a list ``` -and `Seq.t` which is merely a type alias for `unit -> 'a Seq.node`. The whole -point of this definition is the type of second argument of `Seq.Cons`, in `list` -it is a list while in `Seq.t` it is function. Empty lists and empty sequence are +and `Seq.t`, which is merely a type alias for `unit -> 'a Seq.node`. The whole +point of this definition is `Seq.Cons` second argument type; in `list`; in `list` +it is a list, while in `Seq.t`, it is a function. Empty lists and empty sequence are defined the same way (`Seq.Nil` and `[]`). Non-empty lists are non-empty sequences values are both pairs those former member is a piece of data. But non empty sequence values have a sequence returning function as latter member instead of a list. That function is the frozen part of the sequence. When a non-empty sequence is processed, access to data at the tip of the sequence is immediate, but access to the rest of the sequence is deferred. To access the -tail of non-empty sequence, it has to be microwaved, that is, the tail returning +tail of non-empty sequence, it has to be microwaved, i.e., the tail returning function must be passed a `unit` value. Having frozen-by-function tails explains why sequences should be considered potentially infinite. Unless a `Seq.Nil` has been found in the sequence, one can't say for sure if some will ever appear. The tail could be a stream of -client requests in a server, readings from an embedded sensor or logs. All have +client requests in a server, readings from an embedded sensor, or logs. All have unforseenable termination and should be considered infinite. -Here is how to build seemingly infinite sequences of integers. +Here is how to build seemingly infinite sequences of integers: ```ocaml # let rec ints_from n : int Seq.t = fun () -> Seq.Cons (n, ints_from (n + 1));; val ints_from : int -> int Seq.t = # let ints = ints_from 0;; val ints : ints Seq.t = ``` -The function `ints_from n` looks as if building the infinite sequence $(n; n + -1; n + 2; n + 3;...)$ while the value `ints` looks as if representing the -infinite sequence $(0; 1; 2; 3; ...)$. In reality, since there isn't an infinite +The function `ints_from n` looks as if building the infinite sequence `$(n; n + +1; n + 2; n + 3;...)$`, while the value `ints` looks as if representing the +infinite sequence `$(0; 1; 2; 3; ...)$`. In reality, since there isn't an infinite amount of distinct values of type `int`, those sequences are not increasing, -when reaching `max_int` the values will circle down to `min_int`, actually they +when reaching `max_int` the values will circle down to `min_int`. Actually, they are ultimately periodic. -The OCaml standard library contains a module on sequences called `Seq`. It contains an `Seq.iter` function, which has the same behaviour as `List.iter`. Writting this +The OCaml standard library contains a module on sequences called `Seq`. It contains an `Seq.iter` function, which has the same behaviour as `List.iter`. Writing this ```ocaml # Seq.iter print_int ints;; ``` -in an OCaml toplevel actually means: “print integers forever” and you have to +in an OCaml toplevel actually means “print integers forever,” and you have to type `Crtl-C` to interrupt the execution. Perhaps more interestingly, the following code is an infinite loop: ```ocaml From 536da2792999bfd41c127a413fa9094be5502e36 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Tue, 17 Jan 2023 18:32:26 +0100 Subject: [PATCH 03/28] Add sieve of Eratosthenes example --- data/tutorials/ds_05_seq.md | 152 ++++++++++++++++++++++++++++-------- 1 file changed, 118 insertions(+), 34 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index f82f05f232..8ba653ab57 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -2,70 +2,154 @@ id: Sequences title: Sequences description: > - Learn about one of OCaml's must used, built-in data types + Learn about one an OCaml's must used, built-in data types category: "data-structures" date: 2023-01-12T09:00:00-01:00 --- # Sequences -A sequence looks a lot like a list. However from a pragmatic perspective, one -should imagine it may be infinite. One way to look at a value of type `'a Seq.t` -is to consider it as an icicle, a frozen stream of data. To understand this -analogy, consider how sequences are defined in the standard libary: +## Introduction + +Sequences look a lot like lists. However from a pragmatic perspective, one +should imagine they may be infinite. That's the key intuition to understanding +and using sequences. + +One way to look at a value of type `'a Seq.t` is to consider it as an icicle, a +frozen stream of data. To understand this analogy, consider how sequences are +defined in the standard library: ```ocaml type 'a node = | Nil | Cons of 'a * 'a t and 'a t = unit -> 'a node ``` -This is the mutually recursive definition of two types: `Seq.node`, which is almost -the same as `list`: +This is the mutually recursive definition of two types; `Seq.node` which is +almost the same as `list`: ```ocaml type 'a list = | [] | (::) of 'a * 'a list ``` -and `Seq.t`, which is merely a type alias for `unit -> 'a Seq.node`. The whole -point of this definition is `Seq.Cons` second argument type; in `list`; in `list` -it is a list, while in `Seq.t`, it is a function. Empty lists and empty sequence are -defined the same way (`Seq.Nil` and `[]`). Non-empty lists are non-empty -sequences values are both pairs those former member is a piece of data. But non -empty sequence values have a sequence returning function as latter member -instead of a list. That function is the frozen part of the sequence. When a -non-empty sequence is processed, access to data at the tip of the sequence is -immediate, but access to the rest of the sequence is deferred. To access the -tail of non-empty sequence, it has to be microwaved, i.e., the tail returning -function must be passed a `unit` value. - -Having frozen-by-function tails explains why sequences should be considered +and `Seq.t` which is merely a type alias for `unit -> 'a Seq.node`. The whole +point of this definition is the type of the second argument of `Seq.Cons`, which +is a function returning a sequence while its `list` sibling is a list. Let's +compare the constructors of `list` and `Seq.node`: +1. Empty lists and sequences are defined the same way, a constructor without any + parameter: `Seq.Nil` and `[]`. +1. Non-empty lists and sequences are both pairs whose former member is a piece + of data; +1. but the latter member, in lists, is a `list` too, while in sequences, it is a + function returning a `Seq.node`. + +A value of type `Seq.t` is “frozen” because the data it contains isn't +immediately available, a `unit` value has to be supplied to recover it, and +that's “unfreezing”. However, unfreezing only gives access to the tip of the +icicle, since the second argument of `Seq.Cons` is a function too. + +Having frozen-by-function tails explains why sequences may be considered potentially infinite. Unless a `Seq.Nil` has been found in the sequence, one -can't say for sure if some will ever appear. The tail could be a stream of -client requests in a server, readings from an embedded sensor, or logs. All have -unforseenable termination and should be considered infinite. +can't say for sure if some will ever appear. The sequence could be a stream of +client requests in a server, readings from an embedded sensor or system logs. +All have unforeseeable termination and it is easier to consider them infinite. Here is how to build seemingly infinite sequences of integers: ```ocaml -# let rec ints_from n : int Seq.t = fun () -> Seq.Cons (n, ints_from (n + 1));; +# let rec ints_from n : int Seq.t = fun () -> Seq.Cons (n, ints_from (n + 1)) + let ints = ints_from 0;; val ints_from : int -> int Seq.t = -# let ints = ints_from 0;; val ints : ints Seq.t = ``` -The function `ints_from n` looks as if building the infinite sequence `$(n; n + -1; n + 2; n + 3;...)$`, while the value `ints` looks as if representing the -infinite sequence `$(0; 1; 2; 3; ...)$`. In reality, since there isn't an infinite +The function `ints_from n` looks as if building the infinite sequence +$(n; n + 1; n + 2; n + 3;...)$ +while the value `ints` look as if representing the +infinite sequence $(0; 1; 2; 3; ...)$. In reality, since there isn't an infinite amount of distinct values of type `int`, those sequences are not increasing, -when reaching `max_int` the values will circle down to `min_int`. Actually, they -are ultimately periodic. +when reaching `max_int` the values will circle down to `min_int`. They are +ultimately periodic. -The OCaml standard library contains a module on sequences called `Seq`. It contains an `Seq.iter` function, which has the same behaviour as `List.iter`. Writing this +The OCaml standard library contains a module on sequences called `Seq`. It +contains a `Seq.iter` function, which has the same behaviour as `List.iter`. +Writing this: ```ocaml # Seq.iter print_int ints;; ``` -in an OCaml toplevel actually means “print integers forever,” and you have to -type `Crtl-C` to interrupt the execution. Perhaps more interestingly, the -following code is an infinite loop: +in an OCaml top-level means: “print integers forever” and you have to type +`Crtl-C` to interrupt the execution. Perhaps more interestingly, the following +code is also an infinite loop: ```ocaml # Seq.iter ignore ints;; ``` But the key point is: it doesn't leak memory. + +## Example + +Strangely, the `Seq` module of the OCaml standard library does not (yet) define +a function returning the elements at the beginning of a sequence. Here is a +possible implementation: +```ocaml +let rec take n seq () = match seq () with +| Seq.Cons (x, seq) when n > 0 -> Seq.Cons (x, take (n - 1) seq) +| _ -> Seq.Nil +``` +`take n seq` returns, at most, the `n` first elements of the sequence `seq`. If +`seq` contains less than `n` elements, an identical sequence is returned. In +particular, if `seq` is empty, an empty sequence is returned. + +Observe the first line of `take`, it is the common pattern for recursive +functions over sequences. The last two parameters are: +* a sequence called `seq`; +* a `unit` value. + +When executed, the function begins by unfreezing `seq` (that is, calling `seq +()`) and then pattern match to look inside the data made available. However, +this does not happen unless a `unit` parameter is passed to `take`. Writing +`take 10 seq` does not compute anything, it is a partial application and returns +a function needing a `unit` to produce a result. + +This can be used to print integers without looping forever as shown previously: +```ocaml +# ints |> take 43 |> List.of_seq;; +- : int list = +[0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; 16; 17; 18; 19; 20; 21; + 22; 23; 24; 25; 26; 27; 28; 29; 30; 31; 32; 33; 34; 35; 36; 37; 38; 39; 40; + 41; 42] +``` + +The `Seq` module has a function `Seq.filter`: +```ocaml +# Seq.filter;; +- : ('a -> bool) -> 'a Seq.t -> 'a Seq.t = +``` +It builds a sequence of elements satisfying a condition. + +Using `Seq.filter`, it is possible to make a straightforward implementation of the +[Sieve of Eratosthenes](https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes). +Here it is: +```ocaml +let rec sieve seq () = match seq () with +| Seq.Cons (m, seq) -> Seq.Cons (m, sieve (Seq.filter (fun n -> n mod m > 0) seq)) +| seq -> seq;; +let facts = ints_from 2 |> sieve +``` + +This code can be used to generate lists of prime numbers. For instance, here is +the list of 100 first prime numbers: +```ocaml +# facts |> take 100 |> List.of_seq;; +- : int list = +[2; 3; 5; 7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47; 53; 59; 61; 67; 71; + 73; 79; 83; 89; 97; 101; 103; 107; 109; 113; 127; 131; 137; 139; 149; 151; + 157; 163; 167; 173; 179; 181; 191; 193; 197; 199; 211; 223; 227; 229; 233; + 239; 241; 251; 257; 263; 269; 271; 277; 281; 283; 293; 307; 311; 313; 317; + 331; 337; 347; 349; 353; 359; 367; 373; 379; 383; 389; 397; 401; 409; 419; + 421; 431; 433; 439; 443; 449; 457; 461; 463; 467; 479; 487; 491; 499; 503; + 509; 521; 523] +``` + +The function `sieve` is recursive, in OCaml and common senses: it is defined +using the `rec` keyword and calls itself. However, some call that kind of +function “corecursive”. This word is used to emphasize that, by design, it does +not terminate. Strictly speaking, the sieve of Eratosthenes is not an +algorithm either since it does not terminate. This implementation behaves the +same. From 8d42fac829288f442dd9e920d80f4b3bfe2f3149 Mon Sep 17 00:00:00 2001 From: Cuihtlauac Alvarado Date: Wed, 18 Jan 2023 08:14:19 +0100 Subject: [PATCH 04/28] Update data/tutorials/ds_05_seq.md Co-authored-by: Christine Rose --- data/tutorials/ds_05_seq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 8ba653ab57..fad4a37d83 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -2,7 +2,7 @@ id: Sequences title: Sequences description: > - Learn about one an OCaml's must used, built-in data types + Learn about an OCaml's most-used, built-in data types category: "data-structures" date: 2023-01-12T09:00:00-01:00 --- From 3f38371aca51503466c4f3d42546847f139e7795 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Wed, 18 Jan 2023 17:45:44 +0100 Subject: [PATCH 05/28] Add unfold and conversion sections --- data/tutorials/ds_05_seq.md | 140 +++++++++++++++++++++++++++++------- 1 file changed, 115 insertions(+), 25 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index fad4a37d83..182277ca33 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -9,6 +9,16 @@ date: 2023-01-12T09:00:00-01:00 # Sequences +## Prerequisites + +| Concept | Status | Documentation | Reference | +|---|---|---|---| +| Basic types | Mandatory | | | +| Functions | Mandatory | | | +| Lists | Mandatory | | | +| Options | Recommended | | | +| Arrays | Nice to have | | | + ## Introduction Sequences look a lot like lists. However from a pragmatic perspective, one @@ -48,49 +58,47 @@ that's “unfreezing”. However, unfreezing only gives access to the tip of the icicle, since the second argument of `Seq.Cons` is a function too. Having frozen-by-function tails explains why sequences may be considered -potentially infinite. Unless a `Seq.Nil` has been found in the sequence, one -can't say for sure if some will ever appear. The sequence could be a stream of -client requests in a server, readings from an embedded sensor or system logs. +potentially infinite. Until a `Seq`.Nil` value has been found in the sequence, +one can't say for sure if some will ever appear. The sequence could be a stream +of client requests in a server, readings from an embedded sensor or system logs. All have unforeseeable termination and it is easier to consider them infinite. Here is how to build seemingly infinite sequences of integers: ```ocaml -# let rec ints_from n : int Seq.t = fun () -> Seq.Cons (n, ints_from (n + 1)) - let ints = ints_from 0;; -val ints_from : int -> int Seq.t = -val ints : ints Seq.t = +# let rec ints n : int Seq.t = fun () -> Seq.Cons (n, ints_from (n + 1)) +val ints : int -> int Seq.t = ``` -The function `ints_from n` looks as if building the infinite sequence -$(n; n + 1; n + 2; n + 3;...)$ -while the value `ints` look as if representing the -infinite sequence $(0; 1; 2; 3; ...)$. In reality, since there isn't an infinite +The function `ints n` look as if building the infinite sequence +$(n; n + 1; n + 2; n + 3;...)$. In reality, since there isn't an infinite amount of distinct values of type `int`, those sequences are not increasing, when reaching `max_int` the values will circle down to `min_int`. They are -ultimately periodic. +ultimately periodic. -The OCaml standard library contains a module on sequences called `Seq`. It -contains a `Seq.iter` function, which has the same behaviour as `List.iter`. -Writing this: +The OCaml standard library contains a module on sequences called +[`Seq`](/releases/5.0/api/Seq.html). It contains a `Seq.iter` function, which +has the same behaviour as `List.iter`. Writing this: ```ocaml -# Seq.iter print_int ints;; +# Seq.iter print_int (ints 0);; ``` in an OCaml top-level means: “print integers forever” and you have to type `Crtl-C` to interrupt the execution. Perhaps more interestingly, the following code is also an infinite loop: ```ocaml -# Seq.iter ignore ints;; +# Seq.iter ignore (ints 0);; ``` But the key point is: it doesn't leak memory. ## Example -Strangely, the `Seq` module of the OCaml standard library does not (yet) define +The `Seq` module of the OCaml standard library contains + +does not (yet) define a function returning the elements at the beginning of a sequence. Here is a possible implementation: ```ocaml let rec take n seq () = match seq () with -| Seq.Cons (x, seq) when n > 0 -> Seq.Cons (x, take (n - 1) seq) -| _ -> Seq.Nil + | Seq.Cons (x, seq) when n > 0 -> Seq.Cons (x, take (n - 1) seq) + | _ -> Seq.Nil ``` `take n seq` returns, at most, the `n` first elements of the sequence `seq`. If `seq` contains less than `n` elements, an identical sequence is returned. In @@ -109,14 +117,14 @@ a function needing a `unit` to produce a result. This can be used to print integers without looping forever as shown previously: ```ocaml -# ints |> take 43 |> List.of_seq;; +# Seq.ints 0 |> Seq.take 43 |> List.of_seq;; - : int list = [0; 1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; 16; 17; 18; 19; 20; 21; 22; 23; 24; 25; 26; 27; 28; 29; 30; 31; 32; 33; 34; 35; 36; 37; 38; 39; 40; 41; 42] ``` -The `Seq` module has a function `Seq.filter`: +The `Seq` module also has a function `Seq.filter`: ```ocaml # Seq.filter;; - : ('a -> bool) -> 'a Seq.t -> 'a Seq.t = @@ -128,9 +136,9 @@ Using `Seq.filter`, it is possible to make a straightforward implementation of t Here it is: ```ocaml let rec sieve seq () = match seq () with -| Seq.Cons (m, seq) -> Seq.Cons (m, sieve (Seq.filter (fun n -> n mod m > 0) seq)) -| seq -> seq;; -let facts = ints_from 2 |> sieve + | Seq.Cons (m, seq) -> Seq.Cons (m, sieve (Seq.filter (fun n -> n mod m > 0) seq)) + | seq -> seq +let facts = ints_from 2 |> sieve;; ``` This code can be used to generate lists of prime numbers. For instance, here is @@ -153,3 +161,85 @@ function “corecursive”. This word is used to emphasize that, by design, it d not terminate. Strictly speaking, the sieve of Eratosthenes is not an algorithm either since it does not terminate. This implementation behaves the same. + +## Unfolding Sequences + +Standard higher-order iteration functions are available on Sequences. For instance: +* `Seq.iter` +* `Seq.map` +* `Seq.fold_left` + +All those are also available for `Array`, `List` and `Set`. Since OCaml 4.11 +sequences have something which isn't (yet) available on those: `unfold`. Here is +how it is implemented: +```ocaml +let rec unfold f seq () = match f seq with + | None -> Nil + | Some (x, seq) -> Cons (x, unfold f seq) +``` +And here is its type: +```ocaml +val unfold : ('a -> ('b * 'a) option) -> 'a -> 'b Seq.t = +``` +Unlike previously mentioned iterators `Seq.unfold` does not have a sequence +parameter, but a sequence result. `unfold` provides a general means to build +sequences. For instance, `Seq.ints` can be implemented using `Seq.unfold` in a +fairly compact way: +```ocaml +let ints = Seq.unfold (fun n -> Some (n, n + 1));; +``` + +As a fun fact, observe `map` over sequences can be implemented +using `Seq.unfold`. Here is how to write it: +```ocaml +# let map f = Seq.unfold (fun seq -> seq |> Seq.uncons |> Option.map (fun (x, y) -> (f x, y)));; +val map : ('a -> 'b) -> 'a Seq.t -> 'b Seq.t = +``` +Here is a quick check: +```ocaml +# Seq.ints 0 |> map (fun x -> x * x) |> Seq.take 10 |> List.of_seq;; +- : int list = [0; 1; 4; 9; 16; 25; 36; 49; 64; 81] +``` + +Using this function: +```ocaml +let input_line_opt chan = + try Some (input_line chan, chan) + with End_of_file -> close_in chan; None +``` +It is possible to read a file using `Seq.unfold`: +```ocaml +"README.md" |> open_in |> Seq.unfold input_line_opt |> Seq.iter print_endline +``` + +Although this can be an appealing style, bear in mind it does not prevent from +taking care of open files. While the code above is fine, this one no longer is: +```ocaml +"README.md" |> open_in |> Seq.unfold input_line_opt |> Seq.take 10 |> Seq.iter print_endline +``` +Here, `close_in` will never be called over the input channel opened on `README.md`. + + +## Sequences for Conversions + +Throughout the standard library, sequences are used as a bridge to perform +conversions between many datatypes. For instance, here are the signatures of +some of those functions: +* Lists + ```ocaml + val List.of_seq : 'a list -> 'a Seq.t + val List.to_seq : 'a Seq.t -> 'a list + ``` +* Arrays + ```ocaml + val Array.of_seq : 'a array -> 'a Seq.t + val Array.to_seq : 'a Seq.t -> 'a array + ``` +* Strings + ```ocaml + val String.of_seq : string -> char Seq.t + val String.to_seq : char Seq.t -> string + ``` +Similar functions are also provided for sets, maps, hash tables (`Hashtbl`) and +others (except `Seq`, obviously). When implementing a datatype module, it is +advised to expose `to_seq` and `of_seq` functions. From eab0ecb768c62fe4973f2c32b1002c70f82c4579 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Fri, 3 Feb 2023 17:25:22 +0100 Subject: [PATCH 06/28] Cleanup --- data/tutorials/ds_05_seq.md | 54 +++++++++++++++++++++++++------------ 1 file changed, 37 insertions(+), 17 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 182277ca33..3aff10e4c5 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -58,21 +58,27 @@ that's “unfreezing”. However, unfreezing only gives access to the tip of the icicle, since the second argument of `Seq.Cons` is a function too. Having frozen-by-function tails explains why sequences may be considered -potentially infinite. Until a `Seq`.Nil` value has been found in the sequence, +potentially infinite. Until a `Seq.Nil` value has been found in the sequence, one can't say for sure if some will ever appear. The sequence could be a stream -of client requests in a server, readings from an embedded sensor or system logs. +of incoming requests in a server, readings from an embedded sensor or system logs. All have unforeseeable termination and it is easier to consider them infinite. +In OCaml, any value `a` of type `t` can be turned into a constant function by +writing `fun _ -> a`, which has type `'a -> t`. When writing `fun () -> a` +instead, we get a function of type `unit -> t`. Such a function is called a +[_thunk_](https://en.wikipedia.org/wiki/Thunk). Using this terminology, sequence +values are thunks. With the analogy used earlier, `a` is frozen in its thunk. + Here is how to build seemingly infinite sequences of integers: ```ocaml -# let rec ints n : int Seq.t = fun () -> Seq.Cons (n, ints_from (n + 1)) +# let rec ints n : int Seq.t = fun () -> Seq.Cons (n, ints (n + 1)) val ints : int -> int Seq.t = ``` The function `ints n` look as if building the infinite sequence $(n; n + 1; n + 2; n + 3;...)$. In reality, since there isn't an infinite amount of distinct values of type `int`, those sequences are not increasing, when reaching `max_int` the values will circle down to `min_int`. They are -ultimately periodic. +ultimately periodic. The OCaml standard library contains a module on sequences called [`Seq`](/releases/5.0/api/Seq.html). It contains a `Seq.iter` function, which @@ -90,11 +96,9 @@ But the key point is: it doesn't leak memory. ## Example -The `Seq` module of the OCaml standard library contains - -does not (yet) define -a function returning the elements at the beginning of a sequence. Here is a -possible implementation: +The `Seq` module of the OCaml standard library contains the definition of the +function `Seq.take` which returns a specified number of elements from the +beginning of a sequence. Here is a simplified implementation: ```ocaml let rec take n seq () = match seq () with | Seq.Cons (x, seq) when n > 0 -> Seq.Cons (x, take (n - 1) seq) @@ -164,14 +168,16 @@ same. ## Unfolding Sequences -Standard higher-order iteration functions are available on Sequences. For instance: +Standard higher-order iteration functions are available on sequences. For +instance: * `Seq.iter` * `Seq.map` * `Seq.fold_left` -All those are also available for `Array`, `List` and `Set`. Since OCaml 4.11 -sequences have something which isn't (yet) available on those: `unfold`. Here is -how it is implemented: +All those are also available for `Array`, `List` and `Set` and behave +essentially the same. Observe that there is no `fold_right` function. Since +OCaml 4.11 there is something which isn't (yet) available on other types: +`unfold`. Here is how it is implemented: ```ocaml let rec unfold f seq () = match f seq with | None -> Nil @@ -189,10 +195,10 @@ fairly compact way: let ints = Seq.unfold (fun n -> Some (n, n + 1));; ``` -As a fun fact, observe `map` over sequences can be implemented -using `Seq.unfold`. Here is how to write it: +As a fun fact, one should observe `map` over sequences can be implemented using +`Seq.unfold`. Here is how to write it: ```ocaml -# let map f = Seq.unfold (fun seq -> seq |> Seq.uncons |> Option.map (fun (x, y) -> (f x, y)));; +# let map f = Seq.unfold (fun s -> s |> Seq.uncons |> Option.map (fun (x, y) -> (f x, y)));; val map : ('a -> 'b) -> 'a Seq.t -> 'b Seq.t = ``` Here is a quick check: @@ -200,6 +206,7 @@ Here is a quick check: # Seq.ints 0 |> map (fun x -> x * x) |> Seq.take 10 |> List.of_seq;; - : int list = [0; 1; 4; 9; 16; 25; 36; 49; 64; 81] ``` +The function `Seq.uncons` returns the head and tail of a sequence if it is not empty or `None` otherwise. Using this function: ```ocaml @@ -219,7 +226,6 @@ taking care of open files. While the code above is fine, this one no longer is: ``` Here, `close_in` will never be called over the input channel opened on `README.md`. - ## Sequences for Conversions Throughout the standard library, sequences are used as a bridge to perform @@ -243,3 +249,17 @@ some of those functions: Similar functions are also provided for sets, maps, hash tables (`Hashtbl`) and others (except `Seq`, obviously). When implementing a datatype module, it is advised to expose `to_seq` and `of_seq` functions. + +## Miscellaneous + +There are a couple of related Libraries, all providing means to handle large +flows of data: + +* Rizo I [Streaming](/p/streaming) +* Gabriel Radanne [Iter](/p/iter) +* Jane Street `Base.Sequence` + +There used to be a module called [`Stream`](/releases/4.13/api/Stream.html) in +the OCaml standard library. It was +[removed](https://github.com/ocaml/ocaml/pull/10482) in 2021 with the release of +OCaml 4.14. Beware books and documentation written before may still mention it. From 1c4c726ed94aaedcfbf38bc0884393f5a559bcd1 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Fri, 3 Feb 2023 17:46:10 +0100 Subject: [PATCH 07/28] Add corecursion link --- data/tutorials/ds_05_seq.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 3aff10e4c5..6a6efc0b0f 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -161,10 +161,10 @@ the list of 100 first prime numbers: The function `sieve` is recursive, in OCaml and common senses: it is defined using the `rec` keyword and calls itself. However, some call that kind of -function “corecursive”. This word is used to emphasize that, by design, it does -not terminate. Strictly speaking, the sieve of Eratosthenes is not an -algorithm either since it does not terminate. This implementation behaves the -same. +function [_corecursive_](https://en.wikipedia.org/wiki/Corecursion). This word +is used to emphasize that, by design, it does not terminate. Strictly speaking, +the sieve of Eratosthenes is not an algorithm either since it does not +terminate. This implementation behaves the same. ## Unfolding Sequences From 21deb5f0423026ccc07e43a0fd800e5524f7d76b Mon Sep 17 00:00:00 2001 From: Cuihtlauac Alvarado Date: Mon, 6 Feb 2023 08:02:12 +0100 Subject: [PATCH 08/28] Apply suggestions from code review Thanks @dustanddreams, all suggestions merged Co-authored-by: Miod Vallat <118974489+dustanddreams@users.noreply.github.com> --- data/tutorials/ds_05_seq.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 6a6efc0b0f..e53c65e064 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -43,7 +43,7 @@ type 'a list = ``` and `Seq.t` which is merely a type alias for `unit -> 'a Seq.node`. The whole point of this definition is the type of the second argument of `Seq.Cons`, which -is a function returning a sequence while its `list` sibling is a list. Let's +is a function returning a sequence while its `list` counterpart returns a list. Let's compare the constructors of `list` and `Seq.node`: 1. Empty lists and sequences are defined the same way, a constructor without any parameter: `Seq.Nil` and `[]`. @@ -87,7 +87,7 @@ has the same behaviour as `List.iter`. Writing this: # Seq.iter print_int (ints 0);; ``` in an OCaml top-level means: “print integers forever” and you have to type -`Crtl-C` to interrupt the execution. Perhaps more interestingly, the following +`Ctrl-C` to interrupt the execution. Perhaps more interestingly, the following code is also an infinite loop: ```ocaml # Seq.iter ignore (ints 0);; @@ -252,7 +252,7 @@ advised to expose `to_seq` and `of_seq` functions. ## Miscellaneous -There are a couple of related Libraries, all providing means to handle large +There are a couple of related libraries, all providing means to handle large flows of data: * Rizo I [Streaming](/p/streaming) From f1f21b03a98e0bbd76df06ecbe8c99ff42c4f71d Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Mon, 6 Feb 2023 09:11:35 +0100 Subject: [PATCH 09/28] Add fibs example --- data/tutorials/ds_05_seq.md | 48 ++++++++++++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index e53c65e064..55dfab4f15 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -1,6 +1,6 @@ --- id: Sequences -title: Sequences +title: sequences description: > Learn about an OCaml's most-used, built-in data types category: "data-structures" @@ -226,6 +226,52 @@ taking care of open files. While the code above is fine, this one no longer is: ``` Here, `close_in` will never be called over the input channel opened on `README.md`. +## Sequences are Functions + +Although this looks like a possible way to define the [Fibonacci +sequence](https://en.wikipedia.org/wiki/Fibonacci_number): +```ocaml +# let rec fibs m n = Seq.cons m (fibs n (n + m));; +val fibs : int -> int -> int Seq.t = +``` +It actually isn't. It's a non-ending recursion which blows away the stack. +``` +# fibs 0 1;; +Stack overflow during evaluation (looping recursion?). +``` +This definition is behaving as expected: +```ocaml +# let rec fibs m n () = Seq.Cons (m, fibs n (n + m));; +val fibs : int -> int -> int Seq.t = +``` +It can be used to produce some Fibonacci numbers: +```ocaml +# fibs 0 1 |> Seq.take 10 |> List.of_seq;; +- : int list = [0; 1; 1; 2; 3; 5; 8; 13; 21; 34] +``` +Why is it so? The key difference lies in the recursive call `fibs n (n + m)`. In +the former definition, the application is complete, `fibs` is provided all the +arguments it expects; in the latter definition, the application is partial, the +`()` argument is missing. Since evaluation is +[eager](https://en.wikipedia.org/wiki/Evaluation_strategy#Eager_evaluation) in +OCaml, in the former case, evaluation of the recursive call is triggered, and +non-terminating looping occurs. In contrast, in the latter case, the partially +applied function is immediately returned as a +[closure](https://en.wikipedia.org/wiki/Closure_(computer_programming)). + +Sequences are functions, as stated by their type: +```ocaml +# #show Seq.t;; +type 'a t = unit -> 'a Seq.node +``` +Functions working with sequences must be written accordingly. +* Sequence consumer: partially applied function parameter +* Sequence producer: partially applied function result + +When code dealing with sequences does not behave as expected, in particular, if +it is crashing or hanging, there's a fair chance a mistake like in the first +definition of `fibs` was made. + ## Sequences for Conversions Throughout the standard library, sequences are used as a bridge to perform From bc3fc64afc17587470856107abaecb26154ee4f2 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Mon, 6 Feb 2023 09:57:16 +0100 Subject: [PATCH 10/28] Fix Typo --- data/tutorials/ds_05_seq.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 55dfab4f15..8a686cd253 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -1,6 +1,6 @@ --- -id: Sequences -title: sequences +id: sequences +title: Sequences description: > Learn about an OCaml's most-used, built-in data types category: "data-structures" From d6004bbd2a9486f428ed728f2b9ada351ed91f37 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Mon, 6 Feb 2023 18:56:41 +0100 Subject: [PATCH 11/28] Add exercises --- data/problems/diag.md | 30 ++++++++++++++++++ data/problems/stream.md | 61 +++++++++++++++++++++++++++++++++++++ data/tutorials/ds_05_seq.md | 8 ++++- 3 files changed, 98 insertions(+), 1 deletion(-) create mode 100644 data/problems/diag.md create mode 100644 data/problems/stream.md diff --git a/data/problems/diag.md b/data/problems/diag.md new file mode 100644 index 0000000000..8fb4fdec1a --- /dev/null +++ b/data/problems/diag.md @@ -0,0 +1,30 @@ +--- +title: Diagonal of a Sequence of Sequences +number: "100" +difficulty: intermediate +tags: [ "seq" ] +--- + +# Solution + +```ocaml +let rec diag seq_seq () = + let hds, tls = Seq.filter_map Seq.uncons seq_seq |> Seq.split in + let hd, tl = Seq.uncons hds |> Option.map fst, Seq.uncons tls |> Option.map snd in + let d = Option.fold ~none:Seq.empty ~some:diag tl in + Option.fold ~none:Fun.id ~some:Seq.cons hd d () +``` + +# Statement + +Write a function `diag : 'a Seq.t Seq.t -> 'a Seq` that returns the _diagonal_ +of a sequence of sequences. The returned sequence is formed the following way: +The first element of the returned sequence is the first element of the first +sequence; the second element of the returned sequence is the second element of +the second sequence; the third element of the returned sequence is the third +element of the third sequence; and so on. + + + by taking the first element of the +first sequence, the second element of the second second + diff --git a/data/problems/stream.md b/data/problems/stream.md new file mode 100644 index 0000000000..0fa3610e82 --- /dev/null +++ b/data/problems/stream.md @@ -0,0 +1,61 @@ +--- +title: Never-Ending Sequences +number: "101" +difficulty: beginner +tags: [ "seq" ] +--- + +# Solution + +```ocaml +type 'a cons = Cons of 'a * 'a stream +and 'a stream = unit -> 'a cons + +let hd (seq : 'a stream) = let (Cons (x, _)) = seq () in x +let tl (seq : 'a stream) = let (Cons (_, seq)) = seq () in seq +let rec take n seq = if n = 0 then [] else let (Cons (x, seq)) = seq () in x :: take (n - 1) seq +let rec unfold f x () = let (y, x) = f x in Cons (y, unfold f x) +let bang x = unfold (fun x -> (x, x)) x +let ints x = unfold (fun x -> (x, x + 1)) x +let rec map f seq () = let (Cons (x, seq)) = seq () in Cons (f x, map f seq) +let rec filter p seq () = let (Cons (x, seq)) = seq () in let seq = filter p seq in if p x then Cons (x, seq) else seq () +let rec iter f seq = let (Cons (x, seq)) = seq () in f x; iter f seq +let to_seq seq = Seq.unfold (fun seq -> Some (hd seq, tl seq)) seq +let rec of_seq seq () = match seq () with +| Seq.Nil -> failwith "Not a infinite sequence" +| Seq.Cons (x, seq) -> Cons (x, of_seq seq) +``` + +# Statement + +Lists are finite, they always contain a finite number of elements. Sequences may +be finite or infinite. + +The goal of this exercise is to define a type `'a stream` which only contains +infinite sequences. Using this type, define the functions following functions: +```ocaml +val hd : 'a stream -> 'a +(** Returns the first element of a stream *) +val tl : 'a stream -> 'a stream +(** Removes the first element of a stream *) +val take : int -> 'a stream -> 'a list +(** [take n seq] returns the n first values of [seq] *) +val unfold : ('a -> 'b * 'a) -> 'a -> 'b stream +(** Similar to Seq.unfold *) +val bang : 'a -> 'a stream +(** [bang x] produces a infinitely repeating sequences of [x] values. *) +val ints : int -> int stream +(* Similar to Seq.ints *) +val map : ('a -> 'b) -> 'a stream -> 'b stream +(** Similar to List.map and Seq.map *) +val filter: ('a -> bool) -> 'a stream -> 'a stream +(** Similar to List.filter and Seq.filter *) +val iter : ('a -> unit) -> 'a stream -> 'b +(** Similar to List.iter and Seq.iter *) +val to_seq : 'a stream -> 'a Seq.t +(** Translates an ['a stream] into an ['a Seq.t] *) +val of_seq : 'a Seq.t -> 'a stream +(** Translates an ['a Seq.t] into an ['a stream] + @raise Failure if the input sequence is finite. *) +``` +Pro tip: Use irrefutable patterns. \ No newline at end of file diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 8a686cd253..9cedaa5868 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -296,7 +296,7 @@ Similar functions are also provided for sets, maps, hash tables (`Hashtbl`) and others (except `Seq`, obviously). When implementing a datatype module, it is advised to expose `to_seq` and `of_seq` functions. -## Miscellaneous +## Miscellaneous Considerations There are a couple of related libraries, all providing means to handle large flows of data: @@ -309,3 +309,9 @@ There used to be a module called [`Stream`](/releases/4.13/api/Stream.html) in the OCaml standard library. It was [removed](https://github.com/ocaml/ocaml/pull/10482) in 2021 with the release of OCaml 4.14. Beware books and documentation written before may still mention it. + +## Exercices + +* [Diagonal](/problems#100) +* [Streams](/problems#101) + From 7b481161d21647e3fc19df0f39e214f6fccf113b Mon Sep 17 00:00:00 2001 From: Cuihtlauac Alvarado Date: Tue, 7 Feb 2023 08:17:08 +0100 Subject: [PATCH 12/28] Apply suggestions from code review Grammar fixes Co-authored-by: Christine Rose --- data/problems/diag.md | 6 ++-- data/problems/stream.md | 2 +- data/tutorials/ds_05_seq.md | 63 ++++++++++++++++++------------------- 3 files changed, 35 insertions(+), 36 deletions(-) diff --git a/data/problems/diag.md b/data/problems/diag.md index 8fb4fdec1a..ce4f7b18ed 100644 --- a/data/problems/diag.md +++ b/data/problems/diag.md @@ -18,13 +18,13 @@ let rec diag seq_seq () = # Statement Write a function `diag : 'a Seq.t Seq.t -> 'a Seq` that returns the _diagonal_ -of a sequence of sequences. The returned sequence is formed the following way: +of a sequence of sequences. The returned sequence is formed as follows: The first element of the returned sequence is the first element of the first sequence; the second element of the returned sequence is the second element of the second sequence; the third element of the returned sequence is the third element of the third sequence; and so on. - by taking the first element of the -first sequence, the second element of the second second +By taking the first element of the +first sequence, the second element of the second sequence diff --git a/data/problems/stream.md b/data/problems/stream.md index 0fa3610e82..aa2afb3b78 100644 --- a/data/problems/stream.md +++ b/data/problems/stream.md @@ -28,7 +28,7 @@ let rec of_seq seq () = match seq () with # Statement -Lists are finite, they always contain a finite number of elements. Sequences may +Lists are finite, meaning they always contain a finite number of elements. Sequences may be finite or infinite. The goal of this exercise is to define a type `'a stream` which only contains diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 9cedaa5868..50a30f8321 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -34,34 +34,33 @@ type 'a node = | Cons of 'a * 'a t and 'a t = unit -> 'a node ``` -This is the mutually recursive definition of two types; `Seq.node` which is +This is the mutually recursive definition of two types: `Seq.node`, which is almost the same as `list`: ```ocaml type 'a list = | [] | (::) of 'a * 'a list ``` -and `Seq.t` which is merely a type alias for `unit -> 'a Seq.node`. The whole -point of this definition is the type of the second argument of `Seq.Cons`, which +and `Seq.t`, which is merely a type alias for `unit -> 'a Seq.node`. The whole +point of this definition is the second argument's type `Seq.Cons`, which is a function returning a sequence while its `list` counterpart returns a list. Let's compare the constructors of `list` and `Seq.node`: 1. Empty lists and sequences are defined the same way, a constructor without any parameter: `Seq.Nil` and `[]`. 1. Non-empty lists and sequences are both pairs whose former member is a piece - of data; -1. but the latter member, in lists, is a `list` too, while in sequences, it is a + of data. +1. However, the latter member in lists is a `list` too, while in sequences, it is a function returning a `Seq.node`. A value of type `Seq.t` is “frozen” because the data it contains isn't -immediately available, a `unit` value has to be supplied to recover it, and -that's “unfreezing”. However, unfreezing only gives access to the tip of the +immediately available. A `unit` value has to be supplied to recover it, which is called “unfreezing.” However, unfreezing only gives access to the tip of the icicle, since the second argument of `Seq.Cons` is a function too. -Having frozen-by-function tails explains why sequences may be considered +Frozen-by-function tails explain why sequences may be considered potentially infinite. Until a `Seq.Nil` value has been found in the sequence, one can't say for sure if some will ever appear. The sequence could be a stream -of incoming requests in a server, readings from an embedded sensor or system logs. -All have unforeseeable termination and it is easier to consider them infinite. +of incoming requests in a server, readings from an embedded sensor, or system logs. +All have unforeseeable termination, and it is easier to consider them infinite. In OCaml, any value `a` of type `t` can be turned into a constant function by writing `fun _ -> a`, which has type `'a -> t`. When writing `fun () -> a` @@ -76,8 +75,8 @@ val ints : int -> int Seq.t = ``` The function `ints n` look as if building the infinite sequence $(n; n + 1; n + 2; n + 3;...)$. In reality, since there isn't an infinite -amount of distinct values of type `int`, those sequences are not increasing, -when reaching `max_int` the values will circle down to `min_int`. They are +amount of distinct values of type `int`, those sequences don't increase. +When reaching `max_int`, the values will circle down to `min_int`. They are ultimately periodic. The OCaml standard library contains a module on sequences called @@ -86,7 +85,7 @@ has the same behaviour as `List.iter`. Writing this: ```ocaml # Seq.iter print_int (ints 0);; ``` -in an OCaml top-level means: “print integers forever” and you have to type +in an OCaml toplevel means “print integers forever,” and you have to type `Ctrl-C` to interrupt the execution. Perhaps more interestingly, the following code is also an infinite loop: ```ocaml @@ -97,7 +96,7 @@ But the key point is: it doesn't leak memory. ## Example The `Seq` module of the OCaml standard library contains the definition of the -function `Seq.take` which returns a specified number of elements from the +function `Seq.take`, which returns a specified number of elements from the beginning of a sequence. Here is a simplified implementation: ```ocaml let rec take n seq () = match seq () with @@ -108,18 +107,18 @@ let rec take n seq () = match seq () with `seq` contains less than `n` elements, an identical sequence is returned. In particular, if `seq` is empty, an empty sequence is returned. -Observe the first line of `take`, it is the common pattern for recursive +Observe the first line of `take`. It is the common pattern for recursive functions over sequences. The last two parameters are: -* a sequence called `seq`; -* a `unit` value. +* a sequence called `seq` +* a `unit` value When executed, the function begins by unfreezing `seq` (that is, calling `seq -()`) and then pattern match to look inside the data made available. However, +()`) and then pattern matching to look inside the available data. However, this does not happen unless a `unit` parameter is passed to `take`. Writing -`take 10 seq` does not compute anything, it is a partial application and returns +`take 10 seq` does not compute anything; it is a partial application and returns a function needing a `unit` to produce a result. -This can be used to print integers without looping forever as shown previously: +This can be used to print integers without looping forever, as shown previously: ```ocaml # Seq.ints 0 |> Seq.take 43 |> List.of_seq;; - : int list = @@ -159,7 +158,7 @@ the list of 100 first prime numbers: 509; 521; 523] ``` -The function `sieve` is recursive, in OCaml and common senses: it is defined +The function `sieve` is recursive in OCaml and common sense. It is defined using the `rec` keyword and calls itself. However, some call that kind of function [_corecursive_](https://en.wikipedia.org/wiki/Corecursion). This word is used to emphasize that, by design, it does not terminate. Strictly speaking, @@ -174,9 +173,9 @@ instance: * `Seq.map` * `Seq.fold_left` -All those are also available for `Array`, `List` and `Set` and behave +All those are also available for `Array`, `List`, and `Set` and behave essentially the same. Observe that there is no `fold_right` function. Since -OCaml 4.11 there is something which isn't (yet) available on other types: +OCaml 4.11, there is something which isn't (yet) available on other types: `unfold`. Here is how it is implemented: ```ocaml let rec unfold f seq () = match f seq with @@ -187,7 +186,7 @@ And here is its type: ```ocaml val unfold : ('a -> ('b * 'a) option) -> 'a -> 'b Seq.t = ``` -Unlike previously mentioned iterators `Seq.unfold` does not have a sequence +Unlike previously mentioned iterators, `Seq.unfold` does not have a sequence parameter, but a sequence result. `unfold` provides a general means to build sequences. For instance, `Seq.ints` can be implemented using `Seq.unfold` in a fairly compact way: @@ -195,7 +194,7 @@ fairly compact way: let ints = Seq.unfold (fun n -> Some (n, n + 1));; ``` -As a fun fact, one should observe `map` over sequences can be implemented using +As a fun fact, one should observe `map` over sequences, as it can be implemented using `Seq.unfold`. Here is how to write it: ```ocaml # let map f = Seq.unfold (fun s -> s |> Seq.uncons |> Option.map (fun (x, y) -> (f x, y)));; @@ -206,7 +205,7 @@ Here is a quick check: # Seq.ints 0 |> map (fun x -> x * x) |> Seq.take 10 |> List.of_seq;; - : int list = [0; 1; 4; 9; 16; 25; 36; 49; 64; 81] ``` -The function `Seq.uncons` returns the head and tail of a sequence if it is not empty or `None` otherwise. +The function `Seq.uncons` returns the head and tail of a sequence if it is not empty, or it otherwise returns `None`. Using this function: ```ocaml @@ -219,14 +218,14 @@ It is possible to read a file using `Seq.unfold`: "README.md" |> open_in |> Seq.unfold input_line_opt |> Seq.iter print_endline ``` -Although this can be an appealing style, bear in mind it does not prevent from +Although this can be an appealing style, bear in mind that it does not prevent taking care of open files. While the code above is fine, this one no longer is: ```ocaml "README.md" |> open_in |> Seq.unfold input_line_opt |> Seq.take 10 |> Seq.iter print_endline ``` Here, `close_in` will never be called over the input channel opened on `README.md`. -## Sequences are Functions +## Sequences Are Functions Although this looks like a possible way to define the [Fibonacci sequence](https://en.wikipedia.org/wiki/Fibonacci_number): @@ -250,11 +249,11 @@ It can be used to produce some Fibonacci numbers: - : int list = [0; 1; 1; 2; 3; 5; 8; 13; 21; 34] ``` Why is it so? The key difference lies in the recursive call `fibs n (n + m)`. In -the former definition, the application is complete, `fibs` is provided all the -arguments it expects; in the latter definition, the application is partial, the +the former definition, the application is complete because `fibs` is provided all the +arguments it expects. In the latter definition, the application is partial because the `()` argument is missing. Since evaluation is [eager](https://en.wikipedia.org/wiki/Evaluation_strategy#Eager_evaluation) in -OCaml, in the former case, evaluation of the recursive call is triggered, and +OCaml, in the former case, evaluation of the recursive call is triggered and a non-terminating looping occurs. In contrast, in the latter case, the partially applied function is immediately returned as a [closure](https://en.wikipedia.org/wiki/Closure_(computer_programming)). @@ -268,7 +267,7 @@ Functions working with sequences must be written accordingly. * Sequence consumer: partially applied function parameter * Sequence producer: partially applied function result -When code dealing with sequences does not behave as expected, in particular, if +When code dealing with sequences does not behave as expected, like if it is crashing or hanging, there's a fair chance a mistake like in the first definition of `fibs` was made. From 2021ef4302199a349cf5aeeaccd35469c245e0ee Mon Sep 17 00:00:00 2001 From: Cuihtlauac Alvarado Date: Thu, 9 Feb 2023 07:56:37 +0100 Subject: [PATCH 13/28] Apply suggestions from code review Co-authored-by: Miod Vallat <118974489+dustanddreams@users.noreply.github.com> --- data/problems/stream.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data/problems/stream.md b/data/problems/stream.md index aa2afb3b78..d448908322 100644 --- a/data/problems/stream.md +++ b/data/problems/stream.md @@ -32,7 +32,7 @@ Lists are finite, meaning they always contain a finite number of elements. Seque be finite or infinite. The goal of this exercise is to define a type `'a stream` which only contains -infinite sequences. Using this type, define the functions following functions: +infinite sequences. Using this type, define the following functions: ```ocaml val hd : 'a stream -> 'a (** Returns the first element of a stream *) From 32a1eda3ce491f1158dbd17e052e867ae625d985 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Thu, 9 Feb 2023 08:10:14 +0100 Subject: [PATCH 14/28] Some fixes --- data/problems/diag.md | 5 ----- data/problems/stream.md | 2 +- data/tutorials/ds_05_seq.md | 13 ++++--------- 3 files changed, 5 insertions(+), 15 deletions(-) diff --git a/data/problems/diag.md b/data/problems/diag.md index ce4f7b18ed..96f3640bf6 100644 --- a/data/problems/diag.md +++ b/data/problems/diag.md @@ -23,8 +23,3 @@ The first element of the returned sequence is the first element of the first sequence; the second element of the returned sequence is the second element of the second sequence; the third element of the returned sequence is the third element of the third sequence; and so on. - - -By taking the first element of the -first sequence, the second element of the second sequence - diff --git a/data/problems/stream.md b/data/problems/stream.md index d448908322..b06386fd14 100644 --- a/data/problems/stream.md +++ b/data/problems/stream.md @@ -58,4 +58,4 @@ val of_seq : 'a Seq.t -> 'a stream (** Translates an ['a Seq.t] into an ['a stream] @raise Failure if the input sequence is finite. *) ``` -Pro tip: Use irrefutable patterns. \ No newline at end of file +**Tip:** Use `let ... =` patterns. \ No newline at end of file diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 50a30f8321..80ac34778b 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -11,13 +11,8 @@ date: 2023-01-12T09:00:00-01:00 ## Prerequisites -| Concept | Status | Documentation | Reference | -|---|---|---|---| -| Basic types | Mandatory | | | -| Functions | Mandatory | | | -| Lists | Mandatory | | | -| Options | Recommended | | | -| Arrays | Nice to have | | | +You should be comfortable with writing functions over lists and, ideally, +understand what an option is. ## Introduction @@ -85,7 +80,7 @@ has the same behaviour as `List.iter`. Writing this: ```ocaml # Seq.iter print_int (ints 0);; ``` -in an OCaml toplevel means “print integers forever,” and you have to type +in an OCaml top-level means “print integers forever,” and you have to type `Ctrl-C` to interrupt the execution. Perhaps more interestingly, the following code is also an infinite loop: ```ocaml @@ -267,7 +262,7 @@ Functions working with sequences must be written accordingly. * Sequence consumer: partially applied function parameter * Sequence producer: partially applied function result -When code dealing with sequences does not behave as expected, like if +When code dealing with sequences does not behave as expected like if it is crashing or hanging, there's a fair chance a mistake like in the first definition of `fibs` was made. From bebd94af4eee4e80c5fa6c85caed109717c1ca5a Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Wed, 3 May 2023 18:51:30 +0200 Subject: [PATCH 15/28] Update text after letting it rest for a while --- data/tutorials/ds_05_seq.md | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 80ac34778b..326b9388ff 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -11,18 +11,17 @@ date: 2023-01-12T09:00:00-01:00 ## Prerequisites -You should be comfortable with writing functions over lists and, ideally, -understand what an option is. +You should be comfortable with writing functions over lists and options. ## Introduction -Sequences look a lot like lists. However from a pragmatic perspective, one +Sequences are very much like lists. However from a pragmatic perspective, one should imagine they may be infinite. That's the key intuition to understanding and using sequences. -One way to look at a value of type `'a Seq.t` is to consider it as an icicle, a -frozen stream of data. To understand this analogy, consider how sequences are -defined in the standard library: +One way to look at a value of type `'a Seq.t` is to consider it as a list, with +a twist when it's not emprty: a frozen tail. To understand this analogy, +consider how sequences are defined in the standard library: ```ocaml type 'a node = | Nil @@ -44,12 +43,13 @@ compare the constructors of `list` and `Seq.node`: parameter: `Seq.Nil` and `[]`. 1. Non-empty lists and sequences are both pairs whose former member is a piece of data. -1. However, the latter member in lists is a `list` too, while in sequences, it is a - function returning a `Seq.node`. +1. However, the latter member in lists is recursively a `list`, while in + sequences, it is a function returning a `Seq.node`. A value of type `Seq.t` is “frozen” because the data it contains isn't -immediately available. A `unit` value has to be supplied to recover it, which is called “unfreezing.” However, unfreezing only gives access to the tip of the -icicle, since the second argument of `Seq.Cons` is a function too. +immediately available. A `unit` value has to be supplied to recover it, which we +may see as “unfreezing.” However, unfreezing only gives access to the tip of the +sequence, since the second argument of `Seq.Cons` is a function too. Frozen-by-function tails explain why sequences may be considered potentially infinite. Until a `Seq.Nil` value has been found in the sequence, @@ -60,7 +60,7 @@ All have unforeseeable termination, and it is easier to consider them infinite. In OCaml, any value `a` of type `t` can be turned into a constant function by writing `fun _ -> a`, which has type `'a -> t`. When writing `fun () -> a` instead, we get a function of type `unit -> t`. Such a function is called a -[_thunk_](https://en.wikipedia.org/wiki/Thunk). Using this terminology, sequence +[_thunk_](https://en.wikipedia.org/wiki/Thunk). Using this terminology, `Seq.t` values are thunks. With the analogy used earlier, `a` is frozen in its thunk. Here is how to build seemingly infinite sequences of integers: @@ -68,10 +68,10 @@ Here is how to build seemingly infinite sequences of integers: # let rec ints n : int Seq.t = fun () -> Seq.Cons (n, ints (n + 1)) val ints : int -> int Seq.t = ``` -The function `ints n` look as if building the infinite sequence -$(n; n + 1; n + 2; n + 3;...)$. In reality, since there isn't an infinite -amount of distinct values of type `int`, those sequences don't increase. -When reaching `max_int`, the values will circle down to `min_int`. They are +The function `ints n` look as if building the infinite sequence `(n; n + 1; n + +2; n + 3;...)`. In reality, since there isn't an infinite amount of distinct +values of type `int`, those sequences aren't indefinitely increasing. When +reaching `max_int`, the values will circle down to `min_int`. They are ultimately periodic. The OCaml standard library contains a module on sequences called @@ -189,7 +189,7 @@ fairly compact way: let ints = Seq.unfold (fun n -> Some (n, n + 1));; ``` -As a fun fact, one should observe `map` over sequences, as it can be implemented using +As a fun fact, one should observe `map` over sequences can be implemented using `Seq.unfold`. Here is how to write it: ```ocaml # let map f = Seq.unfold (fun s -> s |> Seq.uncons |> Option.map (fun (x, y) -> (f x, y)));; From 1351dd0d9f54b185dd566a6cd58bfa9929e1e03b Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Wed, 3 May 2023 18:58:50 +0200 Subject: [PATCH 16/28] Add feedback from @gpetiot --- data/problems/diag.md | 2 +- data/problems/stream.md | 4 +-- data/tutorials/ds_05_seq.md | 60 +++++++++++++++++++------------------ 3 files changed, 34 insertions(+), 32 deletions(-) diff --git a/data/problems/diag.md b/data/problems/diag.md index 96f3640bf6..e239fed169 100644 --- a/data/problems/diag.md +++ b/data/problems/diag.md @@ -1,6 +1,6 @@ --- title: Diagonal of a Sequence of Sequences -number: "100" +number: "101" difficulty: intermediate tags: [ "seq" ] --- diff --git a/data/problems/stream.md b/data/problems/stream.md index b06386fd14..9670597fdd 100644 --- a/data/problems/stream.md +++ b/data/problems/stream.md @@ -1,6 +1,6 @@ --- title: Never-Ending Sequences -number: "101" +number: "100" difficulty: beginner tags: [ "seq" ] --- @@ -43,7 +43,7 @@ val take : int -> 'a stream -> 'a list val unfold : ('a -> 'b * 'a) -> 'a -> 'b stream (** Similar to Seq.unfold *) val bang : 'a -> 'a stream -(** [bang x] produces a infinitely repeating sequences of [x] values. *) +(** [bang x] produces an infinitely repeating sequence of [x] values. *) val ints : int -> int stream (* Similar to Seq.ints *) val map : ('a -> 'b) -> 'a stream -> 'b stream diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 326b9388ff..7b169055c8 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -36,8 +36,8 @@ type 'a list = | (::) of 'a * 'a list ``` and `Seq.t`, which is merely a type alias for `unit -> 'a Seq.node`. The whole -point of this definition is the second argument's type `Seq.Cons`, which -is a function returning a sequence while its `list` counterpart returns a list. Let's +point of this definition is the second argument's type `Seq.Cons`, which is a +function returning a sequence while its `list` counterpart returns a list. Let's compare the constructors of `list` and `Seq.node`: 1. Empty lists and sequences are defined the same way, a constructor without any parameter: `Seq.Nil` and `[]`. @@ -51,11 +51,11 @@ immediately available. A `unit` value has to be supplied to recover it, which we may see as “unfreezing.” However, unfreezing only gives access to the tip of the sequence, since the second argument of `Seq.Cons` is a function too. -Frozen-by-function tails explain why sequences may be considered -potentially infinite. Until a `Seq.Nil` value has been found in the sequence, -one can't say for sure if some will ever appear. The sequence could be a stream -of incoming requests in a server, readings from an embedded sensor, or system logs. -All have unforeseeable termination, and it is easier to consider them infinite. +Frozen-by-function tails explain why sequences may be considered potentially +infinite. Until a `Seq.Nil` value has been found in the sequence, one can't say +for sure if some will ever appear. The sequence could be a stream of incoming +requests in a server, readings from an embedded sensor, or system logs. All have +unforeseeable termination, and it is easier to consider them infinite. In OCaml, any value `a` of type `t` can be turned into a constant function by writing `fun _ -> a`, which has type `'a -> t`. When writing `fun () -> a` @@ -68,7 +68,7 @@ Here is how to build seemingly infinite sequences of integers: # let rec ints n : int Seq.t = fun () -> Seq.Cons (n, ints (n + 1)) val ints : int -> int Seq.t = ``` -The function `ints n` look as if building the infinite sequence `(n; n + 1; n + +The function `ints n` looks as if building the infinite sequence `(n; n + 1; n + 2; n + 3;...)`. In reality, since there isn't an infinite amount of distinct values of type `int`, those sequences aren't indefinitely increasing. When reaching `max_int`, the values will circle down to `min_int`. They are @@ -80,7 +80,7 @@ has the same behaviour as `List.iter`. Writing this: ```ocaml # Seq.iter print_int (ints 0);; ``` -in an OCaml top-level means “print integers forever,” and you have to type +in an OCaml top-level means “print integers forever,” and you have to press `Ctrl-C` to interrupt the execution. Perhaps more interestingly, the following code is also an infinite loop: ```ocaml @@ -108,10 +108,10 @@ functions over sequences. The last two parameters are: * a `unit` value When executed, the function begins by unfreezing `seq` (that is, calling `seq -()`) and then pattern matching to look inside the available data. However, -this does not happen unless a `unit` parameter is passed to `take`. Writing -`take 10 seq` does not compute anything; it is a partial application and returns -a function needing a `unit` to produce a result. +()`) and then pattern matching to look inside the available data. However, this +does not happen unless a `unit` parameter is passed to `take`. Writing `take 10 +seq` does not compute anything; it is a partial application and returns a +function needing a `unit` to produce a result. This can be used to print integers without looping forever, as shown previously: ```ocaml @@ -129,9 +129,9 @@ The `Seq` module also has a function `Seq.filter`: ``` It builds a sequence of elements satisfying a condition. -Using `Seq.filter`, it is possible to make a straightforward implementation of the -[Sieve of Eratosthenes](https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes). -Here it is: +Using `Seq.filter`, it is possible to make a straightforward implementation of +the [Sieve of +Eratosthenes](https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes). Here it is: ```ocaml let rec sieve seq () = match seq () with | Seq.Cons (m, seq) -> Seq.Cons (m, sieve (Seq.filter (fun n -> n mod m > 0) seq)) @@ -153,12 +153,12 @@ the list of 100 first prime numbers: 509; 521; 523] ``` -The function `sieve` is recursive in OCaml and common sense. It is defined -using the `rec` keyword and calls itself. However, some call that kind of -function [_corecursive_](https://en.wikipedia.org/wiki/Corecursion). This word -is used to emphasize that, by design, it does not terminate. Strictly speaking, -the sieve of Eratosthenes is not an algorithm either since it does not -terminate. This implementation behaves the same. +The function `sieve` is recursive in OCaml and common sense. It is defined using +the `rec` keyword and calls itself. However, some call that kind of function +[_corecursive_](https://en.wikipedia.org/wiki/Corecursion). This word is used to +emphasize that, by design, it does not terminate. Strictly speaking, the sieve +of Eratosthenes is not an algorithm either since it does not terminate. This +implementation behaves the same. ## Unfolding Sequences @@ -200,7 +200,8 @@ Here is a quick check: # Seq.ints 0 |> map (fun x -> x * x) |> Seq.take 10 |> List.of_seq;; - : int list = [0; 1; 4; 9; 16; 25; 36; 49; 64; 81] ``` -The function `Seq.uncons` returns the head and tail of a sequence if it is not empty, or it otherwise returns `None`. +The function `Seq.uncons` returns the head and tail of a sequence if it is not +empty, or it otherwise returns `None`. Using this function: ```ocaml @@ -218,7 +219,8 @@ taking care of open files. While the code above is fine, this one no longer is: ```ocaml "README.md" |> open_in |> Seq.unfold input_line_opt |> Seq.take 10 |> Seq.iter print_endline ``` -Here, `close_in` will never be called over the input channel opened on `README.md`. +Here, `close_in` will never be called over the input channel opened on +`README.md`. ## Sequences Are Functions @@ -244,9 +246,9 @@ It can be used to produce some Fibonacci numbers: - : int list = [0; 1; 1; 2; 3; 5; 8; 13; 21; 34] ``` Why is it so? The key difference lies in the recursive call `fibs n (n + m)`. In -the former definition, the application is complete because `fibs` is provided all the -arguments it expects. In the latter definition, the application is partial because the -`()` argument is missing. Since evaluation is +the former definition, the application is complete because `fibs` is provided +all the arguments it expects. In the latter definition, the application is +partial because the `()` argument is missing. Since evaluation is [eager](https://en.wikipedia.org/wiki/Evaluation_strategy#Eager_evaluation) in OCaml, in the former case, evaluation of the recursive call is triggered and a non-terminating looping occurs. In contrast, in the latter case, the partially @@ -262,8 +264,8 @@ Functions working with sequences must be written accordingly. * Sequence consumer: partially applied function parameter * Sequence producer: partially applied function result -When code dealing with sequences does not behave as expected like if -it is crashing or hanging, there's a fair chance a mistake like in the first +When code dealing with sequences does not behave as expected like if it is +crashing or hanging, there's a fair chance a mistake like in the first definition of `fibs` was made. ## Sequences for Conversions From 5ef58f18061c6a94bb6049f7bdbf87a992978b7b Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Wed, 3 May 2023 19:17:53 +0200 Subject: [PATCH 17/28] More feedback from @gpetiot --- data/tutorials/ds_05_seq.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 7b169055c8..b8fcbaa6a7 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -20,7 +20,7 @@ should imagine they may be infinite. That's the key intuition to understanding and using sequences. One way to look at a value of type `'a Seq.t` is to consider it as a list, with -a twist when it's not emprty: a frozen tail. To understand this analogy, +a twist when it's not empty: a frozen tail. To understand this analogy, consider how sequences are defined in the standard library: ```ocaml type 'a node = @@ -306,7 +306,7 @@ the OCaml standard library. It was [removed](https://github.com/ocaml/ocaml/pull/10482) in 2021 with the release of OCaml 4.14. Beware books and documentation written before may still mention it. -## Exercices +## Exercises * [Diagonal](/problems#100) * [Streams](/problems#101) From 22be1329ac6298947cadcfd7905ea0933d7d5ff4 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Thu, 4 May 2023 13:53:02 +0200 Subject: [PATCH 18/28] Add feedback from @xvw --- data/tutorials/ds_05_seq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index b8fcbaa6a7..db266025d2 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -298,7 +298,7 @@ There are a couple of related libraries, all providing means to handle large flows of data: * Rizo I [Streaming](/p/streaming) -* Gabriel Radanne [Iter](/p/iter) +* Simon Cruanes and Gabriel Radanne [Iter](/p/iter) * Jane Street `Base.Sequence` There used to be a module called [`Stream`](/releases/4.13/api/Stream.html) in From 32b61d9340ca6fe254e31655893edbc84ca30e84 Mon Sep 17 00:00:00 2001 From: Cuihtlauac Alvarado Date: Fri, 5 May 2023 10:18:13 +0200 Subject: [PATCH 19/28] Apply suggestions from @christinerose Co-authored-by: Christine Rose --- data/tutorials/ds_05_seq.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index db266025d2..1c7d420a62 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -80,7 +80,7 @@ has the same behaviour as `List.iter`. Writing this: ```ocaml # Seq.iter print_int (ints 0);; ``` -in an OCaml top-level means “print integers forever,” and you have to press +in an OCaml toplevel, this means “print integers forever,” and you have to press `Ctrl-C` to interrupt the execution. Perhaps more interestingly, the following code is also an infinite loop: ```ocaml @@ -110,7 +110,7 @@ functions over sequences. The last two parameters are: When executed, the function begins by unfreezing `seq` (that is, calling `seq ()`) and then pattern matching to look inside the available data. However, this does not happen unless a `unit` parameter is passed to `take`. Writing `take 10 -seq` does not compute anything; it is a partial application and returns a +seq` does not compute anything. It is a partial application and returns a function needing a `unit` to produce a result. This can be used to print integers without looping forever, as shown previously: @@ -264,7 +264,7 @@ Functions working with sequences must be written accordingly. * Sequence consumer: partially applied function parameter * Sequence producer: partially applied function result -When code dealing with sequences does not behave as expected like if it is +When code dealing with sequences does not behave as expected, like if it is crashing or hanging, there's a fair chance a mistake like in the first definition of `fibs` was made. From cca2dde218cca9000b99f5f66475eae30f130f9f Mon Sep 17 00:00:00 2001 From: Cuihtlauac Alvarado Date: Fri, 5 May 2023 10:19:33 +0200 Subject: [PATCH 20/28] Update ds_05_seq.md --- data/tutorials/ds_05_seq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 1c7d420a62..79a5dce66f 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -247,7 +247,7 @@ It can be used to produce some Fibonacci numbers: ``` Why is it so? The key difference lies in the recursive call `fibs n (n + m)`. In the former definition, the application is complete because `fibs` is provided -all the arguments it expects. In the latter definition, the application is +with all the arguments it expects. In the latter definition, the application is partial because the `()` argument is missing. Since evaluation is [eager](https://en.wikipedia.org/wiki/Evaluation_strategy#Eager_evaluation) in OCaml, in the former case, evaluation of the recursive call is triggered and a From 5305db6d4ac12326c62fb541a36e34fbe2cfb175 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Fri, 5 May 2023 13:34:24 +0200 Subject: [PATCH 21/28] Some more edits --- data/tutorials/ds_05_seq.md | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 79a5dce66f..975dc9f2b7 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -36,8 +36,8 @@ type 'a list = | (::) of 'a * 'a list ``` and `Seq.t`, which is merely a type alias for `unit -> 'a Seq.node`. The whole -point of this definition is the second argument's type `Seq.Cons`, which is a -function returning a sequence while its `list` counterpart returns a list. Let's +point of this definition is `Seq.Cons` second argument's type, which is a +function returning a sequence while its `list` counterpart is a list. Let's compare the constructors of `list` and `Seq.node`: 1. Empty lists and sequences are defined the same way, a constructor without any parameter: `Seq.Nil` and `[]`. @@ -65,7 +65,7 @@ values are thunks. With the analogy used earlier, `a` is frozen in its thunk. Here is how to build seemingly infinite sequences of integers: ```ocaml -# let rec ints n : int Seq.t = fun () -> Seq.Cons (n, ints (n + 1)) +# let rec ints n : int Seq.t = fun () -> Seq.Cons (n, ints (n + 1));; val ints : int -> int Seq.t = ``` The function `ints n` looks as if building the infinite sequence `(n; n + 1; n + @@ -100,7 +100,7 @@ let rec take n seq () = match seq () with ``` `take n seq` returns, at most, the `n` first elements of the sequence `seq`. If `seq` contains less than `n` elements, an identical sequence is returned. In -particular, if `seq` is empty, an empty sequence is returned. +particular, if `seq` is empty, or `n` is negative, an empty sequence is returned. Observe the first line of `take`. It is the common pattern for recursive functions over sequences. The last two parameters are: @@ -108,7 +108,7 @@ functions over sequences. The last two parameters are: * a `unit` value When executed, the function begins by unfreezing `seq` (that is, calling `seq -()`) and then pattern matching to look inside the available data. However, this +()`) and then pattern matching to look inside the data made available. However, this does not happen unless a `unit` parameter is passed to `take`. Writing `take 10 seq` does not compute anything. It is a partial application and returns a function needing a `unit` to produce a result. @@ -127,7 +127,7 @@ The `Seq` module also has a function `Seq.filter`: # Seq.filter;; - : ('a -> bool) -> 'a Seq.t -> 'a Seq.t = ``` -It builds a sequence of elements satisfying a condition. +It keeps elements of a sequence which satisfies the provided condition. Using `Seq.filter`, it is possible to make a straightforward implementation of the [Sieve of @@ -187,6 +187,7 @@ sequences. For instance, `Seq.ints` can be implemented using `Seq.unfold` in a fairly compact way: ```ocaml let ints = Seq.unfold (fun n -> Some (n, n + 1));; +val ints : int -> int Seq.t = ``` As a fun fact, one should observe `map` over sequences can be implemented using @@ -235,7 +236,7 @@ It actually isn't. It's a non-ending recursion which blows away the stack. # fibs 0 1;; Stack overflow during evaluation (looping recursion?). ``` -This definition is behaving as expected: +This definition is behaving as expected (spot the differences, there are four): ```ocaml # let rec fibs m n () = Seq.Cons (m, fibs n (n + m));; val fibs : int -> int -> int Seq.t = @@ -289,7 +290,7 @@ some of those functions: val String.to_seq : char Seq.t -> string ``` Similar functions are also provided for sets, maps, hash tables (`Hashtbl`) and -others (except `Seq`, obviously). When implementing a datatype module, it is +others (except `Seq`, obviously). When implementing a collection datatype module, it is advised to expose `to_seq` and `of_seq` functions. ## Miscellaneous Considerations @@ -308,6 +309,6 @@ OCaml 4.14. Beware books and documentation written before may still mention it. ## Exercises -* [Diagonal](/problems#100) -* [Streams](/problems#101) +* [Streams](/problems#100) +* [Diagonal](/problems#101) From 800df7cc0cfeb6a2a3547fd4d45cb6af97721ca1 Mon Sep 17 00:00:00 2001 From: Cuihtlauac Alvarado Date: Mon, 22 May 2023 16:34:45 +0200 Subject: [PATCH 22/28] Update data/tutorials/ds_05_seq.md Co-authored-by: sabine --- data/tutorials/ds_05_seq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 975dc9f2b7..fa3454644c 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -2,7 +2,7 @@ id: sequences title: Sequences description: > - Learn about an OCaml's most-used, built-in data types + Learn about sequences, of OCaml's most-used, built-in data types category: "data-structures" date: 2023-01-12T09:00:00-01:00 --- From cc0fa39690264482795d682192749e5723771f13 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Mon, 22 May 2023 18:49:56 +0200 Subject: [PATCH 23/28] Add feedback from Simon Cruanes --- data/tutorials/ds_05_seq.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index fa3454644c..3650e592e1 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -36,7 +36,7 @@ type 'a list = | (::) of 'a * 'a list ``` and `Seq.t`, which is merely a type alias for `unit -> 'a Seq.node`. The whole -point of this definition is `Seq.Cons` second argument's type, which is a +point of this definition is `Seq.Cons` second component's type, which is a function returning a sequence while its `list` counterpart is a list. Let's compare the constructors of `list` and `Seq.node`: 1. Empty lists and sequences are defined the same way, a constructor without any @@ -80,7 +80,7 @@ has the same behaviour as `List.iter`. Writing this: ```ocaml # Seq.iter print_int (ints 0);; ``` -in an OCaml toplevel, this means “print integers forever,” and you have to press +in an OCaml top-level, this means “print integers forever,” and you have to press `Ctrl-C` to interrupt the execution. Perhaps more interestingly, the following code is also an infinite loop: ```ocaml @@ -127,7 +127,7 @@ The `Seq` module also has a function `Seq.filter`: # Seq.filter;; - : ('a -> bool) -> 'a Seq.t -> 'a Seq.t = ``` -It keeps elements of a sequence which satisfies the provided condition. +It builds a sequence of elements satisfying a condition. Using `Seq.filter`, it is possible to make a straightforward implementation of the [Sieve of @@ -290,7 +290,7 @@ some of those functions: val String.to_seq : char Seq.t -> string ``` Similar functions are also provided for sets, maps, hash tables (`Hashtbl`) and -others (except `Seq`, obviously). When implementing a collection datatype module, it is +others (except `Seq`, obviously). When implementing a datatype module, it is advised to expose `to_seq` and `of_seq` functions. ## Miscellaneous Considerations @@ -300,6 +300,7 @@ flows of data: * Rizo I [Streaming](/p/streaming) * Simon Cruanes and Gabriel Radanne [Iter](/p/iter) +* Simon Cruanes [OSeq](/p/oseq) (an extension of `Seq` with more functions) * Jane Street `Base.Sequence` There used to be a module called [`Stream`](/releases/4.13/api/Stream.html) in From 305be61d311cf75e7113a00772eadd860eba2d50 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Mon, 22 May 2023 21:17:26 +0200 Subject: [PATCH 24/28] Add suggestions from Christine Rose --- data/tutorials/ds_05_seq.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 3650e592e1..fdb90eab09 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -40,7 +40,7 @@ point of this definition is `Seq.Cons` second component's type, which is a function returning a sequence while its `list` counterpart is a list. Let's compare the constructors of `list` and `Seq.node`: 1. Empty lists and sequences are defined the same way, a constructor without any - parameter: `Seq.Nil` and `[]`. + parameters: `Seq.Nil` and `[]`. 1. Non-empty lists and sequences are both pairs whose former member is a piece of data. 1. However, the latter member in lists is recursively a `list`, while in @@ -80,7 +80,7 @@ has the same behaviour as `List.iter`. Writing this: ```ocaml # Seq.iter print_int (ints 0);; ``` -in an OCaml top-level, this means “print integers forever,” and you have to press +in an OCaml toplevel, this means “print integers forever,” and you have to press `Ctrl-C` to interrupt the execution. Perhaps more interestingly, the following code is also an infinite loop: ```ocaml @@ -156,7 +156,7 @@ the list of 100 first prime numbers: The function `sieve` is recursive in OCaml and common sense. It is defined using the `rec` keyword and calls itself. However, some call that kind of function [_corecursive_](https://en.wikipedia.org/wiki/Corecursion). This word is used to -emphasize that, by design, it does not terminate. Strictly speaking, the sieve +emphasise that, by design, it does not terminate. Strictly speaking, the sieve of Eratosthenes is not an algorithm either since it does not terminate. This implementation behaves the same. From 4d41d72835006c519bd070236c1e278547966875 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Tue, 23 May 2023 08:21:16 +0200 Subject: [PATCH 25/28] Add missing coma --- data/tutorials/ds_05_seq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index fdb90eab09..904f7116af 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -289,7 +289,7 @@ some of those functions: val String.of_seq : string -> char Seq.t val String.to_seq : char Seq.t -> string ``` -Similar functions are also provided for sets, maps, hash tables (`Hashtbl`) and +Similar functions are also provided for sets, maps, hash tables (`Hashtbl`), and others (except `Seq`, obviously). When implementing a datatype module, it is advised to expose `to_seq` and `of_seq` functions. From d52fbdbf3d26d2fcb8fdc959be671e78e53bb6b2 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Tue, 23 May 2023 09:45:41 +0200 Subject: [PATCH 26/28] Fix code samples --- data/tutorials/ds_05_seq.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index 904f7116af..a60197da5a 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -136,7 +136,9 @@ Eratosthenes](https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes). Here it is: let rec sieve seq () = match seq () with | Seq.Cons (m, seq) -> Seq.Cons (m, sieve (Seq.filter (fun n -> n mod m > 0) seq)) | seq -> seq -let facts = ints_from 2 |> sieve;; +let facts = Seq.ints 2 |> sieve;; +val sieve : int Seq.t -> int Seq.t = +val facts : int Seq.t = ``` This code can be used to generate lists of prime numbers. For instance, here is From 962f070b7d39802aea3dbb8d0942e71f0cd1e221 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Tue, 23 May 2023 10:50:08 +0200 Subject: [PATCH 27/28] Fix code samples --- data/tutorials/ds_05_seq.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index a60197da5a..b3dee8d5b4 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -144,7 +144,7 @@ val facts : int Seq.t = This code can be used to generate lists of prime numbers. For instance, here is the list of 100 first prime numbers: ```ocaml -# facts |> take 100 |> List.of_seq;; +# facts |> Seq.take 100 |> List.of_seq;; - : int list = [2; 3; 5; 7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47; 53; 59; 61; 67; 71; 73; 79; 83; 89; 97; 101; 103; 107; 109; 113; 127; 131; 137; 139; 149; 151; @@ -188,7 +188,7 @@ parameter, but a sequence result. `unfold` provides a general means to build sequences. For instance, `Seq.ints` can be implemented using `Seq.unfold` in a fairly compact way: ```ocaml -let ints = Seq.unfold (fun n -> Some (n, n + 1));; +# let ints = Seq.unfold (fun n -> Some (n, n + 1));; val ints : int -> int Seq.t = ``` From a72d5699ac5bb54c0d73e0f0d2eadf0dcb6fd530 Mon Sep 17 00:00:00 2001 From: Cuihtlauac ALVARADO Date: Tue, 23 May 2023 11:04:08 +0200 Subject: [PATCH 28/28] Add Credits --- data/tutorials/ds_05_seq.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/data/tutorials/ds_05_seq.md b/data/tutorials/ds_05_seq.md index b3dee8d5b4..eaaa1dd3a8 100644 --- a/data/tutorials/ds_05_seq.md +++ b/data/tutorials/ds_05_seq.md @@ -315,3 +315,18 @@ OCaml 4.14. Beware books and documentation written before may still mention it. * [Streams](/problems#100) * [Diagonal](/problems#101) +## Credits + +* Authors: + + 1. Cuihtlauac Alvarado [@cuihtlauac](https://github.com/cuihtlauac) + +* Suggestions and Corrections: + + * Miod Vallat [@dustanddreams](https://github.com/dustanddreams) + * Sayo Bamigbade [@SaySayo](https://github.com/SaySayo) + * Christine Rose [@christinerose](https://github.com/christinerose) + * Sabine Schmaltz [@sabine](https://github.com/sabine) + * Guillaume Petiot [@gpetiot](https://github.com/gpetiot) + * Xavier Van de Woestyne [@xvw](https://github.com/xvw) + * Simon Cruanes [@c-cube](https://github.com/c-cube)