From 8b1e712d87ab725364ca701a4015e26ed25f25ac Mon Sep 17 00:00:00 2001 From: Erik Tjong Kim Sang Date: Thu, 25 Mar 2021 23:10:52 +0100 Subject: [PATCH 1/3] refactored notebook --- test_mvp.ipynb | 289 +++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 244 insertions(+), 45 deletions(-) diff --git a/test_mvp.ipynb b/test_mvp.ipynb index 64ba8ad..5841d54 100644 --- a/test_mvp.ipynb +++ b/test_mvp.ipynb @@ -2,56 +2,157 @@ "cells": [ { "cell_type": "markdown", + "id": "settled-retro", "metadata": {}, "source": [ - "# `sequgen` demo" + "# `sequgen` demo\n", + "\n", + "This demo shows how to create a timeseries with sequgen. Before we can run the software, we first need to to load a few packages." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "engaging-liabilities", + "metadata": {}, + "outputs": [], + "source": [ + "from matplotlib import pyplot as plt\n", + "import numpy\n", + "from sequgen.deterministic.triangular_peak import triangular_peak\n", + "from sequgen.parameter_space import ParameterSpace\n", + "from sequgen.dimension import Dimension" ] }, { "cell_type": "markdown", + "id": "stuck-current", "metadata": {}, "source": [ - "Print the version of the library as used in this demonstration:" + "The time series will consist of two parts: the time points and the data points. For the time points we want to use space between the numbers 0 and 20. We want to divide that space in 100 equal sections so we need 101 points: 100 at the beginning of each section and 1 and the end of the final section. We can define these time points with this command:" ] }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 2, + "id": "precise-dublin", + "metadata": {}, + "outputs": [], + "source": [ + "time_points = numpy.linspace(start=0, stop=20, num=101)" + ] + }, + { + "cell_type": "markdown", + "id": "stable-highway", + "metadata": {}, + "source": [ + "We can verify that this process has worked by checking the contents of the variable `time_points`:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "academic-relief", "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "'0.1.0'" + "array([ 0. , 0.2, 0.4, 0.6, 0.8, 1. , 1.2, 1.4, 1.6, 1.8, 2. ,\n", + " 2.2, 2.4, 2.6, 2.8, 3. , 3.2, 3.4, 3.6, 3.8, 4. , 4.2,\n", + " 4.4, 4.6, 4.8, 5. , 5.2, 5.4, 5.6, 5.8, 6. , 6.2, 6.4,\n", + " 6.6, 6.8, 7. , 7.2, 7.4, 7.6, 7.8, 8. , 8.2, 8.4, 8.6,\n", + " 8.8, 9. , 9.2, 9.4, 9.6, 9.8, 10. , 10.2, 10.4, 10.6, 10.8,\n", + " 11. , 11.2, 11.4, 11.6, 11.8, 12. , 12.2, 12.4, 12.6, 12.8, 13. ,\n", + " 13.2, 13.4, 13.6, 13.8, 14. , 14.2, 14.4, 14.6, 14.8, 15. , 15.2,\n", + " 15.4, 15.6, 15.8, 16. , 16.2, 16.4, 16.6, 16.8, 17. , 17.2, 17.4,\n", + " 17.6, 17.8, 18. , 18.2, 18.4, 18.6, 18.8, 19. , 19.2, 19.4, 19.6,\n", + " 19.8, 20. ])" ] }, - "execution_count": 1, + "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "from sequgen.__version__ import __version__\n", - "\n", - "\n", - "__version__" + "time_points" ] }, { "cell_type": "markdown", + "id": "adverse-bunny", "metadata": {}, "source": [ - "Import core classes and run `sequgen`'s minimum viable product:" + "In this time space, we want to create a triangulary shaped signal starting at time point 5 at value 0, rising to value 1 at time point 6 and dropping back to value 0 at time point 10. The following command can achieve this:" ] }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 4, + "id": "concrete-collar", + "metadata": {}, + "outputs": [], + "source": [ + "data_points = triangular_peak(time_points, **{'height': 1, 'placement': 6, 'width_base_left': 1, 'width_base_right': 4})" + ] + }, + { + "cell_type": "markdown", + "id": "analyzed-truth", + "metadata": {}, + "source": [ + "So we have asked for a triangular peak with height 1, placed at time position 6, where the left part of the triangle has width 1 (so it starts at 5) and the right part has width 4 (so it ends at time point 10). We can check the contents of the variable `data_points` to verify that the command has worked:" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "pregnant-tracy", "metadata": {}, "outputs": [ { "data": { - "image/png": "\n", + "text/plain": [ + "array([0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,\n", + " 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,\n", + " 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.2 , 0.4 ,\n", + " 0.6 , 0.8 , 1. , 0.95, 0.9 , 0.85, 0.8 , 0.75, 0.7 , 0.65, 0.6 ,\n", + " 0.55, 0.5 , 0.45, 0.4 , 0.35, 0.3 , 0.25, 0.2 , 0.15, 0.1 , 0.05,\n", + " 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,\n", + " 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,\n", + " 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,\n", + " 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ,\n", + " 0. , 0. ])" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data_points" + ] + }, + { + "cell_type": "markdown", + "id": "arbitrary-value", + "metadata": {}, + "source": [ + "We notice that the standard value of the data points is zero. Only at the triangle that we have defined the values are non-zero. A graph provides a better view of the shape:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "acknowledged-world", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "\n", "text/plain": [ "
" ] @@ -63,48 +164,146 @@ } ], "source": [ - "from matplotlib import pyplot as plt\n", - "import numpy\n", - "from sequgen.deterministic.triangular_peak import triangular_peak\n", - "from sequgen.parameter_space import ParameterSpace\n", - "from sequgen.dimension import Dimension\n", - "\n", - "\n", - "def test_mvp():\n", - "\n", - " # where I want the model to predict values\n", - " t_predict = numpy.linspace(-2, 20, 100)\n", - "\n", - " parameter_space = ParameterSpace([\n", - " Dimension(\"height\", 1, 2),\n", - " Dimension(\"placement\", 3, 10),\n", - " Dimension(\"width_base_left\", 0.5),\n", - " Dimension(\"width_base_right\", 2.0, 3.0),\n", - " ])\n", - "\n", - " # draw a sample of the parameter space for each space\n", - " parameters = parameter_space.sample()\n", - "\n", - " # generate predictions of y at t_predict using the model and the parameterization\n", - " y_predict = triangular_peak(t_predict, **parameters)\n", - "\n", - " # plot to verify\n", + "def plot(time_points, data_points, title):\n", " plt.figure()\n", - " plt.plot(t_predict, y_predict, \".b-\")\n", - " plt.title(parameter_space.format_str().format(**parameters))\n", + " plt.plot(time_points, data_points, \".b-\")\n", + " plt.title(title)\n", " plt.show()\n", "\n", - "\n", - "if __name__ == \"__main__\":\n", - " test_mvp()\n" + "title = \"triangular graph\"\n", + "plot(time_points, data_points, title)" + ] + }, + { + "cell_type": "markdown", + "id": "endless-thinking", + "metadata": {}, + "source": [ + "Often, we want to to allow for some variance in the signal. For example, we could have the height of the peak vary between 1 and 2, the peak position could vary between 3 and 10, the left width of the triangle could vary between 0.5 and 1, and the right width of the triangle could vary between 4 and 6. We can define these variations in a parameter space:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "minute-minister", + "metadata": {}, + "outputs": [], + "source": [ + "parameter_space = ParameterSpace([\n", + " Dimension(\"height\", 1, 2),\n", + " Dimension(\"placement\", 3, 10),\n", + " Dimension(\"width_base_left\", 0.5, 1),\n", + " Dimension(\"width_base_right\", 4, 6),\n", + "])" + ] + }, + { + "cell_type": "markdown", + "id": "infinite-yield", + "metadata": {}, + "source": [ + "Next, we generate a set of arbitrary values based on this parameter space:" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "fluid-narrow", + "metadata": {}, + "outputs": [], + "source": [ + "parameters = parameter_space.sample()" + ] + }, + { + "cell_type": "markdown", + "id": "compact-device", + "metadata": {}, + "source": [ + "We can verify that the parameter value generation was successful by checking the value of the `parameters` variable:" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "cordless-lover", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'height': 1.0918637838415952,\n", + " 'placement': 4.6254030311596175,\n", + " 'width_base_left': 0.6145108647446988,\n", + " 'width_base_right': 5.594328900386932}" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "parameters" + ] + }, + { + "cell_type": "markdown", + "id": "wireless-mentor", + "metadata": {}, + "source": [ + "Each time you run the `sample` command, you will get a different set of values. The parameters can be used to create new data points:" ] }, { "cell_type": "code", - "execution_count": null, + "execution_count": 10, + "id": "historic-waste", "metadata": {}, "outputs": [], - "source": [] + "source": [ + "variable_data_points = triangular_peak(time_points, **parameters)" + ] + }, + { + "cell_type": "markdown", + "id": "negative-treasury", + "metadata": {}, + "source": [ + "And the graph that corresponds with these data points can then be plotted. This time we mention the parameter values in the title of the graph:" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "constant-patch", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "title = parameter_space.format_str().format(**parameters)\n", + "plot(time_points, variable_data_points, title)" + ] + }, + { + "cell_type": "markdown", + "id": "victorian-connection", + "metadata": {}, + "source": [ + "Notice that this graph might have a slightly different shape than the previous one. This depends on the position of the peak. Only when the position of the peak (parameter `placement`) coincides with a time point, will the signal shape be perfectly triangular. But since the peak position now is an arbitrary value, this is usually not the case. And when the peak position does not match a time point, the triangle can be slightly irregular at the top." + ] } ], "metadata": { From 12ff16589fec0d83b6e96d6b3c03b2c98719c4c4 Mon Sep 17 00:00:00 2001 From: Erik Tjong Kim Sang Date: Thu, 25 Mar 2021 23:10:58 +0100 Subject: [PATCH 2/3] added jupyter environment import --- README.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index d988207..6f9d181 100644 --- a/README.md +++ b/README.md @@ -35,6 +35,9 @@ pip install --upgrade pip wheel # (from the repository root directory) # install sequgen + jupyter pip install --requirement requirements.txt + +# add the new local environment to jupyter +ipython kernel install --name sequgen-demo --user ``` Start the notebook server: @@ -43,4 +46,4 @@ Start the notebook server: jupyter lab ``` -It will open a web browser with the Jupyter Lab environment, in file-browser on left side bar open the notebook (*.ipynb files) of interest. +It will open a web browser with the Jupyter Lab environment, in file-browser on left side bar open the notebook (\*.ipynb files) of interest. Select the kernel `sequgen-demo` to run the notebook. From fa9876b74d5799733c682196ee9d281792a47846 Mon Sep 17 00:00:00 2001 From: Erik Tjong Kim Sang Date: Tue, 30 Mar 2021 14:27:42 +0200 Subject: [PATCH 3/3] added headings --- test_mvp.ipynb | 27 ++++----------------------- 1 file changed, 4 insertions(+), 23 deletions(-) diff --git a/test_mvp.ipynb b/test_mvp.ipynb index 5841d54..08fe4b6 100644 --- a/test_mvp.ipynb +++ b/test_mvp.ipynb @@ -2,7 +2,6 @@ "cells": [ { "cell_type": "markdown", - "id": "settled-retro", "metadata": {}, "source": [ "# `sequgen` demo\n", @@ -13,7 +12,6 @@ { "cell_type": "code", "execution_count": 1, - "id": "engaging-liabilities", "metadata": {}, "outputs": [], "source": [ @@ -26,16 +24,16 @@ }, { "cell_type": "markdown", - "id": "stuck-current", "metadata": {}, "source": [ + "## Time series with static parameters\n", + "\n", "The time series will consist of two parts: the time points and the data points. For the time points we want to use space between the numbers 0 and 20. We want to divide that space in 100 equal sections so we need 101 points: 100 at the beginning of each section and 1 and the end of the final section. We can define these time points with this command:" ] }, { "cell_type": "code", "execution_count": 2, - "id": "precise-dublin", "metadata": {}, "outputs": [], "source": [ @@ -44,7 +42,6 @@ }, { "cell_type": "markdown", - "id": "stable-highway", "metadata": {}, "source": [ "We can verify that this process has worked by checking the contents of the variable `time_points`:" @@ -53,7 +50,6 @@ { "cell_type": "code", "execution_count": 3, - "id": "academic-relief", "metadata": {}, "outputs": [ { @@ -82,7 +78,6 @@ }, { "cell_type": "markdown", - "id": "adverse-bunny", "metadata": {}, "source": [ "In this time space, we want to create a triangulary shaped signal starting at time point 5 at value 0, rising to value 1 at time point 6 and dropping back to value 0 at time point 10. The following command can achieve this:" @@ -91,7 +86,6 @@ { "cell_type": "code", "execution_count": 4, - "id": "concrete-collar", "metadata": {}, "outputs": [], "source": [ @@ -100,7 +94,6 @@ }, { "cell_type": "markdown", - "id": "analyzed-truth", "metadata": {}, "source": [ "So we have asked for a triangular peak with height 1, placed at time position 6, where the left part of the triangle has width 1 (so it starts at 5) and the right part has width 4 (so it ends at time point 10). We can check the contents of the variable `data_points` to verify that the command has worked:" @@ -109,7 +102,6 @@ { "cell_type": "code", "execution_count": 5, - "id": "pregnant-tracy", "metadata": {}, "outputs": [ { @@ -138,7 +130,6 @@ }, { "cell_type": "markdown", - "id": "arbitrary-value", "metadata": {}, "source": [ "We notice that the standard value of the data points is zero. Only at the triangle that we have defined the values are non-zero. A graph provides a better view of the shape:" @@ -147,7 +138,6 @@ { "cell_type": "code", "execution_count": 6, - "id": "acknowledged-world", "metadata": {}, "outputs": [ { @@ -176,16 +166,16 @@ }, { "cell_type": "markdown", - "id": "endless-thinking", "metadata": {}, "source": [ + "## Time series with random parameters\n", + "\n", "Often, we want to to allow for some variance in the signal. For example, we could have the height of the peak vary between 1 and 2, the peak position could vary between 3 and 10, the left width of the triangle could vary between 0.5 and 1, and the right width of the triangle could vary between 4 and 6. We can define these variations in a parameter space:" ] }, { "cell_type": "code", "execution_count": 7, - "id": "minute-minister", "metadata": {}, "outputs": [], "source": [ @@ -199,7 +189,6 @@ }, { "cell_type": "markdown", - "id": "infinite-yield", "metadata": {}, "source": [ "Next, we generate a set of arbitrary values based on this parameter space:" @@ -208,7 +197,6 @@ { "cell_type": "code", "execution_count": 8, - "id": "fluid-narrow", "metadata": {}, "outputs": [], "source": [ @@ -217,7 +205,6 @@ }, { "cell_type": "markdown", - "id": "compact-device", "metadata": {}, "source": [ "We can verify that the parameter value generation was successful by checking the value of the `parameters` variable:" @@ -226,7 +213,6 @@ { "cell_type": "code", "execution_count": 9, - "id": "cordless-lover", "metadata": {}, "outputs": [ { @@ -249,7 +235,6 @@ }, { "cell_type": "markdown", - "id": "wireless-mentor", "metadata": {}, "source": [ "Each time you run the `sample` command, you will get a different set of values. The parameters can be used to create new data points:" @@ -258,7 +243,6 @@ { "cell_type": "code", "execution_count": 10, - "id": "historic-waste", "metadata": {}, "outputs": [], "source": [ @@ -267,7 +251,6 @@ }, { "cell_type": "markdown", - "id": "negative-treasury", "metadata": {}, "source": [ "And the graph that corresponds with these data points can then be plotted. This time we mention the parameter values in the title of the graph:" @@ -276,7 +259,6 @@ { "cell_type": "code", "execution_count": 11, - "id": "constant-patch", "metadata": {}, "outputs": [ { @@ -299,7 +281,6 @@ }, { "cell_type": "markdown", - "id": "victorian-connection", "metadata": {}, "source": [ "Notice that this graph might have a slightly different shape than the previous one. This depends on the position of the peak. Only when the position of the peak (parameter `placement`) coincides with a time point, will the signal shape be perfectly triangular. But since the peak position now is an arbitrary value, this is usually not the case. And when the peak position does not match a time point, the triangle can be slightly irregular at the top."