Add pandas example based on XDev workshop #58

Can we add one sentence of explanatory text here, like:
you'll often see the nickname pd used as an abbreviation for pandas in the import statement, just like numpy is often imported as np

Reply via ReviewNB

Agreed. Might we consider adding this and similar explanatory content as a "tip", using the "alert alert-info" class?

I like the alert! Can we fence the code pd, np so they render as code rather than text? I think that helps readability.

brian-rose · 2021-06-02T02:57:13Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


This is great! I suggest telling the reader more explicitly about the dataset, just one sentence to say what it is and where it comes from.

Reply via ReviewNB

Thanks for adding the description. There's an issue with the rendering of the special character in El Nino that's causing an odd line break.

I added in the tilde - it looks okay in the rendered notebook (El is on a different line, but it still shows up). Not really sure how to fix this one - I'd like to keep the tilde in there

brian-rose · 2021-06-02T02:57:13Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


I suggest adding a sentence of text here to point out that pandas provides really nicely formatted table output in the notebook, which is already an improvement over inspecting a plain list or numpy array!

Reply via ReviewNB

Yes; perhaps first add a line that reads "print (df)", then add a markdown cell that mentions how the output can be nicely-formatted (aka "pretty-printed").

brian-rose · 2021-06-02T02:57:13Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


Here, and in the next few cells, we are violating the style suggestion in our template which discourages using code comments as part of the narrative.

I suggest using the text
numpy-like interval slices
label-based slicing
another way of slicing
as sub-sub-headings so the narrative stays organized.

Reply via ReviewNB

brian-rose · 2021-06-02T02:57:13Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


First sentence: explicitly point out that the integer indexing that we just showed above is not inclusive of the final value.

Then break this section up with a new subheading.

Reply via ReviewNB

I would suggest that we break out the first sentence as a Tip.

Looks good now!

brian-rose · 2021-06-02T02:57:14Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


we are interested in where the...

Reply via ReviewNB

brian-rose · 2021-06-02T02:57:14Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


Add some text here to explain what you're trying to do.

Reply via ReviewNB

brian-rose · 2021-06-02T02:57:14Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


It might be helpful to remind the reader here that we don't usually recommend converting to numpy array unless you need to, because then all the label information is lost.

Reply via ReviewNB

I suggest using the "Warning" alert rather than "Info" for this one -- goes along with the word "beware"!

brian-rose · 2021-06-02T02:57:14Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


First sentence... I'm not sure what you mean. The name of the variable is the name we assigned when we created the new column. Clarify?

Reply via ReviewNB

brian-rose · 2021-06-02T02:57:14Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


First bullet point, I suggest strengthening:
Pandas is a very powerful tool for working with tabular (i.e. spreadsheet-style) data

Add a third bullet point, after "multiple ways of subsetting your dataframe"...
Pandas allows you to refer to subsets of data by label, which generally makes code more readable and more robust

Reply via ReviewNB

brian-rose · 2021-06-02T03:00:39Z

core/pandas.md

+
+From the [Pandas documentation](https://pandas.pydata.org/) "is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language."
+
+Pandas can be a useful library when working with tabular data, which is a common data type in the geosciences. You should have basic familiarity with NumPy and Matplotlib prior to working through the Pandas notebooks presented here.


I suggest stronger language here, something like:

"Pandas is a very powerful library for working with tabular data (i.e. anything you might put in a spreadsheet -- a common data type in the geosciences). It allows us to use labels for our data so that we can write expressive and robust code to manipulate the data."

This one is still unresolved.

brian-rose · 2021-06-02T03:02:36Z

Great stuff @mgrover1. I left a bunch of small suggestions here and in ReviewNB.

ktyle · 2021-06-02T16:17:43Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


I think we should briefly define what is meant by a Pandas index (essentially, a list containing the row ID's, (by default, a sequential list of integers beginning with 0)

Reply via ReviewNB

ktyle · 2021-06-02T16:17:43Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


Add a tip that the "dot notation" is a "convenience feature", that is mostly interchangeable with the dict notation, except in cases where the column name is not a valid Python object ... such as names beginning with a number, or names that contain spaces.

Reply via ReviewNB

Looks good to me!

ktyle · 2021-06-02T16:17:43Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


reword as:
and these can be extended in ways that you might expect, but perhaps also in unexpected ways:

Reply via ReviewNB

ktyle · 2021-06-02T16:17:43Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


Eliminate the two markdown cells that follow, and reword this one as follows:
These capabilities extend back to our original DataFrame, as well! Note there are limitations of the dict label notation: for example, neither of the two following code cells will work:

Reply via ReviewNB

ktyle · 2021-06-02T16:17:43Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


Break up these two sentences into their own Markdown cells. The first (which might best be a Tip) should read:
For a more comprehensive explanation, which includes additional examples, limitations, and compares indexing methods between DataFrame and Series see pandas' rules for indexing.

Reply via ReviewNB

ktyle · 2021-06-02T16:17:44Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


dataframes --> DataFrames

Reply via ReviewNB

ktyle · 2021-06-02T16:17:44Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


works --> accepts and returns

Reply via ReviewNB

ktyle · 2021-06-02T16:17:44Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


series --> Series

Reply via ReviewNB

ktyle · 2021-06-02T16:17:44Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


Reword as:
Let's apply this calculation to this Series; this returns another Series object.

Reply via ReviewNB

ktyle · 2021-06-02T16:17:44Z

core/pandas/pandas_fullNotebook.ipynb

@@ -0,0 +1,931 @@
+{


pandas series --> pandas Series

Reply via ReviewNB

ktyle · 2021-06-02T18:05:18Z

@mgrover1 and @brian-rose , I'm all set with my initial round of comments. Great work! This a very comprehensive, yet still concise, intro notebook. Do you have a sense as to how many additional notebooks will fill out this section?

mgrover1 · 2021-06-03T14:25:50Z

Thanks @brian-rose and @ktyle for the comments - just pushed the recent changes.

Do you have a sense as to how many additional notebooks will fill out this section?

I think at least one more, if not two similar to these two notebooks from the Unidata Python Workshops
Pythonic Data Analysis
Advanced Pythonic Data Analysis

github-actions · 2021-06-03T14:30:08Z

🚀 📚 Preview for git commit SHA: 64d5293 at: https://60b8e76de8c6620ef92440a3--pythia-foundations.netlify.app

github-actions · 2021-06-03T14:30:48Z

🚀 📚 Preview for git commit SHA: 64d5293 at: https://60b8e79010c0de0d5cc29c2b--pythia-foundations.netlify.app

brian-rose · 2021-06-03T15:19:25Z

Awesome, I put a few follow-up comments over on ReviewNB. I'm not sure how to make those appear here in this discussion, so I'll just repeat:

Render np and pd as code in the first alert box for the Imports section
Issue with the rendering of the word "El Nino" with the special character
Use "warning" alert rather than "info" for the last alert box about converting to numpy array with .values

My comment above on the Pandas section title page is still unresolved.

I went ahead and marked most of @ktyle 's comments as "resolved" where it was obvious that you took his suggestion and it looked all good. I left a few open in case Kevin has any more comments on them.

Nice work!

mgrover1 · 2021-06-04T13:50:20Z

@brian-rose just pushed the most recent changes

Render np and pd as code in the first alert box for the Imports section

Included the lines on both sides of these in the html - the warning blocks require using raw html for this




Issue with the rendering of the word "El Nino" with the special character



It looks like it still renders on the same line - ideally it would be good keep the tilde in there? In the rendered notebook "El" is just on the line above it. In the rendered notebook locally, it does not create a separate line for the tilde.



Use "warning" alert rather than "info" for the last alert box about converting to numpy array with .values



Changed this to a warning


My comment above on the Pandas section title page is still unresolved.


I made sure to make the changes this time

The only thing still missing is using the data directory - going to wait to talk to @andersy005  about this.

github-actions · 2021-06-04T13:52:48Z

🚀 📚 Preview for git commit SHA: 602d48e at: https://60ba302ce0e9bc3369c7d3ae--pythia-foundations.netlify.app

github-actions · 2021-06-04T13:54:07Z

🚀 📚 Preview for git commit SHA: 602d48e at: https://60ba3077493c41298cc21b63--pythia-foundations.netlify.app

brian-rose · 2021-06-04T17:20:15Z

Just fixed my two remaining nits:

We can just put the character ñ directly into the markdown and the weird formatting problem goes away
Changed from "Danger" alert to "Warning" alert for that last alert back about .values

Also merged in all the latest from main just to make sure there were no conflicts.

One thing to be aware of: when we merge #53, the style of the alerts will change a bit. It's possible we'll need to tweak this notebook once more.

github-actions · 2021-06-04T17:23:35Z

🚀 📚 Preview for git commit SHA: 0fb945a at: https://60ba6194af38b99cc8f04c5f--pythia-foundations.netlify.app

github-actions · 2021-06-04T17:23:56Z

🚀 📚 Preview for git commit SHA: 0fb945a at: https://60ba61a8cb95967e02074508--pythia-foundations.netlify.app

brian-rose · 2021-06-04T17:26:37Z

My suggestion is that we merge now with the local data file just to get the content out there.

We can open an issue about using pythia-datasets, and come back and modify this notebook in a future PR once that is up and running.

In that spirit, I'm approving this now. If you prefer to wait until the datasets are ready @mgrover1, that's fine too.

ktyle

Yep, I agree with @brian-rose ... good to merge!

add intro content

88e71fa

Fix linting issue

e4b5c0d

dcamron mentioned this pull request May 27, 2021

Drop in Unidata pandas workshop notebook #34

Closed

update toc

6dd844e

dcamron added the content Content related issue label May 27, 2021

add pandas markdown file

70caf0f

mgrover1 requested review from brian-rose and ktyle June 1, 2021 13:41

mgrover1 added the ready for review label Jun 1, 2021

brian-rose reviewed Jun 2, 2021

View reviewed changes

ktyle reviewed Jun 2, 2021

View reviewed changes

address comments from Kevin/Brian

64d5293

address additional suggestions

602d48e

brian-rose added 2 commits June 4, 2021 13:04

Merge branch 'main' into pandas_example

6621a43

Tweak special character and warning

0fb945a

brian-rose approved these changes Jun 4, 2021

View reviewed changes

ktyle approved these changes Jun 4, 2021

View reviewed changes

mgrover1 merged commit 4f8fa7a into main Jun 4, 2021

andersy005 deleted the pandas_example branch June 9, 2021 15:37


		From the [Pandas documentation](https://pandas.pydata.org/) "is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language."

		Pandas can be a useful library when working with tabular data, which is a common data type in the geosciences. You should have basic familiarity with NumPy and Matplotlib prior to working through the Pandas notebooks presented here.

Add pandas example based on XDev workshop #58

Add pandas example based on XDev workshop #58

Uh oh!

Conversation

mgrover1 commented May 27, 2021

Uh oh!

review-notebook-app bot commented May 27, 2021

Uh oh!

github-actions bot commented May 27, 2021

Uh oh!

github-actions bot commented May 27, 2021

Uh oh!

dcamron commented May 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 27, 2021

Uh oh!

github-actions bot commented May 27, 2021

Uh oh!

brian-rose commented May 28, 2021

Uh oh!

mgrover1 commented May 28, 2021

Uh oh!

brian-rose commented May 28, 2021

Uh oh!

github-actions bot commented Jun 1, 2021

Uh oh!

github-actions bot commented Jun 1, 2021

Uh oh!

mgrover1 commented Jun 1, 2021

Uh oh!

brian-rose commented Jun 1, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brian-rose Jun 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brian-rose Jun 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brian-rose Jun 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brian-rose Jun 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brian-rose Jun 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

dcamron commented May 27, 2021 •

edited

Loading

brian-rose Jun 2, 2021 •

edited

Loading

brian-rose Jun 2, 2021 •

edited

Loading

brian-rose Jun 2, 2021 •

edited

Loading

brian-rose Jun 2, 2021 •

edited

Loading

brian-rose Jun 2, 2021 •

edited

Loading

ktyle Jun 2, 2021 •

edited

Loading

ktyle Jun 2, 2021 •

edited

Loading

ktyle Jun 2, 2021 •

edited

Loading

ktyle Jun 2, 2021 •

edited

Loading

ktyle Jun 2, 2021 •

edited

Loading

ktyle Jun 2, 2021 •

edited

Loading