bloominstituteoftechnology · nusc2016 · Jun 9, 2020 · Jun 9, 2020 · Jun 10, 2020 · Jun 10, 2020
diff --git a/Pipfile b/Pipfile
@@ -0,0 +1,12 @@
+[[source]]
+name = "pypi"
+url = "https://pypi.org/simple"
+verify_ssl = true
+
+[dev-packages]
+
+[packages]
+pandas = "*"
+
+[requires]
+python_version = "3.7"
diff --git a/Pipfile.lock b/Pipfile.lock
diff --git a/Unit-3-Sprint-Challenge-2/.gitkeep b/Unit-3-Sprint-Challenge-2/.gitkeep
@@ -0,0 +1 @@
+
diff --git a/Unit-3-Sprint-Challenge-2/challenge.md b/Unit-3-Sprint-Challenge-2/challenge.md
@@ -0,0 +1,144 @@
+# Data Science Unit 3 Sprint Challenge 2
+
+## Databases and SQL
+
+A SQL Query walks into a bar. In one corner of the bar are two tables. The Query
+walks up to the tables and asks:
+
+...
+
+*"Mind if I join you?"*
+
+---
+
+In this sprint challenge you will write code and answer questions related to
+databases, with a focus on SQL but an acknowledgment of the broader ecosystem.
+You may use any tools and references you wish, but your final code should
+reflect *your* work and be saved in `.py` files (*not* notebooks), and (along
+with this file including your written answers) turned in directly to your TL.
+
+For all your code, you may only import/use the following:
+- other modules you write
+- `sqlite3` (from the standard library)
+
+As always, make sure to manage your time - get a section/question to "good
+enough" and then move on to make sure you do everything. You can always revisit
+and polish at the end if time allows.
+
+This file is Markdown, so it may be helpful to open with VS Code or another tool
+that allows you to view it nicely rendered.
+
+Good luck!
+
+### Part 1 - Making and populating a Database
+
+Consider the following data:
+
+| s   | x | y |
+|-----|---|---|
+| 'g' | 3 | 9 |
+| 'v' | 5 | 7 |
+| 'f' | 8 | 7 |
+
+Using the standard `sqlite3` module:
+
+- Open a connection to a new (blank) database file `demo_data.sqlite3`
+- Make a cursor, and execute an appropriate `CREATE TABLE` statement to accept
+  the above data (name the table `demo`)
+- Write and execute appropriate `INSERT INTO` statements to add the data (as
+  shown above) to the database
+
+Make sure to `commit()` so your data is saved! The file size should be non-zero.
+
+Then write the following queries (also with `sqlite3`) to test:
+
+- Count how many rows you have - it should be 3!
+- How many rows are there where both `x` and `y` are at least 5?
+- How many unique values of `y` are there (hint - `COUNT()` can accept a keyword
+  `DISTINCT`)?
+
+Your code (to reproduce all above steps) should be saved in `demo_data.py` and
+added to the repository along with the generated SQLite database.
+
+### Part 2 - The Northwind Database
+
+Using `sqlite3`, connect to the given `northwind_small.sqlite3` database.
+
+![Northwind Entity-Relationship Diagram](./northwind_erd.png)
+
+Above is an entity-relationship diagram - a picture summarizing the schema and
+relationships in the database. Note that it was generated using Microsoft
+Access, and some of the specific table/field names are different in the provided
+data. You can see all the tables available to SQLite as follows:
+
+```python
+>>> curs.execute("SELECT name FROM sqlite_master WHERE type='table' ORDER BY
+name;").fetchall()
+[('Category',), ('Customer',), ('CustomerCustomerDemo',),
+('CustomerDemographic',), ('Employee',), ('EmployeeTerritory',), ('Order',),
+('OrderDetail',), ('Product',), ('Region',), ('Shipper',), ('Supplier',),
+('Territory',)]
+```
+
+*Warning*: unlike the diagram, the tables in SQLite are singular and not plural
+(do not end in `s`). And you can see the schema (`CREATE TABLE` statement)
+behind any given table with:
+```python
+>>> curs.execute('SELECT sql FROM sqlite_master WHERE name="Customer";').fetchall()
+[('CREATE TABLE "Customer" \n(\n  "Id" VARCHAR(8000) PRIMARY KEY, \n
+"CompanyName" VARCHAR(8000) NULL, \n  "ContactName" VARCHAR(8000) NULL, \n
+"ContactTitle" VARCHAR(8000) NULL, \n  "Address" VARCHAR(8000) NULL, \n  "City"
+VARCHAR(8000) NULL, \n  "Region" VARCHAR(8000) NULL, \n  "PostalCode"
+VARCHAR(8000) NULL, \n  "Country" VARCHAR(8000) NULL, \n  "Phone" VARCHAR(8000)
+NULL, \n  "Fax" VARCHAR(8000) NULL \n)',)]
+```
+
+In particular note that the *primary* key is `Id`, and not `CustomerId`. On
+other tables (where it is a *foreign* key) it will be `CustomerId`. Also note -
+the `Order` table conflicts with the `ORDER` keyword! We'll just avoid that
+particular table, but it's a good lesson in the danger of keyword conflicts.
+
+Answer the following questions (each is from a single table):
+
+- What are the ten most expensive items (per unit price) in the database?
+- What is the average age of an employee at the time of their hiring? (Hint: a
+  lot of arithmetic works with dates.)
+- (*Stretch*) How does the average age of employee at hire vary by city?
+
+Your code (to load and query the data) should be saved in `northwind.py`, and
+added to the repository. Do your best to answer in purely SQL, but if necessary
+use Python/other logic to help.
+
+### Part 3 - Sailing the Northwind Seas
+
+You've answered some basic questions from the Northwind database, looking at
+individual tables - now it's time to put things together, and `JOIN`!
+
+Using `sqlite3` in `northwind.py`, answer the following:
+
+- What are the ten most expensive items (per unit price) in the database *and*
+  their suppliers?
+- What is the largest category (by number of unique products in it)?
+- (*Stretch*) Who's the employee with the most territories? Use `TerritoryId`
+  (not name, region, or other fields) as the unique identifier for territories.
+
+### Part 4 - Questions (and your Answers)
+
+Answer the following questions, baseline ~3-5 sentences each, as if they were
+interview screening questions (a form you fill when applying for a job):
+
+- In the Northwind database, what is the type of relationship between the
+  `Employee` and `Territory` tables?
+- What is a situation where a document store (like MongoDB) is appropriate, and
+  what is a situation where it is not appropriate?
+- What is "NewSQL", and what is it trying to achieve?
+
+### Part 5 - Turn it in!
+Provide all the files you wrote (`demo_data.py`, `northwind.py`), as well as
+this file with your answers to part 4, directly to your TL. You're also
+encouraged to include the output from your queries as docstring comments, to
+facilitate grading and feedback. Thanks for your hard work!
+
+If you got this far, check out the [larger Northwind
+database](https://github.com/jpwhite3/northwind-SQLite3/blob/master/Northwind_large.sqlite.zip) -
+your queries should run on it as well, with richer results.
diff --git a/Unit-3-Sprint-Challenge-2/northwind_erd.png b/Unit-3-Sprint-Challenge-2/northwind_erd.png
diff --git a/buddymove_holidayiq.sqlite3 b/buddymove_holidayiq.sqlite3
diff --git a/module1-introduction-to-sql/buddymove_holidayiq.py b/module1-introduction-to-sql/buddymove_holidayiq.py
@@ -0,0 +1,14 @@
+import pandas as pd
+import sqlite3
+# df = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/00476/buddymove_holidayiq.csv")
+
+conn = sqlite3.connect('buddymove_holidayiq.sqlite3')
+# df.to_sql("review",con=conn)
+
+# count how many rows you have
+
+print(conn.execute("SELECT COUNT(*) FROM review;").fetchall())
+
+print(conn.execute("SELECT * FROM review WHERE Nature >= 100 LIMIT 10;").fetchall())
+
+print(conn.execute("SELECT AVG(Sports),AVG(Religious),AVG(Nature),AVG(Theatre),AVG(Shopping),AVG(Picnic) FROM review;").fetchall())
diff --git a/module1-introduction-to-sql/rpg_queries.py b/module1-introduction-to-sql/rpg_queries.py
@@ -0,0 +1,44 @@
+import sqlite3
+conn = sqlite3.connect('/Users/user/Documents/GitHub/Lambda/DS-Unit-3-Sprint-2-SQL-and-Databases/module1-introduction-to-sql/rpg_db.sqlite3')
+c = conn.cursor()
+
+
+#How many total Characters there?
+c1 = c
+print(c1.execute('SELECT COUNT(*) FROM charactercreator_character;').fetchall())
+
+#How many of each specific subclass?
+
+print(len(c1.execute('SELECT * FROM charactercreator_character;').fetchall()[0][2:]))
+
+#How many total items?
+
+print(c1.execute('SELECT COUNT(*) FROM armory_item;').fetchall())
+
+#How many of the Items are weapons? 
+
+print(c1.execute('SELECT COUNT(*) FROM armory_weapon;').fetchall())
+
+# How many are not?
+
+print(len(c1.execute('SELECT * FROM armory_item;').fetchall()) - len(c1.execute('SELECT * FROM armory_weapon;').fetchall()))
+
+# How many Items does each character have? (Return first 20 rows)
+
+print(c1.execute('SELECT character_id, count(*) FROM charactercreator_character_inventory GROUP BY item_id LIMIT 20;').fetchall())
+
+# How many Weapons does each character have? (Return first 20 rows)
+
+print(c1.execute('SELECT cci.character_id,count(*) FROM armory_weapon as aw, charactercreator_character_inventory as cci WHERE cci.item_id = aw.item_ptr_id GROUP BY cci.character_id LIMIT 20;').fetchall())
+
+# On average, how many Items does each Character have?
+
+table = c1.execute('SELECT character_id, count(*) FROM charactercreator_character_inventory GROUP BY item_id;').fetchall()[:]
+
+print(sum([x[1] for x in table]) / len(table))
+
+# On average, how many Weapons does each character have?
+
+table = c1.execute('SELECT cci.character_id,count(*) FROM armory_weapon as aw, charactercreator_character_inventory as cci WHERE cci.item_id = aw.item_ptr_id GROUP BY cci.character_id;').fetchall()
+
+print(sum([x[1] for x in table]) / len(table))
diff --git a/module2-sql-for-analysis/Stretch_goal_postgres_and_mongo.txt b/module2-sql-for-analysis/Stretch_goal_postgres_and_mongo.txt
@@ -0,0 +1,19 @@
+def increment(x):
+     return x + 1
+
+def double(x):
+    return x * 2
+
+def run_twice(func, arg):
+    return func(func(arg))
+
+def rec_print(n):
+    print(n)
+    if n > 0:
+        rec_print(n-1)
+
+def add(x,y):
+    return x + y
+
+def identity(x):
+    return x