bloominstituteoftechnology · trevorwjames · Aug 10, 2020 · Aug 10, 2020 · Aug 10, 2020 · Aug 10, 2020
diff --git a/.idea/.gitignore b/.idea/.gitignore
diff --git a/.idea/DS-Unit-3-Sprint-2-SQL-and-Databases.iml b/.idea/DS-Unit-3-Sprint-2-SQL-and-Databases.iml
diff --git a/.idea/inspectionProfiles/Project_Default.xml b/.idea/inspectionProfiles/Project_Default.xml
diff --git a/.idea/inspectionProfiles/profiles_settings.xml b/.idea/inspectionProfiles/profiles_settings.xml
diff --git a/.idea/misc.xml b/.idea/misc.xml
diff --git a/.idea/modules.xml b/.idea/modules.xml
diff --git a/.idea/vcs.xml b/.idea/vcs.xml
diff --git a/Study Guide/Unit 3 Sprint 2 SQL and Databases Study Guide.md b/Study Guide/Unit 3 Sprint 2 SQL and Databases Study Guide.md
@@ -0,0 +1,137 @@
+# Unit 3 Sprint 2 SQL and Databases Study Guide
+
+This study guide should reinforce and provide practice for all of the concepts you have seen in the past week. 
+There are a mix of written questions and coding exercises, both are equally important to prepare you for the 
+sprint challenge as well as to be able to speak on these topics comfortably in interviews and on the job.
+
+If you get stuck or are unsure of something remember the 20 minute rule. If that doesn't help, 
+then research a solution with [google](https://www.google.com) or [StackOverflow](https://www.stackoverflow.com). 
+Only once you have exhausted these methods should you turn to your Team Lead - they won't be there on your SC or during an interview. 
+That being said, don't hesitate to ask for help if you truly are stuck.
+
+Have fun studying!
+
+## SQL
+
+**Concepts:**
+
+1. What is SQL?
+- Selective Query Language: It is a simple way we can query databases in order to obtain the data we want from specific
+tables or areas
+2. What is a RDBMS?
+- Relational Data Base Management System: These are systems that allow us to interact with a database: DB browser, Postgres
+3. What is an ETL pipeline?
+- Extract Transform Load: This is a way that DS takes data from one place and places it into another. 
+-- Extract = Data out
+-- Transform = Taking it from one form to another form
+-- Load = taking it in that new form and inserting, or entering it into the new database or structure. 
+4. What is a schema?
+- Schema is a framework that ensures a structure for the database to accept certain formats of data - Deciding on 
+datatypes, lengths of strings, and Primary keys when necessary
+
+5. What does each letter in ACID stand for? Give an explanation for each and why they matter?
+	- **A**
+	- **C**
+	- **I**
+	- **D**
+6. Explain each of the table relationships and give an example for each
+	- One-to-One: Country to Capital, for each instance there is one singular connection
+	- One-to-Many: Book to Pages, for on instance there are many of the connection
+	- Many-to-Many: Books to Authors, May Authors write many books and often interconnect on certain books. 
+
+## Syntax
+For the following section, give a brief explanation of each of the SQL commands.
+
+1. **SELECT** - Specify the columns that are wanted FROM a certain table in the DB 
+('SELECT character_id, name, level FROM charactercreator_character')
+2. **WHERE** - A Joint logical condition. 
+('WHERE character_id >10 AND LEVEL >10)
+3. **LIMIT** - The amount of results you receive during the .fetachll() period
+4. **ORDER** - Choosing a column in which the order is specified - can use number based columns and add DESC
+5. **JOIN** - INNER, LEFT, RIGHT, MIDDLE also ON Allows for merging of table by different aspects. Inner is often preffered, 
+because it allows for no missing values. Left can be useful for seeing the difference in certain tables. 
+6. **CREATE TABLE** - When loading data there needs to be a table with a proper schema to load the data. 
+7. **INSERT** -Using this allows to insert data into the table 
+8. **DISTINCT** -Parameter used during SELECT that allows to only include values that are not repeats of another
+9. **GROUP BY** -Using this will aid in an implicit join
+10. **ORDER BY** -
+11. **AVG** - 
+12. **MAX** -
+13. **AS** - Casting a specific section to a different ID in order to limit the amount of writing necessary for the query
+
+## Starting From Scratch
+Create a file named `study_part1.py` and complete the exercise below. The only library you should need to import is `sqlite3`.
+ Don't forget to be PEP8 compliant!
+1. Create a new database file call `study_part1.sqlite3`
+2. Create a table with the following columns
+    ```
+    student - string
+    studied - string
+    grade - int
+    age - int
+    sex - string
+    ```
+
+3. Fill the table with the following data
+
+    ```
+    'Lion-O', 'True', 85, 24, 'Male'
+    'Cheetara', 'True', 95, 22, 'Female'
+    'Mumm-Ra', 'False', 65, 153, 'Male'
+    'Snarf', 'False', 70, 15, 'Male'
+    'Panthro', 'True', 80, 30, 'Male'
+    ```
+
+4. Save your data. You can check that everything is working so far if you can view the table and data in DBBrowser
+
+5. Write the following queries to check your work. Querie outputs should be formatted for readability, don't simply print a number to the screen with no explanation, add context.
+
+    ```
+    What is the average age? Expected Result - 48.8
+    What are the name of the female students? Expected Result - 'Cheetara'
+    How many students studied? Expected Results - 3
+    Return all students and all columns, sorted by student names in alphabetical order.
+    ```
+
+## Query All the Tables!
+
+### Setup
+Before we get started you'll need a few things.
+1. Download the [Chinook Database here](https://github.com/bundickm/Study-Guides/blob/master/data/Chinook_Sqlite.sqlite)
+2. The schema can be [found here](https://github.com/bundickm/Study-Guides/blob/master/data/Chinook%20Schema.png)
+3. Create a file named `study_part2.py` and complete the exercise below. The only library you should need to import is `sqlite3`. Don't forget to be PEP8 compliant!
+4. Add a connection to the chinook database so that you can answer the queries below.
+
+### Queries
+**Single Table Queries**
+1. Find the average invoice total for each customer, return the details for the first 5 ID's
+2. Return all columns in Customer for the first 5 customers residing in the United States
+3. Which employee does not report to anyone?
+4. Find the number of unique composers
+5. How many rows are in the Track table?
+
+**Joins**
+
+6. Get the name of all Black Sabbath tracks and the albums they came off of
+7. What is the most popular genre by number of tracks?
+8. Find all customers that have spent over $45
+9. Find the first and last name, title, and the number of customers each employee has helped. If the customer count is 0 for an employee, it doesn't need to be displayed. Order the employees from most to least customers.
+10. Return the first and last name of each employee and who they report to
+
+## NoSQL
+
+### Questions of Understanding
+
+1. What is a document store?
+
+2. What is a `key:value` pair? What data type in Python uses `key:value` pairs?
+
+3. Give an example of when it would be best to use a SQL Database and when it would be best to use a NoSQL Database
+
+4. What are some of the trade-offs between SQL and NoSQL?
+
+5. What does each letter in BASE stand for? Give an explanation for each and why they matter?
+    - B
+    - A
+    - S
+    - E
diff --git a/Study Guide/study_part1.py b/Study Guide/study_part1.py
@@ -0,0 +1,57 @@
+"""
+Study guide practicing importing data to sqlite file
+"""
+
+import sqlite3
+
+# directions
+
+sl_conn = sqlite3.connect('study_part1.sqlite3')
+
+sl_curs = sl_conn.cursor()
+
+"""
+ student - string
+    studied - string
+    grade - int
+    age - int
+    sex - string
+    """
+
+sl_curs.execute("DROP TABLE IF EXISTS students;")
+sl_conn.commit()
+
+create_table = """
+CREATE TABLE students (
+    student TEXT,
+    studied TEXT,
+    grade INT,
+    age INT,
+    sex TEXT
+);
+"""
+
+sl_curs.execute(create_table)
+
+sl_conn.commit()
+
+students = [
+    ('Lion-O', 'True', 85, 24, 'Male'),
+    ('Cheetara', 'True', 95, 22, 'Female'),
+    ('Mumm-Ra', 'False', 65, 153, 'Male'),
+    ('Snarf', 'False', 70, 15, 'Male'),
+    ('Panthro', 'True', 80, 30, 'Male')
+]
+
+for student in students:
+    insert = f"""
+        INSERT INTO students (student, studied, grade, age, sex)
+        VALUES {student};"""
+    sl_curs.execute(insert)
+
+sl_conn.commit()
+
+sl_curs.execute('SELECT * FROM students;')
+results = sl_curs.fetchall()
+
+print(results)
diff --git a/Study Guide/study_part1.sqlite3 b/Study Guide/study_part1.sqlite3
diff --git a/module1-introduction-to-sql/.idea/.gitignore b/module1-introduction-to-sql/.idea/.gitignore
diff --git a/module1-introduction-to-sql/.idea/inspectionProfiles/Project_Default.xml b/module1-introduction-to-sql/.idea/inspectionProfiles/Project_Default.xml
diff --git a/module1-introduction-to-sql/.idea/inspectionProfiles/profiles_settings.xml b/module1-introduction-to-sql/.idea/inspectionProfiles/profiles_settings.xml
diff --git a/module1-introduction-to-sql/.idea/misc.xml b/module1-introduction-to-sql/.idea/misc.xml
diff --git a/module1-introduction-to-sql/.idea/module1-introduction-to-sql.iml b/module1-introduction-to-sql/.idea/module1-introduction-to-sql.iml
diff --git a/module1-introduction-to-sql/.idea/modules.xml b/module1-introduction-to-sql/.idea/modules.xml
diff --git a/module1-introduction-to-sql/.idea/vcs.xml b/module1-introduction-to-sql/.idea/vcs.xml
diff --git a/module1-introduction-to-sql/buddymove_holidayiq.py b/module1-introduction-to-sql/buddymove_holidayiq.py
@@ -0,0 +1,64 @@
+import pandas as pd
+import sqlite3
+
+# read in data using pandas
+df = pd.read_csv('buddymove_holidayiq.csv')
+
+# create database for csv file
+conn = sqlite3.connect('buddymove_holidayiq.sqlite3')
+
+# convert df into DB for SQL use
+# df.to_sql('reviews', con=conn)
+
+# function for running queries
+def execute_query(cursor, query):
+    cursor.execute(query)
+    return cursor.fetchall()
+
+curs = conn.cursor()
+
+# count number of rows in DB
+num_rows = """
+    SELECT COUNT(*)
+    FROM reviews;
+    """
+# Answer: 249
+results1 = execute_query(curs, num_rows)
+
+# How many users who reviewed at least 100 `Nature` in the category also
+#   reviewed at least 100 in the `Shopping` category?
+karens = """
+    SELECT COUNT(*)
+FROM reviews
+WHERE Nature > 100
+AND Shopping > 100;
+"""
+# Answer: 78
+results2 = execute_query(curs, karens)
+
+# - (*Stretch*) What are the average number of reviews for each category?
+avg_reviews = """
+SELECT AVG(Sports), 
+AVG(Religious), 
+AVG(Nature), 
+AVG(Shopping), 
+AVG(Picnic), 
+AVG(Theatre)
+FROM reviews;
+"""
+results3 = execute_query(curs, avg_reviews)
+
+
+
+
+if __name__ == '__main__':
+    print(f'Report from buddymove_holidayiq \n'
+        f'Number of Users: {results1[0][0]} \n'
+        f'Number of Users whom have over 100: {results2[0][0]} \n'
+        f'Average number of reviews Sports: Nature reviews and over 100 Shopping reviews{results3[0][0]} \n'
+        f'Average number of reviews Religious:{results3[0][1]} \n'
+        f'Average number of reviews Nature:{results3[0][2]} \n'
+        f'Average number of reviews Shopping:{results3[0][3]} \n'
+        f'Average number of reviews Picnic:{results3[0][4]} \n'
+        f'Average number of reviews Theatre:{results3[0][5]} \n'
+          )
diff --git a/module1-introduction-to-sql/buddymove_holidayiq.sqlite3 b/module1-introduction-to-sql/buddymove_holidayiq.sqlite3