diff --git a/.idea/.gitignore b/.idea/.gitignore new file mode 100644 index 00000000..26d33521 --- /dev/null +++ b/.idea/.gitignore @@ -0,0 +1,3 @@ +# Default ignored files +/shelf/ +/workspace.xml diff --git a/.idea/DS-Unit-3-Sprint-2-SQL-and-Databases.iml b/.idea/DS-Unit-3-Sprint-2-SQL-and-Databases.iml new file mode 100644 index 00000000..a22cceb0 --- /dev/null +++ b/.idea/DS-Unit-3-Sprint-2-SQL-and-Databases.iml @@ -0,0 +1,10 @@ + + + + + + + + + + \ No newline at end of file diff --git a/.idea/inspectionProfiles/Project_Default.xml b/.idea/inspectionProfiles/Project_Default.xml new file mode 100644 index 00000000..86b0eecb --- /dev/null +++ b/.idea/inspectionProfiles/Project_Default.xml @@ -0,0 +1,12 @@ + + + + \ No newline at end of file diff --git a/.idea/inspectionProfiles/profiles_settings.xml b/.idea/inspectionProfiles/profiles_settings.xml new file mode 100644 index 00000000..105ce2da --- /dev/null +++ b/.idea/inspectionProfiles/profiles_settings.xml @@ -0,0 +1,6 @@ + + + + \ No newline at end of file diff --git a/.idea/misc.xml b/.idea/misc.xml new file mode 100644 index 00000000..a5d07d10 --- /dev/null +++ b/.idea/misc.xml @@ -0,0 +1,4 @@ + + + + \ No newline at end of file diff --git a/.idea/modules.xml b/.idea/modules.xml new file mode 100644 index 00000000..c2a4bb2f --- /dev/null +++ b/.idea/modules.xml @@ -0,0 +1,8 @@ + + + + + + + + \ No newline at end of file diff --git a/.idea/vcs.xml b/.idea/vcs.xml new file mode 100644 index 00000000..94a25f7f --- /dev/null +++ b/.idea/vcs.xml @@ -0,0 +1,6 @@ + + + + + + \ No newline at end of file diff --git a/Study Guide/Unit 3 Sprint 2 SQL and Databases Study Guide.md b/Study Guide/Unit 3 Sprint 2 SQL and Databases Study Guide.md new file mode 100644 index 00000000..bcef7599 --- /dev/null +++ b/Study Guide/Unit 3 Sprint 2 SQL and Databases Study Guide.md @@ -0,0 +1,137 @@ +# Unit 3 Sprint 2 SQL and Databases Study Guide + +This study guide should reinforce and provide practice for all of the concepts you have seen in the past week. +There are a mix of written questions and coding exercises, both are equally important to prepare you for the +sprint challenge as well as to be able to speak on these topics comfortably in interviews and on the job. + +If you get stuck or are unsure of something remember the 20 minute rule. If that doesn't help, +then research a solution with [google](https://www.google.com) or [StackOverflow](https://www.stackoverflow.com). +Only once you have exhausted these methods should you turn to your Team Lead - they won't be there on your SC or during an interview. +That being said, don't hesitate to ask for help if you truly are stuck. + +Have fun studying! + +## SQL + +**Concepts:** + +1. What is SQL? +- Selective Query Language: It is a simple way we can query databases in order to obtain the data we want from specific +tables or areas +2. What is a RDBMS? +- Relational Data Base Management System: These are systems that allow us to interact with a database: DB browser, Postgres +3. What is an ETL pipeline? +- Extract Transform Load: This is a way that DS takes data from one place and places it into another. +-- Extract = Data out +-- Transform = Taking it from one form to another form +-- Load = taking it in that new form and inserting, or entering it into the new database or structure. +4. What is a schema? +- Schema is a framework that ensures a structure for the database to accept certain formats of data - Deciding on +datatypes, lengths of strings, and Primary keys when necessary + +5. What does each letter in ACID stand for? Give an explanation for each and why they matter? + - **A** + - **C** + - **I** + - **D** +6. Explain each of the table relationships and give an example for each + - One-to-One: Country to Capital, for each instance there is one singular connection + - One-to-Many: Book to Pages, for on instance there are many of the connection + - Many-to-Many: Books to Authors, May Authors write many books and often interconnect on certain books. + +## Syntax +For the following section, give a brief explanation of each of the SQL commands. + +1. **SELECT** - Specify the columns that are wanted FROM a certain table in the DB +('SELECT character_id, name, level FROM charactercreator_character') +2. **WHERE** - A Joint logical condition. +('WHERE character_id >10 AND LEVEL >10) +3. **LIMIT** - The amount of results you receive during the .fetachll() period +4. **ORDER** - Choosing a column in which the order is specified - can use number based columns and add DESC +5. **JOIN** - INNER, LEFT, RIGHT, MIDDLE also ON Allows for merging of table by different aspects. Inner is often preffered, +because it allows for no missing values. Left can be useful for seeing the difference in certain tables. +6. **CREATE TABLE** - When loading data there needs to be a table with a proper schema to load the data. +7. **INSERT** -Using this allows to insert data into the table +8. **DISTINCT** -Parameter used during SELECT that allows to only include values that are not repeats of another +9. **GROUP BY** -Using this will aid in an implicit join +10. **ORDER BY** - +11. **AVG** - +12. **MAX** - +13. **AS** - Casting a specific section to a different ID in order to limit the amount of writing necessary for the query + +## Starting From Scratch +Create a file named `study_part1.py` and complete the exercise below. The only library you should need to import is `sqlite3`. + Don't forget to be PEP8 compliant! +1. Create a new database file call `study_part1.sqlite3` +2. Create a table with the following columns + ``` + student - string + studied - string + grade - int + age - int + sex - string + ``` + +3. Fill the table with the following data + + ``` + 'Lion-O', 'True', 85, 24, 'Male' + 'Cheetara', 'True', 95, 22, 'Female' + 'Mumm-Ra', 'False', 65, 153, 'Male' + 'Snarf', 'False', 70, 15, 'Male' + 'Panthro', 'True', 80, 30, 'Male' + ``` + +4. Save your data. You can check that everything is working so far if you can view the table and data in DBBrowser + +5. Write the following queries to check your work. Querie outputs should be formatted for readability, don't simply print a number to the screen with no explanation, add context. + + ``` + What is the average age? Expected Result - 48.8 + What are the name of the female students? Expected Result - 'Cheetara' + How many students studied? Expected Results - 3 + Return all students and all columns, sorted by student names in alphabetical order. + ``` + +## Query All the Tables! + +### Setup +Before we get started you'll need a few things. +1. Download the [Chinook Database here](https://github.com/bundickm/Study-Guides/blob/master/data/Chinook_Sqlite.sqlite) +2. The schema can be [found here](https://github.com/bundickm/Study-Guides/blob/master/data/Chinook%20Schema.png) +3. Create a file named `study_part2.py` and complete the exercise below. The only library you should need to import is `sqlite3`. Don't forget to be PEP8 compliant! +4. Add a connection to the chinook database so that you can answer the queries below. + +### Queries +**Single Table Queries** +1. Find the average invoice total for each customer, return the details for the first 5 ID's +2. Return all columns in Customer for the first 5 customers residing in the United States +3. Which employee does not report to anyone? +4. Find the number of unique composers +5. How many rows are in the Track table? + +**Joins** + +6. Get the name of all Black Sabbath tracks and the albums they came off of +7. What is the most popular genre by number of tracks? +8. Find all customers that have spent over $45 +9. Find the first and last name, title, and the number of customers each employee has helped. If the customer count is 0 for an employee, it doesn't need to be displayed. Order the employees from most to least customers. +10. Return the first and last name of each employee and who they report to + +## NoSQL + +### Questions of Understanding + +1. What is a document store? + +2. What is a `key:value` pair? What data type in Python uses `key:value` pairs? + +3. Give an example of when it would be best to use a SQL Database and when it would be best to use a NoSQL Database + +4. What are some of the trade-offs between SQL and NoSQL? + +5. What does each letter in BASE stand for? Give an explanation for each and why they matter? + - B + - A + - S + - E diff --git a/Study Guide/study_part1.py b/Study Guide/study_part1.py new file mode 100644 index 00000000..650f3524 --- /dev/null +++ b/Study Guide/study_part1.py @@ -0,0 +1,57 @@ +""" +Study guide practicing importing data to sqlite file +""" + +import sqlite3 + +# directions + +sl_conn = sqlite3.connect('study_part1.sqlite3') + +sl_curs = sl_conn.cursor() + +""" + student - string + studied - string + grade - int + age - int + sex - string + """ + +sl_curs.execute("DROP TABLE IF EXISTS students;") +sl_conn.commit() + +create_table = """ +CREATE TABLE students ( + student TEXT, + studied TEXT, + grade INT, + age INT, + sex TEXT +); +""" + +sl_curs.execute(create_table) + +sl_conn.commit() + +students = [ + ('Lion-O', 'True', 85, 24, 'Male'), + ('Cheetara', 'True', 95, 22, 'Female'), + ('Mumm-Ra', 'False', 65, 153, 'Male'), + ('Snarf', 'False', 70, 15, 'Male'), + ('Panthro', 'True', 80, 30, 'Male') +] + +for student in students: + insert = f""" + INSERT INTO students (student, studied, grade, age, sex) + VALUES {student};""" + sl_curs.execute(insert) + +sl_conn.commit() + +sl_curs.execute('SELECT * FROM students;') +results = sl_curs.fetchall() + +print(results) diff --git a/Study Guide/study_part1.sqlite3 b/Study Guide/study_part1.sqlite3 new file mode 100644 index 00000000..4c50a843 Binary files /dev/null and b/Study Guide/study_part1.sqlite3 differ diff --git a/module1-introduction-to-sql/.idea/.gitignore b/module1-introduction-to-sql/.idea/.gitignore new file mode 100644 index 00000000..26d33521 --- /dev/null +++ b/module1-introduction-to-sql/.idea/.gitignore @@ -0,0 +1,3 @@ +# Default ignored files +/shelf/ +/workspace.xml diff --git a/module1-introduction-to-sql/.idea/inspectionProfiles/Project_Default.xml b/module1-introduction-to-sql/.idea/inspectionProfiles/Project_Default.xml new file mode 100644 index 00000000..86b0eecb --- /dev/null +++ b/module1-introduction-to-sql/.idea/inspectionProfiles/Project_Default.xml @@ -0,0 +1,12 @@ + + + + \ No newline at end of file diff --git a/module1-introduction-to-sql/.idea/inspectionProfiles/profiles_settings.xml b/module1-introduction-to-sql/.idea/inspectionProfiles/profiles_settings.xml new file mode 100644 index 00000000..105ce2da --- /dev/null +++ b/module1-introduction-to-sql/.idea/inspectionProfiles/profiles_settings.xml @@ -0,0 +1,6 @@ + + + + \ No newline at end of file diff --git a/module1-introduction-to-sql/.idea/misc.xml b/module1-introduction-to-sql/.idea/misc.xml new file mode 100644 index 00000000..d1e22ecb --- /dev/null +++ b/module1-introduction-to-sql/.idea/misc.xml @@ -0,0 +1,4 @@ + + + + \ No newline at end of file diff --git a/module1-introduction-to-sql/.idea/module1-introduction-to-sql.iml b/module1-introduction-to-sql/.idea/module1-introduction-to-sql.iml new file mode 100644 index 00000000..d0876a78 --- /dev/null +++ b/module1-introduction-to-sql/.idea/module1-introduction-to-sql.iml @@ -0,0 +1,8 @@ + + + + + + + + \ No newline at end of file diff --git a/module1-introduction-to-sql/.idea/modules.xml b/module1-introduction-to-sql/.idea/modules.xml new file mode 100644 index 00000000..d53bc76e --- /dev/null +++ b/module1-introduction-to-sql/.idea/modules.xml @@ -0,0 +1,8 @@ + + + + + + + + \ No newline at end of file diff --git a/module1-introduction-to-sql/.idea/vcs.xml b/module1-introduction-to-sql/.idea/vcs.xml new file mode 100644 index 00000000..6c0b8635 --- /dev/null +++ b/module1-introduction-to-sql/.idea/vcs.xml @@ -0,0 +1,6 @@ + + + + + + \ No newline at end of file diff --git a/module1-introduction-to-sql/buddymove_holidayiq.py b/module1-introduction-to-sql/buddymove_holidayiq.py new file mode 100644 index 00000000..dc8fe178 --- /dev/null +++ b/module1-introduction-to-sql/buddymove_holidayiq.py @@ -0,0 +1,64 @@ +import pandas as pd +import sqlite3 + +# read in data using pandas +df = pd.read_csv('buddymove_holidayiq.csv') + +# create database for csv file +conn = sqlite3.connect('buddymove_holidayiq.sqlite3') + +# convert df into DB for SQL use +# df.to_sql('reviews', con=conn) + +# function for running queries +def execute_query(cursor, query): + cursor.execute(query) + return cursor.fetchall() + +curs = conn.cursor() + +# count number of rows in DB +num_rows = """ + SELECT COUNT(*) + FROM reviews; + """ +# Answer: 249 +results1 = execute_query(curs, num_rows) + +# How many users who reviewed at least 100 `Nature` in the category also +# reviewed at least 100 in the `Shopping` category? +karens = """ + SELECT COUNT(*) +FROM reviews +WHERE Nature > 100 +AND Shopping > 100; +""" +# Answer: 78 +results2 = execute_query(curs, karens) + +# - (*Stretch*) What are the average number of reviews for each category? +avg_reviews = """ +SELECT AVG(Sports), +AVG(Religious), +AVG(Nature), +AVG(Shopping), +AVG(Picnic), +AVG(Theatre) +FROM reviews; +""" +results3 = execute_query(curs, avg_reviews) + + + + +if __name__ == '__main__': + print(f'Report from buddymove_holidayiq \n' + f'Number of Users: {results1[0][0]} \n' + f'Number of Users whom have over 100: {results2[0][0]} \n' + f'Average number of reviews Sports: Nature reviews and over 100 Shopping reviews{results3[0][0]} \n' + f'Average number of reviews Religious:{results3[0][1]} \n' + f'Average number of reviews Nature:{results3[0][2]} \n' + f'Average number of reviews Shopping:{results3[0][3]} \n' + f'Average number of reviews Picnic:{results3[0][4]} \n' + f'Average number of reviews Theatre:{results3[0][5]} \n' + ) diff --git a/module1-introduction-to-sql/buddymove_holidayiq.sqlite3 b/module1-introduction-to-sql/buddymove_holidayiq.sqlite3 new file mode 100644 index 00000000..ce15641d Binary files /dev/null and b/module1-introduction-to-sql/buddymove_holidayiq.sqlite3 differ diff --git a/module1-introduction-to-sql/rpg_queries.py b/module1-introduction-to-sql/rpg_queries.py new file mode 100644 index 00000000..25e7d601 --- /dev/null +++ b/module1-introduction-to-sql/rpg_queries.py @@ -0,0 +1,167 @@ +""" +- How many total Characters are there? +- How many of each specific subclass? +- How many total Items? +- How many of the Items are weapons? How many are not? +- How many Items does each character have? (Return first 20 rows) +- How many Weapons does each character have? (Return first 20 rows) +- On average, how many Items does each Character have? +- On average, how many Weapons does each character have? +""" + +import sqlite3 + + +def connect_to_db(db_name='rpg_db.sqlite3'): + return sqlite3.connect(db_name) + + +def execute_query(cursor, query): + cursor.execute(query) + return cursor.fetchall() + + +conn = connect_to_db() +curs = conn.cursor() + +# Total Number of Characters +total_characters = """ + SELECT COUNT(*) + FROM charactercreator_character; + """ +results1 = execute_query(curs, total_characters) + +# How many of each specific subclass? +# cleric class +cleric_class = """ +SELECT COUNT(*) +FROM charactercreator_cleric; +""" +resultscleric = execute_query(curs, cleric_class) +# answer: 75 + +# Fighter Class +fighter_class = """ +SELECT COUNT(*) +FROM charactercreator_fighter; +""" +resultsfighter = execute_query(curs, fighter_class) +# answer: 68 + +# Mage Class +mage_class = """ +SELECT COUNT(*) +FROM charactercreator_mage; +""" +resultsmage = execute_query(curs, mage_class) +# answer: 108 + +# Necromancer +necromancer_class = """ +SELECT COUNT(*) +FROM charactercreator_necromancer; +""" +resultsnecromancer = execute_query(curs, necromancer_class) +# answer: 11 + +# Theif +thief_class = """ +SELECT COUNT(*) +FROM charactercreator_thief; +""" +resultstheif = execute_query(curs, thief_class) +# Answer: 51 + +# Total Items +total_items = """ +SELECT COUNT(*) +FROM armory_item; +""" +resultsitems = execute_query(curs, total_items) +# Answer: 174 + +# How Many Items are weapons (contain a power attr)? +total_weapons = """ +SELECT COUNT(*) +FROM armory_weapon +""" +results_weapons = execute_query(curs, total_weapons) +# Answer: 37 + +# How Many Items are not weapons +not_weapons = execute_query(curs, total_items)[0][0] - execute_query(curs, total_weapons)[0][0] +# answer: + +# How many Items does each character have? (Return first 20 rows) +character_items_20 = """ +SELECT character_id, name, COUNT(item_id) FROM +(SELECT cc.character_id, cc.name, ai.item_id, ai.name +FROM charactercreator_character AS cc, +armory_item AS ai, +charactercreator_character_inventory AS cci +WHERE cc.character_id = cci.character_id +AND ai.item_id = cci.item_id) +GROUP BY 1 ORDER BY 3 DESC +LIMIT 20;""" +resultscharitems20 = execute_query(curs, character_items_20) + +# How many weapons does each character have? (Return first 20 rows) +character_weapons_20 = """ +SELECT character_id, name, COUNT(item_ptr_id) FROM +(SELECT cc.character_id, cc.name, aw.item_ptr_id, aw.power +FROM charactercreator_character AS cc, +armory_weapon AS aw, +charactercreator_character_inventory AS cci +WHERE cc.character_id = cci.character_id +AND aw.item_ptr_id = cci.item_id) +GROUP BY 1 ORDER BY 2 DESC +LIMIT 20;""" +resultscharweapons20 = execute_query(curs, character_weapons_20) + +# Avg items per character +avg_items_character = """ +SELECT AVG(nc) FROM +(SELECT character_id, COUNT (DISTINCT item_id) AS nc FROM +(SELECT cc.character_id, cc.name, ai.item_id, ai.name +FROM charactercreator_character AS cc, +armory_item AS ai, +charactercreator_character_inventory AS cci +WHERE cc.character_id = cci.character_id +AND ai.item_id = cci.item_id) +GROUP BY 1 ORDER BY 2 DESC) """ +resultsavgitems = execute_query(curs, avg_items_character) +# answer : 2.97 + +# Avg weapons per character +avg_weapons_character = """ +SELECT AVG(nc) FROM +(SELECT character_id, COUNT(DISTINCT item_ptr_id) AS nc FROM +(SELECT cc.character_id, cc.name, aw.item_ptr_id, aw.power +FROM charactercreator_character AS cc, +armory_weapon AS aw, +charactercreator_character_inventory AS cci +WHERE cc.character_id = cci.character_id +AND aw.item_ptr_id = cci.item_id) +GROUP BY 1 ORDER BY 2 DESC) +""" +resultsavgweapons = execute_query(curs, avg_weapons_character) +# Answer = 1.31 + +if __name__ == '__main__': + print( + f'Report of rpg_queries \n' + f'Total Number Characters: {results1[0][0]} \n \n' + f'Number of Clerics: {resultscleric[0][0]} \n' + f'Number of Fighters: {resultsfighter[0][0]} \n' + f'Number of Mages: {resultsmage[0][0]} \n' + f'Number of Necromancers: {resultsnecromancer[0][0]} \n' + f'Number of Theifs: {resultstheif[0][0]} \n \n' + f'Total Items: {resultsitems[0][0]} \n' + f'Total Weapons {results_weapons[0][0]}\n' + f'Total Items that are not Weapons: {not_weapons}\n \n' + f'Top 20 Characters Number of Items: {resultscharitems20}\n \n' + f'Top 20 Characters Number of Weapons: {resultscharweapons20}\n \n' + f'Avg items per Character: {round(resultsavgitems[0][0], 2)}\n' + f'Avg items per Character: {round(resultsavgweapons[0][0], 2)}\n' + + ) diff --git a/module1-introduction-to-sql/test_db.sqlite3 b/module1-introduction-to-sql/test_db.sqlite3 new file mode 100644 index 00000000..e2b8f8a4 Binary files /dev/null and b/module1-introduction-to-sql/test_db.sqlite3 differ diff --git a/module2-sql-for-analysis/.idea/module2-sql-for-analysis.iml b/module2-sql-for-analysis/.idea/module2-sql-for-analysis.iml new file mode 100644 index 00000000..d0876a78 --- /dev/null +++ b/module2-sql-for-analysis/.idea/module2-sql-for-analysis.iml @@ -0,0 +1,8 @@ + + + + + + + + \ No newline at end of file diff --git a/module2-sql-for-analysis/insert_titanic.py b/module2-sql-for-analysis/insert_titanic.py new file mode 100644 index 00000000..19cdf384 --- /dev/null +++ b/module2-sql-for-analysis/insert_titanic.py @@ -0,0 +1,73 @@ +""" +transfering titanic dataset into elephantsql +I worked with the example you showed us during standup as converting to sqlite then working to transfer it using the +for loop was much more difficult. I think if we could go back over how we used the list comprehension would make it +a little easier to understand on my own. +""" +# imports +import psycopg2 +import pandas as pd +from psycopg2.extras import execute_values + +# reading in titanic Data +df = pd.read_csv('titanic.csv') + +# renaming columns in order to have them read into elephant +df['Siblings/Spouses Aboard'].rename('siblingsspouse', axis=1) +df['Parents/Children Aboard'].rename('parentschildren', axis=1) + +# getting rid of unecessary apostrophies +df['Name'] = df['Name'].str.replace("'", "") + +# creds for cloud DB, password is TOP SECRET +dbname = 'zgexitff' +user = 'zgexitff' +password = 'XXX' +host = 'isilo.db.elephantsql.com' + +# connection to cloud +pg_conn = psycopg2.connect(dbname=dbname, user=user, password=password, host=host) + +# Cursor +pg_curs = pg_conn.cursor() + +# creating Titanic Table +create_titanic_table = """ +DROP TABLE IF EXISTS Titanic; +CREATE TABLE Titanic ( + index INT, + Survived INT, + Pclass INT, + Name TEXT, + Sex TEXT, + Age REAL, + siblingsspouse INT, + parentschildren INT, + Fare REAL +); +""" + +# running table and committing table +pg_curs.execute(create_titanic_table) +pg_conn.commit() + +# using the execute_values function - Would like to go over this again to enhance my understanding +execute_values(pg_curs, """ +INSERT INTO Titanic +(Survived, Pclass, Name, Sex, Age, siblingsspouse, parentschildren, Fare) +VALUES %s; +""", [tuple(row) for row in df.values]) + +# commit +pg_conn.commit() + + +pg_curs.execute(""" +SELECT * +FROM Titanic +LIMIT 1; +""") + +# printing to validate +print(pg_curs.fetchall()) + diff --git a/module2-sql-for-analysis/rpg_db.sqlite3 b/module2-sql-for-analysis/rpg_db.sqlite3 new file mode 100644 index 00000000..837d7f16 Binary files /dev/null and b/module2-sql-for-analysis/rpg_db.sqlite3 differ diff --git a/module2-sql-for-analysis/rpg_transfer.py b/module2-sql-for-analysis/rpg_transfer.py new file mode 100644 index 00000000..d1181c5e --- /dev/null +++ b/module2-sql-for-analysis/rpg_transfer.py @@ -0,0 +1,63 @@ +""" +Take RPG data from sqlite3 to out elephant sql DB +""" + +import psycopg2 +import sqlite3 + +dbname = 'zgexitff' +user = 'zgexitff' +password = 'N-rZTbhw5RUyDylzQH6Cmai2wSD4SGtr' +host = 'isilo.db.elephantsql.com' + +pg_conn = psycopg2.connect(dbname=dbname, user=user, password=password, host=host) + +pg_curs = pg_conn.cursor() + +sl_conn = sqlite3.connect('rpg_db.sqlite3') + +sl_curs = sl_conn.cursor() + +""" +Create function to run queries +""" + + +# Query for getting the table +get_characters = "SELECT * FROM charactercreator_character;" + +sl_curs.execute(get_characters) + +characters = sl_curs.fetchall() + +create_character_table = """ +DROP TABLE IF EXISTS charactercreator_character; +CREATE TABLE charactercreator_character ( + character_id SERIAL PRIMARY KEY, + name VARCHAR(30), + level INT, + exp INT, + hp INT, + strength INT, + intelligence INT, + dexterity INT, + wisdom INT +); +""" + +pg_curs.execute(create_character_table) + +pg_conn.commit() + +for character in characters: + insert_character = """ + INSERT INTO charactercreator_character + (name, level, exp, hp, strength, intelligence, dexterity, wisdom) + VALUES """ + str(character[1:]) + ";" + pg_curs.execute(insert_character) + +pg_conn.commit() + +if __name__ == '__main__': + pg_curs.execute('SELECT * FROM charactercreator_character LIMIT 5;') + print(pg_curs.fetchall()) diff --git a/module2-sql-for-analysis/titanic-sqlite.py b/module2-sql-for-analysis/titanic-sqlite.py new file mode 100644 index 00000000..7f3fa679 --- /dev/null +++ b/module2-sql-for-analysis/titanic-sqlite.py @@ -0,0 +1,15 @@ +""" +taking titanic file and transfering into sqlite3 file +""" + +import pandas as pd +import sqlite3 + +df = pd.read_csv('titanic.csv') + +conn = sqlite3.connect('titanic.sqlite3') + +# df.to_sql('titanic', con=conn) + +if __name__ == '__main__': + df.to_sql('titanic', con=conn) diff --git a/module2-sql-for-analysis/titanic.sqlite3 b/module2-sql-for-analysis/titanic.sqlite3 new file mode 100644 index 00000000..091a1272 Binary files /dev/null and b/module2-sql-for-analysis/titanic.sqlite3 differ diff --git a/module3-nosql-and-document-oriented-databases/MongoDB.py b/module3-nosql-and-document-oriented-databases/MongoDB.py new file mode 100644 index 00000000..23a1b834 --- /dev/null +++ b/module3-nosql-and-document-oriented-databases/MongoDB.py @@ -0,0 +1,30 @@ +""" +Connecting to Mongo DB in order to upload RPG sqlite DB +""" + +# imports for today +from pymongo import MongoClient +import sqlite3 as sql +import ssl + + +sl_conn = sql.connect('rpg_db.sqlite3') +sl_curs = sl_conn.cursor() +characters = sl_curs.execute("SELECT * FROM charactercreator_character").fetchall() + +keys = ( + 'character_id', + 'name', 'level', 'exp', 'hp', 'strength', 'intelligence', 'dexterity', 'wisdom' +) + +password = 'XXX' +dbname = 'test' + +db = MongoClient(f"mongodb+srv://twjames:{password}@cluster0.aplgy.gcp.mongodb." + f"net/{dbname}?retryWrites=true&w=majority", ssl=True, ssl_cert_reqs=ssl.CERT_NONE).rpg_db.characters + +# db.insert_many( +# {k: v for k, v in zip(keys, char)} for char in characters +# ) + +print(*db.find(), sep='\n') diff --git a/module3-nosql-and-document-oriented-databases/Questions b/module3-nosql-and-document-oriented-databases/Questions new file mode 100644 index 00000000..be96acaf --- /dev/null +++ b/module3-nosql-and-document-oriented-databases/Questions @@ -0,0 +1,9 @@ +"How was working with MongoDB different from working with +PostgreSQL? What was easier, and what was harder?" + +- The largest difference in working with MongoDb and PostgreSQL was the necessity for a schema when working with +PostrgeSQL. When you need a schema in order to work with a database it makes the data much more rigid. Certain situations +call for certain types of necessities for data to have parameters for loading. My ideas for this can help avoid +adding types or values to a DB that are either unnecessary or improper. I am a big fan of structure, although in a case +where you need to have a large variety of data placed in a DB that may not always align with what is previously placed +in the DB, a MongoDB would be the way to go. diff --git a/module3-nosql-and-document-oriented-databases/cred.py b/module3-nosql-and-document-oriented-databases/cred.py new file mode 100644 index 00000000..1544197e --- /dev/null +++ b/module3-nosql-and-document-oriented-databases/cred.py @@ -0,0 +1,2 @@ +class cred: + password = \ No newline at end of file diff --git a/module3-nosql-and-document-oriented-databases/rpg_db.sqlite3 b/module3-nosql-and-document-oriented-databases/rpg_db.sqlite3 new file mode 100644 index 00000000..837d7f16 Binary files /dev/null and b/module3-nosql-and-document-oriented-databases/rpg_db.sqlite3 differ diff --git a/module4-acid-and-database-scalability-tradeoffs/Titanic_queries.py b/module4-acid-and-database-scalability-tradeoffs/Titanic_queries.py new file mode 100644 index 00000000..d175d1e9 --- /dev/null +++ b/module4-acid-and-database-scalability-tradeoffs/Titanic_queries.py @@ -0,0 +1,88 @@ +""" +Practicing SQL on the titanic DB hosted locally +""" + +import sqlite3 + +conn = sqlite3.connect('titanic.sqlite3') +curs = conn.cursor() + + +def execute_query(cursor, query): + cursor.execute(query) + return cursor.fetchall() + + +# - How many passengers were in each class? +# - How many passengers survived/died within each class? +# - What was the average age of survivors vs nonsurvivors? +# - What was the average age of each passenger class? +# - What was the average fare by passenger class? By survival? +# - How many siblings/spouses aboard on average, by passenger class? By survival? +# - How many parents/children aboard on average, by passenger class? By survival? +# - Do any passengers have the same name? +# - (Bonus! Hard, may require pulling and processing with Python) How many married +# couples were aboard the Titanic? Assume that two people (one `Mr.` and one +# `Mrs.`) with the same last name and with at least 1 sibling/spouse aboard are +# a married couple. + +# - How many passengers survived, and how many died? +pass_survived = """ +SELECT COUNT(survived) +FROM titanic +WHERE survived = 1; +""" +results1 = execute_query(curs, pass_survived) +print(results1) + +# how many died +pass_dead = """ +SELECT COUNT(survived) +FROM titanic +WHERE survived = 0;""" + +results2 = execute_query(curs, pass_dead) +print(results2) + +# - How many passengers survived/died within each class? +pass_dead_class = """ +SELECT COUNT(survived) +FROM titanic +WHERE survived = 1 +GROUP BY pclass +ORDER BY pclass DESC; """ + +results3 = execute_query(curs, pass_dead_class) +print(results3) + +pass_survive_class = """ +SELECT COUNT(survived) +FROM titanic +WHERE survived = 0 +GROUP BY pclass +ORDER BY pclass DESC; """ + +results4 = execute_query(curs, pass_survive_class) +print(results4) + +# - What was the average age of survivors vs nonsurvivors? +avg_survivor = """ +SELECT AVG(age) +FROM +(SELECT age, survived +FROM titanic +WHERE survived = 1);""" + +results5 = execute_query(curs, avg_survivor) +print(results5) + +avg_dead = """ +SELECT AVG(age) +FROM +(SELECT age, survived +FROM titanic +WHERE survived = 0);""" + +print(execute_query(curs, avg_dead)) + + diff --git a/module4-acid-and-database-scalability-tradeoffs/titanic.sqlite3 b/module4-acid-and-database-scalability-tradeoffs/titanic.sqlite3 new file mode 100644 index 00000000..091a1272 Binary files /dev/null and b/module4-acid-and-database-scalability-tradeoffs/titanic.sqlite3 differ diff --git a/rpg_db.sqlite3 b/rpg_db.sqlite3 new file mode 100644 index 00000000..e69de29b