Skip to content

Commit 7859d3c

Browse files
Merge pull request #1 from NIGMS/assign-projects
Assign projects
2 parents d9cc924 + a9f5cef commit 7859d3c

File tree

135 files changed

+516120
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

135 files changed

+516120
-0
lines changed

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
## Summary
2+
3+
<!-- Provide a brief description of the changes you are making. Explain the context and the problem that this PR is solving. -->
4+
5+
## Related Issues
6+
7+
<!-- List any related issues, tickets, or Jira tasks that are addressed by this PR.
8+
Use keywords like "Fixes", "Closes", or "Resolves" to automatically link and close issues, e.g., "Fixes #123". -->
9+
10+
## Changes
11+
12+
<!-- Describe the changes in detail. If there are multiple commits, explain each one if necessary.
13+
Consider breaking this section down into smaller parts, such as "Features Added", "Bugs Fixed", "Technical Debt", etc. -->
14+
15+
- **Feature 1**: Added the ability to do X.
16+
- **Bug Fix**: Corrected the issue where Y would fail under condition Z.
17+
- **Refactoring**: Improved the structure of component A without changing its functionality.
18+
19+
## Testing
20+
21+
<!-- Explain how you tested the changes and what steps you took to verify the correctness.
22+
Include instructions for others to test if applicable, such as command-line scripts, UI steps, etc. -->
23+
24+
- [ ] Unit tests
25+
- [ ] Integration tests
26+
- [ ] Manual testing
27+
28+
### How to test
29+
30+
1. Step 1: [Instruction]
31+
2. Step 2: [Instruction]
32+
3. Step 3: [Instruction]
33+
34+
## Screenshots (if applicable)
35+
36+
<!-- Add any relevant screenshots or GIFs to illustrate the changes. This is particularly useful for UI/UX changes. -->
37+
38+
## Checklist
39+
40+
<!-- Ensure that you have completed the following tasks before submitting the PR. -->
41+
42+
- [ ] My code follows the code style of this project.
43+
- [ ] I have performed a self-review of my code.
44+
- [ ] I have commented my code, particularly in hard-to-understand areas.
45+
- [ ] I have made corresponding changes to the documentation.
46+
- [ ] My changes generate no new warnings or errors.
47+
- [ ] I have added tests that prove my fix is effective or that my feature works.
48+
- [ ] New and existing unit tests pass locally with my changes.
49+
50+
## Notes for Reviewers
51+
52+
<!-- Add any additional notes for the reviewers. This could include areas of the code that you would like reviewers to focus on, known issues, or challenges you faced while implementing the changes. -->
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
name: Auto Assign to Project(s)
2+
3+
on:
4+
issues:
5+
types: [opened]
6+
pull_request:
7+
types: [opened]
8+
issue_comment:
9+
types: [created]
10+
11+
jobs:
12+
assign_one_project:
13+
runs-on: ubuntu-latest
14+
name: Assign to One Project
15+
steps:
16+
- name: Assign all issues to Project 1
17+
uses: srggrs/[email protected]
18+
with:
19+
project: 'https://github.com/orgs/NIGMS/projects/1'
20+
column_name: 'Todo'

.github/workflows/check-links.yaml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
name: 'Check Links'
2+
on:
3+
push:
4+
pull_request:
5+
workflow_dispatch:
6+
7+
8+
jobs:
9+
link_check:
10+
name: 'Link Check'
11+
uses: NIGMS/NIGMS-Sandbox/.github/workflows/check-links.yaml@main
12+
with:
13+
repo_link_ignore_list: ""
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
name: 'Lint Notebook'
2+
on:
3+
push:
4+
workflow_dispatch:
5+
permissions:
6+
contents: write
7+
id-token: write
8+
9+
jobs:
10+
lint:
11+
name: 'Linting'
12+
uses: NIGMS/NIGMS-Sandbox/.github/workflows/notebook-lint.yaml@main
13+
with:
14+
directory: .

README.md

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
# Introduction to Python for Bioinformatics
2+
---------------------------------------------------
3+
<img src="https://nnu.edu/wp-content/uploads/2024/09/Regular-Stacked.png" alt="Logo of NNU" width="150">
4+
5+
## **Contents**
6+
7+
- [Practical Data-Centric Python for Biomedical Researchers](#practical-data-centric-aiml-for-biomedical-researchers)
8+
- [**Contents**](#contents)
9+
- [**Overview**](#overview)
10+
- [**Background**](#background)
11+
- [**Before Starting**](#before-starting)
12+
- [**Getting Started**](#getting-started)
13+
- [**Software Requirements**](#software-requirements)
14+
- [**Architecture Design**](#architecture-design)
15+
- [**Data**](#data)
16+
- [**Module Outline**](#module-outline)
17+
- [**Funding**](#funding)
18+
- [**License for Data**](#license-for-data)
19+
20+
## **Overview**
21+
The module prioritizes practical coding techniques for biological scientists who have limited or no background in programming in Python or other languages. The module also utilizes a blend of short instructional videos, interactive demonstrations, and hands-on exercises to facilitate self-directed learning and knowledge retention.
22+
23+
Module 0 provides the background information you need to create a Cloud Computing account at Azure, to copy the needed tutorials from Github (where they are stored), and how to use Github for your data storage needs.
24+
25+
Module 1 is a set of foundational tutorials in Python
26+
27+
Module 2 expands the Python toolbox to NumPy & Pandas (great data handling tools), graphing libraries and statistics for bioinformatics.
28+
29+
Module 3 tutorials show how to save and edit Python scripts to extend your programming outside of Jupyter notebooks for reproducibly running programming tasks.
30+
31+
## **Background**
32+
33+
Bioinformatics enables the extraction of meaningful insights from biological data, contributes to advancements in medicine and biotechnology, and plays a crucial role in understanding the complexities of living organisms. As biological research continues to generate large datasets, skills in bioinformatics are increasingly valuable for students, researchers, and professionals in the life sciences. Biological data science jobs in industry always require experience with using and creating NGS pipelines, proficiency in Python and/or R, and familiarity with git version control. The centrality of these computer-aided approaches in the lab and industry demands the increased incorporation of these skills in undergraduate curricula through advanced research projects.
34+
35+
The enormous data sets used in and created by bioinformatics studies, including metadata, are expected to be made available to the public according to FAIR principles (Wilkinson et al, 2016). Starting in 2023, all NSF and NIH grantees are expected to develop and maintain a public data and tool repository in the cloud. While the deposition of DNA sequences to the NCBI GenBank is familiar to many biology researchers, data sharing of other types faces technical and motivational barriers for researchers. Using general repositories, and how both data and bioinformatics pipelines can be managed by git, is quite unfamiliar to most biologists. We believe that offering GitHub as a viable solution to the problem of identifying and using a data repository would be a strong incentive for researchers, especially those who are computer science novices, to utilize our proposed training module.
36+
37+
With the generous support of the NIH NIGMS Cloud Computing project, we have created a module to teach foundational Git tools and Introductory Python Programming for Bioinformatics. It is intended that this will give you a foundation for using more detailed and advanced Modules in the NIGMS sandbox which rely on Python. Furthermore, you will have the tools you need for storing, conveniently, your data on GitHub according to FAIR principles.
38+
39+
## **Before Starting**
40+
* This Module has been tested on Azure but the method for working with the Notebooks and data on Amazon Web Services (AWS) or Google Cloud Platform (GCP) is similar. You should determine in which cloud platform you will work.
41+
42+
This *Introduction to Python* course, even if you work very slowly, should cost less than $5 of cloud computing time. You will also have compute storage charges, so you should download and delete these tutorials from the cloud when you have finished using them.
43+
44+
## **Getting Started**
45+
46+
You can view the information in Module 0 in your browser by clicking on the Github folder (above) for Submodule 0. It provides instructions on how to set up a cloud account to use this, and other, tutorials.
47+
48+
Additional information on how to [create an Azure account](https://github.com/NIGMS/NIGMS-Sandbox/blob/main/docs/HowToCreateAzureMLNotebooks.md) is provided by the NIGMS in abbreviated form for those with a subscription to Azure.
49+
50+
* Video directions can be viewed at ____________
51+
52+
53+
## **Software Requirements**
54+
* If you use the cloud, **you do not need any additional software**.
55+
* These notebooks *can* be used in a desktop setting with a free download of Anaconda (anaconda.com) using their Jupyter Notebooks module.
56+
* Individual notebooks can be run with limited functionality in a web browser (e.g., colab.research.google.com)
57+
* It is also possible to use GitHub's VSCode functions.
58+
* Module 0 provides instructions to download and use Github Desktop.
59+
60+
## **Architecture Design**
61+
This course is arranged into 4 sub-modules (0 through 3).
62+
63+
- Submodule 0 is intended as a non-technical introduction so that you can do more technical things (e.g., create a cloud account and copy the tutorials)
64+
- Submodule 1 is the very basic Python introduction for those who are new to programming in general or just new to how the Python language works.
65+
- Submodule 2 builds on the foundations of Python in Submodule 1 to introduce powerful tools for large data sets.
66+
- Submodule 3 introduces key object-oriented programming tools. Its tutorials also build on Submodule 1 to give you the skills to build your own python tools.
67+
68+
Within each submodule, there are several tutorials (basically, topics) with embedded quizzes. They are numbered to make it easy to know the flow of the topics. Each tutorial ends with a link to the next tutorial, or a link to the guided, summative project.
69+
70+
Each module includes at least one guided 'project' that allows you to practice your skills with a coding exercise you might later need to do with your own data or problem.
71+
72+
## **Data**
73+
Data will be obtained from online databases (e.g., NCBI) or will be in folders in the submodule. You will learn to use the Python tools that can read large data sets without needing to download them to your computer hard drive.
74+
75+
## **Module Outline**
76+
**Module 0 - Intro to Cloud Computing and Git**
77+
- Lecture (coming)
78+
79+
- Tutorial 1: Github Download *how to get the tutorials*
80+
- Tutorial 2: Jupyter Notebooks *how to navigate these tutorials*
81+
- Tutorial 3: AzureML *how to start using a cloud computer*
82+
- Tutorial 3b: AzureML *CloudLab details*
83+
- Tutorial 4: GitHub 4 You *how Git and Github can be useful for you as a bioinformatician*
84+
- Tutorial 5: Managing Git *how to manage your Github repositories for multiple users*
85+
- Tutorial 6: Digital Object Identifiers for GitHub *Creating citable identifiers for your data at Zenodo*
86+
87+
**Submodule 1 - Foundations of Python**
88+
89+
Learn core concepts, diverse applications, introductory algorithms, ethical considerations, and data challenges.
90+
91+
- Lecture (coming)
92+
-
93+
- Tutorials
94+
- Tutorial 1: Python Overview
95+
- Tutorial 2: Variables
96+
- Tutorial 3: Data Structures
97+
- Tutorial 4: Functions
98+
- Project
99+
- Tutorial 5: Using NCBI sequences
100+
- Tutorial 5 Project Answer (SOLUTIONS for Using NCBI sequences)
101+
102+
**Submodule 2 - Intro to Data Science with Python**
103+
104+
Learn Data Science with NumPy and Pandas
105+
106+
- Lecture
107+
-
108+
- Tutorials
109+
- Tutorial 0: Overview
110+
- Tutorial 1: NumPy
111+
- Tutorial 2: Pandas
112+
- Tutorial 2a: Pandas PDB Exercise
113+
- Tutorial 2b: Pandas RNA-seq Guided Excercise
114+
- Tutorial 3:Visualizing Data
115+
- Tutorial 4: Inferential Statistics
116+
- Project
117+
- Tutorial 6a Using data
118+
- Tutorial 6bProject with solutions
119+
120+
**Submodule 3 - **
121+
122+
- Tutorial 0: Overview of OOP
123+
- Tutorial 1: Introduction to Python OOP
124+
- Tutorial 2: OOP2
125+
- Project: Patient data import with the Patient Class
126+
127+
128+
## **Funding**
129+
The creation of this training module was supported by the National Institute Of General Medical Sciences of the National Institutes of Health under Award Number *******. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of any of the funding agencies.
130+
131+
## **License for Data**
132+
Text and materials are licensed under a Creative Commons. The [license]([license](https://github.com/drchase55/NNU_nih_python_jrc/blob/Intro_to_Python_Modules/LICENSE) allows you to copy, remix and redistribute any of our publicly available materials, under the condition that you attribute the work (details in the license) and do not make profits from it.
133+

0 commit comments

Comments
 (0)