You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -47,7 +47,7 @@ You can view the information in Module 0 in your browser by clicking on the Gith
47
47
48
48
Additional information on how to [create an Azure account](https://github.com/NIGMS/NIGMS-Sandbox/blob/main/docs/HowToCreateAzureMLNotebooks.md) is provided by the NIGMS in abbreviated form for those with a subscription to Azure.
49
49
50
-
* Video directions can be viewed at ____________
50
+
* Video directions can be viewed at [Learning Modules for Cloud-Based Biomedial Research.](https://www.youtube.com/playlist?list=PLXaEJPtnQ4w7Vu7vqWbttBjUGrPp4Qa7b)
51
51
52
52
53
53
## **Software Requirements**
@@ -74,7 +74,7 @@ Data will be obtained from online databases (e.g., NCBI) or will be in folders i
74
74
75
75
## **Module Outline**
76
76
**Module 0 - Intro to Cloud Computing and Git**
77
-
- Lecture (coming)
77
+
- Lecture (upcoming)
78
78
79
79
- Tutorial 1: Github Download *how to get the tutorials*
80
80
- Tutorial 2: Jupyter Notebooks *how to navigate these tutorials*
@@ -88,7 +88,7 @@ Data will be obtained from online databases (e.g., NCBI) or will be in folders i
88
88
89
89
Learn core concepts, diverse applications, introductory algorithms, ethical considerations, and data challenges.
Copy file name to clipboardExpand all lines: Submodule 0/Submodule_0_Tutorial_1_GithubDownload.ipynb
+18-1Lines changed: 18 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -74,7 +74,15 @@
74
74
"\n",
75
75
"You are likely familiar with cloud data storage, for example with your files saved on OneDrive (Microsoft) or Google Drive (Google). Cloud *computing* takes that a step farther and carries out data processing and runs programs on computers that are located in some distant site. \n",
76
76
"\n",
77
-
"The advantage of \"cloud computing\" is that the computational speed and power is not limited by what you have in *your* desktop or laptop. Rather, cloud computers are very FAST computers with substantially more memory (to work on big biological data sets). Additionally, the providers of cloud computing can make more (or fewer) processors available to your job, depending on the need. "
77
+
"The advantage of \"cloud computing\" is that the computational speed and power is not limited by what you have in *your* desktop or laptop. Rather, cloud computers are very FAST computers with substantially more memory (to work on big biological data sets). Additionally, the providers of cloud computing can make more (or fewer) processors available to your job, depending on the need. \n",
" <img src=\"../images/TourofGitHub.png\" alt=\"NIH/NIGMS Sandbox Foundations of Python Video 6\", width=\"550\"/>\n",
98
+
" </a>\n",
99
+
" <br>\n",
100
+
" <span> Click above image to watch introductory video </span>\n",
101
+
"</p>\n",
102
+
"\n",
86
103
"It is possible that you are reading this file having already navigated to the \"Sandbox.\"\n",
87
104
"\n",
88
105
"The [Sandbox](https://github.com/NIGMS/NIGMS-Sandbox) is housed at github.com. Github is a collaboration tool/website/repository that is being used by NIGMS as a great way to share materials.\n",
Copy file name to clipboardExpand all lines: Submodule 0/Submodule_0_Tutorial_4_GitHub4You.md
+103Lines changed: 103 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,23 @@
4
4
## Overview
5
5
Git is a powerful version control tool that helps track changes to your data files over time. While Git is traditionally used for computer code, it can be just as effective for managing structured data by recording each change, allowing you to compare versions and collaborate efficiently.
<img src="../images/InstallGit.png" alt="NIH/NIGMS Sandbox Foundations of Python Video 12", width="550"/>
18
+
</a>
19
+
<br>
20
+
<span> Click above image to watch introductory video </span>
21
+
</p>
22
+
23
+
7
24
## Learning Objectives
8
25
By the end of this lesson, you will be able to:
9
26
- Define FAIR data practices
@@ -45,6 +62,15 @@ The key word: *AUTOMATICALLY* though we'll not get to THAT until the next tutori
45
62
46
63
Version control tools like Git provide a structured way to track, manage, and document changes to data over time, ensuring that every update, correction, or modification is properly recorded.
<img src="../images/VersionControl.png" alt="NIH/NIGMS Sandbox Foundations of Python Video 2", width="550"/>
68
+
</a>
69
+
<br>
70
+
<span> Click above image to watch introductory video </span>
71
+
</p>
72
+
73
+
48
74
### Why Version Control for Research Data?
49
75
<br>1️⃣ Ensuring Data Integrity Over Time
50
76
<br>
@@ -89,6 +115,7 @@ Now that we understand why research labs need version control, let's set up Git
89
115
90
116
Before you can start using GitHub for your materials, you need to create an account. GitHub is a platform that allows you to store, share, and collaborate on code. It is widely used by developers, students, and organizations for managing software projects using Git, a version control system that tracks changes in your code.
91
117
118
+
92
119
To get started, you need to sign up for a free GitHub account. This will give you access to your own profile, repositories, and collaboration tools. Follow the steps below to create your GitHub account.
93
120
94
121
- Go to GitHub's website
@@ -97,6 +124,15 @@ To get started, you need to sign up for a free GitHub account. This will give yo
97
124
- Click Create an account and follow the instructions.
98
125
- GitHub will send a verification email. Click the link in the email to verify your account.
<img src="../images/CreateGitHubAccount.png" alt="NIH/NIGMS Sandbox Foundations of Python Video 4", width="550"/>
130
+
</a>
131
+
<br>
132
+
<span> Click above image to watch introductory video </span>
133
+
</p>
134
+
135
+
100
136
## Step 2: Setting Up a GitHub Account & Installing GitHub Desktop
101
137
102
138
Before tracking your data, you need to install GitHub Desktop, a user-friendly application that simplifies version control without needing command-line commands.
@@ -107,6 +143,13 @@ Before tracking your data, you need to install GitHub Desktop, a user-friendly a
107
143
2. Install GitHub Desktop and sign in with your GitHub account.
108
144
3. Set up your GitHub profile with your name and email (important for tracking contributions) from the account you set up in step 1.
<img src="../images/GitHubDesktopInstall.png" alt="NIH/NIGMS Sandbox Foundations of Python Video 5", width="550"/>
149
+
</a>
150
+
<br>
151
+
<span> Click above image to watch introductory video </span>
152
+
</p>
110
153
111
154
## Step 3: Creating a Repository for Your Research Data
112
155
A repository (A "repo") is like a folder where you store your research data and track changes over time.
@@ -121,11 +164,35 @@ A repository (A "repo") is like a folder where you store your research data and
121
164
5. Check “Initialize this repository with a README” (important for documenting your dataset). This is the appropriate spot to include summary information about this particular repository's purpose
<img src="../images/DesktopTutorialRepo.png" alt="NIH/NIGMS Sandbox Foundations of Python Video 9", width="550"/>
191
+
</a>
192
+
<br>
193
+
<span> Click above image to watch introductory video </span>
194
+
</p>
195
+
129
196
### Instructions:
130
197
1. Open your repository folder *on your computer.*
131
198
2. Copy or move your data files (e.g., temperature_data_2024.csv) into the folder.
@@ -167,4 +234,40 @@ Now, every protocol update is documented and timestamped, ensuring full transpar
167
234
## Managing a lab group using the same git repository
168
235
It is rather unlikely that ONLY one person would be the involved in collecting all of the data for a research lab. In order to control and protect overwriting, Git provides clear management tools. That is covered in the next tutorial.
In a research lab environment, managing data properly is just as important as collecting it. Labs often deal with long-term datasets, changing protocols, and *multiple* contributors, which can lead to data integrity issues if not properly managed.
0 commit comments