Skip to content

Commit 9b1d746

Browse files
committed
Provide options for importing git log data
1 parent 0114410 commit 9b1d746

File tree

5 files changed

+39
-3
lines changed

5 files changed

+39
-3
lines changed

.github/workflows/java-code-analysis.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,7 @@ jobs:
127127
env:
128128
NEO4J_INITIAL_PASSWORD: ${{ secrets.NEO4J_INITIAL_PASSWORD }}
129129
ENABLE_JUPYTER_NOTEBOOK_PDF_GENERATION: "true"
130+
IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT: "full" # Options: "none", "aggregated", "full"
130131
run: |
131132
./../../scripts/analysis/analyze.sh
132133

.github/workflows/typescript-code-analysis.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,7 @@ jobs:
132132
env:
133133
NEO4J_INITIAL_PASSWORD: ${{ secrets.NEO4J_INITIAL_PASSWORD }}
134134
ENABLE_JUPYTER_NOTEBOOK_PDF_GENERATION: "true"
135+
IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT: "full" # Options: "none", "aggregated", "full"
135136
run: |
136137
./../../scripts/analysis/analyze.sh
137138

COMMANDS.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
- [Start an analysis with CSV reports only](#start-an-analysis-with-csv-reports-only)
1010
- [Start an analysis with Jupyter reports only](#start-an-analysis-with-jupyter-reports-only)
1111
- [Start an analysis with PDF generation](#start-an-analysis-with-pdf-generation)
12+
- [Start an analysis without importing git log data](#start-an-analysis-without-importing-git-log-data)
1213
- [Only run setup and explore the Graph manually](#only-run-setup-and-explore-the-graph-manually)
1314
- [Generate Markdown References](#generate-markdown-references)
1415
- [Generate Cypher Reference](#generate-cypher-reference)
@@ -102,6 +103,14 @@ Note: Generating a PDF from a Jupyter notebook using [nbconvert](https://nbconve
102103
ENABLE_JUPYTER_NOTEBOOK_PDF_GENERATION=true ./../../scripts/analysis/analyze.sh
103104
```
104105

106+
#### Start an analysis without importing git log data
107+
108+
To speed up analysis and get a smaller data footprint you can switch of git log data import of the "source" directory (if present) with `IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT="none"` as shown below or choose `IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT="aggregated"` to reduce data size by only importing monthly grouped changes instead of all commits.
109+
110+
```shell
111+
IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT="none" ./../../scripts/analysis/analyze.sh
112+
```
113+
105114
#### Only run setup and explore the Graph manually
106115

107116
To prepare everything for analysis including installation, configuration and preparation queries to explore the graph manually

README.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -197,6 +197,25 @@ The [Code Structure Analysis Pipeline](./.github/workflows/java-code-analysis.ym
197197
ENABLE_JUPYTER_NOTEBOOK_PDF_GENERATION=true ./../../scripts/analysis/analyze.sh
198198
```
199199

200+
- How can i disable git log data import?
201+
👉 Set environment variable `IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT` to `none`. Example:
202+
203+
```shell
204+
export IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT="none"
205+
```
206+
207+
👉 Alternatively prepend your command with `IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT="none"`:
208+
209+
```shell
210+
IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT="none" ./../../scripts/analysis/analyze.sh
211+
```
212+
213+
👉 An in-between option would be to only import monthly aggregated changes using `IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT="aggregated"`:
214+
215+
```shell
216+
IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT="aggregated" ./../../scripts/analysis/analyze.sh
217+
```
218+
200219
- Why are some Jupyter Notebook reports skipped?
201220
👉 The custom Jupyter Notebook metadata property `code_graph_analysis_pipeline_data_validation` can be set to choose a query from [cypher/Validation](./cypher/Validation) that will be executed preliminary to the notebook. If the query leads to at least one result, the validation succeeds and the notebook will be run. If the query leads to no result, the notebook will be skipped.
202221
For more details see [Data Availability Validation](./COMMANDS.md#data-availability-validation).

scripts/prepareAnalysis.sh

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@
77
# Fail on any error ("-e" = exit on first error, "-o pipefail" exist on errors within piped commands)
88
set -o errexit -o pipefail
99

10+
IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT=${IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT:-"full"} # Select how to import git log data. Options: "none", "aggregated", "full". Default="full".
11+
1012
## Get this "scripts" directory if not already set
1113
# Even if $BASH_SOURCE is made for Bourne-like shells it is also supported by others and therefore here the preferred solution.
1214
# CDPATH reduces the scope of the cd command to potentially prevent unintended directory changes.
@@ -46,9 +48,13 @@ if ! is_csv_column_greater_zero "${dataVerificationResult}" "sourceNodeCount"; t
4648
fi
4749

4850
# Preparation - Import git log if source or history is available
49-
# TODO move into separate analysis compilation/part that is selectable
50-
source "${SCRIPTS_DIR}/importGitLog.sh"
51-
source "${SCRIPTS_DIR}/importAggregatedGitLog.sh"
51+
if [[ ! ${IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT} == "none" ]]; then
52+
if [[ ${IMPORT_GIT_LOG_DATA_IF_SOURCE_IS_PRESENT} == "aggregated" ]]; then
53+
source "${SCRIPTS_DIR}/importAggregatedGitLog.sh"
54+
else
55+
source "${SCRIPTS_DIR}/importGitLog.sh"
56+
fi
57+
fi
5258

5359
# Preparation - Create indices
5460
execute_cypher "${CYPHER_DIR}/Create_Java_Type_index_for_full_qualified_name.cypher"

0 commit comments

Comments
 (0)