You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[Download Maven Artifacts to analyze](#download-maven-artifacts-to-analyze)
26
26
-[Reset the database and scan the java artifacts](#reset-the-database-and-scan-the-java-artifacts)
27
+
-[Import git log](#import-git-log)
27
28
-[Database Queries](#database-queries)
28
29
-[Cypher Shell](#cypher-shell)
29
30
-[HTTP API](#http-api)
@@ -70,7 +71,7 @@ a profile, the newest versions will be used. Profiles are scripts that can be fo
70
71
### Notes
71
72
72
73
- Be sure to use Java 17 for Neo4j v5 and Java 11 for Neo4j v4
73
-
- Use your own initial Neo4j password
74
+
- Use your own initial Neo4j password with `export NEO4J_INITIAL_PASSWORD=my_own_password`
74
75
- For more details have a look at the script [analyze.sh](./scripts/analysis/analyze.sh)
75
76
76
77
### Examples
@@ -214,6 +215,18 @@ enhance the data further with relationships between artifacts and packages.
214
215
215
216
Be aware that this script deletes all previous relationships and nodes in the local Neo4j Graph database.
216
217
218
+
### Import git log
219
+
220
+
Use [importGitLog.sh](./scripts/importGitLog.sh) to import git log data into the Graph.
221
+
It uses `git log` to extract commits, their authors and the changed filenames into an intermediate CSV file that is then imported into Neo4j with the following schema:
The optional parameter `--repository directory-path-to-a-git-repository` can be used to select a different directory for the repository. By default, the `source` directory within the analysis workspace directory is used. This command only needs the git history to be present so a `git clone --bare` is sufficient. If the `source` directory is also used for the analysis then a full git clone is of course needed (like for Typescript).
227
+
228
+
👉**Note:** Commit messages containing `[bot]` are filtered out to ignore changes made by bots.
# Prints the git log in CSV format including the changed files.
85
85
# Includes quoted strings, double quote escaping and supports commas in strings.
86
-
# - --pretty=format starts with a space that is needed to detect the start of a line.
87
-
# gsub(/^ /, "", a[1]); removes that space then afterwards
88
-
# - 3 commas (,,,) should be very unlikely to appear in names, email addresses and commit messages so they are used as an intermediate separator (see split)
89
-
# - gsub(/"/, "\"\"", a[6]) escapes double quotes with two of them (CSV standard)
# - --no-merges: Excludes merge commits from the log.
92
+
# - %h: Abbreviated commit hash
93
+
# - %an: Author name
94
+
# - %ae: Author email
95
+
# - %aI: Author date, ISO 8601 format
96
+
# - %ct: Commit date, Unix timestamp
97
+
# - %s: Subject of the commit
98
+
# - --name-only: Lists the files affected by each commit.
99
+
# - --pretty=format starts with a space that is needed to detect the start of a line.
100
+
# - The chosen delimiters ,,, are used to separate these fields to make parsing easier.
101
+
# It is very unlikely that they appear in the contents and will be used as an intermediate step before escaping.
102
+
#
103
+
# - BEGIN { COMMA=","; QUOTE="\"" }: Initializes the variables COMMA and QUOTE to hold a comma and a double-quote character respectively.
104
+
# - /^ / { ... }: Processes lines that start with a space (indicating a file name in git log --name-only output).
105
+
# - gsub(/^ /, "", a[1]): Removes leading spaces from the first field (commit hash) that was used to indicate a new commit.
106
+
# - gsub(/"/, "\"\"", a[6]) escapes double quotes with two double quotes (CSV standard).
107
+
# a[6] is the commit message column. Double quote escaping is done for every string column
108
+
# - gsub(/\\/, " ", a[6]): Replaces backslashes in the commit message with spaces.
109
+
# Otherwise, \" would lead to an error since it would be seen as an non escaped double quote.
110
+
# - commit=...: Constructs the commit information in CSV format, including the quoted author name, author email, and commit message except for the file name.
111
+
# - NF && !/^\ / { print commit ",\""$0"\"" }: For non-empty lines that do not start with a space (indicating commit information),
112
+
# it prints the commit information followed by the file name(s), enclosed in quotes.
113
+
#
114
+
# - grep -v -F '[bot]': Filters out commits where the commit message includes [bot]
115
+
# Used to identify commits made by automated systems or bots.
echo"prepareAnalysis: Error: Data verification failed. At least one DEPENDS_ON relationship required. Check if the artifacts directory is empty or if the scan failed."
45
46
exit 1
46
47
fi
47
48
49
+
# Preparation - Import git log if source or history is available
0 commit comments