Skip to content

Commit d3da5f5

Browse files
committed
(Breaking) Change git file relationship to CONTAINS_CHANGED.
1 parent 86751d6 commit d3da5f5

6 files changed

+7
-7
lines changed

COMMANDS.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -233,7 +233,7 @@ Use [importGitLog.sh](./scripts/importGitLog.sh) to import git log data into the
233233
It uses `git log` to extract commits, their authors and the names of the files changed with them. These are stored in an intermediate CSV file and are then imported into Neo4j with the following schema:
234234

235235
```Cypher
236-
(Git:Log:Author)-[:AUTHORED]->(Git:Log:Commit)->[:CONTAINS]->(Git:Log:File)
236+
(Git:Log:Author)-[:AUTHORED]->(Git:Log:Commit)->[:CONTAINS_CHANGED]->(Git:Log:File)
237237
(Git:Log:Commit)->[:HAS_PARENT]-(Git:Log:Commit)
238238
```
239239

@@ -254,7 +254,7 @@ You can use [List_unresolved_git_files.cypher](./cypher/GitLog/List_unresolved_g
254254
Use [importAggregatedGitLog.sh](./scripts/importAggregatedGitLog.sh) to import git log data in an aggregated form into the Graph. It works similar to the [full git log version above](#import-git-log). The only difference is that not every single commit is imported. Instead, changes are grouped per month including their commit count. This is in many cases sufficient and reduces data size and processing time significantly. Here is the resulting schema:
255255

256256
```Cypher
257-
(Git:Log:Author)-[:AUTHORED]->(Git:Log:ChangeSpan)-[:CONTAINS]->(Git:Log:File)
257+
(Git:Log:Author)-[:AUTHORED]->(Git:Log:ChangeSpan)-[:CONTAINS_CHANGED]->(Git:Log:File)
258258
```
259259

260260
## Database Queries

cypher/GitLog/Import_aggregated_git_log_csv_data.cypher

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ CALL { WITH row
1010
})
1111
MERGE (git_file:Git:Log:File {fileName: row.filename})
1212
MERGE (git_author)-[:AUTHORED]->(git_change_span)
13-
MERGE (git_change_span)-[:CONTAINS]->(git_file)
13+
MERGE (git_change_span)-[:CONTAINS_CHANGED]->(git_file)
1414
} IN TRANSACTIONS OF 1000 ROWS
1515
RETURN count(DISTINCT row.author) AS numberOfAuthors
1616
,count(DISTINCT row.filename) AS numberOfFiles

cypher/GitLog/Import_git_log_csv_data.cypher

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ CALL { WITH row
1212
})
1313
MERGE (git_file:Git:Log:File {fileName: row.filename})
1414
MERGE (git_author)-[:AUTHORED]->(git_commit)
15-
MERGE (git_commit)-[:CONTAINS]->(git_file)
15+
MERGE (git_commit)-[:CONTAINS_CHANGED]->(git_file)
1616
} IN TRANSACTIONS OF 1000 ROWS
1717
RETURN count(DISTINCT row.author) AS numberOfAuthors
1818
,count(DISTINCT row.filename) AS numberOfFiles

cypher/GitLog/List_ambiguous_git_files.cypher

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
// List ambigiously resolved git files where a single git file is attached to more than one code file for troubleshooting/testing.
22

33
MATCH (file:File&!Git)<-[:RESOLVES_TO]-(git_file:File&Git)
4-
OPTIONAL MATCH (artifact:Artifact:Archive)-[:CONTAINS]->(file)
4+
OPTIONAL MATCH (artifact:Artifact:Archive)-[:CONTAINS_CHANGED]->(file)
55
WITH file.fileName AS fileName
66
,reverse(split(reverse(file.fileName),'.')[0]) AS fileExtension
77
,count(DISTINCT git_file.fileName) AS gitFilesCount
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
// Set numberOfGitCommits property on code File nodes when aggregated change spans with grouped commits are present.
22

33
MATCH (code_file:File&!Git)<-[:RESOLVES_TO]-(git_file:File&Git)
4-
MATCH (git_file)<-[:CONTAINS]-(git_changespan:Git:ChangeSpan)
4+
MATCH (git_file)<-[:CONTAINS_CHANGED]-(git_changespan:Git:ChangeSpan)
55
WITH code_file, sum(git_changespan.commits) AS numberOfGitCommits
66
SET code_file.numberOfGitCommits = numberOfGitCommits
77
RETURN count(DISTINCT coalesce(code_file.absoluteFileName, code_file.fileName)) AS changedCodeFiles
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
// Set numberOfGitCommits property on code File nodes when git commits are present
22

33
MATCH (code_file:File&!Git)<-[:RESOLVES_TO]-(git_file:File&Git)
4-
MATCH (git_file)<-[:CONTAINS]-(git_commit:Git:Commit)
4+
MATCH (git_file)<-[:CONTAINS_CHANGED]-(git_commit:Git:Commit)
55
WITH code_file, count(DISTINCT git_commit.hash) AS numberOfGitCommits
66
SET code_file.numberOfGitCommits = numberOfGitCommits
77
RETURN count(DISTINCT coalesce(code_file.absoluteFileName, code_file.fileName)) AS changedCodeFiles

0 commit comments

Comments
 (0)