You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
// Anomaly Detection DeepDive: Overview of analyzed code units and the number of anomalies detected. Requires all other labels/*.cypher queries to run first. Variables: projection_language, projection_node_label
// Anomaly Detection Summary: Overview of all analyzed code units in total. Requires all other labels/*.cypher queries to run first. Variables: projection_language, projection_node_label
// Anomaly Detection Summary: Overview of analyzed code units and the number of anomalies detected. Requires all other labels/*.cypher queries to run first. Variables: projection_language, projection_node_label
// Anomaly Detection Labels: Summarizes all labelled archetypes by their anomaly score including their archetype rank. For code units with more than one archetype, the one with the higher rank is shown. Requires all other labels/*.cypher queries to run first. Variables: projection_language, projection_node_label
This report analyzes structural and dependency anomalies across multiple abstraction levels of the codebase.
6
+
The goal is to detect potential **software quality, design, and architecture issues** using graph-based features, anomaly detection (Isolation Forest), and SHAP explainability.
7
+
8
+
### 1.1 Overview of Analyzed Structures
9
+
10
+
<!-- include:AnomaliesPerAbstractionLayer.md -->
11
+
12
+
### 1.2 Anomalies in total
13
+
14
+
<!-- include:AnomaliesPerAbstractionLayer.md -->
15
+
16
+
## 2. Deep Dives by Abstraction Level
17
+
18
+
<!-- include:AnomalyDetectionDeepDives.md -->
19
+
20
+
## 3. Taxonomy of Anomaly Archetypes
21
+
22
+
| Archetype | Feature Profile | Risk for Architecture |
***Refactor hubs:** Break down god classes/utilities into smaller abstractions.
35
+
***Mitigate bottlenecks:** Add redundancy or alternative paths.
36
+
***Investigate outliers:** Validate if they are justified exceptions or design flaws.
37
+
***Enforce cohesion:** Raise clustering coefficient via better modular boundaries.
38
+
***Stabilize authorities:** Encapsulate widely used but locally weak components, reduce over-generalization, and ensure stable APIs.
39
+
***Clarify bridges:** Validate whether cross-cluster connectors are intentional (adapters/facades) or accidental; refactor or relocate responsibilities to preserve modularity.
40
+
41
+
---
42
+
43
+
## 5. Appendix
44
+
45
+
***Methodology:** Isolation Forest, Random Forest proxy, SHAP explanations.
46
+
***Embedding generation:** Fast Random Projection, PCA (20–35 dims, \~0.9 target variance).
47
+
***Clustering:** HDBSCAN tuned against Leiden communities (golden reference, AMI optimization).
48
+
***Optimization:** Hyperparameter optimization for both Isolation Forest and Random Forest proxy with their F1 score
0 commit comments