feat: Add privacy specific taxonomy #84

jajanet · 2025-10-13T17:11:34Z

As part of #47, this PR helps ensure P0 CUJ-1 (log data leak ID and removal) and P0 CUJ-2 (ID sensitive flow to 3P) is addressed in the security:analyze command

This also helps cover more privacy specific features via outputting a simple datamap with source and sinks that the end of the analysis

Pending more test cases, this is an example of what a run would look like with a small set of tests: https://screenshot.googleplex.com/8nuFzxWcS5V2X6b (computer settings won't let me paste or upload an image to GH for some reason)

In short, this mainly adds:

privacy taint analysis skill to make sure those issues are flagged (similar to security ones)
edits the following analysis fields:
- Location --> Source Location, to make the privacy datamap more clear
the following fields to the analysis:
- vulnerability type (to differentiate between privacy and security issues)
- sink (only for privacy issues, to complete the datamap)
- data type (only for privacy issues, to flag the specific PII)

google-cla · 2025-10-13T17:11:40Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

heltonduarte · 2025-10-14T18:58:34Z

commands/security/analyze.toml

    *   **Action:** Read the entire `DRAFT_SECURITY_REPORT.md` file.
    *   **Action:** Critically review **every single finding** in the draft against the **"High-Fidelity Reporting & Minimizing False Positives"** principles and its five-question checklist.
    *   **Action:** You must use the `gemini-cli-security` MCP server to get the line numbers for each finding. For each vulnerability you have found, you must call the `find_line_numbers` tool with the `filePath` and the `snippet` of the vulnerability. You will then add the `startLine` and `endLine` to the final report.
+    *   **Action:** After reviewing the detailed findings, you will synthesize all identified privacy violations into a summary table. This table must be included at the top of the final report under a `## Privacy Data Map` heading.


I think this pollutes the output too much without bringing extra value compared to the "vulnerability" it already surfaces. One idea is just to add source and sink to the summary of the privacy violation when generating the report.

Got it! I wasn't sure the best way to rectify this -- currently, I added fields to the Skillset: Reporting in GEMINI.md that are conditional on a vulnerability being privacy related along with a vulnerability type field

I guess the main question I have is: should the privacy and security issues commingle in the final report?

As of recent changes, they commingle -- for example, we could have a single report which lists a security issue, followed by a couple of privacy issues, which is followed by a security one: XSS, PII in Logs, PII to 3P, SSRF

Alternatively, we could be a separate security section and privacy section. Meaning, the Security section would have XSS, SSRF and Privacy would have PII in Logs, PII to 3P for the same example

Thoughts?

heltonduarte · 2025-10-14T19:00:49Z

commands/security/analyze.toml

-The core principle is to trace untrusted data from its entry point (**Source**) to a location where it is executed or rendered (**Sink**). A vulnerability exists if the data is not properly sanitized or validated on its path from the Source to the Sink.
+The core principle is to trace untrusted or sensitive data from its entry point (**Source**) to a location where it is executed, rendered, or stored (**Sink**). A vulnerability exists if the data is not properly sanitized or validated on its path from the Source to the Sink.
+
+### Extended Skillset: Privacy Taint Analysis


Have you considered merging this "Privacy Taint Analysis" into the current taxonomy of "Logging of Sensitive Information" and "PII Handling Violations" in Gemini.md?

Ah yes, that looks like a better spot to put it! Let me move it there!

… privacy fields where relevant

heltonduarte · 2025-10-20T21:09:36Z

GEMINI.md

 *   **Severity:** Critical, High, Medium, or Low.
-*   **Location:** The file path where the vulnerability was introduced and the line numbers if that is available.
+*   **Source Location:** The file path where the vulnerability was introduced and the line numbers if that is available.
+*   **Sink Location:** If this is a privacy issue, include this location where sensitive data is exposed or leaves the application's trust boundary


Nit: add a final period here.

shrishabh · 2025-10-23T21:59:50Z

GEMINI.md


 ---

+## Skillset: Privacy Taint Analysis


Since we are effectively expanding the taxonomy, would it be better to have this included as 1.7 in the section above? This is essentially insecure data handling category, I think? cc: @heltonduarte @capachino

Looking at this, I agree -- keeping it under a new 1.7 section would be better because of that and it would keep the tool as a single unified workflow!

jajanet requested review from QuanZhang-William, capachino, evanotero, heltonduarte, pedrour and shrishabh as code owners October 13, 2025 17:11

add privacy specific taxonomy to security analyze command

c01365b

jajanet force-pushed the main branch from 1c60530 to c01365b Compare October 13, 2025 17:21

capachino changed the title ~~Add privacy specific taxonomy~~ feat: Add privacy specific taxonomy Oct 13, 2025

heltonduarte reviewed Oct 14, 2025

View reviewed changes

jajanet added 2 commits October 15, 2025 17:17

Relocate privacy skillset, remove datamap table in favor of additonal…

26b4986

… privacy fields where relevant

Extra space and some cleanup

60aa578

heltonduarte approved these changes Oct 20, 2025

View reviewed changes

Merge branch 'gemini-cli-extensions:main' into main

690c9e0

shrishabh reviewed Oct 23, 2025

View reviewed changes

add period

580ea8b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: Add privacy specific taxonomy #84

feat: Add privacy specific taxonomy #84

jajanet commented Oct 13, 2025 •

edited

Loading

Uh oh!

google-cla bot commented Oct 13, 2025

Uh oh!

heltonduarte Oct 14, 2025

Uh oh!

jajanet Oct 15, 2025

Uh oh!

heltonduarte Oct 14, 2025

Uh oh!

jajanet Oct 15, 2025

Uh oh!

heltonduarte Oct 20, 2025

Uh oh!

shrishabh Oct 23, 2025

Uh oh!

jajanet Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

feat: Add privacy specific taxonomy #84

Are you sure you want to change the base?

feat: Add privacy specific taxonomy #84

Conversation

jajanet commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

google-cla bot commented Oct 13, 2025

Uh oh!

heltonduarte Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

jajanet Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

heltonduarte Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

jajanet Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

heltonduarte Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

shrishabh Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

jajanet Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jajanet commented Oct 13, 2025 •

edited

Loading