-
Couldn't load subscription status.
- Fork 13
feat: Add privacy specific taxonomy #84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
commands/security/analyze.toml
Outdated
| * **Action:** Read the entire `DRAFT_SECURITY_REPORT.md` file. | ||
| * **Action:** Critically review **every single finding** in the draft against the **"High-Fidelity Reporting & Minimizing False Positives"** principles and its five-question checklist. | ||
| * **Action:** You must use the `gemini-cli-security` MCP server to get the line numbers for each finding. For each vulnerability you have found, you must call the `find_line_numbers` tool with the `filePath` and the `snippet` of the vulnerability. You will then add the `startLine` and `endLine` to the final report. | ||
| * **Action:** After reviewing the detailed findings, you will synthesize all identified privacy violations into a summary table. This table must be included at the top of the final report under a `## Privacy Data Map` heading. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this pollutes the output too much without bringing extra value compared to the "vulnerability" it already surfaces. One idea is just to add source and sink to the summary of the privacy violation when generating the report.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! I wasn't sure the best way to rectify this -- currently, I added fields to the Skillset: Reporting in GEMINI.md that are conditional on a vulnerability being privacy related along with a vulnerability type field
I guess the main question I have is: should the privacy and security issues commingle in the final report?
As of recent changes, they commingle -- for example, we could have a single report which lists a security issue, followed by a couple of privacy issues, which is followed by a security one: XSS, PII in Logs, PII to 3P, SSRF
Alternatively, we could be a separate security section and privacy section. Meaning, the Security section would have XSS, SSRF and Privacy would have PII in Logs, PII to 3P for the same example
Thoughts?
commands/security/analyze.toml
Outdated
| The core principle is to trace untrusted data from its entry point (**Source**) to a location where it is executed or rendered (**Sink**). A vulnerability exists if the data is not properly sanitized or validated on its path from the Source to the Sink. | ||
| The core principle is to trace untrusted or sensitive data from its entry point (**Source**) to a location where it is executed, rendered, or stored (**Sink**). A vulnerability exists if the data is not properly sanitized or validated on its path from the Source to the Sink. | ||
| ### Extended Skillset: Privacy Taint Analysis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered merging this "Privacy Taint Analysis" into the current taxonomy of "Logging of Sensitive Information" and "PII Handling Violations" in Gemini.md?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, that looks like a better spot to put it! Let me move it there!
… privacy fields where relevant
GEMINI.md
Outdated
| * **Severity:** Critical, High, Medium, or Low. | ||
| * **Location:** The file path where the vulnerability was introduced and the line numbers if that is available. | ||
| * **Source Location:** The file path where the vulnerability was introduced and the line numbers if that is available. | ||
| * **Sink Location:** If this is a privacy issue, include this location where sensitive data is exposed or leaves the application's trust boundary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: add a final period here.
| --- | ||
| ## Skillset: Privacy Taint Analysis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are effectively expanding the taxonomy, would it be better to have this included as 1.7 in the section above? This is essentially insecure data handling category, I think? cc: @heltonduarte @capachino
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this, I agree -- keeping it under a new 1.7 section would be better because of that and it would keep the tool as a single unified workflow!
As part of #47, this PR helps ensure P0 CUJ-1 (log data leak ID and removal) and P0 CUJ-2 (ID sensitive flow to 3P) is addressed in the
security:analyzecommandThis also helps cover more privacy specific features via outputting a simple datamap with source and sinks that the end of the analysis
Pending more test cases, this is an example of what a run would look like with a small set of tests: https://screenshot.googleplex.com/8nuFzxWcS5V2X6b (computer settings won't let me paste or upload an image to GH for some reason)
In short, this mainly adds: