Skip to content

Conversation

@AayushTyagi1
Copy link
Contributor

@AayushTyagi1 AayushTyagi1 commented Nov 20, 2025

Screen.Recording.2025-11-21.at.3.52.25.AM.mov

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @AayushTyagi1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a powerful new query analysis tool designed to help users identify and optimize slow or inefficient queries within their system. By leveraging the system:completed_requests catalog, the tool provides insights into various performance aspects, such as query duration, frequency, resource consumption, and index utilization, enabling proactive performance tuning and improving overall system efficiency.

Highlights

  • New Query Analysis Tool: A new utility function, analyze_queries, has been introduced to provide comprehensive performance analysis of queries executed against the system.
  • Multiple Analysis Types: The analyze_queries function supports various analysis types, including identifying longest-running, most frequent, largest response, large result count, primary index usage, lack of covering indexes, and poorly selective queries.
  • Tool Integration: The new analyze_queries function is now exposed through the src/tools/__init__.py file, making it readily available as part of the tools module for broader use.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a powerful new tool, analyze_queries, for diagnosing query performance issues by analyzing the system:completed_requests catalog. The overall structure is clean and extensible.

My review focuses on improving the robustness and maintainability of the new tool. I've identified some inconsistencies in how system-level queries are filtered across different analysis types, which could lead to incorrect results. I've also suggested performance improvements by avoiding SELECT * and a refactoring to reduce code duplication in the query filter logic. Addressing these points will make the tool more reliable and easier to maintain.

SELECT *
FROM system:completed_requests
WHERE phaseCounts.`primaryScan` IS NOT MISSING
AND UPPER(statement) NOT LIKE '% SYSTEM:%'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The filtering for this analysis is inconsistent with other analyses and the function's documentation. It's missing filters for INFER and CREATE INDEX queries. To ensure the analysis provides relevant, application-focused results, please add the missing filters.

Suggested change
AND UPPER(statement) NOT LIKE '% SYSTEM:%'
AND UPPER(statement) NOT LIKE 'INFER %'
AND UPPER(statement) NOT LIKE 'CREATE INDEX%'
AND UPPER(statement) NOT LIKE '% SYSTEM:%'

FROM system:completed_requests
WHERE phaseCounts.`indexScan` IS NOT MISSING
AND phaseCounts.`fetch` IS NOT MISSING
AND UPPER(statement) NOT LIKE '% SYSTEM:%'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Similar to the primary_index analysis, this query is missing filters for INFER and CREATE INDEX queries, which is inconsistent with the documentation and other analyses. Please add these filters to focus on application-level queries.

Suggested change
AND UPPER(statement) NOT LIKE '% SYSTEM:%'
AND UPPER(statement) NOT LIKE 'INFER %'
AND UPPER(statement) NOT LIKE 'CREATE INDEX%'
AND UPPER(statement) NOT LIKE '% SYSTEM:%'

SELECT statement,
AVG(phaseCounts.`indexScan` - resultCount) AS diff
FROM system:completed_requests
WHERE phaseCounts.`indexScan` > resultCount
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This analysis is missing all filters for system queries (INFER, CREATE INDEX, SYSTEM:*). This is inconsistent with the docstring and other analyses, and may include irrelevant system queries in the results. Please add the standard filters.

Suggested change
WHERE phaseCounts.`indexScan` > resultCount
WHERE phaseCounts.`indexScan` > resultCount
AND UPPER(statement) NOT LIKE 'INFER %'
AND UPPER(statement) NOT LIKE 'CREATE INDEX%'
AND UPPER(statement) NOT LIKE '% SYSTEM:%'

"primary_index": {
"description": "Queries using primary indexes (typically inefficient, should use secondary indexes)",
"query": """
SELECT *
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using SELECT * can be inefficient as it retrieves all columns from system:completed_requests. It's a better practice to explicitly list only the columns needed for the analysis. This reduces data transfer and makes the query's intent clearer. Consider selecting key fields like statement, serviceTime, resultCount, resultSize, and phaseCounts.

Suggested change
SELECT *
SELECT statement, serviceTime, resultCount, resultSize, phaseCounts

"no_covering_index": {
"description": "Queries not using covering indexes (require document fetches, can be optimized)",
"query": """
SELECT *
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using SELECT * can be inefficient as it retrieves all columns from system:completed_requests. It's a better practice to explicitly list only the columns needed for the analysis. This reduces data transfer and makes the query's intent clearer. Consider selecting key fields like statement, serviceTime, resultCount, resultSize, and phaseCounts.

Suggested change
SELECT *
SELECT statement, serviceTime, resultCount, resultSize, phaseCounts

Comment on lines +229 to +237
"query": """
SELECT statement,
AVG(phaseCounts.`indexScan` - resultCount) AS diff
FROM system:completed_requests
WHERE phaseCounts.`indexScan` > resultCount
GROUP BY statement
ORDER BY diff DESC
LIMIT $limit
""",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The WHERE clause to filter out system queries is duplicated across multiple analysis definitions. To improve maintainability and ensure consistency, you could define this filter as a constant string and use an f-string to inject it into the queries. This would make it easier to update the filter in one place in the future.

Example:

SYSTEM_QUERY_FILTER = "UPPER(statement) NOT LIKE 'INFER %' AND UPPER(statement) NOT LIKE 'CREATE INDEX%' AND UPPER(statement) NOT LIKE '% SYSTEM:%'"

analyses = {
    "longest_running": {
        "description": "...",
        "query": f"""
            SELECT ...
            FROM system:completed_requests
            WHERE {SYSTEM_QUERY_FILTER}
            ...
        """,
    },
    # ... other analyses
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants