Skip to content

Conversation

@csmith49
Copy link
Collaborator

This PR addresses #59 by porting over a version of the security analyzer framework.

The main change from v0 to v1 is where the security analyzer is being called. Previously, it hooked into the agent controller and intercepted pending actions. However, since the agent is now responsible for executing the actions and will do so unless confirmation mode is set we now pass the security analyzer to the agent directly.

default="UNKNOWN", description=SECURITY_RISK_DESC
security_risk: SecurityRisk = Field(
default=SecurityRisk.UNKNOWN,
description="The LLM's assessment of the safety risk of this action.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not need to be done in this PR, but i wonder if there's way to dynamically add this field to action schema only when LLM analyzer is enabled?

BUT this probably means more dark magin for ActionSchema 😓

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little scared about the dynamic generation tbh. We can go there later if it's needed though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 already created an issue here to track
#300

it'll be good if we can have a better mechanism to keep track of all these :(

Calvin Smith and others added 5 commits September 17, 2025 08:05
- Reduced from 260 lines to 74 lines (71% reduction)
- Removed verbose custom confirmation handling
- Eliminated repetitive test scenarios
- Focused on essential functionality: creating and using LLMSecurityAnalyzer
- Follows the same concise pattern as other examples
- Still demonstrates key concepts: automatic security risk evaluation

Co-authored-by: openhands <[email protected]>
@csmith49 csmith49 self-assigned this Sep 17, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Sep 17, 2025

Coverage

Coverage Report
FileStmtsMissCoverMissing
openhands
   __init__.py10100% 
openhands/agent_server
   __init__.py16160%1, 3–5, 9–12, 20–22, 25, 27–29, 31
   __main__.py15150%1, 3, 6–8, 11, 14, 22, 24–27, 29, 38–39
   api.py20200%1–2, 4, 6, 9, 12, 15, 18, 24–28, 31–32, 36–37, 40–42
   config.py42420%1–4, 6, 10–11, 14, 17, 21, 27–28, 31, 34, 38, 41, 47, 54, 61, 67, 73, 77, 79–80, 82, 85–87, 89–90, 93–96, 98, 101, 104, 107, 109–110, 113–114
   conversation_router.py50500%3–4, 6, 8, 11, 18, 21–22, 28–29, 48–50, 55–56, 63–64, 67–68, 70–73, 76–77, 82–84, 90–91, 95–96, 99, 102–106, 109, 112–116, 119–124
   conversation_service.py1951950%1–6, 8, 10–12, 19–22, 25, 28–29, 35–39, 41–48, 50, 57–58, 61–63, 69–70, 72, 75–82, 85–86, 89–93, 96–98, 100–103, 105, 107, 112–113, 115–117, 120–121, 123, 125, 127, 132–136, 140, 144–148, 151–152, 159–160, 173–177, 180, 182–183, 185–191, 193–199, 201–210, 212–215, 217–225, 230–231, 234–236, 238–242, 244, 251–253, 261–263, 265–266, 269–274, 276, 278, 280–281, 283, 285–286, 288, 290–291, 293–294, 297–299, 302, 308–311, 318–319, 323–327, 329, 334, 337, 340, 342–343, 345, 349–351
   event_router.py86860%5–8, 10, 18, 20, 23, 31–32, 35–37, 42–43, 65–70, 73–74, 84–88, 91–92, 94–100, 103–104, 109–113, 119–120, 122–127, 130, 133, 137–141, 147–148, 152–156, 159–168, 170, 173–175, 177–182
   event_service.py1261260%1–4, 6, 13–15, 21–22, 25–26, 32–36, 38–40, 42–45, 47–50, 53, 56, 58, 65–66, 69–71, 73, 78–79, 82–85, 88–89, 92–96, 99–101, 103–106, 108–109, 111–113, 115, 120–121, 123–125, 127, 132–133, 135, 137, 139–143, 145–152, 154–155, 157–158, 160, 162–166, 182–183, 185, 187–190, 192–194, 196, 198–201, 203–207, 209–212, 214–216, 218–220
   middleware.py26260%1, 3–6, 9, 14–15, 23–26, 29–30, 33–34, 37, 46–48, 50, 53–57
   models.py50500%1–4, 6, 8, 12, 19–20, 27, 30, 33–36, 39, 42–43, 46, 52–54, 59–61, 64, 72, 77, 80, 87, 90–93, 96, 99, 102–104, 107–109, 112, 115–116, 119–120, 123–125
   pub_sub.py35350%1–4, 6–7, 10, 13–15, 18, 22–23, 30, 32, 39–42, 44, 51–54, 56, 59, 61, 68–72, 74–75, 78
   utils.py28280%1–4, 6, 9, 11, 14–18, 21, 25–30, 33–36, 39–41, 44, 47
openhands/sdk
   __init__.py16287%27–28
   logger.py732171%33, 57, 64–67, 69–71, 124, 129–131, 134–135, 141–143, 150, 155–156
openhands/sdk/agent
   __init__.py40100% 
   agent.py1814177%62, 69, 76, 80, 97, 111, 118–119, 124–125, 196–197, 199–201, 203–205, 240, 254, 277, 316, 321–323, 326–327, 330, 360–362, 366–368, 375–376, 380, 384–385, 415, 422
   base.py94792%113, 127, 135–137, 153, 173
   spec.py150100% 
openhands/sdk/context
   __init__.py40100% 
   agent_context.py57296%146, 152
   manager.py330%1, 4–5
   view.py97198%90
openhands/sdk/context/condenser
   __init__.py50100% 
   base.py210100% 
   llm_summarizing_condenser.py39392%44–46
   no_op_condenser.py60100% 
   pipeline_condenser.py13653%45–50
openhands/sdk/context/microagents
   __init__.py40100% 
   exceptions.py50100% 
   microagent.py1432582%130, 133–136, 218–221, 229, 251–252, 257–258, 260, 264, 271–273, 281–283, 337, 339–340
   types.py210100% 
openhands/sdk/context/prompts
   __init__.py20100% 
   prompt.py30583%12, 15, 24, 44–45
openhands/sdk/conversation
   __init__.py70100% 
   conversation.py1151190%115, 123–125, 129–130, 193, 275–276, 284–285
   event_store.py101892%50–51, 60, 67, 72–73, 129, 142
   persistence_const.py50100% 
   secrets_manager.py41197%107
   serialization_diff.py00100% 
   state.py101595%141, 164, 200–202
   types.py60100% 
   visualizer.py94693%90, 147, 169, 186, 218, 220
openhands/sdk/event
   __init__.py70100% 
   base.py74889%55, 75, 87–88, 94, 97–98, 100
   condenser.py28775%37, 39, 41–45
   llm_convertible.py1791691%53, 63–64, 69–70, 246, 280–281, 286, 294, 335–336, 341, 374–375, 380
   metric_events.py130100% 
   types.py70100% 
   user_action.py12191%21
   utils.py120100% 
openhands/sdk/io
   __init__.py40100% 
   base.py14471%7, 11, 15, 19
   local.py561671%43–44, 58, 66–78
   memory.py43490%16, 20, 53–54
openhands/sdk/llm
   __init__.py80100% 
   exceptions.py360100% 
   llm.py39910274%229, 234, 247–249, 253–254, 286, 349, 355–356, 451, 464–465, 470–471, 473–474, 477–479, 484–486, 490–492, 513–516, 523, 541–542, 545–546, 570, 576–577, 623, 672, 689–690, 699, 710, 731, 733–738, 740–757, 760–764, 766–767, 773–782, 786–797, 810, 824, 829
   llm_registry.py380100% 
   message.py109496%96, 99, 222–223
   metadata.py150100% 
openhands/sdk/llm/mixins
   fn_call_converter.py3439273%74, 343, 345, 349, 367, 369, 375, 381, 383, 422, 424, 426, 428, 433–434, 518–520, 522, 524, 545–547, 553, 575, 601–602, 610–613, 615, 617, 639, 648, 656, 701–704, 708–711, 723, 727, 738, 748, 797–798, 800, 829, 833, 859, 867, 870–871, 876, 905–908, 912–913, 918–919, 924, 973–974, 980, 994, 1006, 1008–1009, 1012–1014, 1016–1017, 1023–1025, 1027–1028, 1030, 1032, 1036, 1038, 1043, 1045–1046, 1049
   non_native_fc.py39392%64, 75, 91
openhands/sdk/llm/utils
   metrics.py111397%17, 117, 311
   model_features.py400100% 
   retry_mixin.py501178%47, 50, 64, 86, 90, 94–95, 105, 110–111, 116
   telemetry.py1361588%71, 94, 99–100, 112–113, 120, 134, 199, 216, 222, 229, 232, 234, 241
   unverified_models.py69494%45–46, 51, 73
   verified_models.py50100% 
openhands/sdk/mcp
   __init__.py50100% 
   client.py26676%48–49, 62–63, 72–73
   definition.py481666%55, 75–80, 82–90
   tool.py401367%36–39, 43, 46, 49–52, 101–102, 107
   utils.py30486%23–24, 27, 30
openhands/sdk/preset
   __init__.py00100% 
   default.py201240%13, 15, 22, 28–29, 31–33, 35–36, 43, 45
openhands/sdk/security
   __init__.py20100% 
   analyzer.py37878%42, 75, 77–78, 80–81, 83, 86
   llm_analyzer.py90100% 
   risk.py12283%21, 33
openhands/sdk/tool
   __init__.py50100% 
   schema.py1261191%24–26, 28, 37, 242–245, 265, 280
   spec.py150100% 
   tool.py961089%65, 106, 176, 179–185
openhands/sdk/tool/builtins
   __init__.py40100% 
   finish.py26196%33
   think.py321359%24, 27–28, 31, 33–37, 39, 51, 57, 74
openhands/sdk/utils
   __init__.py30100% 
   async_executor.py52786%39, 55–56, 84, 88, 102–103
   async_utils.py120100% 
   discriminated_union.py1682286%119–127, 141–142, 236, 329, 356, 399–401, 418, 451, 464, 471, 474
   json.py28280%1–3, 5, 7–8, 11, 14–21, 25, 28, 30–31, 34, 37–38, 40, 43, 45–48
   protocol.py30100% 
   pydantic_diff.py571573%36, 44, 50–58, 60–62, 65
   truncate.py100100% 
   visualize.py17476%14–16, 22
openhands/tools
   __init__.py17288%29–30
openhands/tools/browser_use
   __init__.py30100% 
   definition.py1091784%28–29, 31, 35, 37–38, 40, 88, 145, 199, 250, 301, 352, 397, 442, 492, 541
   impl.py1137335%38, 44, 58–59, 61–70, 73–82, 84–85, 87–91, 95, 97–98, 103–104, 108–109, 114–115, 119–120, 124–125, 129, 131–132, 134–137, 140–141, 144, 146, 148, 153–154, 158–159, 163–164, 169–170, 177–179, 188–189, 196–197, 206–207
   server.py45426%11, 13–14, 16–17, 20–21, 24–25, 28, 30–32, 34–35, 38–39, 41, 44, 47–48, 51, 54–55, 57–61, 64–66, 68, 73–78, 80, 87, 89
openhands/tools/execute_bash
   __init__.py40100% 
   constants.py90100% 
   definition.py954651%38, 41, 44–45, 47, 50–52, 54–56, 58, 106, 109–111, 114, 116–118, 120, 124–125, 128–130, 132–133, 136–139, 143–145, 150, 154–156, 159–161, 165–166, 168, 248
   impl.py40392%55, 58, 62
   metadata.py50394%95–96, 100
openhands/tools/execute_bash/terminal
   __init__.py60100% 
   factory.py491177%24–25, 30, 32, 35, 37–38, 44–46, 97
   interface.py691578%43, 52, 62, 71, 76, 85, 94, 99, 145, 157, 162, 171, 180, 191, 193
   subprocess_terminal.py2365975%68, 99–100, 126, 132, 139, 146–147, 157–158, 164–165, 179, 181, 185–187, 193, 209, 218–222, 257–259, 264, 276, 290, 314, 316, 325, 346, 362, 367, 373–375, 383–384, 388–389, 391–397, 401–402, 405–406, 408–409, 411–413
   terminal_session.py178895%92, 96–98, 235, 281, 297, 317
   tmux_terminal.py802173%36, 45, 108, 119, 133, 145–152, 160–161, 163–164, 166, 168–170
openhands/tools/execute_bash/utils
   command.py81495%48, 64–66
openhands/tools/str_replace_editor
   __init__.py30100% 
   definition.py65986%87, 99, 119, 122, 125, 132, 134, 136, 138
   editor.py2281195%131, 264, 340, 350, 401–402, 641, 648–649, 663, 668
   exceptions.py220100% 
   impl.py26292%31–32
openhands/tools/str_replace_editor/utils
   __init__.py00100% 
   config.py20100% 
   constants.py50100% 
   diff.py64198%115
   encoding.py54198%81
   file_cache.py95990%44–46, 49–50, 54, 59, 151, 154
   history.py66198%79
   shell.py230100% 
openhands/tools/task_tracker
   __init__.py220%1, 10
   definition.py1321320%1–4, 6–7, 9–11, 20, 23–26, 33, 36, 40, 45–46, 48, 51–53, 55–56, 59–60, 62, 65, 68, 71–72, 76–78, 80–81, 83, 85, 87–88, 91, 94–96, 98–99, 102–108, 110–112, 115, 117–120, 122, 125, 128–129, 131–132, 134–135, 137, 140, 143, 150–151, 154–155, 157, 159, 161, 163–165, 171, 173–174, 179–180, 184, 191, 193–194, 196–198, 202–203, 205–208, 210, 212, 214–215, 217–219, 221–225, 229, 231, 233–234, 236–237, 239, 241–245, 249, 382, 396, 399–400, 407, 410
openhands/tools/utils
   __init__.py00100% 
TOTAL6694177073% 

@openhands-ai
Copy link

openhands-ai bot commented Sep 17, 2025

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Pre-commit checks

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #305 at branch `feat/sec-an`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

@csmith49 csmith49 marked this pull request as ready for review September 17, 2025 21:45
@csmith49
Copy link
Collaborator Author

Holding off on separate ConfirmationPolicy for another PR. There are a few questions that need to be resolved that deserve more discussion and are totally orthogonal to the changes here.

@csmith49
Copy link
Collaborator Author

@xingyaoww and @malhotra5 Fixed up all the changes if you want to re-review. Will probably merge in the morning if there aren't any outstanding issues.

Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just a few nits -- happy to get this merged once it is resolved

risk, state.confirmation_mode
):
state.agent_status = AgentExecutionStatus.WAITING_FOR_CONFIRMATION
return True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@malhotra5 actually, related to this, it just occurs to me if we can implement the "confirmation mode" as a different security analyzer that require confirmation for EVERY action?

always require confirmation & llm risk confirmation seems to be two things that are orthogonal that can actually be broken into two analyzers, and confirmation_mode can essentially be thrown away -- all we need to check is whether security_analyzer is None or not. Will it help eliminate a lot of cases in CLI for you?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ we can merge this PR -- this is non-blocking

@csmith49 csmith49 enabled auto-merge (squash) September 18, 2025 14:10
@github-actions
Copy link
Contributor

Agent Server image for this PR

Pull (multi-arch manifest):

docker pull ghcr.io/all-hands-ai/agent-server:726cea1

Run:

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-726cea1 \
  ghcr.io/all-hands-ai/agent-server:726cea1

This tag is a multi-arch manifest (amd64/arm64). Your client pulls the right arch automatically.

@csmith49 csmith49 merged commit c352746 into main Sep 18, 2025
13 checks passed
@csmith49 csmith49 deleted the feat/sec-an branch September 18, 2025 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants