Security analyzer framework #305

csmith49 · 2025-09-16T15:30:15Z

This PR addresses #59 by porting over a version of the security analyzer framework.

The main change from v0 to v1 is where the security analyzer is being called. Previously, it hooked into the agent controller and intercepted pending actions. However, since the agent is now responsible for executing the actions and will do so unless confirmation mode is set we now pass the security analyzer to the agent directly.

openhands/sdk/agent/agent.py

openhands/sdk/security/analyzer.py

openhands/sdk/agent/agent.py

xingyaoww · 2025-09-16T16:33:14Z

openhands/sdk/tool/schema.py

-        default="UNKNOWN", description=SECURITY_RISK_DESC
+    security_risk: SecurityRisk = Field(
+        default=SecurityRisk.UNKNOWN,
+        description="The LLM's assessment of the safety risk of this action.",


Probably not need to be done in this PR, but i wonder if there's way to dynamically add this field to action schema only when LLM analyzer is enabled?

BUT this probably means more dark magin for ActionSchema 😓

I'm a little scared about the dynamic generation tbh. We can go there later if it's needed though.

👍 already created an issue here to track
#300

it'll be good if we can have a better mechanism to keep track of all these :(

openhands/sdk/agent/agent.py

openhands/sdk/security/llm_analyzer.py

- Reduced from 260 lines to 74 lines (71% reduction) - Removed verbose custom confirmation handling - Eliminated repetitive test scenarios - Focused on essential functionality: creating and using LLMSecurityAnalyzer - Follows the same concise pattern as other examples - Still demonstrates key concepts: automatic security risk evaluation Co-authored-by: openhands <[email protected]>

…f strs

github-actions · 2025-09-17T20:41:55Z

Coverage Report

File	Stmts	Miss	Cover	Missing
openhands
__init__.py	1	0	100%
openhands/agent_server
__init__.py	16	16	0%	1, 3–5, 9–12, 20–22, 25, 27–29, 31
__main__.py	15	15	0%	1, 3, 6–8, 11, 14, 22, 24–27, 29, 38–39
api.py	20	20	0%	1–2, 4, 6, 9, 12, 15, 18, 24–28, 31–32, 36–37, 40–42
config.py	42	42	0%	1–4, 6, 10–11, 14, 17, 21, 27–28, 31, 34, 38, 41, 47, 54, 61, 67, 73, 77, 79–80, 82, 85–87, 89–90, 93–96, 98, 101, 104, 107, 109–110, 113–114
conversation_router.py	50	50	0%	3–4, 6, 8, 11, 18, 21–22, 28–29, 48–50, 55–56, 63–64, 67–68, 70–73, 76–77, 82–84, 90–91, 95–96, 99, 102–106, 109, 112–116, 119–124
conversation_service.py	195	195	0%	1–6, 8, 10–12, 19–22, 25, 28–29, 35–39, 41–48, 50, 57–58, 61–63, 69–70, 72, 75–82, 85–86, 89–93, 96–98, 100–103, 105, 107, 112–113, 115–117, 120–121, 123, 125, 127, 132–136, 140, 144–148, 151–152, 159–160, 173–177, 180, 182–183, 185–191, 193–199, 201–210, 212–215, 217–225, 230–231, 234–236, 238–242, 244, 251–253, 261–263, 265–266, 269–274, 276, 278, 280–281, 283, 285–286, 288, 290–291, 293–294, 297–299, 302, 308–311, 318–319, 323–327, 329, 334, 337, 340, 342–343, 345, 349–351
event_router.py	86	86	0%	5–8, 10, 18, 20, 23, 31–32, 35–37, 42–43, 65–70, 73–74, 84–88, 91–92, 94–100, 103–104, 109–113, 119–120, 122–127, 130, 133, 137–141, 147–148, 152–156, 159–168, 170, 173–175, 177–182
event_service.py	126	126	0%	1–4, 6, 13–15, 21–22, 25–26, 32–36, 38–40, 42–45, 47–50, 53, 56, 58, 65–66, 69–71, 73, 78–79, 82–85, 88–89, 92–96, 99–101, 103–106, 108–109, 111–113, 115, 120–121, 123–125, 127, 132–133, 135, 137, 139–143, 145–152, 154–155, 157–158, 160, 162–166, 182–183, 185, 187–190, 192–194, 196, 198–201, 203–207, 209–212, 214–216, 218–220
middleware.py	26	26	0%	1, 3–6, 9, 14–15, 23–26, 29–30, 33–34, 37, 46–48, 50, 53–57
models.py	50	50	0%	1–4, 6, 8, 12, 19–20, 27, 30, 33–36, 39, 42–43, 46, 52–54, 59–61, 64, 72, 77, 80, 87, 90–93, 96, 99, 102–104, 107–109, 112, 115–116, 119–120, 123–125
pub_sub.py	35	35	0%	1–4, 6–7, 10, 13–15, 18, 22–23, 30, 32, 39–42, 44, 51–54, 56, 59, 61, 68–72, 74–75, 78
utils.py	28	28	0%	1–4, 6, 9, 11, 14–18, 21, 25–30, 33–36, 39–41, 44, 47
openhands/sdk
__init__.py	16	2	87%	27–28
logger.py	73	21	71%	33, 57, 64–67, 69–71, 124, 129–131, 134–135, 141–143, 150, 155–156
openhands/sdk/agent
__init__.py	4	0	100%
agent.py	181	41	77%	62, 69, 76, 80, 97, 111, 118–119, 124–125, 196–197, 199–201, 203–205, 240, 254, 277, 316, 321–323, 326–327, 330, 360–362, 366–368, 375–376, 380, 384–385, 415, 422
base.py	94	7	92%	113, 127, 135–137, 153, 173
spec.py	15	0	100%
openhands/sdk/context
__init__.py	4	0	100%
agent_context.py	57	2	96%	146, 152
manager.py	3	3	0%	1, 4–5
view.py	97	1	98%	90
openhands/sdk/context/condenser
__init__.py	5	0	100%
base.py	21	0	100%
llm_summarizing_condenser.py	39	3	92%	44–46
no_op_condenser.py	6	0	100%
pipeline_condenser.py	13	6	53%	45–50
openhands/sdk/context/microagents
__init__.py	4	0	100%
exceptions.py	5	0	100%
microagent.py	143	25	82%	130, 133–136, 218–221, 229, 251–252, 257–258, 260, 264, 271–273, 281–283, 337, 339–340
types.py	21	0	100%
openhands/sdk/context/prompts
__init__.py	2	0	100%
prompt.py	30	5	83%	12, 15, 24, 44–45
openhands/sdk/conversation
__init__.py	7	0	100%
conversation.py	115	11	90%	115, 123–125, 129–130, 193, 275–276, 284–285
event_store.py	101	8	92%	50–51, 60, 67, 72–73, 129, 142
persistence_const.py	5	0	100%
secrets_manager.py	41	1	97%	107
serialization_diff.py	0	0	100%
state.py	101	5	95%	141, 164, 200–202
types.py	6	0	100%
visualizer.py	94	6	93%	90, 147, 169, 186, 218, 220
openhands/sdk/event
__init__.py	7	0	100%
base.py	74	8	89%	55, 75, 87–88, 94, 97–98, 100
condenser.py	28	7	75%	37, 39, 41–45
llm_convertible.py	179	16	91%	53, 63–64, 69–70, 246, 280–281, 286, 294, 335–336, 341, 374–375, 380
metric_events.py	13	0	100%
types.py	7	0	100%
user_action.py	12	1	91%	21
utils.py	12	0	100%
openhands/sdk/io
__init__.py	4	0	100%
base.py	14	4	71%	7, 11, 15, 19
local.py	56	16	71%	43–44, 58, 66–78
memory.py	43	4	90%	16, 20, 53–54
openhands/sdk/llm
__init__.py	8	0	100%
exceptions.py	36	0	100%
llm.py	399	102	74%	229, 234, 247–249, 253–254, 286, 349, 355–356, 451, 464–465, 470–471, 473–474, 477–479, 484–486, 490–492, 513–516, 523, 541–542, 545–546, 570, 576–577, 623, 672, 689–690, 699, 710, 731, 733–738, 740–757, 760–764, 766–767, 773–782, 786–797, 810, 824, 829
llm_registry.py	38	0	100%
message.py	109	4	96%	96, 99, 222–223
metadata.py	15	0	100%
openhands/sdk/llm/mixins
fn_call_converter.py	343	92	73%	74, 343, 345, 349, 367, 369, 375, 381, 383, 422, 424, 426, 428, 433–434, 518–520, 522, 524, 545–547, 553, 575, 601–602, 610–613, 615, 617, 639, 648, 656, 701–704, 708–711, 723, 727, 738, 748, 797–798, 800, 829, 833, 859, 867, 870–871, 876, 905–908, 912–913, 918–919, 924, 973–974, 980, 994, 1006, 1008–1009, 1012–1014, 1016–1017, 1023–1025, 1027–1028, 1030, 1032, 1036, 1038, 1043, 1045–1046, 1049
non_native_fc.py	39	3	92%	64, 75, 91
openhands/sdk/llm/utils
metrics.py	111	3	97%	17, 117, 311
model_features.py	40	0	100%
retry_mixin.py	50	11	78%	47, 50, 64, 86, 90, 94–95, 105, 110–111, 116
telemetry.py	136	15	88%	71, 94, 99–100, 112–113, 120, 134, 199, 216, 222, 229, 232, 234, 241
unverified_models.py	69	4	94%	45–46, 51, 73
verified_models.py	5	0	100%
openhands/sdk/mcp
__init__.py	5	0	100%
client.py	26	6	76%	48–49, 62–63, 72–73
definition.py	48	16	66%	55, 75–80, 82–90
tool.py	40	13	67%	36–39, 43, 46, 49–52, 101–102, 107
utils.py	30	4	86%	23–24, 27, 30
openhands/sdk/preset
__init__.py	0	0	100%
default.py	20	12	40%	13, 15, 22, 28–29, 31–33, 35–36, 43, 45
openhands/sdk/security
__init__.py	2	0	100%
analyzer.py	37	8	78%	42, 75, 77–78, 80–81, 83, 86
llm_analyzer.py	9	0	100%
risk.py	12	2	83%	21, 33
openhands/sdk/tool
__init__.py	5	0	100%
schema.py	126	11	91%	24–26, 28, 37, 242–245, 265, 280
spec.py	15	0	100%
tool.py	96	10	89%	65, 106, 176, 179–185
openhands/sdk/tool/builtins
__init__.py	4	0	100%
finish.py	26	1	96%	33
think.py	32	13	59%	24, 27–28, 31, 33–37, 39, 51, 57, 74
openhands/sdk/utils
__init__.py	3	0	100%
async_executor.py	52	7	86%	39, 55–56, 84, 88, 102–103
async_utils.py	12	0	100%
discriminated_union.py	168	22	86%	119–127, 141–142, 236, 329, 356, 399–401, 418, 451, 464, 471, 474
json.py	28	28	0%	1–3, 5, 7–8, 11, 14–21, 25, 28, 30–31, 34, 37–38, 40, 43, 45–48
protocol.py	3	0	100%
pydantic_diff.py	57	15	73%	36, 44, 50–58, 60–62, 65
truncate.py	10	0	100%
visualize.py	17	4	76%	14–16, 22
openhands/tools
__init__.py	17	2	88%	29–30
openhands/tools/browser_use
__init__.py	3	0	100%
definition.py	109	17	84%	28–29, 31, 35, 37–38, 40, 88, 145, 199, 250, 301, 352, 397, 442, 492, 541
impl.py	113	73	35%	38, 44, 58–59, 61–70, 73–82, 84–85, 87–91, 95, 97–98, 103–104, 108–109, 114–115, 119–120, 124–125, 129, 131–132, 134–137, 140–141, 144, 146, 148, 153–154, 158–159, 163–164, 169–170, 177–179, 188–189, 196–197, 206–207
server.py	45	42	6%	11, 13–14, 16–17, 20–21, 24–25, 28, 30–32, 34–35, 38–39, 41, 44, 47–48, 51, 54–55, 57–61, 64–66, 68, 73–78, 80, 87, 89
openhands/tools/execute_bash
__init__.py	4	0	100%
constants.py	9	0	100%
definition.py	95	46	51%	38, 41, 44–45, 47, 50–52, 54–56, 58, 106, 109–111, 114, 116–118, 120, 124–125, 128–130, 132–133, 136–139, 143–145, 150, 154–156, 159–161, 165–166, 168, 248
impl.py	40	3	92%	55, 58, 62
metadata.py	50	3	94%	95–96, 100
openhands/tools/execute_bash/terminal
__init__.py	6	0	100%
factory.py	49	11	77%	24–25, 30, 32, 35, 37–38, 44–46, 97
interface.py	69	15	78%	43, 52, 62, 71, 76, 85, 94, 99, 145, 157, 162, 171, 180, 191, 193
subprocess_terminal.py	236	59	75%	68, 99–100, 126, 132, 139, 146–147, 157–158, 164–165, 179, 181, 185–187, 193, 209, 218–222, 257–259, 264, 276, 290, 314, 316, 325, 346, 362, 367, 373–375, 383–384, 388–389, 391–397, 401–402, 405–406, 408–409, 411–413
terminal_session.py	178	8	95%	92, 96–98, 235, 281, 297, 317
tmux_terminal.py	80	21	73%	36, 45, 108, 119, 133, 145–152, 160–161, 163–164, 166, 168–170
openhands/tools/execute_bash/utils
command.py	81	4	95%	48, 64–66
openhands/tools/str_replace_editor
__init__.py	3	0	100%
definition.py	65	9	86%	87, 99, 119, 122, 125, 132, 134, 136, 138
editor.py	228	11	95%	131, 264, 340, 350, 401–402, 641, 648–649, 663, 668
exceptions.py	22	0	100%
impl.py	26	2	92%	31–32
openhands/tools/str_replace_editor/utils
__init__.py	0	0	100%
config.py	2	0	100%
constants.py	5	0	100%
diff.py	64	1	98%	115
encoding.py	54	1	98%	81
file_cache.py	95	9	90%	44–46, 49–50, 54, 59, 151, 154
history.py	66	1	98%	79
shell.py	23	0	100%
openhands/tools/task_tracker
__init__.py	2	2	0%	1, 10
definition.py	132	132	0%	1–4, 6–7, 9–11, 20, 23–26, 33, 36, 40, 45–46, 48, 51–53, 55–56, 59–60, 62, 65, 68, 71–72, 76–78, 80–81, 83, 85, 87–88, 91, 94–96, 98–99, 102–108, 110–112, 115, 117–120, 122, 125, 128–129, 131–132, 134–135, 137, 140, 143, 150–151, 154–155, 157, 159, 161, 163–165, 171, 173–174, 179–180, 184, 191, 193–194, 196–198, 202–203, 205–208, 210, 212, 214–215, 217–219, 221–225, 229, 231, 233–234, 236–237, 239, 241–245, 249, 382, 396, 399–400, 407, 410
openhands/tools/utils
__init__.py	0	0	100%
TOTAL	6694	1770	73%

openhands-ai · 2025-09-17T20:42:49Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- Pre-commit checks

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #305 at branch `feat/sec-an`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

…formation mode

csmith49 · 2025-09-17T21:45:56Z

Holding off on separate ConfirmationPolicy for another PR. There are a few questions that need to be resolved that deserve more discussion and are totally orthogonal to the changes here.

csmith49 · 2025-09-17T21:47:28Z

@xingyaoww and @malhotra5 Fixed up all the changes if you want to re-review. Will probably merge in the morning if there aren't any outstanding issues.

xingyaoww

LGTM! Just a few nits -- happy to get this merged once it is resolved

examples/17_llm_security_analyzer.py

xingyaoww · 2025-09-18T12:34:31Z

openhands/sdk/agent/agent.py

+                    risk, state.confirmation_mode
+                ):
+                    state.agent_status = AgentExecutionStatus.WAITING_FOR_CONFIRMATION
+                    return True


@malhotra5 actually, related to this, it just occurs to me if we can implement the "confirmation mode" as a different security analyzer that require confirmation for EVERY action?

always require confirmation & llm risk confirmation seems to be two things that are orthogonal that can actually be broken into two analyzers, and confirmation_mode can essentially be thrown away -- all we need to check is whether security_analyzer is None or not. Will it help eliminate a lot of cases in CLI for you?

☝️ we can merge this PR -- this is non-blocking

tests/sdk/agent/test_agent_immutability.py

github-actions · 2025-09-18T14:16:05Z

Agent Server image for this PR

Pull (multi-arch manifest):

docker pull ghcr.io/all-hands-ai/agent-server:726cea1

Run:

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-726cea1 \
  ghcr.io/all-hands-ai/agent-server:726cea1

This tag is a multi-arch manifest (amd64/arm64). Your client pulls the right arch automatically.

Calvin Smith added 10 commits September 16, 2025 07:11

initial risk implementation

16a178c

initial analyzer base class

39973ad

referencing action base

18c43bc

minor type cleanup

cd2f8ff

clean up of risk enum

2604517

security risk cleanup

9a06679

removing tool/security_prompt, replacing with security risk

ea4a31c

cleaning up analyzer implementation

909a7be

cleanup of analyzer interface

071073c

default test framework, now failing

437ead5

csmith49 commented Sep 16, 2025

View reviewed changes

openhands/sdk/agent/agent.py Show resolved Hide resolved

csmith49 commented Sep 16, 2025

View reviewed changes

openhands/sdk/security/analyzer.py Show resolved Hide resolved

malhotra5 reviewed Sep 16, 2025

View reviewed changes

openhands/sdk/agent/agent.py Show resolved Hide resolved

xingyaoww reviewed Sep 16, 2025

View reviewed changes

openhands/sdk/agent/agent.py Outdated Show resolved Hide resolved

Calvin Smith added 2 commits September 16, 2025 13:36

fixing circular imports and adding tests

b26142f

llm analyzer tests

3fb946c

xingyaoww reviewed Sep 17, 2025

View reviewed changes

openhands/sdk/security/llm_analyzer.py Outdated Show resolved Hide resolved

Calvin Smith and others added 5 commits September 17, 2025 08:05

examples

8c74fa9

examples

e7560d7

example + analyzers are base models now

fc7ca23

simpler base class, types on example

56829cd

csmith49 self-assigned this Sep 17, 2025

Calvin Smith and others added 6 commits September 17, 2025 10:03

removing unused abstract methods from llm analyzer

aebcbf0

moving fields from agent to base

e196ad6

Merge branch 'main' into feat/sec-an

6219619

Merge branch 'main' into feat/sec-an

7bb1b2d

fixing tests and circular imports

ca6ef46

renaming example

57eb708

Calvin Smith and others added 3 commits September 17, 2025 14:35

removing accidental extra cli mode flag

b7df507

fixing immutability test to reference actual fields instead of list o…

a6cfccc

…f strs

Merge branch 'main' into feat/sec-an

1f6298d

Calvin Smith added 4 commits September 17, 2025 14:46

removing/updating references to str lit security risks in tool tests

b3322f1

initial confirmation policy impl

b100c1d

fixing user confirmation bug when there's a security analyzer and con…

f281211

…formation mode

removing unused confirmation policy

1edc716

csmith49 marked this pull request as ready for review September 17, 2025 21:45

Merge branch 'main' into feat/sec-an

f36d191

xingyaoww reviewed Sep 18, 2025

View reviewed changes

Calvin Smith added 4 commits September 18, 2025 07:50

minor tweaks to attr access in examples

ef1696a

minor tweaks to attr access in examples

68d51ed

propagating tweaks

3fccbd8

strengthening immutability tests

b2c23c9

csmith49 enabled auto-merge (squash) September 18, 2025 14:10

Merge branch 'main' into feat/sec-an

272d8a8

csmith49 merged commit c352746 into main Sep 18, 2025
13 checks passed

csmith49 deleted the feat/sec-an branch September 18, 2025 14:16

malhotra5 mentioned this pull request Oct 20, 2025

Comprehensive overview for difficulties with security analyzer #819

Closed

Security analyzer framework #305

Security analyzer framework #305

Uh oh!

Conversation

csmith49 commented Sep 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xingyaoww Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

csmith49 Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

xingyaoww Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openhands-ai bot commented Sep 17, 2025

Uh oh!

csmith49 commented Sep 17, 2025

Uh oh!

csmith49 commented Sep 17, 2025

Uh oh!

xingyaoww left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

xingyaoww Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

xingyaoww Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Sep 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions bot commented Sep 17, 2025 •

edited

Loading

xingyaoww left a comment •

edited

Loading