Add optional mask for high norm tokens #52

adamkarvonen · 2025-08-21T01:22:32Z

When training SAEs on Qwen models, I found that as we move to later layers and larger models (e.g. Qwen3-32B), the activations begin to have random attention sinks with extremely high activation norms (> 100x the median). They appear randomly in the sequence, often on seemingly unimportant tokens like the (0) in my_list.append(0).

This reduces the frac variance explained by around 3% and adds loss spikes. It also adds a significant amount of dead features early in training (often 25% or more), although this does seem to go away after 100M tokens or so.

To remove this, I added an optional argument, which if present, we filter out activations with a norm greater than max_activation_norm_multiple * median activation norm.

As an example, we can see the orange and green lines both have multiple high activation norms in a single sequence.

adamkarvonen added 2 commits August 21, 2025 01:15

Add optional mask for high norm tokens

b8bb4ae

Improve variable name

1d2737a

adamkarvonen merged commit 60ec6bf into main Aug 21, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add optional mask for high norm tokens #52

Add optional mask for high norm tokens #52

Uh oh!

adamkarvonen commented Aug 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add optional mask for high norm tokens #52

Add optional mask for high norm tokens #52

Uh oh!

Conversation

adamkarvonen commented Aug 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant