add ability to vary more hyperparamaters in parallel training #6

g-w1 · 2024-02-28T00:25:44Z

This can be useful for hyperparam search.

This allows one to train multiple autoencoders off of a single layer since we aren't indexing a dictionary off the same thing. This could be useful for something like hyperparamater tuning where you only want to change one thing at a time. Here's an example: ``` submodules = [model.gpt_neox.layers[3].mlp, model.gpt_neox.layers[3].mlp, model.gpt_neox.layers[3].mlp] activation_dim = 512 # output dimension of the MLP dictionary_size = 16 * activation_dim learning_rates = [3e-4, 1e-3, 3e-3] ```

This allows one to re-train a sparse autoencoder on the same layer without re-generating all of the activations to train on.

g-w1 force-pushed the parallel-more-hyps branch 2 times, most recently from d5d9ddf to 59785e2 Compare February 29, 2024 16:26

g-w1 added 3 commits February 29, 2024 20:04

add ability to vary more hyperparamaters in parallel training

7ec4227

add way to cache activations from layer

72e23be

This allows one to re-train a sparse autoencoder on the same layer without re-generating all of the activations to train on.

g-w1 force-pushed the parallel-more-hyps branch from a3ff78e to 72e23be Compare February 29, 2024 20:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add ability to vary more hyperparamaters in parallel training #6

add ability to vary more hyperparamaters in parallel training #6

Uh oh!

g-w1 commented Feb 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

add ability to vary more hyperparamaters in parallel training #6

Are you sure you want to change the base?

add ability to vary more hyperparamaters in parallel training #6

Uh oh!

Conversation

g-w1 commented Feb 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant