Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
106 commits
Select commit Hold shift + click to select a range
ceb1e04
init branch
LouisYRYJ May 15, 2025
1dc6a53
Merge branch 'main' into approx-unrolling
LouisYRYJ May 15, 2025
8b52f14
EK-FAC running
LouisYRYJ May 22, 2025
1f873c4
checkpoint EKFACs running
LouisYRYJ May 22, 2025
8581b14
Merge branch 'main' into approx-unrolling
LouisYRYJ May 22, 2025
2ac0540
Merge branch 'main' into approx-unrolling
LouisYRYJ May 25, 2025
bc2d8bf
WIP resetting
LouisYRYJ May 26, 2025
b7f1557
averages working
May 26, 2025
07626fc
unrolling pipeline works for small models (otherwise we get OOM)
May 30, 2025
de08537
pre kronfluence vendoring
LouisYRYJ Jun 2, 2025
0d2740f
pipeline with vendored library working
LouisYRYJ Jun 2, 2025
d00f620
removed score utilities from hessian
LouisYRYJ Jun 2, 2025
a692b89
debugging covariance randomness
LouisYRYJ Jun 3, 2025
97ea928
Merge branch 'main' into approx-unrolling
LouisYRYJ Jun 3, 2025
59d15f5
renaming quelle -> bergson
LouisYRYJ Jun 3, 2025
05325bf
Merge branch 'main' into approx-unrolling and testing
LouisYRYJ Jun 4, 2025
8489f44
fsdp testing
LouisYRYJ Jun 5, 2025
16d7022
covariance with hooks working
LouisYRYJ Jun 6, 2025
aa08964
using closure for covariance processing
LouisYRYJ Jun 11, 2025
c9384f6
using fsdp for covariance processing working
LouisYRYJ Jun 11, 2025
cb8466f
refactoring hessians
LouisYRYJ Jun 12, 2025
9b6045d
quick clean up
LouisYRYJ Jun 13, 2025
fce6d0d
merging
LouisYRYJ Jun 13, 2025
ae7bbc0
debugging memory leaks
LouisYRYJ Jun 17, 2025
6651e78
merge main
LouisYRYJ Jun 20, 2025
6e7eaae
ekfac refactoring WIP
LouisYRYJ Jun 21, 2025
3771dc7
EKFAC refactoring WIP
LouisYRYJ Jun 22, 2025
34a2bdf
KFAC done + slow Eigenvalue correction
LouisYRYJ Jun 22, 2025
7eed777
pipeline running
LouisYRYJ Jun 23, 2025
0ede5b6
merge main
LouisYRYJ Jun 24, 2025
24b5820
pipeline WIP
LouisYRYJ Jun 24, 2025
495cc85
pipeline with new set up running
LouisYRYJ Jun 24, 2025
27b8a71
merge main
LouisYRYJ Jun 25, 2025
144bb32
memory efficient pipeline for bigger models WIP
LouisYRYJ Jun 25, 2025
faa5d6b
scaling covariance
LouisYRYJ Jun 26, 2025
d47dd3c
detach grads
LouisYRYJ Jun 26, 2025
b667b90
sharding covariances WIP
LouisYRYJ Jun 27, 2025
1222850
proper saving WIP
LouisYRYJ Jun 29, 2025
7172a14
eigenvectors sharded
LouisYRYJ Jun 30, 2025
7c4e51a
writing and running tests (pipeline running for 7B)
LouisYRYJ Jul 1, 2025
4aaedf8
ekfac tests, 1 device passing
LouisYRYJ Jul 3, 2025
67fa643
clean up WIP
LouisYRYJ Jul 4, 2025
a43df83
refactor + bug fix: .contiguous must be called BEFORE dist.all_reduce
LouisYRYJ Jul 8, 2025
bf14d88
clean up and add README
LouisYRYJ Jul 8, 2025
d141a82
add specification
LouisYRYJ Jul 8, 2025
be41c0b
clean up WIP
LouisYRYJ Jul 8, 2025
086bcbd
merge main
LouisYRYJ Jul 8, 2025
f46c3b6
more clean up
LouisYRYJ Jul 8, 2025
d8600ac
reformatting
LouisYRYJ Jul 8, 2025
c214258
fix cpu nonblocking bug in compute_eigenvector
LouisYRYJ Jul 14, 2025
852855b
sharded matmul refactoring
LouisYRYJ Jul 15, 2025
566fd5e
small fixes
LouisYRYJ Jul 15, 2025
4714af8
rewriting dist WIP
LouisYRYJ Jul 16, 2025
9ac34e6
refactoring distributed done
LouisYRYJ Jul 16, 2025
7b0128f
fix label when prompt exceeds max_token_len
LouisYRYJ Jul 17, 2025
8a90724
attribution with ekfac
LouisYRYJ Jul 17, 2025
f6c6a30
merge main
LouisYRYJ Jul 17, 2025
33138ac
remove path dependency
LouisYRYJ Jul 18, 2025
3a305ee
fix path dependency
LouisYRYJ Jul 18, 2025
1524cca
refactor + added logger + ekfac transform running
LouisYRYJ Jul 18, 2025
7cea24b
big refactor
LouisYRYJ Jul 30, 2025
ad712f3
apply ekfac refactor
LouisYRYJ Jul 31, 2025
63c1343
merging main
LouisYRYJ Jul 31, 2025
a735ad1
reformatting
LouisYRYJ Jul 31, 2025
422157c
(re)move notebooks
LouisYRYJ Jul 31, 2025
a05d95a
minor changes
LouisYRYJ Jul 31, 2025
2528b0e
Remove attribute_results.ipynb from tracking
LouisYRYJ Jul 31, 2025
20e6b92
test ekfac apply passing
LouisYRYJ Aug 5, 2025
8393fe9
clean up
LouisYRYJ Aug 5, 2025
e252d62
apply ekfac + datafiltering
LouisYRYJ Aug 7, 2025
516ebc8
adding peft fsdp test
LouisYRYJ Aug 8, 2025
a203b75
switch fsdp and peft
LouisYRYJ Aug 8, 2025
9e0e87e
fixing peft fsdp interaction
LouisYRYJ Aug 11, 2025
30e45ab
smaller refactor for peft loading
LouisYRYJ Aug 11, 2025
aff9c64
ekfac fix to get same results as kronfluence
LouisYRYJ Aug 15, 2025
64d4eb0
update script + small fix
LouisYRYJ Aug 15, 2025
fa0a4a6
fix script path
LouisYRYJ Aug 15, 2025
8ca6324
running ekfac sweeps
LouisYRYJ Aug 17, 2025
2d13de2
refactoring ekfac computations WIP
LouisYRYJ Aug 27, 2025
c3dc77d
sharded computation moved into different class DONE
LouisYRYJ Aug 27, 2025
6d0f935
distributed friendly logging (only log rank 0)
LouisYRYJ Aug 27, 2025
28b5866
removing processor
LouisYRYJ Sep 3, 2025
644558f
cut normalizers, make ekfac_apply file handling more clear
LouisYRYJ Sep 16, 2025
bff15f2
remove break statement, now attn is included by default
LouisYRYJ Sep 20, 2025
12f26fa
fix transformers deprecated warning + make tests run
LouisYRYJ Oct 14, 2025
80fa04a
cleaner test notebook
LouisYRYJ Oct 14, 2025
8232b77
refactor collector
LouisYRYJ Oct 16, 2025
76767b4
Add pytest and pyright to dev dependencies
smarter Oct 16, 2025
aa4b370
Clear compute_ekfac_ground_truth.ipynb outputs
smarter Oct 17, 2025
8e02f92
compute_ekfac_ground_truth: Adapt to the removal of EkfacCollector
smarter Oct 17, 2025
a7a4589
compute_ekfac_ground_truth: create pile_100/data if it doesn't exist
smarter Oct 17, 2025
64dca1a
compute_ekfac_ground_truth: drop unused incorrect cell
smarter Oct 17, 2025
227c17a
compute_ekfac_ground_truth: Rewrite .ipynb as .py
smarter Oct 18, 2025
725a2a1
Deduplicate deterministic seed logic
smarter Oct 18, 2025
84ac7de
compute_ekfac_ground_truth: add --precision argument
smarter Oct 18, 2025
a9b759f
compute_ekfac_ground_truth: add --output-dir / -o
smarter Oct 18, 2025
9b553c5
Add minimal CI for typechecking
smarter Oct 18, 2025
32b80d8
edit tests to be compatible with new file naming + make documentation…
LouisYRYJ Oct 20, 2025
399c69b
Add a type alias `Precision` for the type of IndexConfig.precision
smarter Oct 29, 2025
51a25a6
Add more type annotations in compute_ekfac_ground_truth
smarter Oct 29, 2025
5e33a30
Make compute_ekfac_ground_truth usable as a notebook
smarter Oct 29, 2025
c7cb117
Reorder cells to be more convenient as a notebook
smarter Oct 30, 2025
7df9e5e
Add a type alias `Batches` for readability
smarter Oct 30, 2025
376c7fd
running precommit hook
LouisYRYJ Oct 30, 2025
7e350e6
use torch dtype + change path in run_test_compute_ekfac.sh to make th…
LouisYRYJ Oct 30, 2025
c3b24de
Merge pull request #53 from smarter/fix-ground-truth
LouisYRYJ Oct 30, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
name: build

on:
push:
branches:
- ekfac
pull_request:
branches:
- ekfac
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.10"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[dev,faiss]"
# TODO: Proper test infrastructure for tests/ekfac_tests
# - name: Run tests
# run: pytest
# TODO: run pyright on whole codebase
- name: Type Checking bergson/hessians
uses: jakebailey/pyright-action@v1
with:
version: 1.1.406
working-directory: bergson/hessians
- name: Type Checking tests/ekfac_tests
uses: jakebailey/pyright-action@v1
with:
version: 1.1.406
working-directory: tests/ekfac_tests
- name: build
run: pip wheel --no-deps -w dist .
env:
HF_HUB_DOWNLOAD_TIMEOUT: 100
34 changes: 29 additions & 5 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -161,23 +161,47 @@ dmypy.json
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# JetBrains specific template is maintained in a separate JetBrains. that can
# be found at https://github.com/github//blob/main/Global/JetBrains.
# and can be added to the global or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

# VS Code
.vscode/

# Ruff stuff:
.ruff_cache/

# PyPI configuration file
.pypirc

# models
*.pt
*.pth
*.safetensors
*.json
*.jsonl
*.txt
*.arrow
*.bin
*.csv
*.npy

# plots
*.png
*.jpg
*.jpeg
*.gif

# debugging results
*.svg
*.pickle
# Faiss index files
*.faiss
# Local directory for run artifacts
runs/
cache/

wandb/
.vscode/


9 changes: 7 additions & 2 deletions bergson/__main__.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
from simple_parsing import parse

from .build import build_gradient_dataset
from bergson.distributed import distributed_computing

from .collection import collect_gradients
from .data import IndexConfig


def main():
build_gradient_dataset(parse(IndexConfig))
distributed_computing(
parse(IndexConfig),
worker_fn=collect_gradients,
)


if __name__ == "__main__":
Expand Down
271 changes: 0 additions & 271 deletions bergson/build.py

This file was deleted.

Loading
Loading