This repo includes support code and replication scripts for the papers "Similarity-Distance-Magnitude Activations" and "Similarity-Distance-Magnitude Language Models". This repo only includes auxiliary code (e.g., for preprocessing the research datasets) and scripts containing the parameters used for the experiments. The main code is in the Reexpress MCP Server repo, version 2.0.0 (Commit 78c8465). The preprocessed data is available in the GitHub release binaries in this repo.
Create the conda environment in INSTALL.md. In our provided scripts, we assume Linux and CUDA GPUs, but the scripts should also work on cpu, or on Apple silicon ('mps'), if you install an applicable version of FAISS and adjust the command line options for the device, accordingly.
Scripts for training and testing the models are in the sdm_activations_paper directory. Additional experiments with variational Bayesian last-layer neural networks are described in supplemental_material.
Scripts for training and testing the models are in the sdm_lms_paper directory.
@misc{Schmaltz-2025-SimilarityDistanceMagnitudeLanguageModels,
title={Similarity-Distance-Magnitude Language Models},
author={Allen Schmaltz},
year={2025},
eprint={2510.26183},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2510.26183},
}
@misc{Schmaltz-2025-SimilarityDistanceMagnitudeActivations,
title={Similarity-Distance-Magnitude Activations},
author={Allen Schmaltz},
year={2025},
eprint={2509.12760},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2509.12760},
}