HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts
Published in Conference on Neural Information Processing Systems (NeurIPS) 2025, 2025
This paper propose a framework for a family of hyperbolic LLMs, including a mixture-of-curvature experts module where each expert operates in a distinct curvature space, hyperbolic Multi-Head Latent Attention mechanism, and hyperbolic rotary positional encoding.
Recommended citation: Neil He, Rishabh Anand, Hiren Madhu, Ali Maatouk, Smita Krishnaswamy, Leandros Tassiulas, Menglin Yang, and Rex Ying. "HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts." arXiv preprint. 2025.
Paper Link | Code Link