Rotate, zoom, and hover to explore the 3D PCA projections of all four embedding spaces. Color = log(token frequency).
Left: Euclidean GPT-2 embeddings (PPL=48.9, ρ=+0.924). Right: Lorentz GPT-2 embeddings (PPL=43.69, ρ=−0.650). The Lorentz model places frequent tokens in a dense core (dark cluster) with rare tokens radiating outward.
Showing 1,000 stratified tokens (evenly sampled across frequency ranks) out of 16,384 for performance.
Left: Lorentz GPT-2 LorentzMLR centroids (−d² logits). Right: Poincaré Hyp-Output HyperbolicMLR centroids (Ganea et al.). Both develop frequency-correlated radial structure, but the Lorentz model achieves 3.8× better PPL.