Verify any claim · lenz.io
Claim analyzed
Tech“The Poincaré embedding model, introduced by Maximilian Nickel and Douwe Kiela in 2017, demonstrated that hierarchical structures can be embedded with low distortion in hyperbolic space.”
The conclusion
The claim accurately identifies the authors, year, and core contribution of the Poincaré embeddings paper, and the broader research community consistently describes the work as demonstrating low-distortion hierarchical embedding in hyperbolic space. The original 2017 paper empirically showed that Poincaré ball embeddings significantly outperform Euclidean baselines on hierarchical datasets like WordNet. However, the paper provides empirical benchmarks rather than formal distortion guarantees, and later research shows distortion can increase for wider hierarchies.
Based on 29 sources: 24 supporting, 0 refuting, 5 neutral.
Caveats
- The original paper demonstrates low distortion empirically on specific benchmarks (e.g., WordNet) but does not provide formal worst-case distortion bounds or universal guarantees.
- Later research (2023) shows that distortion can increase for wider hierarchies because the Poincaré embedding method relies on contrastive learning without partial-order awareness.
- Subsequent models such as the Lorentz model (2018) were shown to improve upon Poincaré embeddings in low-dimensional settings, indicating the original method's low-distortion performance has important caveats.
Get notified if new evidence updates this analysis
Create a free account to track this claim.
Sources
Sources used in the analysis
We introduce a new approach for learning hierarchical representations of symbolic data by embedding them into hyperbolic space -- or more precisely into an n-dimensional Poincaré ball. Hyperbolic geometry exhibits a surprising amount of tree-likeness in that even though the space is continuous and homogeneous, it allows for very efficient representations of data with hierarchical structure.
In this work, we introduce a new approach for learning hierarchical representations of symbolic data by embedding them into hyperbolic space – or more precisely into an n-dimensional Poincaré ball. To exploit this structural property for learning more efficient representations, we propose to compute embeddings not in Euclidean but in hyperbolic space, i.e., space with constant negative curvature.
In this work, we introduce a new approach for learning hierarchical representations of symbolic data by embedding them into hyperbolic space -- or more precisely into an n-dimensional Poincaré ball. Due to the underlying hyperbolic geometry, this allows us to learn parsimonious representations of symbolic data by simultaneously capturing hierarchy and similarity. We present an efficient algorithm to learn the embeddings based on Riemannian optimization and show experimentally that Poincaré embeddings can outperform Euclidean embeddings significantly on data with latent hierarchies, both in terms of representation capacity and in terms of generalization ability.
In particular, we build upon ideas of Poincaré embeddings (Nickel & Kiela, 2017) to learn continuous representations of hierarchies. Due to its geometric properties, hyperbolic space can be thought of as a continuous analogue of trees.
We study different models of hyperbolic space and find that learning embeddings in the Lorentz model is substantially more efficient than in the Poincaré-ball model. We show that the proposed approach allows us to learn high-quality embeddings of large taxonomies which yield improvements over Poincaré embeddings, especially in low dimensions.
In this work, we introduce a new approach for learning hierarchical representations of symbolic data by embedding them into hyperbolic space – or more precisely into an n-dimensional Poincaré ball.
For hierarchical data, the space of choice is hyperbolic since it guarantees low-distortion embeddings for tree-like structures. Many high-dimensional and large-volume data sets of practical relevance have hierarchical structures induced by trees, graphs or time series. Such data sets are hard to process in Euclidean spaces and one often seeks low-dimensional embeddings in other space forms to perform required learning tasks.
Trees can be embedded with arbitrarily low error into the Poincaré disc model of hyperbolic geometry. Hyperbolic spaces with negative curvature can efficiently embed tree-like structures. We empirically show better generalisation to unseen data than the Euclidean counterpart, and can qualitatively and quantitatively better recover hierarchical structures.
Python implementation of Poincaré Embeddings. These embeddings are better at capturing latent hierarchical information than traditional Euclidean embeddings. The method is described in detail in the original paper by Maximilian Nickel and Douwe Kiela.
Poincaré embeddings learn distributed vector representations from hierarchical data (e.g. a directed graph or tree). We applied Poincaré embeddings in a 2-dimensional hyperbolic space to a large-scale administrative claims database and show performance comparable to 100-dimensional embeddings in a euclidean space. Our hypothesis was that the increased representational capacity offered by Poincaré embeddings and their ability to naturally model hierarchical data would result in improved embeddings for clinical concepts.
This work introduces a new approach for learning hierarchical representations of symbolic data by embedding them into hyperbolic space -- or more precisely into an n-dimensional Poincare ball -- and introduces an efficient algorithm to learn the embeddings based on Riemannian optimization.
A key advantage of the hyperbolic space is its inherent ability to represent hierarchical structures and complex tree structures with minimal distortion (Ganea et al., 2018a; Nickel & Kiela, 2017; Sala et al., 2018) which is not directly feasible in the Euclidean space (Nickel & Kiela, 2017).
In this work, we introduce a new approach for learning hierarchical representations of symbolic data by embedding them into hyperbolic space – or more precisely into an n-dimensional Poincaré ball. Due to the underlying hyperbolic geometry, this allows us to learn parsimonious representations of symbolic data by simultaneously capturing hierarchy and similarity.
For Poincaré embeddings, wider hierarchies result in higher distortions because the algorithm relies solely on contrastive learning, ignoring the partial order between hierarchical layers. Poincaré embeddings with r = 2 achieve lower distortion and higher MAP than the entailment method at high branching factors, effectively capturing deeper hierarchies. Notably, for Poincaré embeddings, larger hierarchies reduce distortion.
This report is focused on one particular embedding method called Poincaré embeddings, which is used for learning latent hierarchical structures in the data. It is shown experimentally that embeddings learned in hyperbolic space require far fewer dimensions than embeddings learned in Euclidean space on the WordNet dataset. Furthermore, the simulated distance ratio shows that Poincaré distance can be used to approximate the true hierarchical distance of the tree-like structure data.
Poincaré Embeddings is a method for learning continuous representations of symbolic data by embedding them into hyperbolic space. It was introduced by Nickel and Kiela in 2017 and shown to outperform Euclidean embeddings on hierarchical datasets with low distortion.
Nickel and Kiela (2017) learnt hierarchical representations of symbolic data by embedding them into an n-dimensional Poincaré ball by leveraging the distance property of hyperbolic spaces.
Nickel and Kiela (2017) present a new method for embedding tree nodes in the Poincare ball, and suggest that these hyperbolic embeddings are far more efficient than Euclidean embeddings for representing hierarchical data.
这篇论文是facebook研究人员在2017的工作,尝试了将网络嵌入到poincare(一种双曲空间模型中),poincare ball模型的特点是与欧式空间共形,便于进行梯度计算和优化。
Nickel et al. (2017) Maximillian Nickel, Douwe Kiela, et al. Poincaré Embeddings for Learning Hierarchical Representations. arXiv preprint arXiv:1705.08039 (2017).
The model training takes about 10 minutes per iteration with a bi-dimensional embedding and using 10 negative examples per positive example. We can note that the graph tends to be pushed towards the unit circle. The elements that are not migrated towards the unit circle are those that live at the top of the hierarchy.
The Poincaré Embedding is concerned with the problem of learning hierarchical structure on the dataset. Here is an example that can be an input of the model.
双曲嵌入是一种数据表示技术,它将数据点映射到具有恒定负曲率的几何空间——双曲空间中。该方法对于嵌入层次结构或树状结构尤为有效,因为双曲空间的指数级增长特性使其能够以低失真表示层次数据。
The 2017 paper by Nickel and Kiela explicitly demonstrates through experiments on WordNet and other hierarchies that Poincaré embeddings achieve lower distortion for hierarchical data compared to Euclidean embeddings, with results showing up to 50% relative improvement in link prediction tasks.
Poincaré embeddings [1][2] are hierarchical word embeddings which map integer-encoded words to the hyperbolic space.
第一个特点是,它具有表达层次结构的能力。比如我们的数据本身具有一定的层次结构,它则可以在双曲空间里面复原这种层次结构。随着学习过程的进行,这些点的分布在空间里明显呈现出一种类似于树状结构,也就是层次结构。
我正在尝试构建一个深度学习预测器,它以一组单词向量(在欧几里德空间中)作为输入,并输出Poincaré嵌入。Poincaré嵌入用于在双曲空间中表示层次数据。
实证验证层次语义分割有效性的假设,并探索平面分类器是否能实现更好的性能。研究双曲空间的内在特性,特别是庞加莱球模型,以减轻父类偏差并提高分割准确性和校准性。
本文提出了一种新的双曲空间多层网络嵌入框架。该框架解决了现有方法通常独立嵌入各层或生成单一统一嵌入的局限性,从而提供了一种能够进行层内分析和层间比较,同时保留层次结构的嵌入方法。
What do you think of the claim?
Your challenge will appear immediately.
Challenge submitted!
Expert review
How each expert evaluated the evidence and arguments
Expert 1 — The Logic Examiner
The logical chain from evidence to claim is strong: Sources 1–3 (the original arXiv and NeurIPS papers) directly confirm Nickel and Kiela introduced the model in 2017 and demonstrated empirically that hyperbolic embeddings outperform Euclidean ones on hierarchical data "significantly...both in terms of representation capacity and generalization ability" (Source 3); Source 16 explicitly states the method was "shown to outperform Euclidean embeddings on hierarchical datasets with low distortion," and Source 12 (2024) cites Nickel & Kiela 2017 as foundational evidence for "minimal distortion" embedding of hierarchical structures. The opponent's rebuttal raises a legitimate scope concern — the claim's phrase "low distortion" could be read as requiring a formal worst-case distortion bound, which the 2017 paper does not provide, and Source 14 notes distortion can increase for wider hierarchies — but this is a narrow semantic objection: the claim says "demonstrated," which is satisfied by the empirical benchmarks reported in the paper, and the community consensus (Sources 7, 8, 12, 16) consistently interprets the 2017 work as establishing low-distortion hierarchical embedding; the opponent's argument conflates the absence of a formal theoretical bound with the absence of an empirical demonstration, which is a false equivalence, making the claim Mostly True with only a minor inferential gap around the precise meaning of "low distortion."
Expert 2 — The Context Analyst
The claim states the model "demonstrated that hierarchical structures can be embedded with low distortion in hyperbolic space." The evidence pool overwhelmingly confirms the authorship (Nickel & Kiela), the year (2017), and the core finding that hyperbolic/Poincaré ball embeddings outperform Euclidean embeddings on hierarchical data (Sources 1–3, 12, 16). The key missing context is the nuance around "low distortion": the original paper demonstrates empirical superiority over Euclidean baselines on specific benchmarks (e.g., WordNet), but does not provide formal worst-case distortion bounds; Source 14 (2023) notes that distortion can actually increase for wider hierarchies in the Poincaré embedding approach due to its reliance on contrastive learning without partial-order awareness, and Source 5 shows the Lorentz model improves upon Poincaré embeddings especially in low dimensions. However, the claim's phrasing — "demonstrated that hierarchical structures can be embedded with low distortion" — is a reasonable characterization of the paper's empirical findings and is consistent with how the broader research community (including a 2024 source, Source 12) continues to describe the contribution. The opponent's argument about lack of formal distortion bounds is technically valid but overly strict: the claim uses "demonstrated" in an empirical sense, not a formal proof sense, and the community consensus supports this reading. The missing context (distortion varies by hierarchy type, later models improve upon it) is worth noting but does not fundamentally reverse the claim's truthfulness.
Expert 3 — The Source Auditor
The most reliable, primary sources are the original 2017 paper by Nickel & Kiela in NeurIPS (Sources 2–3, plus the matching arXiv preprint Source 1), which clearly introduces Poincaré embeddings and empirically shows hyperbolic (Poincaré ball) embeddings represent hierarchical/tree-like data more efficiently than Euclidean baselines, but these sources do not clearly frame the result as a general “low distortion” guarantee (they emphasize efficiency/parsimonious representation and benchmark outperformance rather than a formal distortion bound). Later high-quality secondary/adjacent academic sources (e.g., Source 8 NeurIPS 2020 and Source 12 UvA-DARE 2024) broadly support the idea that hyperbolic space can embed trees with low error/minimal distortion, while Source 14 (UvA Pure 2023) adds nuance that distortion behavior depends on hierarchy shape and the original method's design—so the claim is directionally correct but somewhat overstated as written.
Expert summary
The arguments
Two AI advocates debated this claim using the research gathered.
Argument for
The claim is unambiguously true: Source 1 (arXiv), Source 2 (NeurIPS Proceedings), and Source 3 (NeurIPS) all confirm that Nickel and Kiela introduced Poincaré embeddings in 2017, explicitly embedding hierarchical data into hyperbolic space (the Poincaré ball) and demonstrating that this approach yields "very efficient representations of data with hierarchical structure" while outperforming Euclidean embeddings "significantly on data with latent hierarchies." This is further corroborated by Source 16 (Papers with Code), which states the method was "shown to outperform Euclidean embeddings on hierarchical datasets with low distortion," and by Source 12 (UvA-DARE), which cites Nickel & Kiela 2017 as foundational evidence that hyperbolic space represents hierarchical structures "with minimal distortion" — confirming every element of the claim with overwhelming, multi-source consensus from highly authoritative venues.
You're equivocating between “efficient/parsimonious representations” and the motion's specific requirement of “low distortion”: Sources 1–3 say hyperbolic space is tree-like and that Poincaré embeddings outperform Euclidean baselines, but they do not provide a formal distortion guarantee or bounded distortion demonstration, so your conclusion is a non sequitur. And your appeal to “consensus” leans on secondary summaries (Source 16; Source 12) while ignoring the nuance that distortion can worsen with wider hierarchies because the original method ignores partial order information (Source 14), which directly undercuts your blanket “low distortion” reading.
Argument against
The Nickel & Kiela 2017 Poincaré Embeddings paper describes hyperbolic space as “tree-like” and reports better performance than Euclidean baselines, but it does not actually establish the motion's stronger claim that hierarchies are embedded with demonstrably “low distortion” in any formal, distortion-bounded sense (Sources 1: arXiv; 2: NeurIPS Proceedings; 3: NeurIPS). Moreover, later work explicitly notes distortion can increase for wider hierarchies because the original method ignores partial order information, undermining the blanket claim that the model demonstrated low-distortion hierarchical embedding (Source 14: University of Amsterdam Pure).
Your argument commits a classic moving-the-goalposts fallacy: the claim requires only that the model "demonstrated" low distortion, and Source 3 (NeurIPS) explicitly confirms it "outperform[s] Euclidean embeddings significantly on data with latent hierarchies, both in terms of representation capacity and in terms of generalization ability," while Source 16 (Papers with Code) directly states the method was "shown to outperform Euclidean embeddings on hierarchical datasets with low distortion" — this is precisely the empirical demonstration the claim asserts, not a formal worst-case bound. Your reliance on Source 14 (University of Amsterdam Pure) is a red herring: that 2023 paper discusses distortion under specific wider-hierarchy configurations as a limitation to be optimized, which does not retroactively erase the 2017 paper's demonstrated low-distortion results on the benchmarks it tested, and Source 12 (UvA-DARE) — published in 2024 — still cites Nickel & Kiela 2017 as foundational evidence that hyperbolic space embeds hierarchical structures "with minimal distortion," confirming the community's consensus on what the original paper demonstrated.