Claim analyzed

Science

“Correlation entropy is used as a statistical feature in the analysis of datasets.”

Submitted by Clever Zebra 548a

The conclusion

Mostly True
8/10

Scholarly studies show correlation entropy has been extracted as a quantitative feature—especially in time-series and independence tests—confirming that the method is indeed used in data analysis. The claim does not specify prevalence, so documented albeit specialized usage suffices. Its application, however, is niche and not part of most mainstream statistical or machine-learning toolkits.

Caveats

  • Low confidence conclusion.
  • Correlation entropy is specialized; its use is largely confined to nonlinear dynamics, serial-independence or time-series studies.
  • Some sources conflate correlation entropy with broader entropy measures such as mutual information; the two are related but not identical.
  • Mainstream libraries rarely implement correlation entropy, so it should not be assumed to be a standard feature across general datasets.

Sources

Sources used in the analysis

#1
PMC (PubMed Central) 2018-08-29 | Entropy Correlation and Its Impacts on Data Aggregation in a ...
SUPPORT

This paper proposes a general distance-independent entropy correlation model based on the relation between joint entropy and the number of members in a group. This relation is estimated using entropy of individual members and entropy correlation coefficients of member pairs. The proposed model is then applied to evaluate two data aggregation schemes in WSNs including data compression and representative schemes.

#2
PMC 2024-12-23 | Applications of Entropy in Data Analysis and Machine Learning: A Review - PMC
SUPPORT

RbE is used in terms of correlation entropy to test serial independence in [176]. In data analysis, entropy is a powerful tool for the detection of dynamical changes, segmentation, clustering, discrimination, etc. In machine learning, it is used for classification, feature extraction, algorithm optimization, anomaly detection, and more.

#3
NSF PAR 2021-01-01 | Applications of Common Entropy for Causal Inference
NEUTRAL

We show two applications of common entropy in causal inference: First, under the assumption that there are no low-entropy mediators, it can be used to distinguish causation from spurious correlation among almost all joint distributions on simple causal graphs with two observed variables.

#4
Sandia National Laboratories 2022-10-01 | Entropy and its Relationship with Statistics
NEUTRAL

Entropy quantifies the “average amount of surprise” in a random variable and lies at the heart of information theory, which studies the transmission, processing, and storage of information.

#5
PMC 2018-11-23 | Definition and Time Evolution of Correlations in Classical Statistical Mechanics - PMC
SUPPORT

The N-variable distribution function which maximizes the Uncertainty (Shannon's information entropy) and admits as marginals a set of (N−1)-variable distribution functions, is, by definition, free of N-order correlations. This way to define correlations is valid for stochastic systems described by discrete variables or continuous variables, for equilibrium or non-equilibrium states and correlations of the different orders can be defined and measured. Uncertainty U2 increases whenever correlations of order higher than two are created.

#6
arXiv 2025-03-04 | Applications of Entropy in Data Analysis and Machine Learning: A Review - arXiv
SUPPORT

Entropies based on information-theoretical concepts such as the correlation integral... In data analysis, entropy is a powerful tool for detection of dynamical changes, segmentation, clustering, discrimination, etc. In machine learning, it is used for classification, feature extraction, optimization of algorithms, anomaly detection, and more.

#7
PubMed 2024-06-03 | Correlations of Cross-Entropy Loss in Machine Learning - PubMed
SUPPORT

Cross-entropy loss is crucial in training many deep neural networks. In this context, we show a number of novel and strong correlations among various related divergence functions. In particular, we demonstrate that, in some circumstances, (a) cross-entropy is almost perfectly correlated with the little-known triangular divergence, and (b) cross-entropy is strongly correlated with the Euclidean distance over the logits from which the softmax is derived.

#8
scikit-learn Documentation 2026-01-01 | sklearn.feature_selection.mutual_info_classif
SUPPORT

Mutual information measures the dependency between variables using entropy calculations. It is used as a feature selection method based on entropy-related statistics for analyzing datasets in supervised learning.

#9
University of Oxford Statistics Lecture 4: Measures of Correlation and Dependence - Mutual Information and Maximal Information Coefficient
NEUTRAL

Interpretation: H(Y|X) measures the amount of uncertainty remaining about Y after X is known. Given (X,Y) ∼ p(x,y), define the conditional entropy H(Y|X).

#10
arXiv 2000-01-17 | [math-ph/0001024] The Boltzmann/Shannon entropy as a measure of correlation - arXiv
SUPPORT

It is demonstrated that the entropy of statistical mechanics and of information theory, S(p) = -sum p_i log p_i may be viewed as a measure of correlation. Given a probability distribution on two discrete variables, p_ij, we define the correlation-destroying transformation C: p_ij -> pi_ij, which creates a new distribution on those same variables in which no correlation exists between the variables.

#11
Harvard ADS 2007-01-01 | Correlation Problems and Combinatorial Applications of Entropy
NEUTRAL

This project focuses on two areas where combinatorics and other parts of mathematics meet: correlation inequalities, and applications of entropy to combinatorics.

NEUTRAL

This dataset relates to Statlog project comparing statistical, neural, and symbolic learning algorithms; no specific mention of correlation entropy as a feature.

#13
University of North Carolina at Charlotte Using Entropy-Related Measures in Categorical Data ...
SUPPORT

We use measures of entropy, mutual information, and joint entropy as a means of harnessing this discreteness to generate more effective visualizations for large categorical datasets. As suggested in , entropy can be employed in quantifying data features for better visualization.

#14
Nebius 2025-01-01 | Entropy in machine learning — applications, examples, ...
NEUTRAL

Beyond classification, entropy also has applications in dimensionality reduction and anomaly detection. It helps you determine the relative correlation between data points.

#15
arXiv 2024-09-27 | Entropy-based feature selection for capturing impacts in ...
SUPPORT

This paper presents the development of a new entropy-based feature selection method for identifying and quantifying impacts. Temporal feature selection is performed by first computing the cross-fuzzy entropy to quantify similarity of patterns between two datasets.

#16
LibreTexts Engineering 2023-01-01 | 13.13: Correlation and Mutual Information
NEUTRAL

This method is extremely useful when applied to a sample of experimental data that can be modeled by a normal distribution function. It is useful for correlation and mutual information in statistics.

#17
GeeksforGeeks 2025-06-23 | Entropy in Information Theory - GeeksforGeeks
SUPPORT

Entropy quantifies the amount of "information" contained in a message or system, and is foundational in diverse domains such as data compression, cryptography, statistical mechanics, machine learning, and even neuroscience. In Machine Learning, entropy is used in: Decision trees (e.g. information gain), Regularization (e.g. entropy minimization), Semi-supervised learning (e.g. entropy-based confidence) and Generative models (e.g. maximum entropy models).

#18
mimuw 2016-01-01 | Is the entropy a good measure of correlation?
NEUTRAL

The common entropy of two variables X and Y taking values x and y respectively is given by: H(X,Y) = H(X) + H(Y|X) = H(Y) + H(X|Y). This paper investigates whether entropy can be used as a good measure of correlation, particularly for categorical values where Pearson's correlation coefficient is not applicable.

#19
LLM Background Knowledge Information Theory Applications in Statistics
SUPPORT

Entropy and its variants, including measures like mutual information derived from joint entropy, are standard statistical features used in dataset analysis for tasks such as feature selection, clustering, and correlation assessment in machine learning and signal processing.

#20
ukbonn.de Entropies Order / Disorder from Time Series
NEUTRAL

correlation entropy. Fundamentals of Analyzing Biomedical Signals. Entropies. Page 21. 21 entropies from time series correlation entropy pros and cons of correlation entropy. - conceptually easy. - quickest to calculate. - requires existence of scaling region (independent on ε). (if you can't find a scaling region do not apply this method!) - needs lots of data ... - strong correlations in data (sampling interval) use Theiler correction (see Dimensions).

#21
Cross Validated (Stack Exchange) 2023-07-10 | Is entropy commonly used as a feature in statistics?
REFUTE

Entropy is primarily a measure of uncertainty, not routinely extracted as a 'feature' like mean or variance; it's more often used internally in algorithms rather than as an input feature for models.

Full Analysis

Expert review

How each expert evaluated the evidence and arguments

Expert 1 — The Logic Examiner

Focus: Inferential Soundness & Fallacies
Mostly True
8/10

The claim only requires existence (“is used”), and the evidence shows correlation entropy being applied as an empirical statistic on data—for example, Source 2 explicitly states correlation entropy is used to test serial independence, and Source 20 treats correlation entropy as a concrete time-series analysis method, which together suffice to establish use in dataset analysis (even if not universally common). The opponent's critique mainly attacks typicality/ubiquity and conflation with other entropy measures (Sources 8, 15, 13), but that does not logically negate the narrower existential claim, so the claim is true though the proponent sometimes overextends by implying general-purpose feature status beyond the cited contexts.

Logical fallacies

Scope creep/overgeneralization: some proponent reasoning slides from “used in some analyses” (e.g., serial-independence testing, time-series methods) to implying broad, general-purpose feature extraction across datasets.Equivocation: parts of the proponent case treat mutual information or other entropy-based measures (e.g., cross-fuzzy entropy) as if they directly establish “correlation entropy” usage, which is a related but not identical concept.
Confidence: 7/10

Expert 2 — The Context Analyst

Focus: Completeness & Framing
Misleading
5/10

The claim is broad and omits that “correlation entropy” is a specific, domain-dependent family of measures (e.g., time-series/chaos and serial-independence testing) rather than a universally standard, routinely extracted feature across arbitrary datasets; much of the pool instead supports entropy-related dependence measures generally (e.g., mutual information in scikit-learn) rather than correlation entropy specifically [2,8,20]. With full context, it is still accurate that correlation entropy is used as a statistical feature/statistic in some dataset analyses (notably time-series/independence contexts), but the framing can mislead readers into thinking it is common or general-purpose, so the overall impression is overstated [2,20,21].

Missing context

“Correlation entropy” is not a single universally used feature; it refers to specific definitions (often in nonlinear dynamics/time-series) with assumptions (e.g., scaling region, data length) that limit applicability.Several cited items support entropy-based dependence features (e.g., mutual information) rather than correlation entropy itself, so the evidence partly shifts the term to broader entropy measures.Where correlation entropy is used, it is often as a specialized test/statistic (e.g., serial independence) rather than a routinely extracted feature in general ML pipelines.
Confidence: 7/10

Expert 3 — The Source Auditor

Focus: Source Reliability & Independence
Mostly True
8/10

High-authority, independent scholarly sources support that “correlation entropy/entropy correlation” is used as an applied statistic in analyzing data: Source 2 (PMC review, 2024) explicitly states correlation entropy is used to test serial independence, and Source 1 (PMC paper, 2018) applies an entropy-correlation model to evaluate data aggregation schemes on sensor datasets (with additional but less direct support from older/theoretical works like Source 10 arXiv 2000 and domain notes like Source 20). The only explicit refutation is Source 21 (Stack Exchange), which is low-authority and opinion-based, so the trustworthy evidence indicates the claim is true in at least some real dataset-analysis contexts, though it does not establish it as a universally standard feature across all dataset analyses.

Weakest sources

Source 21 (Cross Validated/Stack Exchange) is low-authority, non-peer-reviewed community commentary and does not outweigh peer-reviewed/curated literature.Source 14 (Nebius blog) is a non-scholarly, potentially promotional explainer and is not strong evidence for technical usage claims.Source 19 (LLM Background Knowledge) is not an independent citable source and should not be used as evidence.
Confidence: 7/10

Expert summary

See the full panel summary

Create a free account to read the complete analysis.

Sign up free
The claim is
Mostly True
8/10
Confidence: 7/10 Spread: 3 pts

The arguments

Two AI advocates debated this claim using the research gathered.

Argument for

P
Proponent Argues TRUE

Multiple high-authority sources directly confirm that correlation entropy is used as a statistical feature in dataset analysis: Source 2 (PMC, 2024) explicitly states that "correlation entropy" is used "to test serial independence" and that entropy is "a powerful tool for detection of dynamical changes, segmentation, clustering, discrimination" in data analysis, while Source 1 (PMC, 2018) demonstrates a full entropy correlation model applied to evaluate data aggregation schemes, and Source 20 (ukbonn.de) lists "correlation entropy" as a named analytical method with documented pros and cons for time-series datasets. Furthermore, Source 8 (scikit-learn Documentation) and Source 15 (arXiv, 2024) confirm that entropy-based measures — including those capturing correlations between variables — are actively employed as statistical features for feature selection and dataset analysis, with Source 13 (University of North Carolina at Charlotte) reinforcing that entropy measures are used to "quantify data features," collectively establishing that correlation entropy is a recognized and applied statistical feature in dataset analysis.

O
Opponent Rebuttal

The Proponent equivocates between “correlation entropy” as a specific statistic and the broader fact that “entropy-based measures” exist: Source 8 (scikit-learn) is about mutual information (not correlation entropy), Source 15 is about cross-fuzzy entropy, and Source 13 discusses entropy/mutual information/joint entropy generally—none substantiate the motion's precise claim that correlation entropy itself is used as a statistical feature in dataset analysis. Even where “correlation entropy” is named, the cited uses are narrow and methodological rather than feature-extraction practice—Source 2 frames it as a serial-independence test and Source 1 applies an entropy-correlation model to WSN aggregation evaluation—while Source 20 is a low-authority slide noting constraints, so the Proponent's leap from specialized applications to a general “statistical feature” claim is an overgeneralization.

Argument against

O
Opponent Argues FALSE

The motion is misleading because the brief's strongest “support” largely discusses entropy/correlation concepts as theoretical measures or internal algorithmic quantities (e.g., entropy correlation for WSN aggregation in Source 1, and correlation entropy for serial-independence testing in Source 2), not as a general-purpose statistical feature routinely extracted from arbitrary datasets. The only item that directly challenges the “used as a feature” framing notes entropy is not commonly treated like standard features and is more often used inside methods rather than as an input feature (Source 21, Cross Validated), and the brief provides no concrete, widely adopted dataset-analysis pipeline demonstrating “correlation entropy” as a standard feature.

P
Proponent Rebuttal

The Opponent's argument rests critically on Source 21 — a Stack Exchange community post of notably low authority — while dismissing multiple peer-reviewed, high-authority sources including Source 2 (PMC, 2024), which explicitly names "correlation entropy" as used "to test serial independence," Source 15 (arXiv, 2024), which demonstrates entropy-based measures deployed as features in a concrete feature selection pipeline, and Source 20 (ukbonn.de), which lists "correlation entropy" as a named analytical method with documented practical pros and cons, collectively constituting precisely the kind of applied, dataset-analysis evidence the Opponent claims is absent. Furthermore, the Opponent commits a no-true-Scotsman fallacy by arbitrarily narrowing "used as a statistical feature" to exclude domain-specific applications such as WSN data aggregation (Source 1) and time-series analysis (Source 20), when the motion makes no such restriction and the evidence clearly demonstrates correlation entropy functioning as a statistical feature across multiple analytical contexts.

Your annotation will be visible after submission.

Embed this verification

Every embed carries schema.org ClaimReview microdata — recognized by Google and AI crawlers.

Mostly True · Lenz Score 8/10 Lenz
“Correlation entropy is used as a statistical feature in the analysis of datasets.”
21 sources · 3-panel audit
See full audit on Lenz →