InFeeo
Global
technology-news
New
Language

Channels

Adaptive Low-Rank Transformer with Dynamic Expert Routing for Continual Learning(creativecommons.org)
Inspired by the role of sleep in biological continual learning, we introduce RVW, a trans- former architecture for online continual adaptation of pretrained models. RVW maintains a small pool of per-layer experts that grow and prune in response to distribution shift, with no replay buffer and no explicit task identifier. Applied to TinyLlama-1.1B on a 15,000- chunk six-domain stream, RVW reaches 40 average held-out PPL, substantially better than EWC (158), fine-tuning (164), and LoRA (448) on the same parameter-matched base, while preserving prior-domain performance. Threshold sweeps suggest a combinatorial encoding reading: domain knowledge appears to be carried by routing patterns across layers rather than by individual specialized experts.
Simpson's Paradox(leibniz.stanford.edu)
Stanford Encyclopedia of Philosophy Browse Table of Contents What's New Random Entry Chronological Archives About Editorial Information About the SEP Editorial Board How to Cite the SEP Special Characters Advanced Tools Contact Support SEP Support the SEP PDFs for SEP Friends Make a Donation SEPIA for Libraries Entry Contents Bibliography Academic Tools Friends PDF Preview Author and Citation Info Back to Top Simpson’s ParadoxFirst published Wed Mar 24, 2021; substantive revision Sat Jun 6, 2026 Simpson’s Paradox is a statistical phenomenon where an association between two variables in a population emerges, disappears or reverses when the population is divided into subpopulations. For instance, two variables may be positively associated in a population, but be independent or even negatively associated in all subpopulations. Cases exhibiting the paradox are unproblematic from the perspective of mathematics and probability theory, but nevertheless strike many people as surprising. Additionally, the paradox has implications for a range of areas that rely on probabilities, including decision theory, causal inference, and evolutionary biology. Finally, there are many instances of the paradox, including in epidemiology and in studies of discrimination, where understanding the paradox is essential for drawing the correct conclusions from the data. The following article provides a mathematical analysis of the paradox, explains its role in causal reasoning and inference, compares theories of what makes the paradox seem paradoxical, and surveys its applications in different domains. 1. Introduction 2. Definition and Mathematical Characterization 2.1 Varieties of Simpson’s Paradox 2.2 Necessary and Sufficient Conditions 3. Simpson’s Paradox and Causal Inference 3.1 Probabilistic Causality and Simpson’s Paradox 3.2 Specific Debates: Causal Interaction, Average Effects, Mediators 3.3 DAGs and Causal Identifiability 3.4 Confounding and Pearl’s Analysis of the Paradox 3.5 Implications 4. What Makes Simpson’s Paradox Paradoxical? 5. Applications 5.1 Non-Categorical Data and Linear Regression 5.2 Epidemiology and Meta-Analysis 5.3 Decision Theory and the Sure-Thing Principle 5.4 Philosophy of Biology and Natural Selection 5.5 Policy Questions: Interpreting Data on Discrimination 5.6 Using Statistics to Evaluate Task Performance 6. Conclusions Bibliography Academic Tools Other Internet Resources Related Entries 1. Introduction We begin with an illustration of the paradox with concrete data. The numbers in Table 1 summarize the effect of a medical treatment for the overall population (N = 52), and separately for men and women: Full Population, \(\bf N=52\) Men \(\bf(\r{M})\), \(\bf N=20\) Women \(\bf(\neg \r{M})\), \(\bf N=32\) Success \(\bf(\r{S})\) Failure \(\bf(\neg \r{S})\) Success Rate Success Failure Success Rate Success Failure Success Rate Treatment (T) 20 20 50% 8 5 ≈ 61% 12 15 ≈ 44% Control (¬T) 6 6 50% 4 3 ≈ 57% 2 3 ≈ 40% Table 1: Simpson’s Paradox: the type of association at the population level (positive, negative, independent) changes at the level of subpopulations. Numbers taken from Simpson’s original example (1951). For matters of exposition, we assume that these frequencies are unbiased estimates of the underlying probabilities. The treatment looks ineffective at the level of the overall population, but it leads to higher success percentages than the control both for men and for women (61% vs. 57% for men and 44% vs. 40% for women). Writing these proportions as conditional probabilities, with \(\r{T}\)=treatment, \(\r{S}\)=success/recovery, and \(\r{M}\)=male subpopulation, we obtain \[ p(\r{S}\mid \r{T}) = p(\r{S}\mid \neg \r{T}) \] but at the same time, \[\begin{align*} p(\r{S}\mid \r{T}, \r{M}) & \gt p(\r{S}\mid \neg \r{T}, \r{M} ) \\ p(\r{S}\mid \r{T}, \neg \r{M}) &\gt p(\r{S}\mid \neg \r{T}, \neg \r{M}) \end{align*}\] Should we use the treatment or not? When we know the gender of the patient, we would presumably administer the treatment, whereas it does not look like the right thing to do when we don’t know the patient’s gender—although we know that the patient is either male or female! This phenomenon was first pointed out in papers by Karl G. Pearson (1899) and George U. Yule (1903), but it was Simpson’s short paper “The interpretation of interaction in contingency tables” (1951), discussing the interpretation of such association reversals, that led to the phenomenon being labeled as “Simpson’s Paradox”. The phenomenon is, however, broader than independence in the overall population and positive association in the subpopulations; for example, the associations may also be reversed. Nagel and Cohen (1934: ch. 16) provide an example of such a reversal as part of a exercise for logic students. Understanding the paradox is essential for drawing the proper conclusions from statistical data. To give a recent example involving the paradox (Kügelgen, Gresele, & Schölkopf 2021), early data revealed that the case fatality rate for Covid-19 was higher in Italy than in China overall. Yet within every age group the fatality rate was higher in China than in Italy. One thus appears to get opposite conclusions about the comparative severity of the virus in the countries depending on whether one compares the whole populations or the age-partitioned populations. Having a proper analysis of what is going on is such cases is thus crucial for using statistics to inform policy. In what follows, Section 2 explains different varieties of the paradox, clarifies the logical relationships between them, and identifies precise conditions for when the paradox can occur. While that section focuses on the mathematical characterization of the paradox, Section 3 focuses on its role in causal inference, its implications for probabilistic theories of causality, and its analysis by means of causal models based on directed acyclic graphs (DAGs: Spirtes, Glymour, & Scheines 2000; Pearl 2000 [2009]). Based on these different approaches, Section 4 discusses different analyses of what makes Simpson’s Paradox look paradoxical, and what kind of error it reveals in human reasoning. This section also reports empirical findings on the prevalence of the paradox in reasoning and inference. Section 5 surveys the occurrence and interpretation of the paradox in applied statistics (regression models), philosophy of biology, decision theory and public policy. For example, Simpson’s Paradox is relevant when analyzing data to test for race or gender discrimination (Bickel, Hammel, & O’Connell 1975). Section 6 wraps up our findings and concludes. 2. Definition and Mathematical Characterization This section shows how Simpson’s Paradox can be characterized mathematically, under which conditions it occurs, and how it can be avoided. We begin by further considering the concrete example from the introduction in order to build intuitions that will guide us through the more technical results. The data in Table 1 can be translated into success or recovery rates, showing that treated men have a higher recovery rate than untreated men (roughly 61% vs. 57%), and the same for women (44% vs. 40%). Two observations are key to understanding why this positive association vanishes in the aggregate data. First, the recovery rate of untreated men is still higher than the recovery rate of women who receive treatment (57% vs. 44%), suggesting that not only treatment, but also gender is a relevant predictor of recovery. Second, while the treatment group is majority female (27 vs. 13), the control group is majority male (7 vs. 5). Speaking informally, the lack of population-level correlation between treatment and recovery results from men being both (i) more likely to recover from the treatment, and (ii) less likely to be in the treatment group. This becomes evident when we use conditional probabilities to represent recovery rates given treatment and/or subpopulation. The overall recovery rates given treatment and control can, by the Law of Total Probability, be written as the weighted average of recovery rates in the subpopulations: \[\begin{align*} p(\r{S}\mid \r{T}) &= p(\r{S}\mid \r{T},\r{M}) p(\r{M}\mid \r{T}) + p(\r{S}\mid \r{T}, \neg \r{M}) p(\neg \r{M}\mid \r{T}) \\ p(\r{S}\mid \neg \r{T}) &= p(\r{S}\mid \neg \r{T},\r{M}) p(\r{M}\mid \neg \r{T}) + p(\r{S}\mid \neg \r{T}, \neg \r{M}) p(\neg \r{M}\mid \neg \r{T})\end{align*}\] Plugging in the numbers from Table 1 to calculate the overall recovery rates via these equations, we see that the first line is a weighted average of success rates for treated men and women (61% and 44%) while the second line is a weighted average of success rates of the two control groups (57% and 40%). These averages are weighted by the percentage of males and females in each group, and in the present case the gender disparity between the groups results in both averages being 50%. Since these weights can be different, the treatment may raise the probability of success among males and females without doing so in the combined population. Later we will show that the positive association in the subpopulations cannot vanish if the correlation of treatment with gender is broken (e.g., by balancing gender rates in both conditions). The weights in each line are then identical—\(p(\r{M}\mid \r{T}) = p(\r{M}\mid \neg \r{T})\)—and associations in subpopulations are preserved for the aggregate data (Theorem 1 in Section 2.2). In fact, the absence of such a correlation rules out Simpson’s Paradox. In what follows, we interpret Simpson’s paradox as a property of association between variables, expressed by conditional probabilities. This perspective is not uncontentious. For Spanos (2021), it amounts to a deductive take on the paradox, contrary to the original understanding of the paradox as a challenge for inductive, statistical learning. In Pearson’s and Yule’s original papers (and in applications involving linear regression models, see Section 5.1), the paradox is about models and data: estimating a model parameter yields spurious correlations that vanish when one refines the model and includes further variables. For Spanos, the paradox emerges as a consequence of statistical model misspecification. We get back to this statistics-centered perspective on Simpson’s paradox at the end of Section 4. 2.1 Varieties of Simpson’s Paradox Simpson’s Paradox can occur for various types of data, but classically, it is formulated with respect to \(2\times2\) contingency tables. Let \(D_i = (a_i, b_i, c_i, d_i)\) be a four-dimensional vector of real numbers representing the \(2\times2\) contingency table for treatment and success in the i-th subpopulation, and let \[D = \sum_{i=1}^N D_i = \left(\sum a_i, \sum b_i, \sum c_i, \sum d_i\right)\] be the aggregate data set over \(N\) subpopulations. These data should be read as shown in Table 2. Population \(\bf \i{D} = \i{D}_1+\i{D}_2\) Subpopulation \(\bf \i{D}_1\) Subpopulation \(\bf \i{D}_2\) Success (\(\bf \i{S}\)) Failure (\(\bf \neg \i{S}\)) Success (\(\bf \i{S}\)) Failure (\(\bf \neg \i{S}\)) Success (\(\bf \i{S}\)) Failure (\(\bf \neg \i{S}\)) Treatment (\(\bf \i{T}\)) \(a_1 + a_2\) \(b_1 + b_2\) \(a_1\) \(b_1\) \(a_2\) \(b_2\) No Treatment (\(\bf\neg \i{T}\)) \(c_1 + c_2\) \(d_1 + d_2\) \(c_1\) \(d_1\) \(c_2\) \(d_2\) Table 2: Abstract representation of a \(2 \times 2\) contingency table with subpopulations \(D_1\) and \(D_2\). Let \(\alpha (D_i)\) be a measure the strength of the probabilistic association between \(T\) and \(S\) in population \(D_i\).[1] By convention, \(\alpha (D_i) = 0\) corresponds to no association between the variables, \(\alpha (D_i) \gt 0\) indicates a positive association, and \(\alpha (D_i) < 0\) a negative one. This can best be translated into the condition \[\begin{align*} \tag{1} \alpha (D_i) & \begin{cases} > 0 & \qquad \text{if and only if} \qquad a_i \, d_i > b_i \, c_i; \\ = 0 & \qquad \text{if and only if} \qquad a_i \, d_i = b_i \, c_i; \\ > 0 & \qquad \text{if and only if} \qquad a_i \, d_i < b_i \, c_i. \end{cases}\end{align*}\] The condition \(a_i \, d_i > b_i \, c_i\) is equivalent to saying that the success rate in the first row (“treatment condition”) is higher than the success rate in the second row (“control condition”): \[ a_i/(a_i+b_i) > c_i/(c_i+d_i).\] Applying all this to our dataset in Table 1, we see that \(\alpha(D) = 0\) although \(\alpha(D_1) > 0\) and \(\alpha(D_2) > 0\). This is a special case of what Samuels (1993) calls Association Reversal (AR). Association reversal occurs if and only if there is a population such that the association in all partitioned subpopulations is either (i) positive (ii) negative, or (iii) zero, and the type of association in the population does not match that of the subpopulations. Writing this out mathematically, this means for a dataset \(D = \sum_{i=1}^N D_i\) that one of the following two conditions holds, \[\begin{align*} \alpha(D) &\le 0 \qquad \text{and} & \alpha(D_i) &\ge 0 \qquad \forall \; 1 \le i \le N \tag{AR1}\\ \alpha(D) &\ge 0 \qquad \text{and} & \alpha(D_i) &\le 0 \qquad \forall \; 1 \le i \le N \tag{AR2}\end{align*}\] where at least one of the inequalities has to be strict. Association reversal is the standard variety of Simpson’s Paradox (Bandyopadhyay et al. 2011; Blyth 1972, 1973) and also the one that is most frequently investigated in the psychology of reasoning, or by philosophers analyzing the paradox (e.g., Cartwright 1979; Eells 1991; Malinas 2001). An important special case of AR occurs when there is no association in the subpopulations, but an association emerges in the overall dataset: \[\begin{align*} \alpha(D_i) &= 0 \qquad \forall 1 \le i \le n \qquad \text{but} & \alpha(D) &\ne 0 \tag{YAP}\end{align*}\] Referring to the pioneering work of the statistician George U. Yule (1903: 132–134), Mittal (1991) calls this Yule’s Association Paradox (YAP). It is typical of spurious correlations between variables with a common cause, that is, variables that are dependent unconditionally (\(\alpha(D) \ne 0\)) but independent given the values of the common cause (\(\alpha(D_i) = 0\)). For example, sleeping in one’s clothes is correlated with having a headache the next morning. However, once we stratify the data according to the levels of alcohol intake on the previous night, the association vanishes: given the same level of drunkenness, people who undress before going to bed will have the same headache, ceteris paribus, as those who kept their clothes on. Finally, the most general version of Simpson’s Paradox is the Amalgamation Paradox (AMP) identified by Good and Mittal (1987). This paradox occurs when the overall degree of association is bigger (or smaller) than each degree of association in the subpopulations, or mathematically, \[\begin{align*} \alpha(D) &> \max_{1 \le i \le N} \alpha(D_i) \qquad \text{or} & \alpha(D) &< \min_{1 \le i \le N} \alpha(D_i). \tag{AMP} \end{align*}\] AMP challenges the intuition that the degree of association in the general population, in virtue of being “the sum” of the individual subpopulations, has to fall in between the minimal and the maximal degree of association observed on that level. The logical strength of the paradoxes is inversely related to their generality and frequency of occurrence: \(\text{YAP} \Rightarrow \text{AR} \Rightarrow \text{AMP}\). Variations of the paradox for non-categorical data (e.g., bivariate real-valued data) will be discussed in Section 5.1. 2.2 Necessary and Sufficient Conditions We proceed to characterizing the mathematical conditions under which Simpson’s Paradox occurs. We have already suggested that the paradox arises in the medical example due to correlations between the treatment variable and the partitioning variable, and we can now make this more precise: Theorem 1 (Lindley & Novick 1981; Mittal 1991): If \(\alpha(D) > 0\) and association reversal occurs for the subpopulations characterized by attribute \(\r{M}\) and \(\neg\r{M}\), (i.e., \(\alpha(D_1), \alpha(D_2) \le 0\)), then either \(\r{M}\) is positively related to \(\r{S}\) and \(\r{T}\); or \(\r{M}\) is positively related to \(\neg\r{S}\) and \(\neg\r{T}\). As Theorem 1 makes clear, the lack of correlation between \(\r{M}\) and \(\r{T}\) is sufficient to rule out association reversals (and thus YAP as well). Does it also rule out the more general amalgamation paradox? The answer to this depends on which measure of association one chooses for \(\alpha\). Discussions of Simpson’s Paradox commonly treat association as the difference in the success rate between the treated and the untreated, but this is only one of many possibilities (Fitelson 1999). While the lack of association between \(M\) and \(T\) is sufficient to rule out AMP for most measures (including the difference measure) it does not rule it out for all measures, as we will now explain. Readers not interested the specific details may skip to the following section. Here are some widely used association measures for a dataset \((a, b, c, d)\): \[\begin{align*} \pi_{D} &= \frac{a}{a+b} - \frac{c}{c+d} & \pi_{Y} &= \frac{ad -bc}{N^2}\\ \pi_{R} &= \log \left(\frac{a}{a+b} \cdot \frac{c+d}{c} \right) & \pi_{W} &= \log \left(\frac{a}{a+c} \cdot \frac{b+d}{b} \right) \\ \pi_{O} &= \log \frac{ad}{bc} & \pi_{C} &= \log \left(\frac{d}{c+d} \cdot \frac{a+b}{a} \right) \end{align*}\] Some of these measures can be formulated probabilistically and have been suggested as measures of causal strength and outcome measures for clinical trials (Edwards 1963; Eells 1991; Fitelson & Hitchcock 2011; Greenland 1987; Peirce 1884; Sprenger 2018; Sprenger & Stegenga 2017). For example, \(\pi_{D} = p(\r{S}\mid \r{T}) - p(\r{S}\mid \neg \r{T})\) represents the difference and \(\pi_R = p(\r{S}\mid \r{T}) / p(\r{S}\mid \neg \r{T})\) the ratio of success rates in treatment and control conditions. \(\pi_W\) can be interpreted as the prognostic weight of evidence that treatment provides for success (i.e., as the log-Bayes factor), \(\pi_{Y}\) is Yule’s (1903) measure of association, \(\pi_{O}\) is the log-odds ratio familiar from epidemiological data analysis, and \(\pi_C\) is I.J. Good’s (1960) measure of causal strength. We now consider the extent to which AMP for different measures is ruled out by different experimental designs. Suppose that individuals are uniformly assigned to the treatment and control condition across subpopulations. In such a case, where the ratio of persons assigned to the treatment and control condition is equal for each subpopulation, the experimental design is called row-uniform. Specifically, there has to be a \(\lambda > 0\) such that for any subpopulation i \[ a_i + b_i = \lambda (c_i+d_i) \tag{Row Uniformity} \] In particular, row uniformity holds approximately if our sample is large and we sample at random from the population. Row-uniform design of a trial ensures independence between a potential confounder \(M\) and the treatment variable \(T\). Accordingly, by Theorem 1, it rules out association reversals. Additionally, row-uniform design is sufficient to rule out the AMP for a wide class of association measures: Theorem 2 (Good & Mittal 1987): If a dataset \(D = \sum D_{i}\) satisfies row uniformity, then the Amalgamation Paradox is avoided for the measures \(\pi_{D}\), \(\pi_{R}\), \(\pi_{Y}\) and \(\pi_{W}\) and \(\pi_{C}\). It is not avoided for the log-odds ratio \(\pi_{O}\). Some studies also exhibit column-uniform design where the proportion of successes and failures is constant across all subpopulations: \[ a_i + c_i = \lambda (b_i+d_i) \tag{Column Uniformity} \] Also then \(\r{M}\) is independent of \(\r{S}\). Column uniformity can occur in case-control studies with various subpopulations (e.g., different hospitals) where one does not match the number of persons with the explanatory attribute, like in an RCT. Instead, for each person with a certain attribute (e.g., a specific form of cancer), one selects a number of persons that does not have this attribute. Column-uniform design avoids AR as well, but among the presented association measures, it suffices to rule out AMP only for \(\pi_Y\). Association Measure Avoids AMP? \(\pi_{D}\) \(\pi_{R}\) \(\pi_{O}\) \(\pi_{Y}\) \(\pi_{W}\) \(\pi_{C}\) Row-uniform design yes yes no yes yes yes Column-uniform design no no no yes no no Both yes yes yes yes yes yes Table 3: An overview of how row- and column-uniform design avoid the amalgamation paradox for various association measures. Table 3 summarizes the properties of all association measures with respect to the AMP and the different forms of experimental design. The behavior of the log-odds measure \(\pi_O\), where neither row- nor column-uniform design suffices to rule out the AMP, will be discussed in Section 5.2. We now identify one last fundamental condition for when data exhibit association reversal. Have a look at Figure 1 which displays the success proportions for treatment and control graphically. Figure 1: A geometrical representation of a necessary condition for the occurrence of Association Reversal. The paradox can occur if the proportions are ordered like in the left graph; it cannot occur if they are ordered like in the right graph. [An extended description of figure 1 is in the supplement.] In both examples, the treatment success rate is for both subpopulations greater than the control success rate. When will this order be preserved at the overall level? We know that the overall success rate for each condition (treatment/control) is constrained by the success rates in the subpopulations: Fact 1: Suppose \(a_i, b_i > 0\) for all \(1 \le i \le N\). Then also \[\begin{align*}\tag{2} \min \frac{a_i}{a_i+b_i} \le \frac{\sum_{j=1}^N a_j}{\sum_{j=1}^N (a_j+b_j)} \le \max \frac{a_i}{a_i+b_i} \end{align*}\] This fact follows directly from the Law of Total Probability (proof omitted) and it gives us a simple necessary condition for the occurrence of Association Reversal (AR): turning to Figure 1 again, it implies that the overall success rate per condition has to be on the solid lines. Thus AR cannot occur in the right part of Figure 1, but it can occur if the proportions are ordered as in the left part of Figure 1. Generally, AR is avoided when the following condition holds: \[\tag{RH} \begin{align*} \max_{1 \le i \le N} \frac{a_i}{a_i+b_i} & < \min_{1 \le i \le N} \frac{c_i}{c_i+d_i} \\ \text{ or } \hspace{5.5em}\\ \min_{1 \le i \le N} \frac{a_i}{a_i+b_i} & > \max_{1 \le i \le N} \frac{c_i}{c_i+d_i} \end{align*} \] Any dataset that satisfies (RH) will be called row-homogenous. By contrast, for any given set of proportions violating condition (RH), we can find datasets exhibiting these very same proportions such that AR indeed occurs (by fiddling with the size of the subpopulations; Lemma 3.1 in Mittal 1991). However, neither row homogeneity, nor the analogous condition of column homogeneity, nor their conjunction is sufficient for avoiding the amalgamation paradox AMP. Finally, one might be interested in how frequently the paradox arises. Simulations by Pavlides and Perlman (2009) suggest that it should not occur frequently: the confidence interval for the probability of AR is a subset of the interval \([0;0.03]\) for both the uniform prior and the (objective) Jeffreys prior. Of course, the practical value of this diagnosis depends on whether the sampling assumptions are sensible, and whether the entire approach makes sense for real-life datasets where researchers can group the data into subpopulations along numerous dimensions. 3. Simpson’s Paradox and Causal Inference Within the philosophical literature, Simpson’s Paradox received sustained attention due to its implications for accounts of causality that posit systematic connections between causal relationships and probability-raising. Specifically, the paradox reveals that facts about probability-raising will not necessarily be preserved when one partitions a population into subpopulations. This poses a number of important challenges to philosophical accounts of causal inference based on the concept of probability: What is the appropriate set of background factors for determining when a probabilistic relationship is causal? What do association reversals imply for causal inference? Does Simpson’s Paradox threaten the objectivity of causal relationships? Strategies for treating the paradox and answering these questions have contributed substantially to the development of theories of probabilistic causality (Cartwright 1979; Eells 1991). A different set of answers is provided by more recent work on the paradox in the framework of graphical causal models (e.g., Pearl 1988, 2000 [2009]; Spirtes et al. 2000), and we will discuss both accounts in turn. In particular, we will explain how Simpson’s Paradox can be analyzed through the notions of confounding and the identifiability of a causal effect. 3.1 Probabilistic Causality and Simpson’s Paradox Early accounts of probabilistic causation (e.g., Reichenbach 1956; Suppes 1970) sought to explicate causal claims purely in terms of probabilistic and temporal facts. On Suppes’ (1970) account, event \(\r{C}\) is a prima facie cause of \(\r{E}\) if and only if (i) \(\r{C}\) occurs before \(\r{E}\) and (ii) \(\r{C}\) raises the probability of \(\r{E}\).[2] As we have already seen in Section 2.1, not all prima facie causes are genuine causes. If I drink a strong blond Belgian beer now, I will probably be happy during the day, but also have a headache tomorrow. However, being happy would not thereby by the cause of the headache: the correlation is explained by the common cause—the beer drinking. The variable for drinking the beer screens off the probabilistic relationship between its effects, meaning that the effects will be uncorrelated when one conditions on it. The crux of Suppes’ account is that a prima facie causal relationship between \(\r{C}\) and \(\r{E}\) is a genuine causal relationship iff there is no factor F prior to C that screens off \(\r{C}\) from \(\r{E}\).[3] Later theorists such as Cartwright (1979) and Eells (1991) developed this condition by making causal claims relative to a causally homogenous background context, which is specified by a set of variables \(\b{K}\). Consider the following example of association reversal presented by Cartwright. Supposing that smoking \((\r{S})\) is a cause of heart disease \((\r{H})\), one might expect that smoking would raise the probability of heart disease. Yet this might not be the case. Suppose that in a population there is a strong correlation between smoking and exercising (X), and that exercise lowers the probability of heart disease by more than smoking raises its probability. In such a case, smoking might lower the probability of heart disease although conditional on either \(X\) or \(\neg X\), \(\r{S}\) raises \(\r{H}\)’s probability. Cartwright interprets this case as follows: causes always raise the probability of their effects, but this can be “concealed” by the correlation between the cause and some other variable (here, \(X\)). In order to isolate the genuine probabilistic relationship between \(\r{C}\) and \(\r{E}\), one needs to consider it in a context where such correlations cannot occur: Probabilistic Causality (Cartwright) Let \(\b{K}\) denote all and only the causes of \(\r{E}\) other than \(\r{C}\) and effects of \(\r{C}\). Then \(\r{C}\) causes \(\r{E}\) if and only if relative to all combinations of values variables in \(\b{K}\), \(\r{C}\) raises the probability of \(\r{E}\): \(p(\r{C}\mid \r{E},\b{K}) > p(\r{C}\mid \neg{\r{E},\b{K}})\). While Suppes defends a reductive account of probabilistic causality, where the elements of \(\b{K}\) are determined without appeal to causal assumptions, Cartwright presents a non-reductive account where \(\b{K}\) must include all and only the causes of \(\r{E}\), excluding \(C\) itself and any variables that are causally intermediate between \(\r{C}\) and \(\r{E}\). The current consensus is that it is impossible to give a probabilistic account of causation without relying and causal concepts, and thus that no non-reductive account is feasible (though see Spohn 2012 for a dissenting view). Although non-reductive accounts could not be used to explain causation to someone with no prior causal knowledge, they can nevertheless clarify how causal claims are tested, and illuminate the relationship between causation and probability (see also Woodward 2003: 20–22). Moreover, Cartwright argues that her general criterion for inclusion of background factors in \(\b{K}\) avoids the reference class problem for purely statistical accounts of causal explanation, which arises when probabilistic facts arbitrarily depend on the way one partitions a population into subpopulations. Through specifying the relevant populations for evaluating causal claims, she aims to eliminate a threat to the objectivity of causal explanation. More detail is provided in the entry on probabilistic causality. 3.2 Specific Debates: Causal Interaction, Average Effects, Mediators Cartwright’s innovations for probabilistic accounts of causality have triggered various debates related to Simpson’s Paradox. We highlight three of them here: Debate 1: Causal Interaction Cartwright claims that causes raise the probabilities of their effects across all background contexts,[4] but many purported causes only raise the probabilities of their effects in some contexts. In the latter cases, causes interact with background factors in producing their effects. To give Cartwright’s own example (1979: 428), ingesting an acid poison generally causes death, except in contexts where one also ingests an alkali poison (in which case the two cancel one another out). The problem of such interactive causes for probabilistic accounts is that they threaten Cartwright’s picture on which t
Rethinking the Value of Generated Tests for LLM Software Engineering Agents(agents.This)
Large Language Model (LLM) code agents increasingly resolve repository-level issues by iteratively editing code, invoking tools, and validating candidate patches. In these workflows, agents often write tests on the fly, but the value of this behavior remains unclear. For example, GPT-5.2 writes almost no new tests yet achieves performance comparable to top-ranking agents.This raises a central question: do such tests meaningfully improve issue resolution, or do they mainly mimic a familiar software-development practice while consuming interaction budget? To better understand the role of agent-written tests, we analyze trajectories produced by six strong LLMs on SWE-bench Verified. Our results show that test writing is common, but resolved and unresolved tasks within the same model exhibit similar test-writing frequencies. When tests are written, they mainly serve as observational feedback channels, with value-revealing print statements appearing much more often than assertion-based checks. Based on these insights, we perform a prompt-intervention study by revising the prompts used with four models to either increase or reduce test writing. The results suggest that prompt-induced changes in the volume of agent-written tests do not significantly change final outcomes in this setting. Taken together, these results suggest that current agent-written testing practices reshape process and cost more than final task outcomes.
Misu(donate.wikimedia.org)
Misu - Wikipedia Jump to content From Wikipedia, the free encyclopedia Grain-based Korean beverage For other uses, see Misu (disambiguation). This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: "Misu" – news · newspapers · books · scholar · JSTOR (May 2014) (Learn how and when to remove this message) Misu Misu-garu (misu powder) Misu (Korean: 미수) is a beverage made from the traditional Korean grain powder misu-garu (미숫가루; misutgaru; 'misu powder'), which is a combination of 7–10 different grains. It is usually served on hot summer days to quench thirst or as an instant nutritious drink for breakfast or as a healthy snack. In a Joseon Dynasty (1392–1897) recipe book, misu was mentioned as stir-fried barley (gu). Gu was a delicacy of that time and easy to serve as one went to travel. Misu is made of glutinous rice and other ingredients, such as barley, yulmu (Coix lacryma-jobi var. ma-yuen), brown rice, black rice, black soybeans, corn, white beans, millet, and sesame seeds, which are ground, roasted and/or steamed, then mixed together. Misugaru is commonly added to water or milk and stirred to make a drink. Sugar or condensed milk can be added as a sweetener. The beverage is high in protein, vitamins, calcium, magnesium, molybdenum, folate, and selenium, and is a dieter's drink, as it is quite filling but low in calories.[1] See also[edit] Chatang – Gruel in Beijing and Tianjin cuisine Gofio – Toasted flour from the Canary Islands Kama (food) – Traditional Finnic dish of mixed cereal flour and milk Rubaboo – PorridgePages displaying short descriptions with no spaces Tsampa – Roasted flour for cereal References[edit] ^ Sue. "Healthy Korean Multi-Grain Shakes – Homemade Misutgaru Latte – My Korean Kitchen". My Korean Kitchen. Retrieved 19 August 2015. vteRice drinksAlcoholic Agkud Amazake Andong soju Apo Ara Awamori Baekse-ju Beopju Brem Cheongju Beopju Chhaang Choujiu Cơm rượu Dansul Gwaha-ju Huangjiu Jiuniang Kuchikamizake Lao khao Lihing Lugdi Makgeolli Mijiu Mirin Nigori Pangasi Raksi Rice baijiu Rice shochu Rượu cần Rượu đế Rượu nếp Sake Sato Shaoxing wine Soju Sonti Sra peang Tamagozake Tapuy Zutho Zu Non-alcoholic Black vinegar Brown rice green tea Brown rice tea Genmaicha Horchata Jūrokucha Mieum Miki Misu Kokkoh Rice milk Rice water Sikhye Sudan Sungnyung List of rice drinks Retrieved from "https://en.wikipedia.org/w/index.php?title=Misu&oldid=1355814791" Categories: Rice drinksKorean cuisineHidden categories: Articles with short descriptionShort description is different from WikidataArticles needing additional references from May 2014All articles needing additional referencesUse dmy dates from July 2024Articles containing Korean-language textPages displaying short descriptions with no spaces via Module:Annotated link Misu Add topic