plant lover, cookie monster, shoe fiend
10563 stories
·
20 followers

Lottery, luck, or legacy. A review of “The Genetic Lottery: Why DNA matters for social equality” - Coop - 2022 - Evolution

1 Comment

The Genetic Lottery: Why DNA Matters for Social Equality aims to convince the reader that recent methodological developments in human genetics should change the broader societal conversation about redistributive justice. The author, Dr. Kathryn Paige Harden, is a Professor of Psychology at the University of Texas, Austin, who specializes in behavioral genetics. Her book starts from the premise that human behaviors, and in particular educational attainment, are “heritable,” i.e., that within a study sample, some fraction of the phenotypic variance is explained by differences in genotypes. As is described, we can now identify some of the genetic loci associated with trait variation through genome-wide association studies (GWAS) and make predictions—currently, quite noisy predictions—of individual outcomes from genotypes. In the author's view, GWAS findings underscore that people differ not only in the social circumstances into which they are born but also in the genetics that they happen to inherit. Since neither social circumstances nor genetics are earned or chosen, both result from “luck.” The book argues that both sources of luck contribute commensurately to social inequalities in educational attainment and ultimately in income, and therefore that genetics is needed in order to better understand and redress social inequalities. In particular, in Harden's view, recent GWAS findings should lead us to be mindful of principles of equity and not just equality.

The author is an extremely talented communicator, and The Genetic Lottery includes discussion of many engaging and thought-provoking examples. But in our view, its central argument mischaracterizes where the field of human genetics stands and what it promises. Although some of the controversy over the book has centered on its premise, the fact that educational attainment is heritable was documented before GWAS and is in some sense trivial. In humans as in any other species, almost all traits that vary within a group are heritable (Barton & Keightley, 2002; Turkheimer, 2000). We thus fully grant the book's starting point. We also happen to support redistributive policies outlined in her conclusions. However, we believe that many of the arguments made to connect the premise to these conclusions are unwarranted, notably concerning the pertinence of GWAS findings.

Given its broad scope, The Genetic Lottery presents many angles from which to comment. As others have pointed out, it focuses attention on “genetic luck,” when people face social and historical inequities that are anything but random (Martschenko, 2021), and considers the impacts of relatively small social interventions rather than the larger structural inequities in which they are embedded (Panofsky, 2021; Parens, 2021). As population geneticists, and given the importance placed on GWAS and trait prediction in the book, we concentrate on points at which the scientific results are distorted or exaggerated. Cumulatively, these mischaracterizations foster a view of genetic causes of educational attainment as identifiable, intrinsic properties of individuals. As we discuss, this view is not justified by current understanding.

The authors thank Jeremy Berg, Vince Buffalo, Dalton Conley, Doc Edge, Arbel Harpak, Norman Johnson, Hakhamanesh Mostafavi, Magnus Nordborg, Carl Veller, Sivan Yair, and other members of the Coop lab for comments on drafts of this manuscript, and Ewan Birney, Michael Nivard, and Alexander Young for helpful comments on Twitter. Funding was provided by the National Institutes of Health (NIH R01 GM108779 and R35 GM136290 awarded to GC and R01 HG011432 co-awarded to MP). [Correction added on 30th March 2022, after first online publication: due to a system error the abstract and an author email were omitted. A typo in the acknowledgment has also been corrected.]

    Read the whole story
    sarcozona
    4 minutes ago
    reply
    Pretty sure I've shared this before, but it is so important
    Share this story
    Delete

    The worst of both worlds: A comparative analysis of errors in learning from data in psychology and machine learning

    1 Comment

    This is Jessica. In a paper to appear at AIES 2022, Sayash Kapoor, Priyanka Nanayakkara, Arvind Narayanan, and Andrew and I write:

    Recent arguments that machine learning (ML) is facing a reproducibility and replication crisis suggest that some published claims in ML research cannot be taken at face value. These concerns inspire analogies to the replication crisis affecting the social and medical sciences. They also inspire calls for greater integration of statistical approaches to causal inference and predictive modeling.

    A deeper understanding of what reproducibility critiques in research in supervised ML have in common with the replication crisis in experimental science can put the new concerns in perspective, and help researchers avoid “the worst of both worlds,” where ML researchers begin borrowing methodologies from explanatory modeling without understanding their limitations and vice versa. We contribute a comparative analysis of concerns about inductive learning that arise in causal attribution as exemplified in psychology versus predictive modeling as exemplified in ML.

    Our results highlight where problems discussed across the two domains stem from similar types of oversights, including overreliance on theory, underspecification of learning goals, non-credible beliefs about real-world data generating processes, overconfidence based in conventional faith in certain procedures (e.g., randomization, test-train splits), and tendencies to reason dichotomously about empirical results. In both fields, claims from learning are implied to generalize outside the specific environment studied (e.g., the input dataset or subject sample, modeling implementation, etc.) but are often difficult to refute due to underspecification of the learning pipeline. We note how many of the errors recently discussed in ML expose the cracks in long-held beliefs that optimizing predictive accuracy using huge datasets absolves one from having to consider a true data generating process or formally
    represent uncertainty in performance claims. At the same time, the goals of ML are inherently oriented toward addressing learning failures, suggesting that lessons about irreproducibility could be resolved through further methodological innovation in a way that seems unlikely in social psychology. This assumes, however, that ML researchers take concerns seriously and avoid overconfidence in attempts to reform. We conclude by discussing risks that arise when
    sources of errors are misdiagnosed and the need to acknowledge the role that human inductive biases play in learning and reform.

    As someone who has followed the replication crisis in social science for years and now sits in a computer science department where it’s virtually impossible to avoid engaging with the huge crushing bulldozer that is modern ML, I often find myself trying to make sense of ML methods and their limitations by comparison to estimation and explanatory modeling. At some point I started trying to organize these thoughts, then enlisted Sayash and Arvind, who had done some work on ML reproducibility, Priyanka who follows work on ML ethics and related topics, and Andrew as authority on empirical research failures. It was a good coming together of perspectives, and an excuse to read a lot of interesting critiques and foundational stuff on inference and prediction (we cite over 200 papers!) As a ten page conference style paper this was obviously ambitious, but the hope is that it will be helpful to others who have found themselves trying to understand how, if at all, these two sets of critiques relate. On some level I wrote it with computer science grad students in mind–I teach a course to first year PhDs where I talk a little about reproducibility problems in CS research and what’s unique compared to reproducibility issues in other fields, and they seem to find it helpful.

    The term learning in the title is overloaded. By “errors in learning” here we are talking about not just problems with whatever the fitted models have inferred–we mean the combination of the model implications and the human interpretation of what we can learn from it, i.e., the scientific claims being made by researchers. We break down the comparison based on whether the problems are framed as stemming from data problems, model representation bias, model inference and evaluation problems, or bad communication.

    table comparing concerns in ml versus psych

    The types of data issues that get discussed are pretty different – small samples with high measurement error versus datasets that are too big to understand or document. The underrepresentation of subsets of the population to which the results are meant to generalize comes up in both fields, but with a lot more emphasis on implications for fairness in decision pipelines in ML based on its applied status. ML critics also talk about unique data issues like “harms of representation,” where model predictions reinforce some historical bias, like when you train a model to make admissions decisions based on past decisions that were biased against some group. The idea that there is no value-neutral approach to creating technology so we need to consider normative ethical stances is much less prevalent in mainstream psych reform, where most of the problems imply ways that modeling diverges from its ideal value-neutral status. There are some clearer analogies though if you look at concerns about overlooking sampling error and power issues in assessing the performance of an ML model.

    Choosing representations and doing inference are also obviously different on the surface in ML versus psych, but here the parallels in critiques that reformers are making are kind of interesting. In ML there’s colloquially no need to think about the psychological plausibility of the solutions that a learner might produce; it’s more about finding the representation where the inductive bias, i.e., properties of the solutions that it finds, is desirable for the learning conditions. But if you consider all the work in recent years aimed at improving the robustness of models to adversarial manipulations to input data, which basically grew out of acknowledgment that perturbations of input data can throw a classifier off completely, it’s often implicit that successful learning means the model learns a function that seems plausible to a human. E.g., some of the original results motivating the need for adversarial robustness were surprising because they show that manipulations that a human doesn’t perceive as important (like slight noising of images or masking of parts that don’t seem crucial) can cause prediction failures. Simplicity bias in stochastic gradient descent can be cast as a bad thing when it causes a model to overrely on a small set of features (in the worst case, features that correlate with the correct labels as a result of biases in the input distribution, like background color or camera angle being strongly correlated with what object is in the picture). Some recent work explicitly argues that this kind of “shortcut learning” is bad because it defies expectations of a human who is likely to consider multiple attributes to do the same task (e.g., the size, color, and shape of the object). Another recent explanation is underspecification, which is related but more about how you can have many functions that achieve roughly the same performance given a standard test-validate-train approach but where the accuracy degrades at very different rates when you probe them along some dimension that a human thinks is important, like fairness. So we can’t really escape caring about how features of the solutions that are learned by a model compare to what we as humans consider valid ways to learn how to do the task.

    We also compare model-based inference and evaluation across social psych and ML. In both fields, implicit optimization–for statistical significance in psych and better than SOTA performance in ML–is suggested to a big issue. However in contrast to using analytical solutions like MLE in psych, optimization is typically non-convex in ML such that the hyperparameters and initial conditions and computational budget you use in training the model can matter a lot. One problem critics point to is that in reporting researchers don’t always recognize this. How you define the baselines you test against is another source of variance and potentially bias if chosen in a way that improves your chances of beating SOTA.

    In terms of high-level takeaways, we point out ways that claims are irrefutable by convention across the two fields. In ML research one could say there’s confusion about what’s a scientific claim and what’s an engineering artifact. When a paper claims to have achieved X% accuracy on YZ benchmark with some particular learning pipeline, this might be useful for other researchers to know when attempting progress on the same problem, but the results are more possibilistic than probabilistic, especially when based on only one possible configuration of hyperparameters etc and with an implicit goal of showing one’s method worked. The problem is that the claims are often stated more broadly, suggesting that certain innovations (a new training trick, a model type) led to better performance on a loosely defined learning task like ‘reading comprehension,’ ‘object recognition’, etc. In a field like social psych on the other hand you have a sort of inversion of NHST as intended, where a significant p-value leads to acceptance of loosely defined alternative hypotheses and subject samples are often chosen by convenience and underdescribed but claims imply learning something about people in general.

    There’s also some interesting stuff related to how the two fields fail in different ways based on unrealistic expectations about reality. Meehl’s crud factor implies that using noisy measurements, small samples and misspecified models to argue about classes of interventions that have large predictable effects on some well-studied class of outcomes (e.g., political behavior) is out of touch with common sense about how we would expect multiple large effects to interact. In ML, the idea that we can leverage many weak predictors to make good predictions is accepted, but assumptions that distributions are stationary and that good predictive accuracy can stand alone as a measure of successful learning imply a similarly naive view of the world.

    So… what can ML learn from the replication crisis in psych about fixing its problems? This is where our paper (intentionally) disappoints! Some researchers are proposing solutions to ML problems, ranging from fairly obvious steps like releasing all code and data to things like templates for reporting on limitations of datasets and behavior of models to suggestions of registered reports or pre-registration. Especially in an engineering community there’s a strong desire to propose fixes when a problem becomes apparent, and we had several reviewers that seemed to think the work was only really valuable if we made specific recommendations about what psych reform methods can be ported to ML. But instead the lesson we point out from the replication crisis is that if we ignore the various sources of uncertainty we face about how to reform a field—in how we identify problematic claims, how we define the core reasons for the problems, and how we know that a particular reform will more successful than others—it’s questionable whether we’re making real progressin reform. Wrapping up a pretty nuanced comparison with a few broad suggestions based on our instincts just didn’t feel right.

    Ultimately this is the kind of paper that I’ll never feel is done to satisfaction, since there’s always some new way to look at it, or type of problem we didn’t include. There are also various parts where I think a more technical treatment would have been nice to relate the differences. But as I think Andrew has said on the blog, sometimes you have to accept you’ve done as much as you’re going to and move on from a project.

    Read the whole story
    sarcozona
    10 minutes ago
    reply
    Love to see the combination of shitty frequentist stats, shitty ML, and shitty genome assembly & SNP filtering pipelines making all the careers in my field rn
    Share this story
    Delete

    Links 5/24/22

    1 Share

    Links for you. Science:

    Scientists discover ‘ghost’ fossils beneath a microscope
    The Annihilation of Florida: An Overlooked National Tragedy
    Amazonian ‘camera traps’ provided images for massive archive
    Scientist finds professor who supported her love for bugs when she was 4
    Like it or not, invasive ‘Frankenfish’ are still among us
    Tracking coronavirus in animals takes on new urgency: Inside the global hunt to identify mutations that might lead to more lethal variants.

    Other:

    ‘Long covid’ is going to be a long haul
    George W. Bush Stumbles Into a Moment of Truth
    Voting is surging in Georgia despite controversial new election law
    6 Ways to Fight for Abortion Rights After Roe
    How our system of primary elections could destroy democracy
    Even Under Roe, I Faced Barriers to Get an Abortion: In spite of having made up my mind, I still had to face unnecessary barriers in order to access medical care that I could have received online and by mail.
    How Elise Stefanik, ‘bright light’ of a generation, chose a dark path
    The School Board Culture War: Republicans are pushing national wedge issues to the local level, but smart progressives are beating them.
    Metro’s recurring problems raise questions about oversight, management. As the rail system struggles to lure back riders whose confidence is shaken, future service cuts are likely if passengers don’t return (“The fundamental issue is that the appointing jurisdictions cheapen the board by appointing political ne’er-do-wells who have some contact with the appointing authority, and they know nothing about transit…A number of them have never been in the public sector. A number of them come equipped with preconceived notions …unfounded in reality.”)
    When Right-Wing Attacks on School Textbooks Fell Short: Some essential lessons from an earlier culture war.
    COVID’s Death Milestone And Mass Shootings: Is Mass Death The New Normal?
    Numbers don’t lie: Even in Mass., GOP candidates are outpacing Democrat. The tallies are a warning to Democrats everywhere, said Secretary of State William F. Galvin.
    Louisiana Senator Bill Cassidy: Our Maternal Death Rates Are Only Bad If You Count Black Women
    Entire Maine town forced to shut after its only clerk quits over denied vacation
    The Democrats Really Are That Dense About Climate Change. The party doesn’t even seem to realize that it’s blowing a once-in-a-decade chance to pass meaningful climate legislation.
    Abortion’s Last Stand in the South: A Post-Roe Future Is Already Happening in Florida
    What’s Behind America’s Shocking Baby-Formula Shortage?
    ‘What else have they been missing?’ Massive infant formula recall raises questions about FDA inspections
    Southern Baptist leaders covered up sex abuse, lied about secret database, report says
    Democrats’ Major Campaign Tech Firm Shifts Under New Private Equity Owner
    With the Buffalo massacre, white Christian nationalism strikes again
    How Trump Caused Inflation
    Vogue magazine publisher asked a British pub to change its name. It refused.
    Is the Middle Class Musician Disappearing?

    Read the whole story
    sarcozona
    15 hours ago
    reply
    Share this story
    Delete

    The Annihilation of Florida: An Overlooked National Tragedy ❧ Current Affairs

    1 Share
    Read the whole story
    sarcozona
    16 hours ago
    reply
    Share this story
    Delete

    Temporal correlations among demographic parameters are ubiquitous but highly variable across species

    1 Share
    We use long‐term demographic data from 15 bird and mammal species to quantify correlation patterns among five demographic parameters. Results show that positive correlations are ubiquitous and suggest that correlations are more strongly driven by ecological rather than evolutionary factors. Abstract Temporal correlations among demographic parameters can strongly influence population dynamics. Our empirical knowledge, however, is very limited regarding the direction and the magnitude of these correlations and how they vary among demographic parameters and species’ life histories. Here, we use long‐term demographic data from 15 bird and mammal species with contrasting pace of life to quantify correlation patterns among five key demographic parameters: juvenile and adult survival, reproductive probability, reproductive success and productivity. Correlations among demographic parameters were ubiquitous, more frequently positive than negative, but strongly differed across species. Correlations did not markedly change along the slow‐fast continuum of life histories, suggesting that they were more strongly driven by ecological than evolutionary factors. As positive temporal demographic correlations decrease the mean of the long‐run population growth rate, the common practice of ignoring temporal correlations in population models could lead to the underestimation of extinction risks in most species.
    Read the whole story
    sarcozona
    16 hours ago
    reply
    Share this story
    Delete

    How network size strongly determines trophic specialisation: A technical comment on Luna et al. (2022)

    1 Share
    Luna et al. (2022) concluded that the environment contributes to explaining specialisation in open plant–pollinator networks. When reproducing their study, we instead found that network size alone largely explained the variation in their specialisation metrics. Thus, we question whether empirical network specialisation is driven by the environment. Abstract Luna et al. (2022) concluded that the environment contributes to explaining specialisation in open plant–pollinator networks. When reproducing their study, we instead found that network size alone largely explained the variation in their specialisation metrics. Thus, we question whether empirical network specialisation is driven by the environment.
    Read the whole story
    sarcozona
    16 hours ago
    reply
    Share this story
    Delete
    Next Page of Stories