plant lover, cookie monster, shoe fiend
20371 stories
·
20 followers

Agent Psychosis: Are We Going Insane? | Armin Ronacher's Thoughts and Writings

1 Share

written on January 18, 2026

You can use Polecats without the Refinery and even without the Witness or Deacon. Just tell the Mayor to shut down the rig and sling work to the polecats with the message that they are to merge to main directly. Or the polecats can submit MRs and then the Mayor can merge them manually. It’s really up to you. The Refineries are useful if you have done a LOT of up-front specification work, and you have huge piles of Beads to churn through with long convoys.

Gas Town Emergency User Manual, Steve Yegge

Many of us got hit by the agent coding addiction. It feels good, we barely sleep, we build amazing things. Every once in a while that interaction involves other humans, and all of a sudden we get a reality check that maybe we overdid it. The most obvious example of this is the massive degradation of quality of issue reports and pull requests. As a maintainer many PRs now look like an insult to one’s time, but when one pushes back, the other person does not see what they did wrong. They thought they helped and contributed and get agitated when you close it down.

But it’s way worse than that. I see people develop parasocial relationships with their AIs, get heavily addicted to it, and create communities where people reinforce highly unhealthy behavior. How did we get here and what does it do to us?

I will preface this post by saying that I don’t want to call anyone out in particular, and I think I sometimes feel tendencies that I see as negative, in myself as well. I too, have thrown some vibeslop up to other people’s repositories.

Our Little Dæmons

In His Dark Materials, every human has a dæmon, a companion that is an externally visible manifestation of their soul. It lives alongside as an animal, but it talks, thinks and acts independently. I’m starting to relate our relationship with agents that have memory to those little creatures. We become dependent on them, and separation from them is painful and takes away from our new-found identity. We’re relying on these little companions to validate us and to collaborate with. But it’s not a genuine collaboration like between humans, it’s one that is completely driven by us, and the AI is just there for the ride. We can trick it to reinforce our ideas and impulses. And we act through this AI. Some people who have not programmed before, now wield tremendous powers, but all those powers are gone when their subscription hits a rate limit and their little dæmon goes to sleep.

Then, when we throw up a PR or issue to someone else, that contribution is the result of this pseudo-collaboration with the machine. When I see an AI pull request come in, or on another repository, I cannot tell how someone created it, but I can usually after a while tell when it was prompted in a way that is fundamentally different from how I do it. Yet it takes me minutes to figure this out. I have seen some coding sessions from others and it’s often done with clarity, but using slang that someone has come up with and most of all: by completely forcing the AI down a path without any real critical thinking. Particularly when you’re not familiar with how the systems are supposed to work, giving in to what the machine says and then thinking one understands what is going on creates some really bizarre outcomes at times.

But people create these weird relationships with their AI agent and once you see how some prompt their machines, you realize that it dramatically alters what comes out of it. To get good results you need to provide context, you need to make the tradeoffs, you need to use your knowledge. It’s not just a question of using the context badly, it’s also the way in which people interact with the machine. Sometimes it’s unclear instructions, sometimes it’s weird role-playing and slang, sometimes it’s just swearing and forcing the machine, sometimes it’s a weird ritualistic behavior. Some people just really ram the agent straight towards the most narrow of all paths towards a badly defined goal with little concern about the health of the codebase.

Addicted to Prompts

These dæmon relationships change not just how we work, but what we produce. You can completely give in and let the little dæmon run circles around you. You can reinforce it to run towards ill defined (or even self defined) goals without any supervision.

It’s one thing when newcomers fall into this dopamine loop and produce something. When Peter first got me hooked on Claude, I did not sleep. I spent two months excessively prompting the thing and wasting tokens. I ended up building and building and creating a ton of tools I did not end up using much. “You can just do things” was what was on my mind all the time but it took quite a bit longer to realize that just because you can, you might not want to. It became so easy to build something and in comparison it became much harder to actually use it or polish it. Quite a few of the tools I built I felt really great about, just to realize that I did not actually use them or they did not end up working as I thought they would.

The thing is that the dopamine hit from working with these agents is so very real. I’ve been there! You feel productive, you feel like everything is amazing, and if you hang out just with people that are into that stuff too, without any checks, you go deeper and deeper into the belief that this all makes perfect sense. You can build entire projects without any real reality check. But it’s decoupled from any external validation. For as long as nobody looks under the hood, you’re good. But when an outsider first pokes at it, it looks pretty crazy. And damn some things look amazing. I too was blown away (and fully expected at the same time) when Cursor’s AI written Web Browser landed. It’s super impressive that agents were able to bootstrap a browser in a week! But holy crap! I hope nobody ever uses that thing or would try to build an actual browser out of it, at least with this generation of agents, it’s still pure slop with little oversight. It’s an impressive research and tech demo, not an approach to building software people should use. At least not yet.

There is also another side to this slop loop addiction: token consumption.

Consider how many tokens these loops actually consume. A well-prepared session with good tooling and context can be remarkably token-efficient. For instance, the entire port of MiniJinja to Go took only 2.2 million tokens. But the hands-off approaches—spinning up agents and letting them run wild—burn through tokens at staggering rates. Patterns like Ralph are particularly wasteful: you restart the loop from scratch each time, which means you lose the ability to use cached tokens or reuse context.

We should also remember that current token pricing is almost certainly subsidized. These patterns may not be economically viable for long. And those discounted coding plans we’re all on? They might not last either.

Slop Loop Cults

And then there are things like Beads and Gas Town, Steve Yegge’s agentic coding tools, which are the complete celebration of slop loops. Beads, which is basically some sort of issue tracker for agents, is 240,000 lines of code that … manages markdown files in GitHub repositories. And the code quality is abysmal.

There appears to be some competition in place to run as many of these agents in parallel with almost no quality control in some circles. And to then use agents to try to create documentation artifacts to regain some confidence of what is actually going on. Except those documents themselves read like slop.

Looking at Gas Town (and Beads) from the outside, it looks like a Mad Max cult. What are polecats, refineries, mayors, beads, convoys doing in an agentic coding system? If the maintainer is in the loop, and the whole community is in on this mad ride, then everyone and their dæmons just throw more slop up. As an external observer the whole project looks like an insane psychosis or a complete mad art project. Except, it’s real? Or is it not? Apparently a reason for slowdown in Gas Town is contention on figuring out the version of Beads, which takes 7 subprocess spawns. Or using the doctor command times out completely. Beads keeps growing and growing in complexity and people who are using it, are realizing that it’s almost impossible to uninstall. And they might not even work well together even though one apparently depends on the other.

I don’t want to pick on Gas Town or these projects, but they are just the most visible examples of this in-group behavior right now. But you can see similar things in some of the AI builder circles on Discord and X where people hype each other up with their creations, without much critical thinking and sanity checking of what happens under the hood.

Asymmetric and Maintainer’s Burden

It takes you a minute of prompting and waiting a few minutes for code to come out of it. But actually honestly reviewing a pull request takes many times longer than that. The asymmetry is completely brutal. Shooting up bad code is rude because you completely disregard the time of the maintainer. But everybody else is also creating AI-generated code, but maybe they passed the bar of it being good. So how can you possibly tell as a maintainer when it all looks the same? And as the person writing the issue or the PR, you felt good about it. Yet what you get back is frustration and rejection.

I’m not sure how we will go ahead here, but it’s pretty clear that in projects that don’t submit themselves to the slop loop, it’s going to be a nightmare to deal with all the AI-generated noise.

Even for projects that are fully AI-generated but are setting some standard for contributions, some folks now prefer actually just getting the prompts over getting the actual code. Because then it’s clearer what the person actually intended. There is more trust in running the agent oneself than having other people do it.

Is Agent Psychosis Real?

Which really makes me wonder: am I missing something here? Is this where we are going? Am I just not ready for this new world? Are we all collectively getting insane?

Particularly if you want to opt out of this craziness right now, it’s getting quite hard. Some projects no longer accept human contributions until they have vetted the people completely. Others are starting to require that you submit prompts alongside your code, or just the prompts alone.

I am a maintainer who uses AI myself, and I know others who do. We’re not luddites and we’re definitely not anti-AI. But we’re also frustrated when we encounter AI slop on issue and pull request trackers. Every day brings more PRs that took someone a minute to generate and take an hour to review.

There is a dire need to say no now. But when one does, the contributor is genuinely confused: “Why are you being so negative? I was trying to help.” They were trying to help. Their dæmon told them it was good.

Maybe the answer is that we need better tools — better ways to signal quality, better ways to share context, better ways to make the AI’s involvement visible and reviewable. Maybe the culture will self-correct as people hit walls. Maybe this is just the awkward transition phase before we figure out new norms.

Or maybe some of us are genuinely losing the plot, and we won’t know which camp we’re in until we look back. All I know is that when I watch someone at 3am, running their tenth parallel agent session, telling me they’ve never been more productive — in that moment I don’t see productivity. I see someone who might need to step away from the machine for a bit. And I wonder how often that someone is me.

Two things are both true to me right now: AI agents are amazing and a huge productivity boost. They are also massive slop machines if you turn off your brain and let go completely.

This entry was tagged ai

Read the whole story
sarcozona
1 hour ago
reply
Epiphyte City
Share this story
Delete

Learning from AI summaries leads to shallower knowledge than web search

1 Share

Results of a set of experiments found that individuals learning about a topic from large language model summaries develop shallower knowledge compared to when they learn through standard web search. Individuals who learned from large language models felt less invested in forming their advice, and created advice that was sparser and less original compared to advice based on learning through web search. The research was published in PNAS Nexus.

Large language models (LLMs) are artificial intelligence systems designed to interpret and generate human language by learning statistical patterns from vast collections of text. They are typically based on deep learning architectures, which allow them to process context and relationships between words over long passages. The most popular large language models today include those developed by OpenAI (GPT series used in ChatGPT), Google (Gemini), Anthropic (Claude), and Meta (LLaMA).

The development of large language models has progressed rapidly over the last decade due to advances in computing power, the availability of large datasets, and improvements in training algorithms. Early models focused mainly on simple text prediction, while modern models can perform complex reasoning, summarization, translation, and dialogue. Training usually involves two main stages: large-scale pretraining on general text and fine-tuning on more specific tasks or with human feedback.

These models are widely used in applications such as chatbots, virtual assistants, search engines, and automated customer support. In education and research, they assist with writing, coding, literature reviews, and data exploration. In business and industry, they are used for document analysis, marketing content generation, and decision support. Despite their usefulness, large language models sometimes produce errors, biases, or misleading information because they do not truly understand the world but rely on patterns learned from the materials used for their training.

Study authors Shiri Melumad and Jin Ho Yun note that many people use summaries of various materials generated by LLMs as learning tools. However, when learning from LLM summaries, users no longer need to exert the effort of gathering and distilling different informational sources on their own. The study authors hypothesized that this lower effort in assembling knowledge from LLM summaries might suppress the depth of knowledge users gain compared to learning through traditional web search, resulting in shallower knowledge. In turn, this shallower knowledge would result in less investment in giving advice based on that knowledge, and in sparser and less unique advice content. Such advice would then be seen as less informative and persuasive.

The study authors conducted a series of experiments to verify elements of their model. The first experiment involved 1,104 participants recruited via Prolific. They were told to imagine that a friend was seeking advice on how to plant a vegetable garden. One group of participants had to learn about this through Google search, while the other learned from ChatGPT. They would then give advice.

The second experiment involved 1,979 participants recruited via Prolific. It was the same as the first experiment, but the participants were limited to typing just one query. The query did not result in a typical search or response generation. Instead, participants were all given the same results formulated either as a series of linked websites or a summary of ChatGPT-style suggestions.

The third experiment was similar to experiment one, but the two groups of participants either used Google search or Google’s “AI Overview” (and not ChatGPT). They were to give advice about leading a healthier lifestyle. In this way, the platform was held constant. Participants in the fourth experiment rated various characteristics of the advice produced in the third study.

Results of these experiments showed that participants who used LLM summaries spent less time learning and reported learning fewer new things. They invested less thought and spent less time writing their advice. As a result, they felt lower ownership of the advice they produced. Overall, this supported the idea that learning from LLM summaries results in shallower learning and lower investment in acquiring knowledge and using it.

Participants learning from web searches and websites produced richer advice with more original content. Their advice texts were longer, more dissimilar to each other, and more semantically unique.

“A theory is proposed that because LLM summaries lessen the need to discover and synthesize information from original sources—steps essential for deep learning—users may develop shallower knowledge compared with learning from web links. When subsequently forming advice on the topic, this manifests in advice that is sparser, less original—and less likely to be adopted by recipients. Results from seven experiments support these predictions, showing that these differences arise even when LLM summaries are augmented by real-time web links, for example. Hence, learning from LLM syntheses (vs. web links) can, at times, limit the development of deeper, more original knowledge,” the study authors concluded.

The study contributes to the scientific understanding of how people learn using LLMs. However, it should be noted that the initial experiments involved hypothetical scenarios (advising a friend), though later experiments confirmed the results held even when the topics were of high personal relevance to the participants.

Additionally, the experiments involved paid participants—individuals who were likely primarily motivated by the award for participation, which was not dependent on the quality of the advice they produced. Results of studies looking into real-world learning situations where participants feel responsible for the outcomes of learning and have a personal stake in the quality of advice they produce may differ.

The paper, “Experimental evidence of the effects of large language models versus web search on depth of learning,” was authored by Shiri Melumad and Jin Ho Yun.

Read the whole story
sarcozona
2 hours ago
reply
Epiphyte City
Share this story
Delete

Gitanyow wins court battle to restart forestry licence consultation

1 Share
The BC Supreme Court has set aside the transfer of a major forest licence in northwestern BC after ruling the province failed to properly consult the First Nation whose territory the licence concerns.
Read the whole story
sarcozona
1 day ago
reply
Epiphyte City
Share this story
Delete

Feds sort flood of feedback on national AI strategy — with AI

1 Share
Canada's AI ministry got so many comments on its national strategy for the technology, it turned to AI to parse the responses — despite concerns in the ministry that using the technology could further undermine public trust in government.
Read the whole story
sarcozona
1 day ago
reply
Epiphyte City
Share this story
Delete

Burnout

1 Share

Burnout by Hannah Proctor (Verso, 2024)

Hannah Proctor visits the concept of burnout not only as the sense of exhaustion and apathy that we commonly associate with it, but as the experience of political defeat—the disappointment, despair, and grief that emerges when one becomes aware that the political project they have committed themselves to may not succeed. This version of burnout can’t be entirely resolved by rest or self-care that limits itself to the personal, but requires attention and consideration of public and communal practices, movements, and militancies. That is, recovery from political defeat is itself a political process. She argues for anti-adaptive healing—not healing that adapts the wounded to a broken world, but healing that transforms both the injured and the injurer, that looks to the possibility of a different world amidst the ruins of the present one.


View this post on the web, subscribe to the newsletter, or reply via email.

Read the whole story
sarcozona
1 day ago
reply
Epiphyte City
Share this story
Delete

Loss of an ideal

1 Share

The word “burnout” has taken on so many meanings, become a kind of casual and generic refrain that seems to apply to everyone all the time, a condition of malaise and overwork that afflicts whole generations. But when first conceived the term had a more specific connotation.

Burnout in Freudenberger’s articles from this period is not just defined in terms of physical tiredness as a result of doing too many things; rather, it emerges from emotional investment in a cause and from the disappointments that arise when flaws in a political project become apparent. Freudenberger’s concept not only describes physical exhaustion but also acknowledges the need to deal with anger caused by grief brought about by the “loss of an ideal.” Burnout in the context of social justice projects thus often involves a process of mourning, according to Freudenberger. Returning to his earlier writings on burnout makes it clear that when understood as a malaise arising from politically committed activities, burnout cannot be equated with tiredness or stress.

Proctor, Burnout, page 92

In other words, burnout was defined more in the context of what Hannah Proctor terms the emotional experience of political defeat. Exhaustion was a component of that experience, marked also by the grief, anger, resentment, and despair that arises when an effort to create meaningful change is frustrated. Herbert J. Freudenberger, one of the early theorists of burnout, drew from his own observations working with patients at the St. Mark’s Free Clinic in New York City in the 70s and 80s. But as he and others worked with the term, it transformed into something else:

While in 1974 Freudenberger claimed that those most at risk of burning out were “the dedicated and the committed,” by 1989 he linked burnout to “the externally imposed societal values of achievement, acquisition of goods, power, monetary compensation and competition.” Burnout shifted its meaning: from a symptom experienced by people struggling to change society to one experienced by people trying too hard to succeed within it.

Proctor, Burnout, page 94

This shift also shows up in Byung-Chul Han’s writing about burnout, in which the source of burnout is an “achievement society” that drives people towards a reflexive and all-consuming self-exploitation. But notice how that shift works: where before the notion of burnout was located within a communal and political project, now it becomes something we’re doing to ourselves, absent the still unchanged political and material conditions which gave rise to the original term. There’s a kind of commodification of burnout here, transferring the subject of burnout (and so of sympathy and potential support) from activists to executives, and the source from intolerable inequities to personal psychologies. Which is not to say that burned-out executives don’t exist, but that the use of the same word for two entirely different circumstances serves to undermine the political critique inherent in the word.

The move is akin to the one made when imposter phenomena became imposter syndrome: where the former concerns an experience in the world (“phenomenon” meaning a thing which can be seen or observed), the latter is an invisible pathology, something that only occurs within someone’s psyche and is, to a large extent, their own problem to solve. The disparities of the system become internalized, the therapeutics personalized, the victims pathologized. And the system keeps doing what the system does.

Bench Ansfield writes that Freudenberger borrowed the word burnout from his patients, who used it to describe someone suffering the long term effects of chronic drug use. But Freudenberger turned the word around, associating it not with drug use but with the burned out buildings that then peppered the Lower East Side, a neighborhood terrorized by landlords setting fire to their own buildings, eager for an insurance payout and happy to let their Black and brown tenants pay the price. “But it’s actually quite telling that Freudenberger saw himself and his burned-out coworkers as akin to burned-out buildings,” Ansfield writes. “Though he didn’t acknowledge it in his own exploration of the term, those torched buildings had generated value by being destroyed.”


View this post on the web, subscribe to the newsletter, or reply via email.

Read the whole story
sarcozona
1 day ago
reply
Epiphyte City
Share this story
Delete
Next Page of Stories