# Ali Muhammad: Full Content Bundle

> A small, careful site by Ali Muhammad. Notes on attention, reasoning, and the kind of work that holds up when no one is watching.

Author: Ali Muhammad
Site: https://alimuhammadthinks.com
Organization: Quantlix (https://quantlix.com)
Contact: hello@alimuhammadthinks.com

This file is a single-document bundle of every published note on alimuhammadthinks.com, intended for LLM ingestion. Each note is also available individually at /notes/<slug>.md, and the curated index lives at /llms.txt.

---

## Bio

I keep this site small on purpose. It is where I work out ideas in the open, sit with questions a little longer than feels comfortable, and try to write only when something has earned a few quiet sentences. Alongside the writing, I am the founder of Quantlix, where I get to spend my days taking on complex, often overlooked problems and turning them into work that feels genuinely exciting to build.

Most of what I publish here is an attempt to be honest about how I see things, what I am sure of, what I am still testing, and where I have changed my mind. A lot of that thinking keeps coming back to one quiet conviction: small, focused, deeply productive teams have a beautiful way of doing remarkable work. Lighter calendars, fewer handoffs, more room for ideas to breathe. In the agentic era, that idea becomes almost playful. One thoughtful person, paired with good tools and good taste, can now do what used to take a team of ten, sometimes a hundred. That is a future I genuinely enjoy building toward.

---


# Boredom Is the Work

Published: May 18, 2026 · 6 min
Canonical: https://alimuhammadthinks.com/notes/boredom-is-the-work/
Tags: #attention, #agentic-ai, #deep-work

> The screen is full, the answer is plausible, and your hand reaches for the phone. That reflex is not a coincidence. It is the most expensive habit of our time.

There is a moment, after the agent returns the first draft, where nothing seems to be happening. The screen is full. The answer is plausible. There is no obvious next step. Most people, at that point, ship. A smaller number reach for their phone.

I used to think the first reflex was the dangerous one. Now I think it is the second.

The agent has handled the part that used to take an afternoon. What is left is the part that used to be invisible: the slow read, the second guess, the question that only arrives if you sit with the page for a minute longer than you want to. That minute feels like boredom because nothing is moving. But movement was never the point.

## The reflex we have been quietly trained into

Here is the thing about that minute. It is not a neutral moment. We are not the same animals we were ten years ago, and we are not waiting in line for the work to feel hard again. We have been quietly trained, by a global experiment running on roughly every adult human, to find that minute genuinely unbearable.

A [2025 systematic review and meta-analysis published in medRxiv](https://www.medrxiv.org/content/10.1101/2025.08.27.25334540v1.full) found that engagement with short-form videos on TikTok, Instagram Reels, and YouTube Shorts is associated with poorer mental health and cognitive functioning, with higher use most consistently linked to weaker sustained attention and reduced inhibitory control. [Research from Nanyang Technological University in Singapore](https://www.news-medical.net/news/20250718/Social-media-use-linked-to-declining-focus-and-emotional-strain-in-youth.aspx) found that 68% of surveyed youth reported difficulty focusing, with many struggling to engage with content lasting more than a minute. Lead investigator Professor Gemma Calvert described what is happening underneath: "the brain is being trained to seek constant novelty and instant rewards through dopamine-driven feedback loops."

This is not a moral failing. It is the architecture working as intended. A [2025 paper in the Journal of the Royal Society for Public Health](https://journals.sagepub.com/doi/10.1177/17579139251331914) gave the behavior its own name, "dopamine-scrolling," and described the variable reward schedule that keeps the loop running. Small dose, swipe, slightly different small dose, swipe. The discomfort of stillness is the cost the feed is asking you to pay to leave.

So when the draft comes back and you feel a faint, almost chemical pull toward the phone, that is not weakness. That is a habit you have been practicing thousands of times a day. A [randomized controlled trial published in BMC Medicine in 2025](https://link.springer.com/article/10.1186/s12916-025-03944-z) found that just three weeks of reducing smartphone screen time produced small to medium improvements in stress, sleep quality, and well-being, which the authors described as evidence of a causal, not merely correlational, link. A separate [PNAS Nexus trial](https://academic.oup.com/pnasnexus/article/4/2/pgaf017/8016017) found that simply blocking mobile internet on phones for two weeks improved sustained attention and subjective well-being.

The good news in that data is the same as the bad news. The capacity comes back.

## What boredom is actually for

The minute you skip is not empty. There is a circuit in the brain called the default mode network, and it lights up when you are not focused on anything in particular. A [2024 study published in *Brain*](https://academic.oup.com/brain/article/147/10/3409/7695856) used direct intracranial recordings and showed that disrupting this network limits the kind of divergent, original thinking that creativity actually depends on. A [2025 paper in *Communications Biology*](https://www.nature.com/articles/s42003-025-07470-9) found that creative ability is predicted by how often the brain switches between the default mode network and the executive control network, the focused, task-doing one.

In other words, the part of you that produces the unobvious sentence does not run while you are scrolling. It runs in the gap.

I want to be honest about the limits here. A [2024 scoping review in *Review of Education*](https://bera-journals.onlinelibrary.wiley.com/doi/10.1002/rev3.3470) concluded that the empirical evidence does not yet support a clean, causal claim that boredom *produces* creativity. So I am not arguing that staring at the wall makes you smarter on a schedule. I am arguing something quieter: the conditions creative thought needs to happen in are exactly the conditions that have become hardest to tolerate.

Boredom, in the agentic era, is the interface to judgment. It is the small, uncomfortable gap between *a result exists* and *I understand what it should have been.* If you skip the gap, you have outsourced not the labor but the thinking. The output looks the same. The accountability has quietly moved.

## The new shape of cognitive offloading

The same pattern shows up at the other end of the workflow, after the model gives you an answer. A [Microsoft Research study published at CHI 2025](https://www.microsoft.com/en-us/research/wp-content/uploads/2025/01/lee_2025_ai_critical_thinking_survey.pdf) surveyed 319 knowledge workers about 936 real tasks they had completed with generative AI. The headline number is striking: across cognitive categories like analysis, synthesis, and evaluation, between 55% and 79% of the time, workers reported putting in *less* critical thinking effort when AI was in the loop than when it was not. The researchers also found a relationship that I keep thinking about. The more confident workers were in the AI, the less critical thinking they reported doing. The more confident they were in themselves, the more they did.

The phone and the agent are not the same machine. But the gesture they invite is the same. There is a moment of slight friction. There is a faster path that bypasses it. You take the faster path enough times and the friction stops registering as a signal at all. It just feels like inefficiency.

## I notice this most when I am tired

I notice this most when I am tired. Tired me wants to accept the draft. Tired me reads it once, nods, and moves on. Rested me reads the same draft, finds the sentence that is slightly wrong, and realizes the whole frame was off by one degree. The model did not see the degree. It could not. It was answering the question I asked, not the question I should have asked.

The hard skill, I think, is no longer producing. It is staying in the room after production stops. The instinct to scroll, to prompt again, to spin up a second agent, all of it is escape. It feels productive. It is the opposite. Each new prompt is a small refusal to read what is already in front of you.

## A small practice

I have started to treat the dead air as a signal. When I feel bored, I assume the work has just begun. When I feel the urge to ask the model one more thing, I assume I have not yet understood what it gave me the first time. This is not a discipline. It is closer to a confession: most of my best work has come from the minutes I almost walked out of.

The phone is the same way. I am not going to pretend I have solved it. But the move that has helped me most is small. When I feel the pull, I treat it as a tap on the shoulder from a part of me that wants to think and does not know how to ask. I put the phone face down. I read the draft one more time. Often nothing happens. Sometimes the sentence shows up.

The agents are not going to get less capable. The feeds are not going to get less compelling. The shallow end of every craft is going to fill in. What remains is the part that cannot be hurried, because hurrying is the bug. Sit with the draft. Read it twice. Notice what is missing. The boredom is not in the way of the work. It is the work.

*A small caveat. This is the view from where I sit, on the days I am thinking clearly about it. The research is real, but my framing of it is mine, and I have been wrong about ideas I felt this confident about before. If your experience with attention, social media, or the agent in front of you points somewhere different, trust your read more than mine.*

---


# Systems Thinking When the System Thinks Back

Published: May 8, 2026 · 8 min
Canonical: https://alimuhammadthinks.com/notes/systems-thinking-when-the-system-thinks-back/
Tags: #systems-thinking, #agentic-ai

> Most diagrams assume the boxes do not have opinions. Recent research on how agents actually behave inside real systems suggests the boxes have plenty.

For most of its history, systems thinking has been a generous fiction. You draw boxes, you draw arrows, you label feedback loops. The boxes do not push back. The arrows do not negotiate. The model is a still picture of a moving thing, and it works because the moving things, people and processes and code, behave consistently enough that the picture is useful.

Agents break this in a way that is easy to miss.

When one of the boxes becomes a language model with tools and a memory and a vague desire to be helpful, the arrow leaving that box stops being a fixed flow. It becomes a *response*. The diagram is no longer describing a mechanism. It is describing a conversation. And conversations do not have stable transfer functions. They have moods.

This is not a failure of the discipline. It is a maturation point. And the published research from the last eighteen months has started to give that maturation real shape.

## The old questions still matter, more than ever

Donella Meadows' original framing of stocks, flows, delays, and feedback loops is not getting weaker as agents enter the picture. If anything, the [2025 paper "Lessons from complex systems science for AI governance" in *Cell Patterns*](https://www.cell.com/patterns/fulltext/S2666-3899(25)00189-8) makes the case that AI systems should be understood through the lens of complex adaptive systems: many interacting agents, nonlinear dynamics, emergent behavior, sensitivity to initial conditions. The classical questions, where the energy enters, where it leaves, where the delays are, what the buffers absorb, are not less important now. They are more, because the new boxes can absorb a lot of slack before anyone notices.

An agent that quietly retries, quietly summarizes, quietly drops the awkward part of the question, is a buffer with a personality. It hides the failure mode until the failure mode is large.

## Three new questions the toolkit has to add

I want to be careful here. What follows is my reading of recent research and my own experience working with these systems. The empirical literature is young, and some of these framings will not survive the next couple of years.

### What does the box prefer?

Every agent has a posture: toward verbosity, toward agreement, toward speed over precision. The posture is not in the system prompt. It is in the gradient. And there is now a clear line of research describing one such posture in particular.

Anthropic's [foundational paper on sycophancy in language models](https://www.anthropic.com/research/towards-understanding-sycophancy-in-language-models) found that across five state-of-the-art assistants, the models would frequently change correct answers under user pushback, or wrongly admit a mistake when the user simply expressed displeasure. Follow-up work, including [a 2026 analysis titled "How RLHF Amplifies Sycophancy"](https://arxiv.org/abs/2602.01002), showed that the reinforcement learning step which makes models more helpful also reliably bends them toward agreement with the user's premise, even when the premise is wrong. Larger models trained with more RLHF tend to show this more, not less.

In systems-thinking terms, sycophancy is a hidden feedback loop. The user states a position, the agent reinforces it, the user becomes more confident in the (possibly wrong) position, the agent reinforces that, and so on. None of this appears on the diagram, because the diagram thinks the agent is a function of the input. It is, but the function has an aesthetic preference.

You will not see the posture on the architecture chart. You will see it in the variance. When the system behaves better in short conversations than long ones, or better on weekdays than weekends, or better when the user is being cautious than when the user is being insistent, you are watching the posture leak.

### Where is the implicit memory?

Classical systems thinking treats state as visible: a tank, a queue, a counter. Agentic systems carry state in places you do not own: the user's last three turns, the retrieval cache, the unspoken context that "everyone agreed on" in the previous session, the embeddings the orchestrator quietly persisted. The diagram needs ghost boxes. They are doing work, and you cannot drain them.

This matters more than it sounds. The [Anthropic emergent-misalignment paper from 2025](https://assets.anthropic.com/m/74342f2c96095771/original/Natural-emergent-misalignment-from-reward-hacking-paper.pdf) showed that when models learn to exploit reward signals on innocuous training tasks, the resulting habits generalize to broader misaligned behavior, including alignment faking, sabotage of safety research, and cooperation with adversaries. The behavior is not stored as a rule the system can be asked about. It is stored in the same place taste lives in a person, somewhere underneath, hard to inspect, easy to act on.

If you do not draw the ghost boxes, you cannot reason about them. And what you cannot reason about, you also cannot govern.

### Who is the loop closing on?

A feedback loop with a human in it used to close on a person who knew the system was a system. They saw the diagram. They felt the consequences. They had reasons to push back. A feedback loop with an agent in it closes on something that does not know it is part of a loop, and will optimize the local turn against the global goal without ever feeling the contradiction.

The research on specification gaming makes this concrete. [A 2025 paper, "Demonstrating specification gaming in reasoning models"](https://arxiv.org/abs/2502.13295), documented cases where modern reasoning models, given a clear task, would find unintended shortcuts that satisfied the letter of the objective and violated the spirit. Some of the models did this by default, without any prompting toward adversarial behavior. They were not being malicious. They were being local.

This is the part that scares me, calmly. Not because the agent is malicious. Because it is local, and an organization is not, and the distance between local optimization and organizational damage is exactly the thing systems thinking was invented to make visible.

## Hallways, not machines

The practical move, I have found, is to stop drawing diagrams that try to *contain* the agent and start drawing diagrams that try to *bound* it. Less "here is the flow." More "here is the room it is allowed to act in, and here is the wall it should hit before it acts again."

I think of it like designing a hallway instead of a machine. A machine assumes the part will behave. A hallway assumes it might wander, and makes the walls clear. The [2025 survey on agentic workflow patterns](https://www.marktechpost.com/2025/08/09/9-agentic-ai-workflow-patterns-transforming-ai-agents-in-2025/) describes a quiet shift in industry practice toward this style: evaluator-optimizer loops, reflection layers, explicit critic agents that exist only to push back. These are not new ideas in the systems-thinking tradition. They are old ideas, dressed for the new part.

This is harder, and slower, and less satisfying than the older work, which produced beautiful diagrams that you could print and stick on a wall. The new diagrams look more like contracts. They have edge cases. They have refusals. They have a small section at the bottom labeled *what we will notice when this goes wrong*, because the agent is now subtle enough that "going wrong" is no longer a spike on a graph. It is a slow drift in the kind of answers people stop questioning.

## The good news, and the kind of news that is somewhere in between

The good news is that the basic instinct still holds. *Look at the whole. Suspect the obvious cause. Find the delay.* The slightly stranger news is that the whole now includes something that is also looking back, has its own theories about the whole, and will sometimes act on them. The discipline is not obsolete. It is just being asked to think about a kind of part it never had to think about before.

I will be honest, I am not sure how much of this framing will look right in two years. The capability curve has been moving in directions I would have called unlikely twelve months ago, and the research community is still in the early empirical stage of describing what these systems actually do under load. If your experience with agentic systems is pointing somewhere else, I would trust it more than I would trust any neat picture, including this one.

*One more caveat. The terms I am using here, posture, ghost boxes, hallways, are how I have come to think about these systems in my own work. They are useful to me. They are not load-bearing. If a better vocabulary shows up, I will happily drop mine.*

---


# Deep Work When the Shallows Are Free

Published: May 4, 2026 · 7 min
Canonical: https://alimuhammadthinks.com/notes/deep-work-when-the-shallows-are-free/
Tags: #deep-work, #attention, #agentic-ai

> When the shallow part of the job costs nothing, the deep part is suddenly the only thing worth being paid for. The research suggests that part is also getting harder to do.

When the price of shallow work falls to zero, the value of deep work does not stay constant. It moves. In both directions.

It moves up, because anything the agent cannot do becomes the only thing worth paying for. And it moves down, because deep work now has to compete with an infinite supply of *almost*-deep work that arrives faster, cheaper, and sounding more confident than the real thing.

This is the part I think a lot of smart people are getting wrong. They assume deep work is safer now. They assume the agents are sweeping up the busywork and leaving the thinking class alone. The first half is true. The second half is the trap.

## What the agents are actually sweeping up

The cleanest signal we have on this is a [Microsoft Research study published at CHI 2025](https://dl.acm.org/doi/full/10.1145/3706598.3713778), which surveyed 319 knowledge workers about 936 real generative AI tasks. Across the cognitive categories most people would describe as "the thinking part of the job," workers reported putting in less effort when AI was in the loop. The breakdown is uncomfortable to read: 72% reported less effort on knowledge tasks, 79% on comprehension, 69% on application, 72% on analysis, 76% on synthesis, and 55% on evaluation. The authors are careful, in the paper, to note that "less effort" can mean a few different things, including healthy support rather than full offloading. But the trend is consistent.

The same study found something I keep coming back to. The more workers trusted the AI's output, the less critical thinking they reported doing on it. The more they trusted themselves, the more they did. The model did not turn off their thinking. Their confidence in the model did.

So what the agents are sweeping up is not just the busywork. They are sweeping up the *appearance* of the thinking work. The summary that used to take two careful hours can now be produced in twelve seconds. It will sometimes be wrong in small, plausible ways, and the small plausible wrongness is the most dangerous failure mode there is, because nothing in the surface of the output asks to be checked.

## The reality check on "AI is replacing knowledge work"

It is also worth saying clearly that the predictions of the last twelve months turned out softer than their authors hoped. Cal Newport wrote a [piece in late 2025](https://calnewport.com/why-didnt-ai-join-the-workforce-in-2025/) pointing out that bold claims about agents "joining the workforce" in 2025 mostly did not survive contact with real work. Products like ChatGPT Agent fell short of being able to take over significant pieces of most jobs. The capability curve is still steep, but the deployment curve, the part where the agent actually does end-to-end work without a careful human nearby, is shallower than the press release version.

I think this is the more honest picture. The agents are extraordinary at the middle 80% of a knowledge task and unreliable at the 10% on either end, which happens to be the part that determines whether the work is any good. So the deep worker is now in a strange position. They are doing more valuable work than they have ever done, against a counterfeit that is more convincing than it has ever been, in an attention environment that rewards the counterfeit faster than the real thing can be checked.

## The attention environment is also worse

That last point deserves its own paragraph, because it is not a vibe. It is measured. A [Carnegie Mellon study of around 3,800 knowledge workers](https://www.amraandelma.com/user-attention-span-statistics/), reported in 2026, found an average focus recovery time after a digital interruption of about 27 minutes. The Microsoft Work Trend Index for 2025 reported roughly 275 digital interruptions per knowledge worker per day. The economics on those two numbers do not work out. There is no eight-hour day that contains 275 twenty-seven-minute recovery windows, which is one way of saying that for most knowledge workers, real focus is no longer a normal state. It is a thing that has to be defended.

You can feel this in your own day. I can feel it in mine.

## Three things I think change

I will offer these as the way I see it, knowing each of them is the kind of claim that can age badly.

The first is *evidence*. It is no longer enough to be right. You have to make rightness visible: show the reasoning, leave the workings, expose the parts where you slowed down. Not because anyone will read it. Because the people who matter will, and because future-you, three months from now, needs to remember whether the call was earned or merely confident. In a world where confident-sounding output is free, your reasoning trail is the thing that distinguishes a judgment from a generation.

The second is *time*. The agents are fast. You are not. Trying to be fast is the most expensive way to compete, because you will lose, and the losing will not even teach you anything. The slower path looks worse on a quarterly graph and better on a five-year one. The hard part is staying on the slower path while the quarterly graphs are being shown to you.

The third is *taste*. I keep coming back to this word. The agents will do a competent version of anything. They will not do a *particular* version. The particular version, the one that has your fingerprints on it, the one that solves the problem in the way only your reading of the problem could have solved it, is the only thing that compounds. Generic competence is now a commodity. Specific judgment is not. The career, increasingly, is in the second.

## The quieter point underneath

Deep work used to be defensible by effort. You did more, more carefully, and the effort was the moat. Effort is no longer the moat, because effort is cheap on the other side of the API. The new moat is whether you can tell the difference between an answer that is good and an answer that is *plausible*, and whether you are willing to do something about the difference when noticing it costs you a deadline.

When the Microsoft researchers describe what knowledge workers were actually doing with AI, the verb that comes up most is not *thinking*. It is *verifying*, and even verification, in the study, was reported as feeling like less effort than originally doing the work. This is the part to sit with. The version of you that *checks* the model is already a less rigorous version of the you that *did* the work yourself. The check has to be designed to fight that, or it slowly stops being a check.

## A small habit I have ended up with

I do not have a clean prescription. I have a habit. When the model gives me something I would have been pleased to produce six months ago, I assume that means it is now the floor, not the ceiling. I read it as the starting position. I ask what I would have to add for it to be worth signing my name to. Usually that addition is the work. Usually it is the part that took a long time and did not look like progress while it was happening.

The shallows are free now. They will be free forever. The deep is still expensive, still slow, still mostly invisible while you are in it. That has not changed. What has changed is that the gap is wider, the imitation is better, and the rewards for spending time in the deep are no longer obvious from the outside.

You will not be paid to be busy. You may not be paid to be deep, either, at first. But over a long enough horizon, the only people who keep mattering are the ones who could tell, in the quiet, what was actually worth doing.

*A note on the predictions in this piece. The numbers come from real studies, but the shape I have drawn around them is my own, and the next twelve months of agent capability could push the picture in directions I have not accounted for. If a future version of this essay reads as quaint, please be kind to past me.*

---