Rendered at 18:35:10 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
dewey 24 hours ago [-]
Building your AI agent "toolkit" is becoming the equivalent of the perfect "productivity" setup where you spend your time reading blog posts, watching YouTube videos telling you how to be productive and creating habits and rituals...only to be overtaken by a person with a simple paper list of tasks that they work through.
Plain Claude, ask it to write a plan, review plan, then tell it to execute still works the best in my experience.
obsidianbases1 24 hours ago [-]
Lots of money being made by luring people into this trap.
The reality is that if you actually know what you want, and can communicate it well (where the productivity app can be helpful), then you can do a lot with AI.
My experience is that most people don't actually know what they want. Or they don't understand what goes into what they want. Asking for a plan is a shortcut to gaining that understanding.
I asked Claude whether these elaborate words like "walk down the design tree" actually mean anything to the LLM and make a difference. The answer confirmed my gut feeling: You can just tell me to "be critical" and get mostly the same results.
Matt did incredible work teaching people TS, but this feels more like trying to create FOMO to sell snake oil and AI courses.
Rumudiez 3 hours ago [-]
I thought that was supposed to be “decision tree” but otherwise, totally agree the exact words don’t actually matter all that much in most instances. I copy-paste templated prompts and every now and then notice some baffling grammar on my side after the fact… claude doesn’t mind
Leynos 11 hours ago [-]
It feels to me that "walk down the design tree" has a specific meaning with respect to treating the design as a hierarchy (although whether that means BFS or DFS is still ambiguous). "Be critical" lacks that specificity.
stingraycharles 10 hours ago [-]
Yes but then it’s better to spell those instructions out explicitly, eg state facts, state ambiguities / assumptions, inspect codebase, challenge assumptions, etc.
This particular skill is not great.
ModernMech 7 hours ago [-]
Problem is they don’t know how to express themselves and many people, especially those interested in tech, don’t want to learn.
I can’t tell you how many times I have a CS student in my office for advising and they tell me they only want to take technical courses, because anything reading or writing or psychology or history based is “soft”, unrelated to their major, and a waste of their time.
I’ve spent years telling them critical reading and expressive writing skills are very important to being a functioning adult, but they insist what they need to know can only be found in the Engineering college.
abustamam 5 hours ago [-]
Much of my time at work is reading through quickly typed messages from my boss and understanding exactly what questions I need to ask in order to make it easy for him to answer clearly.
Engineers who lack soft skills cannot be effective in team environments.
thbb123 20 hours ago [-]
Or, as I like to put it: I need to activate my personal transformers on my inner embeddings space to figure what is it I really want. And still, quite often, I think in terms of the programming language I'm used to and the library I'm familiar with.
So, to really create something new that I care about, LLMs don't help much.
They are still useful for plenty of other tasks.
andrei_says_ 19 hours ago [-]
Bikeshedding seems to have shifted from code to LLMs which is a step further.
We used to have the very difficult task of producing working scalable maintainable code describing complex systems which do what we need them to do.
Now on top of it we have the difficult task of producing this code using constantly mutating complex nondeterministic systems.
We are the circus bear riding a bicycle on a high wire now being asked to also spin plates and juggle chainsaws.
Maybe singularity means that time sunk into managing LLMs is equal to time needed to manually code similar output in assembly or punch cards.
lanthissa 22 hours ago [-]
its not though if you're working in a massive codebase or on a distributed system that has many interconnected parts.
skills that teach the agent how to pipe data, build requests, trace them through a system and datasources, then update code based on those results are a step function improvement in development.
ai has fundamentally changed how productive i am working on a 10m line codebase, and i'd guess less than 5% of that is due to code gen thats intended to go to prod. Nearly all of it is the ability to rapidly build tools and toolchains to test and verify what i'm doing.
sillysaurusx 22 hours ago [-]
But... plain Claude does that. At least for my codebase, which is nowhere close to your 10m line. But we do processing on lots of data (~100TB) and Claude definitely builds one-off tools and scripts to analyze it, which works pretty great in my experience.
What sort of skills are you referring to?
FINDarkside 20 hours ago [-]
I think people are looking at skills the wrong way. It's not like it gives it some kind of superpowers it couldn't do otherwise. Ideally you'll have Claude write the skills anyway. It's just a shortcut so you don't have to keep rewriting a prompt all over again and/or have Claude keep figuring out how to do the same thing repeatedly. You can save lots of time, tokens and manual guidance by having well thought skills.
Some people use these to "larp" some kind of different job roles etc and I don't think that's productive use of skills unless the prompts are truly exceptional.
abustamam 4 hours ago [-]
At work I use skills to maintain code consistency. We instrumented a solid "model view viewmodel" architecture for a front-end app, because without any guard rails it was doing redundant data fetching and type casts and just messy overall. Having a "mvvm" rule and skill that defines the boundaries keeps the llm from writing a bunch of nonsense code that happens to work.
dotancohen 4 hours ago [-]
This sounds great - skills to ensure that the code maintains proper separation of concerns and is packaged properly.
I'd love to know how this skill was phased.
jmalicki 19 hours ago [-]
I have sometimes found "LARPing job roles" to be useful for expectations for the codebase.
Claude is kind of decent at doing "when in Rome" sort of stuff with your codebase, but it's nice to reinforce, and remind it how to deploy, what testing should be done before a PR, etc.
jmalicki 21 hours ago [-]
If you build up and save some of those scripts, skills help Claude remember how and when to use them.
Skills are crazy useful to tell Claude how to debug your particular project, especially when you have a library of useful scripts for doing so.
DontTrustOver25 3 hours ago [-]
[dead]
ijustlovemath 16 hours ago [-]
Even the most complex distributed systems can be understood with the context windows we have. Short of 1M+ loc, and even then you could use documentation to get a more succinct view of the whole thing.
zwaps 11 hours ago [-]
This really doesn’t pan out in practice if you work a lot with these models
And also we know why: effective context depends on inout and task complexity. Our best guess right now is that we are often between 100k to 200k effective context length for frontier, 1m NIHS type models
chatmasta 20 hours ago [-]
All I want is for my agent to save me time, and to become a _compounding_ multiplier for my output. As a PM, I mostly want to use it for demos and prototypes and ideation. And I need it to work with my fractured attention span and saturated meeting schedule, so compounding is critical.
I’m still new to this, but the first obvious inefficiency I see is that I’m repeating context between sessions, copying .md files around, and generally not gaining any efficiency between each interaction. My only priority right now is to eliminate this repetition so I can free up buffer space for the next repetition to be eliminated. And I don’t want to put any effort into this.
How are you guys organizing this sort of compounding context bank? I’m talking about basic information like “this is my job, these are the products I own, here’s the most recent docs about them, here’s how you use them, etc.” I would love to point it to a few public docs sites and be done, but that’s not the reality of PM work on relatively new/instable products. I’ve got all sorts of docs, some duplicated, some outdated, some seemingly important but actually totally wrong… I can’t just point the agent at my whole Drive and ask it to understand me.
Should I tell my agent to create or update a Skill file every time I find myself repeating the same context more than twice? Should I put the effort into gathering all the best quality docs into a single Drive folder and point it there? Should I make some hooks to update these files when new context appears?
tern 20 hours ago [-]
It's too early. People are trying all of the above. I use all of the above, specifically:
- A well-structured folder of markdown files that I constantly garden. Every sub-folder has a README. Every files has metadata in front-matter. I point new sessions at the entry point to this documentation. Constantly run agents that clean up dead references, update out of date information, etc. Build scripts that deterministically find broken links. It's an ongoing battle.
- A "continuation prompt" skill, that prompts the agent to collect all relevant context for another agent to continue
- Judicious usage of "memory"
- Structured systems made out of skills like GSD (Get Shit Done)
- Systems of "quality gate" hooks and test harnesses
For all of these, I have the agent set them up and manage them, but I've yet to find a context-management system that just works. I don't think we understand the "physics" of context management yet.
chatmasta 17 hours ago [-]
On your first point, one unexpected side effect I’ve noticed is that in an effort to offload my thinking to an agent, I often end up just doing the thinking myself. It’s a surprisingly effective antidote to writer’s block… a similar effect to journaling, and a good reason why people feel weird about sharing their prompts.
nzoschke 17 hours ago [-]
The best thing you can do is help build and maintain high quality docs.
Great docs help you, your agents, your team and your customers.
If you’re confused and the agent can’t figure it out reliably how can anyone?
Easier said than done of course. And harder now than ever if the products are rapidly changing from agentic coding too.
One of my only universal AGENTS.md rules is:
> Write the pull request title and description as customer facing release notes.
chatmasta 17 hours ago [-]
I’ve been thinking about this a lot. It’s obviously the ideal state of things. The challenge is that we’ve got existing docs frameworks and teams and inertia and unreleased features… and I don’t have time to wait for that when I’m trying to get something done today. Not to mention the trade off of writing in public vs. private.
One quick win I’ve thought could bridge this is updating our docs site to respond to `Accept: text/markdown` requests with the markdown version of the docs.
makerdiety 18 hours ago [-]
Sounds like you need OpenClaw's assistance.
d4rkp4ttern 7 hours ago [-]
Agree. For what it’s worth, in interviews Cherny (Claude Code creator) and Steinberger (OpenClaw creator) say they keep things simple and use none of the workflow frameworks. The latter even said he doesn’t even use plan mode, but I find that very useful: exiting plan mode starts clean with compressed context.
steveklabnik 5 hours ago [-]
They backed out the “clear context and execute plan” thing recently. It’s a bummer, I thought it was great.
gruez 4 hours ago [-]
Maybe they figured it wasn't need with 1M context?
cornholio 21 hours ago [-]
Let me give you a counterexample. I'm working on a product for the national market, and i need to do all financial tasks, invoicing, submit to national fiscal databse etc. through a local accounting firm. So i integrate their API in the backend; this is a 100% custom API developed by this small european firm, with a few dozen restful enpoints supporting various accounting operations, and I need to use it programmatically to maintain sync for legal compliance. No LLM ever heard of it. It has a few hundred KB of HTML documentation that Claude can ingest perfectly fine and generate a curl command for, but i don't want to blow my token use and context on every interaction.
So I naturally felt the need to (tell Claude to) build a MCP for this accounting API, and now I ask it to do accounting tasks, and then it just does them. It's really ducking sweet.
Another thing I did was, after a particularly grueling accounting month close out, I've told Claude to extract the general tasks that we accomplished, and build a skill that does it at the end of the month, and now it's like having a junior accountant in at my disposal - it just DOES the things a professional would charge me thousands for.
So both custom project MCPs and skills are super useful in my experience.
LuxBennu 7 hours ago [-]
this is exactly how i use it too. i have a few custom MCP servers running on a mac mini homelab, one for permission management, one for infra gateway stuff. the key thing i learned is keeping CLAUDE.md updated with what each MCP server actually does and what inputs it expects. otherwise claude code will either not use the tool when it should, or call it with wrong params and waste a bunch of back and forth. once you document it properly it really does feel like having a team member who just knows how your stack works. the accounting use case is a great example because nobody else's generic tooling would ever cover that.
zormino 20 hours ago [-]
That's what you should be doing. Start from plain Claude, then add on to it for your specific use cases where needed. Skills are fantastic if used this way. The problem is people adding hundreds or thousands of skills that they download and will never use, but just bloat the entire system and drown out a useful system.
cornholio 11 hours ago [-]
Sure, it's basic use and nothing to flex about - was just responding specifically to the line that plan-review-implement is all you need.
Though, you get such a huge bang from customizing your config that I can easily see how you could go down that slippery slope.
mememememememo 20 hours ago [-]
Your use is maybe more vanilla than you think. I think you are just getting shit done. Which is good.
Claude and an mcp and skill is plain to me. Writing your own agent connecting to LLMs to try to be better than Claude code, using Ralph loops and so on is the rabbit hole.
endofreach 21 hours ago [-]
What exactly does it do that a professional would charge you thousands for?
(I'm genuinely asking)
cornholio 10 hours ago [-]
The basic problem is that the reporting and accounting rules are double plus bureaucratic and you need to have on hand multiple registers that show the financial situation at any time, submit them to the tax authority etc.
To give you a small taste: you need to issue an electronic invoice for each unique customer, and submit on the fly the tax authority - but these need to correlated monthly with the money in your business bank account. The paid invoices don't just go into your bank account, they are disbursed from time to time by the payment processor, on random dates that don't sync with the accounting month, so at end of month you have to have correlate precisely what invoice is paid or not. But wait, the card processor won't just send you the money in a lump sum, it will deduct from each payment some random fee that is determined by their internal formula, then, at the end of each month, add all those deducted fees (even for payments that have not been paid to you) and issue another invoice to you, which you need to account for in you books as being partially paid each month (from the fees deducted from payments already disbursed). You also have other payment channels, each with their fees etc. So I need to balance this whole overlapping intervals mess with all sort of edge cases, chargebacks and manual interventions I refuse to think about again.
This is one example, but there are also issues with wages and their taxation, random tax law changes in the middle of the month etc. The accountant can of course solve all this for you, but once you go a few hundred invoices per month (if you sell relatively cheap services) you are considered a "medium" business, so instead of paying for basic accounting services less than 100€ per month (have the certified accountant look over your books and sign them, as required by law), you will need more expensive packages which definitely add up to thousands in a few months.
Go be an entrepreneur, they said.
thisrobot 22 hours ago [-]
This resonates with me. Sometimes I build up some artifacts within the context of a task, but these almost always get thrown away. There are primarily three reason I prefer a vanilla setup.
1. I have many and sometimes contradictory workflows: exploration, prototyping, bug fixing debugging, feature work, pr management, etc. When I'm prototyping, I want reward hacking, I don't care about tests or lint's, and it's the exact opposite when I manage prs.
2. I see hard to explain and quantify problems with over configuration. The quality goes down, it loses track faster, it gets caught in loops. This is totally anecdotal, but I've seen it across a number of projects. My hypothesis is that is related to attention, specifically since these get added to the system prompt, they pull the distribution by constantly being attended to.
3. The models keep getting better. Similar to 2, sometime model gains are canceled out by previously necessary instructions. I hear the anthropic folks clear their claude.md every 30 days or so to alleviate this.
adshotco 21 hours ago [-]
[flagged]
ctoth 23 hours ago [-]
> Plain Claude, ask it to write a plan, review plan, then tell it to execute still works the best in my experience.
Working on an unspecified codebase of unknown size using unconfigured tooling with unstated goals found that less configuration worked better than more.
pfortuny 24 hours ago [-]
Emacs init file bikeshedding comes to mind…
wilkystyle 23 hours ago [-]
but now you can build your AI agent toolkit to work on your init file for you
dotancohen 22 hours ago [-]
My init.el file went from some 300 lines to under 50 with Claude's assistance. Some of that had to do with updating Emacs, but I really only use Emacs for Org mode so that contribution was minimal.
EdwardDiego 9 hours ago [-]
I've put some stuff in my global Claude.md to avoid things like...
* Claude trying to install packages into my Python system interpreter - (always use uv and venvs)
* Claude pushing to main - (don't push to main ever)
* When creating a PR, completely ignoring how to contribute (always read CONTRIBUTING.md when creating a PR)
* Yellow ANSI text in console output - (Color choices must be visible on both dark and light backgrounds)
Because I got sick of repeating myself about the basics.
pdantix 16 hours ago [-]
at work i've spent some time setting up our claude.md files and curated the .claude directory with relevant tools such as linear, figma, sentry, LSP, browser testing. sensible stuff anyone using these tools would want, it all works pretty well.
my only machine-specific config is overriding haiku usage with sonnet in claude code. i outline what i want in linear, have claude synthesize into a plan and we iterate until we're both happy, then i let it rip. works great.
then one of my juniors goes and loads up things like "superpowers" and all sorts of stuff that's started littering his PRs. i'm just not convinced this ricing of agents materially improves anything.
kelnos 10 hours ago [-]
This is what I do; frankly I can't be arsed to take the time to write all these commands and skills and whatnot. I did use /init to get Claude to create a CLAUDE.md file, and I occasionally -- very occasionally -- go through it and correct anything that's no longer valid due to code changes (and then ask Claude to do the same).
But beyond that, I just ask it what I want it to ask, and that's it. I'm not convinced that putting more time into building the "toolbox" will actually give me significant returns on that time.
I do think that some of this (commands, skills, breaking up CLAUDE.md into separate rules files) can be useful, but it's highly context-dependent, and I think YAGNI applies here: don't front-load this work. Only set those up if you run into specific problems or situations where you think doing this work will make Claude work better.
hypercube33 19 hours ago [-]
Understandable - I find skills for odd duck things and a simple set of rules you routinely prune work the best for me. Went from crappy code in niche projects to it nailing things first prompt almost every time now.
ljm 18 hours ago [-]
I heavily advocate for rawdogging AI agents.
All the fancy frameworks are vibe coded, so why could they do better than something you do by yourself?
At most get playwright MCP in so the agent can see the rendered output
jp57 20 hours ago [-]
This. At work I have described this phenomenon as the equivalent of tinkering with the margins and fonts in your word processor instead of just writing your paper.
pjm331 19 hours ago [-]
I've had the same thought recently and this definitely is a thing that you can do - but there are also cases where you get dramatically better results if you put some more effort into your setup.
e.g. spend time creating a skill about how to query production logs
sockgrant 23 hours ago [-]
if you work on platforms, frameworks, tools that are public knowledge, then yeah. If there’s nothing unique to your project or how to write code in it, build it, deploy it, operate it, yeah.
But for some projects there will be things Claude doesn’t know about, or things that you repeatedly want done a specific way and don’t want to type it in every prompt.
I’m seeing this more and more, where people build this artificial wall you supposedly need to climb to try agentic coding. That’s not the right way to start at all. You should start with a fresh .claude, empty AGENTS.md, zero skills and MCP and learn to operate the thing first.
gck1 1 days ago [-]
I'd also go even further and say that you likely should never install ANY skill that you didn't create yourself (i mean, guided claude to create it for you works too), or "forked" an existing one and pulled only what you need.
Everyone's workflow is different and nobody knows which workflow is the right one. If you turn your harness into a junk drawer of random skills that get auto updated, you introduce yet another layer of nondeterminism into it, and also blow up your context window.
The only skill you should probably install instead of maintaining it yourself is playwright-cli, but that's pretty much it.
dance2die 1 days ago [-]
> I'd also go even further and say that you likely should never install ANY skill that you didn't create yourself
Ignore original comment below, as the post is technical so is the parent comment: for techies
---
That applies to tech users only.
Non-tech users starting to use Claude code and won't care to get the job done
Claude introduced skills is to bring more non-tech users to CLI
as a good way to get your feet wet.
Not everyone will go for such minute tweaks.
JohnMakin 24 hours ago [-]
what? non techies are most at risk. There are a huge number of malicious skills. Not knowing or caring how to spot malicious behavior doesn’t mean someone shouldn’t be concerned about it, no matter how much they can’t or don’t want to do it.
I am an adminstrator of this stuff at my company and it’s an absolute effing nightmare devising policies that protect people from themselves. If I heard this come out of someone’s mouth underneath me I’d tell them to leave the room before I have a stroke.
And this is stuff like, if so and so’s machine is compromised, it could cost the company massive sums of money. for your personal use, fine, but hearing this cavalier attitude like it doesn’t matter is horrifying, because it absolutely does in a lot of contexts.
gck1 14 hours ago [-]
I run a small local non-profit which is essentially security hardening guide with some helper tooling that simplifies some concepts for non-techies (FDE, MFA, password managers etc).
LLMs have completely killed my motivation to continue running it. None of standard practices apply anymore
kccqzy 17 hours ago [-]
My company simply bans Claude code for all non-technical users. They can only use the chatbot from the web UI.
claytonia 13 hours ago [-]
[flagged]
hombre_fatal 1 days ago [-]
I had an issue with playwright MCP where only one Claude Code instance could be using it at a time, so I switched to Claude's built-in /chrome MCP.
In practice, I also find it more useful that the Chrome MCP uses my current profile since I might want Claude to look at some page I'm already logged in to.
I'm not very sophisticated here though. I mainly use use browser MCP to get around the fact that 30% of servers block agent traffic like Apple's documentation.
silverwind 13 hours ago [-]
Would love if there is a way to parallelize playwright mcp using multiple agents and such, but it seems it's a fundamental limitation of that MCP that only on instance/tab can be controlled.
Chrome MCP is much slower and by default pretty much unusable because Claude seems to prefer to read state from screenshots. Also, no Firefox/Safari support means no cross-browser testing.
I was using the built-in chrome skill but it was too unreliable for me. So I switched to playwright cli and I can also have it use firefox to get help debugging browser-specific issues.
s900mhz 1 days ago [-]
Yes this is the path I’m taking. Experiment, build your own toolbox whether it’s hand rolled skills or particular skills you pull out from other public repos. Then maintain your own set.
You do not want to log in one day to find your favorite workflow has changed via updates.
Then again this is all personal preference as well.
smj-edison 16 hours ago [-]
I use vanilla Claude Code, and I've never looked that much into skills, so I'm curious: how do you know when it's time to add a new skill?
solidasparagus 14 hours ago [-]
I used them for repeated problems or workflows I encounter when running with the default. If I find myself needing to repeat myself about a certain thing a lot, I put it into claude.md. When that gets too big or I want to have detailed token-heavy instructions that are only occasionally needed, I create a skill.
I also import skills or groups of skills like Superpowers (https://github.com/obra/superpowers) when I want to try out someone else's approach to claude code for a while.
gck1 14 hours ago [-]
You observe what it does to accomplish a particular task, and note any instances where it:
1. Had to consume context and turns by reading files, searching web, running several commands for what was otherwise a straightforward task
2. Whatever tool it used wasn't designed with agent usage in mind. Which most of the time will mean agent has to do tail, head, grep on the output by re-running the same command.
Then you create a skill that teaches how to do this in fewer turns, possibly even adding custom scripts it can use as part of that skill.
You almost never need a skill per se, most models will figure things out themselves eventually, skill is usually just an optimization technique.
Apart from this, you can also use it to teach your own protocols and conventions. For example, I have skills that teach Claude, Codex, Gemini how to communicate between themselves using tmux with some helper scripts. And then another skill that tell it to do a code review using two models from two providers, synthesize findings from both and flag anything that both reported.
Although, I have abandoned the built-in skill system completely, instead using my own tmux wrapper that injects them using predefined triggers, but this is stepping into more advanced territory. Built in skill system will serve you well initially, and since skills are nothing but markdown files + maybe some scripts, you can migrate them easily into whatever you want later.
nunez 1 days ago [-]
This matters for big engineering teams who want to put _some_ kind of guardrails around Claude that they can scale out.
For example, I have a rule [^0] that instructs Claude to never start work until some pre-conditions are met. This works well, as it always seems to check these conditions before doing anything, every turn.
I can see security teams wanting to use this approach to feel more comfortable about devs doing things with agentic tools without worrying _as much_ about them wreaking havoc (or what they consider "havoc").
As well, as someone who's just _really_ getting started with agentic dev, spending time dumping how I work into rules helped Claude not do things I disapprove of, like not signing off commits with my GPG key.
That said, these rules will never be set in stone, at least not at first.
I'm also thinking on how we can put guardrails on Claude - but more around context changes. For example, if you go and change AGENTS.md, that affects every dev in the repo. How do we make sure that the change they made is actually beneficial? and thinking further, how do we check that it works on every tool/model used by devs in the repo? does the change stay stable over time?
nunez 12 hours ago [-]
Given the scope that AGENTS has, I would use PRs to test those changes and discuss them like any other large-impact area of the codebase (like configs).
If you wanted to be more “corporate” about it, then assuming that devs are using some enterprise wrapper around Claude or whatever, I would bake an instruction into the system prompt that ensures that AGENTS is only read from the main branch to force this convention.
This is harder to guarantee since these tools are non-deterministic.
dominotw 23 hours ago [-]
NO EXCEPTIONS!!!!!!!!!!!!!!!!!!!!!!!!
cute that you think cluade gives a rat ass about this.
nunez 12 hours ago [-]
Claude won’t do me wrong; that’s what the exclamation marks are for!
Aurornis 1 days ago [-]
This article isn't saying you must set up a big .claude folder before you start. It repeats several times that it's important to start small and keep it short.
It's also not targeted at first-timers getting their first taste of AI coding. It's a guide for how to use these tools to deal with frustrations you will inevitably encounter with AI coding.
Though really, many of the complaints about AI coding on HN are written by beginners who would also benefit from a simple .claude configuration that includes their preferences and some guidelines. A frequent complaint from people who do drive-by tests of AI coding tools before giving up is that the tools aren't reading their mind or the tools keep doing things the user doesn't want. Putting a couple lines into AGENTS.md or the .claude folder can fix many of those problems quickly.
jameshart 1 days ago [-]
Yes, but as soon as you start checking in and sharing access to a project with other developers these things become shared.
Working out how to work on code on your own with agentic support is one thing. Working out how to work on it as a team where each developer is employing agentic tools is a whole different ballgame.
sceptic123 1 days ago [-]
But why is it different? Why does it need to be? I don't write code the same as other devs so why would/should I use AI the same?
Is this a hangover from when the tools were not as good?
lucideer 1 days ago [-]
I'd see this as being useful for two reasons:
1. Provision of optional tools: I may use an ai agent differently to all other devs on a team, but it seems useful for me to have access to the same set of project-specific commands, skills & MCP configs that my colleagues do. I amn't forced to use them but I can choose to on a case by case basis.
2. Guardrails: it seems sensible to define a small subset of things you want to dissuade everyone's agents from doing to your code. This is like the agentic extension of coding standards.
IanCal 1 days ago [-]
> I don't write code the same as other devs
Most people do, most people don’t have wildly different setups do they? I’d bet there’s a lot in common between how you write code and how your coworkers do.
benoau 1 days ago [-]
I bet there's a lot more consistency now that AI can factor in how things are being done and be guided on top of that too.
thierrydamiba 1 days ago [-]
[dead]
georgeburdell 1 days ago [-]
In my own group, agentic coding made sharing and collaboration go out the window because Claude will happily duplicate a bunch of code in a custom framework
mitchell_h 1 days ago [-]
In my AGENTS.md I have two lines in almost every single one:
- Under no condition should you use emoji's.
- Before adding a new function, method or class. Scan the project code base, and attached frame works to verify that something else can not be modified to fit the needs.
himmi-01 1 days ago [-]
I'm curious about the token usage when it scans across multiple repositories to finding similar methods. As our code grows so fast, is it sustainable ?
xmprt 1 days ago [-]
I think the idea is that by creating these shared .claude files, you tell the agent how to develop for everyone and set shared standards for design patterns/architecture so that each user's agents aren't doing different things or duplicating effort.
abtinf 1 days ago [-]
Seriously, just use plan mode first and you get like 90% of the way there, with CC launching subagents that will generally do the right thing anyway.
IMHO most of this “customize your config to be more productive” stuff will go away within a year, obsoleted by improved models and harnesses.
Just like how all the lessons for how to use LLMs in code from 1-2 years ago are already long forgotten.
JohnMakin 22 hours ago [-]
I loved all the dumb prompt “hacks” back then like “try saying please”
imiric 21 hours ago [-]
Modern "skills" and Markdown formats of the day are no different than "save the kittens". All of these practices are promoted by influencers and adopted based on wishful thinking and anecdata.
JohnMakin 21 hours ago [-]
Uh, this couldn't be more false. I've implemented these from scratch at my company and rolled them out org-wide and I've yet to watch a youtube video and don't consume any influencers. Mostly by just using the tools and reading documentation - as any other technical tool.
Perhaps your blanket statement could be wrong, and I would encourage you to let your mind be a bit more open. The landscape here is not what it was 6 months ago. This is an undeniable fact that people are going to have to come to terms with pretty soon. I did not want to be in this spot, I was forced to out of necessity, because the stuff does work.
dinkumthinkum 2 hours ago [-]
To be fair, if you have never watched a YouTube video in your life then how can you say the OP was wrong about what influencers are peddling? Side note, have you ever seen that Onion article on the man that can't stop telling people he doesn't own a TV?
Great, so how do you know this stuff works? Did you evaluate it against other approaches? How do you know it's actually reliable?
The Vercel team had some interesting findings[1]:
> In 56% of eval cases, the skill was never invoked. The agent had access to the documentation but didn't use it.
Others had different findings for commonly accepted practices[2], some you may have adopted from reading documentation, which surely didn't come from influencers.
And yet others swear by magical Markdown documents[3].
So... who is the ultimate authority on what actually works, and who is just cargo culting the trendy practice of the week? And how is any of this different from what was being done a few years ago?
Sorry, but from your first comment, I don’t particularly feel inclined to help you figure this out. I was just offering I’ve already deployed these things at a scale with success using many of the configuration options offered as documentation in the op here. this stuff isn’t some mystical blackbox, although you seem to think it is.
I measure the tooling success with a suite of small prompt tests performing repeatable tasks, measuring the success rate over time, educating the broader team, and providing my own tried and tested in the field skills that I’ve shared to similar successes to the broader teams. We’ve seen a huge increase in velocity and lower bug rate, which are also very easily measurable (and long evaluated stats) enough to put me in the position I am, which was not a reluctant one. You’re perfectly free to view my long history on this topic on this forum to see I am a complete skeptic on this topic, and wouldn’t be here unless I had to.
everyone is figuring this out still. There is no authority, I am my own authority on what I have seen work and what hasn’t. Feel free to take of that what you will. I just wanted to provide a counterpoint to your initial claim. I’m certainly not going to expose to a fine degree what has worked for my org and what hasn’t due to obvious reasons.
have a good day!
bonoboTP 22 hours ago [-]
2 months ago I built (with Claude) a quite advanced Python CLI script and Claude Skill that searches and filters the Claude logs to access information from other sessions or from the same session before context compaction. But today Claude Code has a builtin feature to search its logs and will readily do it when needed.
My point is, these custom things are often short lived band-aids, and may not be needed with better default harnesses or smarter future models.
nzoschke 17 hours ago [-]
This is very insightful thanks for sharing.
I’ve been developing and working on dev tools for more than 15 years. I’ve never seen things evolve so rapidly.
Experiment, have fun and get things done, but don’t get too sure or attached to your patches.
It’s very likely the models and harnesses will keep improving around the gaps you see.
I’ve seen most of my AGENTS.md directives and custom tools fade away too, as the agents get better and better at reading the code and running the tests and feeding back on themselves.
beyonddream 1 days ago [-]
.claude has become the new dotfiles. And what do people do when they want to start using dotfiles ? they copy other’s dotfiles and same is happening here :)
silverwind 12 hours ago [-]
.claude is likely to contain secrets and also contains garbage like cache etc, if it is shared, it should only be partially shared.
freedomben 1 days ago [-]
I totally agree with you that this not the right way to start. But, in my experience, the more you use the tool the more of a "feel" you get for it, and knowing how all these different pieces work and line up can be quite useful (though certainly not mandatory). It's been immensely frustrating to me how difficult it is to find all this info with all the low-quality junk that is out there on the internet.
embedding-shape 1 days ago [-]
> all the low-quality junk that is out there on the internet.
Isn't this article just another one in that same drawer?
> What actually belongs in CLAUDE.md - Write: - Import conventions, naming patterns, error handling styles
Then just a few lines below:
> Don’t write: - Anything that belongs in a linter or formatter config
The article overall seems filled with internal inconsistencies, so I'm not sure this article is adding much beyond "This is what an LLM generated after I put the article title with some edits".
Fishkins 1 days ago [-]
I agree with most of this, with one important exception: you should have some form of sandboxing in place before running any local AI agent. The easiest way to do that is with .claude/settings.json[0].
This is important no matter how experienced you are, but arguable the most important when you don't know what you're doing.
0: or if you don't want to learn about that, you can use Claude Code Web
post-it 1 days ago [-]
The default sandboxing works fine for me. It asks before running any command, and I can whitelist directories for reading and non-compound commands.
The part about permissions with settings.json [0] is laughable. Are we really supposed to list all potential variations of harmful commands? In addition to the `Bash(cat ./.env)`, we would also need to add `Bash(cat .env)`, Bash(tail ./.env)`, Bash(tail .env)`, `Bash(head ./.env)`, `Bash(sed '' ./.env)`, and countless others... while at the same time we allow something like `npm` to run?
I know the deny list is only for automatically denying, and that non-explicitly allowed command will pause, waiting for user input confirmation. But still it reminds me of the rationale the author of the Pi harness [1] gave to explain why there will be no permission feature built-in in Pi (emphasis mine):
> If you look at the security measures in other coding agents, *they're mostly security theater*. As soon as your agent can write code and run code, it's pretty much game over. [...] If you're uncomfortable with full access, run pi inside a container or use a different tool if you need (faux) guardrails.
As you mentioned, this is a big feature of Claude Code Web (or Codex/Antigravity or whatever equivalent of other companies): they handle the sand-boxing.
Do people really run claude and other clis like this outside a container??
kelnos 10 hours ago [-]
Yes. I don't bother with that. I feel like the risk of Claude Code running amok is pretty low, and I don't have it do long-running tasks that exceeds my desire to monitor it. (Not because I'm worried about it breaking things, it's just I don't use the tool in that way.)
Fishkins 22 hours ago [-]
I'm sure most folks run Claude without isolation or sandboxing. It's a terrible idea, but even most professional software developers don't think much about security.
There many decent options (cloud VMs, local VMs, Docker, the built-in sandboxing). My point is just that folks should research and set up at least one of them before running an agent.
kenforthewin 1 days ago [-]
Let's not fool ourselves here. If a security feature adds any amount of friction at all, and there's a simple way to disable it, users will choose to do so.
matheusmoreira 1 days ago [-]
How did you contain Claude Code? Did you virtualize it? I just set up a simple firejail script for it. Not completely sure if it's enough but it's at least something.
arcanemachiner 1 days ago [-]
The official Claude Code repo is configured use a devcontainer config:
You can download the devcontainer CLI and use it to start a Docker container with a working Claude Code install, simple firewall, etc. out of the box. (I believe this is how the VSCode extension works: It uses this repo to bootstrap the devcontainer).
this is true, but i think people are best off starting with SOME project that gives users an idea of how to organize and think about stuff. for me, this is gastown, and i now have what has gotta be the most custom gastown install out there. could not agree more that your ai experience must be that which you build for yourself, not a productized version that portends to magically agentize your life. i think this is the real genius of gastown— not how it works, but that it does work and yegge built it from his own mind. so i’ve taken the same lesson and run very, very far with it, while also going in a totally different direction in many ways. but it is a work of genius, and i respect the hell out of him for putting it out there.
butlike 1 days ago [-]
It's not as bucolic as this when trying to get an org on board. We're currently very open to using Claude, but the unknowns are still the unknowns, so the guardrails the `.claude` folder provides gives us comfort when gaining familiarity with the tool.
pgwhalen 1 days ago [-]
Who is building an artificial wall? Maybe I skimmed the post too fast, but it doesn't seem like this information is being presented as "you have to know/do this before you start agentic engineering", just "this is some stuff to know."
ymolodtsov 1 days ago [-]
Peter Steinberger himself says he's just chatting with AI instead of coming up with crazy coding workflows.
hmokiguess 1 days ago [-]
with Anthropic already starting to sell "Claude Certified Architect" exams and a "Partner Network Program", I think a lot of this stuff is around building a side industry on top of it unfortunately
stronglikedan 1 days ago [-]
Seems maybe you're just not the target audience for this article.
heliumtera 1 days ago [-]
Operate == me send https post and pray for the best
dietr1ch 1 days ago [-]
That's the goal, keep spending tokens and claim you are super productive because of it
dominotw 23 hours ago [-]
> empty AGENTS.md, zero skills
which is basically every setup because claude sucks at calling skills and forget everything in claude.md with a few seconds.
girvo 14 hours ago [-]
Right? I laughed when I read this:
>If you tell Claude to always write tests before implementation, it will. If you say “never use console.log for error handling, always use the custom logger module,” it will respect that every time.
It just isn't true lol
silverwind 12 hours ago [-]
Yep, it regularily ignores CLAUDE.md files. It seems these files are not weighted high enough vs. the prompt.
cloverich 1 days ago [-]
Feel little like this is generated and not based on experience. Claude.md should be short. Typescript strict mode isnt a gotcha, itll figure that out on its own easily, imo omit things like that. People put far too much stuff in claude, just a few lines and links to docs is all it needs. You can also @Agents.md and put everything there instead. Dont skills supercede commands? Subagents are good esp if you specify model, forked memory, linked skills, etc. Always ask what you can optimize after you see claude thrashing, then figure out how to encode that (or refactor your scripts or code choices).
Always separate plan from implementation and clear context between, its the build up of context that makes it bad ime.
esses 23 hours ago [-]
The intro paragraph sounds exactly like Claude’s phrasing. So much so that I couldn’t read the rest of the article because I assumed I could just ask Claude about the topic.
jmtulloss 14 hours ago [-]
Exactly this. If there is some nuance in the article vs what Claude can tell you, then that's worthwhile. This article is just generated with a specific prompt on style but very little content editing. What's the point? It's like posting the results of a Google search. The prompt would have been more interesting.
It's not against the rules to post AI slop here, and I don't necessarily think it should be. But I do wonder how we value written content going forward. There's value to taste and style and editing and all the other human things... there's very little value in the actual words themselves. We'll figure it out.
hombre_fatal 1 days ago [-]
> Dont skills supercede commands?
Don't skills sit in context while custom slash commands are only manually invoked?
The difference isn't clear to me, especially since, upon googling it right now, I see that skills can also be invoked with a /slash.
wonnage 23 hours ago [-]
They’re the same thing now except that skills can have some frontmatter to allow the agent to execute them automatically.
bushido 21 hours ago [-]
I keep seeing these posts, and here's the most interesting thing, for me.
I get the best results with the least number of skills and unnecessary configuration in place.
People are spending way too much time over-prescribing these documents, but AI is like a competent but nervous adult. The more you give it, the dumber it gets.
Synthetic7346 1 days ago [-]
I wish all model providers would converge on a standard set of files, so I could switch easily from Claude to Codex to Cursor to Opencode depending on the situation
embedding-shape 1 days ago [-]
Issue is that both harness and specific model matters a lot in what type of instruction works best, if you were to use Anthrophic's models together with the best way to do prompting with Codex and GPT models, you'd get a lot worse results compared to if you use GPT models with Codex, prompted in the way GPTs react best to them.
I don't think people realize exactly how important the specific prompts are, with the same prompt you'd get wildly different results for different models, and when you're iterating on a prompt (say for some processing), you'd do different changes depending on what model is being used.
freedomben 1 days ago [-]
Having experimented with soft-linking AGENTS.md into CLAUDE.md and GEMINI.md, this lines up well with my experience. I now just let each time maintain it's own files and don't try to combine them. If it's something like my custom "## Agent Instructions" then I just copy-pasta and it's not been hard, and since that section is mostly identical I just treat AGENTS.md as the canonical and copy/paste any changes over to the others.
NomDePlum 21 hours ago [-]
[dead]
dbmikus 1 days ago [-]
Are there any good guides on how to write prompt files tailored to different agents?
Would also be interested in examples of a CLAUDE.md file that works well in Claude, but works poorly with Codex.
dhorthy 1 days ago [-]
I think one of the main examples that i saw in a swyx article a while back is that using the sort of ALL CAPS and *IMPORTANT* language that works decently with claude will actually detune the codex models and make them perform worse. I will see if I can find the post
flyingcircus3 22 hours ago [-]
Why would you settle for a guide when you can get a claude skill to do it for you?
Because that just does it for you, it doesn't help me understand how to write better prompts.
Actually, I can just read the skill with my own eyes and then I can also learn. So, thank you for sharing. It's interesting to read through what it suggests for different models - it fits for the ones I work with regularly, but there are many I don't know the strengths and weaknesses of.
b-karl 11 hours ago [-]
Cursor supports all the Claude file patterns, including plugins and marketplaces. We leverage that to support both Claude and Cursor with same instructions and skills
conception 23 hours ago [-]
You can just use a single agents.md and have claude.md use it. Then symlink or sync skills/etc between .folders. It’s not perfect but works.
mememememememo 20 hours ago [-]
It is early browser days. It is good they are not converging. That is how we got AJAX (now Fetch) and even JS.
0x500x79 1 days ago [-]
Agree, for now im using dotagents by sentry to handle a lot of this.
heliumtera 1 days ago [-]
And why would they ever let switch?
Synthetic7346 1 days ago [-]
Interoperability means that people could switch to them as well
dataviz1000 1 days ago [-]
Claude Fast has very good alternate documentation for this. [0] I don't understand the hate for defining .claude/ . It is quite easy to have the main agent write the files. Then rather doing one shot coding, instead iterate quickly updating .claude/ I'm at the point where .claude/ makes copies of itself, performs the task, evaluates, and updates itself. I'm not coding code, I'm coding .claude/ which does everything else. This is also a mechanism for testing .claude, agents, and instructions which would be useful for sharing and reuse in an organization.
I'm in absolute disbelief at the existence of this website. Idiocracy is at an all-time high.
dinkumthinkum 2 hours ago [-]
Yes. I think people that think like us are in a major minority, ironically. It's a sinking ship.
chrisweekly 22 hours ago [-]
Great link, thanks for sharing! Read and bookmarked it.
TLDR "CLAUDE.md isn't documentation for Claude to read - it's an operating system for Claude to run. Define behavior, delegate knowledge to skills, and build a system that improves itself over time."
giancarlostoro 1 days ago [-]
The real wall I never see people talking about is, yes, you can tell Claude to update whatever file you want, but you have to be aware that if it's .claude/INSTRUCTIONS.md or CLAUDE.md that you need to tell Claude to re-read those files because it wrote the contents but its not treating it as if it were fresh instructions, it will run off whatever the last time it read that file was, so if it never existed, it will not know. I believe Claude puts those instructions in a very specific part of its context window.
saadn92 1 days ago [-]
The claim that "whatever you write in CLAUDE.md, Claude will follow" is doing a lot of heavy lifting. In practice CLAUDE.md is a suggestion, not a contract. Complex tasks and compaction will dilute the use of CLAUDE.md, especially once the context window runs out.
dgb23 1 days ago [-]
This is correct. All of these .md files are just blobs of text that the LLM matches against. They might increase the likelihood of something happening or not happening.
They look to me like people actually want to build deterministic workflows, but blobs of text are the wrong approach for that. The right tool is code that controls the agent through specific states and validates the tool calls step by step.
chrisweekly 22 hours ago [-]
That's not quite right. Claude treats certain md files very differently from others. See eg
If true it is adding more weight. It is not infallable.
mememememememo 20 hours ago [-]
You can create a control loop that runs all tests and then runs a claude session to fix.
Ultimately you can't force claude to solve any problem but you could make it so constraints are kept.
A simple way is a git hook thay runs all the deterministic things you care about.
philbitt 23 hours ago [-]
[dead]
robertfw 24 hours ago [-]
Yeah, the moment I saw this I knew this article was not going to be very useful.
Getting claude to follow your guidance files consistently is a bit maddening.
gigapotential 1 days ago [-]
Nice! Article didn't mention but ~/.claude/plans is where it stores plan md file when running in plan mode. I find it useful to open or backup plans from the directory.
brandnewideas 8 hours ago [-]
What a pointless, useless article. The shitty generated graphics are just the cherry on the cake.
I've been going heavily in the direction of globally configured MCP servers and composite agents with copilot, and just making my own MCP servers in most cases.
Then all I have to do is let the agents actually figure out how to accomplish what I ask of them, with the highly scoped set of tools and sub agents I give them.
I find this works phenomenally, because all the .agent.md file is, is a description of what the tools available are. Nothing more complex, no LARP instructions. Just a straightforward 'here's what you've got'.
And with agents able to delegate to sub agents, the workflow is self-directing.
Working with a specific build system? Vibe code an MCP server for it.
Making a tool of my own? MCP server for dev testing and later use by agents.
On the flipside, I find it very questionable what value skills and reusable prompts give. I would compare it to an architect playing a recording of themselves from weeks ago when talking to their developers. The models encode a lot of knowledge, they just need orientation, not badgering, at this point.
nzoschke 17 hours ago [-]
I’ve had success with this general approach too.
The best thing I’ve done so far is put GitHub behind an API proxy and reject pushes and pull requests that don’t meet a criteria, plus a descriptive error.
I find it forgets to read or follow skills a lot of the time, but it does always try to route around HTTP 400s when pushing up its work.
einrealist 11 hours ago [-]
So when Anthropic releases a new model that "breaks compatibility" with some Markdown files, do we call it "refactoring" to find (guess) the required changes to have the desired outcome again? Don't we create brittle specifications to fit a version of a model?
Leynos 11 hours ago [-]
Use evals
Coming soon, unit, behavioural and regression tests for your prompts and skills :P
stingraycharles 11 hours ago [-]
How do you use evals when you’re using Claude Code, given that Claude Code also changes their prompts all the time?
You’ll have:
* Claude model version
* Claude Code prompts and tools
* Your own prompts and skills and whatnot
* Your repository’s source code (= the input)
All of those change constantly, it’s not like it’s some kind of SWE benchmark.
manudaro 1 days ago [-]
The .claude folder structure reminds me of how Terraform organizes state files. Smart move putting conversation history in Json rether than some propiertary format, makes it trivial to grep through old conversations or build custom analysis tools.
unshavedyak 1 days ago [-]
Claude itself will grep through old conversations so it’s handy that Claude understands too
manudaro 19 hours ago [-]
Ha yeah that makes sense. Having the AI read its own conversation history in a format it already understands is a nice side effect of keeping it in plain Json.
TheRoque 1 days ago [-]
So that's what "software engineering" has become nowadays ? Some cargo cult basically. Seriously all of this gives red flag. No statements here are provable. It's just like langhchain that was praised and then everyone realized it's absolute dog water. Just like MCP too. The job in 2026 is really sad.
graypegg 1 days ago [-]
I think I'm finding a pretty good niche for myself honestly. IMO, Software engineering is more so splitting into different professions based on the work is produces.
This sort of "prompt and pray" flow really works for people, as in they can make products and money, however, I do think the people that succeed today also would've reached for no-code tools 5 years ago and seen similar success. It's just faster and more comprehensive now. I think the general theme of the products remains the same though; not un-important or worthless, but it tends to be software that has effects that say INSIDE the realm of software. I feel like there's always been a market for that, as it IS important, it's just not WORTH the time and money to the right people to "engineer" those tools. A lot of SaaS products filled that niche for many years.
While it's not a way I want to work, I am also becoming comfortable with respecting that as a different profession for producing a certain brand of software that does have value, and that I wasn't making before. The intersection of that is opportunity I'm missing out on; no fault to anyone taking it!
The software engineer that writes the air traffic avoidance system for a plane better take their job seriously, understand every change they make, and be able to maintain software indefinitely. People might not care a ton about how their sales tracking software is engineered, but they really care about the engineering of the airplane software.
sarchertech 1 days ago [-]
I think this is mostly right. The primary difference is that with no code you had to change platforms, but the Prompt and Pray method can be brought to bear on any software easily even the air traffic avoidance system.
It shouldn’t be, but it’s going to take some catastrophic events to convince people that we have to work to make sure we understand the systems we’re building and keep everything from devolving into vibe coded slop.
graypegg 1 days ago [-]
> the Prompt and Pray method can be brought to bear on any software easily even the air traffic avoidance system.
I guess that's why I see it as a separate profession, as in we have to actually profess a standard for how a professional in our field acts and believes. I think it's OK for it to bifurcate into two different fields, but Software Engineering would need to specifically reject prompt-and-pray on a principled and rational basis.
Sadly yes, that might require real cost to life in order to find out the "why" side of that rational basis. If you meet anyone that went to an engineering school in Québec, ask them about the ceremony they did and the ring they received. [0] It's not like that ceremony fixes anything, but it's a solemn declaration of responsibility which to me at least, sets a contract with society that says "we won't make things that harm you".
> [The] history of the 1907 failure of the Quebec City bridge, which was the inspiration for the Calling of an Engineer ceremony.
baal80spam 1 days ago [-]
> prompt and pray
This is a brilliant reimagining of the old and trusted PnP acronym.
dinkumthinkum 2 hours ago [-]
I would say, yes its pretty sad. The hypers are kind of gassing themselves up because they, unironically, think they are using LLMs in some special way and they are going to win. I think the industry is ramping up to speed-run into some Tai Lopez type situation.,
donperignon 10 hours ago [-]
trial, pray, error, trial ... such a waste of energy and talent
sunir 1 days ago [-]
When was it not a cargo cult?
63stack 1 days ago [-]
The article starts off really weak:
>Claude Code users typically treat the .claude folder like a black box. They know it exists. They’ve seen it appear in their project root. But they’ve never opened it, let alone understood what every file inside it does.
I know we are living in a post-engineering world now, but you can't tell me that people don't look at PRs anymore, or their own diffs, at least until/if they decide to .gitignore .claude.
sunir 1 days ago [-]
You’re assuming most people using Claude code are senior engineers.
politelemon 1 days ago [-]
And that we're living in a post engineering world.
fogzen 1 days ago [-]
I don't. I have Claude do all my PR reviews, running in a daily loop in the morning. The truth is an LLM is better at code review than the average programmer.
I'm a senior engineer who has been shipping code since before GitHub and PR reviews was a thing. Thankfully LLMs have freed me from being asked to read other people's shit code for hours every day.
dinkumthinkum 2 hours ago [-]
It "freed" you. :) That's an interesting way to put it.
qiine 22 hours ago [-]
Reading the "AGENTS.md" files people write, sometimes, feels like reading "README(2).md"
mememememememo 20 hours ago [-]
echo "Read README.md" > AGENTS.md
phyzix5761 1 days ago [-]
Is there a completely free coding assistant agent that doesn't require you to give a credit card to use it?
I recently tried IntelliJ for Kotlin development and it wanted me to give a credit card for a 30 day trial. I just want something that scans my repo and I tell it the changes I want and it does it. If possible, it would also run the existing tests to make sure its changes don't break anything.
bityard 1 days ago [-]
There are lots! Too many to cover in a single HN comment, and this space is evolving rapidly so I encourage you to look around.
While the coding assistants are pretty much universally free, you still need to connect them to a model. The model tokens generally cost something once you've gone past a certain quota.
I'm not sure if this is still true, but if you have a Google account, Gemini Code Assist had a quite generous "free tier" that I used for a while and found it do be pretty decent.
I think this does a great job of explaining the .claude directories in a beginner friendly way. And I don’t necessarily read it as “you have to do all this, before you start”.
It has a few issues with outdated advice (e.g. commands has been merged with skills), but overall I might use share it with co-workers who needs an introduction to the concept.
rafaelmn 22 hours ago [-]
It's shocking how shitty claude code CLI app is - config is brittle shit (setting up a plugin LSP is searching through GitHub issues and guessing which parameters you messed up), hooks render errors in the app when there are none and the permission harness is barely documented, zero customization options (would you like the agent config come from a different folder than source root ? nope). Going through gihub issues, same issue you hit has been open since beginning of 2025 and ignored - their issues are /dev/null - it's basically a user forum.
frizlab 1 days ago [-]
Completely tangential, but can we please stop putting one million files at the root of the project which have nothing to do with the project? Can we land on a convention like, idk, a `.meta` folder (not the meta company, the actual word), or whatever, in which all of these Claude.md, .swift-version, Code-of-Conduct.md, Codeowners, Contributing.md, .rubocop.yml, .editorconfig, etc. files would go??
flurdy 1 days ago [-]
I was glad when linux went with the .config standard for most dotfiles.
qiine 22 hours ago [-]
Now if only it would be respected more often!
pedropaulovc 23 hours ago [-]
Shameless plug, if you ever need to parse ~/.claude/projects use claude-code-types [1].
Here's a question that I hope is not too off-topic.
Do people find the nano-banana cartoon infographics to be helpful, or distracting? Personally, I'm starting to tire seeing all the little cartoon people and the faux-hand-drawn images.
Wouldn't Tufte call this chartjunk?
push0ret 1 days ago [-]
I haven't come around any AI generated imagery in documents / slides that adds any value. It's more the opposite, they stand out like a sore thumb and often even reduce usability since text cannot be copied. Oh and don't get me started on leadership adding random AI generated images to their emails just to show that they use AI.
linux2647 1 days ago [-]
> Oh and don't get me started on leadership adding random AI generated images to their emails just to show that they use AI
Feels like generated AI art like this is modern clipart
GaggiX 1 days ago [-]
It may be survivorship bias, you only notice the AI ones that are bad.
pona-a 24 hours ago [-]
The problems are not visual but epistemic. If the author didn't specify enough to produce a useful chart, then it's going to be the diagram equivalent of stock images thrown on a finished presentation by a lazy intern. You can't rejection-sample away this kind of systemic fault.
The simple truth we're about to realize is there is no free lunch: a tool cannot inject more intent into a piece than its author put in. It might smooth out some blemishes or highlight some alternative choices, but it can't transform the input "make me a video game" into something greater than a statistical mix-mash of the concept. And traditional tools of automation give you a much better, more precise interface for intent than natural language, which allows these vagaries.
spunker540 1 days ago [-]
Yeah there are almost certainly times when it is gen ai and you just didn’t notice.
elcapitan 1 days ago [-]
When I see AI images, I skip them, and most likely, the entire article. They're a better warning sign than the ones hidden in the text.
SV_BubbleTime 1 days ago [-]
Yeah, I’ve been considering this. They’re going to start removing em dashes, which currently is a surefire way to detect AI text.
Let’s say lose those and using emojis as bullet points. It’s going to be a lot harder to detect.
slopinthebag 1 days ago [-]
I don't actually look for em dashes or emojis as indicators, I can tell just from a few paragraphs if the pacing and flow is AI slop.
fny 1 days ago [-]
This is equivalent to "do people find PowerPoint to be helpful or distracting." Sometimes yes, mostly no.
In this case, I'd say helpful because I didn't have to read the article at all to understand what was being communicated.
matsemann 1 days ago [-]
I never trust them to actually be correct. Aka they're probably worse than useless.
randusername 1 days ago [-]
Tufte is evergreen. Zinsser is another.
> Clutter is the disease of American writing. We are a society strangling in unnecessary words, circular constructions, pompous frills and meaningless jargon.
> Look for the clutter in your writing and prune it ruthlessly. Be grateful for everything you can throw away. Reexamine each sentence you put on paper. Is every word doing new work? Can any thought be expressed with more economy?
On Writing Well (Zinsser)
freedomben 1 days ago [-]
Most of the time I find them distracting, and sometimes a huge negative on the article. In this particular article though, they're well done and relevant, and I think they add quite a bit. It's a highly personal opinion kind of thing though for sure.
SV_BubbleTime 1 days ago [-]
The first one is actually quite good.
Some of the others, I don’t feel like added value, but I agree that these are some of the best of a practice that I agreed does not add a ton of value typically
browningstreet 1 days ago [-]
I think it's fine. As someone who blogged a lot, the instant visual differentiation among articles offered by the art within is actually valuable.
eitally 1 days ago [-]
I am a victim of AI-documentation-slop at work, and the result is that I've become far more "Tuftian" in my preferences than ever before. In the past, I was a fan of beautiful design and sometimes liked nice colors and ornaments. Now, though, I've a fan of sparse design and relevant data (not information -- lots of information is useless slop). I want content that's useful and actionable, and the majority of the documents many of my peers create using Claude, Gemini or ChatGPT are fluffy broadsheets of irrelevant filler, rarely containing insights and calls-to-action.
So yes, it's chartjunk.
btucker 1 days ago [-]
It's not necessarily an AI-generated infographics issue, it's that these aren't good infographics. The graphic part is adding minimal value.
hrmtst93837 9 hours ago [-]
Bad infographics existed long before image models.
If the graphic still needs paragraphs to decode and doesn't let the reader pull out the key facts faster than plain text, it's not an infographic so much as cargo-cult design pasted on top of an explanation.
ramon156 1 days ago [-]
LinkedIn loves these, even if they're broken.
But they had already lost me at all the links, and the fact there's not a red wire through the entire article.
The first thing my eyes skimmed was:
> CLAUDE.md: Claude’s instruction manual
> This is the most important file in the entire system. When you start a Claude Code session, the first thing it reads is CLAUDE.md. It loads it straight into the system prompt and keeps it in mind for the entire conversation.
No it's not. Claude does not read this until it is relevant. And if it does, it's not SOT. So no, it's argumentatively not the most important file.
frotaur 1 days ago [-]
Are you certain? My understanding was that this is automatically injected in the context, and in my experience that's how it worked. I never see 'ReadFile(claude.md)', and yet claude is aware of some conventions I put in there.
hbarka 1 days ago [-]
They’re mistaken. CLAUDE.md is always loaded into context, along with system prompts and memory files.
“CLAUDE.md files are loaded into the context window at the start of every session”
SV_BubbleTime 1 days ago [-]
Maybe. But I kind of view LinkedIn as a social network for people who only by the grace of a couple better decisions are talking about real business and not multilevel marketing schemes… but otherwise use the same themes and terminologies.
Like mostly people who have confused luck and success, or business acumen for religion.
So I wouldn’t use LinkedIn as a positive data point of what’s hot.
simonw 1 days ago [-]
My eye has started skipping past them, even though they're often quite useful if you engage with them.
I think the problem is that they're uninformative slop often enough that I've subconsciously determined they aren't worth risking attention time on.
heliumtera 1 days ago [-]
No. It adds nothing so nothing is preferred
mark_l_watson 24 hours ago [-]
Off topic but earlier today I asked Gemini to read this article and advise how to do the same things for OpenCode. I am fascinated with trying to get good performance from small local models.
forgotusername6 1 days ago [-]
If these different agents could agree on a standard location that would be great. The specs are almost the same for .github and Claude but Claude won't even look at the .github location.
dmix 1 days ago [-]
There already is, it's ~/.agents and you use symlinks for .claude, and the dir structure is pretty similar and anything you want to reuse across models is pretty standardized, just not formalized.
arvindrajnaidu 23 hours ago [-]
Is this the best way to do things? If the idea is to simply compose a string to add to the input? Maybe it is.
BoredPositron 1 days ago [-]
Alchemy.
einrealist 10 hours ago [-]
"Vibe prompting"
persedes 1 days ago [-]
huh neat, somehow completely missed out on the rules/ + path filters as a way to extend CLAUDE.md
lukebechtel 18 hours ago [-]
~/.claude/projects is where the real fun is :)
jwilliams 1 days ago [-]
> Simply put: whatever you write in CLAUDE.md, Claude will follow.
No.
CLAUDE.md is just prompt text. Compaction rewrites prompt text.
If it matters, enforce it in other ways.
1 days ago [-]
hbarka 1 days ago [-]
CLAUDE.md survives compaction.
jwilliams 1 days ago [-]
It's meant to, yes.
taormina 1 days ago [-]
Exactly!
adshotco 1 days ago [-]
[flagged]
quang1011 20 hours ago [-]
mais pourquoi je dois souscrire pour pouvoir répondre alors que j'ai déjà souscrit à l'autre j'ai l'impression que c'est une autre page du coup
Spixel_ 20 hours ago [-]
T'es perdu, chaton ?
nickphx 12 hours ago [-]
I look forward to the death of this hype machine.
gbrindisi 22 hours ago [-]
are agents/ still relevant after we got skills? I am genuinely confused on why I would need custom system prompts for specific agents, what should I use them for?
Aeroi 22 hours ago [-]
nice writeup. if you have good claude.md, .md files, .skills or mcp cli you want to monetize I built mog.md to let people/agents buy and sell these things.
rdevilla 1 days ago [-]
The fuck? What's next, configuring maven and pom.xml? At least XML is unambiguous, well specified, and doesn't randomly refuse to compile 2% of the time..
imranstrive7 6 hours ago [-]
how can i check this
1 days ago [-]
sergiotapia 1 days ago [-]
In my experience fewer skills is significantly better.
When you have this performative folder of skills the AI wastes a bunch of tool calls, gets confused, doesn't get to the meat of the problem.
beware!
jonnycoder 24 hours ago [-]
Yea I went through my global claude skills and /context yesterday because claude was performing terribly. I deleted a bunch of stuff including memory and anecdotally got better results later on in the day.
Normal_gaussian 1 days ago [-]
Yeah skills get loaded into context which in effect pollutes context.
submeta 1 days ago [-]
Tangential: The image with the heading "Anatomy of the .claude/ folder" is nicely made, anyone knows what tool is used for it?
phplovesong 12 hours ago [-]
AI agents like claude is slowly moving to config hell, as we often see in deployment pipelines, project setup etc. This is always a neverending timesink, and because of AI can/will probably need to be altered very frequently.
In the end it will still produce slop you need to review line by line.
The question is: Do you want to write code you know and verified that works, or review code written by AI that is of junior dev quality that is not verified.
galoisscobi 1 days ago [-]
> Most people either write too much or too little. Here’s what works.
> Two folders, not one
Why post AI slop here?
rvz 1 days ago [-]
"Thinking" is about to get even harder to do for most grifters with newsletters to sell.
akalidz 1 days ago [-]
interesting
PetrBrzyBrzek 1 days ago [-]
Why is this AI slop article first on HN?
TacticalCoder 20 hours ago [-]
That sounds like serious AI slop to me:
> "The project-level folder holds team configuration. You commit it to git. Everyone on the team gets the same rules, the same custom commands, the same permission policies."
> "Most people either write too much or too little. Here’s what works."
It feels like I've been teleported into a recent LinkedIn feed. Do real people actually already write like AI or is it AI generated?
Plain Claude, ask it to write a plan, review plan, then tell it to execute still works the best in my experience.
The reality is that if you actually know what you want, and can communicate it well (where the productivity app can be helpful), then you can do a lot with AI.
My experience is that most people don't actually know what they want. Or they don't understand what goes into what they want. Asking for a plan is a shortcut to gaining that understanding.
This particular skill is not great.
I can’t tell you how many times I have a CS student in my office for advising and they tell me they only want to take technical courses, because anything reading or writing or psychology or history based is “soft”, unrelated to their major, and a waste of their time.
I’ve spent years telling them critical reading and expressive writing skills are very important to being a functioning adult, but they insist what they need to know can only be found in the Engineering college.
Engineers who lack soft skills cannot be effective in team environments.
So, to really create something new that I care about, LLMs don't help much.
They are still useful for plenty of other tasks.
We used to have the very difficult task of producing working scalable maintainable code describing complex systems which do what we need them to do.
Now on top of it we have the difficult task of producing this code using constantly mutating complex nondeterministic systems.
We are the circus bear riding a bicycle on a high wire now being asked to also spin plates and juggle chainsaws.
Maybe singularity means that time sunk into managing LLMs is equal to time needed to manually code similar output in assembly or punch cards.
skills that teach the agent how to pipe data, build requests, trace them through a system and datasources, then update code based on those results are a step function improvement in development.
ai has fundamentally changed how productive i am working on a 10m line codebase, and i'd guess less than 5% of that is due to code gen thats intended to go to prod. Nearly all of it is the ability to rapidly build tools and toolchains to test and verify what i'm doing.
What sort of skills are you referring to?
I'd love to know how this skill was phased.
Claude is kind of decent at doing "when in Rome" sort of stuff with your codebase, but it's nice to reinforce, and remind it how to deploy, what testing should be done before a PR, etc.
Skills are crazy useful to tell Claude how to debug your particular project, especially when you have a library of useful scripts for doing so.
And also we know why: effective context depends on inout and task complexity. Our best guess right now is that we are often between 100k to 200k effective context length for frontier, 1m NIHS type models
I’m still new to this, but the first obvious inefficiency I see is that I’m repeating context between sessions, copying .md files around, and generally not gaining any efficiency between each interaction. My only priority right now is to eliminate this repetition so I can free up buffer space for the next repetition to be eliminated. And I don’t want to put any effort into this.
How are you guys organizing this sort of compounding context bank? I’m talking about basic information like “this is my job, these are the products I own, here’s the most recent docs about them, here’s how you use them, etc.” I would love to point it to a few public docs sites and be done, but that’s not the reality of PM work on relatively new/instable products. I’ve got all sorts of docs, some duplicated, some outdated, some seemingly important but actually totally wrong… I can’t just point the agent at my whole Drive and ask it to understand me.
Should I tell my agent to create or update a Skill file every time I find myself repeating the same context more than twice? Should I put the effort into gathering all the best quality docs into a single Drive folder and point it there? Should I make some hooks to update these files when new context appears?
- A well-structured folder of markdown files that I constantly garden. Every sub-folder has a README. Every files has metadata in front-matter. I point new sessions at the entry point to this documentation. Constantly run agents that clean up dead references, update out of date information, etc. Build scripts that deterministically find broken links. It's an ongoing battle.
- A "continuation prompt" skill, that prompts the agent to collect all relevant context for another agent to continue
- Judicious usage of "memory"
- Structured systems made out of skills like GSD (Get Shit Done)
- Systems of "quality gate" hooks and test harnesses
For all of these, I have the agent set them up and manage them, but I've yet to find a context-management system that just works. I don't think we understand the "physics" of context management yet.
Great docs help you, your agents, your team and your customers.
If you’re confused and the agent can’t figure it out reliably how can anyone?
Easier said than done of course. And harder now than ever if the products are rapidly changing from agentic coding too.
One of my only universal AGENTS.md rules is:
> Write the pull request title and description as customer facing release notes.
One quick win I’ve thought could bridge this is updating our docs site to respond to `Accept: text/markdown` requests with the markdown version of the docs.
So I naturally felt the need to (tell Claude to) build a MCP for this accounting API, and now I ask it to do accounting tasks, and then it just does them. It's really ducking sweet.
Another thing I did was, after a particularly grueling accounting month close out, I've told Claude to extract the general tasks that we accomplished, and build a skill that does it at the end of the month, and now it's like having a junior accountant in at my disposal - it just DOES the things a professional would charge me thousands for.
So both custom project MCPs and skills are super useful in my experience.
Though, you get such a huge bang from customizing your config that I can easily see how you could go down that slippery slope.
Claude and an mcp and skill is plain to me. Writing your own agent connecting to LLMs to try to be better than Claude code, using Ralph loops and so on is the rabbit hole.
(I'm genuinely asking)
To give you a small taste: you need to issue an electronic invoice for each unique customer, and submit on the fly the tax authority - but these need to correlated monthly with the money in your business bank account. The paid invoices don't just go into your bank account, they are disbursed from time to time by the payment processor, on random dates that don't sync with the accounting month, so at end of month you have to have correlate precisely what invoice is paid or not. But wait, the card processor won't just send you the money in a lump sum, it will deduct from each payment some random fee that is determined by their internal formula, then, at the end of each month, add all those deducted fees (even for payments that have not been paid to you) and issue another invoice to you, which you need to account for in you books as being partially paid each month (from the fees deducted from payments already disbursed). You also have other payment channels, each with their fees etc. So I need to balance this whole overlapping intervals mess with all sort of edge cases, chargebacks and manual interventions I refuse to think about again.
This is one example, but there are also issues with wages and their taxation, random tax law changes in the middle of the month etc. The accountant can of course solve all this for you, but once you go a few hundred invoices per month (if you sell relatively cheap services) you are considered a "medium" business, so instead of paying for basic accounting services less than 100€ per month (have the certified accountant look over your books and sign them, as required by law), you will need more expensive packages which definitely add up to thousands in a few months.
Go be an entrepreneur, they said.
1. I have many and sometimes contradictory workflows: exploration, prototyping, bug fixing debugging, feature work, pr management, etc. When I'm prototyping, I want reward hacking, I don't care about tests or lint's, and it's the exact opposite when I manage prs.
2. I see hard to explain and quantify problems with over configuration. The quality goes down, it loses track faster, it gets caught in loops. This is totally anecdotal, but I've seen it across a number of projects. My hypothesis is that is related to attention, specifically since these get added to the system prompt, they pull the distribution by constantly being attended to.
3. The models keep getting better. Similar to 2, sometime model gains are canceled out by previously necessary instructions. I hear the anthropic folks clear their claude.md every 30 days or so to alleviate this.
Working on an unspecified codebase of unknown size using unconfigured tooling with unstated goals found that less configuration worked better than more.
* Claude trying to install packages into my Python system interpreter - (always use uv and venvs)
* Claude pushing to main - (don't push to main ever)
* When creating a PR, completely ignoring how to contribute (always read CONTRIBUTING.md when creating a PR)
* Yellow ANSI text in console output - (Color choices must be visible on both dark and light backgrounds)
Because I got sick of repeating myself about the basics.
my only machine-specific config is overriding haiku usage with sonnet in claude code. i outline what i want in linear, have claude synthesize into a plan and we iterate until we're both happy, then i let it rip. works great.
then one of my juniors goes and loads up things like "superpowers" and all sorts of stuff that's started littering his PRs. i'm just not convinced this ricing of agents materially improves anything.
But beyond that, I just ask it what I want it to ask, and that's it. I'm not convinced that putting more time into building the "toolbox" will actually give me significant returns on that time.
I do think that some of this (commands, skills, breaking up CLAUDE.md into separate rules files) can be useful, but it's highly context-dependent, and I think YAGNI applies here: don't front-load this work. Only set those up if you run into specific problems or situations where you think doing this work will make Claude work better.
All the fancy frameworks are vibe coded, so why could they do better than something you do by yourself?
At most get playwright MCP in so the agent can see the rendered output
e.g. spend time creating a skill about how to query production logs
But for some projects there will be things Claude doesn’t know about, or things that you repeatedly want done a specific way and don’t want to type it in every prompt.
Everyone's workflow is different and nobody knows which workflow is the right one. If you turn your harness into a junk drawer of random skills that get auto updated, you introduce yet another layer of nondeterminism into it, and also blow up your context window.
The only skill you should probably install instead of maintaining it yourself is playwright-cli, but that's pretty much it.
Ignore original comment below, as the post is technical so is the parent comment: for techies
---
That applies to tech users only.
Non-tech users starting to use Claude code and won't care to get the job done
Claude introduced skills is to bring more non-tech users to CLI as a good way to get your feet wet.
Not everyone will go for such minute tweaks.
I am an adminstrator of this stuff at my company and it’s an absolute effing nightmare devising policies that protect people from themselves. If I heard this come out of someone’s mouth underneath me I’d tell them to leave the room before I have a stroke.
And this is stuff like, if so and so’s machine is compromised, it could cost the company massive sums of money. for your personal use, fine, but hearing this cavalier attitude like it doesn’t matter is horrifying, because it absolutely does in a lot of contexts.
LLMs have completely killed my motivation to continue running it. None of standard practices apply anymore
In practice, I also find it more useful that the Chrome MCP uses my current profile since I might want Claude to look at some page I'm already logged in to.
I'm not very sophisticated here though. I mainly use use browser MCP to get around the fact that 30% of servers block agent traffic like Apple's documentation.
Chrome MCP is much slower and by default pretty much unusable because Claude seems to prefer to read state from screenshots. Also, no Firefox/Safari support means no cross-browser testing.
There appears to be https://github.com/sumyapp/playwright-parallel-mcp which may be worth trying.
You do not want to log in one day to find your favorite workflow has changed via updates.
Then again this is all personal preference as well.
I also import skills or groups of skills like Superpowers (https://github.com/obra/superpowers) when I want to try out someone else's approach to claude code for a while.
1. Had to consume context and turns by reading files, searching web, running several commands for what was otherwise a straightforward task
2. Whatever tool it used wasn't designed with agent usage in mind. Which most of the time will mean agent has to do tail, head, grep on the output by re-running the same command.
Then you create a skill that teaches how to do this in fewer turns, possibly even adding custom scripts it can use as part of that skill.
You almost never need a skill per se, most models will figure things out themselves eventually, skill is usually just an optimization technique.
Apart from this, you can also use it to teach your own protocols and conventions. For example, I have skills that teach Claude, Codex, Gemini how to communicate between themselves using tmux with some helper scripts. And then another skill that tell it to do a code review using two models from two providers, synthesize findings from both and flag anything that both reported.
Although, I have abandoned the built-in skill system completely, instead using my own tmux wrapper that injects them using predefined triggers, but this is stepping into more advanced territory. Built in skill system will serve you well initially, and since skills are nothing but markdown files + maybe some scripts, you can migrate them easily into whatever you want later.
For example, I have a rule [^0] that instructs Claude to never start work until some pre-conditions are met. This works well, as it always seems to check these conditions before doing anything, every turn.
I can see security teams wanting to use this approach to feel more comfortable about devs doing things with agentic tools without worrying _as much_ about them wreaking havoc (or what they consider "havoc").
As well, as someone who's just _really_ getting started with agentic dev, spending time dumping how I work into rules helped Claude not do things I disapprove of, like not signing off commits with my GPG key.
That said, these rules will never be set in stone, at least not at first.
[^0]: https://github.com/carlosonunez/bash-dotfiles/blob/main/ai/c...
If you wanted to be more “corporate” about it, then assuming that devs are using some enterprise wrapper around Claude or whatever, I would bake an instruction into the system prompt that ensures that AGENTS is only read from the main branch to force this convention.
This is harder to guarantee since these tools are non-deterministic.
cute that you think cluade gives a rat ass about this.
It's also not targeted at first-timers getting their first taste of AI coding. It's a guide for how to use these tools to deal with frustrations you will inevitably encounter with AI coding.
Though really, many of the complaints about AI coding on HN are written by beginners who would also benefit from a simple .claude configuration that includes their preferences and some guidelines. A frequent complaint from people who do drive-by tests of AI coding tools before giving up is that the tools aren't reading their mind or the tools keep doing things the user doesn't want. Putting a couple lines into AGENTS.md or the .claude folder can fix many of those problems quickly.
Working out how to work on code on your own with agentic support is one thing. Working out how to work on it as a team where each developer is employing agentic tools is a whole different ballgame.
Is this a hangover from when the tools were not as good?
1. Provision of optional tools: I may use an ai agent differently to all other devs on a team, but it seems useful for me to have access to the same set of project-specific commands, skills & MCP configs that my colleagues do. I amn't forced to use them but I can choose to on a case by case basis.
2. Guardrails: it seems sensible to define a small subset of things you want to dissuade everyone's agents from doing to your code. This is like the agentic extension of coding standards.
Most people do, most people don’t have wildly different setups do they? I’d bet there’s a lot in common between how you write code and how your coworkers do.
IMHO most of this “customize your config to be more productive” stuff will go away within a year, obsoleted by improved models and harnesses.
Just like how all the lessons for how to use LLMs in code from 1-2 years ago are already long forgotten.
Perhaps your blanket statement could be wrong, and I would encourage you to let your mind be a bit more open. The landscape here is not what it was 6 months ago. This is an undeniable fact that people are going to have to come to terms with pretty soon. I did not want to be in this spot, I was forced to out of necessity, because the stuff does work.
https://theonion.com/area-man-constantly-mentioning-he-doesn...
The Vercel team had some interesting findings[1]:
> In 56% of eval cases, the skill was never invoked. The agent had access to the documentation but didn't use it.
Others had different findings for commonly accepted practices[2], some you may have adopted from reading documentation, which surely didn't come from influencers.
And yet others swear by magical Markdown documents[3].
So... who is the ultimate authority on what actually works, and who is just cargo culting the trendy practice of the week? And how is any of this different from what was being done a few years ago?
[1]: https://vercel.com/blog/agents-md-outperforms-skills-in-our-...
[2]: https://arxiv.org/abs/2602.11988
[3]: https://soul.md/
I measure the tooling success with a suite of small prompt tests performing repeatable tasks, measuring the success rate over time, educating the broader team, and providing my own tried and tested in the field skills that I’ve shared to similar successes to the broader teams. We’ve seen a huge increase in velocity and lower bug rate, which are also very easily measurable (and long evaluated stats) enough to put me in the position I am, which was not a reluctant one. You’re perfectly free to view my long history on this topic on this forum to see I am a complete skeptic on this topic, and wouldn’t be here unless I had to.
everyone is figuring this out still. There is no authority, I am my own authority on what I have seen work and what hasn’t. Feel free to take of that what you will. I just wanted to provide a counterpoint to your initial claim. I’m certainly not going to expose to a fine degree what has worked for my org and what hasn’t due to obvious reasons.
have a good day!
My point is, these custom things are often short lived band-aids, and may not be needed with better default harnesses or smarter future models.
I’ve been developing and working on dev tools for more than 15 years. I’ve never seen things evolve so rapidly.
Experiment, have fun and get things done, but don’t get too sure or attached to your patches.
It’s very likely the models and harnesses will keep improving around the gaps you see.
I’ve seen most of my AGENTS.md directives and custom tools fade away too, as the agents get better and better at reading the code and running the tests and feeding back on themselves.
Isn't this article just another one in that same drawer?
> What actually belongs in CLAUDE.md - Write: - Import conventions, naming patterns, error handling styles
Then just a few lines below:
> Don’t write: - Anything that belongs in a linter or formatter config
The article overall seems filled with internal inconsistencies, so I'm not sure this article is adding much beyond "This is what an LLM generated after I put the article title with some edits".
This is important no matter how experienced you are, but arguable the most important when you don't know what you're doing.
0: or if you don't want to learn about that, you can use Claude Code Web
I know the deny list is only for automatically denying, and that non-explicitly allowed command will pause, waiting for user input confirmation. But still it reminds me of the rationale the author of the Pi harness [1] gave to explain why there will be no permission feature built-in in Pi (emphasis mine):
> If you look at the security measures in other coding agents, *they're mostly security theater*. As soon as your agent can write code and run code, it's pretty much game over. [...] If you're uncomfortable with full access, run pi inside a container or use a different tool if you need (faux) guardrails.
As you mentioned, this is a big feature of Claude Code Web (or Codex/Antigravity or whatever equivalent of other companies): they handle the sand-boxing.
[0] https://blog.dailydoseofds.com/i/191853914/settingsjson-perm...
[1] https://mariozechner.at/posts/2025-11-30-pi-coding-agent/#to...
I never said "permissions", I said "sandboxing". You can configure that in settings.json.
https://code.claude.com/docs/en/sandboxing#configure-sandbox...
There many decent options (cloud VMs, local VMs, Docker, the built-in sandboxing). My point is just that folks should research and set up at least one of them before running an agent.
https://github.com/anthropics/claude-code
You can download the devcontainer CLI and use it to start a Docker container with a working Claude Code install, simple firewall, etc. out of the box. (I believe this is how the VSCode extension works: It uses this repo to bootstrap the devcontainer).
Basic instructions:
- Install the devcontainer CLI: `https://github.com/devcontainers/cli#install-script`
- Clone the Claude Code repo: `https://github.com/anthropics/claude-code`
- Navigate to the top-level repo directory and bring up the container: `devcontainer --workspace-folder . up`
- Start Claude in the container: `devcontainer exec --workspace-folder . bash -c "exec claude"`
P.S. It's all just Docker containers under the hood.
Better isolation than running it in a container.
which is basically every setup because claude sucks at calling skills and forget everything in claude.md with a few seconds.
>If you tell Claude to always write tests before implementation, it will. If you say “never use console.log for error handling, always use the custom logger module,” it will respect that every time.
It just isn't true lol
Always separate plan from implementation and clear context between, its the build up of context that makes it bad ime.
It's not against the rules to post AI slop here, and I don't necessarily think it should be. But I do wonder how we value written content going forward. There's value to taste and style and editing and all the other human things... there's very little value in the actual words themselves. We'll figure it out.
Don't skills sit in context while custom slash commands are only manually invoked?
The difference isn't clear to me, especially since, upon googling it right now, I see that skills can also be invoked with a /slash.
I get the best results with the least number of skills and unnecessary configuration in place.
People are spending way too much time over-prescribing these documents, but AI is like a competent but nervous adult. The more you give it, the dumber it gets.
I don't think people realize exactly how important the specific prompts are, with the same prompt you'd get wildly different results for different models, and when you're iterating on a prompt (say for some processing), you'd do different changes depending on what model is being used.
Would also be interested in examples of a CLAUDE.md file that works well in Claude, but works poorly with Codex.
https://github.com/nidhinjs/prompt-master
Actually, I can just read the skill with my own eyes and then I can also learn. So, thank you for sharing. It's interesting to read through what it suggests for different models - it fits for the ones I work with regularly, but there are many I don't know the strengths and weaknesses of.
[0] https://claudefa.st/blog/guide/mechanics/claude-md-mastery
TLDR "CLAUDE.md isn't documentation for Claude to read - it's an operating system for Claude to run. Define behavior, delegate knowledge to skills, and build a system that improves itself over time."
They look to me like people actually want to build deterministic workflows, but blobs of text are the wrong approach for that. The right tool is code that controls the agent through specific states and validates the tool calls step by step.
https://claudefa.st/blog/guide/mechanics/claude-md-mastery
Ultimately you can't force claude to solve any problem but you could make it so constraints are kept.
A simple way is a git hook thay runs all the deterministic things you care about.
Getting claude to follow your guidance files consistently is a bit maddening.
Just read the official Claude documentation:
https://code.claude.com/docs/
Then all I have to do is let the agents actually figure out how to accomplish what I ask of them, with the highly scoped set of tools and sub agents I give them.
I find this works phenomenally, because all the .agent.md file is, is a description of what the tools available are. Nothing more complex, no LARP instructions. Just a straightforward 'here's what you've got'.
And with agents able to delegate to sub agents, the workflow is self-directing.
Working with a specific build system? Vibe code an MCP server for it.
Making a tool of my own? MCP server for dev testing and later use by agents.
On the flipside, I find it very questionable what value skills and reusable prompts give. I would compare it to an architect playing a recording of themselves from weeks ago when talking to their developers. The models encode a lot of knowledge, they just need orientation, not badgering, at this point.
The best thing I’ve done so far is put GitHub behind an API proxy and reject pushes and pull requests that don’t meet a criteria, plus a descriptive error.
I find it forgets to read or follow skills a lot of the time, but it does always try to route around HTTP 400s when pushing up its work.
Coming soon, unit, behavioural and regression tests for your prompts and skills :P
You’ll have:
* Claude model version
* Claude Code prompts and tools
* Your own prompts and skills and whatnot
* Your repository’s source code (= the input)
All of those change constantly, it’s not like it’s some kind of SWE benchmark.
This sort of "prompt and pray" flow really works for people, as in they can make products and money, however, I do think the people that succeed today also would've reached for no-code tools 5 years ago and seen similar success. It's just faster and more comprehensive now. I think the general theme of the products remains the same though; not un-important or worthless, but it tends to be software that has effects that say INSIDE the realm of software. I feel like there's always been a market for that, as it IS important, it's just not WORTH the time and money to the right people to "engineer" those tools. A lot of SaaS products filled that niche for many years.
While it's not a way I want to work, I am also becoming comfortable with respecting that as a different profession for producing a certain brand of software that does have value, and that I wasn't making before. The intersection of that is opportunity I'm missing out on; no fault to anyone taking it!
The software engineer that writes the air traffic avoidance system for a plane better take their job seriously, understand every change they make, and be able to maintain software indefinitely. People might not care a ton about how their sales tracking software is engineered, but they really care about the engineering of the airplane software.
It shouldn’t be, but it’s going to take some catastrophic events to convince people that we have to work to make sure we understand the systems we’re building and keep everything from devolving into vibe coded slop.
I guess that's why I see it as a separate profession, as in we have to actually profess a standard for how a professional in our field acts and believes. I think it's OK for it to bifurcate into two different fields, but Software Engineering would need to specifically reject prompt-and-pray on a principled and rational basis.
Sadly yes, that might require real cost to life in order to find out the "why" side of that rational basis. If you meet anyone that went to an engineering school in Québec, ask them about the ceremony they did and the ring they received. [0] It's not like that ceremony fixes anything, but it's a solemn declaration of responsibility which to me at least, sets a contract with society that says "we won't make things that harm you".
[0] https://ironring.ca/home-en/
This is a brilliant reimagining of the old and trusted PnP acronym.
>Claude Code users typically treat the .claude folder like a black box. They know it exists. They’ve seen it appear in their project root. But they’ve never opened it, let alone understood what every file inside it does.
I know we are living in a post-engineering world now, but you can't tell me that people don't look at PRs anymore, or their own diffs, at least until/if they decide to .gitignore .claude.
I'm a senior engineer who has been shipping code since before GitHub and PR reviews was a thing. Thankfully LLMs have freed me from being asked to read other people's shit code for hours every day.
I recently tried IntelliJ for Kotlin development and it wanted me to give a credit card for a 30 day trial. I just want something that scans my repo and I tell it the changes I want and it does it. If possible, it would also run the existing tests to make sure its changes don't break anything.
While the coding assistants are pretty much universally free, you still need to connect them to a model. The model tokens generally cost something once you've gone past a certain quota.
I'm not sure if this is still true, but if you have a Google account, Gemini Code Assist had a quite generous "free tier" that I used for a while and found it do be pretty decent.
It is fun to use.
https://www.youtube.com/watch?v=0RLIlNWv1xo
You log in with your Goggle account.
Opencoder is bring your own model.
You get what you pay for so good luck.
It has a few issues with outdated advice (e.g. commands has been merged with skills), but overall I might use share it with co-workers who needs an introduction to the concept.
[1]: https://www.npmjs.com/package/claude-code-types
Do people find the nano-banana cartoon infographics to be helpful, or distracting? Personally, I'm starting to tire seeing all the little cartoon people and the faux-hand-drawn images.
Wouldn't Tufte call this chartjunk?
Feels like generated AI art like this is modern clipart
The simple truth we're about to realize is there is no free lunch: a tool cannot inject more intent into a piece than its author put in. It might smooth out some blemishes or highlight some alternative choices, but it can't transform the input "make me a video game" into something greater than a statistical mix-mash of the concept. And traditional tools of automation give you a much better, more precise interface for intent than natural language, which allows these vagaries.
Let’s say lose those and using emojis as bullet points. It’s going to be a lot harder to detect.
In this case, I'd say helpful because I didn't have to read the article at all to understand what was being communicated.
> Clutter is the disease of American writing. We are a society strangling in unnecessary words, circular constructions, pompous frills and meaningless jargon.
> Look for the clutter in your writing and prune it ruthlessly. Be grateful for everything you can throw away. Reexamine each sentence you put on paper. Is every word doing new work? Can any thought be expressed with more economy?
On Writing Well (Zinsser)
Some of the others, I don’t feel like added value, but I agree that these are some of the best of a practice that I agreed does not add a ton of value typically
So yes, it's chartjunk.
If the graphic still needs paragraphs to decode and doesn't let the reader pull out the key facts faster than plain text, it's not an infographic so much as cargo-cult design pasted on top of an explanation.
But they had already lost me at all the links, and the fact there's not a red wire through the entire article.
The first thing my eyes skimmed was:
> CLAUDE.md: Claude’s instruction manual
> This is the most important file in the entire system. When you start a Claude Code session, the first thing it reads is CLAUDE.md. It loads it straight into the system prompt and keeps it in mind for the entire conversation.
No it's not. Claude does not read this until it is relevant. And if it does, it's not SOT. So no, it's argumentatively not the most important file.
https://code.claude.com/docs/en/memory
“CLAUDE.md files are loaded into the context window at the start of every session”
Like mostly people who have confused luck and success, or business acumen for religion.
So I wouldn’t use LinkedIn as a positive data point of what’s hot.
I think the problem is that they're uninformative slop often enough that I've subconsciously determined they aren't worth risking attention time on.
No.
CLAUDE.md is just prompt text. Compaction rewrites prompt text.
If it matters, enforce it in other ways.
When you have this performative folder of skills the AI wastes a bunch of tool calls, gets confused, doesn't get to the meat of the problem.
beware!
In the end it will still produce slop you need to review line by line.
The question is: Do you want to write code you know and verified that works, or review code written by AI that is of junior dev quality that is not verified.
> Two folders, not one
Why post AI slop here?
> "The project-level folder holds team configuration. You commit it to git. Everyone on the team gets the same rules, the same custom commands, the same permission policies."
> "Most people either write too much or too little. Here’s what works."
It feels like I've been teleported into a recent LinkedIn feed. Do real people actually already write like AI or is it AI generated?