IT/Software career thread: Invert binary trees for dollars.

Deathwing

<Bronze Donator>
17,757
8,731
My boss has been using Claude Code to review MRs. He claims it's a quality add, perhaps it is, but I suspect it's more a shaming tool and he probably doesn't have time to properly review all his MRs. Whatever the actual intent, it worked, as more of us have been using the tool. I've used it a few times to review local changes in the ~1k LoC range. Some random musings and ramblings:

How much can you trust the dollar cost in /usage? I used up a third of my daily tokens(sorry, forgot to record actual amount) and it was ~$5. Yet, others at my work are looking at implementing as part of CI and cost(among other things) is a problem. $5 for even one true positive is a steal.

It's hit or miss, it will find genuine bugs but it will also point out issues that other code(in which it has access to) will easily refute.

Commit summaries can be impressive, almost worth $5 on that alone.

I'm being generally positive because I'm familiar with the code it's analyzing, I just spent hours/days working with the code. Thus, it's easy for me to understand its findings. I don't understand how people are using this to write feature-size code and have that same amount of familiarity with the results. It seems analogous to asking someone else to review your code. You cannot expect them to take the time to become as intimate with your code as you(exeptions, of course).




A bit ironic that Claude charged me tokens to add token, context, and cost usage to the status line...and fucked it up.
 

Noodleface

A Mod Real Quick
39,666
18,298
I've been using Claude to review PRs as a first line of defense. Using it inside VScode where it has context to the same codebase works really well. It can get pretty confused if your branch and the PR branch are way out of step.

I could easily see it getting to the point where AI writes the code, another agent review it, and another agent merges it with no human interaction.
 

Sheriff Cad

scientia potentia est
<Nazi Janitors>
33,938
82,030
I've been using Claude to review PRs as a first line of defense. Using it inside VScode where it has context to the same codebase works really well. It can get pretty confused if your branch and the PR branch are way out of step.

I could easily see it getting to the point where AI writes the code, another agent review it, and another agent merges it with no human interaction.
I do this with legal briefs where I have it write a section with access to case documents and fact documents, I put a full brief together from the sections it wrote, and then I take the full brief, clean conversation and put the full brief in and say, check this for consistency/grammar/cites/etc and I get a whole host of things that it wants to change from the other conversation. There's something about repeated edits/changes within the same conversation that it gets mingled or confused sometimes, and starting fresh gives you way cleaner answers.

Do you guys do that where you may use AI to completely write something or just assist and then put it into a fresh conversation and it finds new errors?
 

sliverstorm

Blackwing Lair Raider
150
304
Do you guys do that where you may use AI to completely write something or just assist and then put it into a fresh conversation and it finds new errors?
That is fairly normal, and I'll do the same when I want to refresh the context window.

Think of the model as dividing its attention between every single thing you've talked about and shared up to that point. Every time you submit a new prompt, the model is generating its output by processing not just that prompt, but also the full history of your conversation--including all your prior prompts and the model's responses to generate each section, any back and forth you had editing or revising, and especially all those other case/fact docs you've exposed the model to. This basically dilutes the model's attention on just the brief text.

When you open a new window and say "Read this doc and provide critical feedback", the full weight of the model's output is generated solely against that prompt + the brief text, which means you get a much sharper response.
 
  • 1Solidarity
Reactions: 1 user

Deathwing

<Bronze Donator>
17,757
8,731
I've been using Claude to review PRs as a first line of defense. Using it inside VScode where it has context to the same codebase works really well. It can get pretty confused if your branch and the PR branch are way out of step.

I could easily see it getting to the point where AI writes the code, another agent review it, and another agent merges it with no human interaction.
How does that(within VSCode) work if you want to containerize Claude to have some modicum of security?

BTW, the permissions you have to give Claude are fucking scary. Any other application, noping the fuck out for just 1 of those, let alone all 5.