IT/Software career thread: Invert binary trees for dollars.

Deathwing

<Bronze Donator>
17,757
8,731
My boss has been using Claude Code to review MRs. He claims it's a quality add, perhaps it is, but I suspect it's more a shaming tool and he probably doesn't have time to properly review all his MRs. Whatever the actual intent, it worked, as more of us have been using the tool. I've used it a few times to review local changes in the ~1k LoC range. Some random musings and ramblings:

How much can you trust the dollar cost in /usage? I used up a third of my daily tokens(sorry, forgot to record actual amount) and it was ~$5. Yet, others at my work are looking at implementing as part of CI and cost(among other things) is a problem. $5 for even one true positive is a steal.

It's hit or miss, it will find genuine bugs but it will also point out issues that other code(in which it has access to) will easily refute.

Commit summaries can be impressive, almost worth $5 on that alone.

I'm being generally positive because I'm familiar with the code it's analyzing, I just spent hours/days working with the code. Thus, it's easy for me to understand its findings. I don't understand how people are using this to write feature-size code and have that same amount of familiarity with the results. It seems analogous to asking someone else to review your code. You cannot expect them to take the time to become as intimate with your code as you(exeptions, of course).




A bit ironic that Claude charged me tokens to add token, context, and cost usage to the status line...and fucked it up.
 

Noodleface

A Mod Real Quick
39,669
18,299
I've been using Claude to review PRs as a first line of defense. Using it inside VScode where it has context to the same codebase works really well. It can get pretty confused if your branch and the PR branch are way out of step.

I could easily see it getting to the point where AI writes the code, another agent review it, and another agent merges it with no human interaction.
 
  • 1Like
Reactions: 1 user

Sheriff Cad

scientia potentia est
<Nazi Janitors>
33,940
82,050
I've been using Claude to review PRs as a first line of defense. Using it inside VScode where it has context to the same codebase works really well. It can get pretty confused if your branch and the PR branch are way out of step.

I could easily see it getting to the point where AI writes the code, another agent review it, and another agent merges it with no human interaction.
I do this with legal briefs where I have it write a section with access to case documents and fact documents, I put a full brief together from the sections it wrote, and then I take the full brief, clean conversation and put the full brief in and say, check this for consistency/grammar/cites/etc and I get a whole host of things that it wants to change from the other conversation. There's something about repeated edits/changes within the same conversation that it gets mingled or confused sometimes, and starting fresh gives you way cleaner answers.

Do you guys do that where you may use AI to completely write something or just assist and then put it into a fresh conversation and it finds new errors?
 

sliverstorm

Blackwing Lair Raider
150
305
Do you guys do that where you may use AI to completely write something or just assist and then put it into a fresh conversation and it finds new errors?
That is fairly normal, and I'll do the same when I want to refresh the context window.

Think of the model as dividing its attention between every single thing you've talked about and shared up to that point. Every time you submit a new prompt, the model is generating its output by processing not just that prompt, but also the full history of your conversation--including all your prior prompts and the model's responses to generate each section, any back and forth you had editing or revising, and especially all those other case/fact docs you've exposed the model to. This basically dilutes the model's attention on just the brief text.

When you open a new window and say "Read this doc and provide critical feedback", the full weight of the model's output is generated solely against that prompt + the brief text, which means you get a much sharper response.
 
  • 1Solidarity
  • 1Like
Reactions: 1 users

Deathwing

<Bronze Donator>
17,757
8,731
I've been using Claude to review PRs as a first line of defense. Using it inside VScode where it has context to the same codebase works really well. It can get pretty confused if your branch and the PR branch are way out of step.

I could easily see it getting to the point where AI writes the code, another agent review it, and another agent merges it with no human interaction.
How does that(within VSCode) work if you want to containerize Claude to have some modicum of security?

BTW, the permissions you have to give Claude are fucking scary. Any other application, noping the fuck out for just 1 of those, let alone all 5.
 

Control

Golden Baronet of the Realm
5,677
15,879
Do you guys do that where you may use AI to completely write something or just assist and then put it into a fresh conversation and it finds new errors?
In addition to the context rot, I find that if it hits too much of a snag on something, it can get stuck in rabbit holes or decides that one path is the correct one and keeps coming back to it. A new session basically gets you a reroll on it's decisions. I'm only using it on relative hobby projects so far so so ymmv.

How does that(within VSCode) work if you want to containerize Claude to have some modicum of security?

BTW, the permissions you have to give Claude are fucking scary. Any other application, noping the fuck out for just 1 of those, let alone all 5.
You can run it in a container or vm. A bit of extra hassle, but I don't want to completely turn it loose unless it's on a sacrificial system, and even then, careful about what it has access to on your network.
 

pwe

Silver Baronet of the Realm
1,304
6,896
Boys, I had my second interview for a mini job (UiPath) yesterday. 15 hours per week, low pay, but despite that (sort of due to that) I really, really want this. I can get by just fine, and a low stress job sounds amazing. Never tried to be nervous for something like this before. I am about 10x overqualified, but it's all about culture fit and I wasn't the only one for second interview. I hope the rest are assholes.

Team up two and two and cross your wieners for me.
 
  • 2Like
Reactions: 1 users

Noodleface

A Mod Real Quick
39,669
18,299
How does that(within VSCode) work if you want to containerize Claude to have some modicum of security?

BTW, the permissions you have to give Claude are fucking scary. Any other application, noping the fuck out for just 1 of those, let alone all 5.
Claude is handled by our overlords here, but I work at a megacorp.

Also I agree on permissions. I think we very easily hand over the reins without thinking through the consequences sometimes.
 

stankwoo

Golden Knight of the Realm
185
104
I've been running the agents in VMs - good success with Multipass for headless Linux slices so far. I don't mount or share volumes between the host machine and the slices. You can setup a baseline Multipass dev enviroment, snapshot it, and then if you and/or the agent do something bad, you can just restore from snapshot. I figured this was worth the effort because I am still learning all these systems but I really wanted to let the agent be as autonomous as possible. The more autonomy they have, the more permissions they have, and the more of a chance something gets fucked up.

For repo access I've done it two ways. First way I just used Multipass file I/O to copy the repo over to the agent VM. I ask it to do some activity and it performs the edits. Works fine, but then you are the workhorse on repo management becaus you have to copy back the files, etc.

Second way is much easier but more setup. I created a Github account for the agent and then added that agent account as a collaborator to the repo I want it to work on. Make sure the repo has main branch protection setup. Then in the agent VM you just "gh auth login" like normal into that agent VM's account. Now you can tell it "Review code on repo XYZ" and it will clone the repo, etc. If you ask it to do edits, it will edit the code, you can review, and when you are happy you can say "Create a PR" and it will do it all.

With the second way, I have multiple agents setup with their own Github accounts. The farthest I have gotten right now is a code writer agent who makes the code edits and then a code review agent who only reviews PRs. Pretty funny to see a PR with two robots talking to each other.

Last thing - Visual Studio Code has an extension that lets you SSH into the VM so you can look at the code and more importantly the git diff to see what changes it made before it goes to a PR.
 
  • 3Like
Reactions: 2 users

ShakyJake

<Donor>
8,633
21,311
I'm being generally positive because I'm familiar with the code it's analyzing, I just spent hours/days working with the code. Thus, it's easy for me to understand its findings. I don't understand how people are using this to write feature-size code and have that same amount of familiarity with the results.
Does it work? Yes? Who cares.
 

ShakyJake

<Donor>
8,633
21,311
This is also exactly why there is money to be made unfucking what the AI fucks.
I am betting that your typical human coders will 'fuck' things up just as much or more. The difference is that when AI does it, we can iterate and fix it stupidly fast and at almost zero marginal cost. Humans? Good luck getting them to even admit the bug exists, let alone ship the fix before the next sprint ends.
 

moonarchia

The Scientific Shitlord
<Bronze Donator>
30,945
61,068
I am betting that your typical human coders will 'fuck' things up just as much or more. The difference is that when AI does it, we can iterate and fix it stupidly fast and at almost zero marginal cost. Humans? Good luck getting them to even admit the bug exists, let alone ship the fix before the next sprint ends.
You can fix it because you know the code. AI is going to put you out of work (hopefully not, but HR and management are going to try to make it happen) and actively prevent anyone after you from putting in the work to learn the code in the future.
 

Khane

Got something right about marriage
21,907
15,899
I am betting that your typical human coders will 'fuck' things up just as much or more. The difference is that when AI does it, we can iterate and fix it stupidly fast and at almost zero marginal cost. Humans? Good luck getting them to even admit the bug exists, let alone ship the fix before the next sprint ends.

Wait what? So before AI we were all just shipping garbage code and then lying about it?