The Scary Sudden Realization with Claude 3.7

Posted on March 3, 2025 by Yµn ^…^ ƒ(x) aka. Yunus Emre Vurgun

Long time no blog post… Or at least a blog post with correct timestamps.

This won’t be long as it is past one in the morning for me in Istanbul.

I just wanted to share my experience with the new AI model from Anthropic, which is Claude 3.7, which they have released around a week ago I guess.

I haven’t tested its accuracy with textual data yet but I did some real-world codebase assistance with it via Cursor IDE and surprisingly it was pretty messy compared to the previous version, the 3.5.

It almost never listened to my commands, never tried to process my approach to the code I write, never even kept the context data for the session period.

It constantly got obsessed with popular programming stacks and codebase structures that I almost lost some of my work in the source code files.

Weird enough, it actually knew better when it came to knowing the latest stuff about popular frameworks, libraries, tools etc. and I mean it knew almost too much about them.

This makes me think that the team behind it maybe spent too much time fine-tuning the model to output code snippets that just “worked” and did the do the same with the core of how the model process the input data.

Running it outside Cursor IDE, it was simpler, but still had the same problem of spitting out useless memorised code structures from its training data, which had nothing to do with the actual code it was fed in the inputs.

Claude 3.5 still did a better job with my codebase, though had a hard time getting the nuances of the code I wrote.

Claude 3.7 tried to fix language-related errors I caused with what I call “remove-and-make-up-your-own” approach. It constantly removed my code and replaced it with its own ideas of what it should do in terms of even the core functionality.

So you may be wondering where is the sudden realization part within this article…

Well, it is simple: I suddenly realized how dangerous an LLM can be when given the power to alter codebases.

I am aware that LLMs are a blessing for generating simple code blocks and fixing my silly errors, but the amount of power it can have if you connect it directly to the system is just crazy.

Given enough stupidity and lack of reason, a powerful decision maker may soon allow an LLM to alter a critical codebase that maybe is powering a huge chunk of the online ecosystem that we use daily.

Imagine an instant collapse of a critical worldwide service because some irresponsible person allowed a language model to alrer a critical codebase module and engineers struggling to find what caused it, as the code generated by the bot looks quite reasonable due to the fact that the custom training data was almost identical to the original code style.

This is the scary reality.

Monitor your code.

Read your codebase.

Don’t let a chatbot push to production!

Don’t be lazy!

Fin

← Back to Blog