Will you PLEASE stop saying “coding is a practical use case”? This is the third appeal I’ve made on this subject. (Do you read your comments?) If you want bug ridden code with security issues which is not extensible and which no-one understands, then sure, it’s a practical use case. Just like if you want nonsensical articles with invented facts, then article writing is a practical use case. But as I’ve pointed out already no reputable editorial is now using LLMs to write their articles. Why is that? Because it obviously doesn’t work.
Let’s face it the only reason you’re saying “coding is a practical use case” is because you yourself don’t code, and don’t understand it. I can’t see another reason why would assume the problems experienced in other domains somehow don’t apply to coding. Newsflash: they do. And software engineering definitely doesn’t need the slop any more than anyone else. So I hope this is my final appeal: please stop perpetuating this myth. If you want more information on the problems of using LLMs to code, then I can talk in great length about it - feel free to reach out. Thanks…
The point is, there has always been a trade-off between the speed of development and quality of engineering (confidence in the code, robustness of the app etc.) I don’t see LLMs as either changing this trade-off or shifting the needle (greater quality in a shorter time), because they are probabilistic and can’t be relied upon to produce the best solution - or even a correct solution - every time. So you’re going to have to pick your way through every single line it generates in order to have the same confidence you would have if you wrote it - and this is unlikely to save time because understanding someone else’s code is always more difficult and time-consuming than writing it yourself. When I hear people say it is “making them 10x more productive” at coding, I think, “and also 10x as unsure what you’ve actually produced”…
You’ll also need to correct it when it does something you don’t want. Now this is pretty interesting, if you think about it. Imagine you provide an LLM a prompt, and the LLM produces something but not exactly what you want. What is the advice on this? “Provide a more specific prompt!” Ok, so then we write a more specific prompt - the results are better, but it still falls short. What now? “Keep making the prompt more specific!” Ok but wait - eventually won’t I be supplying the same number of tokens to the LLM as it is going to generate as the solution? Because if I’m perfectly specific about what I want, then isn’t this just the same as actually writing the solution myself using a computer language? Indeed, isn’t this the purpose behind computer languages in the first place?..
We software developers very often pull chunks of code from various locations - not just stackoverflow. Very often they are chunks of code we wrote ourselves, that we then adapt to the new system we are inserting it into. This is great, because we don’t need to make an effort to understand the code we’re inserting - we already understand it, because we wrote it…
“You should consider combing through Hacker News to see how people are actually making successful use of LLMs” - the problem with this is there are really a lot of hype-driven stories out there that are basically made up. I’ve caught some that are obvious - e.g. see my comment on this post: https://substack.com/home/post/p-185469925 (archived) - which then makes me quite sceptical of many of the others. I’m not really sure why this kind of fabrication has become so prevalent - I find it very strange - but there’s certainly a lot of it going on. At the end of the day I’m going to trust my own experiences actually trying to use these tools, and not stories about them that I can’t verify.
Absolutely… Thank you, from the very depths of my heart and soul… dear Tom Gracey, programmer, artist… for the marvel you do… for the wisest attitude, for the belief in in human… in effort… in art…
If you want bug ridden code with security issues which is not extensible and which no-one understands, then sure, it’s a practical use case.
This assumes you never review it, meaning it’s at best an argument against vibe coding. It’s not an argument against using LLMs for coding in general.
Additionally, I’ve been writing software for a living for almost 30 years, and I could say the exact same thing about a lot of human generated code I’ve reviewed during that time. I don’t even know how often I’ve explained basic stuff like “security goes in the backend, not in the frontend” to humans.
Let’s face it the only reason you’re saying “coding is a practical use case” is because you yourself don’t code, and don’t understand it.
I certainly do code and if I don’t understand what the LLM outputs it doesn’t go in the project.
I can’t see another reason why would assume the problems experienced in other domains somehow don’t apply to coding.
I’m a software engineer, I can’t judge LLMs in most other domains. I also don’t think there are no problems. A tool doesn’t have to be 100% problem free to be useful as long as you recognize the limitations.
So you’re going to have to pick your way through every single line it generates in order to have the same confidence you would have if you wrote it
I don’t see a problem with this. The post even mentions pulling code from stackoverflow, which is the same. But nobody ever argued that it has no uses in coding because you still have to read the code.
Honestly at this point any article just flat out dismissing LLMs for coding only reads to me like the author isn’t even trying to stay up to date. Which is understandable if they don’t like AI but makes posting about it a bit pointless.
A year ago I would had a similar opinion as the author but in the last 3-4 months specifically, it feels like AI based tools made a huge leap. I went from using short snippets for learning to letting AI implement entire features and being actually happy with the result.
There is however still a pretty big difference between what it produces for common problems vs. what it produces for specialized difficult ones. It’s also inherently better at some languages than others based on the availability of up-to-date training material. So you need some amount of breadth in your projects to accurately judge it.
If you only try some AI service in free mode on one thing every month, for example, you’ll just have this very polarized opinion that’s either “AI is useless” or “AI can do everything”, but you won’t have a good idea of what it can and can’t do.
A year ago I would had a similar opinion as the author but in the last 3-4 months specifically, it feels like AI based tools made a huge leap. I went from using short snippets for learning to letting AI implement entire features and being actually happy with the result.
Maybe if you’re only working with languages and features that are well documented and have a lot of examples out there. I’ve been trying to use LLM coding to assist me with a process automation at work, and the results are a couple steps up from dog vomit more often than not.
AI code assistants aren’t making big strides, you’re likely just seeing them refine common scenarios to points where it becomes very usable for your specific use cases.
A year ago I would had a similar opinion as the author but in the last 3-4 months specifically, it feels like AI based tools made a huge leap.
I’ve seen this claim made basically weekly for the last couple of years, if we’re having “generational leaps” monthly then these LLMs would actually be capable of doing what people claim they can.
It’s just my experience as someone who was pretty much forced to use AI for coding by my employer for the last few years. For the longest time it was completely useless. And then it suddenly wasn’t. I’m sure you’ll keep hearing this kind of story though, because people have different requirements and AI assisted coding or even agents don’t have to start working for everybody at the same time.
Oh you must try the newest version of <insert model name here> with <insert genAI IDE name here>, with letting it to do most of your job and only do code reviews, and otherwise you’ll have to learn how to prompt it
Absolutely… Thank you, from the very depths of my heart and soul… dear Tom Gracey, programmer, artist… for the marvel you do… for the wisest attitude, for the belief in in human… in effort… in art…
This assumes you never review it, meaning it’s at best an argument against vibe coding. It’s not an argument against using LLMs for coding in general.
Additionally, I’ve been writing software for a living for almost 30 years, and I could say the exact same thing about a lot of human generated code I’ve reviewed during that time. I don’t even know how often I’ve explained basic stuff like “security goes in the backend, not in the frontend” to humans.
I certainly do code and if I don’t understand what the LLM outputs it doesn’t go in the project.
I’m a software engineer, I can’t judge LLMs in most other domains. I also don’t think there are no problems. A tool doesn’t have to be 100% problem free to be useful as long as you recognize the limitations.
I don’t see a problem with this. The post even mentions pulling code from stackoverflow, which is the same. But nobody ever argued that it has no uses in coding because you still have to read the code.
Honestly at this point any article just flat out dismissing LLMs for coding only reads to me like the author isn’t even trying to stay up to date. Which is understandable if they don’t like AI but makes posting about it a bit pointless.
A year ago I would had a similar opinion as the author but in the last 3-4 months specifically, it feels like AI based tools made a huge leap. I went from using short snippets for learning to letting AI implement entire features and being actually happy with the result.
There is however still a pretty big difference between what it produces for common problems vs. what it produces for specialized difficult ones. It’s also inherently better at some languages than others based on the availability of up-to-date training material. So you need some amount of breadth in your projects to accurately judge it.
If you only try some AI service in free mode on one thing every month, for example, you’ll just have this very polarized opinion that’s either “AI is useless” or “AI can do everything”, but you won’t have a good idea of what it can and can’t do.
Maybe if you’re only working with languages and features that are well documented and have a lot of examples out there. I’ve been trying to use LLM coding to assist me with a process automation at work, and the results are a couple steps up from dog vomit more often than not.
AI code assistants aren’t making big strides, you’re likely just seeing them refine common scenarios to points where it becomes very usable for your specific use cases.
I’ve seen this claim made basically weekly for the last couple of years, if we’re having “generational leaps” monthly then these LLMs would actually be capable of doing what people claim they can.
It’s just my experience as someone who was pretty much forced to use AI for coding by my employer for the last few years. For the longest time it was completely useless. And then it suddenly wasn’t. I’m sure you’ll keep hearing this kind of story though, because people have different requirements and AI assisted coding or even agents don’t have to start working for everybody at the same time.
Sounds like Tom tried LLM-assisted coding once about 6 model release cycles ago and hasn’t revisited it.
ultra copium.jpg