dakshgupta
2 months ago
Hi, I'm Daksh, a co-founder of Greptile. We're an AI code review agent used by 2,000 companies from startups like PostHog, Brex, and Partiful, to F500s and F10s.
About a billion lines of code go through Greptile every month, and we're able to do a lot of interesting analysis on that data.
We decided to compile some of the most interesting findings into a report. This is the first time we've done this, so any feedback would be great, especially around what analytics we should include next time.
neom
2 months ago
If AI tools are making teams 76% faster with 100% more bugs, one would presume you're not more productive you're just punting more debt. I'm no expert on this stuff, but coupling it with some type of defect density insights might be helpful. Would be also interested to know what percentage of AI assisted code is "rolled back" or "reverted" within 48 hours. Has there been any change in number of review iterations over time?
refactor_master
2 months ago
I’m interested in earnings correlating with feature releases. Maybe you’re pushing 100% more bugs, but if you can sell twice as many buggy features as your neighbor at the same time, it could be that you could land more contracts.
It’s definitely a raise to the bottom scenario, but that was already the scenario we lived in before LLMs.
apercu
2 months ago
Right? I want to see the problem ticket variance year over year with something to qualify the data if release velocity is more frequent.
8note
2 months ago
i wouldnt find that convincing.
plenty of tickets are never written because they dont seem worth tracking. an llm speeding up development can have the opposite effect - increasing the amount of tickets because more fixes look possible than before
apercu
2 months ago
Fair. Everything has nuance.
ChrisbyMe
2 months ago
Hey! Thanks for publishing this.
Would be interested in seeing the breakdown between uplift vs company size.
e.g. I work in a FAANG and have seen an uptick in the number of lines on PRs, partially due to AI coding tools and partially due to incentives for performance reviews.
dakshgupta
2 months ago
This is a good one, wish we had included it. I'd run some analysis on this a while ago and it was pretty interesting.
An interesting subtrend is that Devin and other full async agents write the highest proportion of code at the largest companies. Ticket-to-PR hasn't worked nearly as well for startups as it has for the F500.
wrs
2 months ago
It’s hard to reach any conclusion from the quantitative code metrics in the first section, because as we all know, more code is not necessarily better. “Quantity” is not actually the same as “velocity”. And that gets to the most important question people have about AI assistance: does help you maintain a codebase long term, or does it help you fly headlong into a ditch?
So, do you have any quality metrics to go with these?
dakshgupta
2 months ago
We weren’t able to find a good quality measure. LLM-as-judge dint feel right. You’re correct that without that the data is interesting but not particular insightful.
Morromist
2 months ago
Thanks for publishing this. People will complain about your metrics, but I would say its just useful to have metrics of any kind at this point. People talk a lot about AI coding today without having any data, just thousands of anecdotes. This is like a glass of pure water in a desert.
I'm a bit of an AI coding skeptic btw, but I'm open to being convinced as the technology matures.
I actually think LOC is a useful metric. It may or may not be a positive thing to have more LOC, but its data, and that's great.
I would be interested in seeing how AI has changed coding trends. Are some languages not being used as much because they work poorly with AI? How much is the average script length changing over time? Stuff like that. Also how often is code being deleted and rewritten - that might not be easy to figure out, but it would be interesting.
jacekm
2 months ago
> About a billion lines of code go through Greptile every month, and we're able to do a lot of interesting analysis on that data.
Which stats in the report come from such analysis? I see that most metrics are based on either data from your internal teams or publicly available stats from npm and PyPi.
Regardless of the source, it's still an interesting report, thank you for this!
dakshgupta
2 months ago
Thanks! The first 4 charts as well as Chart 2.3 are all from our data!
chis
2 months ago
Wish you'd show data from past years too! It's hard to know if these are seasonal trends or random variance without that.
Super interesting report though.
alienbaby
2 months ago
I actually ended up enjoying reading the cards after the charts more than I did reading the charts, but the charts were really interesting too.
conartist6
2 months ago
It's a shame that the AI branch is the software engineering industry is so determined to make us look like compete fools.
WHY ARE THEY STILL TALKING ABOUT ADDING LINES OF CODE IF THEY KNOW HOW SOFTWARE COMPLEXITY SCALES.
I could not put it more simply: you don't get the benefit of the doubt anymore. Too many asinine things have been done like this line-of-code-counting BS for me to not see I it as attempted fraud.
Something we know for sure is that the most productive engineers are usually neutral or negative on lines of code. Bad ones who are costing your business money by cranking out debt: those amp up you number of lines
conartist6
2 months ago
I cannot believe how often I have to call out ostensibly smart AI people for saying shit that is obviously not literally true.
It's like they all forgot how to think, or that other people can spot right where and then they stopped thinking critically and started to go with the hype. Many lines of code good! Few lines of code bad!
dremnik
2 months ago
very cool report. been looking for some data on this (memory + AI SDKs) for a while :)