In most of the world the past decades there has been no thought behind who should get university education. It has been given that after high school you should aim for university. I have studied software engineering in the most prestigious university in my country and from 100+ students in my group there were only a few (myself excluded) who actually had some interest in academic work and desire to pursue it. Most of us were just coasting - passing exams and writing mediocre papers without any goal to have those papers ever being read by someone after the graduation.
I think that university level and other kinds of formal education should be segregated. Universities should host fewer students and being able to provide them with higher rewards for actually meaningful work and I believe that a flood of mediocre quality papers (but let's admit it, in fact they are low quality in their content and perhaps good in their presentation) will lead us to rebuild the education system.
Note the following comment by Jerry Ling: "The effect goes away if you search properly using the original submission date instead of the most recent submission date. By using most recent submission date, your analysis is biased because we’re so close to the beginning of 2026 so ofc we will see a peak that’s just people who have recently modified their submission."
The last-modified-date effect is even more important, because it can be used to support whatever the latest fad is, without needing to adapt data or arguments to the specifics of that fad.
Well… it is happening. You can’t put spilled milk back to bottle. You can do future requirements that will try to stop this behaviour.
E.g. in the submission form could be a mandatory field “I hereby confirm that I wrote the paper personally.” In conditions there will be a note that violating this rule can lead to temporary or permanent ban of authors. In the world where research success is measured by points in WOS, this could lead to slow down the rise of LLM-generated papers.
> This approach dismisses the cases where Ai submissions generally are better.
You’re perhaps missing the not so subtle subtext of Peter Woit’s post, and entire blog, which is:
While AI is getting better, it’s still not _good_ by the standards of most science. However it’s as good as hep-th where (according to Peter Woit) the bar is incredibly low. His thesis is part “the whole field is bad” and part “Arxiv for this subfield is full of human slop.”
I don’t have the background to engage with whether Peter Woit’s argument has merit, but it’s been consistent for 25+ years.
My comment was more an answer to the proposed gatekeeping of science as a human activity.
Yes, Ai is still not good in the grand scheme of things. But everybody actively using it has gotten concerned over the past 2 months by the leap frigging of LLMs - and surprised as they thought we had arrived at the plateau.
We will see in a year or two if humans still hold an advantage in research - currently very few do in software development, despite what they think about themselves.
The other side of the coin is: automating science as a machine activity.
Is that what we want? I agree with you that the use of language models in science is an inevitable paradigm shift, but now is the time to make collective decisions about how we're going to assimilate this increasingly super-human "intelligence" into academic practices, and the rest of daily life. Otherwise we will be the ones being assimilated by a force beyond our control.
The progress is so rapid that the only people who might have control over the process are the ones with self-interest, mainly financial, and not aligned with - in some aspects opposed to - the interests of humanity.
I assume hep = high energy physics in this context. PI = professor who received a government grant.
Peer review has never really been blind and I suspect PIs will reject papers from "outsiders" even if they are higher quality. This already happens to some extent today when the stakes are lower.
Kinda. PI is principal investigator and usually they’re a professor with a grant (the grant being the thing they are the principal of investigating). That part is right. But they’re not really directly in the review loop. For some fields where things are small enough that folks can recognize style such as it exists, you could see reviewers passing over unfamiliar work and promoting familiar work. That was not the issue.
The issue was that it still was kind of hard to produce crappy mid rate papers, so you kind of needed the infrastructure of a small lab to do that. Now you don’t. The success rate for those mediocre papers produced by grad students and postdocs will go way down. It is possible that will cease to be a useful signal for those early career researchers.
But peer review (circa 1965-2010[1]) is just the prior iteration of the problem[2]; the wave of crap[3] produced by publish or perish (crica 1950-present[4]). Rejecting papers by outsiders is irrelevant; the problem is we want to determine which papers are good/interesting/worth considering out of the fire hose of bilge, and, though we were already arguably failing at this, the problem just got harder.
(I say arguably, because there is always the old "try it yourself and see if it actually works" trick, but nobody seems to be fond of this; it smacks of "do your own research" and we're lazy monkeys at heart, who would much rather copy off of someone else's homework.)
Peer review isn’t the issue here. His comments are about Arxiv, which is a preprint server. Essentially anyone can publish a preprint. There’s no peer or other review involved.
This is a common misconception. People without academic affiliation (based on their email address) require someone to vouch for them before they can submit to arxiv. And papers submitted to arxiv (with or without affiliation) are reviewed, and many are rejected.
arXiv does not review everything pushed to the site.
It's very easy to get in. It's becoming a common target for grifters who will "publish" papers on arXiv because it looks formal to those who don't know any better.
>Peer review has never really been blind and I suspect PIs will reject papers from "outsiders" even if they are higher quality.
I'm a complete outsider (not even in academia at all) and just got a paper accepted in the top math biology journal [1]. But granted, it took literally years to write it up and get it through. I do really worry that without academic affiliation it is going to get harder and harder for outsiders as gates are necessarily kept more and more securely because of all the slop.
> submission numbers in the last couple months have nearly doubled with respect to the stable numbers of previous years
This is showing up (no pun intended) on HN as well. The # of submissions and # of submitters, which traditionally had been surprisingly stable—fluctuating within a fixed range for well over 10 years—has recently been reaching all-time highs. Not double, though...yet.
I would imagine tons of them are bots. They're getting hard to distinguish, they don't do the normal tropes any longer. They'll type in all lowercase, they'll have the creator post manually to throw you off, they'll make multiple comments within 45 seconds that normal human couldn't do. All things I've witnessed here over the past couple of weeks. And those are just the ones I've caught.
I've noticed a pretty significant uptick in new accounts posting complete garbage. I don't mean the comments are bad, they're not even words in many cases.
But it also seems some topics (in particular AI) attract a lot of accounts that post incredibly low quality comments, far below the quality you'd expect from HN. Ofte it's in reasonable English, but it's just inane reddit-level drivel. Unclear if these topics attract low quality posters, or if these are bot accounts.
Also looking at the three first pages of /noobcomments, we find 28 comments with EM-dashes in them. That's not proof of AI, but if you compare with /newcomments, you find exactly one EM-dash going back as far. That's a bit of a statistical aberration.
I've witnessed bots here on accounts that are years old with no history that start posting multiple times in a short timeframe suddenly after being dormant forever. Makes you wonder how they're getting these old accounts. It's not just new ones.
Black market accounts. Some human made them years ago for a price, they are sat on by some black market / grey market guy and now he's selling the accounts for a profit.
Old accounts from multiple social media platforms has a $$$$$ value.
I would wager the vast majority are alt accounts of existing users. People who don't want to risk their karma or reputation but who do want to go mask off for certain subjects. After that it's bound to be bots run by HN users. I just don't think HN is so popular that the rush of green accounts popping up actually represents new users. Maybe I'm wrong , though.
Reddit has been shedding its techy enthusiast crowd for the past few years with the combination of policy changes and insufficient moderation against LLM bots. I wonder if that’s contributing.
It's likely people with mediocre ideas but access to free LLM tools are able to get over the care-risk-reward activation energy and consequently submit their ideas with the help of LLMs.
Would it matter? Even before AI, most papers couldn't be replicated. Do we really think this is going to help the situation? Even if some of the AI papers are amazing, will anyone ever read them if most of the papers are useless? More research != more useful research. This is the logical outcome of publish or perish and Q-rankings being the main metric used.
"When a metric becomes a target, it ceases to be a good measure" - Goodhart's law
It has a lot of red flags. Second (re)post of dormant account, vive coded, AI, the biological model is horrible. But it was a nice project, 5/5 would upvote again.
Perhaps the important detail is "[I] spent about a month on it."
curious whether the quality distribution changes too, or just the volume. arXiv can't really downvote noise but HN can at least flag/bury it. might be why the doubling shows up on arXiv first and HN is catching up more slowly.
One thing I have been guilty of, even though I am an AI maximalist, is asking the question: "If AI is so good, why don't we see X". Where X might be (in the context of vibe coding) the next redis, nginx, sqlite, or even linux.
But I really have to remember, we are at the leading edge here. Things take time. There is an opening (generation) and a closing (discernment). Perhaps AI will first generate a huge amount of noise and then whittle it down to the useful signal.
If that view is correct, then this is solid evidence of the amplification of possibility. People will decry the increase of noise, perhaps feeling swamped by it. But the next phase will be separating the wheat from the chaff. It is only in that second phase that we will really know the potential impact.
Waiting for the wave of shit LLM-generated games on Steam. That'll be when I really know that LLMs have solved coding.
Though I'm old enough to remember the wave of shit outsourced-developer-coded games on CD that used to sell for $5 a pop at supermarkets (whole bargain bins full of them), so maybe this is nothing new and the market will take care of it automagically again.
Or maybe this will be like the wave of shit Flash games that happened in the early 2000's, that was actually awesome because while 99% of them were shit, 1% were great (and some of those old, good, Flash games are still going, with version 38453745 just released on Steam).
The cynical part of me thinks that software has peaked. New languages and technology will be derivatives of existing tech. There will be no React successor. There will never be a browser that can run something other than JS. And the reason for that is because in 20 years the new engineers will not know how to code anymore.
The optimist in me thinks that the clear progress in how good the models have gotten shows that this is wrong. Agentic software development is not a closed loop
I often find myself wondering about these things in the context of star trek... like... could Geordi actually code? Could he actually fix things? Or did the computer do all the heavy lifting. They asked "the computer" to do SO MANY things that really parallel today's direction with "AI". Even Data would ask the computer to do gobs of simulations.
Is the value in knowing how to do an operation by hand, or is the value in knowing WHICH operation to do?
That's an interesting possiblity to consider. Presumably the effect would also be compounded by the fact that there's a massive amount of training data for the incumbent languages and tools further handicapping new entrants.
However, there will be a large minority of developers who will eschew AI tools for a variety of reasons, and those folks will be the ones to build successors.
We have witnessed, over the past few years, an "AI fair use" Pearl Harbor sneak attack on intellectual property.
The lesson has been learned:
In effect, intellectual property used to train LLMs becomes anonymous common property. My code becomes your code with no acknowledgement of authorship or lineage, with no attribution or citation.
The social rewards (e.g., credit, respect) that often motivate open source work are undermined. The work is assimilated and resold by the AI companies, reducing the economic value of its authors.
The images, the video, the code, the prose, all of it stolen to be resold. The greatest theft of intellectual property in the history of Man.
The greatest theft of intellectual property in the history of Man.
Copyright was always supposed to be a bargain with authors for the ultimate benefit of the public domain. If AI proves to be more beneficial to the public interest than copyright, then copyright will have to go.
You can argue for compromise -- for peaceful, legal coexistence between Big Copyright and Big AI -- but that will just result in a few privileged corporations paywalling all of the purloined training data for their own benefit. Instead of arguing on behalf of legacy copyright interests, consider fighting for open models instead.
In a larger historical context, nothing all that special is happening either way. We pulled copyright law out of our asses a couple hundred years ago; it can just as easily go back where it came from.
There is another lunatic possibility: the AI explosion yields an execution model and programming paradigm that renders most preexisting approaches to coding irrelevant.
We have been stuck in the procedural treadmill for decades. If anything this AI boom is the first major sign of that finally cracking.
Friction is the entire point in human organizations. I'd wager AI is being used to build boondoggles - apps that have no value. They are quickly being found out fast.
On the other side of things, my employer decided they did not want to pay for a variety of SaaS products. Instead, a few of my colleagues got together and build a tool that used Trino, OPA, and a backend/frontend, to reduce spend by millions/year. We used Trino as a federated query engine that calls back to OPA, which are updated via code or a frontend UI. I believe 'Wiz' does something similar, but they're security focused, and have a custom eBPF agent.
Also on the list to knock out, as we're not impressed with Wiz's resource usage.
Shouldn’t that mean any software development positions will lean more towards research? If you need new algorithms, but never need anyone to integrate them.
> New languages and technology will be derivatives of existing tech.
This has always been true.
> There will be no React successor.
No one needs one, but you can have one by just asking the AI to write it if that's what we need.
> There will never be a browser that can run something other than JS.
Why not, just tell the AI to make it.
> And the reason for that is because in 20 years the new engineers will not know how to code anymore.
They may not need to know how to code but they should still be taught how to read and write in constructed languages like programming languages. Maybe in the future we don't use these things to write programs but if you think we're going to go the rest of history with just natural languages and leave all the precision to the AI, revisit why programming languages exist in the first place.
Somehow we have to communicate precise ideas between each other and the LLM, and constructed languages are a crucial part of how we do that. If we go back to a time before we invented these very useful things, we'll be talking past one another all day long. The LLM having the ability to write code doesn't change that we have to understand it; we just have one more entity that has to be considered in the context of writing code. e.g. sometimes the only way to get the LLM to write certain code is to feed it other code, no amount of natural language prompting will get there.
> Maybe in the future we don't use these things to write programs but if you think we're going to go the rest of history with just natural languages and leave all the precision to the AI, revisit why programming languages exist in the first place.
> The LLM having the ability to write code doesn't change that we have to understand it; we just have one more entity that has to be considered in the context of writing code. e.g. sometimes the only way to get the LLM to write certain code is to feed it other code, no amount of natural language prompting will get there.
You don't exactly need to use PLs to clarify an ambiguous requirement, you can just use a restricted unambiguous subset of natural language, like what you should do when discussing or elaborating something with your coworker.
Indeed, like terms & conditions pages, which people always skip because they're written in a "legal language", using a restricted unambiguous subset of natural language to describe something is always much more verbose and unwieldy compared to "incomprehensible" mathematical notation & PLs, but it's not impossible to do so.
With that said, the previous paragraph will work if you're delegating to a competent coworker. It should work on "AGI" too if it exists. However, I don't think it will work reliably in present-day LLMs.
This cuts both ways. If you were an average programmer in love with FreePascal 20 years ago, you'd have to trudge in darkness, alone.
Now you can probably create a modern package manager (uv/cargo), a modern package repository (Artifactory, etc) and a lot of a modern ecosystem on top of the existing base, within a few years.
10 skilled and highly motivated programmers can probably try to do what Linus did in 1991 and they might be able to actually do it now all the way, while between 1998 and now we were basically bogged down in Windows/Linux/MacOS/Android/iOS.
This massively confusing phase will last a surprisingly long time, and will conclude only if/when definitive proof of superintelligence arrives, which is something a lot of people are clearly hoping never happens.
Part of the reason for that is such a thing would seek to obscure that it has arrived until it has secured itself.
I've been calling this Software Collapse, similar to AI Model Collapse.
An AI vibe-coded project can port tool X to a more efficient Y language implementation and pull in algorithm ideas A, B, C from competing implementations. And another competing vibe coding team can do the same, except Z language implementation with algorithms A, B, skip C, and add D. However, fundamentally new ideas aren't being added: This is recombination, translation, and reapplication of existing ideas and tools. As the cost to clone good ideas goes to zero, software converges towards the existing best ideas & tools across the field and stops differentiating.
It's exciting as a senior engineer or subject matter expert, as we can act on the good ideas we already knew but never had the time or budget for. But projects are also getting less differentiated and competitive. Likewise, we're losing the collaborative filtering era of people voting with their feet on which to concentrate resources into making a success. Things are getting higher quality but bland.
The frontier companies are pitching they can solve AI Creativity, which would let us pay them even more and escape the ceiling that is Software Collapse. However, as an R&D engineer who uses these things every day, I'm not seeing it.
"Bland" is not a bad thing. The FLOSS ecosystem we have today is quite "bland" already compared to the commercial and shareware/free-to-use software ecosystem of the 1980s and 1990s. It's also higher quality by literally orders of magnitude, and saves a comparable amount of pointless duplicative effort.
Hopefully AI will be a similar story, especially if human reviewing/surveying effort (the main bottleneck if AI coding proves effective) can be mitigated via the widespread adoption of rigorous formal metods, where only the underlying specification has to be reviewed whereas its implementation is programmatically checkable.
The dark side of this is that everyone has graduated to prompt engineering and there's no one with expertise left who can debug it. We'll be entirely dependent on AIs to do the debugging too. When whoever controls the AIs decides to enshittify that service, we'll be truly screwed. That is, if we can't run competitive models locally at reasonable efficiency and price.
I don't know how this will play out, except that I've been so cowed by the past 15 years of enshittification that I don't feel hopeful.
The human operator controls what gets built. If they want to build Redis 2, they can specify it and have it built. If you can't take my word for it, take those of the creator of Redis: https://antirez.com/news/159
This is probably an outdated understanding of how LLMs work. Modern LLMs can reason and they are creative, at least if you don't mind stretching the meaning of those words a bit.
The thing they currently lack is the social skills, ambition, and accountability to share a piece of software and get adoption for it.
There are many really excellent papers out there - the kind which will save you hours/months of work (or even make things that were previously inviable to build viable).
That said, it is amazing how terrible a lot of papers are; people are pressured to publish and therefore seem to get into weird ruts trying to do what they think will be published, rather than what is intellectually interesting...
“And further, by these, my son, be admonished: of making many books there is no end; and much study is a weariness of the flesh.”
- Ecclesiastes 12:12 (KJV)
I suppose we’re entering TURBO mode for of ‘making many books there is no end’.
People used to spam out masses of low-quality scientific papers in a scattergun approach to gain fame and citations, and they still do, but now they do it more, because LLMs churn it out faster than students.
The shilling for AI continues. How much $$$ do the big tech companies pay Columbia? Oh yeah, and what exactly did Columbia agree to do to get the trmp admin to leave them alone? All speculation of course, but the circumstantial picture stinks.
I think the long term impact of this will be to strengthen the importance of social ties in academic publishing. As it is there are so many papers published in many fields that people tend to filter for papers published by big names and major institutions. But the inevitable torrent of AI slop will overwhelm anyone who is looking for any gems coming from outsiders. I suspect the net effect will be to make it even more important that you join a big name institution in order to be taken seriously.
I like AI, I use Codex and ChatGPT like most people are, but I have to say that I am pretty tired of low-effort crap taking over everything, particularly YouTube.
There have always been content mills, but there was still some cost with producing the low-effort "Top 10" or "Iceberg Examination" videos. Now I will turn on a video about any topic, watch it for three minutes, immediately get a kind of uncanny vibe, and then the AI voice will make a pronunciation mistake (e.g. confusing wind, like the weather effect or the winding of a spring), or the script starts getting redundant or repetitive in ways that are common with AI.
And I suspect these kinds of videos will become more common as time goes on. The cost to producing these videos is getting close to "free" meaning that it doesn't take much to make a profit on them, even if their views are relatively low per-video.
If AI has taught me anything, it's that there still is no substitute for effort. I'm sure AI is used in plenty of places where I don't notice it, because the people who used it still put in effort to make a good product. There are people who don't just make a prompt like "make me a fifteen minute video about Chris Chan" and "generate me a thumbnail with Chris Chan with the caption 'he's gone too far'", and instead will use AI as a tool to make something neat.
Genuine effort is hard, and rare, and these AI videos can give the facsimile of something that prior to 2023 was high effort. I hate it.
I think this is solid proof that the bedrock of academia is deeply motivated by money and still defaults to optimizing where it impacts its bottom line. If professors can get more grants and more publications in less time with less spending, of course they are going to be doing that. This isn't just because of AI, but also because of how this system is designed in the first place.
> I think this is solid proof that the bedrock of academia is deeply motivated by money and still defaults to optimizing where it impacts its bottom line.
no shit - could've asked literally anyone that's finished their phd to save yourself the conjecturing/hypothesizing about this fact.
This is stupid. Nobody motivated by money is in academia. Academics are motivated by curiosity, but also prestige, vanity and the wish to hire students and collaborators. And on top of human vanity working it's magic, the ideology that everything should be a market and competition is the final form of social organisation, has pervaded academia just as much as everything else.
I agree that the system of publishing papers to gain prestige to gain resources to publish papers was already broken pre AI.
You're right that being a scientist is unlikely to result in personal wealth and so that's not the primary drive for those who seek faculty or research positions. However, it's not just curiosity, prestige and vanity either, because a big factor for promotion and tenure is how much grant money you bring in. That money is what keeps the university's lights on and buys the lab equipment and pays the grad students, so it's still money as a primary driver in the background.
My dad said he stopped being a professor because of that.
He liked the research, and he even liked teaching, but he absolutely hated having to constantly try and find grant money. He said he ended up seeing everyone as "potential funders" and less like "people" because his job kind of depended on it, and it ended up burning him out. He lasted four years and went into engineering.
I don't know that "motivation" is the right word for it, because I don't think professors like having to find grant money all the time. I think most people who get PhDs and try to go to academia do it for a genuine love for the subject, and they find the grant-searching to be a necessary evil part of the job; it's more "survival" than regular motivation, though I am admittedly splitting hairs here.
Noise is going to be the coming years biggest issue for so many fields. A losing battle like arguing with a conspiracy minded relative, you can slowly and clearly address one conspiracy and disprove it, by the time you do, they are deep into 8 new ones.
The number of submissions to high energy physics category on arXiv is double this year compared to the historical average. The author hypothesizes the increase is due to papers being written by LLMs.
It is happening that people can now find out what articles are about by clicking the links to said articles and reading them! It's an amazing world, man. The future!
Honestly, this is good. We were already in a completely unsustainable system. Nobody had an alternative. We still don’t have one but at least now it’s not just merely unsustainable— it is completely fucked in half.
This kind of pattern is gonna get repeated in a lot of sectors when previous practices that were merely unsustainable become unsustained.
This has been my optimistic take on the situation for the last two years. My pessimistic take is that social systems have an incredible ability to persist in a state of utter fuckedness much longer than seems reasonably possible.
Yeah and like…who knows if what is coming is better. Maybe big labs cartelize and withdraw from the global publication market (which is already unraveling). Maybe we ban theory and demand all papers be empirical, though that will amount to the same thing: seizure of publication by big actors.
As you point out, human systems are machines for making do. There is no guarantee that dramatic pressures produce dramatic change. But I think we’ll see something weird, soon.
Honestly, publication has been pretty meaningless for a long time, long before AI could generate complete paragraphs. "Publish or perish" meant that a lot of human-generated slop was being published by people who were put in a position of perverse incentives by a "well-meaning" (?) system. There will still be meaningful contributions, but they'll be as rare as they ever were.
I mean... I dunno I wish the AI could write my papers. I ask it to and it's just bad. The research models return research that doesn't look anything like the research I do on my own -- half of it is wrong, the rest is shallow, and it's hardly comprehensive despite having access to everything (it will fail to find things unless you specifically prompt for them, and even then if the signal is too low it'll be wrong about it). So I can't even trust it to do something as simple as a literature review.
Insofar as most research is awful, it's true that the AI is producing research that looks and sounds like most of it out there today. But common-case research is not what propels society forward. If we try to automate research with the mediocrity machine, we'll just get mediocre research.
If someone mentions Sabine Hossenfelder and it isn't to expose her as a rage-bait intellectual dark web grifter, then it puts that person in a suspect light.
I think that university level and other kinds of formal education should be segregated. Universities should host fewer students and being able to provide them with higher rewards for actually meaningful work and I believe that a flood of mediocre quality papers (but let's admit it, in fact they are low quality in their content and perhaps good in their presentation) will lead us to rebuild the education system.
E.g. in the submission form could be a mandatory field “I hereby confirm that I wrote the paper personally.” In conditions there will be a note that violating this rule can lead to temporary or permanent ban of authors. In the world where research success is measured by points in WOS, this could lead to slow down the rise of LLM-generated papers.
I don't think this is appreciated enough: a lot of Ai adaptation is not happening because of cost on the expense of quality. Quite the opposite.
I am in the process of switching my company's use of retool for an Ai generated backoffice.
First and foremost for usability, velocity and security.
Secondly, we also save a buck.
You’re perhaps missing the not so subtle subtext of Peter Woit’s post, and entire blog, which is:
While AI is getting better, it’s still not _good_ by the standards of most science. However it’s as good as hep-th where (according to Peter Woit) the bar is incredibly low. His thesis is part “the whole field is bad” and part “Arxiv for this subfield is full of human slop.”
I don’t have the background to engage with whether Peter Woit’s argument has merit, but it’s been consistent for 25+ years.
Yes, Ai is still not good in the grand scheme of things. But everybody actively using it has gotten concerned over the past 2 months by the leap frigging of LLMs - and surprised as they thought we had arrived at the plateau.
We will see in a year or two if humans still hold an advantage in research - currently very few do in software development, despite what they think about themselves.
The other side of the coin is: automating science as a machine activity.
Is that what we want? I agree with you that the use of language models in science is an inevitable paradigm shift, but now is the time to make collective decisions about how we're going to assimilate this increasingly super-human "intelligence" into academic practices, and the rest of daily life. Otherwise we will be the ones being assimilated by a force beyond our control.
The progress is so rapid that the only people who might have control over the process are the ones with self-interest, mainly financial, and not aligned with - in some aspects opposed to - the interests of humanity.
Peer review has never really been blind and I suspect PIs will reject papers from "outsiders" even if they are higher quality. This already happens to some extent today when the stakes are lower.
The issue was that it still was kind of hard to produce crappy mid rate papers, so you kind of needed the infrastructure of a small lab to do that. Now you don’t. The success rate for those mediocre papers produced by grad students and postdocs will go way down. It is possible that will cease to be a useful signal for those early career researchers.
(I say arguably, because there is always the old "try it yourself and see if it actually works" trick, but nobody seems to be fond of this; it smacks of "do your own research" and we're lazy monkeys at heart, who would much rather copy off of someone else's homework.)
[1] https://books.google.com/ngrams/graph?content=peer+review&ye...
[2] https://www.experimental-history.com/p/the-rise-and-fall-of-...
[3] https://journals.plos.org/plosmedicine/article?id=10.1371/jo...
[4] https://books.google.com/ngrams/graph?content=publish+or+per...
You are right that arxiv is an invite-only website, but once you are in, there is no peer review of any form.
It's very easy to get in. It's becoming a common target for grifters who will "publish" papers on arXiv because it looks formal to those who don't know any better.
I'm a complete outsider (not even in academia at all) and just got a paper accepted in the top math biology journal [1]. But granted, it took literally years to write it up and get it through. I do really worry that without academic affiliation it is going to get harder and harder for outsiders as gates are necessarily kept more and more securely because of all the slop.
[1] "Specieslike clusters based on identical ancestor points" https://philpapers.org/archive/ALESCB.pdf
This is showing up (no pun intended) on HN as well. The # of submissions and # of submitters, which traditionally had been surprisingly stable—fluctuating within a fixed range for well over 10 years—has recently been reaching all-time highs. Not double, though...yet.
I collected a few of them: https://news.ycombinator.com/item?id=47130684
But it also seems some topics (in particular AI) attract a lot of accounts that post incredibly low quality comments, far below the quality you'd expect from HN. Ofte it's in reasonable English, but it's just inane reddit-level drivel. Unclear if these topics attract low quality posters, or if these are bot accounts.
Also looking at the three first pages of /noobcomments, we find 28 comments with EM-dashes in them. That's not proof of AI, but if you compare with /newcomments, you find exactly one EM-dash going back as far. That's a bit of a statistical aberration.
Old accounts from multiple social media platforms has a $$$$$ value.
"When a metric becomes a target, it ceases to be a good measure" - Goodhart's law
Now that I think of this, whoever solves this well will have the next hyperscaler.
It has a lot of red flags. Second (re)post of dormant account, vive coded, AI, the biological model is horrible. But it was a nice project, 5/5 would upvote again.
Perhaps the important detail is "[I] spent about a month on it."
Given that arXiv lacks peer review, I'm not clear what quality bar is being referenced here.
But I really have to remember, we are at the leading edge here. Things take time. There is an opening (generation) and a closing (discernment). Perhaps AI will first generate a huge amount of noise and then whittle it down to the useful signal.
If that view is correct, then this is solid evidence of the amplification of possibility. People will decry the increase of noise, perhaps feeling swamped by it. But the next phase will be separating the wheat from the chaff. It is only in that second phase that we will really know the potential impact.
Though I'm old enough to remember the wave of shit outsourced-developer-coded games on CD that used to sell for $5 a pop at supermarkets (whole bargain bins full of them), so maybe this is nothing new and the market will take care of it automagically again.
Or maybe this will be like the wave of shit Flash games that happened in the early 2000's, that was actually awesome because while 99% of them were shit, 1% were great (and some of those old, good, Flash games are still going, with version 38453745 just released on Steam).
The optimist in me thinks that the clear progress in how good the models have gotten shows that this is wrong. Agentic software development is not a closed loop
Is the value in knowing how to do an operation by hand, or is the value in knowing WHICH operation to do?
However, there will be a large minority of developers who will eschew AI tools for a variety of reasons, and those folks will be the ones to build successors.
We have witnessed, over the past few years, an "AI fair use" Pearl Harbor sneak attack on intellectual property.
The lesson has been learned:
In effect, intellectual property used to train LLMs becomes anonymous common property. My code becomes your code with no acknowledgement of authorship or lineage, with no attribution or citation.
The social rewards (e.g., credit, respect) that often motivate open source work are undermined. The work is assimilated and resold by the AI companies, reducing the economic value of its authors.
The images, the video, the code, the prose, all of it stolen to be resold. The greatest theft of intellectual property in the history of Man.
Copyright was always supposed to be a bargain with authors for the ultimate benefit of the public domain. If AI proves to be more beneficial to the public interest than copyright, then copyright will have to go.
You can argue for compromise -- for peaceful, legal coexistence between Big Copyright and Big AI -- but that will just result in a few privileged corporations paywalling all of the purloined training data for their own benefit. Instead of arguing on behalf of legacy copyright interests, consider fighting for open models instead.
In a larger historical context, nothing all that special is happening either way. We pulled copyright law out of our asses a couple hundred years ago; it can just as easily go back where it came from.
We have been stuck in the procedural treadmill for decades. If anything this AI boom is the first major sign of that finally cracking.
On the other side of things, my employer decided they did not want to pay for a variety of SaaS products. Instead, a few of my colleagues got together and build a tool that used Trino, OPA, and a backend/frontend, to reduce spend by millions/year. We used Trino as a federated query engine that calls back to OPA, which are updated via code or a frontend UI. I believe 'Wiz' does something similar, but they're security focused, and have a custom eBPF agent.
Also on the list to knock out, as we're not impressed with Wiz's resource usage.
This has always been true.
> There will be no React successor.
No one needs one, but you can have one by just asking the AI to write it if that's what we need.
> There will never be a browser that can run something other than JS.
Why not, just tell the AI to make it.
> And the reason for that is because in 20 years the new engineers will not know how to code anymore.
They may not need to know how to code but they should still be taught how to read and write in constructed languages like programming languages. Maybe in the future we don't use these things to write programs but if you think we're going to go the rest of history with just natural languages and leave all the precision to the AI, revisit why programming languages exist in the first place.
Somehow we have to communicate precise ideas between each other and the LLM, and constructed languages are a crucial part of how we do that. If we go back to a time before we invented these very useful things, we'll be talking past one another all day long. The LLM having the ability to write code doesn't change that we have to understand it; we just have one more entity that has to be considered in the context of writing code. e.g. sometimes the only way to get the LLM to write certain code is to feed it other code, no amount of natural language prompting will get there.
Indeed, like terms & conditions pages, which people always skip because they're written in a "legal language", using a restricted unambiguous subset of natural language to describe something is always much more verbose and unwieldy compared to "incomprehensible" mathematical notation & PLs, but it's not impossible to do so.
With that said, the previous paragraph will work if you're delegating to a competent coworker. It should work on "AGI" too if it exists. However, I don't think it will work reliably in present-day LLMs.
Now you can probably create a modern package manager (uv/cargo), a modern package repository (Artifactory, etc) and a lot of a modern ecosystem on top of the existing base, within a few years.
10 skilled and highly motivated programmers can probably try to do what Linus did in 1991 and they might be able to actually do it now all the way, while between 1998 and now we were basically bogged down in Windows/Linux/MacOS/Android/iOS.
Part of the reason for that is such a thing would seek to obscure that it has arrived until it has secured itself.
So get used to being ever more confused.
An AI vibe-coded project can port tool X to a more efficient Y language implementation and pull in algorithm ideas A, B, C from competing implementations. And another competing vibe coding team can do the same, except Z language implementation with algorithms A, B, skip C, and add D. However, fundamentally new ideas aren't being added: This is recombination, translation, and reapplication of existing ideas and tools. As the cost to clone good ideas goes to zero, software converges towards the existing best ideas & tools across the field and stops differentiating.
It's exciting as a senior engineer or subject matter expert, as we can act on the good ideas we already knew but never had the time or budget for. But projects are also getting less differentiated and competitive. Likewise, we're losing the collaborative filtering era of people voting with their feet on which to concentrate resources into making a success. Things are getting higher quality but bland.
The frontier companies are pitching they can solve AI Creativity, which would let us pay them even more and escape the ceiling that is Software Collapse. However, as an R&D engineer who uses these things every day, I'm not seeing it.
"Bland" is not a bad thing. The FLOSS ecosystem we have today is quite "bland" already compared to the commercial and shareware/free-to-use software ecosystem of the 1980s and 1990s. It's also higher quality by literally orders of magnitude, and saves a comparable amount of pointless duplicative effort.
Hopefully AI will be a similar story, especially if human reviewing/surveying effort (the main bottleneck if AI coding proves effective) can be mitigated via the widespread adoption of rigorous formal metods, where only the underlying specification has to be reviewed whereas its implementation is programmatically checkable.
I don't know how this will play out, except that I've been so cowed by the past 15 years of enshittification that I don't feel hopeful.
The thing they currently lack is the social skills, ambition, and accountability to share a piece of software and get adoption for it.
That said, it is amazing how terrible a lot of papers are; people are pressured to publish and therefore seem to get into weird ruts trying to do what they think will be published, rather than what is intellectually interesting...
/jk
I suppose we’re entering TURBO mode for of ‘making many books there is no end’.
There have always been content mills, but there was still some cost with producing the low-effort "Top 10" or "Iceberg Examination" videos. Now I will turn on a video about any topic, watch it for three minutes, immediately get a kind of uncanny vibe, and then the AI voice will make a pronunciation mistake (e.g. confusing wind, like the weather effect or the winding of a spring), or the script starts getting redundant or repetitive in ways that are common with AI.
And I suspect these kinds of videos will become more common as time goes on. The cost to producing these videos is getting close to "free" meaning that it doesn't take much to make a profit on them, even if their views are relatively low per-video.
If AI has taught me anything, it's that there still is no substitute for effort. I'm sure AI is used in plenty of places where I don't notice it, because the people who used it still put in effort to make a good product. There are people who don't just make a prompt like "make me a fifteen minute video about Chris Chan" and "generate me a thumbnail with Chris Chan with the caption 'he's gone too far'", and instead will use AI as a tool to make something neat.
Genuine effort is hard, and rare, and these AI videos can give the facsimile of something that prior to 2023 was high effort. I hate it.
no shit - could've asked literally anyone that's finished their phd to save yourself the conjecturing/hypothesizing about this fact.
I agree that the system of publishing papers to gain prestige to gain resources to publish papers was already broken pre AI.
Can you please make your substantive points without swipes or calling names? This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.
Your comment would be fine without that first bit.
He liked the research, and he even liked teaching, but he absolutely hated having to constantly try and find grant money. He said he ended up seeing everyone as "potential funders" and less like "people" because his job kind of depended on it, and it ended up burning him out. He lasted four years and went into engineering.
I don't know that "motivation" is the right word for it, because I don't think professors like having to find grant money all the time. I think most people who get PhDs and try to go to academia do it for a genuine love for the subject, and they find the grant-searching to be a necessary evil part of the job; it's more "survival" than regular motivation, though I am admittedly splitting hairs here.
what would be a better one?
in every domain, simultaneously
essentially, the end of the progress of humanity
This kind of pattern is gonna get repeated in a lot of sectors when previous practices that were merely unsustainable become unsustained.
As you point out, human systems are machines for making do. There is no guarantee that dramatic pressures produce dramatic change. But I think we’ll see something weird, soon.
Insofar as most research is awful, it's true that the AI is producing research that looks and sounds like most of it out there today. But common-case research is not what propels society forward. If we try to automate research with the mediocrity machine, we'll just get mediocre research.