This article feels extremely imprecise. The syntax of the "language" changes from example to example, control structures like conditionals are expressed in English prose, some examples are solved by "do all the work for me" functions like the "toPdf()" example...
This whole thing feels like an elaborate LLM fantasy. Is there any real, usable language behind these examples, or is the author just role-playing with ChatGPT?
I know ACM Queue is a non-peer-reviewed magazine for practitioners but this still feels like too much of an advertisement, without any attempt whatsoever to discuss downsides or limitations. This really doesn't inspire confidence:
While this may seem like a whimsical example, it is not intrinsically easier or harder for an AI model compared to solving a real-world problem from a human perspective. The model processes both simple and complex problems using the same underlying mechanism. To lessen the cognitive load for the human reader, however, we will stick to simple targeted examples in this article.
For LLMs this is blatantly false - in fact asking about "used textbooks" instead of "apples" is measurably more likely to result in an error! Maybe the (deterministic, Prolog-style) Universalis language mitigates this. But since Automind (an LLM, I think) is responsible for pre/post validation, naively I would expect it to sometimes output incorrect Universalis code and incorrectly claim an assertion holds when it does not.
Maybe I am making a mountain out of a molehill but this bit about "lessen the cognitive load of the human reader" is kind of obnoxious. Show me how this handles a slightly nontrivial problem, don't assume I'm too stupid to understand it by trying to impress me with the happy path.
Prolog works indeed very well as target for generation by an LLM, for input problems limited and similar enough in nature to given classes of templated in-context examples, so well indeed that the lack of a succinct, exhaustive text description of your problem is becoming the issue. At which point you can specify your problem in Prolog directly considering Prolog was also invented to model natural language parsing and not just for solving constraint/logic problems, or could employ ILP techniques to learn or optimize Prolog solvers from existing problem solutions rather than text descriptions. See [1].
That link you're citing is old news and also contained/discussed in the Quantum Prolog article. Those observations were made with respect to translating problem descriptions into PDDL, a frame-based, LISPish specification language for AI competitions encoding planning "domains" in a tight a priori taxonomical framework rather than logic or any other Turing-complete language. As such, contrary to what's speculated in that link, the expectation is that results do not carry over to the case of Prolog, which is much more expressive. I actually considered customizing an LLM using an English/Prolog corpus, which should be relatively straightforward given Prolog's NLP roots, but the in-context techniques turned out so impressive already (using 2025 SoTA open weight models) that indeed the bottleneck was the lack of text descriptions for really challenging real-world problems, as mentioned in the article. The reason may lie in the fact that English-to-Prolog mapping examples and/or English Prolog code documentation examples are sufficiently common in the latent space/in foundation training data.
I can ensure you Prolog prompting for at least the class of robotic planning problems (and similar discrete problems, plus potentially more advanced classes such as scheduling and financial/investment allocation planning requiring objective function optimization) works well, and you can easily check it out yourself with the prompting guide or even online if you have a capable endpoint you're willing to enter [1].
>> Since we're not doing original research, but rather intend to demonstrate a port of the Aleph ILP package to ISO Prolog running on Quantum Prolog, we cite the problem's definition in full from the original paper (ilp09):
Aleph? In 2025. That's just lazy, now. At the very least they should try Metagol or Popper, both with dozens of recent publications (and I'm not even promoting my own work).
You're not wrong, and alternatives were considered, but those were really not fit to be ported to ISO Prolog in bounded time with complete lack of tests or even a basic reproducible demo, uncontrolled usage of non-ISO libs and features only available on the originally targetted Prolog implementation, and other issues typical of "academic codes."
The lack of unit tests is something I'm guilty of too and you're very right about it. The community is dimly aware of the fact that systems are more on the "academic prototype" side of things and less on the "enterprise software" but there's so little interest from industry that nobody is going to spend significant effort to change that. Kind of a catch 22 maybe.
How about ISO? Why was this a requirement, out of curiosity?
Glad to see focus being put on keeping humans in the drivers seat, democratizing coding with the help of AI. The syntax is probably still too verbose to be easily accessible, but I like the overall approach.
Great to start off ... then we will end-up reinventing/re-specifying functions for reusability, module/packages for higher-level grouping, types/classes, state machines, control-flows [with the nuances for edge cases and exit conditions], then we will need error control, exceptions; sooner or later concurrency, parallelism, data structures, recursion [lets throw in monads for the Haskellians amoung us]; who knows .. we may even end up with GOTOs peppered all over the English sentences [with global labels] & wake up to the scoping, parameter passing. We can have a whole lot of new fights if we need object-oriented programming; figure out new design patterns with special "Token Factory Factories".
We took a few decades to figure out how to specify & evolve current code to solve a certain class of problems [nothing is perfect .. but it seems to work at scale with trade-offs]. Shall watch this from a distance with pop-corn.
This whole thing feels like an elaborate LLM fantasy. Is there any real, usable language behind these examples, or is the author just role-playing with ChatGPT?
Is it a good thing to make this easier? We're drowning in garbage already.
Maybe I am making a mountain out of a molehill but this bit about "lessen the cognitive load of the human reader" is kind of obnoxious. Show me how this handles a slightly nontrivial problem, don't assume I'm too stupid to understand it by trying to impress me with the happy path.
[1]: https://quantumprolog.sgml.net
a) a game of roulette where you hope the LLM provider has RLHFed something very close to your use case, or
b) trying to few-shot it with in-context examples requires more engineering (and is still less reliable) than simply doing it yourself
In particular it's not just "the lack of a succinct, exhaustive text description," it also a lack of English->Prolog "translations."
It seems like the LLM-Prolog community is well aware of all this (https://swi-prolog.discourse.group/t/llm-and-prolog-a-marria...) but I don't see anything in Universalis that solves the problem. Instead it's just magically invoking the LLM.
I can ensure you Prolog prompting for at least the class of robotic planning problems (and similar discrete problems, plus potentially more advanced classes such as scheduling and financial/investment allocation planning requiring objective function optimization) works well, and you can easily check it out yourself with the prompting guide or even online if you have a capable endpoint you're willing to enter [1].
[1]: https://quantumprolog.sgml.net/llm-demo/part2.html
Aleph? In 2025. That's just lazy, now. At the very least they should try Metagol or Popper, both with dozens of recent publications (and I'm not even promoting my own work).
How about ISO? Why was this a requirement, out of curiosity?
Essentially, say that you have an input type like:
And you do something like ".map(it => { doubledAge: it.age * 2 })"The inferred type of that intermediate operation is now:
Which is wild, since you essentially have TypeScript-like inference in a JVM language.Timestamp to talk:
https://www.youtube.com/watch?v=F5NaqGF9oT4&t=543s
We took a few decades to figure out how to specify & evolve current code to solve a certain class of problems [nothing is perfect .. but it seems to work at scale with trade-offs]. Shall watch this from a distance with pop-corn.