Hacker News new | past | comments | ask | show | jobs | submit | itkovian_'s comments login

"That is a phrase that I coined in a 2022 essay called “Deep Learning is Hitting a Wall,” which was about why scaling wouldn’t get us to AGI. And when I coined it, everybody dismissed me and said, “No, we’re not reaching diminishing returns. We have these scaling laws. We’ll just get more data.”

How can anyone think he's arguing in good faith at this point. That essay was published after gpt3 prior to gpt 4 - and he's claiming it was correct!


I lost faith in Marcus after just a few interactions. He is indeed what one would refer to as "crackpot" in academia. The most glaring thing was that he is technically extremely shallow and don't have a clue about most of the details. I also got impression that is enormously enamored with having attention and recognition at any cost. Depending on weather, he will change directions and views, basically just do anything it took to get that attention no matter how ridiculous he looks doing that.

While writing this, it occured to me that he would get even goose bumps at reading this comment because it, after all, I am giving him attention.


> Depending on weather, he will change directions and views

My impression is the opposite: I would describe Gary Marcus as having all his opinions perfectly aligned to a singular viewpoint at all times regardless of weather (or evidence).


At different points he has claimed (1) current models are not intelligent at all and can never be intelligent like us, (2) also we need to ban current models because they will outsmart us and take over the world.

Depending on how he can get interview or get seat at the table, he may chose exact opposite of positions.


Can you show an example of 2)? I've missed that.

I thought "Attention Is All You Need"

I don't think he is always arguing in good faith, unfortunately.

So his timing was slightly off. I don’t know why people expected LLMs to improve exponentially. Your iPhone now doesn’t look much different than the one 10 years ago. GPT-3 or arguably GPT-4 was the first iPhone moment, everything else will be gradual improvements unless fundamental discoveries are found, but those happen seemingly randomly.

If one compares O3-mini's coding abilities to the original GPT-4. It is as large as GPT-3 to GPT-4 gap

GPT-3: Useful as autocomplete. Still error prone, but vastly better than any pre-AI autocomplete

GPT-4: Already capable of independently coding up simple functions based on natural language.

O3-mini: Can code in say top 5% of codeforces.

There's a 2 years gap between each of them.

More over, intelligence has a superexponential return, 90IQ->100IQ < 100IQ->110IQ in terms of returns.


> More over, intelligence has a superexponential return, 90IQ->100IQ < 100IQ->110IQ in terms of returns

That's the second time I've seen the claim that linear increases in intelligence have "superexponential" results, first time was Altman's blog.

But I've not seen any justification for this.

(As you specifically say IQ, note that an IQ is defined as a mapping of standard deviations rather than a mapping of absolute skill, the normal mapping is 15 points being 1σ).


AI is spreading across disciplines like science, math, software development, language, music, and health. You’re looking at it too narrowly. Human-computer symbiosis is accelerating at an unprecedented rate, far beyond the pace of something like the iPhone.

More like computer-human parasitism; it weakens the host.

It also only affects those with a "weak immune system" i.e. those whose bullshit filter doesn't function.

AI is here to stay for some tasks (segment anything, diffusion image generation for accelerating certain kinds of Photoshop), but LLMs are a dead end and AI Winter 2 is coming. They don't work for programming or law or medicine or mechanical engineering or even writing most emails because it's faster to just write the email yourself than to ask the AI to do it.


In what sense are the bleeding edge models incremental improvements over GPT-3 (read his examples of GPT-3 output and imagine any of the top models today producing them!), GPT-3.5, or GPT-4? Look at any benchmark or use it yourself. It's night and day.

Gary Marcus didn't make a lot of specific criticisms or concrete predictions in his essay [0], but some of his criticisms of GPT-3 were:

- "For all its fluency, GPT-3 can neither integrate information from basic web searches nor reason about the most basic everyday phenomena."

- "Researchers at DeepMind and elsewhere have been trying desperately to patch the toxic language and misinformation problems, but have thus far come up dry."

- "Deep learning on its own continues to struggle even in domains as orderly as arithmetic."

Are these not all dramatically improved, no matter how you measure them, in the past three years?

[0] https://nautil.us/deep-learning-is-hitting-a-wall-238440/


To me, the current LLMs aren't qualitatively different from the char RNNs that Karpathy showcased all the way back in 2015. They've gotten a lot more useful, but that is about it. Current LLMs will have as much to do with GAI as computer games have to do with NNs. Which is to say, games were necessary to develop GPUs which were then used to train NNs, and current LLMs are necessary to incentivize even more powerful hardware to come into existence, but there isn't much gratitude involved in that process.

> To me, the current LLMs aren't qualitatively different from the char RNNs that Karpathy showcased all the way back in 2015.

It's very difficult to understand this statement. What meaning of "qualitatively" could possibly make it true?


The strengths and weaknesses of the algorithmic niche that artificial NNs are in hasn't changed a bit since a decade ago. They are still bad at anything I'd want to actually use them for that you'd imagine actual AI would be good at. The only thing that has changed is people's perception. LLMs found a market fit, but if you notice, compared to last decade where we had Deepmind and OpenAI competing at actual AI in games like Go and Starcraft, they've pretty much given up on that in favor on hyping text predictors. For anybody in the field, it should be an obvious bubble.

Underneath it all, there is some hope that an innovation might come about to keep the wave going, and indeed, a new branch of ML being discovered could revolutionize AI and actually be worthy of the hype that LLMs have now, but that has nothing to do with the LLM craze.

It's cool that we have them, and I also appreciate what Stable Diffusion has brought to the world, but in terms of how much LLMs influenced me, they only shorted the time it takes for me to read the documentation.

I don't think that machines cannot be more intelligent than humans. I don't think that the fact that they use linear algebra and mathematical functions makes the computers inferior to humans. I just think that the current algorithms suck. I want better algos so we can have actual AI instead of this trash.


The difference between a doomsday conspiracy theorist and a physicist surmising the heat death of the universe is...just timing.

Well it's true that all of the most recent advances come from changes the architecture to do to inference scaling instead of model scaling. Scaling laws as people talked about in them in 2022 (that you take a base LLM and make it bigger) are dead.

I think you want both. To scale the model, e.g. train it with more and more data, you also need to scale your inference step. Otherwise, it just takes too long and it's too costly, no?

Completely anecdotal; the amount of people that do not consume caffeine seems overrepresented among very very smart people (i.e. math profs, researchers etc)

On the other hand, the amount of people that consume excess amounts of caffeine also seems overrepresented among such people.

It might be that both not consuming caffeine, and being very very smart, have the same root cause - like good sleep patterns.

Just an unresearched thought I had!


Incredible how low usage is among lawyers. Does anyone have any intuition on why?

Part of it is selection bias, Claude is much less general-audiences than ChatGPT. But any lawyers using LLMs in 2025 deserve to be disbarred:

"A Major Law Firm's ChatGPT Fail" https://davidlat.substack.com/p/morgan-and-morgan-order-to-s...

"Lawyer cites six cases made up by ChatGPT" https://arstechnica.com/tech-policy/2023/05/lawyer-cited-6-f...

"AI 'hallucinations' by ChatGPT end up costing B.C. lawyer" https://www.msn.com/en-ca/news/world/ai-hallucinations-creat...

The list goes on and on. Maybe there's a bespoke RAG solution that works...maybe.


> But any lawyers using LLMs in 2025 deserve to be disbarred

In what year would you think it will be acceptable and why?

LLMs are tools, I don't see anything wrong with using them in any occupation as long as the user is aware of the limitations.


no - some Judge wrote to his family member recently.. " I am seeing all these great briefs now " followed by a novice discussion of AI use. This is anecdotal (recent), but it says to me that non-lawyers, with care, are writing their own legal papers across the USA and doing it well. This fits with other anecdotes here in coastal California for ordinary law uses.

i think they're especially likely to hallucinate when asked to cite sources, as in they're mostly prone to making up sources, and a lot of the work my lawyer friend have asked of chatgpt or claude requires it to cite stuff, and my friend has said it has just made up case law that isn't real. so while it's useful as a launching point and can in fact be helpful and find real case law, you still have to double check every single thing it says with a fine tooth comb, so its productivity impact is much lower than code where you can clearly see whether the output works immediately

My guess is bc hallucinations in a legal context can be fatal to a case, possibly even career endu g — there’s been some high profile cases where judges have ripped into lawyers pretty destructively.

Because LLMs make things up and the lawyer is liable for using that made up information.

Lawyers are selected for critical thinking skills and they aren't vulnerable to AI hype the way relatively poorly educated computer guys are.

Interesting article relating Adam Unikowsky asking for Claude to decide a Supreme Court case: https://blog.plover.com/tech/gpt/presidential-emoji.html

"Claude is fully capable of acting as a Supreme Court Justice right now."


Here's an example of the type of question it is acheiving 20% on;

The set of natural transformations between two functors F,G ⁣:C→DF,G:C→D can be expressed as the end Nat(F,G)≅∫AHomD(F(A),G(A)). Nat(F,G)≅∫A HomD (F(A),G(A)).

Define set of natural cotransformations from FF to GG to be the coend CoNat(F,G)≅∫AHomD(F(A),G(A)). CoNat(F,G)≅∫AHomD (F(A),G(A)).

Let: - F=B∙(Σ4)∗/F=B∙ (Σ4 )∗/ be the under ∞∞-category of the nerve of the delooping of the symmetric group Σ4Σ4 on 4 letters under the unique 00-simplex ∗∗ of B∙Σ4B∙ Σ4 . - G=B∙(Σ7)∗/G=B∙ (Σ7 )∗/ be the under ∞∞-category nerve of the delooping of the symmetric group Σ7Σ7 on 7 letters under the unique 00-simplex ∗∗ of B∙Σ7B∙ Σ7 .

How many natural cotransformations are there between FF and GG?


As someone who doesn't understand anything beyond the word 'set' in that question, can anyone give an indication of how hard of a problem that actually is (within that domain)?

Also I'm curious as to what percentage of the questions in this benchmark are of this type / difficulty, vs the seemingly much easier example of "In Greek mythology, who was Jason's maternal great-grandfather?".

I'd imagine the latter is much easier for an LLM, and almost trivial for any LLM with access to external sources (such as deep research).


btw isn't this question at least really badly worded (and maybe incorrect?) the definitions they give for F and G are categories not functors... (and both categories are in fact one object with contractible space of morphisms...)


That's easy Dave: 42.


Do we actually know whether it got this specific example right? It got 20% on HLE, but I think a few questions are quite a bit easier.


It's very interesting to think about what kind of "mental model" might it have, if it's capable of "understanding" all this (to me) gibberish, but is then unable to actually work the problem.


I remember when openai first raised and had the 100x cap and everyone said that was ridiculous and insane and of course they're not going to 100x from 1b... That would require them to become a 100b company!


Tsmc will never allow the Arizona plant to be a viable replacement. They are extremely incentived to prevent this happening.


That's OK. It's on US soil with US employees and can be nationalized if and when need be. I'm sure ASML will be happy to comply or else risk their US operations being nationalized too. Like their DUV/EUV light sources office https://www.asml.com/en/company/about-asml/locations/san-die...


Yup. Just pass another tiktok bill to force sell factory to US buyer.


Holy shit they have 1,900 employees for that


It doesn't need to be a viable replacement. Even if it only ever makes chips that are 1-2 years behind, it's still a huge strategic benefit for the country.


Does it have to be or does it just have to be enough to be a deterrent to China?

I wonder if the strategy behind the CHIPS act is to have enough “backup” capacity in the US that it isn’t completely vulnerable.


China’s interest in Taiwan is about controlling the sea routes, not about chips.


How so? They also are extremely incentivised to make this happen. A war on your front door is not good for business.


Tsmc is mostly governed by Taiwanese who would like to maintain Taiwanese sovereignty


Taiwan might be a more appealing target if all of TSMC's output is located there.


My thinking is less appealing, because the more USA depends on them the more USA will defend them.


Don’t all the machines in Taiwan have explosives fitted in case of invasion?


TSMC is governed by the Taiwanese ruling class. If the Chinese launches a widespread attack on Taiwanese soil tomorrow, nothing would happen to any of these people. These people are not your random neighbors harboring nationalistic views.


You don't have to be a rabid nationalist to not wish for your country to be invaded and annexed by others. You don't even have to live there. I'm sure a large percentage of Taiwanese living in countries outside of Taiwan would not wish for it to be invaded.

I'm not even Taiwanese, don't know anyone of Taiwanese descent well, and I don't want Taiwan invaded.

The suggestion that there's some kind of weird oligarchy class of TSMC-controlling Taiwanese who couldn't give a toss if Taiwan was invaded is a mustache-twirling level of caricature.


TSMC is governed by the Taiwanese government, which is a puppet government controlled by the US government and military. TSMC answers to the US directly, as without US support, Taiwan falls to China almost instantly. Nobody besides the US can prevent a blockade of Taiwan


A "puppet government".

This claim is based on... Them wanting not to be invaded?

If anything they had the foresight and took advantage of US companies not wanting to fab their chips at higher prices domestically. This has led to cooperation between the two.

I see this argument in some fashion every day, claiming US allies are puppets. When in fact, they just find commonalities and cooperate.


the strength of the US defense commitment is likely proportional to the strategic value of the economic assets they still hold. the taiwanese have every incentive to do just well enough at the AZ plant for the $39 Billion checks to clear and no better


While true, TSMC has a stronger incentive for its own survival than the survival of Taiwan. If it's easier for them to shift operations to the US and continue to make $$$, I suspect they'd do that over retaining operations in Taiwan and hoping it will convince the US to protect the country.


The biggest shareholder of TSMC is the Taiwan government.


This factory is not for economic independence or economic strategy. It is for geopolitical strategy. This factory is meant to build smarter munitions if war breaks out, not the latest cellphone. The US gov does not give a fuck about Apple's stock price and product plans if war breaks out with China, since, you know, there's real adult problems going on.


This is not true


Phsycohistory


I think gpt4o is probably doing some ocr as preprocessing. It's not really controversial to say the vmls today don't pick up fine grained details - we all know this. Can just look at the output of a vae to know this is true.


If so, it's better than any other ocr on the market.

I think they just train it on a bunch of text.

Maybe counting squares in a grid was not probably considered important enough to train for.


Why do you think it's probable? The much smaller llava that I can run in my consumer GPU can also do "OCR", yet I don't believe anyone has hidden any OCR engine inside llama.cpp.


Knowledge graphs where created to solve the problem of making natural,free flowing text machine processable. We now have a technology that completely understands natural free flowing text and can extract meaning. Why would going back to structure help when that structure can never be as rich as just text. I get it if the kb has new information, that's not what I'm saying.


> Why would going back to structure help

When your corpus is large it is useful to split it up and hierarchically combine. In their place I would do both bottom-up and top-down summarization passes, so information can percolate from a leaf to the root and from the root to a different leaf. Global context can illuminate local summaries, for example think of the twist in a novel, it sheds new light on everything.


That's not what a kb is


> We now have a technology that completely understands natural free flowing text and can extract meaning.

Actually we don't. I know it certainly feels like LLMs do this but no one would dare stake their life on their output if they know how they work. Still useful!


But RAG without graphs just relies on similarity search, which isn't very smart.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: