Hacker News new | past | comments | ask | show | jobs | submit login

> Can't we just send LLMs back to the drawing board until they have some semblance of reliability?

Well at this point they've certainly proven a net gain for everyone regardless of the occasional nonsense they spew.






No, from the research around it the findings are mixed. There is no consensus that it's net gain.

That is... debatable. You may be entirely inside the bubble, there.

Not sure if this was posted as humour, but I don't feel that way. In today's world, where I certainly would consider taking the blue pill, I'm having a blast with LLMs!

It has helped me learn stuff incredibly faster. Especially I find them useful for filling the gaps of knowledge and exploring new topics in my own way and language, without needing to wait an answer from a human (that could also be wrong).

Why does it feel, that "we are entirely inside the bubble" for you?


Are you sure it's helped you learn?

In the early days of ChatGPT where it seemed like this fun new thing, I used it to "learn" C. I don't remember anything it told me, and none of the answers it gave me were anything that I couldn't find elsewhere in different forms - heck I could have flipped open Kernighan & Ritchie to the right page and got the answer.

I had a conversation with an AI/Bitcoin enthusiast recently. Maybe that already tells you everything you need to know about this person, but to the hammer the point home, they made a claim to similar to you: "I learn much more and much better with AI". They also said they "fact check" things it "tells" them. Some moments later they told me "Bitcoin has its roots in Occupy Wall Street".

A simple web search tells you that Bitcoin is conceived a full 2 years before Occupy. How can they be related?

It's a simple error that can be fact checked simply. It's a pretty innocuous falsity in this particular case - but how many more falsehoods have they collected? How do those falsehoods influence them on a day-by-day basis?

How many falsehoods influence you?

A very well meaning activist posted a "comprehensive" list of all the programs that were to be halted by the grants and loans freezes last week. Some of the entries on the list weren't real, or not related to the freeze. They revealed they used ChatGPT to help compile the list and then went down one-by-one to verify each one.

With such meticulous attention to detail, incorrect information still filtered through.

Are you sure you are learning?


I guess the real learning happens outside the AI, here in real life. Does the code run? Sure, it's on my local and not in production, but I would've never have the patience to get "that new thing working" without AI as assistant.

Does the food taste good? Oops, there's a bit too much vegetables here, they are never gonna fit in this pan of mine. Not a big deal, next time I'll be wiser.

AI is like a hypothesis machine. You're gonna have to figure out if the output is true. Few years ago, just testing any machine's "intelligence" was pretty quickly done and machine failed miserably. Now, the accuracy is astounishing in comparison.

> How many falsehoods influence you?

That is a great question. The answer is definitely not zero. I try to live by with a hacker mentality and I'm an engineer by trade. I read news and comments, which I'm not sure is good for me. But you also need some compassion towards oneself. It's not like ripping everything open will lead to salvation. I believe the truth does set you free, eventually. But all in one's time...

Anyway, AI is a tool like any other. Someone will hammer their fingers with it. I just don't understand the hate. It's not like we're drinking any AI koolaids here. It's just like it was 30 years ago (in my personal journey), you had a keyboard and a machine, you asked it things and got gibberish. Now the conversation with it just started to get interesting. Peace.


When your bitcoiner friend told you something that's not true, that's a human who hallucinated, not an LLM.

Maybe we're already at AGI and just don't know it because we overestimate the capabilities of most humans.


The assertion is that they "learned" that Bitcoin came from Occupy from an AI.

If AI is teaching you, you are going to collect a thousand papercuts of lies.


>It has helped me learn stuff incredibly faster. Especially I find them useful for filling the gaps of knowledge and exploring new topics in my own way and language

and then you verify every single fact it tells you via traditional methods by confirming them in human-written documents, right?

Otherwise, how do you use the LLM for learning? If you don't know the answer to what you're asking, you can't tell if it's lying. It also can't tell if it's lying, so you can't ask it.

If you have to look up every fact it outputs after it does, using traditional methods, why not skip to just looking things up the old fashioned way and save time?

Occasionally an LLM helps me surface unknown keywords that make traditional searches easier, but they can't teach anything because they don't know anything. They can imagine things you might be able to learn from a real authority, but that's it. That can be useful! But it's not useful for learning alone.

And if you're not verifying literally everything an LLM tells you.. are you sure you're learning anything real?


I guess it all depends on the topic and levels of trust. How can I be certain that I have a brain? I just have to take something for granted, don't I? Of course I will "verify" the "important stuff", but what is important? How can I tell? Most of the time only thing I need is a pointer in the right direction. Wrong advice? I know when I get there I suppose.

I can remember numerous things I was told while growing up, that aren't actually true. Either by plain lies and rumours or because of the long list of our cognitive biases.

> If you have to look up every fact it outputs after it does, using traditional methods, why not skip to just looking things up the old fashioned way and save time?

What is the old fashioned way? I mean people learn "truths" these days from Tiktok and Youtube. Some of the stuff is actually very good, you just have to distill it based on the stuff I was being taught at school. Nonody has yet declared LLMs as a subtitute for schools, maybe they soon will, but neither "guarantees" us anything. We could as well be taught political agendas.

I could order a book about construction, but I wouldn't build a house without asking a "verified" expert. Some people build anyway and we get some catastrofic results.

Levels of trust, it's all games and play until it gets serious, like what to eat or doing something that involves life threatening physics. I take it as playing with a toy. Surely something great have come up from only a few piece of legos?

> And if you're not verifying literally everything an LLM tells you.. are you sure you're learning anything real?

I guess you shouldn't do it that way. But really, so far the topics I've rigorously explored with ChatGPT for example, have been better than your average journalism. What is real?


> What is the old fashioned way?

Looking in a resource written by someone with sufficient ethos that they can be considered trustworthy .

> What is real?

I'm not arguing ontology about systems that can't do arithmetic. you're not arguing in good faith at all


Saying you need to verify "literally everything" both overestimates the frequency of hallucinations and underestimates the amount of wrong found in human-written sources. e.g. the infamous case of Google's AI recommending Elmer's glue on pizza was literally a human-written suggestion first: https://www.reddit.com/r/Pizza/comments/1a19s0/my_cheese_sli...

The Gell-Mann amnesia effect applies to LLMs as well!

https://en.m.wikipedia.org/wiki/Gell-Mann_amnesia_effect


> without needing to wait an answer from a human (that could also be wrong).

The difference is you have some reassurances that the human is not wrong - their expertise and experience.

The problem with LLMs, as demonstrated by the top-level comment here, is that they constantly make stuff up. While you may think you're learning things quickly, how do you know you're learning them "correctly", for lack of a better word?

Until an LLM can say "I don't know", I really don't think people should be relying on them as a first-class method of learning.


You overestimate the importance of being correct

"Occasional nonsense" doesn't sound great, but would be tolerable.

Problem is - LLMs pull answers from their behind, just like a lazy student on the exam. "Halucinations" is the word people use to describe this.

Those are extremely hard to spot - unless you happen to know the right answer already, at which point - why ask? And those are everywhere.

One example - recently there was quite a discussion about llm being able to understand (and answer) base16 (aka "hex") encoding on the fly, so I went on to try base64, gzipped base64, zstd-compressed base64, etc...

To my surprise, LLM got most of those encoding/compressions right, decoded/uncompressed the question, and answered it flawlessly.

But with few encodings, LLM detected base64 correctly, got compression algorithm correctly, and then... instead of decompressing, made up a completely different payload, and proceeded to answer that. Without any hint of anything sinister going.

We really need LLMs to reliably calculate and express confidence. Otherwise they will remain mere toys.


Yeah, what you said represents a 'net gain' over not having any of that at all.

I think as these things get more integrated into customer service workflows - especially for things like insurance claims - there's gonna start being a lot more buyer's remorse on everyone's part.

We've tried for decades to turn people into reliable robots, now many companies are running to replace people robots with (maybe less reliable?) robot-robots. What could go wrong? What are the escalation paths going to be? Who's going to be watching them?


A net gain for everyone? Tell that to the artists its screwing over!



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: