More

richdougherty · 2025-01-05T20:42:08 1736109728

"The cost-effective nature of AI makes it highly plausible we're moving towards an agent vs agent future."

Sounds right. I assume we will all have AI agents triaging our emails trying to protect us.

Maybe we will need AI to help us discern what is really true when we search for or consume information as well. The amount and quality of plausible but fake information is only going to increase.

"However, the possibilities of jailbreaks and prompt injections pose a significant challenge to using language models to prevent phishing."

Gives a hint at the arms race between attack and defense.

throw10920 · 2025-01-05T23:33:50 1736120030

I don't think that there will necessarily be an arms race. Some security problems are deterministically solvable and don't need AI.

For instance, there is a very good classical algorithm for preventing password brute-forcing - exponential backoff on failure per IP address, maybe with some additional per-account backoff as well. Combined with sane password rules (e.g. correct horse battery staple, not "you must have one character from every language in Madagascar), make password brute-forcing infeasible, and force attackers to try other approaches - which in the security world counts as success. No AI needed.

richdougherty · 2024-10-16T03:06:12 1729047972

Kudos to the dev for coming up with the eye position fixing solution.

Building further on this idea, I wonder if instead of changing the image to look at the camera, we could change the "camera" to be where we're looking.

In other words we could simulate a virtual camera somewhere in the screen, perhaps over the eyes of the person talking.

We could simulate a virtual camera by using the image of the real camera (or cameras), constructing a 3D image of ourselves and re-rendering it from the virtual camera location.

I think this would be really cool. It would be like there was a camera in the centre of our screen. We could stop worrying about looking at the camera and look at the person talking.

Of course this is all very tricky, but does feel possible right now. I think the Apple Vision Pro might do something similar already?

newaccount74 · 2024-10-16T08:29:14 1729067354

There is already a lot of research on the 3D reconstruction and camera movement part, for example this SIGGRAPH 2023 paper: https://research.nvidia.com/labs/nxp/lp3d/

In order for this to work for gaze correction, you'd probably need to take into consideration the location of the camera relative to the location of the eyes of the person on the screen, and then correct for how the other person is holding the phone, and it would probably only work for one-on-one calls. Probably need to know the geometry of the phone (camera parameters, screen size, position of camera relative to phone)

Would be amazing, not sure how realistic it is.

scotty79 · 2024-10-16T07:38:53 1729064333

I think you'd get a lot by just transforming eyes so the gaze is relative to the virtual camera located on the screen at the place of the face of a person you are talking to. This way you get eye contact only when you are looking on their face on the screen, but not when you look somewhere else.

mvoodarla · 2024-10-16T04:16:11 1729052171

This is an interesting idea. We are a little farther off from being able to do this but agree it would look really cool.

richdougherty · 2024-08-29T20:31:22 1724963482

So it's also vulnerable to a Helicopter Injection Attack?

richdougherty · 2024-06-28T04:48:24 1719550104

If you have control of the tokenizer you could make sure it doesn't produce these tokens on user input. I.e. instead of the special "<eos>" token, produce something like "<", "eos", ">" - whatever the 'natural' encoding of that string is.

See for example, the llama3 tokenizer has options to control special token tokenization:

Tokenization method with args to control special token handling: https://github.com/meta-llama/llama3/blob/bf8d18cd087a4a0b3f...

And you can see how it is used combined with special tokens and user input here: https://github.com/meta-llama/llama3/blob/bf8d18cd087a4a0b3f...

If you don't have control of the tokenizer, I guess it needs to be sanitized in the input like you say.

richdougherty · 2024-06-21T05:21:40 1718947300

Source with more information: https://www.tuxedocomputers.com/en/TUXEDO-on-ARM-is-coming.t...

richdougherty · 2024-05-20T18:54:55 1716231295

A few details here: "Recall leverages your personal semantic index, built and stored entirely on your device. Your snapshots are yours; they stay locally on your PC. You can delete individual snapshots, adjust and delete ranges of time in Settings, or pause at any point right from the icon in the System Tray on your Taskbar. You can also filter apps and websites from ever being saved. You are always in control with privacy you can trust."

https://blogs.microsoft.com/blog/2024/05/20/introducing-copi...

J_Shelby_J · 2024-05-21T01:31:42 1716255102

That didn’t answer the ops question.

Are they encrypted? Can Microsoft access them if compelled by law enforcement?

richdougherty · 2024-05-15T19:56:52 1715803012

I can't track down the citation (either Google or DeepMind I think), but I remember reading research from a year or two ago how adding extra languages (French, German) improved English language performance. There may have also been an investigation about multi modality too, which found that adding vision or audio helped with text as well.

richdougherty · 2024-05-11T19:59:38 1715457578

Related? "Let's Think Dot by Dot: Hidden Computation in Transformer Language Models" https://arxiv.org/abs/2404.15758

> Chain-of-thought responses from language models improve performance across most benchmarks. However, it remains unclear to what extent these performance gains can be attributed to human-like task decomposition or simply the greater computation that additional tokens allow. We show that transformers can use meaningless filler tokens (e.g., '......') in place of a chain of thought to solve two hard algorithmic tasks they could not solve when responding without intermediate tokens. However, we find empirically that learning to use filler tokens is difficult and requires specific, dense supervision to converge. We also provide a theoretical characterization of the class of problems where filler tokens are useful in terms of the quantifier depth of a first-order formula. For problems satisfying this characterization, chain-of-thought tokens need not provide information about the intermediate computational steps involved in multi-token computations. In summary, our results show that additional tokens can provide computational benefits independent of token choice. The fact that intermediate tokens can act as filler tokens raises concerns about large language models engaging in unauditable, hidden computations that are increasingly detached from the observed chain-of-thought tokens.

> In this work, we demonstrate that transformers trained on the next-token prediction objective can achieve improved performance on certain tasks when given filler tokens, achieving perfect accuracy whereas the no-filler, immediate-answer setting achieves only low accuracy.

--

I wonder if we could get benefits from adding special computation/register tokens to text LLMs?

More discussion:

- https://news.ycombinator.com/item?id=40182695

- https://www.reddit.com/r/LocalLLaMA/comments/1cf2w5a/transfo...

richdougherty · 2024-04-07T03:49:32 1712461772

Anyone who enjoyed this, might also like "All the Horses of Iceland". It's a fictional work, but written by a medieval scholar. It's set in the 9th century where a Norse trader travels through an alternative central Asia.

http://sarahtolmie.ca/horsesReviews.html

richdougherty · 2024-03-10T07:49:45 1710056985

http://ymeskhout.substack.com/p/eleven-magic-words-unlocked

ProllyInfamous · 2024-03-11T04:17:32 1710130652

Those 11 Magic Words:

"Is there anything the court would like to review to reconsider?"

A great article by a great public defender.

Huggernaut · 2024-03-10T20:12:12 1710101532

Aye aye aye I got more and more confused reading this wondering how this was going to relate to the Elves.

93po · 2024-03-10T18:56:22 1710096982

I found this a little disappointing after so much build up. It's good writing but there was no payoff as to why those words had the impact they did and the reason was effectively just a random choice by a judge

matthewdgreen · 2024-03-10T19:17:43 1710098263

That’s the entire point of the piece.

shermantanktop · 2024-03-10T20:26:19 1710102379

Exactly. The machinery of justice is decorated with all kinds of indicators of objectivity and wisdom: precedent, settled law, sentencing guidelines, burdens of proof, law schools filled with eminent scholars.

And yet someone has a bad day, or a change of heart, or ate a tasty taco for lunch, or just fell in love, and lives are ruined or saved.

DANmode · 2024-03-10T20:21:37 1710102097

> Every morning I wonder whether that day has Eleven Magic Words and, if it does, whether I’ll be able to figure them out. And every day that potential scares the shit out of me.

morphle · 2024-03-11T13:28:47 1710163727

I'm trying to contact you about a post you replied to about a start-up ISP from a few months back. You can contact me at morphle73 at gmail dot com.

DANmode · 2024-03-13T02:09:36 1710295776

Sent you a note!

n3rv · 2024-03-11T07:18:03 1710141483

Hello DANmode,

I'm trying to touch base from a few months back about a post you replied to about a start-up ISP.

I have updated my contact info if you still want to chat.