Hacker News new | past | comments | ask | show | jobs | submit login

If you're going to say something like this, you need to back it up with specific alternatives that provide a better result.

Besides just citing your sources, I'm genuinely curious what the best ones are for this so I can see the competition :)




HunYuan released by Tencent [1] is much better than Sora. It's 100% open source, is compatible with fine tuning, ComfyUI, control nets, and is receiving lots of active development.

That's not the only open video model, either. Lightricks' LTX, Genmo's Mochi, and Black Forest Labs' upcoming models will all be open source video foundation models.

Sora is commoditized like Dall-E at this point.

Video will be dominated by players like Flux and Stable Diffusion.

[1] https://github.com/Tencent/HunyuanVideo/


Something being available OSS is very different from a turnkey product solution, not to mention that Tencent's 60 GiB requirement requires a setup with like at least 3-4 GPUs which is quite rare & fairly expensive vs something time-sharing like Sora where you pay a relatively small amount per video.

I think the important thing is task quality and I haven't seen any evaluations of that yet.


> Something being available OSS is very different from a turnkey product solution, not to mention that Tencent's 60 GiB requirement requires a setup with like at least 3-4 GPUs which is quite rare & fairly expensive vs something time-sharing like Sora where you pay a relatively small amount per video.

It took two weeks to go from Mochi running on 8xH100s to running on 3090s. I don't think you appreciate the rapidity at which open source moves in this space.

HunYuan landed less than one week ago with just one modality (text-to-video), and it's already got LoRA training and fine tuning code, Comfy nodes, and control nets. Their roadmap is technically impressive and has many more control levers in scope.

I don't think you realize how "commodity" these models are and how closed off "turn key solutions" quickly get out-innovated by the wider ecosystem: nobody talks about or uses Dall-E to any extent anymore. It's all about open models like Flux and Stable Diffusion.

{Text/Image/Video}-to-Video is an inadequate modality for creative work anyway, and OpenAI is already behind on pairing other types of input with their models. This is something that the open ecosystem is excelling at. We have perfect syncing to dance choreography, music reactive textures, and character consistency. Sora has none of that and will likely never have those things.

> something time-sharing like Sora where you pay a relatively small amount per video.

Creators would prefer to run all of this on their own machines rather than pay for hosted SaaS that costs them thousands of dollars.

And for those that do prefer SaaS, there are abundant solutions for running hosted Comfy and a constellation of other models as on-demand.


If you've got a 4090 and ComfyUI can you run HunYuan?


There are already Hunyuan fp8 examples running on a 4090 on r/stablediffusion.


RunwayML too but not sure they also won't get commoditized by OSS video generation.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: