Hacker News new | past | comments | ask | show | jobs | submit | srameshc's comments login

If the time zone handling can be added like Golang time package, that would make it very convenient.

that'll work as soon as everyone in the world agrees to download the latest version of their favorite web browser each time Western Sahara is reclassified

We are anyways discussing the future :)

Deep cut.

Also when the King of Morocco makes another snap decision about Ramadan timezones.


What customer base will Render bring to them ? Google directly competes with AWS and Azure and it need the big enterprise customers if any .

Also all those services basically just sit back and wait till a company gets big enough to have in-house SevOps/SRE/whatever and the inevitable cost optimising migration gets demanded.

Companies will make acquisitions as a means of preventing competition. It might not be about bringing a new customer base but protecting the one you already have.

Can someone explain how is it planning to achieve the federation of that massive amount of data and storage ?

Isn't Pixelfed using the ActivityPub protocol?

As far as I know, that makes it scale like email. When you post something, your host sends it to all recipients - everyone who subscribed to your posts.

That does not sound like a lot of traffic. Even if one has 10k followers, as the post would only go once to each of their hosts. So if the 10k followers are on 100 hosts, that means 100 messages.


It is indeed using ActivityPub but most people are on the main instance and there's not much federation afaik.

Well, sure - if someone wanted to be the hoster for all mail traffic on planet earth, then they would need a lot of resources and a business model to finance it.

But if that is their plan, why would they have used ActivityPub in the first place?

So I guess they could just shut their door as soon as the traffic is too much for them to handle. And tell people to look for another host or host themselfes.


right now. Let's make it grow

I believe the plan is to not have a plan (right now): https://mastodon.social/@dansup/113887622931474663


Pixelfed works like any other Fediverse server. Content is stored on the node where it’s posted by its author. The nodes of people following the account have the option to cache a copy of the posts or just keep a reference depending on their implementation but Mastodon for example would cache it.

Mastodon seems to be doing pretty good so far on scaling up. Pixelfed is smaller than Mastodon, so it will be fine.

mastodon is text, pixelfed is images/videos

As I understand it activitypub itself shouldn't matter there as the only thing going through it should be URL references to the image/video data which you can front with a CDN. Federating instances don't have to host each others' data in that context.

In addition to the other points, there was some discussion about more directly p2p exchange of content (like WebTorrent or something) to share the load more evenly. Spritely is a more methodical approach to it but it could be layered into ActivityPub as an extension without changes to the core protocol.

Pixelfed could make a hash of every blob uploaded and then store that hash in the ActivityPub message. Then the clients could peer using WebTorrent or Trystero and randomly ask peers for that hash until it discovers a peer that has already cached the image. This would reduce load on dansup's server within his own ecosystem anyway.

> the federation of that massive amount of data and storage

I'm not sure what you mean by that but the media blobs (photos and videos) are not "federated" (i.e. passed around instances) most likely, but hosted in one place (the instance of the author) and referenced by their URL.


it varies by activitypub implementation. mastodon for example caches media per instance, pleroma simply hotlinks.

briefly searching through github issues, i believe pixelfed does not cache remote media. discussion on this issue about remote media cacheing seems to indicate that pixelfed only caches avatars https://github.com/pixelfed/pixelfed/issues/4571


I was under the impression that federated nodes acts as caches. Is this not the case?

Intuitively, it is unlikely there's 500K individual servers set up.

We can then also observe other comments clarifying that no, there aren't 500K instances.

The other comments provide...tweets? mastodons?...from the maintainer also clarify that in practice, there's 1 instance.

People are questioning how that will scale, and the tweet from the maintainer was cited as part of that because the content of the tweet is that tl;dr there's a $2,600 month gap between Patreon and hosting costs.


Has anyone tried using Cloudflare Bot Management and how effective is it for such bots ?

I put my personal site behind Cloudflare last year specifically to combat AI bots. It's very effective, but I hate that the web has devolved to a state where using a service like Cloudflare is practically no longer optional.

This is the first time I came across micro-budget term in AI context.

> end-to-end model on an 8×H100 machine is 2.6 days based on the pricing on Lambda labs site, it's about $215 which isn't bad for training a model for educational purposes.


If you want to cost optimize even further, you can get 8 x H100 machines for around $4.00 less per hour through Denvr on Shadeform’s GPU Cloud Marketplace (YC S23).

Well said. Only if we start looking at both of these issues separately, owner and algorith and deal with each one appropriately.

My instant thought was this was an article from the past and why is it reposted now !! Almost after a decade we are back to this headline again. Probably we will read something like this after another 10 years.


I love these new distributed DBs. CockroachDB is one of them. Still I think a managed Postgres/MySQL is a better choice. My primary concern is how challenging will it be if you have to eventually move your data out to a RDBMS for cost or other reasons. Does anyone have any experience ? I am not talking enterprise scale but data about size of 50 - 100GB scale.


Distributed DBs and traditional RDBMS serve different purposes. Most – by which I mean the overwhelming majority – companies do not need a distributed DB, they need better schema and queries.

My fear is companies without in-house RDBMS expertise see these products as a way to continue to avoid getting that expertise.


I think it is mostly three folds. Availability, to be cloud agnostic and scale of few tbs to few hundred TBs. I have been seeing cockroach and yugabyte in an on prem setup where you don't get the benefits of a cloud provider but you can get the availability guarantees and fault tolerance.


why do you think that would be harder? assume for the moment that the reader here is going to run at the effective rate of a single node and not that we're going to try to parallelize that. Assuming we have transaction isolation, that reader is going to get a consistent snapshot.

a distributed database is potentially more complicated to operate, and optimize, and because its new and potentially has more sharp edges maybe less reliable (?) but the extraction of moderate sized datasets doesn't really seem to be an obvious failing.


I would argue the opposite, distributed databases are much easier to operate at large scale. Truly online distributed DDL, at least in TIDB, strong consistency etc.

People who bang on about Postgres replication have rarely setup replication in Postgres themselves and that too in the 100a of Pb scale.

MySQL replication works well and can be scaled more easily (relative to Postgres) but has its own problems. eg., DDL is still a nightmare, lag is a real problem, usually masked by async replication. But then eventual consistency makes the application developers life more complicated.


I have had a great experience with Astro and HTMX. When the first time I tried, I didn't think much about HTMX and thought it was an Astro thing, about a year or over. But this time, I had a great experience and I understand the power of htmx and native web only JS, no framework when building simple sites is much refreshing.


Obvious question: How to protect against this ?


Build your API assuming anything public facing will be known. This includes anything downloaded to a device.


Your first line of defence should be a secure API where an attacker doesn't gain anything by knowing it.

You can add obfuscation, but ultimately if the client is shipped to the user you must assume an attacker can reverse engineer it.


What specifically do you want to protect?


for me, we cant 100% protect again this type of usage but we can minimize with good observarbility and monitoring tools that always check if user is run this via verified way (signed app,web or etc) or RE'ing the api <<

because guess what??? we are the creator of such system, its easy to detect bot/such case when you have good analytical data because this type of way does not give any "traces"


I find this confusing because the point of an API is to be known, yes? Otherwise who's accessing it?


It's a valid desire, but you have to be really dedicated to the effort to block it, in practice.

You might intend your API to be consumed only by your own clients. E.g. your published mobile apps.

A well-designed API won't allow a third-party client to do anything that your own client wouldn't allow of course. Permissions are always enforced on the back end.

But there are many cases where a user might want a custom/different client:

If your mobile apps are not awesome, or if they deprioritize a specific use case, or if they serve ads ... or even if your users want to automate some action in your service...

If your service is popular enough (or you attract a certain kind of user), you will have some people building their own clients.


Those sound like bad use cases for a client-server model with public endpoints, then? I mean, you could cert-pin yourself in the client app, I guess.


Not sure what you mean here. All endpoints are equally public.


Not necessarily. A common pattern is to build a 'private API' intended to be used by one's own front-end applications. For example: most client-rendered applications, like the Airbnb example on this page.


Modern APIs are actually most of the times poor man's RPC, they don't need to exist, much less known.


[flagged]


You can read SSL traffic if you're able to install a root certificate on your device and the website/app doesn't use certificate pinning.

I recently used HttpToolkit to reverse engineer a REST endpoint that used SSL encryption


Even if it does use certificate pinning, you can generally disable that using tools like Frida (https://frida.re) with scripts like https://github.com/httptoolkit/frida-interception-and-unpinn...


This isn't true. Mitmproxy and burp can both proxy TLS. Maybe you're misunderstanding the use case.


A good deal of APIs don't pin SSL certs so MITM works for a solid amount of them.


Only as long as you cannot load your own certificates, which you are able to in a lot of cases. Though on Android you can lock certificates allowed in a app, this can be circumvented though it adds another step. I am unsure if the same is a case for Apples devices, at least you might need jailbreak there.


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: