Hacker News new | past | comments | ask | show | jobs | submit login

Let chat more. I'm almost ready to raise some seed money, hire a second staff dev or find a cofounder, and I'm looking for people that care deeply about the space.

It's only been during the last few months that I decided to go all in on the project, so this is still just the first few pages of a new chapter in the project's history.

(I should also mention that if you're a commercial entity relying on ArchiveBox, you can hire us for dedicated support and uptime guarantees. We have a closed source fork that has a much better test suite and lots of other goodies)




It looks like you're doing great work here, thanks a bunch; looking forward to seeing this project develop.

Selling custom integrations, managed instances, white-glove support with an SLA, and so on seems like a reasonable funding model for a project based on an open-source, self-hostable platform. But I'm a little disheartened to read that you're maintaining a closed fork with "goodies" in it.

How do you decide which features (better test suite?) end up in the non-libre, payware fork of your software? If someone contributed a feature to the open-source version that already exists in the payware version, would you allow it to be merged or would you refuse the pull request?


The idea with the plugin system is that plugins are just git repos containing <pluginname>/__init__.py, and you can add any set of git repo plugins you want to your instance.

The marketplace will work by showing all git repos tagged with the "archivebox" tag on github.

My approval is only needed for PRs to the archivebox core engine.

More info on free vs paid + reasoning why it's not all open source: https://news.ycombinator.com/item?id=41863539


"I too would like commit access to your promising looking project's git repo and CI/CD pipeline. Thanks, Jia Tan"


Do you guys have a Discord by chance? I have a close friend who is insanely passionate about archiving, he has a personal instance of archivebox, and is working on a Video Downloading project as well. He has used it almost everyday and archived thousands of news articles over years. He's aware of a lot of the nuances.


We have a Zulip which is similar to discord (but self hosted and it has better threading): https://zulip.archivebox.io


I love this project. I "independently" "invented" it in my head the other day, and happy to see it already exists!

I'd love to see blockchain proof/notary support. The ability to say "content matching this hash existed at this time.

I'm exceptionally busy now but that being said, I may choose to contribute nonetheless.

I'd love to connect directly, and will connect to the Zulip instance later.

If we align on values, I may be able to connect you with some cash. People often call me an "anarchist" or "libertarian", though I'm just me, not labels necessary.


Can you please explain what you mean by “blockchain proof/notary support”?


Motivation: Have evidence that some content existed at a particular time. For example, let's say a major website publishes an article, and later they remove it, and there is no record of it ever existing. If I host an ArchiveBox, I can look at it and see "Oh here is that article. Looks line it was published after all." However, why should you believe me I didn't just make it up?

If when I initially archived it, I computed a cryptographic hash of the content and posted that on a blockchain, then at a future date I can at least claim "As of block N, approximately corresponding to this time UTC, content that hashes to this hash exited."

If multiple unrelated parties also make the same claim, it is stronger evidence.

Is this sufficient explanation? I can expand on this more later.


There's no reason to believe that the hashed and timestamped content was hosted at a particular domain, however (unless the content was signed by the author of course, then there's no Blockchain necessary). sure multiple peers could make some attestation that they saw it at that URL, but then you're back at square one of the reputation problem

Internet archive as an institution with a reputation that holds up to a judge is actually more valuable than a cryptographic proof that x bytes existed at y time


No, definitely not. I have no inherent reason to trust the people working at the Internet Archive over let's say close friend. For me trust is always a human to human concept, and no amount of tech or institutions will change that.

The more people I hear making a claim, the more I'm likely to deem the claim(s) as true. This is even true regarding the claims that cryptographic algorithms have the properties that make them useful in these contexts. I say this as someone who has even taken graduate level classes with Ron Rivest.

I'm not sure what will happen in a court. I imagine the more people that start making claims using cryptography as part of the supporting evidence, the more likely people will start to trust cryptography as a useful tool for resolving disputes about the veracity of claims.

So you would not get any value from multiple people making such claims?


Wow, thanks for sharing your perspective it's quite different from mine. For me reality is not democratic, number of people making a claim doesn't influence the truthiness of it.

I bring up judges because Internet archive captures have been used as evidence in court cases, the first one I pulled up [0] makes an interesting distinction on whether the archive's snapshots are merely hearsay:

  The hearsay rule does not apply to the document (so far as it contains the representation) if the representation was made:

  (a)    by a person who had or might reasonably be supposed to have had personal knowledge of the asserted fact; or ...
The archive's office manager submitted an affidavit to the court as someone who would have personal knowledge of the fact that the date and claimed availability of the content are accurate. There's no cryptography involved, just an individual and an institutions reputation - this carries much more weight than any number of anonymous individuals attesting to a cryptographic proof

[0] https://www.judgments.fedcourt.gov.au/judgments/Judgments/fc...


I think the best solution is to have multiple people with reputation attest to the encrypted TLS content without being able to see the cleartext of it, that way they cant easily tamper with it.

See my comments on TLSNotary stuff below...


Woah, cool, yes, exactly this!

I think I read a paper or blog post about this concept a while ago, but never saw it implemented!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: