Hacker News new | past | comments | ask | show | jobs | submit login
Simple Encrypted Arithmetic Library goes open source (microsoft.com)
215 points by dsr12 on Dec 3, 2018 | hide | past | favorite | 62 comments



To give some context, this is moving away from Microsoft's "open source" license that basically requires you to give Microsoft access to whatever you make with their libraries.

Moving to MIT is great for this. As someone working on FHE, we'll finally be able to see some interesting applications with this library other than Microsoft's research. Great job to the SEAL team!


What does FHE stand for? All my searches are leading to religious articles about spending time with your family


I believe they're referring to Fully Homomorphic Encryption [1]

[1] https://en.wikipedia.org/wiki/Homomorphic_encryption#Fully_h...


Fully Homomorphic Encryption


I am not an encryption expert, but to the casual techy follower, homomorphic encryption seems like it could be huge for the union of machine learning and digital privacy. Are there any experts on HN who could comment on how mature this technology is? Or does anyone know of any interesting projects that use this? I know Numerai is using something of this sort, but it is hard for me to tell how legitimate it is.


I work at Microsoft Research with the SEAL team on applications for Fully Homomorphic Encryption (FHE). Batched integer FHE schemes are great for machine learning. The regular data access patterns in neural networks make using the batching easy and the rather "wide" computations lead to good encryption parameters. Also a new scheme, CKKS, gives the ability to divide by certain plaintext constants, which helps control scaling in fixed point representations. This comes in exchange for results being approximate, which is not a problem in machine learning.

Currently what FHE needs is better toolchain support. Selecting encodings and encryption parameters is hard, but a lot can be done with good language support and tooling. We're presenting some of our work on automatically selecting good data layouts for NNs on FHE at the PPML workshop in NIPS next weekend [1].

[1] https://ppml-workshop.github.io/ppml/


Homomorphic encryption (full and partial) opens the road to some unforeseen forms of obfuscation. This basically means that we'll see the status quo change in this area quite soon.

While the full homomorphic encryption schemes tend to be at the slower side for now, partially homomorphic systems are fast.


One issue to be aware of : it is typically very (very) slow.


Another big advantage I'd hope to see is it enabling more interesting volunteer computing efforts or maybe a truly distributed web infrastructure.


What are the performance costs of using such library vs raw data operations?


To give you an estimate: an AES encryption operation takes about 2 seconds to compute in homomorphic encrypted domain on a generic machine.

Non-encrypted AES operation is approximately 10.000.000 times faster.


I understand that initial encryption costs a lot, that can be one-off though. My question was more about the computation on data - whether the same computation (using this library?) on homomorphically encrypted data is performing as well as on unencrypted data? Is there a penalty to pay on every operation after initial encryption?


Has anyone tried to use this to create a distributed cloud? If storage and certain types of computation could be encrypted, anyone could sell their spare computation power without trust issues.


Somewhat Homomorphic Encryption unfortunately only works for a pretty limited set of workloads and FHE is still mind bogglingly slow so a general purpose cloud for arbitrary computation isn't going to be a thing any time soon.


Also, by itself, homomorphic encryption doesn't protect against chosen-ciphertext attacks, which makes it very unattractive for 99.9% of real world crypto use-cases.

https://tonyarcieri.com/all-the-crypto-code-youve-ever-writt...

In my experience, a lot of teams also reach for FHE to solve problems that are easily and securely solved without it (i.e. searchable encryption):

https://github.com/paragonie/ciphersweet

That being said, there is the 0.01% problem space where FHE makes perfect sense (mostly in the realm of encrypted databases and machine learning). For those systems, if the database can ever be considered adversarial (your threat model may rule this out, most applications' do not), a simple construction involving HMAC, a secure digital signature algorithm, and an append-only ledger can protect the application that reads from the database against chosen-ciphertext attacks.

https://paragonie.com/blog/2017/12/assuring-ciphertext-integ...


Why does lack of CCA security make FHE unattractive? The entire point of FHE is to allow malleability of ciphertexts.


CCAs have historically been useful for defeating the confidentiality guarantees of cryptography protocols. BB'98, Vaudenay '02, the iMessage attack, etc.

Selling "encryption, but not IND-CCA3" to a lot of (clueful, at least) companies is almost impossible.

If you read the linked article, it discusses a way to hack it in.


In case anyone else wondered: https://en.wikipedia.org/wiki/Homomorphic_encryption

"Homomorphic encryption is a form of encryption that allows computation on ciphertexts, generating an encrypted result which, when decrypted, matches the result of the operations as if they had been performed on the plaintext."


Helpful! This kind of encryption is required to make relational data secure. https://arxiv.org/pdf/1512.03498.pdf


I recall several years ago hearing about other companies, particularly IBM, patenting technologies related to homomorphic encryption. I wonder if there is any risk to some of this infringing on a patent and what that means for users of OSS when that happens.


Isn't the "SEAL" acronym a bit overloaded in the realm of cryptography? https://en.wikipedia.org/wiki/SEAL_(cipher)


Now they can say this is not your father's "military-grade encryption". This is SEAL-grade encryption!


I'd never heard of it, a web search turns up only a handful of mentions, and it's not even the only thing in crypto to be called 'seal'.


I figured it wasn't the only one. People who have gone through Cisco material have an outsize exposure to this particular flavor of SEAL because it's one of the few presented alternatives to AES in their material: https://security.stackexchange.com/questions/141879/cisco-sa...


Not simple. But super cool stuff


I know very little about C, C++, or Rust, so I want to ask this to understand why C++ was picked for this. My limited understanding is that people were tolerating C++ libraries like OpenSSL, but that new crypto or security code would likely be written in Rust for safety.

The blog doesn't say much (to me) about the choice:

    In addition to having no external dependencies, Microsoft SEAL
    is written in standard C++, making it easy to compile in many
    different environments.
Is Rust unsuitable for this project? If anyone here worked on SEAL, I'd love to learn more about this -- did the SEAL team consider using Rust?


> My limited understanding is that ... that new crypto or security code would likely be written in Rust for safety.

Where did you get that impression? The creators of Rust would undoubtedly love that to be the case, but Rust is still essentially in its infancy. To my knowledge there are almost no major projects released that have been built in Rust. It's ranked below such heavy hitters as Ada and Prolog for popularity on the TIOBE index[1]. It's pretty unlikely that Rust will become the standard language for everything crypo- or security-related anytime soon (or ever, but it could happen).

[1] https://www.tiobe.com/tiobe-index/


> there are almost no major projects released that have been built in Rust.

https://news.ycombinator.com/item?id=18545373

This includes Microsoft, which I forgot in that post, who uses it in production for their IOT product, and as part of Visual Studio Code.

That said, for crypto, there are some less than ideal things. That’s not stopping projects from working on it.


> That’s not stopping projects from working on it.

To support that...

Grin contains a phenomenal amount of cryptography: https://github.com/mimblewimble/grin

Bulletproofs implementation in rust (small proof sizes): https://github.com/dalek-cryptography/bulletproofs

...with tons of really well done cryptography by that group (dalek): https://github.com/dalek-cryptography

Pairing based crypto library used in the second largest distribution of Ethereum (and maybe zcash?): https://github.com/paritytech/bn

These libraries have prompted me to start learning rust (albeit slowly as I'm using Go and C at work).


To be clear, I did not claim no one was using it. "Almost no major projects" implies there there are a few major projects. It's a small set of projects, though, especially in comparison with something like C++.


Yep, that’s very much true. It’s been on a big upward trajectory lately. But that’s easy when you’re starting small :)


Microsoft might experiment with non-mainstream stuff, but in practice, most of the code in the company is still written in C++, C#, and lately, TypeScript. Much like other large corps, it has a lot of people proficient in certain technologies, and going for something exotic is rarely justifiable purely from the perspective of being able to maintain a team for such a project long-term.


Sure. I certainly don’t mean to imply Microsoft is now built on Rust.

But it is a big place, and some places do get to experiment.


Where is rust used in Code?


Rust isn't directly used in VSCode. VSCode makes use of ripgrep (a Rust-based replacement for grep), though.


Okay that’s what I thought. So that’s not really Microsoft “using Rust in production code” so much as one guy (I know him; he likes coconut la croix; I do too) deciding ripgrep would be a good basis for implementing the FileSearchProvider api.


The Code situation is the least strong; see the other sibling threads for more.


PHP ranks above Swift on Toibe, I should probably use that for writing iOS applications then.

Edit: I guess to make the point crystal clear since there's downvotes, "search engine popularity" is not equivalent to "appropriate tool for the job". Tiobe could equally be viewed as "which languages are hardest to use" or "which languages attract the most newcomers" both of which screw with search engine query rankings.


Not sure how your edit makes your point more clear. The intro explains how it's calculated and how you could use the data that tiobe provides. Personally I wouldn't worry too much about such lists, use it as inspiration to find new technologies. The more languages you know the easier it is to switch when the world around you changes.


The downvotes are likely for the fact that no one made your strawman claim. “Rust isn’t very popular” is not a statement of its suitability for any particular purpose. It’s a response to the idea that somehow it would have already displaced C/C++ for all security-sensitive work.


> so I want to ask this to understand why C++ was picked for this

Probably in-house competence & interoperability: tons of language provide binding to C++ (including Microsoft .net framework). And I guess researchers at MSR have other things to do than to learn a new language for the sole purpose of having the approvement of one or two netizens.


Not sure about this one but crypto code has some strange requirements (like constant time execution) or storing keys in some "secure" cpu registers/zones that you end up writing parts of code in ASM anyway. Maybe it is easier to mix it with C/C++ than with Rust.


Rust is great, but its adoption is minuscule compared to C++. Given the point of this release is to encourage wide scale adoption of certain algorithms, C++ is the realistic choice at this time.


SEAL was started before or soon after Rust became stable.


This is such an off tanget comment, made worse by the fact that it pretends to me on-topic.

This adds nothing of value to the conversation and only distracts from conversations worth having about encryption and practical applications, and shoots down the work that’s gone into this.


This is such an off tanget comment, made worse by the fact that it pretends to me on-topic.

This adds nothing of value to the conversation and only distracts from conversations worth having about the question asked by the parent, and shoots down the work that’s gone into that comment.


Choice of language is a critical design decision for a real world crypto implementation.

One can prove the math is correct, but proving that in actual use it doesn't leave secrets behind in memory or leak information in timing variations requires guarantees that few (if any) languages provide.


> C++ libraries like OpenSSL

OpenSSL is a C library.


Never roll out your own crypto.


When people say "never roll your own crypto", they mean you, the amateur who doesn't have the experience, expertise, or scale to get it right. They don't mean that literally no one should ever write crypto code.


Yes, if there are experts today on homomorphic encryption, one would assume that Microsoft's department of Cryptographic Research plausibly includes some (if not all) of them.

Their publications are online, if one wants to see what peer reviewed works they've published: https://www.microsoft.com/en-us/research/group/cryptography-...


One would assume that "Cryptography Research Group" at Microsoft would include some word class cryptographers and I would think its okay for Cryptographers to roll their own Crypto or there wont be any progress. Their work is open source so their peers in industry and academia can actually review.


Not that I distrust this particular code, Microsoft has had some serious blunders in the past. Like chopping off any password over 8 characters.


I always forget that I should not use sarcasm on HN. Apologies.


Perhaps someone can explain why this can be trusted? microsoft has a long history of cooperating with governments across a broad range of oppressiveness (e.g. US, China, etc). I wasn't able to quickly/easily find any peer reviews of this implementation. Without any more information, how do we know this isn't backdoored in some way?


I believe releasing it to open source makes the code more trustworthy. Now, anyone can go in and verify how it works. Any backdoor would be visible in the code.


>Any backdoor would be visible in the code.

AFAIK, looking at the code for a Dual_EC_DRBG implementation wouldn't look backdoor'd as the backdoor was in the mathematics of the crypto itself.

Even if the code itself had a backdoor, a carefully hidden one may go undetected. Like a missing "goto" or a single equals sign; `if x = y`. An example here: https://freedom-to-tinker.com/2013/10/09/the-linux-backdoor-...


> I believe releasing it to open source makes the code more trustworthy

I don't, and I think it's dangerous to assume that. Many open source projects have systems in place for publicly reviewing and approving code, but microsoft does not have a history of conducting public code reviews for their projects. I would never assume that publicly viewable code is being reviewed by others who know what they are doing if it's not obvious that they are.

> Now, anyone can go in and verify how it works. Any backdoor would be visible in the code

I would only trust those with a high degree of mathematical knowledge, specifically around the subject of cryptography to be able review the code in any meaningful way. The rest of us can verify that it builds successfully and, well, that's about it.


Be that as it may, if you’re trying to hide something releasing it as open source is a pretty terrible way to do it.


Not if it's highly unlikely anyone with enough knowledge to know otherwise is going to review it. The upside is that you get folks like the one above who automatically assume open source software is safe, and use it.


Or just don’t bother open sourcing it and running the risk of being caught. In principle I agree with you, but in practical terms it would be a very wacky double bluff to pull.


I am working on the SEAL team and agree that these are reasonable concerns and questions. With open-sourcing we are hoping to build more trust in our implementation of this admittedly complicated technology. In addition to our internal code reviews at Microsoft, we are hoping to engage with the broader OSS and crypto researcher community in identifying and address any issues and concerns that are brought to our attention.

Note also that the theory of homomorphic encryption has been developed in the open scientific community; all schemes and security estimates are backed up by published papers, and multiple other publicly available implementations. SEAL itself has been publicly available since 2015, although under a non-commercial license. In 2017 Microsoft helped launch the HomomorphicEncryption.org consortium for standardizing homomorphic encryption (see http://HomomorphicEncryption.org). Today, this group consists of more than 300 scientists from around the world, including partners and participants from industry (Microsoft, IBM, Duality Technologies, Intel, Google, multiple start-ups), top researchers from universities (MIT, Seoul National University, EPFL, ...), and government (e.g. NIST).




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: