Announcement

**droidhacker** · 23 February 2017, 01:18 PM

So I'm seeing something really silly in this discussion.... what google has demonstrated doesn't create any particular worry when it comes to file integrity checks. The reason is because it is so overwhelmingly difficult to intentionally manufacture a file that has an sha1 checksum collision that it isn't a concern for ANYBODY.

The main consideration here is only whether or not an sha1 HASH can be duplicated. This has security considerations, NOT related to file integrity checking, but to the storage of SECURITY CREDENTIALS.

And it is a concern for EVERYBODY, because it has to do with logging into your BANK.

Here is the thing;
YOU know your password, your bank does NOT.

Your bank knows a HASH of your password, which they use to calculate whether or not YOU know your password, and in many cases, this is an SHA1 hash.

A hacker breaks into your bank's website and dumps the user database, or an insider leaks it, or WHATEVER, which includes the SHA1 hash of everybody's password.

If it is practical to create an SHA1 collision, then they can generate a new password to log into your bank account, and that password they generate doesn't necessarily even MATCH yours. This attack also doesn't depend on trying trillions of times to log into the bank's website since they can apply a botnet to generate the fake password and log in on the first attempt!

And to make it even worse, they have a huge list of customer login hashes to generate colliding passwords for, which could significantly reduce the time before they get something that is useful for stealing money.

**karolherbst** · 23 February 2017, 01:46 PM

Originally posted by droidhacker View Post

Your bank knows a HASH of your password, ..., this is an SHA1 hash.

if your bank still uses SHA1 hashes, I would quit the account like immediately...

**schmidtbag** · 23 February 2017, 01:48 PM

Originally posted by karolherbst View Post

It is today maybe, but not in the future. And then somebody suddenly finds a much better attacks and then it just takes 10 GPUs out of the sudden. And exactly because of things like this, you always consider attackable things as broken. Especially if you can attack those things in a cheap manner in 10 years.

Again - what exactly do you expect to be accomplished when brute-force analyzing a checksum? You can't reverse them and get anything useful out of it. Looking at the server side of things (so, files not accessible from the outside of the LAN), you don't need to do checksums for security. If someone is tampering with your files on the server, you've already made a mistake with either your network security, or, not encrypting the files in the first place. You can't tamper with an encrypted file and expect it to make sense once decrypted (assuming you don't have the decryption key). If corrupting something is your only goal, then it doesn't matter if the checksum matches or not.
If someone is remotely tampering with files on the server level, again, you have bigger problems than whether your checksums match.
So, that only leaves us with someone intercepting network packets and tampering with the file they intercepted when a user requests it. Putting aside the woeful complications of intercepting the file alone, do you honestly think that even 100 GPUs will be able to modify a file to have the exact same SHA1 sum in a timely manner and have the replacement file coincidentally functional enough where either:
A. the hacker benefits from the change
B. the hackee is negatively affected by the change
How paranoid could you possibly get? Remember - checksums aren't intended to protect against this.

SHA1 is now broken, I don't use broken encryption, period. RC4 shouldn't be used as well, because it's fundamentally broken. Would you say that even RC4 is "good enough" for average users?

That's because it isn't encryption.It's fine to use it to aide in the prevention of malice, but if it is your primary dependence for security and encryption, that is a mistake.

The point of hashes are to identify or check the integrity of files of course. Sometimes I have to send financially relevant documents over the internets and I have scanned files for this. Of course it has to be proven somehow that those files weren't modified in between.

And because e-governance is becoming a thing, this is getting more and more important by the day.

And like I explicitly said before, it is wise to not use anything less than SHA256 for these things... Doesn't mean you need it, but taking shortcuts with security and data integrity (where that is a priority) is a bad idea.

Sorry, but this is pure bullshit. There might be "evil" actors which won't tell that they found issues within security algorithms. How can I or google or whoever know this? And because you can't be sure, you expose this if you don't belong to the "evil camp" (like secret services, which don't expose such security issues in all cases).

I already explained why the "evil actors" are irrelevant. Hackers don't spend the time and money attempting to prove something that they already know is statistically possible. Just because they know they can replicate a checksum with a different file, that doesn't change the fact that knowing this doesn't help them figure out how to do it in a useful way. Again, there's not much of reason to replace a file with another of the exact same checksum if it's corrupted. The only sensical purpose of disguising a checksum is to sneak information through.

**jass0** · 23 February 2017, 01:52 PM

Originally posted by TheBlackCat View Post

No, this attack is not a brute-force attack. So the sentence you were quoting is comparing Google's faster attack against a brute-force attack. This is the typical comparison when measuring the speed of such attacks, since everything by definition is vulnerable to brute-force attacks.

Oops, my bad

**F.Ultra** · 23 February 2017, 02:02 PM

Originally posted by AndyChow View Post

Still not really an issue. We typically use more than one type of checksum. Even if MD5 and SHA1 are falsifiable individually, they still aren't collideable together. So pass the SHA1, fails the MD5, doesn't help. Most times there are 3 different checksums done.

And in that context, MD5 is still very useful. If it fails MD5, discard and move on. If it passes, then SHA1, SHA256, SHA512, then you know your file hasn't been tampered. Assuming the signature hasn't been compromised, which is more likely than trying to compromise the file by buffering it with some pixie magic that makes it collide.

Does not work that way, according to this paper https://www.iacr.org/cryptodb/archiv.../1472/1472.pdf if you use multiple hashes for the same file the security is determined by the strongest hash and there is no added strength by the additional hashes.

**molletts** · 23 February 2017, 02:27 PM

Though it's still not too easy to come by such an attack: Google's SHA1 "shattered" attack takes 110 GPUs one year of work to produce a collision while a SHA1 bruteforce attack on the other hand would take 12 million GPUs and a year worth of work.

Originally posted by jass0 View Post

The second occurance should surely be SHA256?

The comparison is between a brute-force attack on SHA1 and Google's optimised "Shattered" attack, which makes it much quicker to find a collision.

**AndyChow** · 23 February 2017, 02:35 PM

Originally posted by F.Ultra View Post

Does not work that way, according to this paper https://www.iacr.org/cryptodb/archiv.../1472/1472.pdf if you use multiple hashes for the same file the security is determined by the strongest hash and there is no added strength by the additional hashes.

You have either not read, or not understood, that paper.

**DrYak** · 23 February 2017, 02:47 PM

Originally posted by OneTimeShot View Post

For checksums to check you haven't accidentally got a corrupt file you could use CRC for all it matters...

(though perhaps not exactly crc).

I totally agree that in theory SHA-1 or MD5 or even CRC would be enough if you just want to check against corruption, or just need to hash something (as opposed to resist against an attacker). Rsync is still using MD4 and MD5 for that exact purpose.

I would still stick with SHA-1 specifically, though, as it found on hardware implementations which make it easy.

Also CRC doesn't have that good properties. If you want a light CPU hash, I would go for XXHASH64 (by LZ4's author). Better properties than CRC and much faster.

Originally posted by karolherbst View Post

It is today maybe, but not in the future. And then somebody suddenly finds a much better attacks and then it just takes 10 GPUs out of the sudden.

As discussed above, it all depends on the use case.
- If you use SHA-1 to detect intentional tampering by an adversary, Yup, it's time to shift to SHA-2 or SHA-3 already (and you shouldn't actually have waited to this but already started the push when SHA-2 became a standard).
- If you simply use SHA-1 to detect file corruption (e.g.: because your target CPU has it as a handy hardware instruction), then SHA-1 is indeed good enough. Cosmic radiations and HDD bitrot won't specifically run Google's attack on GPU clusters with the malicious intent to produce collisions.

Originally posted by AndyChow View Post

Still not really an issue. We typically use more than one type of checksum. Even if MD5 and SHA1 are falsifiable individually, they still aren't collideable together. So pass the SHA1, fails the MD5, doesn't help. Most times there are 3 different checksums done.

In theory, several hashes might be better than one single hash, but...
- that's quite some useless complexity (And complexity is bound to attract implementation problems).
- MD5 is way more primitive and simplistic, and negligible in run-time compared to SHA-1, therefore :....
- ...MD5 will probably soon reach the point where collision are nearly trivial to make. Thus, it should be possible to add as simply an additional criteria in the SHA-1 search space (not simply brute force, but brute force combination that also happen to pass MD5), at maybe the cost of a small factor of additional GPU run time.
(Compare this to the generation of "Vanity" bitcoin addresses : almost normal key pair generation, except the hash needs to start with a few selected letters. If this additional criteria is small and light enough (like a short nickname instead of trying to cram a whole play by shakespear in the hash) it's not much a penalty).

Originally posted by F.Ultra View Post

Does not work that way, according to this paper https://www.iacr.org/cryptodb/archiv.../1472/1472.pdf if you use multiple hashes for the same file the security is determined by the strongest hash and there is no added strength by the additional hashes.

Exactly. As I said above, if you have 2 hases, H1 and H2, and the complexity of H1 >>> H2, you can simply ignore H2. The strenght of the H1|H2 stack is going to be basically the strength of H1.
(Same situation as optimising steps - go for the slowest first - etc.)

Originally posted by AndyChow View Post

And in that context, MD5 is still very useful. If it fails MD5, discard and move on. If it passes, then SHA1, SHA256, SHA512, then you know your file hasn't been tampered.

Instead of using a complex stacked construct, if your aim is to insure tamper-proof, go straight for the best available industry standard. Or the best available above all.
So concentrate on SHA-2 or SHA-3 depending on your audience.

Using a whole stack (MD5, SHA-1, etc.) is at best a waste of time, and at worst a big opportunity to make some mistake or end up only counting on SHA-1 in practice.

Originally posted by AndyChow View Post

Assuming the signature hasn't been compromised, which is more likely than trying to compromise the file by buffering it with some pixie magic that makes it collide.

Not necessarily. Currently HMAC-SHA1 isn't compromised (yet). Public key signing for sufficiently long key (RSA >2048bits, ED25519, etc.) isn't either (yet).
To spoof a file at the signing level you still need to steal the private key, or spoof it (So additional work at social engineering, or Certificate Authority bribing).
To spoof a file at the hash level, you need to produce a malicious file with the same hash, so you don't even need to tamper with the signed hashfile. (currently possible with MD5, and soon with SHA1 - all within your bedroom or at least your own GPU cluster, without need for any interraction with a 3rd party).

So no, signature isn't necessarily more likely to be compromised.

Originally posted by droidhacker View Post

Here is the thing;
YOU know your password, your bank does NOT.

Your bank knows a HASH of your password, which they use to calculate whether or not YOU know your password, and in many cases, this is an SHA1 hash.

If your bank stores password as plain hashes you need to close your account IMMEDIATELY, and run as fast as possible.
And chose your next bank as not having a website coded by a 14 year old in his bedroom.

News flash, although there are still tons of stupid websites using this kind of shitty practice, any normal bank follows correct cryptographic procedure.

At minimum:
- a password is always stored salted (so 2 users having the same password will still get 2 different hashes due to different salts) thus preventing the possibility of rainbow table that you mention.
- a password is always hashed with a specific PASSWORD-hashing function (a.k.a. a key derivation function). Not any random hash (like SHA-1, MD5, etc.) Usual standard is PBKDF2, new comers are Scrypt and Argon. (These functions are designed on purpose to run slow and eat tons of RAM. It's still realistic for a webserver to run them every now or then to check for identity only at log-in time. It's completely hopeless to run them millions per second on a GPU or on specialized ASIC to do bruteforce).

At best:
- use 2 factor identifications (OTP, chip card and reader, smartphone app, etc.) because no matter what security you throw at your password storage, stupid users are always going to use "Password1!" for absolutely all of their sites (but, it had an uppercase, and a number, and a special caracter ! It was secure according to the password meter ?) (Look on youtube for lectures by security experts about "password patterns" why in practice this is a shitty technique).

DISCLAIMER:
I am a doctor. I work in medical research and although I'm not a cryptographer (cue in Dr McCoy citation), information security is a pet peeves of mine.

**willmore** · 23 February 2017, 05:57 PM

Originally posted by bug77 View Post

https://www.entrust.com/understandin...longer-secure/

I take it you didn't read the article?

n fact, in 2012 noted security researcher Bruce Schneier reported the calculations of Intel researcher Jesse Walker, who found that the estimated cost of performing a SHA-1 collision attack will be within the range of organized crime by 2018 and for a university project by 2021. Walker’s estimate suggested then that a SHA-1 collision would cost $2 million in 2012, $700,000 in 2015, $173,000 in 2018 and $43,000 in 2021.

Google was pushing to have SHA1 removed by Jan 1 2017. Most people in the web industry weren't interesed in doing it until a few years later. Because of Googles insistance, Microsoft finally (last April) moved up their schedule to July/August of 2017. So, my assertion that Google is dragging the industry away from SHA-1 stands.

**bug77** · 23 February 2017, 06:25 PM

Originally posted by willmore View Post

I take it you didn't read the article?

Google was pushing to have SHA1 removed by Jan 1 2017. Most people in the web industry weren't interesed in doing it until a few years later. Because of Googles insistance, Microsoft finally (last April) moved up their schedule to July/August of 2017. So, my assertion that Google is dragging the industry away from SHA-1 stands.

The point is theoretical vulnerabilities have been known for a while. That article is 6 years old and only what I googled quickly while at work.

Announcement

Google Announces First Practical SHA1 Collision

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment