> MD5 Hash is fast and pretty reliable for its purpose CRC is more reliable
> but also much slower
I'm pretty sure that's the wrong way round. In particular, CRCs don't
attempt to foil deliberate attempts to circumvent them.
> MD5 hash is often used to quickly compare 2 binary`s ( are they the same or
> not , idea; for update/ file synchronization programs etc etc )
> CRC is often used by compression program`s ( winzip , winrar etc etc ) to
> check if they are exactly the same on byte level ( to check if the file not
> has gone corrupted during deflation )
Well, MD5 can be used for the latter as well, but is harder to
deliberately fool.
MD5 isn't the safest hash algorithm around - there are ways to break it
in certain circumstances - but it's a lot safer from a tampering point
of view than CRC.

Signature
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Arne Vajhøj - 02 Sep 2007 01:52 GMT
>> MD5 Hash is fast and pretty reliable for its purpose CRC is more reliable
>> but also much slower
>
> I'm pretty sure that's the wrong way round. In particular, CRCs don't
> attempt to foil deliberate attempts to circumvent them.
No doubt about the reliable part.
Your assumption about the speed part is the same I had. CRC ought
to be much faster than MD5.
But apparently it is not.
#ZipLib CRC-32 is only about 5% faster than .NET MD5.
Maybe that implementation is not super good - Java CRC-32 is
40% faster that Java MD5, but still nowhere near the expected
difference.
I guess CRC's are really intended for hardware not for
software.
BTW, the CRC-32's I used are according to Wikipedia not a
real CRC, but are supposed to be faster than real CRC's, so ...
Arne
Michel Posseth [MCP] - 02 Sep 2007 10:05 GMT
Strange ,,,,
Just one week ago , i was asked to create a network file synchronization
mechanism wich did not care about file versions
"file remote different as the local version , copy it local" as we are
talking about hundreds of files and a total size of + 100 MB
i needed a fast way to check these files .
So i went digging on the web wich algorythm would be the fastest and found
my previous conclusion on various websites
My project is finished and performs superb , but you are telling me now that
CRC ought to be faster but less reliable ??
Michel
>> MD5 Hash is fast and pretty reliable for its purpose CRC is more
>> reliable
[quoted text clipped - 17 lines]
> in certain circumstances - but it's a lot safer from a tampering point
> of view than CRC.
Jon Skeet [C# MVP] - 02 Sep 2007 12:18 GMT
> Strange ,,,,
>
[quoted text clipped - 9 lines]
> My project is finished and performs superb , but you are telling me now that
> CRC ought to be faster but less reliable ??
It *may* be faster, depending on the exact implementation. However,
it's unlikely that the hash performance is going to be significant
compared with the IO cost. Hashing 100MB of data is likely to be very
quick with either algorithm.

Signature
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Michel Posseth [MCP] - 02 Sep 2007 14:58 GMT
>However,
> it's unlikely that the hash performance is going to be significant
> compared with the IO cost.
Yes ... that is a good one .. in my situation the cost of copying a file
that did not need replacement
However it seems that i got my implentation right as MD5 hash would be more
reliable but probably a bit slower as a CRC
and in my situation it turned out thet this is exactly what i need cause it
is more costly to copy the file over the intranet to the client
So i guess i had a lucky day when i wrote it :-)
regards
And thanks for sharing
Michel
>> Strange ,,,,
>>
[quoted text clipped - 16 lines]
> compared with the IO cost. Hashing 100MB of data is likely to be very
> quick with either algorithm.
Jon Skeet [C# MVP] - 02 Sep 2007 17:59 GMT
> >However,
> > it's unlikely that the hash performance is going to be significant
> > compared with the IO cost.
>
> Yes ... that is a good one .. in my situation the cost of copying a file
> that did not need replacement
What I meant is that the cost of reading a file in order to calculate
the hash is probably bigger than the computational cost of the hash.
Both MD5 and CRC will require the whole file to be read, so there's no
benefit there either.

Signature
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Evan Camilleri - 04 Sep 2007 15:29 GMT
I actually wanted to see some code since I cannot find how to get MD5 or CRC
for the object's data
thanks
Evan
>> >However,
>> > it's unlikely that the hash performance is going to be significant
[quoted text clipped - 7 lines]
> Both MD5 and CRC will require the whole file to be read, so there's no
> benefit there either.
Jon Skeet [C# MVP] - 04 Sep 2007 15:37 GMT
On Sep 4, 3:29 pm, "Evan Camilleri" <e...@holisticrd.com.nospam>
wrote:
> I actually wanted to see some code since I cannot find how to get MD5 or CRC
> for the object's data
As Peter said, there's no such concept as "the object's data" that
makes taking an MD5 hash sensible in all cases.
What would the MD5 of a NetworkStream be? Would you have to read all
its contents to find out?
Jon