wsj2005-03-16 15:28:29
Crack in Computer Security Code Raises Red Flag
Source:Staff Reporter of THE WALL STREET JOURNAL


Obscure but Worrying Flaw
Compromises 'Fingerprint'
Widely Used on Internet
By CHARLES FORELLE
Staff Reporter of THE WALL STREET JOURNAL
March 15, 2005; Page A1

With worries about online security already at a high pitch, the discovery of a

crack in a widely used Internet encryption technique has raised another red
flag among government agencies and computer-code experts.

The technique, called a "hash function," has been used for years by Web-site
operators to scramble online transmissions containing credit-card information,

Social Security numbers and other sensitive data. Hash functions are at work,
for instance, for most of the millions of transactions that take place on the
Internet every day. The system, involving an algorithm, or mathematical
formula, was thought to be impenetrable.

But last month, a team of researchers from Shandong University in eastern
China began circulating a draft of a paper showing that a key hash function
used in state-of-the-art encryption could be less resistant to an attack by
hackers than had been thought.

Hash functions generate digital fingerprints, or "hashes," of documents or
data. As with fingerprints, the uniqueness of the hash is what makes hash
functions a great tool for verifying the authenticity of information.


But the Chinese team found different pieces of data that yielded the same hash

when team members used a hash algorithm called SHA-1 -- and their method
generated the identical hash far more efficiently than experts thought
possible. SHA-1 is a federal standard promulgated by the National Institute of

Standards and Technology and used by the government and private sector for
handling sensitive information. It is thought to be the most widely used hash
function, and it is regarded as the state of the art.

Cryptographers say exploiting the flaw for malevolent purposes doesn't seem
practical, even using a lot of computer power. Hash functions are also often
used in conjunction with other cryptographic techniques, which haven't shown
any flaws. But if someone were to exploit the newfound flaw, the most
immediate threat would be to applications involving "authentication." A hacker

theoretically could set up a dummy Web site that appears to have the security
credentials of a trusted, secure site -- and then steal data that is shipped
to this site by unsuspecting users.

Despite what are believed to be remote chances of abuse, the discovery has set

off alarms in the computer-security industry because it overturns a bedrock
belief about a popular encryption system. "Our heads have been spun around,"
says Jon Callas, chief technology officer at encryption supplier PGP Corp. of
Palo Alto, Calif. "Everything is now topsy-turvy." PGP has begun to replace
SHA-1 in its programs.

Another provider of widely used security systems, RSA Security Inc. of
Bedford, Mass., is doing an inventory of its products to see how they use
SHA-1 with an eye toward phasing it out. (RSA makes the popular SecurID cards
used by many companies to ensure that only employees have remote access to
computer networks.) The National Institute of Standards and Technology
recommends not using SHA-1 in any new applications and is instructing federal
agencies to develop plans for removing it from existing ones.

The Chinese team hasn't published its paper on SHA-1, but the flaw is "real,"
says Bruce Schneier, a cryptographer and chief technology officer of
Counterpane Internet Security Inc., who has seen a draft of the paper.
"Academically, this is stunning work."

The Chinese researchers "haven't caused panic yet," says Avi Rubin, a
computer-security expert at Johns Hopkins University. But "it's definitely a
wake-up call."

The discovery follows recent research showing flaws in other hash functions.
And it comes at a time when information-security concerns have been sharply
heightened by problems not involving hash functions.

Recent breaches at data aggregators ChoicePoint Inc. and Reed Elsevier PLC's
LexisNexis exposed personal data on more than 100,000 Americans to identity
thieves. And a poorly designed online system allowed scores of business-school

applicants earlier this month to view decision letters ahead of time.

Hash functions take a piece of data -- anything from an e-mail message to a
giant database file -- and generate a short string of ones and zeros, 160 of
them in SHA-1, that functions as the datum's unique fingerprint. Nothing else
should generate the same "hash," and a person in possession of only the hash
can't figure out what the e-mail said or what the database contained.

Those properties make hash functions well-suited to "authentication" -- they
are used to make sure the Web site to which you send money actually belongs
to, say, your bank or credit-card company -- not some rogue operator out for a

scam. Hash-function-based authentication is at the core of "digital
signatures" used to verify the identity of users producing documents or e-mail

messages.

Two different chunks of data yielding the same hash is known as a "collision,"

and the Shandong team found the one in SHA-1 far faster than thought possible.

Their work hasn't shown any instances of a more serious flaw that would enable

attackers to create duplicating hashes for their choice of data.

Burt Kaliski, vice president of research at RSA Security, says collisions
don't greatly affect many applications of hashing. But it's possible, he says,

that a person presenting you a document to be signed digitally with a hash has

secretly created a second document designed to "collide" with the first. Then,

by signing the first, you're unknowingly also signing the second.

Also worrying cryptographers is a stream of recent hash compromises. At a
conference in August, problems were reported with MD5, widely used to ensure
integrity of computer data, and other, lesser-used functions. And a French
researcher threw cold water on the commonly held belief that using two hash
functions is more secure than using one.

Recent research has also showed that MD4, long known to have problems, was so
weak that collisions could be found with a few hand calculations -- no
supercomputer required. A Czech cryptographer using the Chinese method claimed

this month to have found collisions in MD5 in only eight hours on a standard
laptop.

Hash functions are perhaps the least well understood cryptographic functions,
cryptographers say. The functions perform a bunch of math on a piece of data,
switch the order of some bits, chop the result down to a fixed length and spit

out the fingerprint. Basically, "you stir it all around and hope you can't
unstir," says Mr. Schneier.

The National Institute of Standards and Technology says it recommends moving
to improved variants of SHA-1 that generate a longer hash, making it harder to

find collisions. The National Security Agency says SHA-1 is fine for now, but
should be phased out by 2010.

But Mr. Schneier and some other top cryptographers believe federal agencies
and academic researchers need to develop entirely new flavors of
harder-to-break hash functions. "All the red flags are up for the SHA family,"

says Arjen K. Lenstra, a researcher at Lucent Technologies Inc.'s Bell Labs.
"We can no longer trust them."

SHA-1 was based on MD5, which came from MD4. Xiaoyun Wang, the lead author of
the SHA-1 paper, says her team's method "does not seem to apply directly" to
the stronger SHA variants. Still, in an e-mail she recommends developing
"different style algorithms." The small team's work has been presented at
respected cryptography conferences and its hash-function paper, while
unpublished, has been reviewed in draft form by experts.

Experts say the research weighs particularly on the technology underlying
secure Web sites. An online-banking site, for example, displays a
"certificate" of authenticity to a Web browser, which then compares it, using
hashes, to a third-party certificate repository to be sure the site actually
belongs to the bank.

Mr. Lenstra and colleagues used the Chinese method to produce two different
certificates with the same hash -- something that shouldn't happen. The
certificates aren't for real sites.