Entries Tagged 'Crypto' ↓

On privacy, hashing, and your customers

I’ve talked before about not being a dick when it comes to dealing with private data and personally-identifying information. It seems events have conspired to make it worth diving into some more detail.

Only collect data you need to collect (and have asked for)

There’s plenty of information on the iPhone ripe for the taking, as fellow iOS security boffin Nicolas Seriot discussed in his Black Hat paper. You can access a lot of this data without prompting the user: should you?

Probably not: that would mean being a dick. Think about the following questions.

  • Have I made it clear to my customers that we need this data?
  • Have I already given my customers the choice to decline access to the data?
  • Is it obvious to my customer, from the way our product works, that the product will need this data to function?

If the answer is “no” to any of these, then you should consider gathering the data to be a risky business, and the act of a dick. By the way, you’ll notice that I call your subscribers/licensees “your customers” not “the users”; try doing the same in your own discussions of how your product behaves. Particularly when talking to your investors.

Should you require a long-form version of that discussion, there’s plenty more detail on appropriate handling of customer privacy in the GSMA’s privacy guidelines for mobile app developers.

Only keep data you need to keep

Paraphrasing Taligent: There is no data more secure than no data. If you need to perform an operation on some data but don’t need to store the inputs, just throw the data away. As an example: if you need to deliver a message, you don’t need to keep the content after it’s delivered.

Hash things where that’s an option

If you need to understand associations between facts, but don’t need to be able to read the facts themselves, you can store a one-way hash of the fact so that you can trace the associations anonymously.

As an example, imagine that you direct customers to an affiliate website to buy some product. The affiliates then send the customers back to you to handle the purchase. This means you probably want to track the customer’s visit to your affiliate and back into your purchase system, so that you know who to charge for what and to get feedback on how your campaigns are going. You could just send the affiliate your customer’s email address:

X-Customer-Identifier: iamleeg%40gmail.com

But now everybody who can see the traffic – including the affiliate and their partners – can see your customer’s email address. That’s oversharing, or “being a dick” in the local parlance.

So you might think to hash the email address using a function like SHA1; you can track the same hash in and out of the affiliate’s site, but the outsiders can’t see the real data.

X-Customer-Identifier: 028271ebf0e9915b1b0af08b297d3cdbcf290e3c

We still have a couple of problems though. Anyone who can see this hash can take some guesses at what the content might be: they don’t need to reverse the hash, just figure out what it might contain and have a go at that. For example if someone knows you have a user called ‘iamleeg’ they might try generating hashes of emails at various providers with that same username until they hit on the gmail address as a match.

Another issue is that if multiple affiliates all partner with the same third business, that business can match the same hash across those affiliate sites and build up an aggregated view of that customer’s behaviour. For example, imagine that a few of your affiliates all use an analytics company called “Slurry” to track use of their websites. Slurry can see the same customer being passed by you to all of those sites.

So an additional step is to append a different random value called a salt to the data before you hash it in each context. Then the same data seen in different contexts cannot be associated, and it becomes harder to precompute a table of guesses at the meaning of each hash. So, let’s say that for one site you send the hash of “sdfugyfwevojnicsjno” + email. Then the header looks like:

X-Customer-Identifier: 22269bdc5bbe4473454ea9ac9b14554ae841fcf3

[OK, I admit I'm cheating in this case just to demonstrate the progressive improvement: in fact in the example above you could hash the user's current login session identifier and send that, so that you can see purchases coming from a particular session and no-one else can track the same customer on the same site over time.]

N.B. I previously discussed why Apple are making a similar change with device identifiers.

But we’re a startup, we can’t afford this stuff

Startups are all about iterating quickly, finding problems and fixing them or changing strategy, right? The old pivot/persevere choice? Validated learning? OK, tell me this: why doesn’t that apply to security or privacy?

I would say that it’s fine for a startup to release a first version that covers the following minimum requirements (something I call “Just Barely Good Enough” security):

  • Legal obligations to your customers in whatever country those countries (and your data) reside
  • Standard security practices such as mitigating the OWASP top ten or OWASP mobile top ten
  • Not being a dick

In the O2 Labs I’ve been working with experts from various groups – legal, OFCOM compliance, IT security – to draw up checklists covering all of the above. Covering the baseline security won’t mean building the thing then throwing it at a pen tester to laugh at all the problems: it will mean going through the checklist. That can even be done while we’re planning the product.

Now, as with everything else in both product engineering and in running a startup, it’s time to measure, optimise and iterate. Do changes to your product change its conformance with the checklist issues? Are your customers telling you that something else you didn’t think of is important? Are you detecting intrusions that existing countermeasures don’t defend against? Did the law change? Measure those things, change your security posture, iterate: use the metrics to ensure that you’re pulling in the correct direction.

I suppose if I were willing to spend the time, I could package the above up as “Lean Security” and sell a 300-page book. But for now, this blog post will do. Try not to be a dick; check that you’re not being a dick; be less of a dick.

On explaining stuff to people

An article that recently made the rounds, though it was written back in September, is called Apple’s Idioten Vektor. It’s a discussion of how the CCCrypt() function in Apple’s CommonCrypto library, when used in its default cipher block chaining mode, treats the IV (Initialization Vector) parameter as optional. If you don’t supply an IV, it provides its own IV of 0x0.

Professional Cocoa Application Security also covers CommonCrypto, CBC mode, and the Initialization Vector. Pages 79-88 discuss block encryption. The section includes sample code for both one-shot and staged use of the API. It explains how to set the IV using a random number generator, and why this should be done.[1] Mercifully when the author of the above blog post reviewed the code in my book section, he decided I was doing it correctly.

So both publications cover the same content. There’s a clear difference in presentation technique, though. I realise that the blog post is categorised as a “rant” by the author, and that I’m about to be the pot that calls the kettle black. However, I do not believe that the attitude taken in the post—I won’t describe it, you can read it—is constructive. Calling people out is not cool, helping them get things correct is. Laughing at the “fail” is not something that endears people to us, and let’s face it, security people could definitely be more endearing. We have a difficult challenge: we ask developers to do more work to bring their products to market, to spend more money on engineering (and often consultants), in return for potentially protecting some unquantified future lost revenue and customer hardship.

Yes there is a large technical component in doing that stuff, but solving the above challenge also depends very strongly on relationship management. Security experts need to demonstrate that we’re all on the same side; that we want to work with the rest of the software industry to help make better software. Again, a challenge arises: a lot of the help provided by security engineers comes in the form of pointing out mistakes. But we shouldn’t be self promoting douchebags about it. Perhaps we’re going about it wrong. I always strive to help the developers I work with by identifying and discussing the potential mistakes before they happen. Then there’s less friction: “we’re going to do this right” is a much more palatable story than “you did this wrong”.

On the other hand, the Idioten Vektor approach generated a load of discussion and coverage, while only a couple of thousand people ever read Professional Cocoa Application Security. So there’s clearly something in the sensationalist approach too. Perhaps it’s me that doesn’t get it.

[1]Note that the book was written while iPhone OS 3 was the current version, which is why the file protection options are not discussed. If I were covering the same topic today I would recommend eschewing CCCrypto for all but the most specialised of purposes, and would suggest setting an appropriate file protection level instead. The book also didn’t put encryption into the broader context of cryptographic protocols; a mistake I have since rectified.

On the top 5 iOS appsec issues

Nearly 13 months ago, the Intrepidus Group published their top 5 iPhone application development security issues. Two of them are valid issues, the other three they should perhaps have thought longer over.

The good

Sensitive data unprotected at rest

Secure communications to servers

Yes, indeed, if you’re storing data on a losable device then you need to protect the data from being lost, and if you’re retrieving that data from elsewhere then you need to ensure you don’t give it away while you’re transporting it.

Something I see a bit too often is people turning off SSL certificate validation while they’re dealing with their test servers, and forgetting to turn it on in production.

The bad

Buffer overflows and other C programming issues

While you can indeed crash an app this way, I’ve yet to see evidence you can exploit an iOS app through any old buffer overflow due to the stack guards, restrictive sandboxes, address-space layout randomisation and other mitigations. While there are occasional targeted attacks, I would have preferred if they’d been specific about which problems they think exist and what devs can do to address them.

Patching your application

Erm, no. Just get it right. If there are fast-moving parts that need to change frequently, extract them from the app and put them in a hosted component.

The platform itself

To quote Scott Pack in “The DMZ”, If you can’t trust your users to implement your security plan, then your security plan must work without their involvement. In other words, if you have a problem and the answer is to train 110 million people, then you have two problems.

Storing and testing credentials: Cocoa Touch Edition

There’s been quite the media circus regarding the possibility that Sony was storing authentication credentials for its PlayStation Network credentials in plain text. I was even quoted in a UK national daily paper regarding the subject. But none of this helps you: how should you deal with user passwords?

The best solution is also the easiest: if you can avoid it, don’t store the passwords yourself. On the Mac, you can use the OpenDirectory framework to authenticate both local users and users with accounts on the network (where the Mac is configured to talk to a networked directory service). This is fully covered in Chapter 2 of Professional Cocoa Application Security.

On the iPhone, you’re not so lucky. And maybe on the Mac there’s a reason you can’t use the local account: your app needs to manage its own password. The important point is that you never need to see that password—you need to know that the same password was presented in order to know (or at least have a good idea) that the same user is at the touchscreen, but that’s not the same as seeing the password itself.

That means that we don’t even need to use encryption where we can protect the password and recover it when we must check the password. Instead we can use a cryptographic one-way hash function to store data derived from the password: we can never get the password back, but we can always generate the same hash value when we see the same password.

Shut up Graham. Show me the code.

Here it is. This code is provided under the terms of the WTFPL, and comes without any warranty to the extent permitted by applicable law.

The first thing you’ll need to do is generate a salt. This is a random string of bytes that is combined with the password to hash: the point here is that if two users on the same system have the same password, the fact that the salt is different means that they still have different hashes. So you can’t do any statistical analysis on the hashes to work out what some of the passwords are. Otherwise, you could take your knowledge that, say, 10% of people use “password” as their password, and look for the hash that appears 10% of the time.

It also protects the password against a rainbow tables attack by removing the one-one mapping between a password and its hash value. This mitigation is actually more important in the real world than the one above, which is easier to explain :-) .

This function uses Randomization Services, so remember to link Security.framework in your app’s link libraries build phase.

NSString *FZARandomSalt(void) {
    uint8_t bytes[16] = {0};
    int status = SecRandomCopyBytes(kSecRandomDefault, 16, bytes);
    if (status == -1) {
        NSLog(@"Error using randomization services: %s", strerror(errno));
        return nil;
    }
    NSString *salt = [NSString stringWithFormat: @"%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x%2x",
                      bytes[0],  bytes[1],  bytes[2],  bytes[3],
                      bytes[4],  bytes[5],  bytes[6],  bytes[7],
                      bytes[8],  bytes[9],  bytes[10], bytes[11],
                      bytes[12], bytes[13], bytes[14], bytes[15]];
    return salt;
}

Now you pass this string, and the password, to the next function, which actually calculates the hash. In fact, it runs through the hashing function 5,000 times. That slows things down a little—on an A4-equipped iPad it takes nearly 0.088s to compute the hash—but it also slows down brute-force attacks.

NSData *FZAHashPassword(NSString *password, NSString *salt) {
    NSCParameterAssert([salt length] >= 32);
    uint8_t hashBuffer[64] = {0};
    NSString *saltedPassword = [[salt substringToIndex: 32] stringByAppendingString: password];
    const char *passwordBytes = [saltedPassword cStringUsingEncoding: NSUTF8StringEncoding];
    NSUInteger length = [saltedPassword lengthOfBytesUsingEncoding: NSUTF8StringEncoding];
    CC_SHA512(passwordBytes, length, hashBuffer);
    for (NSInteger i = 0; i < 4999; i++) {
        CC_SHA512(hashBuffer, 64, hashBuffer);
    }
    return [NSData dataWithBytes: hashBuffer length: 64];
}

Where do I go now?

You now have two pieces of information: a random salt, like edbfe42b3da2995a159c16c0a7184211, and a hash of the password, like 855fec563d91576db0e66d8745a3a9cb71dbe40d7cb2615a82b1c87958dd2e8e56db02860739422b976f182a7055dd223a3037dd3dcc5e1ca28aaaf0bade8a08. Store both of these on the machine where the password will be tested. In principle there isn’t too much worry about this data being leaked, because it’s super-hard to get the password out of it, but it’s still best practice to restrict access as much as you can so that attackers have to brute-force passwords on your terms.

When you come to verify the user’s password, pass the string presented by the user and the stored salt to FZAHashPassword(). You should get the same hash out that you previously calculated, if the same password was presented.

Anything else?

Yes. The weakest part of this solution is no longer the password storage: it’s the password itself. The salt+hash shown above is actually for the password “password” (try it yourself), and no amount of software is going to change the fact that that’s a questionable choice of password…well, software that finally does away with password authentication will, but that’s a different argument.

If you want to limit a user’s ability to choose a simple password, you have to do this at password registration and change time. Just look at the (plain-text) password the user has given you and decide whether you want to allow its use.

On cryptographic file storage

In Chapter 3 of Professional Cocoa Application Security, I talk about using CommonCrypto to encrypt files stored on either Mac or iOS file systems. In Chapter 4, I talk about using CommonCrypto to generate Hashed Message Authentication Codes (HMACs) to verify the integrity of messages received by an app. Unfortunately, I didn’t connect the two in the book.

I actually did discuss the reasons for providing an HMAC in a recent talk on cryptographic storage for iOS. But I went to Justin Clark’s talk at Security B-Sides London, where he demonstrated the attacks that are possible against cryptosystems that don’t verify the integrity of their content. This talk was very informative: if the presentation ever makes its way online, I fully recommend that you check it out.

The talk reminded me that there are important attacks I didn’t discuss in the book. So I need to bring these concepts together for the readers of PCAS. The tl;dr version is: if you encrypt content, you must also verify the integrity of the ciphertext. The long version follows.

I recommended in the book using CBC, or Cipher Block Chaining, mode for AES encryption. In this mode, the first block of plaintext is XORed with an Initialization Vector (IV), and this is then encrypted. The resultant ciphertext is then XORed with the second block of plaintext, and this is encrypted. And so on. This offers some protection against certain cryptanalysis attacks that Electronic Code Book (EBC) mode does not: for example, blocks of ciphertext cannot be arbitrarily swapped without the decrypted plaintext becoming nonsensical. A change in the ciphertext not only affects the decryption of the changed block, but also subsequent blocks.

Well, it turns out that this propagating-change property of CBC mode can be used by attackers—if they have information about errors encountered during decryption—to decrypt and encrypt data without needing to discover the key. Ouch. See the B-Sides talk described above (or google for Padding Oracle Attack) for the full details.

Of course, not providing the padding oracle is an important solution to this attack: but refusing to handle content that isn’t verifiably valid avoids any attack involving modified ciphertext: including encrypted content that lies about its length, if that’s an important consideration for your file format. Therefore, if you’re encrypting data in your app, you should be generating an HMAC and verifying that before attempting a decryption operation. Tampered data is not data you want to deal with.

On the broken(?) Mac App Store

A day after the Mac App Store was launched, people are reporting that it has been cracked. There are two separate stories here, a vapourware circumvention of the FairPlay DRM used to generate the receipts and a report that certain apps aren’t validating the receipts properly. We can ignore the first case for the moment: it’s important, and if it’s true then Apple needs to fix it (and co-ordinate updating the validation code with us third-party developers). But for the moment, it’s more important that developers are implementing the protections that are in place in their applications – it’s those applications that are supposed to be protected.

Let’s skip, for the moment, the question of whether DRM or anti-cracking mechanisms are ethically right, worthwhile, or how much effort you want to put into them. Apple have done most of the legwork, in providing a vendor-signed receipt that’s part of your signed app bundle. What you need to do is:

  • Check whether you have a receipt
  • Check whether Apple signed the receipt you have
  • Check whether the receipt is valid for your product
  • Check whether the receipt is valid for this version of your product
  • Check whether the receipt is valid for this computer

That’s it in a nutshell. Of course, some nutshells surround very big and complex nuts, and that’s true in this case:

  • There’s some good example code for receipt validation at github/roddi/ValidateStoreReceipt, if you’re going to use it then don’t just paste it wholesale. If everyone uses the same code then it’s super-easy for someone to detect and strip that code from each instance.
  • Check at runtime, as well as before startup. If you just check at startup, then all an attacker needs to do is patch main() to jump straight into NSApplicationMain() and your app runs for free.
  • Code obfuscation is not a very effective tool. Having worked in anti-virus, I know it’s much easier to classify code based on what it does than what it is, and it’s quite easy to find the code that opens the receipt file, or calls exit(173). That said, some of the commercial obfuscation companies offer a guaranteed service, so you can still protect your revenue after the app gets cracked.
  • Update I have been advised privately and seen in a blog post that people are recommending hard-coding their app bundle IDs and version numbers into the binary rather than using Info.plist, because that file can be edited. Well, so can the app binary…and in either case you’d need to re-sign the product with a valid certificate to continue, because Apple have used the kill flag:
    heimdall:~ leeg$ codesign -dvvvv /Applications/Twitter.app/
    Executable=/Applications/Twitter.app/Contents/MacOS/Twitter
    Identifier=com.twitter.twitter-mac
    Format=bundle with Mach-O universal (i386 x86_64)
    CodeDirectory v=20100 size=12452 flags=0x200(kill) hashes=616+3 location=embedded
    CDHash=8e0736639d79a108a5a1ebe89f928d1da0d49d94
    Signature size=4169
    Authority=Apple Mac OS Application Signing
    Authority=Apple Worldwide Developer Relations Certification Authority
    Authority=Apple Root CA
    Info.plist entries=21
    Sealed Resources rules=4 files=78
    Internal requirements count=2 size=344
    

    Changing a hard-coded string in a binary file is not difficult. You can of course obfuscate the string, but the motivated cracker still finds the point where the comparison is made (particularly easily if you use NSStrings). Really, how far you want to go depends on how much you’re willing to spend.

Of course, Fuzzy Aliens Ltd has already been implementing receipt validation for customers, so if this is too hard for you or you don’t have the time… ;-)

A last word on publicising receipt-validation vulnerabilities

You and I both make our living by selling software, or by selling services to people who sell software. Crowing on the interwebs about how this application or that application doesn’t validate its receipts properly is not cool, because you are shitting on your own doorstep. There is no public benefit to public disclosure that class of vulnerability, because DRM is not a user security feature. Don’t do that. Send the developer a private message explaining your findings. Give them a chance to put extra effort into protecting their product, if that’s what they want to do.

Did the UK create a new kind of “Crypto Mule”?

It’s almost always the case that a new or changed law means that there is a new kind of criminal, because there is by definition a way to contravene the new law. However, when the law allows the real criminals to hide behind others who will take the fall, that’s probably a failure in the legislation.

The Regulation of Investigatory Powers Act 2000 may be doing just that. In Section 51, we find that a RIPA order to disclose information can be satisfied by disclosing the encryption key, if the investigating power already has the ciphertext.

Now consider this workflow. Alice Qaeda needs to send information confidentially to Bob Laden (wait: Alice and Bob aren’t always the good guys? Who knew?). She doesn’t want it intercepted by Eve Sergeant, who works for SOCA (wait: Eve isn’t always the bad guy etc.). So she prepares the information, and encrypts it using Molly Mule’s public key. She then gives the ciphertext to Michael Mule.

Michael’s job is to get from Alice’s location to Bob’s. Molly is also at Bob’s location, and can use her private key to show the plaintext to Bob. She doesn’t necessarily see the plaintext herself; she just prepares it for Bob to view.

Now Alice and Bob are notoriously difficult for Eve to track down, so she stops Michael and gets her superintendent to write a RIPA demand for the encryption key. But Michael doesn’t have they key. He’ll still probably get sent down for two years on a charge of failing to comply with the RIPA request. Even if Eve manages to locate and serve Molly with the same request, Molly just needs to lie about knowing the key and go down for two years herself.

The likelihood is that Molly and Michael will be coerced into performing their roles, just as mules are in other areas of organised crime. So has the legislation, in trying to set out government snooping permissions, created a new slave trade in crypto mules?

On how to get crypto wrong

I’ve said time and time again: don’t write your own encryption algorithm. Once you’ve chosen an existing algorithm, don’t write your own implementation.

Today I had to look at an encryption library that had been developed to store some files in an app. The library used a custom implementation of SHA256-HMAC, and a custom implementation of CBC mode. The implementations certainly looked OK, and seemed to match the descriptions in the textbooks. They also seem to work – you can encrypt a file to get gibberish, and decrypt the gibberish to get the file back.

So the first thing I did was to crack open Xcode and replace these custom functions with CommonCrypto. CommonCrypto’s internals also look a lot like the textbook descriptions of the methods, too. So it would be surprising if these two approaches yielded different results.

These two approaches yielded different results. This was surprising. Specifically, I found that the CBC implementation would sometimes use junk memory, which the CommonCrypto version never does. Of course, the way in which this junk was used was predictable enough that the encryption routine was still reversible – but could it be that the custom implementation was leaking information about the plaintext in the cipher-text by inappropriate re-use of the buffer? Possibly, and that’s good enough for me to throw the custom implementation out. Proving whether or not this implementation is “safe” is something that a specialist cryptographer could probably do in half a day. However, as I was able to use half a day to produce something I had more confidence in, just by using a tested implementation, I decided there was no need to do that work.

On the extension of code signing

One of the public releases Apple has made this WWDC week is that of Safari 5, the latest version of their web browser. Safari 5 is the first version of the software to provide a public extensions API, and there are already numerous extensions to provide custom functionality.

The documentation for developing extensions is zero-cost, but you have to sign up with Apple’s Safari developer program in order to get access. One of the things we see advertised before signing up is the ability to manage and download extension certificates using the Apple Extension Certificate Utility. Yes, that’s correct, all Safari extensions will be signed, and the certificates will be provided by Apple.

For those of you who have provisioned apps using the iTunes Connect and Provisioning Portal, you will find the Safari extension certificate dance to be a little easier. There’s no messing around with UDIDs, for a start, and extensions can be delivered from your own web site rather than having to go through Apple’s storefront. Anyway, the point is that earlier statement – despite being entirely JavaScript, HTML and CSS (so no “executable code” in the traditional sense of machine-language libraries or programs), all Safari extensions are signed using a certificate generated by Apple.

That allows Apple to exert tight control over the extensions that users will have access to. In addition to the technology-level security features (extensions are sandboxed, and being JavaScript run entirely in a managed environment where buffer overflows and stack smashes are nonexistent), Apple effectively have the same “kill switch” control over the market that they claim over the iPhone distribution channel. Yes, even without the store in place. If a Safari extension is discovered to be malicious, Apple could not stop the vendor from distributing it but by revoking the extension’s certificate they could stop Safari from loading the extension on consumer’s computers.

But what possible malicious software could you run in a sandboxed browser extension? Quite a lot, as it happens. Adware and spyware are obvious examples. An extension could automatically “Like” facebook pages when it detects that you’re logged in, post Twitter updates on your behalf, or so on. Having the kill switch in place (and making developers reveal their identities to Apple) will undoubtedly come in useful.

I think Apple are playing their hand here; not only do they expect more code to require signing in the future, but they expect to have control over who has certificates that allow signed code to be used with the customers. Expect more APIs added to the Mac platform to feature the same distribution requirements. Expect new APIs to replace existing APIs, again with the same requirement that Apple decide who is or isn’t allowed to play.

Regaining your identity

In my last post, losing your identity, I pointed out an annoying problem with the Sparkle update framework, in that if you lose your private key you can no longer post any updates. Using code signing identities would offer a get-out, in addition to reducing the complexity associated with releasing a build. You do already sign your apps, right?

I implemented a version of Sparkle that does codesign validation, which you can grab using git or view on github. After Sparkle has downloaded its update, it will test that the new application satisfies the designated requirement for the host application – in other words, that the two are the same app. It will not replace the host unless they are the same app. Note that this feature only works on 10.6, because I use the new Code Signing Services API in Security.framework.