Sep 02, 2021

> Apple is not using PhotoDNA's hashes.

Do you have a source for this, or is it an assumption you are making?

> Apple is using a different system called NeuralHash.

No, one of the systems Apple is using is NeuralHash. They also use a second, undisclosed system on the visual derivative before human review. While it’s possible they have invented two separate perceptual hashing systems, it’s also quite possible that they would use PhotoDNA for this.

Details from Apple:

> Once Apple's iCloud Photos servers decrypt a set of positive match vouchers for an account that exceeded the match threshold, the visual derivatives of the positively matching images are referred for review by Apple. First, as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possibility that the match threshold was exceeded due to non-CSAM images that were adversarially perturbed to cause false NeuralHash matches against the on-device encrypted CSAM database. If the CSAM finding is confirmed by this independent hash, the visual derivatives are provided to Apple human reviewers for final confirmation.

https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 26, 2021

Their threat model[1] states:

> This feature runs exclusively as part of the cloud storage pipeline for images being uploaded to iCloud Photos and cannot act on any other image content on the device. Accordingly, on devices and accounts where iCloud Photos is disabled, absolutely no images are perceptually hashed. There is therefore no comparison against the CSAM perceptual hash database, and no safety vouchers are generated, stored, or sent anywhere.

and

> Apple’s CSAM detection is a hybrid on-device/server pipeline. While the first phase of the NeuralHash matching process runs on device, its output – a set of safety vouchers – can only be interpreted by the second phase running on Apple’s iCloud Photos servers, and only if a given account exceeds the threshold of matches.

We should also take account the way how blinding the hash works from CSAM paper[2]:

> However, the blinding step using the server-side secret is not possible on device because it is unknown to the device. The goal is to run the final step on the server and finish the process on server. This ensures the device doesn’t know the result of the match, but it can encode the result of the on-device match process before uploading to the server.

What this means, that whole process is tied strictly to specific endpoint in the server. To be able to match some other files from device into the server, these are also required to be uploaded into the server (PSI implementation forces it). And based on the pipeline description, upload of other files should not be possible. However, if it is and they suddenly change policy to expand to scan all files of your device, they will end-up into the same iCloud as other files, and you will notice them and you can't opt out from that with the current protocol. So they have to modify whole protocol to include only those images which are actually meant to be synced, and then scan all the files (which are then impossible to match on server side because of the how PSI protocol works). If they create some other endpoint for files which are not supposed to end up into iCloud, they need store them in the cloud anyway, because of the PSI protocol. Otherwise, they have no possibility to detect matches.

It sounds like that this is pretty far away from just policy change away.

Many people have succumbed to populism as it benefits them, and it takes some knowledge and time to really understand the whole system, so I am not surprised that many keep talking, that it is just policy change away. Either way, we must trust everything what they say, or we can't trust a single feature they put on the devices.

[1]: https://www.apple.com/child-safety/pdf/Security_Threat_Model...

[2]: https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...

Aug 25, 2021

> - It has been proven that the automated flagging system can be fooled

This statement is absolutely false. No such proof has been presented anywhere, including in the linked article.

Adversarial Neuralhash collisions are expected and the system is designed avoid false positives if they are encountered.

Here is the relevant paragraph from Apple’s documentation:

“as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possibility that the match threshold was exceeded due to non-CSAM images that were adversarially perturbed to cause false NeuralHash matches against the on-device encrypted CSAM database. If the CSAM finding is confirmed by this independent hash, the visual derivatives are provided to Apple human reviewers for final confirmation.”

https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 25, 2021

No it doesn’t. There is a second hash that has to also be matched even before human review, and this doesn’t demonstrate an image even getting that far.

Here is the relevant paragraph from Apple’s documentation:

“as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possibility that the match threshold was exceeded due to non-CSAM images that were adversarially perturbed to cause false NeuralHash matches against the on-device encrypted CSAM database. If the CSAM finding is confirmed by this independent hash, the visual derivatives are provided to Apple human reviewers for final confirmation.”

https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 25, 2021

It's easy to imagine innocuous photos with nude coloring being altered to match CSAM hashes. Having a few will flag your account for human review. The low res visual derivative[0][1] that the human moderators see will look credible enough to be CP[2] and alert the authorities.

[0] It's federally illegal for anyone to have CP on their system, or to view it.

[1] So the workaround is displaying a distorted low-res version instead https://www.apple.com/child-safety/pdf/Security_Threat_Model...

[2] You can replace CP with anything your oppressive regime is trying to snuff out.

Aug 25, 2021

The Technical Summary uses "visual derivative" without clarification, but their Threat Model PDF clarifies it further as thumbnails:

>The decrypted vouchers allow Apple servers to access a visual derivative – such as a low-resolution version – of each matching image.

https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 22, 2021

>> This is an outright lie. The only honest answer is no.

> Are you sure about that?,

Yes.

> I'm not... And all the news so far reinforces that oppinion...

There are no news articles that explain how anyone will be falsely accused for having pictures of their own baby.

> Perceptual filter there seems pretty poor > in terms of collision resistance

I don’t think you know anything about how poor the filter is. What is the false positive rate on randomly selected photos?

The system is even resistant against intentionally created false positives.

Here is the relevant paragraph from Apple’s documentation:

“as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possi- bility that the match threshold was exceeded due to non-CSAM images that were ad- versarially perturbed to cause false NeuralHash matches against the on-device en- crypted CSAM database. If the CSAM finding is confirmed by this independent hash, the visual derivatives are provided to Apple human reviewers for final confirmation.”

https://www.apple.com/child-safety/pdf/Security_Threat_Model...

...

Aug 22, 2021

That isn’t a proof of concept. The system is designed to handle even intentional hash collisions.

Here is the relevant paragraph from Apple’s documentation:

“as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possi- bility that the match threshold was exceeded due to non-CSAM images that were ad- versarially perturbed to cause false NeuralHash matches against the on-device en- crypted CSAM database. If the CSAM finding is confirmed by this independent hash, the visual derivatives are provided to Apple human reviewers for final confirmation.”

https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 21, 2021

Why do you keep posting links to this collider as though it means something?

As has been already pointed out the system is designed to handle attacks like this.

Here is the relevant paragraph from Apple’s documentation:

“as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possi- bility that the match threshold was exceeded due to non-CSAM images that were ad- versarially perturbed to cause false NeuralHash matches against the on-device en- crypted CSAM database. If the CSAM finding is confirmed by this independent hash, the visual derivatives are provided to Apple human reviewers for final confirmation.”

https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 20, 2021

Here’s their paper: https://www.usenix.org/system/files/sec21-kulshrestha.pdf

Their system is vulnerable in all the ways they claim.

However Apple’s system is not the same and does contain mitigations.

> Apple’s muted response about possible misuse is especially puzzling because it’s a high-profile flip-flop.

This is a dishonest statement. Apple has not been muted about the concerns these researchers are presenting.

They address them here: https://www.apple.com/child-safety/pdf/Security_Threat_Model...

There is nothing in this piece that relates to Apple’s actual technology.

These researchers obviously have the background to review what Apple has said and identify flaws, but they have not done so here.

Aug 20, 2021

> Sorry, this is not even wrong.

Probably a mistake to say things like this, when the public documentation contradicts you.

> The visual derivative is not matched against anything, and there is no "original" visual derivative to match against.

Bullshit.

Here is the relevant paragraph from Apple’s documentation:

“as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possi- bility that the match threshold was exceeded due to non-CSAM images that were ad- versarially perturbed to cause false NeuralHash matches against the on-device en- crypted CSAM database. If the CSAM finding is confirmed by this independent hash, the visual derivatives are provided to Apple human reviewers for final confirmation.”

https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 20, 2021

As I said, it’s in public documentation you could easily check.

Here:

https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 20, 2021

> How is a perceptual hash sophisticated and impressive given that it can be abused by governments demanding Apple scan for political content, etc?

It's pretty sophisticated when you look at everything implemented and also consider the infrastructure / review pipelines that are required. See the link below:

https://www.apple.com/child-safety/pdf/Security_Threat_Model...

It's based on perceptual hashing, but the whole end-to-end system is clearly sophisticated when operating on Apple's scale.

Aug 20, 2021

The first half of your post is predicated on it being likely the noise added to generate hash A using the NeuralHash is likely to produce a specific hash B with some unknown perceptual hashing function (which they specifically call out [1] as independent of the NeuralHash function precisely because they don’t want to make this easy, so speculating it might be the NeuralHash run again is incorrect). Hash A is generated via thousands of iterations of an optimization function, guessing and checking to produce a 12 bit number. What shows that same noise would produce an identical match when run through a completely different hashing function that is designed very differently specifically to avoid these attacks? Just one bit of difference will prevent a match. Nothing you’ve linked to would show any likelihood of that being anywhere close to 10 percent.

For the second part, yes if an Apple engineer (that had access to this code) leaked the internal hash function they used or a bunch of example image’s to hash values, that would allow these adversarial attacks.

Until you can show an example or paper where the same adversarial image generates a specific hash value for two unrelated perceptual hash functions, with one being hidden, it is not right to predict a high likelihood of that first scenario being possible.

Here’s a thought exercise, how long would it have taken researches to generate a hash collision with that dog image if the NeuralHash wasn’t public and you received no immediate feedback that you were “right” or getting closer along the way?

[1] https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 19, 2021

my aim was to point out that the above reverenced "image scaling attack" is easily protected against, because it is fragile to alternate scaling methods -- it breaks if you don't use the scaling algorithm the attacker planned for, and there exist secure scaling algorithms that are immune. [0] Since defeating the image scaling attack is trivial, it means that, if it is addressed, the thumbnail will always resemble the full image.

With that out of the way, that, obviously, just forecloses this one particular attack, specifically, where you want the thumbnail to appear dramatically different than the full image in order to fool the user that it's an innocent image and the reviewer that it's an illegal image. It's still, never-the-less, possible to have a confusing thumbnail -- perhaps an adult porn image engineered to have a CSAM hash collision will be enough to convince a beleaguered or overeager reviewer to pull the trigger. The "Image Scaling Attack" is neither sufficient or necessary.

(However, that confusing image would almost certainly not also fool Apple's unspecified secondary server-side hashing algorithm, as referenced on page 13 of Apple's Security Threat Model Review, so would never be shown to a human reviewer: "as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash" [1])

[0] Understanding and Preventing Image-Scaling Attacks in Machine Learning https://www.sec.cs.tu-bs.de/pubs/2020-sec.pdf

[1] https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 19, 2021

It would be good, I think, if people read Apple's threat assessment before calling it "pretty trivial":

> • Database update transparency: it must not be possible to surreptitiously change the encrypted CSAM database that’s used by the process.

> • Database and software universality: it must not be possible to target specific accounts with a different encrypted CSAM database, or with different software performing the blinded matching.

I mean, you can argue that Apple's safeguards are insufficient etc., but at least acknowledge that Apple has thought about this, outlined some solutions, and considers it a manageable threat.

ETA:

> Since no remote updates of the database are possible, and since Apple distributes the same signed operating system image to all users worldwide, it is not possible – inadvertently or through coercion – for Apple to provide targeted users with a different CSAM database. This meets our database update transparency and database universality requirements.

> Apple will publish a Knowledge Base article containing a root hash of the encrypted CSAM hash database included with each version of every Apple operating system that supports the feature. Additionally, users will be able to inspect the root hash of the encrypted database present on their device, and compare it to the expected root hash in the Knowledge Base article. That the calculation of the root hash shown to the user in Settings is accurate is subject to code inspection by security researchers like all other iOS device-side security claims.

> This approach enables third-party technical audits: an auditor can confirm that for any given root hash of the encrypted CSAM database in the Knowledge Base article or on a device, the database was generated only from an intersection of hashes from participating child safety organizations, with no additions, removals, or changes. Facilitating the audit does not require the child safety organization to provide any sensitive information like raw hashes or the source images used to generate the hashes – they must provide only a non-sensitive attestation of the full database that they sent to Apple.

[1] https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 19, 2021

Apple did mention in their security thread model document [0] this:

  Apple will also refuse all
  requests to instruct human reviewers to file reports for 
  anything other than CSAM materials for accounts that exceed 
  the match threshold.
[0]: https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 19, 2021

FAQ Part 2/2

Q: If the second, secret hash algorithm is based on a neural network, can we think of its weights (coefficients) as some kind of secret key in the cryptographical sense?

A: Absolutely not. If (as many suspect) the second hash algorithm is also based on some feature-identifying neural network, then we can't think of the weights as a key that (when kept secret) protects the confidentiality and integrity of the system.

Due to the way perceptual hashing algorithms work, having access to the outputs of the algorithm is sufficient to train a high-fidelity "clone" that allows you to generate perfect adversarial examples, even if the weights of the clone are completely different from the secret weights of the original network.

If you have access to both the inputs and the outputs, you can do much more: by choosing them carefully [4], you can eventually leak the actual secret weights of the network. Any of these attack can be executed by an Apple employee, even one who has no privileged access to the actual secret weights.

Even if you have proof positive that nobody could have accessed the secret weights directly, the entire key might have been leaked anyway! Thus, keeping the weights secret from unauthorized parties does not suffice to protect the confidentiality and integrity of the system, which means that we cannot think of the weights as a kind of secret key in the cryptographical sense.

Q: I heard that it's impossible to determine Apple's CSAM image hashes from the database on the device. Doesn't this make a hash attack impossible?

A: No. The scheme used by Apple (sketched in the technical summary [6]) ensures that the device doesn't _learn_ the result of the match purely from the interaction with server, and that the server doesn't learn information about images whose hash the server doesn't know. The claim that it's "impossible to determine Apple's CSAM image hashes from the database on the device" is a very misleading rephrasing of this, and not true.

Q: Doesn't Apple claim that there is only a one in one trillion chance per year of incorrectly flagging a given account?

A: Apple does claim this, but experts on photo analysis technologies have been calling bullshit [8] on their claim since day one.

Moreover, even if the claimed rate was reasonable (which it isn't), it was derived without adversarial assumptions, and using it is incredibly misleading in an adversarial context.

Let me explain through an example. Imagine that you play a game of craps against an online casino. The casino will throw a virtual six-sided die, secretly generated using Microsoft Excel's random number generator. Your job is to predict the result. If you manage to predict the result 100 times in a row, you win and the casino will pay you $1000000000000 (one trillion dollars). If you fail to predict the result of a throw, you lose and pay the casion $1 (one dollar).

In an ordinary, non-adversarial context, the probability that you win the game is much less than one in one trillion, so this game is very safe for the casino. But this number, one in one trillion, is based on naive assumptions that are completely meaningless in adversarial context. If your adversary has a decent knowledge of mathematics at the high school level, the serial correlation in Excel's generator comes into play, and the relevant probability is no longer one in one trillion. It's 1 in 216 instead! Whenfaced with a class of sophomore math majors, the casino will promptly go bankrupt.

Q: Aren't these attacks ultimately detectable? Wouldn't I be exonerated by the exculpatory evidence?

A: Maybe. IANAL. I wouldn't want to take that risk. While matching hashes are probably not sufficient to convict you, and possibly not sufficent to take you into custody, but it's more than sufficient to make you a suspect. Reasonable suspicion is enough to get a warrant, which means that your property may be searched, your computer equipment may be hauled away and subjected to forensic analysis, etc. It may be sufficient cause to separate you from your children. If you work with children, you'll be fired for sure. It'll take years to clear your name.

And if they do charge you, it will be in Apple's best interest not to admit to any faults in their algorithm, and to make it as opaque to the court as possible. The same goes for NCMEC.

Q: Why should I trust you? Where can I find out more?

A: You should not trust me. You definitely shouldn't trust the people defending Apple using the claims above. Read the EFF article [7] to learn more about the social dangers of this technology. Consult Apple's Threat Model Summary [5], and the CSAM Detection Technical Summary [6]: these are biased sources, but they provide sketches of the algorithms and the key factors that influenced the current implementation. Read HackerFactor [8] for an independent expert perspective about the credibility of Apple's claims. Judge for yourself.

[1] https://imgur.com/a/j40fMex

[2] https://graphicdesign.stackexchange.com/questions/106260/ima...

[3] https://arxiv.org/abs/1809.02861

[4] https://en.wikipedia.org/wiki/Chosen-plaintext_attack

[5] https://www.apple.com/child-safety/pdf/Security_Threat_Model...

[6] https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni...

[7] https://www.eff.org/deeplinks/2021/08/apples-plan-think-diff...

[8] https://www.hackerfactor.com/blog/index.php?/archives/929-On...

Aug 19, 2021

Apple's "CSAM detection" feature is a two-part on-device/server as I understand it:

1: Run on-device code to perform perceptual hash comparison of each photo against an on-device encrypted database of known CSAM hashes.

2: On iCloud Photos servers, send out the relevant notifications when a user’s iCloud Photos account exceeds a threshold of positive matches.

So as a high level testing strategy, I would want to:

- Verify on-device lookup of CSAM hashes. This could be tested by provisioning a test device with an on-device database containing CSAM hashes of images that aren't illegal. As a bystander, I think I'd be fairly confident with this approach because I'm guessing the on-device database that Apple ships could conceivably be changed over time to expand the definition of the images it will flag as CSAM.

- Do some exploratory testing to discover the threshold of how much image manipulation can be done on a flagged image before the perceptual hash comparison fails to return a match.

- Verify that the notification system notifies the correct parties once a user account exceeds the defined threshold of positive CSAM matches.

- Ensure the flagged account can still be investigated if user deletes the offending material from iCloud, or their account by the time a real person gets around to investigating.

- Ensure that the logging is informative and adequate (contains device name, timestamp, etc.).

- Test behaviour on same iCloud account logged in to multiple devices.

- Figure out any additional business logic - are positive matches a permanent count on the account or are they reset after a certain amount of time?

source: https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 18, 2021

If you read their Security Threat Model Review [1] they're only using the "intersection of hashes provided by at least two child safety organizations operating in separate sovereign jurisdictions".

So you'd have to pressure NEMEC and another org under a different government to both add the non-CSAM hash, plus Apple would need to be pressured to verify a non-CSAM derivative image, plus you'd need other hash matches on-device to exceed the threshold before they could even do the review in the first place (they can't even tell if there was a match unless the threshold is exceeded).

I get why people are concerned, but between this thread and the other thread yesterday it's clear that pretty much everyone discussing this has no idea how it works.

1: https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 18, 2021

Before they make it to human review, photos in decrypted vouchers have to pass the CSAM match against a second classifier that Apple keeps to itself. Presumably, if it doesn’t match the same asset, it won’t be passed along. This is explained towards the end of the threat model document that Apple posted to its website. https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 18, 2021

> First, as an additional safeguard, the visual derivatives themselves are matched to the known CSAM database by a second, independent perceptual hash. This independent hash is chosen to reject the unlikely possibility that the match threshold was exceeded due to non-CSAM images that were adversarially perturbed to cause false NeuralHash matches against the on-device encrypted CSAM database.

https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 18, 2021

Apple has outlined[1] multiple levels of protection in place for this:

1. You have to reach a threshold of matches before your account is flagged.

2. Once the threshold is reached, the matched images are checked against a different perceptual hash algorithm on Apple servers. This means an adversarial image would have to trigger a collision on two distinct hashing algorithms.

3. If both hash algorithms show a match, then “visual derivative” (low-res versions) of the images are inspected by Apple to confirm they are CSAM.

Only after these three criteria are met is your account disabled and referred to NCMEC. NCMEC will then do their own review of the flagged images and refer to law enforcement if necessary.

[1]: https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 18, 2021

Apple said that the probability of a collision is quite a bit higher than that:

> As the system is initially deployed, we do not assume the 3 in 100M image-level false positive rate we mea- sured in our empirical assessment

The "1 in 1 trillion" part is the probability that the number of false positives could exceed the threshold needed to trigger a human review:

> Apple always chooses the match threshold such that the possibility of any given account being flagged incorrectly is lower than one in one trillion, under a very conservative assumption of the NeuralHash false positive rate in the field.

source: https://www.apple.com/child-safety/pdf/Security_Threat_Model..., page 10

Aug 18, 2021

That isn’t correct, nor is it credible.

See: https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 18, 2021

> The backdoor might be inactive for now, but there would still be a backdoor on my phone

The key (from Apple's POV) is that this is done on your device, so the model can be audited, and users will know if it changes or is suddenly enabled where it wasn't before. Apple has documented the entire threat model and their design decisions realted to each threat vector.

It's worth reading the document, as it becomes pretty clear that this is a step towards enabling E2E for iCloud Photos.

The alternative to what Apple did is cloud-based scanning, which is less transparent, permanently disallows E2EE, and is more vulnerable to being changed by national decree. If CSAM scanning is going to (or is already) mandatory, I vastly prefer Apple's method here.

[1] https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 13, 2021

They put out a new paper [0] describing the security thread model they were working with, and this paragraph on page 9 stood out to me:

The perceptual CSAM hash database is included, in an encrypted form, as part of the signed operating system. It is never downloaded or updated separately over the Internet or through any other mechanism. This claim is subject to code inspection by security researchers like all other iOS device-side security claims.

Could someone tell me how that inspection works? Are there researchers who are given the source code?

[0]: https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Aug 13, 2021

FYI, they also published this today: https://www.apple.com/child-safety/pdf/Security_Threat_Model...

With new bits of infos too.

Aug 13, 2021

Apple published a threat model review recently, with an explicit response to the threat from governments:

https://www.apple.com/child-safety/pdf/Security_Threat_Model...

Specifically, it looks like they will be requiring that hashes exist in two separate databases in two separate sovereign jurisdictions.