"Reflections on Trusting Trust" - https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref...
See Reflections on Trusting Trust 
>This is so plainly false its silly.
You're simply wrong. Did you read the article that shows quite clearly exactly how to do this I told you about? No, you didn't, or you'd stop making this false claim. Before you repeat this, RTFA, which I'll post again since you didn't learn last time .
There, read it? There's decades of research into even deeper, more sophisticated ways to hide behavior. At the lowest level, against a malicious actor, there is no current way to ensure lack of bad behavior.
>What do you think security researchers do?
Yes, I've worked on security research projects for decades, winning millions of dollars for govt projects to do so. I am quite aware of the state of the art. You don't seem to be aware of basic things decades old.
Do you actually work in security research?
>Plenty of things are found all the time
You're confusing finding accidental bugs with an actor trying to hide behavior. The latter you will not find if the actor is as big as a FAANG or nation state.
If simply looking at things was sufficient, then the DoD wouldn't be afraid of Chinese made software or chips - they could simply look, right? But they know this is a fool's errand. They spend literally billions working this problem, year in and year out, for decades. It's naive that you think simple audits will root out bad behavior against malicious actors.
Even accidental bugs live in huge, opensource projects for decades, passing audit after audit, only to be exploited decades later. These are accidental. How many could an actor like NSA implant with their resources that would survive your audits?
Oh, did I mention ? Read it again. Read followup papers. Do some original research in this vein, and write some papers. Give talks on security about these techniques to other researchers. I've done all that. I have a pretty good idea how this works.
>what type of data is being collected
Again, you miss. I am not talking about the data being collected. I'm talking about the data in big systems that make decisions. NNs and all sorts of other AI-ish systems run huge parts of all the big companies, and these cannot yet be understood - it is literally an open research problem. Check DoD SBIR lists for the many, many places they're paying to have researchers (me, for example - I write proposals for this money) to help solve this problem. For the tip of the iceberg, read on adversarial image recognition and the arms race to detect or prevent it and how deep that rabbit hole goes.
Now audit my image classifier and tell me which adversarial image systems it is weak against. Tell me if I embedded any adversarial behavior into it. Oh yeah, you cannot do either, because it's an unsolved (and possibly unsolvable) problem.
Now do this for every learning system, such as Amazon's recommender system, for Facebook's ad placement algorithms, for Google's search results. You literally cannot.
Don't bother replying until you understand the paper - it shows that a code audit will not turn up malicious behavior if the owner is actively trying to hide stuff from you.
This is framed more classically in terms of the compiler and toolchain: https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref... "Reflections on Trusting Trust"
After you learn a little bit about compilers (and know something about software security), the paper "Reflections on Trusting Trust" blew my mind. I consider this a must read.
What GP is talking about is that information is never fully contained in a medium; it's encoded in the tuple (the medium, what reads the medium). E.g. take a JPG file. The picture in that file isn't fully contained in the bits that file consists of. Some of the information is contained in the JPG specification (and through it, in the JPG decoding library). You need both to get at the picture. Meanwhile, with a different decoding program, the same file can represent a piece of music. Or a text document.
"Gödel, Escher, Bach" goes into a detailed discussion about this.
On top of that, GP is talking about hardware that builds its own copies, possibly imperfect copies. That implies a lot of information relevant to the organism may not be directly visible in the DNA - it may sit within the replication machinery, and evolve there. It's kind of similar to the difference between machine code and microcode + actual traces in silicon.
A famous Turing Award Lecture by Ken Thompson, "Reflections on Trusting Trust", provides another example. Consider a C compiler. How do you build one? From its C source code, using a different C compiler. Now imagine a malicious C compiler that a) injects a trojan into the compiled binary whenever it compiles, say, "login" program, and b) injects a trojan into the binary whenever it compiles another C compiler; that second trojan contains the code for injecting a) and b). You use that compiler to recompile the original C compiler, install it in a system - and from now on, not only "login" will be bugged, but you can't get a non-malicious compiler by recompiling the original one from source; the source contains no traces of the trojans, but they are there, in the binary of the compiler you're using, and they self-replicate.
This is what I believe happens with life.
It's kinda interesting (although probably more of a simple slippery slope) to consider the intersection of this and the classic Reflections On Trusting Trust.
E voting shouldn’t exist. How badly it was implemented is moot.
The seminal "Trusting Trust" paper, by Ken Thompson, in 1984, is remarkable for being as relevant today as ever, and should be required reading. https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_Ref...