The Nostalgic Nerds Podcast

S2E9 - CAPTCHA Stolen Cognition

Renee Murphy, Marc Massar Season 2 Episode 9

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 59:47

Send us Fan Mail

CAPTCHA was supposed to keep the bots out. A simple lock on a simple door. Instead, it became one of the largest unpaid labour operations in the history of the internet.

Google bought reCAPTCHA in 2009, and every time you clicked a traffic light, a crosswalk, or a bicycle, you were labelling training data for Waymo's self-driving cars. You digitised the New York Times archive. You transcribed millions of Google Books pages. Nobody told you. A UC Irvine study put the total at 819 million hours of human cognitive labour, roughly $6 billion at minimum wage. The AI trained on that work now solves the test at 100% accuracy. Humans manage about 70-90%.

Marc and Renee are angry about it. Marc traces the architecture of a security model that was broken from the start: a gate that checks you once and then forgets, while the real threats happen on the other side. Renee traces the emotional arc of being used as a guinea pig by platforms worth hundreds of billions of dollars. There are CAPTCHA farms in India and the Philippines where humans solve puzzles on behalf of bots for about a dollar per thousand. The system designed to stop bots created a labour market that serves them.

Renee wants to be an orca. Marc just wants to browse without proving he's not a robot. Neither of them is getting what they want.

Join Renee and Marc as they discuss tech topics with a view on their nostalgic pasts in tech that help them understand today's challenges and tomorrow's potential.

email us at nostalgicnerdspodcast@gmail.com

Come visit us at https://www.nostalgicnerdspodcast.com/episodes or wherever you get your podcasts.

We normally talk about stuff that you and I love, right? We talk about stuff. That, like, oh, it's invisible infrastructure. It's, you know, something that we have, like, direct experience with films and data centers and, you know, networks and security and risk and yay. We love all that stuff. And logistics. Like, but this is, yeah, no. I hate this. I hate it so much. I hate it so much. All right. So I was trying to do, and I don't know what changed in the last couple of years, but I was trying to do a basic search on Google yesterday and I'm, I am constantly getting the stupid image grids all like constantly. And I don't know why I hit, it says, are you a human or I'm not a robot or whatever the little stupid checkbox is. And I check it. and most of the time i get one of the little you know the little image groups you know the little click the squares to hit the traffic light i get it all the time multiple times a day i just i don't get it i don't understand why so i'm sitting here staring at the grid and there's the pole. And the pole the traffic light and it goes out of the next square is that the traffic light well like is just the thing hanging from the pole the traffic light is the whole thing the traffic light like I don't know or is just you know is it just the lights is that it's the traffic light and I get it wrong, and I get it wrong, and it gets me another one, crosswalks, right, bridges, steps, whatever. I think, you know, it was crosswalks, and, okay, it drives me nuts, the little crosswalks. Something's wrong with the image, and it's, like, pixelated or something, so is that it? Is that, you know, I'm not sure, like, was it a crosswalk? Was it not a crosswalk? There's images that are faded, like, yeah, that was a crosswalk, but it's not anymore. Is it still a crosswalk? I don't know. I don't know what a crosswalk is. I did go through like three rounds the other day. So to prove to Google, it's not, and it wasn't my bank, wasn't my email, wasn't anything critical. It was just a freaking search. I had to prove to Google that I wasn't a robot. I know I'm not a robot. Renee, you know, I'm not a robot. I know you're not a robot. Freaking Google should know I'm not a robot. And yet here I was failing a test that, as it turns out, an AI could pass with 100% accuracy. I had one last week. It was bicycles. Like click all the squares with bicycles. And there's this one that looked like a scooter. Like it was a bicycle with an engine on it. Like what is that? It's a moped. And it has two wheels. Is it a bicycle? I don't know. I clicked it. Wrong. And then the next grid was storefronts. And I'm staring at an awning. And I'm like, It's an awning. Is it a storefront? Is it open? I don't know. I don't know. But if it's been closed for years, I don't know. I don't know. So I'll click it again. And this will go on and on and on. Right? Right? I hate this stuff. I hate this stuff. And you know what? For me, it really is the, like, the tire just goes a little bit past that square. But you don't see it in the next square. Yeah, do you click it? It's like a puzzle piece. Oh. I hate it so much. And the thing that gets me is the indignity of it, right? It's not just annoying. It's that I have to prove I'm human. I know I'm human. You know I'm human. Why is the burden of proof on me? And what does it say about the system that the test it uses to check if something keeps failing and the machine keeps passing it? Like, I can't make it pass, but the machine can? Like, I... It is, right? This is why I picked this, because it's something we hate. All right, folks. Hello and welcome to another episode of the Nostalgic Nerds Podcast, where we talk about frustrating history of technology and what it teaches us about the present and the future. Tonight, if you didn't figure it out already, we're talking about CAPTCHA and all of its various guises. It's that little puzzle. Have you seen the new ones where you have to slide the thing, you know, from one side to the other to make a puzzle? Why do we have to keep inventing them? Just leave it the way it is. I know, I know. Right? Just leave it. Just leave it. It's the little puzzles, right, that stand between you and basically everything you have to do online. And it turns out there's a lot more going on behind those image grids than you actually think. It's one of those technologies that everyone uses, everyone hates, and no one ever stops to think about what it actually is, where it came from, or what it's really doing. And once you know what it's really doing, you're going to be as angry as I am, I swear to you. I swear. You're going to be as mad as I am. Probably. We're going to, it's going to be like a call to arms in our, all right. So Renee, before we get into the history, let's, let's start with what the CAPTCHA actually is. Cause I think most people just think of it as this, you know, annoying thing you have to do before you log in or, or whatever, but there's a real, a real concept behind it here. God help me, CAPTCHA stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart, which is one of the great acronyms of all time. I mean, it should go into the Hall of Fame of the worst acronym ever. It's got so many more words in it that it— Yeah. The name was coined in 2003 by a guy—from a team at Carnegie Mellon led by a guy named Louis Vaughan Ahn. He was a grad student at the time. I'm going to do, I swear to you, I'm going to do a report to see how many things grad students have done at Carnegie Mellon that ended up being the biggest pain in my rear end. Yeah, yeah, yeah. Because this isn't the only thing. This isn't the only one. I swear to you. Of course. Why they were able to just walk around that building over there at CMU doing all kinds of craziness, I'll never understand. Anyway, we're going to come back to him because he's basically the main character of this whole story. Or the villain. Or the villain, right? to found Duolingo, by the way, which is a whole other conversation. But the core idea is simple. It's a reverse Turing test. Alan Turing asked, can a machine convince a human that it's human? CAPTCHA flips that. Can a human convince a machine it's a human? And the fact that we're increasingly bad at it should tell you something. That's right. It tells quite a lot, actually. Let's go back to the beginning, you know, where we start with Capture. So because the concept is older than the name itself, when did we first need to tell humans and machines apart online? What was the actual problem that this started with, Rene? So the first real-world version shows up at AltaVista. Oh, that takes me back in 1997. Remember AltaVista before Google? Little mountains. Yeah, it was like, oh, before everything. AltaVista had a feature where you could submit your website URL to the search index, and bots started flooding it. Thousands and thousands of automated submissions stuffing the index with spam sites to game the rankings, right? So their chief scientist, Andre Broder, had this idea. Yeah, he studied how optical character recognition worked, OCR at the time, could read clean printed text, but it couldn't read distorted text. I bet it couldn't read script either. Like, you know, yeah, humans could. So he studied his office scanner manual and figured out everything that makes text readable and then deliberately reversed it all. He warped the fonts, added noise, rotated characters, overlapped the letters, and he started showing these distorted images to users, asking them to type what they saw. If you got it right, you were probably a human. If you didn't, you were probably a bot. I can't believe that's how we got here. I hate the stupid letters, man. I can't. And then you hit the little, like, like the microphone or the little speaker button to hear it. And you can't really tell what it's saying either because, you know, the bots can interpret that too. Is that a capital Y? Is that a lowercase Y? I don't know. Why? I don't know. All right. So the protection works though. Right. So spam submissions dropped 95% overnight, 95% with one filter. And I think that's what's worth sitting with for a moment because, you know, that's what the solution actually is. It's basically a lock on a door. You prove you can read this warp text and the door opens and you're through. And for a while, the lock worked because there was a gap between what humans could do and what machines could do. Humans could read the distorted text. Machines couldn't. And CAPTCHA sat right in that gap. The entire model depends on that capability gap existing. Okay. Remember that. Remember that. That's important. So when I think back through history, think back to the 1920s and the speakeasy where you, like, knocked on the door and that little sling slid over. Somebody's like, say the password. And if you didn't know the password, you were lucky if you didn't get beat up because they thought you were fed. Like, you better hope you ran when you didn't know that password, right? Right. On D-Day, when they dropped the parachuters, paratroopers in to into where D-Day was like France. Right. They drop them into France or Germany or wherever they were going. But they drop them in and they drop them with these clickers. They called them crickets, but they were like kid toys. We use them now to train dogs. Clicker. Clicker. Oh, right. Yeah. Yeah. Yeah. So they would drop them in with that. And the whole thing was like, it's pitch black. Like, you don't you're all really separated. Like, you don't know who's any you don't friendly. Are you faux i don't know so you'd click twice and then the person if they were american they'd click once and you'd be like i won't shoot you right but if you click twice and no one clicked back you had two choices you either shot them or ran because they they're not america they don't have a clicker right you know me i lose like stuff i've lost the clicker i'd have been dead and then phone freaking somebody would click it you'd be like a click i'm gonna click click click Don't shoot me. Yeah. And then phone freaking. Remember we talked about phone freaking? Yeah. Remember that? So if you called into a phone freaking bridge before you ever said a word, somebody would say to you, what's the frequency? What's the frequency? And you're supposed to say 2,600 hertz. But if you didn't say it fast enough, they just hung up on you. Like, we've been doing this forever, right? So once we start doing it with CAPTCHA, the approach spreads. Different versions of distorted text CAPTCHAs show up across the web over the few years. Yahoo deploys one in 2001. Lots of sites start using them. And then, then, the graduate student. Our boy. Our boy. Louise Vaughn-Anne comes back into the story. This is 2007. The same guy who coined the term four years earlier. He has this realization that changes everything. Which is? This is. People are spending millions of hours a day solving these problems every day just to log on to things. And all that human effort, all that cognitive work is being thrown away. It's wasted the moment you saw it. So Von Ahn, you know, his insight is, what if you could make that effort useful? What if every time somebody solved a CAPTCHA, they were also doing something productive? It makes me want to kick him, honestly. So he builds reCAPTCHA. And the way it works, I guess it's clever, right? It gives you two words instead of one. One word it already knows the answer to. That's your security check. That's the one that proves you're a human being, right? The other word is from a scanned document that OCR software couldn't read. So your answer helps digitize that word. So every time you solve a CAPTCHA, you're also transcribing a piece of a book or a newspaper that a machine couldn't read on your own. I'm co-opted by a machine to be a machine. That's right. My eye's twitching. Go ahead. So I knew that there was the one word and the second word. I didn't know that the second word was a training thing. But, you know, now it makes a lot of sense. The first big project to use this was the New York Times Archive. Millions and millions of articles going back to 1851. They're scanned as images, but not as searchable text. So ReCAPTCHA users, they transcribe them word by word without even knowing it. So by 2009, they'd done over 5 billion words. We only started in 2007. Exactly. Exactly okay this is this to me is the personification of inshittification you know the doctoral like how did the platform get worse like think about five billion words in just a few years time like oh that's messed up that's that's like you're you stole from me you stole from me yes thank you okay google buys you had to pay interns for that you'd have been Exactly, exactly. Google buys ReCAPTCHA in September of 2009. And now, of course, because it's Google, right, it's not just newspapers. Google has this other project, which, you know, has some dubious ethical boundaries, let's say. Google Books, where they're trying to digitize every book ever published. Like, I have real problems with Google Books. I do, too. Yeah. Yeah. Yeah. Anyways, by 2015, they've scanned over 25 million books and recapture users are clearing up all the OCRs. They transcribed enough text to fill more than 17,000 books with better than 99% accuracy, all done by people who just wanted to check their email. 17,000 books. People sitting at their desk, typing squiggly words to get their email, and they're digitizing the entire New York Times archive without even knowing it. I don't know whether to be impressed. No, I know. I'm not impressed. I'm angry. You know what? This is why I'd rather be an orca. Like, this is it. I'd rather buy a killer whale. At least no one would do this to me. Right? Did you ever hear of this? It's an app called Zooniverse. Have you ever heard of it? No, I don't know. So you would download it from, I don't know, the App Store. You're you're you're signing on to research projects and you're actually named in the research paper that someone's going to use. Oh, is this the crowdsource? Yeah. Yeah. Yeah. Yeah. Yeah. So they're trying to get you to participate in research. And the research I would participate in is like like try you're teaching you're essentially teaching a model. Right. So you're doing the machine learning piece of it. They need one. They need you to go through one hundred thousand images. And in those images, like I did one, like, tell me what kind of galaxy this is. Is this a spiral galaxy? Is it a globular cluster? Is it a spiral arm? Is it, like, what is it? And so you would look, right? And clearly this was stuff that was taken from Hubble, right? So, like, you're just looking at galaxies. Like, this is what it is. This is what it is. They train you so you don't get it wrong, but that's what you're doing. And when that project is done, they close it out and you go find another one. I also did one where it was trail cams. And your only job was to say, there's a mountain lion in that picture. They didn't care about anything else. It was just a mountain lion. Is there a mountain lion? And you'd be like, no, no, no, no. Deer, no, no, no. Oh, mountain lion. That was your job. And you were just training a model to recognize a mountain lion. I think the most interesting thing I did was they had transcribed all of the Civil War records. You know, when you would join the Army, they had all the Civil War records done except for. The freed slaves. The freed slaves who joined the Union Army, those records were not done yet. And they're all in cursive, which means there was no way to just read it, right? And I found out that cursive may as well be ancient Aramaic because no one reads it anymore, right? And so I thought, okay, I still write in cursive, so I know how to read it. So I'm going to do that. I'm going to go do that one. So you would just see this person enlisted. This is where they were. This is where they came from. Here's what regiment they're in and all that stuff. So, yeah, it was really, really interesting. But I knew I was doing it. I signed up for it. Yeah. And I was happy to participate. Right. Yeah. This crap to think that I help digitize books for Google makes me want to kick Google. Yeah, it does. It definitely. Right. Because like. Oh, yeah. I mean, did they have the right to digitize the books in the first place? Like question. That's the first question. But you have to. It's almost like a social experiment. You know, they used to do these on Facebook all the time, too. Like, will you click this button? It's like, this is manipulative crap that you shouldn't be allowed to do just because I'm on your platform. Like, that's how I kind of feel about it. And to think that someone comes back and says, I have a good idea. Let's do it like this. Oh, and I'm going to sell that to Google. Probably made him a multimillionaire. That makes me mad, too, right? Like, on the backs of, like, he did nothing on the backs of all these other people who were probably being fed that prompt way more often to get this stuff done. in a reasonable amount of time. Like, that's what makes me mad. Like, maybe I didn't need to do it for two-thirds of the time. Look how mad I am. If I were an orca, I wouldn't be this mad. So, look, I mean, when I dawned on this idea, I just, it was something that came up, and we'll talk about it in a little bit here, but the internet episode, we were talking about the internet episode, right? And something just kept turning in my head over and over. And on the one hand, this is a very clever idea, right? To use wasted effort and you make it productive. Great. Interesting. So, you know, our villain, Von Ande, see, he even has a villain name. No, I'm going to look him up. I bet he looks like a villain. Hang on. Go ahead. Keep talking. He's probably like a totally nice guy. He sees something that no one else does. Great. But on the other hand, it's the pattern and the acceptability of the pattern. And it just gets uncomfortable when you follow it forward. The user is doing the work, but the platform captures the value. And the user doesn't know. The consent model on this is just messed up. They don't know they're working. They think they're trying to get the puzzle to prove that they're not a robot or whatever to get to their email. Yeah. There's no consent there, except for some probably user agreement thing that nobody ever read in the first place. It just, yeah, makes me mad. So he looks an awful lot like the CEO of Anthropic. Like, put that guy's, like, picture him in your head and that's him. Okay, well, not super villain. Not super villain. Maybe we should cut him a break. Okay. Yeah, maybe. We'll call him the subject of our, we'll call him the villain. We'll make it more neutral. I'll have to find him on LinkedIn and send him this. Anyway, so, okay. All right. So, let's go forward a little bit. But from this point forward, right? From this point forward, we can only blame Google. I mean, he sold it. Yeah. Right? So, like, he's no longer the evil person Google is. He's no longer the evil. Yeah. Yeah. All right. I'm okay with Google being evil, aren't you? I'm totally good with that. Well, I mean, they kind of are. They kind of are. Okay, perfect. So around 2014, Google shifts recapture from text to images. And the reason is straightforward, right? And you can guess, right? The capability gap closed. AI and modeling and systems got better at reading distorted text than humans. They got better than humans. Google's own research showed that AI could solve text captures at 99.8% accuracy. Humans were getting 97.5% accuracy. It's my fault. It's my fault because I can't tell a capital Y from a lowercase y. But you see, this is where something wrong happens, right? The security system designed to distinguish humans from machines and become something that machines did better than humans. And remember, we talked about the capability gap. They needed a new gap. Okay, so how do we find a new gap? We find one in images. Okay. This is when we get the grids. Click all the traffic lights. Click all the crosswalks. Click all the bicycles. Click the fire hydrants. Street view photos chopped into squares. And this is the part that I think most people don't realize. When you click on those squares, you're not just proving you're human. You're labeling training data. Yeah. Every time you click a square that contains a traffic light, you're teaching a computer vision system to recognize the traffic lights. You're annotating images for machine learning for free. Again, didn't sign up for this, but I'm doing it, right? Oh, by the way, let me just, do you remember? Google owns Waymo. Waymo builds self-driving cars. Have you seen how bad the Waymo cars are though now? Like, have you seen how they get stuck? And they get confused? They can't turn around. It's horrible. That's horrible. But Waymo builds self-driving cars. Self-driving cars need to recognize what? Traffic lights, crosswalks, bicycles, fire hydrants, storefronts, pedestrians, all the things you've been clicking on for the last 10 years. All right. So the connection took me a while to see it, right? You know, the traffic lights, the crosswalks, the bicycles. These are all the things a self-driving car has to identify in the real world to not kill somebody. And we've been building the data set for them one capture grid at a time. Again, again, I'm being manipulated by algorithms. I'm being manipulated by software. It's happening. No wonder I feel paranoid. Like, I'm right. I'm right. I'm constantly, maybe Wi-Fi is going to give me brain cancer. I don't know. But I feel like I'm being manipulated constantly. And here's the proof. Here's the proof. Okay. All right. So the numbers are staggering on this. Okay. A study from UC Irvine calculated that recaptcha and related systems have consumed approximately, okay. Breathe. 819 million hours of human time. So when Sam Altman talks about freaking, you know, training time for humans is, you know, you should take that in consideration before you blame open AI models and their electricity usage. 819 million hours to train these image systems. And that's 93,500 person years. Oh, my God. If you value that at a U.S. minimum wage, it comes to roughly $6 billion. Okay. I don't make minimum wage. You don't make minimum wage. It's definitely a lot more than $6 billion of unpaid labor. And the return to the user is that you get to log in. That's it. That's it. You get to access the thing you already own. Meanwhile, Google captured training data that helped build AI systems worth potentially, I don't know, maybe trillions. The study called ReCAPTCHA a tracking cookie farm for profit masquerading as a security service. And that's the research at UC Irvine. Renee, do you remember in the internet episode, we talked about how early platforms extracted value from communities without returning it? Yeah, communities created the content. the platform, monetize the attention. And we ended up, this is why it makes me crazy. We ended up being the product, right? Like, how do we get tricked into this every time? Right? Because we're living in the matrix. Like, let's just admit it. Let's just admit it. It's the pattern, right? It's the same pattern over and over again, but this time it feels worse. Because at least on social media, you kind of chose to post something. You chose to create, you know, put your pictures or whatever. You knew you were participating. Here, you're just trying to get through the door. You didn't sign up to train a self-driving car. You didn't agree to annotate images. You didn't agree to, you know, digitize the New York Times or help Google digitize books illegally. You just wanted to search for something on Google. And the platform turned that involuntary interaction into training data worth billions. And you know what? If I were actually doing that to train something to find Bigfoot, I'd be fine with it, to be honest with you. It's just self-driving cars. Like, why was I doing that? I would have been happy to find Bigfoot. Is there Bigfoot in this? No. Is there a Bigfoot in this picture? No. That would be funny. Is there a Bigfoot in this picture? No. That would be a good... I'd have been down for that. Yeah. I think it's worse because the AI was trained on all that human layer. Labor can now solve the test better than we can. In September 2024, researchers at ETH Zurich published a paper showing that a modified version of the YOLO image recognition model can solve recapture version two with 100% accuracy. Humans manage 70 to 90%. We trained the thing that made our own test obsolete, which makes sense because we're about to do it again. And we're still doing the stupid test. We've made it obsolete. We know we're already bad at it, yet we're still doing it. That's where we're at. There's another layer that I think it's worth knowing about. Captcha farms. Okay, this really makes me crazy too. These are actual companies, real businesses with management structures, APIs, quality assurance. And they're in India, the Philippines, and Vietnam, where human workers with their card keys, key in in the morning, sit down in their cube and solve captchas for bot operators. If a bot encounters a captcha, it sends the image to the farm. The human solves it and sends back the answer. The whole cycle takes seconds, right? Workers earn, oh, this is terrible. Yes, it is. A human mind is a terrible thing to waste. Workers earn about $1 to $2 per 1,000 CAPTCHAs solved. It's not much, but in some economies, it's enough to be worth sitting in front of a screen all day. And you know what? At the end, at least you're not moderating for Facebook. At least all you're doing is this, right? Because it could be a worse job. It's about $100 a month, which it's, yeah, it's not a living wage. Like, in those markets, it's not a living wage. It's not, yeah, it's, yeah. Yeah, well, I mean, you know, everybody takes advantage of everybody. Yeah. So the security system designed to stop bots was created an entire labor market of humans working on behalf of the bots. So we really are the batteries that the machines are using to create the alternate universe known as the Matrix. It's real, people. It's real. Yeah. The humans, in this case, are on the wrong side of the door. That's the thing, right? The gate was supposed to keep the bots out, and instead it created a micro-economy where humans open the gate for bots as the job. I just... I'm working for the bot i'm working for the pot i'm not working for the man i'm working for the bot that's not okay i guess you know what maybe that's where we're all headed i don't know did you see this i can't remember where i saw it but it was some stupid video and it had it had these it was it was it was built by you know a deep fake or whatever and it had musk and altman and somebody else and they're pudgy and they're older and they're talking about you know how great the models are and all this stuff and then and then and then the then it shows the humans and the like the humans are all like buff and built and they're they're in a gym and they're literally generating electricity to run their models there you go yeah that's what sam altman wants i'm just gonna say it like he wants us to be the electricity like we're gonna be running in hamster wheels because Because he's not going to be able to afford that electricity bill. It's going to be us. No, no. I mean, the thing that's just got me so tuned up about this is that, you know, the capability gap has disappeared again, right? You could train a model over and over, and then the gap disappears. But then we're left with the crappy experience. And we know it's a crappy experience. Google knows it's a crappy experience. Facebook knows it's a crappy experience. Everybody knows it's a bad experience, and yet... Like, we're still doing it. Like, why? Because we still have to prove we're humans. Like, again, like, we still, and maybe every time the technology catches up with it, it's just like, all right, you need to figure out a way to be more human. Like, maybe they could figure out a way to be less human. Like, maybe the bot should just give it a break, right? Like, just give it a break. Yeah. But they can't. They never get tired. That's why people like them. And then, okay, so then there's the latest version. Recapture version 3. It was launched in 2018. You've probably never seen it, and that's the point. There's no checkbox, there's no image grid, there's no puzzle. It just runs in the background on every page that implements it. It watches your mouse movements, your keystroke patterns, how you scroll, how fast you click, your device fingerprint, your browser history, and it generates a score between zero and one. Well, that's pretty binary. Zero means you're definitely a bot. One means you're definitely a human. Your entire behavioral profile is, you know, compressed into a single decimal point and you never see it. You never consented to it beyond the terms of service on a page that nobody reads. It's, I mean, plain and simple here, it's surveillance and mass surveillance with low value. Because at the end of all that watching and all the behavioral data collection, all that tracking, outputs a number. Oh, yay, a number. A probability. Are you human? You know, that's 0.7. Maybe. Probably. We think so. It sees everything and understands almost nothing. And the behavioral data it collects? That's not just for bot detection. Of course not. Why would it be? That's another data stream feeding the same machine, right? Users become the product without a return benefit, again. And the return benefit we do get, the bot detection, still broken because I'm still clicking on freaking fire hydrants. I mean, it is literally it's literally what our intelligence services use to to track, you know, our enemies like we did. It's we do the exact same thing. Right. Like like like it's it's what the CIA has like like malware sitting on computers somewhere in Russia. And it's doing all of that. It's looking at the browser. It's watching how it moves. It's like watching how it types. It's actually keeping track of what it is. Why we knew the Russians were like interfering in the election in 2016 was because Dutch Secret Service had keyboard loggers on everything in Russia. Like they were wired into the cameras. They could see people coming and going. They could see everybody. They knew it. They had transcripts of what those people were typing. And they gave that all up to let America know that, uh-oh, something's up, right? So the same technology that is used to, you know, to surveil, like, foreign enemies, they're using on us. And I just, that's so crazy. For what? For what? So that I can get access to a web page that seriously doesn't matter? Like, that's borderline crazy. Yeah. It's borderline crazy. Like, you get access to your email or your whatever. You get to do a Google search. But whoever is building the model gets to build a very close representation of how humans behave on a mass scale. On a mass scale. That's crazy. I don't like that. No, I don't like it. But I kind of only got to just kind of step back for a second and think about it sort of differently. And I really have been thinking about this since the internet episode and some other things. I think this all approach is just architecturally wrong. Of course you do. Yeah. Yeah. Because it's me. And of course, yeah, of course you do. Not that just the captures are annoying and the AI can beat them. The entire model of how we're trying to solve the problem is wrong. I think it captures like a gate. It checks you once at the threshold and then it trusts you. And this is, so, so this is not how, like banking behavioral biometrics works, right? Banking behavioral biometrics checks you at every step of the way. This Captum stuff, like, you know, once you pass the gate, it's trying to prevent the bots at the gate, right? Right? It's still the same model of the Alta Vista of bot spamming things. Once you get past that, well, a bot can still do bad things. You can do the, you know, You can do whatever you want past that gate. But the actual threat was never a bot getting through the door or the gate or the lock, right? The threat is what happens after. Bad behavior is what happens after the door, the lock, the gate, whatever. Spam happens after the door. Fraud happens after the door. You know, identity takeovers happen after the door. The gate is solving the wrong problem, I think. Yeah, we've seen it before, right? Early online communities, they didn't have gates. Usenet, forums, bullet boards, right? Like there was no CAPTCHA, no identity verification at the threshold. And they were more civil, more functional, more human than most of what we have now because the community observed the behavior continuously. Not once at the door, but all the time. Your handle had a reputation. People remembered what you said. And if you were rude or sloppy or cruel, there were consequences, not because the system enforced them, but because humans remembered and we used to have shame. Right. And so you would behave yourself because you didn't want to get kicked out of the community. You wanted to participate. So the community had rules and you would follow them because if you didn't, you would get locked out of the community. Right. And I think like that. Yeah. Yeah. Those were the good old days when we knew how to behave. We knew how to behave. So what do you think? What do you think happened then? It was accountability because everybody knew everybody, right? Like, I think now we're so anonymous. Like, I can go on to any social media platform, call myself anything, Calamity Jane. I can call myself Calamity Jane, and you don't know who I am, right? And I can go say whatever I want, and I can do whatever I want. I can call you whatever I want, and I'm gonna. I'm gonna, because that's what we've devolved into. We've devolved it. We're all trolls, and we're all doing it anonymously. And we all, I think there's a dopamine hit with that. And we actually really like it. And I think it's really sad. It's sad that like community doesn't matter anymore. We're not a community anymore. We don't, we don't behave like that. Even in the forum, like if you join, I used to belong to a Simpsons, like obscure Simpsons group. Like it was really obscure. You'd have to come up with some really obscure stuff to be part of that group. Even to get in, it had to be obscure like like what's skinner's prison number right you mean like yeah oh yeah no what was jean valjean's prison number because that's what it is oh 42601 whatever it was it's jean valjean's prison number but like yeah like what's the crazy cat lady's name eleanor abenathy like you it was something super like and then you got in the group and the group was so weird and obscured it was really fun right um and then all of a sudden a nazi got in there and everybody got mean and everything got weird and i'm like you guys it was fun while it lasted i'm out like like and so even in those you know captured groups where we all thought we had something in common you know we people still didn't know how to behave in those places right so i don't know i think once anona you know becoming anonymous was something that you could do and that we started getting that dopamine hit from being mean like that it was over it was over it's sad yeah i think talk about scale because I think the scale piece of this is what, you know, why anonymous, you know, anonymity. Yeah. Because communities got too big for human memory, right? And the platform didn't build systems that preserved the accountability, right? Can you imagine if I got an email from Facebook every time I said something mean to somebody? I don't say mean things to people, but if I did, like, can you imagine, right? They build gates instead, verify once and forget. And the gates were always easier to build an actual governance right can you imagine governing facebook no wonder they don't want to do it like i mean what and you just like what you just said right if if i was getting an email every time i. Misbehaved and actual governance like that's why that's why people behave badly because there isn't actual real governance so i can get away with it and i'm not even held responsible for it i can say whatever i want right there's no i have passed through along with facebook i renee murphy have passed through indemnification you can't hold me to whatever i said it's free speech it's whatever like yeah you it's craziness it's absolute craziness all right go ahead so the lock is broken right ai solves the puzzles better than we do the invisible version watches everything and And still only produces a probability. The capture farms route around the gate entirely with cheap human labor. Like the whole model of verifying identity at the threshold and then forgetting is the same design flaw that broke online communities when they scaled. And then there's the burden of proof. I have to prove I'm a human. I know I'm a human. I wish I was an orca, but I'm not. I know I'm a human. Every other... If there was a checkbox that said, I'm an orca, click. I'd be like, done, I'm an orca, if only. Every other trust system humans have ever built starts from the opposite assumption. In law, you're innocent until proven guilty. In social settings, you're trusted until you give someone a reason not to trust you. Caption inverts that. You're guilty of being a robot until you click enough fire hydrants to prove you're not. I hate the ones where you click and it like you have to keep clicking because the things disappear, you know, and they like it just keeps on popping up with more, you know, just keep clicking. Yeah, right. Yeah. I hate that. And then there's the next group of things and there's the next group of things. You're just like, I want to kill you. I'm not asking for like the nuclear, you know, code. I'm not. I'm not. I'm really not. Like I've spent so much time, you know, at the gate here. Right? You're making me not want to come in. You're making me not want to hang out. Exactly. Just let me in. I would actually rather just sit there for three seconds, you know? So what happens? What replaces the gate here, right? There are some things that are emerging. Cloud Flare, bless their hearts, has something called turnstile that does behavioral analysis without sending your data to Google. So yay. Wait, don't they have it? Don't they have it? Yes, but they actually have a privacy policy that they publish around this. Yeah. Okay. Yeah. Apple has private access tokens that use your device's hardware to prove you're using a real phone. I know some of the product managers on that. It's pretty good. And it doesn't reveal who you are. There are some proof-of-work systems where your browser does a computation of the background. Again, that feels a little, you know, sketchy because you're, you know, the human cognition time isn't being taken. But the CPU time is being taken for some task that's, you know, whatever. I'd be okay if it was SETI. Like, I'd be okay if it was solving SETI at home packets or something like that. But yeah. For a normal user, it takes, you know, milliseconds. But for a bot running millions of requests, the computational costs, it's prohibitive. And then there's the more radical stuff, right? World ID, the WorldCoin project uses, oh my God, iris scanning. Okay. Okay. Okay. You go to a physical device called an orb. Of course it is. It scans your, you scan, it's so Simpsons. Remember when Lisa goes to the museum and it's the ISIS orb? Like this is literally, right. Okay. So it's called orb. It scans your iris and you get a cryptographic proof that you're a unique human being. No identity attached, just proof of personhood over 9 million people can I just tell you something like this is why 9 million people have used it but this is why I don't own VR headsets can I your eye is as unique as a fingerprint yeah and the way your eye tracks when you read or when you're looking on a screen or they that's unique to you too no two people look at the same stuff the same way and they don't do it with the same eyeballs so imagine if your fingerprint had a fingerprint like that's how like unique it is to you like it's why i don't have the i don't trust anybody especially not meta with that data right like i don't have i would love to go bigfoot hunting in vr i would love it it'd be fun no no i don't do it i'm not playing with the eyeballs i'm not playing with the eyeballs yeah i think there's definitely some ups and downs with biometrics like i won't say where I was, but I was at a company and did some work in the biometric space. And we could literally break every single biometric. It didn't matter what it was. So there's definitely ways to attack biometrics. And. You know, I think there's some positive implementations and negative implementations. Positive implementations like Apple, the biometric data actually stays on the device that you own and it never leaves. I think that's the model that most of these guys have been implementing in the last couple of years, which is good. But there are other biometric implementations that are bad and you can't replace a fingerprint, you know, and once the fingerprint is compromised, it's bad. But anyways, but, you know, yeah, biometrics, it solves one problem, but it creates like 50 others. Yeah, better bot detection at the cost of biometric data, more tracking, more invasive verification. We keep building better locks. Yeah. Honestly, lock stuff that doesn't even need to be locked up. I mean, it's like putting more locks on the garage that's full of crap you wish someone would steal. I don't I think I don't know, You know, there's something to be said for positive identity claim and proof, right? And that establishes some sort of reputation, you know, whether that's good or bad. But anonymity, like as much as I'd like to think that, you know, we should be in a position where we have some anonymous, you know, capabilities, anonymity creates so many other problems. Yeah. Yeah i mean i guess i don't have a problem being judged i mean i do have a i have a rating on uber, sure it always surprises me too like it's still pretty good even though i'm the idiot who travels like once a month and then i forget to tip somebody and when i open it to get a new cab like a month later i'm like oh i should have tipped them oh no okay like yeah so like you get it a month late so you've already said something bad about me right like she doesn't tip what a scumbag and it's like oh no she did like go race that i need a five yeah i mean yeah But isn't that what they do in China, though? In China, you have a score. Yeah. Yeah, that social score gets you in and out of places, right? If you have a low social score, there's a lot you can't do. I bet you can't even own a cell phone, right? But if you have a really high one, then that means you're a good citizen. You align yourself well to the government. You can, you know, do all kinds of things. But to do that anonymously, right? Like, you don't know who I am, but I have a good score. Like, that makes it all the harder, right? Like, it's, although, yeah. I mean, yeah, you can anonymize me, but at some point, you can put all that data back together, right? And with large language models, at some point, does it become possible for bots to mimic good behavior, right? Reputational behavior. And... Then, then, then what, you know? Why not? It can already do bad things. Why couldn't it do good things, right? Like to gain trust to do bad things. Like I could totally see that happening, right? I could totally see that happening. I mean, even prompt injections could like lead, lead your bot that route, right? Like, like you could, yeah, you could, they're not smart. They're really not smart. Yeah. So I think this is like, what do we say all the time, right? This is why we can't have nice thing. Yeah. It's why we can't have nice things. All the behaviors remain the same, but things get weaponized because of scale. The technology gets better, and so that means you can reach further, your scope gets bigger, the spread is wider. I don't know. This is one of those problems that it's really frustrating. And and the i think the problem i have with it is that the the experience of the internet itself has degraded over the last several years right like bless gdpr's heart right you know but after gdpr hits all the cookies like now i've got to approve you know or select yeah we do in california too i realize like like people in wisconsin don't put up with this but like in california everything is like you have the right to say no to cookies and i do i actually do say no to cookies like i do mess with that stuff so i'm glad i have the right to do we have the right to be forgotten in california too i can have all my stuff deleted from the internet every 45 days if i want but the anti-patterns like the bad ux right the like the guys that build the sites and the apps and all that stuff they know what they're doing they know what the anti-patterns are they know what the bad UX is to try to keep your data live, you know, to try to keep you using it. And, and I just. Like, man, that's why we can't have nice things, you know? Why we can't have nice things. Again, but we're not the user. We're the product. Yeah, I know. Right? I know. I mean, if you can just default your head to that, then you're just like, yep, that's what happens. That's how you got, what was that 815 million hours of work out of us? Like, it's because we're not a user. None of this is for us. None of this is for us. Yeah. It's kind of sad, actually. I think the alternatives that will actually work here are the ones that stop treating humanness as a gate, though, and start treating it as a continuous signal. Whether that's anonymous or reputation-based or whatever, I don't know. But, you know, continuous, because that's what we do in behavioral biometrics in the banking industry is continuous signal. But we're not using it to collect data, right, and build models of your behavior to then sell your data as a product. You know, we're doing it to make sure that fraudsters don't steal your money. You know, so that's what continuous signaling should be. Well, right. It's how fast I can type in a password. It's how it's like, yeah. Yeah. And I actually don't mind that, right? Like in that regard, I think like the more you can make my behavior, the way I behave and someone's doing something better. Different than that and that's how you say that's fraud yeah i'm a i think i'm okay with that right i'm okay with that because yeah as again so because you're not turning it into a product you know but if then google does that same exact thing and then productizes that you know to sell the next coolest whatever gemini does this or whatever like i'm not okay with that like i'd rather pay google five bucks a month or whatever so that's how i felt about my smart meter like i'd rather pay you 75 a month for you not to give me that thing like i'd rather have a dumb meter just i'll pay the money i don't want the smart meter it knows too much yeah we're gonna have to have a we're gonna have to do a show about that like we should like electrical metering or gas metering like i that oh that's right that meter drives me crazy okay nobody talks nobody be talking about privacy policies on smart meters you know and they should they should if you knew what that meter knew about you i'm just saying i know i know anyways we'll have to do one about that one all right so so let's let's bring this to kind of a close here so so for the last 25 years we've been clicking on distorted text and blurry photos of traffic lights we digitized the new york times. We digitized millions of books. We helped train self-driving cars. We created an entire labor market of human-solving puzzles on behalf of bots. And the AI we trained on our own labor can now pass the test better than we can. And we still have to click the stupid fire hydrants. I'm going to think about that every single time I get one now. Every crosswalk, every bicycle, every blurry storefront, I'm going to deliberately get it wrong so Waymo hits somebody. Ha ha ha. Because that's what an orca would do. But see, this is the irony. This is the total irony. Because the bot and the ML is smarter than you are, and if you deliberately give it the wrong stuff, it already knows. It already knows. It can already solve them 100% of the time. So maybe that's the thing, right? Maybe that's it. That actually proves that you're human if you can't pass. Right? If I'm part of that 73% that just can't sort it out, just can't figure out what's a bicycle. I think, listen, if you're listening, I think we should collectively get together and just tank it. Just tank it. Just get it all wrong all the time. Just go find a couple of pages that are really irritating and just keep getting them wrong over and over. Can we create a plot to just get them wrong? Exactly. That's right. Yes, we can create a plot to get it wrong. Love it. What a great idea. I can't be any more mad. I'm going to have to go take a nap. So this is, but see, this is where, like, we love, you and I, we love tech. Like, I know a lot of times we say we hate tech, but it's not really that we hate. I don't. I actually really like it. Exactly. I really like it. Like, the promise of technology. We love, like, you and I, we really love the promise. Like, the utopia that could be at the end of that data center tunnel, right? Right. But this is not, this is not that. This is, this is bad. This is. Yeah. This is an anti-pattern. And I think, but see, this is, I don't know. The weaponization of technology is real. I don't like being taken advantage of at scale. And I feel bad for, I feel bad that people think they have to do that. That you have to use me as a guinea pig or use me as free labor or, you know, use me as a product. Like, we really have to get away from that. We really, really have to try to get away from that. I just, it's not healthy. It's not fair. And it's not, that's not how things should work. Not for companies that are worth hundreds of billions of dollars, right? Exactly. It's just not okay. Maybe they're worth hundreds of billions of dollars because they exploit the labor. Yes, that's exactly why. That's exactly why. Yeah, if they had to pay for 98,500 human hours or human lives to get that done in one year, like, yeah, they would have. But no, instead we did, what was that crazy number? Like how many thousands of books in three years? Like that's crazy. That's crazy. Yeah. That's crazy. Yeah. And you can't tell me, here's the thing. I don't think you could be able to convince me right now that you didn't make me do that more times than I needed to just so you could get your project right faster. Right. Like I think like that's what you did. You create, you actually destroyed whatever trust I had in trying to figure out if you could trust me. Like I'm done. I'm done. Yeah. I'm done. I would totally with you there. Like, I don't know what has happened, but the browsing experience has gotten so bad. And it feels like it's because of these, you know, stuff like cookies and stuff like these image recognition, you know, trash. And you're right. It's violated trust. I don't trust any of these platforms that use this stuff. I don't trust them. So, and I guess this is why if you have an A, like if you subscribe or you download it in LL, like, you know, chat GPT or whatever, a chat bot, it's why you go there to search for the internet instead of to the internet. Yeah. Like, the internet has become such a bad experience between ads and Captcha and, you know, and sponsored browser returns, like listing returns. Like, at some point, you got to be like, okay, I got to cut through the crap. So, yes, yes, Sam Altman, I'm going to waste your GPUs on finding out what time CVS opens. Yep, that's what I'm going to do. Exactly it. But the answer you get may not actually be the right answer. It's going to be probabilistically close. So, you know what? Close enough. That UX experience is so bad. Probabilistically good enough is good enough for me. There you go. Good enough. All right, folks. If you liked this episode, you were angry with us, please like, subscribe, share with your friends. You can reach us at nostalgicnervspodcast at gmail.com. Thanks for listening, and we'll see you next time. I need a drink. Bye, you guys.