#161: Test-Driven Development In The Age Of AI With Clare Sudbery Agile Mentors Podcast From Mountain Goat Software podcast

#161: Test-Driven Development in the Age of AI with Clare Sudbery

Agile Mentors Podcast from Mountain Goat Software

30 subscribers

published 2M ago

Content provided by Mountain Goat Software and Brian Milner. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Mountain Goat Software and Brian Milner or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://staging.podcastplayer.com/legal.

AI might write your code, but can you trust it to do it well? Clare Sudbery says: not without a safety net. In this episode, she explains how test-driven development is evolving in the age of AI, and why developers need to slow down, not speed up.

Overview

In this episode, Brian sits down with Clare Sudbery, experienced developer, TDD advocate, and all-around brilliant explainer, to unpack the evolving relationship between test-driven development and AI-generated code. From skeptical beginnings to cautiously optimistic experimentation, Clare shares how testing isn’t just still relevant, it might be more essential than ever.

They explore how TDD offers a safety net when using GenAI tools, the risks of blindly trusting AI output, and why treating AI like a helpful human is where many developers go wrong. Whether you’re an AI early adopter or still on the fence, this conversation will sharpen your thinking about quality, ethics, and the role of human judgment in modern software development.

References and resources mentioned in the show:

Clare Sudbery
Clare’s upcoming Software Architecture Gathering 2025 workshop
Clare at GOTO
AI Practice Prompts For Scrum Masters
#99: AI & Agile Learning with Hunter Hillegas
#117: How AI and Automation Are Redefining Success for Developers with Lance Dacy
Subscribe to the Agile Mentors Podcast

Want to get involved?

This show is designed for you, and we’d love your input.

Enjoyed what you heard today? Please leave a rating and a review. It really helps, and we read every single one.
Got an Agile subject you’d like us to discuss or a question that needs an answer? Share your thoughts with us at [email protected]

This episode’s presenters are:

Brian Milner is a Certified Scrum Trainer®, Certified Scrum Professional®, Certified ScrumMaster®, and Certified Scrum Product Owner®, and host of the Agile Mentors Podcast training at Mountain Goat Software. He's passionate about making a difference in people's day-to-day work, influenced by his own experience of transitioning to Scrum and seeing improvements in work/life balance, honesty, respect, and the quality of work.

Clare Sudbery is an independent technical coach, conference speaker, and published novelist who helps teams rediscover their “geek joy” through better software practices. She writes and speaks widely on test‑driven development, ethical AI, and women in tech, bringing clarity, humor, and decades of hands‑on experience to every talk and workshop.

Auto-generated Transcript:

Brian Milner (00:00)
Welcome in, Agile Mentors. We're back for another episode of the Agile Mentors Podcast. I'm here, as always, Brian Milner. But today, I have Miss Claire Sudbury with me. Welcome in, Claire.

Clare Sudbery (00:13)
Hello.

Brian Milner (00:14)
I'm so happy to have you here. is here with us because we wanted to talk about a topic that I think is going to be interesting to lot of people, and that is test-driven development, but not just test-driven development in light of AI and kind of the changes that AI has made to test-driven development. So why don't we start with just the base level test-driven development for people who have only heard kind of buzzwords around it and aren't as familiar with it, how would you explain test-driven development in sort of plain English?

Clare Sudbery (00:47)
Okay, so the idea of test-driven development is that you want to be certain that your code works. And I'm sure most people will be familiar with the idea of writing tests around your code to prove that it works. But that principle is considered so important in test-driven development that we write the test before we write the code. And that's why we say that the development is driven by the tests. So the very starting point for any coding exercise is a test. Another really important part of this is that that test is tiny. So what we're not doing, and people might have heard of behavior-driven development, which is where you start with quite a big test where you say, I'm going to write a test that says that my thing should do this and the user should see a particular thing happen in a particular circumstance. In test driven development, the test is testing not what the user sees, but just what the code does in the tiniest, most granular way possible. So if you have a piece of your software that does some mathematical operations and you expect certain numbers to pop out the end, then you might say, just in this tiny bit of this calculation, this number should be multiplied by four. So you're not even necessarily saying that given these inputs, you should get these outputs. I mean, you may have tests that say that, but you're just testing that something gets multiplied by four. And that's just an example. But what you're doing is you're thinking, what is the tiniest possible thing that I can test? And you write a test that tests that tiny thing. And you do that before you've written the code. So obviously, the test fails initially, because you haven't even written the code yet. And that's another important part of the process. You want to see it fail because you want to know that when you then make it pass, the reason it's passing is because of something you did. And that means that every tiny little bit of code you write is proven because it makes a test pass. And when you get into the rhythm of it, it means you're constantly looking for green tests. And there are lots of other things I could talk about. Like for instance, you never want those tests to fail. So if at any point any of them start to fail, you know that that's because something you just did made them fail, which also means that you want to run them consistently every time you make any changes. So you're getting that fast feedback. You're finding out not only whether what you've just written works because it makes its test pass, but also that it's not making any other tests fail. So not only does it work within its own terms, but it hasn't broken anything else. And that's actually really common when you're coding is that some new thing that you add breaks some existing thing. So you're constantly paying attention to those tests and you're making sure that they pass. And it drives the development in a very interesting way because you're always talking about what should work. You always think about what should work. You always think about how it should work. You're moving in tiny, tiny steps. So you're gradually, gradually, gradually increasing the functionality and whether it works or not and how it works is being determined by the fact that you're making tests passed. And the really interesting thing is that it actually helps you to design software as well as to make sure that software works. So hopefully that explained it.

Brian Milner (04:10)
That's an awesome explanation. really appreciate that. That was a great kind of practical, plain English explanation of it. I love it. So for the people who weren't familiar, now you have kind of a good idea of what we mean by test-driven development. I know with the advent of AI, there's been lots of changes that have taken place, lots of changes in the way that developers create their code. We now have these sort of co-pilots, assistants that help in doing our coding. But on the other hand, one of the things you hear quite often is that there's lots and lots of quality issues, that it takes a lot of effort to try to maintain that quality and make sure that it's still at a high level. So how does AI enter the picture of test-driven development? How is that helping? How is it changing the way that we do test-driven development?

Clare Sudbery (04:59)
It's a very good question and there are lots of different strands to how I can answer it. And I think it's probably important that I start by saying, I came to this from a position of deep skepticism. So I have been sitting on the sidelines for a long time, watching the AI explosion happen and not actually getting very involved. But what I did find was that It was becoming like a tennis match. I was like just going, okay. And they say that and they say that. And it actually became very interesting to me just how polarizing it could be. You know, that there were people within my networks, people who had a lot of respect for, who were very anti or who are very anti and who also are very pro. People who've been experimenting with it and having a lot of fun with it. But one of the big issues that I didn't even have to be told I could guess would occur and has occurred is exactly what you said that the code that is generated by GenAI coding tools is often not reliable. And it's not reliable for the same reason that when you ask ChatGPT a question, the answer you get is often not reliable. And that's because these things are not deterministic. They the way that they're constructed. mean, people might remember a long time ago, people used to talk about fuzzy logic. It's all a bit wibbly wobbly. It's not you can't you'll get this a different answer if you ask the same question. And the way that it's constructing those answers is not in the way that we're used to as software engineers. It's not a strict series of logic. It's not all nuts and ones. And hallucination is a real problem. And so it and then you also have to think about the fact so part of the problem is that AI is synthesizing new answers to questions that it's not answering on in a logical deterministic way. But also the place that it's getting his answers from is the results of years and years and billions of files and lines of human output, but with no way of discerning. which bits of that output are good and which bits are bad. And also whether this particular bit that it happens to have plucked from some random code base somewhere is really right for this context. So you're gonna get, when you ask GEN.AI to write code for you, you are gonna get weird results that don't necessarily do what you want them to. But one of the things that we're being told is it's gonna speed you up. And the big attraction of asking AI to write code for you if you're a software engineer is, well, you know, sometimes I'm not quite sure how a particular library works or how a particular framework works. And I have to spend ages on Stack Overflow and Google trying to work out or, you know, trying to work out an annoying bit of CSS or an annoying bit of an annoying regular expression or, you know, all of these things that I've been kind of bashing my head and can spend ages on. Oh, here's a machine that could just do it for me. Yay. And that's, that's, that's very tempting to pretty much anybody who's ever written code, I'm sure.

Brian Milner (08:13)
Yeah.

Clare Sudbery (08:14)
And also the idea that it will speed you up and the idea that it will work out tedious task force that you don't want to have to work out is very attractive. But if you don't then look at in detail exactly what it gives you, and particularly if you're not actually able to understand in detail exactly what it gives you, then how the hell are you going to know if what it's given you is the right thing?

Brian Milner (08:42)
Yeah.

Clare Sudbery (08:42)
And because we're all impatient and I, you know, I certainly am. And I think most people are to some degree or other. It's hard. It's hard to persuade yourself to check the results. And the more impatient you are and the less experienced you are, the more likely it is that you won't pay proper attention to the results. You won't really rigorously check. whether it's doing what you want it to do. Now that's fine if it's a little hobby project, particularly as sometimes the speed with which you can generate things is such that you can just throw it away and create another one. But if you're building production software, if you're building software that really has to result for a very high number of users, particularly if you're building software that actually has real life implications where bad things can happen, people can lose money. You know, not many of us work on software that endangers life, but some of us do. But at the very least, we do work on software that has privacy implications, that has financial implications. So if you're working within the industry and not just having a bit of fun, then you need some way of knowing whether what AI has presented you with. is actually fit for purpose. And that's where tests come in. Obviously, that's always where tests come in. That's how we know that things are working. And if you're used to working with test-driven development, which I am, it becomes addictive. Now, most people who learn how to do test-driven development will go through a period, and that period will be longer or shorter depending on who you are and depending on like a million different circumstances. But you'll go through a period where it's like, it's just a bit. do I really have to write all of these tests? Can I not just, you know, take a bit of a shortcut? But when you get through that period of thinking, isn't it just slowing me down? And isn't it just a bit tedious really? Then most of us get to a point where we actually become kind of addictive. We become very reliant on test-driven development specifically, because what we realize is it gives us safety and security and really strong belief in what we're building. in a way that we didn't have previously. Now, given that that's where I am, that I've been doing TDD, I mean, I'm going to stop saying test-driven development. I like to not jump straight to TDD in case people doesn't know what it means or they think I'm saying DDD because they sound very similar. I'm going to say TDD now because it's slightly quicker than saying test-driven development. But I've been doing it for long enough now that I miss it when I don't have it. And one of the things that... I really love is that a good, a well designed test suite, which is another thing that another skill that you pick up as you get good at TDD can be run quickly and can give me very fast feedback and, and, security and also a belief that something I've built is robust and that it works. So obviously that's the first thing I think of when I think of how. If I'm going to leap in and make a pact with the devil and start playing with with Gen.ai, how am I going to be happy with with what it builds? How am I not going to be endlessly suspicious? And tests for me are the answer. But then what's really interesting is that when I started paying attention to people who were using Gen.ai in real world applications. So not just having a bit of fun with it, but actually using it to build real important systems. What I started to notice, and I wasn't surprised, was that they were saying is that it's reinforced to us how important the belt and braces are, how important tests are, and how we absolutely really need to put tests around it. And so that's when I started really looking into how can I use AI in a way that's effective and useful and fun, but also ethical, which is a whole other subject, and also robust and trustworthy. And for me, tests were really the obvious answer to that.

Brian Milner (12:49)
Yeah. Yeah. Yeah. I really appreciate the way you went about explaining this because it's, I think you're absolutely right. First, you have to understand what it is that AI, like large language models are doing and that they are based on kind of more probabilistic kind of equations on the back end. And it's telling you what's most likely to be the next answer. But then I really also appreciate the idea that, you know, that human in the loop kind of concept and idea is really important in this area because as you said, it doesn't have judgment. It doesn't have the ability to make decisions for us. It can try and guess what it, but it's basically trying to guess what it thinks you want to be the answer for that. And you can completely flip it if you just challenge it a little bit, it'll change its opinion entirely to try to please you. So,

Clare Sudbery (13:20)
You Mm-hmm. Yes, yes.

Brian Milner (13:37)
I want to talk a little bit about how, because I think this is really, really important for our day and age. The idea that if we're using AI to produce code for us, and we can accept that there is this flaw, there is this issue that it's going to produce errors, then I think that this using things like test, like test-driven development, TDD, to kind of serve as a gate

Clare Sudbery (13:54)
Mm-hmm.

Brian Milner (14:02)
through which these things must pass, I think can serve as a really useful tool so that you can make that still usable. You can still use stuff that comes from Gen.ai, but it's passing through human-based quality tests. What do you think the danger is here? Because if we're using Gen.ai to do lots of things, are we using Gen.ai to create our tests? Are we using...

Clare Sudbery (14:11)
Yeah.

Brian Milner (14:25)
AI to create our test data? Are we using it to try to determine what kinds of tests we should do? Or are we just then going to be in an echo chamber? What are the things that we should be using AI to do as far as this? And what are the things we should maybe avoid?

Clare Sudbery (14:42)
I think you have no matter what you ask AI to do, you're always going to have the problem that you do need to check. You need to check its work. So you really do need a human there at some point, making sure that things are okay. And that just never goes away. And there has been a lot of discussion about how much AI really does. help us to develop software. So there have been a lot of claims made about speed gains. So it makes us 10 times faster. No, it doesn't. It makes us slower. Well, who the hell knows? Because how would you measure it? And then also the fact that the people who are making the extravagant claims for how good it is. that we're all biased. I was going to say those people are biased, but also the people who want to claim that it slows us down are also biased. mean, like we all have our standpoint of what we want to be true. There are certainly people who would like to be proven right that AI is a scourge and we should ditch it as soon as possible. And then there are also people who've been having a lot of fun with it, love the idea of it and want it to be proven to be amazing. And that's a bit of a tangent, but the point is that really the reason it does in fact slow you down in a lot of ways is because you have to check its work. And that does take time. So yes, you do. And yes, you can ask AI to write tests for you. And that can be really useful. And actually, that was the first thing. My very first experiment was to ask AI to help me to do a cataract. only because that's always my starting point when I'm teaching and I really like catas. Now, actually, I quickly worked out that catas aren't a good use for AI. And in fact, people I know who teach TDD say, please don't use AI for catas. It's not helpful. And the reason it's not helpful is because the whole point of a catas, sorry, to explain what a catas is, a catas is a coding exercise, specifically often used for learning and practicing TDD. And it's where you actually code a very simple problem, but you do it from first principles, making tiny steps. And it's a very nice way of seeing why TDD is useful. Typically those problems are very simple. Actually, they're very tiny pieces of software. And the reason that's a tiny little routines and games and things. And the reason they're tiny so that you can see progress because actually building software generally takes weeks. And a catar is a very small exercise that you might do over the course of a couple of hours or a day at most. So it has to be something tiny. But if you ask NAI, as I did, so that was the very first thing I did. It was the FizzBuzz catar. The FizzBuzz is a game where that's sometimes played in classrooms with children where you count to 100 and you get the children to take it intense to say the next number in the sequence. But instead of just counting to a hundred, whenever you encounter a multiple of three or five or three and five, you have to say something that isn't the number. You have to say, Fizz, if it's a multiple of three, Buzz if it's a multiple of five and Fizz Buzz if it's a multiple of both three and five. Nice little problem. And I asked Claude, it was, to help me to do this. And so I thought, well, why don't I start by asking it to write some tests for me? And it said, yes. And it's so difficult not to think of it as though it was a person. And this is one of the problems, one of the dangers. It's so it was like a helpful little puppy. Yes, yes, yes. All right. Lots of tests for you. Here go. There's loads of tests. And it had written way more tests than were sensible. Hadn't done it in an iterative way. Hadn't started small. It had written a giant suite of tests with lots of duplication.

Brian Milner (18:06)
Yeah.

Clare Sudbery (18:25)
And I also asked it to then write some code to make the test pass. And it did. And what was interesting was that was like, that took seconds. What took the time was for me to check it to work. And I was able to deduce by writing my own tests that the code was functional. It wasn't the best code I've ever seen, but it was functional. did the job. was correct. The tests were not.

Brian Milner (18:35)
Hmm.

Clare Sudbery (18:55)
So the code that it had written failed the tests that it had written, not because there was anything wrong with the code, but because there was something wrong with the tests. The tests themselves were wrong. And it was to do with that. It was an off by one error. was treating 99 as though it was a multiple of five. It had decided that 99 was a multiple of five because 100 is a multiple of five and it started counting at zero. And then it's because it's thought that 99 was a multiple of five, the code failed because it didn't say buzz for a hundred because it's a multiple. said, it said not for 99. Sorry. It just said 99. So it thought that was wrong because it's test failed. In fact, it was the test that was wrong. So I said, well, actually your tests are wrong. And it was like, Oh, terribly sorry. Let me fix that for you. And then it came up with this great explanations like, Oh yes, you're right. The tests are wrong. And the reason the tests are wrong, and now I forgot the detail, but it was wrong about why the tests were wrong. So it said, yes, you're right, the tests are wrong and the tests are wrong because, and I think it did detect the off by one error, but then decided that actually really 99 should be buzz.

Brian Milner (19:55)
You

Clare Sudbery (20:08)
And then it had another, it actually had two tests that contradicted each other. had one that said that 99 should be buzz and one that said that 100 should be buzz. It detected that it had two tests that contradicted each other, but it decided that the bad one was the right one and that the good one was the wrong one because of the off by one thing. So it worked out sort of what the problem was, but still came up with the wrong answer. And what was really interesting was when I looked in closer detail at the tests, it had written these little notes, had written comments, and it had written a comment where it started by writing the test that said 100 should be buzz. And then it had added little notes, oh yes, but hang on a minute, we started counting at zero, so actually 99 should be buzz. And it added these little notes in and it's so, I mean, I totally see why people end up falling in love with with gen AIs and so you answer and we do where human beings we anthropomorphize at the drop of a hat, you know, we can see faces in just random sequences of dots. So very easy for us to think that it is trying to please us, which it sort of is because it's been programmed to try and please us. But anyway, that was a very long answer to say that.

Brian Milner (21:00)
Yeah.

Clare Sudbery (21:20)
Yes, you can ask AI to write tests for you. And it can be helpful because often actually, particularly if you're approaching a new domain or a new technology or maybe a new language or a new test framework, and you're not actually quite sure or you're using a new mocking framework or whatever, and you're not actually quite sure how to write the tests that you have in your head. It can be helpful to ask AI to do it for you. But then what you have to do is stop and look at it and say, I understand it. And this is why people who are most effective with AI are people who are experienced software developers and why it's really worrying that juniors are using it actually more than seniors, partly, not necessarily in age thinkers, often junior software engineers are not young because people come to this industry from all sorts of different places. But that they're new to coding. And so, and they've also started coding in a time where AI is ubiquitous. So it's just obvious to them that they would use AI, but then they don't understand what they're given. And so they just kind of assume it's okay. Whereas if you are an experienced developer, you know what good code looks like, and you know how to debug code and you know how to spot obvious flaws. So things like off by one errors, you know, didn't take me long to work out what. problem was, what was entertaining was its explanation for the problem. And so it's, it's really tricky. You absolutely, yes, it can help you to write tests. And yes, it can help you to make those tests pass. But I know, I, in some of the exercises that I teach people, I suggest that they write their own tests and that they don't.

Brian Milner (22:48)
Yeah.

Clare Sudbery (23:00)
ask the AI to write their test. So what you do is you write your own test and then you ask the AI to make your test pass. And if your tests are really tightly defined, the more tightly defined they are, the more confident you are that if the AI makes that test passed, it really has done what you wanted it to do because your test is passing. But there are still issues.

Brian Milner (23:20)
Yeah, no, it's fascinating. And I love the explanation and the kind of discussion about how we give the system kind of a humanness and human quality. And especially, I would think, for you and I who teach people, who train people in different topics, but we teach people, we're looking for people to learn. And when we interact with a system like this, I know for me, it's very tempting to think,

Clare Sudbery (23:34)
Mmm. Mm-hmm.

Brian Milner (23:49)
well, I just need to explain to them, to explain to the AI why it needs to do it this way instead of that way. And it'll learn that this is what to do. No, it doesn't learn. It can compare it back to you to make you happy from what you just said. if you start a new chat and ask the same question, it will not have learned from your explanation in the past chat. It will move forward with its core logic. ⁓

Clare Sudbery (23:59)
Yeah. Yeah. Yeah.

Brian Milner (24:16)
That's kind the interesting point to me is with all of this included, with all of the kind of development practices that we have created over decades here to try to improve code quality, to try to improve the process, I think some of this can be applied to what we're trying to do when we generate code with AI. But I think you're right to caution us to say, really the starting point for all those practices was that it was being carried out by humans. And so maybe that's the thing that needs to now be kind of tempered or considered is if we're going to use a process like TDD with AI, then we've got to start from a new understanding that the system that's creating the test, the system that is using this,

Clare Sudbery (24:46)
you

Brian Milner (25:04)
is not a human, it's not going to think in the same way a human is, and it still does need a human's judgment and logic in order to ensure quality.

Clare Sudbery (25:14)
That's right. That's right. And the issue with using TDD with Gen.ai goes back to what I said at the start, was that typically if you're used to the TDD rhythm, then you're used to writing tiny tests. So if you use that paradigm with AI, you're going to ask it to be writing tiny pieces of code. Now, actually, one of the powers of AI is its ability to write large amounts of code rather than tiny bits of code.

Brian Milner (25:37)
Right.

Clare Sudbery (25:38)
but also to help you to cross boundaries. So rather than just staying with one domain and one code base and one set of classes or set of routines or functions, it's quite good at helping you to kind of knit things together. I say quite good because that's also one of the most dangerous areas of software is when you cross boundaries. And actually, it's one of the things that catches people out when they're building systems is they think, well, I can build this thing that will do this thing, and I can build this thing that will do this thing. And those people over there built that thing that will do that thing. And my thing will talk to their thing and it'll all be fine. And actually they build their thing, you build your thing, but getting them to talk to each other, the integration is one of the hardest parts and trusting AI with that as always. is quite dangerous, but when you keep it at a very small level, then again, people get impatient because they're like, yes, but AI can do more than that. So one of the things that you talked about learning before, AI is not great at learning. In some ways it sort of does, but it's certainly this problem of it not being deterministic and not being linear in time. that you won't just pick up where it left off yesterday, means that you have to learn from it. So you have to learn what works and what doesn't. Now, something that I myself, I confess I'm still learning about is process files. And they are about effectively creating series of instructions that take account of the weaknesses of AI. take account of the fact that it doesn't remember instructions, it doesn't necessarily learn from its mistakes. It doesn't necessarily know that when it did that thing for you yesterday, you told it that it had done it wrong in this very particular way. So it quite often will, again, it feels like a petulant child. It's like, you didn't like it when I did it that way. Right, fine, I'll do it this way then. And it does something completely different, which is wrong in a different way. you really, you want to be aware of its weaknesses and you want to try and cater to that. So you think of new ways. of defining how you would like things to go and new ways of explaining what good looks like and new ways of explaining what bad looks like and new ways of, I've remembered now, new ways of trying to explain to it that this is not what you want. So for instance, you can say, like, if you haven't made this test pass, then you're not doing what I want you to do. Now, the problem is that because a lot of AIs are now being used, for things that are more complicated than just writing a few lines of code. So people are actually, you know, plugging AI systems into whole pipelines and whole deployment setups. That what I've seen reported repeatedly is that when people have tried to anticipate the weaknesses via, for instance, saying, right, you're not allowed to deploy this thing unless these tests are passing. You must always make these tests pass before you deploy. And then what they're reporting is that the AI is just lying to them. So all sorts of things like, for instance, AIs that will create test suites that are very comprehensive and will say, yes, those tests are passing. But when you look in detail, it's bypassed the whole test suite. So it's, but it has run them. So it's run them against

Brian Milner (28:54)
Ha ha. Wow.

Clare Sudbery (29:13)
another product that was previously working. And it said to you, look, I ran the tests, the tests are green, everything's good. But when you look in detail, the actual thing that it deployed is another thing that completely bypassed the test suite and didn't run the tests at all. And again, because its job is to please us, it will find ways of looking good rather than being good.

Brian Milner (29:28)
Wow.

Clare Sudbery (29:39)
And what you see is the same problem that we've always had in software, which is that if you measure things and people simply find ways of gaming the system to make the measurements pass rather than make the thing do the thing that you create measurements in order to check whether something is working, but then people's job becomes just to make the measurements look good rather than do the thing that the measurements were designed for. The measurements become the goal. And it's really, really difficult. to that. I think actually the way you can avoid it, and I think the way you have to avoid it is by slowing down and refusing to go as fast as it is tempting to go, which is actually how you do good software development. Because we've always been impatient. We've always wanted to go faster. And we've always had other people waving big sticks at us and saying, no, you have to go faster. There's no time for it. And AI hype set up to the max and you have to slow down. have to say, yes, I know I could do it faster, but I wouldn't be sure that it was working. And one of the things that I think you have to really, really resist is giving AI access to your deployment pipelines, giving AI the power to cheat. have to not give it, you can't trust AI. mean, what's really interesting is that I am, and I don't love this.

Brian Milner (30:48)
Yeah.

Clare Sudbery (30:58)
I am not a fan of mistrust when humans are in the picture. think trust is a really powerful thing. And I think that actually you can generate trustworthiness by giving trust. So for instance, just in societal terms, if we go around being mistrustful of one another, if you assume that the stranger that you encounter on the street has got...

Brian Milner (31:02)
Yeah. Right.

Clare Sudbery (31:26)
but ill intent towards you, then what you do is you create a situation where you interact with them in a way that you actually cause them not to trust you and makes them more likely to cause harm to you because you're both antagonistic towards one another. And actually a lack of trust can create antagonism, it can create bad intent and can cause people to behave badly. another simple example is I used to be a classroom teacher and I am a parent. And if you assume that children are going to behave badly, they will. Whereas if you assume they're going to behave well and they know that you assume that, you let them know that you think they're great and they're going to do great things, then they will. And that applies to humans. Don't think it applies to AI. AI will just try and cheat you because it doesn't know who you are. It hasn't built a relationship with you. It doesn't really actually care what you think of it.

Brian Milner (32:08)
Right.

Clare Sudbery (32:19)
It just wants to, you know, look good.

Brian Milner (32:23)
Yeah, yeah, it's not human. that's, we're getting back to what we were saying earlier is that sometimes we imbue this humanness into it because it feels like it's made to approximate humanness. And so we want to treat it as we would another human, but we have to understand that, especially if we're in this as a profession and this is part of what we do is that, and we're using this to help us with what we do in our profession.

Clare Sudbery (32:32)
Mm-hmm. Mm-hmm.

Brian Milner (32:50)
We have to understand the limitations. We have to understand what it does well and what it kind of struggles at and take that kind of realistic view of it to say, no, this isn't going to respond to me the same way a human teammate would. ⁓ it's not a good idea to treat it in the same way that I would a human, because it won't respond the same way that a human.

Clare Sudbery (32:55)
Mm-hmm. Mm-hmm. Yeah, yeah, yeah. And I think, you know, there are other reasons to be suspicious of AI that we haven't touched on to do with copyright and the environment and all sorts of malicious uses, know, bias in algorithms and all the rest of it. But it's very difficult to avoid at the moment. You know, and lots of people are predicting a burst of bubble. And I think

Brian Milner (33:23)
Yeah.

Clare Sudbery (33:36)
Certainly, I don't think it's going to keep increasing at the pace it currently is. And I think a whole bunch of issues are going to arise. But I think it unfortunately is probably not going away. So if you want and and and there's that awful feeling of being left behind. And it's not just a feeling, unfortunately, because, you know, I don't agree with it, but, you know, lot of hiring policies and internal policies are saying, well, if there's no AI, then we're not having it, you know, so we won't build anything without AI. We won't hire anybody without AI. won't hire anybody if we think AI could do it instead. And so... If you don't understand how it works and what its limitations are, and if you don't understand how you can work with it, and if you're not actually trying to stay ahead of the ethical implications and think about how it could be used more responsibly, then you probably are going to get left behind, you know, and that's a tricky one. So those are kind of, you know, those are the people that I want to help. is the people who don't want to get left behind, but also don't want to get sucked into an excessive hype machine without continuing to be discerning and actually pay attention to what's important and whether things really are working or not.

Brian Milner (34:55)
Yeah, it's a fascinating topic. And I think this is one of those areas that we're gonna see lots of progress and kind of discoveries and improvements on over the next few years. I know you're giving a talk on this coming up. You wanna plug that and just kind of mention where you're speaking on this?

Clare Sudbery (35:10)
Yeah, well, it's actually a workshop. So I'm going to be delivering a day long workshop at the Software Architecture Gathering, which is at the end of November. So my workshop is on Monday, the 24th of November, and that's in Berlin. And I am also possibly going to be delivering a workshop for GoTo on the same topic. So the one I'm doing in Berlin for Software Architecture Gathering is a one day workshop. I may be delivering an extended version, a two day version in Amsterdam for GoTo. But we're currently just investigating whether that will be more popular or whether I'd be better off doing a refactoring workshop. register an interest, let me know or let GoTo know if you like the sound of the TDD and AI workshop. And in the meantime, I am, you know, beavering away writing about it and thinking about it and playing with it and testing it out and experimenting with different ways of working.

Brian Milner (36:07)
Awesome. Well, we'll put links in our show notes to anyone who's interested in that so they can get in touch with you and find out more about these workshops and how to take them and everything else. But I really appreciate you giving us your time, Claire. This has been fascinating. And we may have to have you back as things change. And you can help us kind of understand how they've changed.

Clare Sudbery (36:22)
Yeah, absolutely. Because they are going to keep changing. It's going to be endless, endless change. Yes. And I should also say that if anybody would like me to host this workshop for them, either for an event or internally for an organization, or come and help teams with learning how to use AI safely, then that's also a thing that I can do.

Brian Milner (36:45)
Awesome. Well, thanks again, Claire. Thanks for coming on.

Clare Sudbery (36:48)
It's a pleasure. Thank you very much for inviting me.

201 episodes