The Old Reader

21 Jun 04:37

Can Approval Testing and Specification by Example Work Together?

Todd Jordan
Approval testing sounds intriguing...

At XP 2013 I attended a workshop by Emily Bache in which we compared different methods of writing tests for an existing piece of legacy code. The methods were coding tests to run in a test framework such as JUnit or RSpec, using Cucumber with pre-written step definitions, and approval testing.

With an approval testing library, you do not write assertions in your test code. Instead, a test passes inputs to the functionality under test and formats the results as text. It passes the text to the approval testing library which compares it against an approved copy from a previous run. If the outputs are different from the approved copy, or this is the first time the test has run and there is no approved copy, the test fails and the user must examine the differences and decide whether to approve the latest outputs. If they do, the latest outputs then become the approved copy to be used in future tests.

I was impressed by how quickly one could use approval testing to get a piece of existing code under test compared to the other two methods. It's a powerful technique for locking down behaviour when refactoring legacy code.

However, we found that it wasn't always easy to understand test failures with approval testing. We had to look in at least three places to understand what was being tested and why it might have failed: the test code, a diff between the output of the test and the currently approved copy, and the code under test.

It was worth expending some effort on how results were formatted as text to make failures easier to understand. Helpful output showed the inputs alongside the results and had explanatory text that identified the feature being tested and what each pair of inputs and outputs was testing.

To me, that begins to sound very like Specification by Example. But instead of documention being written up front and parsed by the tests to extract inputs and expected outputs, documents are generated by the test and then compared against the approved copy. With a diff-friendly document format, such as Markdown, and some string templating, one could use approval testing as a low-overhead way of both testing and documenting a system. The templating would have to generate diff-friendly output too, but it might be enough to just generate Markdown tables from inputs and outputs and ensure that the grid lines are aligned.

Approval testing would also be better than something like Cucumber for testing numerical calculations. The final output for approval does not have to be text. For a numerical function, a test can render a graphical visualisation so one can more easily see calculation errors, such as undesirable discontinuities, then when results are displayed in tabular form.

Approval testing might help avoid a failure-mode of Specification by Example / BDD, in which it gradually becomes a document-heavy process with an up-front analysis phase. People notice that requirements were missed or misunderstood when they first see the running system. If it has taken a long time to get that running system in front of people, a common reaction is to try and avoid the problem in the future by spending more effort on requirements analysis and writing more documents. In a workshop at XP Day last year I heard how a team that had adopted BDD now wrote so many Cucumber specs documents they had stopped using them to automate tests and just used Cucumber's Gherkin language as a standard format for documenting requirements before starting development. But this is a vicious circle: the more time spent writing requirements documents instead of getting running code in front of users, the more critical any mistakes in those documents become, because time needed to change the software has been spent on analysis and documentation. Reacting to mistakes by spending even more time writing requirements documents only makes the problem worse.

Agile software development takes the opposite approach: relentlessly increase the rate at which you can put running software in front of users and get feedback about how they really use it. The end result, if you push it hard enough, is continuous delivery and, for online applications with large userbases, the lean startup approach. For bespoke applications, the end result is developers and users working together to change the software, the programmer changing the system during conversations with the user about what it should do. When software can be changed that fast, writing Specification by Example documents for new features feels like an unhelpful overhead. But the documentation and regression tests that you get from Specification by Example are helpful as the system is evolved.

Maybe combining approval testing with Specification by Example would help. Rapidly change the code until it does what is desired, then lock down its behaviour with an approval test that generates easily diffable documentation and pass that documentation through a publishing pipeline as part of the build. For example, the Markdown format and pandoc tool could work well for this.

Of course, such a tool would not preclude following a normal acceptance-test driven process for changes that require more time to implement. One can write the golden copy document for the approval test by hand and it will then act like a normal specification by example document in an acceptance-test driven process, reporting where the system does not yet perform the desired behaviour.

Image from VistaICO Toolbar Icons, used under the under the Creative Commons Attribution 3.0 Unported license.

Todd Jordan likes this

19 Jun 05:41

"Start Each feature with an Acceptance Test" => end-to-end ?

by michael.azer...@gmail.com (Michael Azerhad)

Todd Jordan
Excellent discussion on Acceptance Tests - whether they should be driven through the UI

Hello,
I'm confused about the page 39 of the book, stating this:
"We write the acceptance test using only terminology from the application's
domain, not from the underlying technologies (such as databases or web
servers)".
Does it mean, that after writing our very first test: Walking Skeleton,
being end-to-end, we start writing each acceptance feature with mocking

Todd Jordan likes this

14 Jun 01:00

May 2013 edition of ThoughtWorks Technology Radar

Todd Jordan
Interesting to read and see what up and coming technologies are gaining steam, as well as what is phasing out.

The ThoughtWorks Technology Advisory Board (TAB) has released the latest edition of our technology radar. This is where we highlight some of the technologies that are currently attracting our attention and that we feel are worth you taking a look at. In this edition our themes include my long term interest in breaking down boundaries between people and groups, lightweight option for analytics, infrastructure as code, and applying the practices that have worked well for us in development to places that are missing them.

Troy Astle, Reza Sadr and 2 others like this

18 May 19:19

Straw Man TDD

by jason@parlezuml.com (Jason Gorman)

Todd Jordan
Debunking some common misconceptions... good read

A lot of the criticisms of Test-driven Develoment I hear are really attacks on a mythical version of TDD that no right-minded advocate ever put forward.

Nevertheless, being a TDD trainer and coach, I do still devote time to answering these straw man criticisms and objections. I thought it would be useful to collect some of the most common misconceptions in one place that I can point people to when I'm just too tired and/or drunk to answer them any more.

1. TDD means not doing any up-front thinking about design

Nobody has ever suggested this. It would be madness. Read books like Extreme Programming Explained again. You'll see sketches. You'll see CRC cards. You'll even see UML. (Gasp!)

The question really is about how much up-front design is sufficient. And the somewhat glib answer is "just enough". I tend to qualify that as "just enough to know what tests you need to pass". So, if your approach is focused on roles, responsibilities and interactions, then I'd want to have a high-level idea of what those are before diving in to code. If it's more an algorithmic focus, I'd want to have a test list that can act as a roadmap for key examples that - taken together - explain the algorithm. And so on.

I'd stop at the point where I'm asking questions that are best answered in code (e.g., is this an interface? Should this method be exposed? etc) Code is for details.

2. TDD takes significantly longer because you write twice as much code

Once you've got the hang of TDD - and that can take months of practice - we find it doesn't take significantly longer. Mostly because the bulk of our time isn't spent typing, it's spent thinking and, when we don't take care, fixing problems. Fixing problems, we find, generally takes more time than avoiding them. So much so, in fact, that working in the very short feedback loops of TDD and testing thoroughly as we go can turn out to be a way of saving time.

Most developers and teams who report a loss of productivity when they try TDD are actually reporting the learning curve. Which can be steep. This is why it can make good commercial sense to seek help in those early stages from someone who's been there, done that and got the t-shirt.

3. TDD leads to mountains of test code that make it harder to change your source code

There are three key steps in TDD, but most developers miss out or skimp on the third one - refactoring. So, when they report that they tried TDD for a few months, but found after a while that they couldn't change their source code without breaking loads of unit tests, I'm inclined to believe that this is what's really happened.

Test code is source code. If the test code is difficult to change, your code is difficult to change. So we must apply as much effort to the maintainability of test code as to the code it's testing. It must be easy to read and understand. It must be as simple as we can make it. It must be low in duplication. And, very importantly, it must be loosely coupled to the interfaces of the objects it's testing.

Think of UI testing. Maybe we wrote thousands of lines of scripts that click buttons and populate text boxes and all that sort of thing, binding our UI tests very closely to the implementation of the UI itself. So if we want to change the UI design - and we will - a whole bunch of dependent tests break.

Better to refactor our UI test scripts so that interactions with the concrete UI are encapsulated in one place and invoked through meaningfully-named helper functions. so we can write tests scripts in the abstract (e.g., submitMortgageApplication() instead of submitButton.click() )

The same applies to unit tests. If we repeatedly invoke the same methods on an object in our tests, better to encapsulate those interactions behind abstract and meaningful interfaces so it all happens in one place only.

4. TDD does not guarantee bug-free code

This isn't a straw man, per se. But to say that "we don't bother doing X because X is not completely perfect" isn't much of an argument against doing X when no approach guarantees perfection. When people throw this one at me, I'm naturally keen to see their bug-free code.

Let's face it, the vast majority of teams who don't do TDD would benefit from doing something like TDD. They'd benefit from working towards more explicit, testable outcomes. They'd benefit from shorter and less subjective feedback loops. They'd benefit from continuous refactoring. They'd benefit from fast, cheap regression testing. Their software would be more reliable and easier to maintain, and - once they've worked their way up the learning curve - it won't cost them more to achieve those better results. There are, of course, other approaches than TDD that can achieve these things. But, by Jiminy, they don't half feel like TDD when you're doing them (which I have).

UPDATE

5. you are not designing domain abstractions, you are designing tests.

This is a new addition to the fold, courtesy of some chap on That Twitter who obviously thinks I don't know one end of a domain model from a horse's backside.

Now, I've spent a fair chunk of my career modeling businesses - back in the good old days of "enterprise architecture", when that was where the big bucks were. So I do know a thing or two about this.

What I know is that those domain abstractions have to come from somewhere. How do we know we need a customer and that customer might have both a billing address and a shipping address, which my be the same address, and that customer may be a person or a company?

We know it because we see examples that require it to be so. If we don't see examples on which these generalisations are based, then our domain model is pure conjecture based on what we think the world our systems are modeling might look like (probably). I design software to be used, and it has been considered a good idea to drive the design from examples of usage for longer than I've been alive. Even when we're not designing software, but simply modeling the domain in order to understand it - perhaps to improve the way our business works - it workes best when we explore with examples and generalise as we go. In TDD, we call this "triangulation".

I will very often sketch out the concepts that play a part in a collection of scenarios - or examples - and create a generalised model that satisfies them all as a basis for the tests I'm about to write. (See Straw Man #1, of which this is just another example.)

When we generalise without exploring examples, we tend to find our domain models suffer from a smell we call "Speculative Generality". We can end up with unnecessarily complex models that often turn out not to be what's needed to satisfy the needs of end users.

Good user-centred software design is a process of discovery. We don't magic these abstractions and generalisations out of thin air. We discover the need for them. At it's very essence, that's what TDD is. I can't think of a single mainstream software development method of the last few decades that wasn't driven by usage scenarios or examples. There's a very good reason for that. To just go off and "model the domain" is a fool's errand. Model for a purpose, and that purpose comes first.

If you practice TDD, but don't think about the domain and the design up-front, then you're doing TDD wrong. It's highly recommended you think ahead. Just as long as you don't code ahead.

UPDATE #2

6. TDD doesn't work for the User Interface

Let's backtrack a little. Remember those good old days, about 10 minutes ago, when I told you that you should decouple your test code from the interfaces that it tests?

Those were the days. David Cameron was Prime Minister, and you could buy a pint of beer for under £4.

Anyhoo, it turns out - as if by magic - that it's not such a bad idea to decouple the logic of user interactions from the specific UI implementation in the architecture of your software. That is to say, you knobs and widgets in the UI should do - to use the scientific parlance - "f**k all" as regards the logic of your application.

The workflow of user interactions exists independent of whether that workflow is through a Java dekstop application or an iOS smartphone app.

A tiny slither of code is needed to glue the logical user experience to the physical user experience. If more than 5% of you code is dependent on the UI framework you're using, you're very probably doing it wrong.

And for that last 5%... well, you'd be surprised at how testable it really is. It may take some ingenuity, but it's often more do-able than you think.

Take web apps: all it takes is a fake HTTP context, and we've got ourselves 100% coverage. (Whatever that means.) Java Swing is equally get-at-able. As are .NET desktop GUIs. You just have to know where to stick your wotsit.

UPDATE #3

7. TDD Only Works On Simple, Toy Examples

I hear this surprisingly often. People say "It's all fine and dandy if you're doing FizzBuzz, but can you name a single real-world application of any appreciable size that has been written using TDD?"

I can think of a few dozen I've been directly involved with, plus hundreds more I've coached developers on. The FREESAT version of the BBC's iPlayer was originally written using TDD, for example. Significant parts of the Eclipse platform have been written using TDD. A fair amount of the new Visual Studio was written using TDD (it may surprise you to learn.) They use it at Facebook. They use it at Google. They use it at Amazon. Not across the board, of course. But when did any large software organisation apply any practice consistently?

I've personally used it on everything from slimline desktop gazelles to major server-side mammoths.

I've seen many clients successfully apply TDD on real projects of all shapes and sizes, from a few hundred lines of code to a few million.

To be honest, I'm not sure why - after all these years and all the war stories - people still think that TDD only exists in the laboratory. For sure, it's been documented many times in the wild. It may relate, in some way, to the first myth - that up-front thinking about design is not allowed in TDD. There's an accusation that TDD is somehow "anti-architecture". And, for certain, if you do no thinking up-front - and as you go along - about the design then that will be the case.

Also, too many developers have weak refactoring muscles. As the need for generalisations and abstractions emerges with each new test case, developers may lack the refactoring skills to reshape the design to be optimal for the code at that time. Instead, they let entropy do its worst.

Yes; it turns out that refactoring is really rather important after all.

Chris Pitts (@thirstybear) points out that TDD-ing FizzBuzz can reveal that there's more to a simple example like that than some programmers may have realised. I completely agree. Even an algorithm that simple can have several things go wrong in the implementation.

This is another important issue: TDD, done well, can help us to focus in on those "minor details" that we tend to skim over, or miss completely, when we test after the fact. be honest now, if you wrote the FizzBuzz code and then thought about writing some tests for it, would you have been as rigorous? Would you have picked out that many potential failing test cases? I've measured teams on this, and testing after the fact has a very noticeable tendency to miss more test cases.

When we discipline ourselves to only write production code when we have a failing test that requires it, we tend to end up with every inch of our code being meaningfully tested. Hence, TDD'd code tends to have much higher assurance, and be much more reliable. It's not the reason for doing TDD, but it's one heck of a fringe benefit.

UPDATE #4

8. I'm A Special Case. TDD Won't Work For Me

Got reminded of this one today from a Twitter debate I seemed to have got caught in the crossfire of where someone claimed "I'd do TDD, but I work on Android", or words to that effect.

The basic form of the argument goes "TDD might work for you, but my case is special, so it won't work for me."

It's either that it's not technically feasible in your "special case", or there would be no benefit to doing it in your "special case"

Let's face it, we're all special cases, right? We could all dismiss everything that seems to work for other developers by simply playing that card. "Oh, I'd use source control but I'm in Belgium", "I'd get everyone around a whiteboard and do a bit of brainstorming about the architecture, but we're at a high altitude here" or "TDD's fine if you're not working on Android like I am"

It's a given that not every practice is consistently applied, and not every practice makes sense in all situations. But some are very widely applicable. Name me a project or product you've worked on in your career where it would not have been a good idea to version-control the code?

TDD can work in a wide range of situations, if you know how. And the vast majority of teams would benefit from doing it, if they can.

That's not to say that TDD is the only way. There are other ways of producing clean, elegant and reliable code that's easier to change.

But I don't hear teams saying "We'd do TDD, but we're already practicing Design By Contract, so we don't really need to" or "We're running model checkers on our code, so TDD probably wouldn't add much value". (And, before anyone dares to say it, if you're doing BDD, you're doing a kind of TDD.)

Most teams would benefit from doing something like TDD in most situations. Undoubtedly there are special cases, like when the software is safety-critical, when TDD might not be sufficient (although there are those from that background who are thinking perhaps it might be, with some tweaks and an extra dollop of discipline), or when it might be overkill (though I've not come across many situations where even some loosely applied TDD wouldn;t have helped, even on spikes.)

I suspect there's an element of Special Pleading in many of these instances, and also people extrapolating from the Straw Men of TDD (e.g., "I've heard that it takes longer, and we haven't got the time", "I've heard that it's very difficult to do GUIs, and we have lots of GUIs" etc)

If you'd like to see a few other TDD myths debunked, while getting some hands-on practice in an intensive and fun workshop, join us in London on July 13th.

Todd.jordan likes this

10 Apr 14:18

[CLOSED] YTD and Elegant Themes Giveaway

by Cadence Wu

Hello creatives! This week, YTD have partnered with Elegant Themes to give away three (3) developer accounts that entitle you with 100% complete access to their awesome collection of WordPress Themes! All of these themes are equipped with a user friendly control panel that allows you to take control of its features without touching any line of code. Moreover, the developer account gives you an access to all plugins and layered Photoshop files for your website needs. So if you are planning to redesign your WordPress website or create a new one, this giveaway contest might just be for you!

YTD and Elegant Themes Giveaway

How To Join:

1. Like YTD on Facebook or Follow YTD on Twitter.

2. Tweet or Share this article.

3. Copy and post your Tweet or Share link below this article as a comment.

This giveaway contest will run from April 10 – 13, 2013 PST. Three winners will be chosen randomly and will be announced next week. Good luck everyone!

UPDDATE: The giveway has ended and below are the 3 winners of developer accounts from Elegant Themes!

Joan Bagge
Natosha Benning
Adam Majchrzak

Congratulations! We will be contacting you soon for your accounts! And a big thank you to all those who participated! :)

We are Principled: 6th Edition

by eric smith

Todd Jordan
I hope someday to write this about my company.

We take responsibility for the correctness of our code by testing it thoroughly.
We do not tolerate preventable defects. —from the 8th Light Principles of Well-Crafted Software

Bugs. Defects. Errors, whatever you want to call them they are the scourge of our industry, and they will continue to be so until we take them seriously.

"But Eric!" you say, "We already take our defects seriously! We categorize them, prioritize them, describe them. We have a huuuge database and a gigantic QA department. How can you say we don't take our defects seriously?"

Simple: If you took your defects seriously you would fix them!

Think of all the things I just mentioned. Huge QA departments running manual test scripts, bug tracking databases, bug prioritization meetings, etc. all exist because as an industry we have accepted the premise that software will have a huge number of defects. So many that we can't possibly fix all of them. Instead we track them, and fix only the "important" ones.

At 8th Light we reject this premise. It is not that we believe we don't make mistakes, but we work as hard as we can to prevent them and take responsibility for fixing them.

We Test Thoroughly

We practice Test Driven Development and it's not negotiable. If you want to "save time" by eliminating this, you'll get a straightforward answer.

No.

Any 8th Light team will immediately begin writing tests first, and will write tests for any untested code they encounter. But TDD alone will not produce a quality product.

Each team will add higher-level testing suitable to the context of the project. We may use Fitnesse, Cucumber, Selenium and others. We'll write integration tests as necessary. When we find a defect we will write a test to make sure it never happens again.

Finally we will manually test, and in a special way. In the third edition of this series Paul detailed our demo process. He mentioned that the demo is practiced, which is something we developed on one of our earliest projects. There are many reasons to practice the demo but one of the biggest benefit is catching missed requirements. I couldn't list the number of times one team member would demonstrate a feature only to realize they had left something out or made a simple mistake, usually before the person they were demonstrating to could even catch it. The simple act of walking through the feature slowly proved immensely beneficial.

We Take Responsibility

Have you ever been in a meeting, or meetings, trying to determine the root cause of an expensive defect? Development, QA and management are all represented, and then somebody will ask QA:

"So how come YOU didn't catch this bug?"

It's said with an accusatory emphasis on YOU because after all it's QA's job to catch bugs and if they aren't catching them, then what are they doing? The poor QA lead fumbles around, mentions they'll add a test for it so it won't happen next time, and development goes on doing the same thing they've always done.

Of course the question should be:

"Why did you write that bug?" or better "Why did you ALL miss this bug?"

And the one and only correct answer to this question is really quite simple:

"We screwed up."

You may substitute your own colorful phrasing if you like, but the point is the same. If a mistake was made it's because development made it, not because QA didn't see it, and ideally it's more than one person on the development team. 8th Light backs this up this responsibility in the most clear way possible, financially. On the craftsmanship page of 8th Light's website you'll find the following quote:

If we make a mistake, we'll fix it quickly—for free. It's that simple.

I've told other people we do this. I've told them stories about us working for free for days or weeks at a time while we correct large errors. Usually they claim we're idiots, that we'll lose all our money fixing defects for free, because they accept the premise that large amounts of defects are inevitable.

As evidence to the contrary I point out that we've had this policy for as long as I've been an employee, and in that time we've grown from 5 employees to 40. It seems to be working out just fine.

We Do Not Tolerate Preventable Defects

Why do we "manage" defects? The Wikipedia entry comparing issue-tracking systems has 53 entries in it. Do you know what the best issue tracking system is? A piece of paper, with the defects written on it. Cross them off as you fix them. The end. Why the complications of classification, prioritization, assignment and all that?

It only makes sense if we aren't going to fix them all. We're going to to tolerate many of them, but not "too" many, so we have complicated systems to track them all. I find the notion offensive both to my client and to the end user.

A simple todo list works just fine if you actually fix the defects. On my projects I've found the best system is to stick defects in the next iteration. Sometimes you can fix them all in a given week, sometimes you can't, and if you find you don't regularly get to 0 known defects then you've got to bite the bullet and and do nothing but fix defects for a while. Yes this will delay features shipping occasionally, but working in nearly bug-free code will mean you work faster the overwhelming majority of the time.

I've worked on products with hundreds, even thousands of open defects. These defects were usually tolerated because individually they were small, but collectively they made for extremely slow development and angry customers. When you've reached that number of defects you really only have one option, a complete rewrite. I assure that's far more expensive than occasionally moving out a ship date by a week or two to fix bugs.

Real World

Whenever I get on my soapbox about defects people start giving me reasons why it can't work.

"You'd never ship!"
"What about performance?" or "memory?" or "a feature they didn't like but works?"
"Your customers will take advantage of you!"

The truth is most of the time these problems take care of themselves. We can argue all day on the edges of it, in particular what it means for something to be a bug, and someday I may write about it, but my answer now is to say I work for a company that fixes every defect. And it works.

Todd.jordan likes this

25 Mar 20:00

Evaluating the Odds for Florida Gulf Coast and the Rest of the Final 16 Teams

by By NATE SILVER

Todd Jordan
If Nate Silver's predictions on NCAA basketball are anything like his prediction for US president my bracket is looking good with a Florida vs Louisville championship matchup.

Ten of the 11 teams our model initially gave the greatest chance of winning the tournament have survived to the Round of 16. But their odds have changed based on their margins of victory so far and the teams they are most likely to face.

Todd Jordan

Shared posts