Unit Tests: Boldly Crossing Boundaries and Gently Breaking Rules

Tuesday July 29thGuest posts, Software Development Category

For the first time ever, OJ’s rants has a guest blogger! Long term friend and highly-respected geek, RobG has put together an interesting piece on something that’s close to the hearts of most Geeks - Unit testing. This is his first post, and I hope it won’t be the last.

Without further ado, here’s Rob!

Firstly, a bit of context is required: I recently read Phil’s post regarding Unit Test Boundaries, and wrote a decent comment there asking him a question or two that were never answered by him (or anyone else for that matter). Similar questions were also being asked by other users also left unanswered, and it seems that my follow-up comment there (meaning to answer some of those) was too long to be a comment, directly resulting in this article instead!

It’s highly likely that this article won’t make much sense without reading the original post first (so please do that), followed by Chris’ comment (question), which I’ve copied below in case it disappears for some reason:

Chris B.
Alright…clearly there are a lot of sharp people here, so I’d like to ask everyone’s opinion for the utterly perfect testing scenario for a couple of simple database operations.

I need to test whether a customer which exists loads successfully from the database, loaded by a unique id. Do I preload the database in a setup script for the test harness, or do I have an entire db filled with test data from a backup, or what?

I need to test whether a customer is saved successfully to the database. When I’m done, what’s the best way to clean up? Do I abort a transaction context, or throw away the entire db (I’ve done that before), or what? What’s the best way to verify that the customer is saved successfully? Direct queries or what?

I’m amenable to the theoretical argument that this article makes, but it seems that pragmatically the kind test we’re saying you shouldn’t write is exactly the test that is most likely to reveal problems (in my experience). I’d love to hear some other ideas.

Just like Chris, I’m all for the theory, but as he pointed out, pragmatically, it’s a lot more like wading through thick syrup. He asked for some guidance, so I thought I’d tell you what I usually do, and what has worked for me. It may or may not work for you, but at least you get to make that call: If you’re a TDD practitioner, you’ll have hit the crux of the problem that we all end up facing eventually with unit testing: You’ve had to make a conscious choice to let your unit test reach into actual (not mock) data-land, because no matter how much you try and persuade yourself, you don’t actually believe that your database will in fact return that customer record you’re asking for until you’ve been able to test it for real – and I mean really real!

So what I’ve done to solve this is setup a new test project for these (crossing the line) tests. This is so that you can run them independently of the faster unit tests. I call them Datastore or Repository tests or something of that nature. Then you have problem number two hit you in the face like a frying pan - How can I make these tests repeatable? I know that if I add a new customer to the database in one unit test, I’m going to need some way of restoring the database to its previous clean state right? I can’t have “TestCustomer1″ running around in the database after I’m done with him, so I need to get rid of all that test data somehow.

I’ve heard various strategies suggested in the past - some of which involve dropping and recreating the database each time. I can’t help thinking that’s like taking out a sledgehammer to smash an ant, but it definitely still works. I’ve also tried other options where you make sure any test data you insert has specific identifying properties like negative ID values (won’t help if you’re using GUIDs though), or prefix/suffix specific column values with something like “test_” or “!test” and running a cleanup script at the end that clears down any lagging data. This has worked pretty well in the past, and if you stick to the rules, the cleanup script can require very little maintenance and can be quite dynamic - but still, I’ve always wanted a more elegant solution.

Finally, it hit me. It’s the solution I’ve been using for a few projects now, and it has worked very well. Only one problem - you have to break another cardinal rule of unit testing. Hrmm…OK I hear you say, what the hell, we’ve already crossed the line, let’s see if the ogre flinches when we spit in its eye?

The rule we’re breaking now is that of only testing ONE thing in a unit test. Well, I don’t like to think of it as breaking - more like gently taking apart - that implies you’re being very careful about when you do it - and you’ve thought it through and not done it by accident.

The unit test name gives it away completely:

ShouldCreateAndDeleteACustomer()

At this point, I can see many people wince violently while sucking air quickly through their teeth – but let me explain. It stands to reason that you’re going to implement the basic CRUD functions into your data strategy right? So why not exercise a couple of them at a time? Well, this is where my own warning bells go off! It’s a bit like walking the tight-rope, it’s dangerous, but if you’re careful and develop enough skill, you can get to the other side without doing any damage.

Now that you’ve gotten over the initial shock, let me explain what the test does. For this one, you’re going to exercise the Create, Retrieve and Delete functions in CRUD, and all of this can be done in a TDD fashion. First you exercise the code that creates the customer:

Assert.True(customerRepository.SaveCustomer(customer));

This line of code will fail (RED) because you haven’t written your data access code yet right…right?! I use LinqToSql, but ultimately you’ve just got to write enough code (SPROC or otherwise) that adds the record to the table (GREEN) so you can verify it with your eyes with a Query Analyser of sorts - then delete the row manually.

Remember, you’re still (hopefully) programming against an interface (ICustomerRepository for example), so it’s the same call that you would make from a Mock (Moq rocks!), and you can still use your favourite IoC Container (mine is StructureMap) for dependency injection - but instead, this time you’re hitting the actual database. My SaveCustomer() methods usually handle Create as well as Update depending on the ID field (I normally use GUIDs), but I’m trying to keep this simple.

Next you exercise the retrieval code:

Assert.NotNull(customerRepository.GetCustomers.WithId(customer.ID));

In case you’re wondering, WithId is an extention method to IQueryable because LinqToSql utilises deferred execution, and WithId triggers the actual DB call.
Again, this will fail (RED) because you haven’t written your data access code yet right?! You write the code (as I said, I use LinqToSql, but you can use a call to a SPROC if you like), and viola it’s (GREEN), but the record is still in the database - so you delete it manually.

Finally, you exercise the code that deletes the record:

Assert.True(customerRepository.DeleteCustomer.WithId(customer.ID));

Once again, this will fail (RED) because you haven’t written the code yet right?! (Is this getting old yet?). So you write the code to delete the record and the test goes (GREEN).

If you’re still a bit sceptical about your delete operation, you can run the retrieval code again and this time Assert.Null instead.

Now what you’ve done in summary is test your Create, Retrieve, Delete and Retrieve (failure) cases in one test. The downside is that you’ve tested more than one operation in a single test, but the upside is you’ve left the database perfectly neat and tidy. With a bit of extra work, you can add a few more calls (hey…why stop there?) to include the “Update” function from CRUD, and what you’ll have is a neat unit test (10 to 15 lines of code) that exercises the entire “Customer” entity in your repository, and leaves no traces behind.

Now for a word of warning - try and make sure that, even if you can’t obey the rules of unit testing, that you at least obey the spirit of unit testing. By that, I mean keep the footprint of your test small and specific. If you’re working with the Customer entity - stick to that. If you’re working with the Order entity, stick to that. Don’t go writing tests that start messing with joins between the two. THAT’S WHAT YOUR “PROPER” unit tests are for - testing behaviour should not come into these - or else you’re heading for a world of pain. Your mocks and interfaces will take care of behaviour. These data tests are purely there to prove that your data methods do what they say they do - for real!

I hope that helps Chris and any other people looking for elegant solutions to sticky problems. If it does, I might consider blogging some more of the pragmatic techniques that have worked for me and my teams in follow up posts.

That’s it from me for now – over and out!

1 Trackbacks/Pingbacks

  1. Pingback: Pages tagged "decent" on July 29, 2008

14 Comments

  1. Rhys Parry
    July 29, 2008

    Exciting stuff. I recently had to write automated functional tests that checked that a tool was manipulating the database correctly. I took the approach of using very specific values that could be excised in the cleanup for my test.

    The problem I see with this particular method is when one of these steps (say the update one) fails (maybe due to some refactoring). The delete operation is not performed and the test leaves data in your database.

    One possible workaround is to wrap it all in a try…finally block to ensure the delete operation is performed. This is certainly an improvement (particularly if you can\’t just use a more generic cleanup method), but relies on the program you are testing to do the cleanup for you.

    Having separate independent code to do the initialise/cleanup steps can help here (although it probably pushes those boundaries even further).

    Nevertheless, the test as it stands will certainly fall over if something breaks, but before the fix operation can begin, those extra rows of test data will need to be manually deleted.

  2. RobG
    July 29, 2008

    You’re absolutely right Rhys, there are a number of considerations to take into account to make sure that no data is left in - even when the test fails as it should because something changes. For the sake of brevity, I left those extra bits out, but like you, depending on the situation, I either use a try -> finally, or I combine the approach in the article with a cleanup script at the end that targets specific “test_data”.

    Sometimes though, in a few situations, I’ve opted to “do nothing”, for the very fact that data left behind has sometimes helped me debug the very issues that caused the test to break in the first place. Either way, it’s your code - and you need to maintain it, so make the choices that fit you best in the circumstances. Like I said - it’s a tricky tight-rope walk :-)

    Cheers,
    Rob

  3. OJ
    July 29, 2008

    The use of the try/finally would most likely cater for the case where your system doesn’t do physical deletes, it just does logical deletes.

    For the most part you’re not going to be running your unit tests on a production system anyway. If you did, you’d no doubt refresh the DB before allowing it to go live.

  4. RobG
    July 29, 2008

    @OJ Yes that does help with proper cleanup of logical deletes - there are a number of ways to skin that particular kitty.

    What exactly do you mean: “For the most part ” ?!? :-P

  5. sparky
    July 29, 2008

    Sometimes you need to check that things come out of persistence layers as you expect them to, or make it through business layers as expected. Mocking and stubs are great tools, but why can’t you also use NUnit for some integration testing?

    One thing that I have done for this scenario is to have the test fixture class have a helper class that is a private member of the test fixture. The helper class knows how to delete items of type x, and anything that references those items. The helper class is responsable for cleaning up after the created db rows. Preferably the cleanup code goes directly to the database.

    CustomerHelper {
     List<int> ids;
     public void AddToDeleteList(int value) { _ids.Add(value) };
     public void Cleanup() { 
     if( !ids.Any() ) return;
     
     // execute some sql that uses an IN clause. execute all the sql at once.
     string sql = string.Format( "delete from tablex where tablexid in {0}",
                          ids.ToCsv() );
     
     try { db.Execute(sql); } catch ...
     ids.Clear();
    }

    The testfixture then holds reference to the helper that it needs, and executes the helper in the test fixture teardown. Don’t put these inside the test method itself, because the test can fail, and we want to ensure the db will be cleaned up if at all possible. By adding the helper to the test fixture level, you can not only have simple cruds, but any number of test methods can add to the delete list, content that the new object will be deleted.

    private CustomerHelper _customerHelper;

    Inside the test method, immediately after the first update() you store the id for later deletion.

    _customerHelper.AddToDeleteList(newId);

    You can definitely make this more robust, but that’s the essence of it all. One of the other reasons this is done is so that certain configurations can be constructed in the db and later cleaned up. No more relying that test code ‘knows’ that ID 1234 is a Manager, and ID 3456 is a graduate employee. Your test creates entities in the database and cleans them up for you.

    Also, typically, you assign a specific category to the test (ie DBIntegration) and only run those daily, because they do take a long time to run, but a daily test by Cruise is better than no run at all.

    * I’m tempted by IDisposable for the helper class, but can I guarantee that the db is still legitimate when IDisposable is called? Maybe IDisposable as a tracking method, then at least you know if the teardown never happened and can log that somewhere.

  6. OJ
    July 29, 2008

    Great comment Sparky. I hope you don’t mind but I modified it to include the code tags so that it was a bit easier to read. :)

    One of the issues that I have with allowing unit tests to grow into integration tests is that it then becomes a “standard”. Over time, other devs get wind of this and no longer make sure that their tests actually remain meaningful at the unit level. In short, all the unit tests become integration tests. Yes this is more of a discipline thing than anything else, but it does happen. If you go down this path you also have the issue of multiple tests covering the same functionality, which results in more maintenance of your tests when things change. It’s bad enough that you have to maintain the tests when the code changes, but having to do it across multiple tests is just horrid.

    Yup, I’m whinging about relatively minor stuff. But at the end of the day, the cost of maintenance should be kept low and having unit test code attempt to cover more than the unit level increases that cost.

    Thanks for the comment :)

  7. Hey Rob,

    Nice post. I couldn’t agree more on your points on the posts on here and the other blog. It is sometimes a lot easier to stay in the world of theory rather than down at the coal face, at the ‘business end’ as I call it.

    Had this problem myself before and saw a solution implemented by a team I joined for this problem in the System.Transactions namespace.

    Basically I think that a TransactionScope was setup on TestFixtureSetup, all the data calls were enlisted into that and then the TransactionScope was rolled back on TestFixtureTearDown, this gave the testing a safe kind of ’sandbox’ to play in.

    So this could give a solution to the whole ‘what if one fails problem’, as it is in one big trannie.

    Whether this is compatible with linq I am not too sure though.

    Interested to hear your thoughts on this.

  8. OJ
    August 7, 2008

    Long time no see H :)

    I don’t really have too much of a problem with this approach, except for cases where you might end up with nested transactions. It does tend to slow your unit tests down too.

    I can’t see why it wouldn’t be compatible with Linq? :)

  9. Rob G
    August 7, 2008

    Hey H - nice to see you still care :-)

    You’re right about your solution being feasible, and yes it’s entirely feasible with LINQ too, but my personal approach is to keep it as simple as possible, and writing transaction code feels a bit too much like writing code that ONLY satisfies/helps a unit test and has no other real value.

    The way I see it, (as in my example in the article) is that if I’m going to write the code to create, and the code to delete - I may as well use them together in the unit test since I’m already breaking some unit testing rules, in which case - I’ve written no extra code that is purely for the unit test.

    In addition, I don’t use Setup or TearDown anymore because I rather keep the production code DRY, but my tests MOIST for ease of maintenance…

    I can write some new articles about all those concepts too - including what I do when it’s not a PHYSICAL delete, but rather a LOGICAL delete. Doesn’t seem there’s much interest there though - but we’ll see.

    Cheers,
    Rob

  10. Hey you guys,

    You know I am big fans of you both, its nice for us to debate like this. Reminds me of the good ol’ days :-)

    Rob, I can totally see the merits of adding all the CRUD operations in one test, if you use this kind of tranny method it may save you having to hand pick the DB clean if a test fails though.

    Maybe this bit of pseudo code would explain the solution I proposed a bit better, I don’t think I gave it a good enough of an attempt last time. I don’t think it is touched by the don’t use Setup or Teardown rules.

    using System.Transactions;
    using NUnit.Framework;
    using System.Data.SqlClient;
     
    public class TestWithTransaction
    {
      [TestFixtureSetUp]
      public void TestSetup ()
      {
        Transaction.Current = new Transaction();
      }
     
      [TestFixtureTearDown]
      public void TearDown()
      {
        Transaction.Current.Rollback();
      }
    }
     
    [TestFixture]
    public class NormalTest: TestWithTransaction
    {
      [Test]
      public void AtoB ()
      {
        CustomerRepository.DoSomething();
      }
    }
     
    public class CustomerRepository
    {
      public static void DoSomething()
      {
        //So if we do not have a current
        //distributed Tranny we can start one.
        if (Transaction.Current == null)
          Transaction.Current == new Transaction()
     
        // now we will have a transaction local
        //to here or a distributed one with several
        //command tied up in it.
        SqlConnection connection = new SqlConnection("") 
        connection.EnlistDistributedTransaction(
        Transaction.Current);
      }
    }

    Never knew you had a blog Rob, send me the url, always interested to hear what you are thinking.

    Oj, Yeah totally agree with your point about having stuff like nested trannies and it degrading test performance. If you have this kind of setup above you can dig yourself that hole quite easily.

    I think it is pretty useful if you are doing some vanilla Db work and want a nice sandbox for your tests to play in.

    One problem with this method is though that if you do not have you data access object setup like this for handling transactions then you will have to change your underlying code to support this which breaks a few rules. And therefore that is where Robs solution of all the CRUD in one test makes perfect sense.

  11. Jon Limjap
    August 26, 2008

    MbUnit already implements a Rollback attribute, so any persistence unit tests can be run without explicit transaction handling in SetUps and TearDowns.

    Quite nifty, although it’s gonna mess a lot with incremental/Identity-keyed tables.

  12. OJ
    August 26, 2008

    @H: Great comment mate. I fixed up your “code” a bit to make it a little more readable. I can see what you’re talking about but again you’re still hitting your DB, which is what the purists think you shouldn’t be doing. Hard one to nail.

    @Jon: Welcome to the blog (no doubt you came from StackOverflow eh? :) ) Thanks for the comment. There are some nifty features of the unit testing frameworks which do help clean things up, but the question still remains: should we be doing it anyway! :)

  13. Jon Limjap
    August 26, 2008

    OJ,

    Yep, I came from stackoverflow :) although I have had your feed on my GoogleReader for some time already (forgot via which article).

    Great blog! I do hope that we find the answer for that question one day.

  14. OJ
    August 27, 2008

    @Jon: Oh cool! It’s nice to hear that you’ve been a subscriber for a while. That makes me happy :)

    I fear that we’ll never all agree on a single answer to the question. That’s the pain about issues like this which are very subjective and very much down to taste. Having said that I hope that I’m wrong :)

    Thanks for the comments!

Leave a comment

Size

Colors