Object Mentor blog

Syndicate content
Updated: 11 min 42 sec ago

The 4-contact points of software development

Fri, 20/08/2010 - 05:59
The three laws of TDD are:
  • Write no production code without a failing test
  • Write just enough of a test to fail
  • Write just enough production code to get the test to pass

This list doesn’t include refactoring, which is typically an assumed activity. In fact, some people refer to these rules as “red, green, refactor”. An even older version of this, from the Smalltalk community, is Red, Green, Blue. (Why Blue for refactor? I think someone was thinking RBG for a color space, luckily they didn’t try to use CMYK or LAB!)

In this simple model, there two kinds of code: test & production. There are two kinds of activity: writing & refactoring. Interestingly, at one level it is all code. The thing that distinguishes both sets is intent.

The intent of a test is to demonstrate or maybe specify behavior. The intent of production code is to implement (hopefully) business-relevant functionality.

The intent of writing code is creation. The intent of refactoring code is to change (hopefully improve) its structure without changing its behavior (this is oversimplified but essentially correct).

If you mix those combinations you have the 4-limbs of development:
  • Writing a test
  • Writing production code
  • Refactoring a test
  • Refactoring production code

An important behavior to practice is doing only one of these at a time. That is, when you are writing tests, don’t also write production code. Sure, you might use tools to stub out missing methods and classes, but the heart of what you are doing is writing a test. Finish that train of thought before focusing on writing production code.

On the other hand, if you are refactoring production code, do just that. Don’t change tests at the same time, try to only do one refactoring at a time, etc.

WHY?

First an analogy that almost always misses since most developers don’t additionally rock climb.

When rock climbing, a good general bit of advice is to only move one contact point at a time. For this discussion, consider your two hands and two feet as your four contact points. Sure, you can use your face or knee, but neither are much fun. So just considering two hands and two feet, that suggests that if, for example, you move your right hand, then leave your left hand and both feet in place.

This gives you stability, a chance to easily recover by simply moving the most recent appendage back in place and, when the inevitable happens, another appendage slips, you have a better chance of not eating rock face. If you move more than one thing at a time, you are in more danger because you’ve taken a risky action and reduced the number of points of contact, or stability.

Will you sometimes move multiple appendages? Sure. But not as a habit. Sometimes you need to take risks. The rock face may not always offer up movement patterns that make applying this recommendation possible. Since you know the environment will occasionally work against you, you need to maintain some slack for the inevitable.

Practicing Test Driven Development is similar. If you change production code and tests at the same time, what happens if a test fails? What is wrong? The production code, the test, both, neither? An even more subtle problem is that tests pass but the test is fragile or heavily implementation-dependent. While not necessarily an immediate threat, it represents design debt that will eventually cause problems. (This also happens frequently when tests are written after the production code as it’s seductively easy to write tests that exercise code, expressing the production’s code implementation but fundamentally hiding the intent.)

Notice, if you had only refactored code, then you know the problem is in one place. When you change both, the problem space actually goes from 1 to 3 (4 if you allow for neither). Furthermore, if you are changing both production and test code at the same time and you get to a point where you’ve entered a bottomless pit, you’ll end up throwing away more work if you choose to restore from the repository.

Are there going to be times when you change both? Sure. Sometimes you may not see a clear path that gives you the option to do only one thing at a given time. Sometimes the tests and code will work against you. Often, you’ll be working in a legacy code base where there are no tests. Given that the environment will occasionally (or frequently) work against you, you need to maintain some slack.

Essentially, be focused on a single goal at any given time: write a test. then get it to pass. clean up production code & keep the tests first unchanging and then passing.

I find that this is a hard thing both to learn and to apply. I frequently jump ahead of myself. Unfortunately I’m “lucky” enough when I do jump ahead that when I fail, I thoroughly fall flat on my face.

This approach is contextual (aren’t they all?). Every time you start working on code, you’ll be faced with these four possibilities. Each time you are, you need to figure out what is the most important thing in the moment, and do that one thing. Once you’ve taken care of the most important thing, you may have just promoted the second most important thing to first place. Even so, reassess. What is the most important thing now? Do that.

Good luck trying to apply this idea to your development work. I’m interesting in hearing about it.

Categories: Mentoring

Is it worth killing trees over another C++ Book?

Tue, 17/08/2010 - 07:17

I’ve taught a few C++ courses recently to people primarily moving from C to C++. I know C++ has been around for years and it’s not in vogue like it was 15 years ago. I stopped using it full time in 1997. Even so, there’s been quite a bit of work on the library and language standard. And there are still a lot of places developing new systems with C++. I know some people are still be learning C++ in school.

In my most recent classes, I’ve been teaching students who have recently taken a class on OO A & D based on the work of Craig Larman. The C++ class attempts to follow that class by dovetailing into what it covers. Because of this, I did not use ObjectMentors’ standard C++ and OOD class. Good as it is, it has different starting assumptions. I instead wrote a class, using two problems as the entire basis of all the material I cover. I know, “not build here syndrom.” It pained me to do this, I did my research and I considered retro-fitting the OM course. The discrepancy was too large to consider reuse. And refactoring the existing class to accomodate changes, which I also considered, wasn’t practical. Looking back I made the right decision.

So this course has a few key design elements:
  • Problem focused
  • Test oriented (sometimes test-first, other times test-driven, occasionally refactor-oriented)

The problem-focus limits the topic coverage. If something about the language doesn’t come up somewhat naturally (not contrived) in the two projects I use, I don’t cover the material. For example, I don’t mention placement new, I only cover Multiple Inheritance if asked about it. I also try to focus on classes in the standard library. For example, std::array, std::vector, std::map, std::shared_ptr.

Additionally, there’s an early focus on testing. That’s another thing that’s different from how I taught C++ say before 1995 – I really didn’t teach it from 1995 – 2007, so no comments on what I might have done differently in that span of years.

To give you an idea of how early the focus is on test, I only show cout if asked. The first main() calls CommandLineTestRunner::RunAllTests and everything after that runs within a unit test. the last time I taught the class, I demonstrated tests executing with cslim, but still, a test focus.

I have them use unit tests as a way to experiment with the language. In one example, I have them write tests that force a method to become virtual that was not virtual before. In another case, I do the same thing with a virtual destructor. I have them test from raw pointers into shared_ptr and then update their code accordingly.

Because of the test focus, I make certain recommendations that impact overall class design. That means learning C++ with designs supporting testability early on. In C++ this means (among other things):
  • Dependency Injection (OK, this is not just C++)
  • Virtual functions and by corollary a virtual destructor
  • Storing pointers because 1. calling methods through an auto object are not virtually dispatched, and 2. you cannot put references in the standard collections
  • Use std::shared_ptr to avoid memory leaks, which are detected by the unit testing tool I have them use.

I take the class to a certain level, but I make it clear I’m just scratching the surface. I believe it’s not really possible to learn C++ in a 1-week class. You can get the beginnings of proficiency and be in a good position to continue learning – that’s an assumption I state up front after the students have had their first exercise – about 5 minutes into the start of the class. I try to only go into detail as the students ask questions, but at times I just want to really open up the beast and get into what’s really happening.

To address this, I’ve started novelizing the class I’ve been teaching. I’m following the same outline as the class, but I dive under the surface and at times get to quite a bit of detail that I would not typically get into in a 4.5 day class.

I know this material will augment the class. This will give my students three sources of information:
  • The class itself, which is exercise-driven
  • Online videos
  • A novelization of the class, going into much more detail

The thing I’m wondering is, would the book be worth making more generally available? I’m writing it so that it can stand on its own. There’s a certain advantage to knowing my students have taken the previous OO A & D class I mentioned because it uses a problem that I’ve used on and off since 1992. This allows me to give examples from a problem they have looked at in the past and then at the current problem. I’ve not yet made references to that problem in the book I’m writing. I could, I just have not done so yet – it’s more natural in the second problem and I’m still working on the first problem (and therefor the first half of the book). I’m certain I can make those references without the previous class experience. (It’s the Monopoly problem.)

In any case, I need to do some research. If I do start the publishing process, a key step involves competitive research. Can you recommend any books published for the first time this century (really in the past 7 years) that have any of these characteristics:
  • Cover a minimal set of C++, enough for decent OO solutions
  • Have any kind of emphasis on test – at least half the book?
  • Cover the language strictly through a problem-based approach rather than from a language perspective?
  • Involve deliberately making mistakes and observing those mistakes to learn how the language works?

Additionally, can you recommend any great C&#43&#43 books? I can list many of the ones I’ve read and enjoyed, but the last time I bought a book for myself on the topic, Amazon did not exist as an online company.

Independent of whether I deal with trying to get this thing published as a printed book, I’m going to finish it because I think it will be useful to my students. I suspect I’ll be teaching this class in the future, so I will find it useful in the future as well. If I don’t attempt publishing it through a major publisher, I’ll put it on my wiki at the very least. Though it’s going to be quite a bit more effort than my typical wiki articles.

But the question I keep coming back to, is this: Is there really a need for this book dead tree version, or it is primarily useful as supplemental material for a class I teach?

I’d like to hear your opinions.

Categories: Mentoring

Game Of Life with @lunivore

Mon, 16/08/2010 - 03:40

At the #coderetreat in Orlando I spent an hour programming with @lunivore (Liz Keogh). We worked in Clojure on Conway’s game of life. It was quite an experience!

Liz was not very familiar with Clojure, so I felt I had the advantage. Wrong! By the end of the hour she had taken charge and was programming rings around me. It was a lot of fun; if a bit humbling.

Liz came into the session knowing the structure of the algorithm she wanted to implement. She just didn’t know how to implement it in Clojure. When the starting gun was fired, she showed me a picture of a glider (a standard form in life) and said she wanted this to be the acceptance test.

During the session we wrote only a very few unit tests, each very focussed on one particular part of the problem. The size of the steps was considerably larger than I am used to in Java or Ruby; yet it didn’t seem to matter. Writing functions in clojure is easy, and apparently far less error prone.

We finished 10 minutes before the end of the session, and had some time to refactor.

I’ve posted the code below. I’ve cleaned it up a bit (being unable to leave it alone); but the algorithm remains the same; as are the tests.

Categories: Mentoring

Nearly 22 years ago

Thu, 12/08/2010 - 05:23

Moved the article and the file. Reduced resolution from 300 dpi to 75 dpi (using a quartz filter on OS X from Jerome Colas

Here’s the story with the updated location: The Moved Blog

Categories: Mentoring

Getting Started With cslim in Visual Studio 2010 Using the Command Line Tools

Mon, 09/08/2010 - 05:08

Some of you asked for it. Here’s something to get you started: http://schuchert.wikispaces.com/cpptraining.UsingCSlimWithVisualStudio2010

These instructions are a work in progress and alpha. However, they do get the basics working.

If you are so inclined, have a look at the NMakefile (you’ll come across it in the instructions) and give me a better way to build it.

I spent probably 8 hours getting a working environment (much yak shaving including a faulty mac book pro DVD drive). I then spent another 8 – 10 hours getting this working. I worked through it about 5 times to minimize the amount of changes I needed to make to the original library source.

I ended up using some link and pre-processor seams to get most of this built. However, most of my time was spent trying to figure out the command line tools and their options.

If you can give me some guidance on improving this, I’d like to hear it. However, this is now a low-burning thread, so some assembly required!

Enjoy,

Categories: Mentoring

Rough Notes on using FitNesse with C++

Sat, 31/07/2010 - 05:09

I’ve been working with FitNesse.slim, using the cslim implementation to execute C++ code. I have some rough notes online. These should be enough to get started, though you’ll need to be using G++ 4.4 or later.

In any case, have a look. Give me some feedback if you’d like. I’ll be working on these over the next month or so: http://schuchert.wikispaces.com/cpptraining.ExecutingBinaryOperators

Categories: Mentoring

Preprocessor seams and assignment of responsibility

Thu, 22/07/2010 - 17:32
In my previous blog I mentioned adding a single method to the cslim library: cslim/include/CSlim/SlimListSerializer.h: void SlimList_Release(char *serializedResults); cslim/src/CSlim/SlimListSerializer.c: void SlimList_Release(char *serializedResults) { if(serializedResults) free(serializedResults); } Now I’m going to explain how I came across this need, what it took to figure it out and then why I picked this solution. Background In FitNesse, Query Table Results are a list of list of list:
  • The outer-most list represents “rows”. It collects all objects found. It has zero or more entries.
  • The inner-most list represents a single field. It is a key value pair. It is a list of size 2. The first entry is the name of the field (column to FitNesse). The second field is the value. Both are strings.
  • The middle list represents a single object. It is a collection of fields. It has zero or more entires.

I understand the need for this representation and it takes a little bit to get it built correctly. So much so, I built a simple tool to do it in java.

C++ is no different. In fact, the authors of cslim though the same thing and they created a C Abstract Data Type to help out (and an add-on method to create the correct final form):

  • SlimList
  • SlimList_Serializer

They use C because it can be used by both C and C++. I’m using C++ and I wanted to make building query results even easier, so I built a QueryResultAccumulator. The most recent source is in the previous blog and I’ll be putting it on github after I’ve had some more time to work on it.

Here’s the progression to my QueryResultAccumulator class:

  • Wrote a Query-Table based fixture and followed the example provided with cslim (thank you for that!)
  • Moved the code from functions into methods on a class
  • Extracted a class, called SlimListWrapper, which made the fixture code easier to follow.
  • Went to get takeout and realized that I had named the class incorrectly and that it was really accumulating query results (thus the name). The SlimList was a mechanism, not an intent.
  • Refactored the class into QueryResultAccumulator (I left the original alone, created a new class, copied code from one to the other and changed it around a bit.

Now it might sound like I didn’t have any tests. In fact, I did. I had my original Acceptance Test in FitNesse. I kept running that, so in a sense I was practicing ATDD.

I was not happy with that, because I was not sure that I had properly handled memory allocation correctly. In fact, I had not. The final result is dynamically allocated and I was not releasing that. So I “fixed” it. (It needs to be released after the return from slim back to FitNesse, so the typical pattern is to lazily delete it, or release it in a “destroy” method called after the execution of a single table.)

I have a memory leak?! I am simplifying this story a bit. So I’m skipping some intermediate results. Ultimately I wrote the following test to check that you could use a single query result accumulator for multiple results correctly: TEST(QueryResultAccumulator, CanProduceFinalResultsMultipleTimes) { QueryResultAccumulator accumulator; accumulator.produceFinalResults(); accumulator.produceFinalResults(); }

This caused a memory leak of 60 bytes. It was at this point I was up too late and banging my head against a wall. About 3 hours later I figured that out and went to bed. I fixed the problem and sent a patch to authors of cslim in maybe 45 minutes. So I should have gotten more sleep.

Where is that damn thing In any case, I visually checked the code. I debugged it. I did everything I could initially think of, and I was convinced that my code was correct. (As we’ll see it both was and was not due to a preprocessor seam in cslim.) I got to the point where I even tried different versions of gcc. (I found a parsing error in g++ 4.5 when handling templates, so in desperation and late at night I wasted 5 minutes switching my compiler.) I had the following code in my class: if(result) free(result);

This was the correct thing, but it was in the wrong location. Again, this was related to a preprocessor seam in cslim.

Eureka! I looked at the cslim code and confirmed it was doing basic C stuff, nothing surprising. It was at that point that I remembered something important: cslim depends on CppUTest and uses a different malloc/free

Ah ha! That’s it. So I tired to recompile my C++ code to use the same thing. However, I was not able to do that. CppUTest’s memory tracking implementation does not work with many of the C++ standard classes like <string> and <vector>. So I could not compile my code using the same approach.

I’m glad this happened becuase it made me realize that it was the wrong place anyway. Here’s the logic:

  • CSlim has a preprocessor seam, comile with or without -Dfree=cpputest_free, -Dmalloc=cpputest_malloc
  • I’m using a class in cslim to do the allocation, where the policy of allocation is stored
  • I should not release the memory but instead allow the cslim library to release the memory becaue it has the allocation policy, and therefore the release policy.
About 2 minutes later, I had added a single function to the library: cslim/include/CSlim/SlimListSerializer.h: void SlimList_Release(char *serializedResults); cslim/src/CSlim/SlimListSerializer.c: void SlimList_Release(char *serializedResults) { if(serializedResults) free(serializedResults); }

I updated my QueryResultAccumulator to use SlimList_Release and my false positive disappeared.

It also turns out that this improved symmetry in the library. To allocate and release entries in an SlimList you use the following functions:

  • SlimList* SlimList_Create()
  • void SlimList_Destroy(SlimList*);

Now to serlalize a list and release the memory later you use:

  • SlimList_Serialize
  • SlimList_Release

As I write this, I think there’s a better name. I’ll let the authors give it a better name (like SlimList_Release_Serialization_Results). But in any case, if you use a function in the cslim library that allocates something, you use another method in the cslim library to release it.

Since the libray has a preprocessor seam, that symmetry removes a false-positive memory leak.

What took so long? I had an interesting time with this. Originally I had not released that memory in the class but rather in the unit test. I was working too late and not thinking clearly. I realized that my class needed to manage that.

When I called free in the unit test, it was calling the correct version of free, cpputest_free, becasue it was a unit test using CppUTest. When I moved the code into the class, which has no knowledge of CppUTest (nor can it), the flow of the code was correct, but the compilation (preprocessor symbols) were different and it caused a false positive.

Since I changed the code, I assumed it was a problem with how I changed the code. To me more clear, I though it was a code-flow problem, not a preprocessor seam problem. So I spent a lot of time verifyig my code. Once I determined it was correct, I then moved into debugger mode.

It was not long after that when I finally figured out what was going on.

Conslusions As with many things in life, this is intuitive once you understand it!-)

That cslim depends on CppUTest might be questionable. However, if I treat cslim as a (mostly) black box, and I don’t know its allocation policy, I should not assume a deallocation policy.

By putting the responsiblity in the correct library level, it fixed the problem and added symmetry to the overall solution.

I also really enjoyed this (after it was done). I’ve come across memory leaks using CppUTest in the past. Often they were my fault. Sometimes they were not. This was interesting because it both was and was not my fault. I originally had written the code incorrectly. When I put he correct steps in my code, I still had wrong code because I put the responsibility in the wrong place. It really was only correct after I moved the actual implementation into the library and then called it from my code that I had finally written it correctly.

Categories: Mentoring

Some C++ Fixtures for FitNesse.slim

Thu, 22/07/2010 - 07:32

I continue working on these. I was stuck in the airport for 5 hours. Between that and the actual flight, I managed to create three different test examples against a C++ RpnCalculator. Each example uses a different kind of fixture. I had a request from @lrojas to publish some results on the blog. So this is that, however these are in progress and rough.

I’m still trying different forms to figure out what I like the best.

By the way, that lastValue stuff in the fixtures has to do with the fact that all of the hook methods return a char* but I’m responsible for cleaning up after myself.

A Decision Table !|ExecuteBinaryOperator | |lhs|rhs|operator|expected?| |3 |4 |- |-1 | |5 |6 |* |30 | And Its Fixture Code #include <stdlib.h> #include <stdio.h> #include <string> #include "RpnCalculator.h" #include "OperationFactory.h" #include "Fixtures.h" #include "SlimUtils.h" struct ExecuteBinaryOperator { ExecuteBinaryOperator() { lastValue[0] = 0; } int execute() { RpnCalculator calculator(factory); calculator.enterNumber(lhs); calculator.enterNumber(rhs); calculator.executeOperator(op); return calculator.getX(); } static ExecuteBinaryOperator* From(void *fixtureStorage) { return reinterpret_cast<ExecuteBinaryOperator*>(fixtureStorage); } OperationFactory factory; int lhs; int rhs; std::string op; char lastValue[32]; }; extern "C" { void* ExecuteBinaryOperator_Create(StatementExecutor* errorHandler, SlimList* args) { return new ExecuteBinaryOperator; } void ExecuteBinaryOperator_Destroy(void* self) { delete ExecuteBinaryOperator::From(self); } static char* setLhs(void* fixture, SlimList* args) { ExecuteBinaryOperator *self = ExecuteBinaryOperator::From(fixture); self->lhs = getFirstInt(args); return self->lastValue; } static char* setRhs(void* fixture, SlimList* args) { ExecuteBinaryOperator *self = ExecuteBinaryOperator::From(fixture); self->rhs = getFirstInt(args); return self->lastValue; } static char* setOperator(void *fixture, SlimList* args) { ExecuteBinaryOperator *self = ExecuteBinaryOperator::From(fixture); self->op = getFirstString(args); return self->lastValue; } static char* expected(void* fixture, SlimList* args) { ExecuteBinaryOperator *self = ExecuteBinaryOperator::From(fixture); int result = self->execute(); snprintf(self->lastValue, sizeof(self->lastValue), "%d", result); return self->lastValue; } SLIM_CREATE_FIXTURE(ExecuteBinaryOperator) SLIM_FUNCTION(setLhs) SLIM_FUNCTION(setRhs) SLIM_FUNCTION(setOperator) SLIM_FUNCTION(expected) SLIM_END } There’s a bit of duplication. I’ve been experimenting with pointers to member functions and template functions to make it better. I really should be using lambdas, but I’m not there yet. I have them available in some form since I’m using gcc 4.5. I simply compile with the option -sdd=c++0x. Even so, I’m not quite ready to do that. A Script Table !|script |ProgramTheCalculator | |startProgramCalled|primeFactorsOfSum | |addOperation |sum | |addOperation |primeFactors | |saveProgram | |enter |4 | |enter |13 | |enter |7 | |execute |primeFactorsOfSum | |check |stackHas|3|then|2|then|2|then|2|is|true| And Its Fixture Code #include <stdlib.h> #include <stdio.h> #include <string> #include "RpnCalculator.h" #include "OperationFactory.h" #include "SlimUtils.h" #include "SlimList.h" #include "Fixtures.h" struct ProgramTheCalculator { ProgramTheCalculator() : calculator(factory) { } static ProgramTheCalculator* From(void *fixtureStorage) { return reinterpret_cast<ProgramTheCalculator*>(fixtureStorage); } OperationFactory factory; RpnCalculator calculator; }; extern "C" { void* ProgramTheCalculator_Create(StatementExecutor* errorHandler, SlimList* args) { return new ProgramTheCalculator; } void ProgramTheCalculator_Destroy(void *fixture) { delete ProgramTheCalculator::From(fixture); } static char* startProgramCalled(void *fixture, SlimList *args) { auto *self = ProgramTheCalculator::From(fixture); self->calculator.createProgramNamed(getFirstString(args)); return remove_const(""); } static char* addOperation(void *fixture, SlimList *args) { auto *self = ProgramTheCalculator::From(fixture); self->calculator.addOperation(getFirstString(args)); return remove_const(""); } static char* saveProgram(void *fixture, SlimList *args) { auto *self = ProgramTheCalculator::From(fixture); self->calculator.saveProgram(); return remove_const(""); } static char* enter(void *fixture, SlimList *args) { auto *self = ProgramTheCalculator::From(fixture); self->calculator.enterNumber(getFirstInt(args)); return remove_const(""); } static char* execute(void *fixture, SlimList *args) { auto *self = ProgramTheCalculator::From(fixture); self->calculator.executeOperator(getFirstString(args)); return remove_const(""); } static char* stackHasThenThenThenIs(void *fixture, SlimList *args) { auto *self = ProgramTheCalculator::From(fixture); for(int i = 0; i < 4; ++i) { if(self->calculator.getX() != getIntAt(args, i)) return remove_const("false"); self->calculator.executeOperator("drop"); } return remove_const("true"); } SLIM_CREATE_FIXTURE(ProgramTheCalculator) SLIM_FUNCTION(startProgramCalled) SLIM_FUNCTION(addOperation) SLIM_FUNCTION(saveProgram) SLIM_FUNCTION(enter) SLIM_FUNCTION(execute) SLIM_FUNCTION(stackHasThenThenThenIs) SLIM_END } This one is a bit more regular. I am using the updated auto keyword in this code. The fixture is just holding the calculator and its OperationFactory (not my preferred name, but that’s what students wanted to call things like +, -, etc, operations not operators). The Dreaded Query Table It’s a bit of a pain to produce query results. So much so, I wrote a simple library in Java to make it easier. I can create a well-formed query result from a single object or a list of objects and even do basic transforms (in names and in paths to data). I started using the jakarta bean utils, but my use was so simple (2 methods), I ripped out that library and just hand-wrote the methods I needed. It was not a case of “not invented here syndrom.” I started by using the library, and I had tests. I didn’t like the size of the library relative to how much I was using it, so I just got rid of it.

Well here I am working C++ and I felt compelled to make it easier work with query results in C++.

First the FitNesse table, then the fixture and finally the support class. I have tests for it as well, I’m not going to show those, however. !|Query: SingleCharacterNameOperators| |op | |+ | |* | |/ | |! | |- | And Its Fixture Code #include <stdlib.h> #include <stdio.h> #include <vector> #include <string> #include <memory> #include "RpnCalculator.h" #include "OperationFactory.h" #include "Fixtures.h" #include "SlimUtils.h" #include "QueryResultAccumulator.h" struct SingleCharacterNameOperators { OperationFactory factory; RpnCalculator calculator; SingleCharacterNameOperators() : calculator(factory), result(0) { } ~SingleCharacterNameOperators() { delete result; } static SingleCharacterNameOperators* From(void *fixtureStorage) { return reinterpret_cast<SingleCharacterNameOperators*> (fixtureStorage); } void resetResult(char *newResult) { delete result; result = newResult; } void conditionallyAddOperatorNamed(const std::string &name) { if (name.size() == 1) { accumulator.addFieldNamedWithValue("op", name); accumulator.finishCurrentObject(); } } void buildResult() { v_string names = calculator.allOperatorNames(); buildResult(names); } void buildResult(v_string &names) { for (v_string::iterator iter = names.begin(); iter != names.end(); ++iter) conditionallyAddOperatorNamed(*iter); resetResult(accumulator.produceFinalResults()); } QueryResultAccumulator accumulator; char *result; }; extern "C" { void* SingleCharacterNameOperators_Create(StatementExecutor* errorHandler, SlimList* args) { return new SingleCharacterNameOperators; } void SingleCharacterNameOperators_Destroy(void *fixture) { delete SingleCharacterNameOperators::From(fixture); } static char* query(void *fixture, SlimList *args) { auto *self = SingleCharacterNameOperators::From(fixture); self->buildResult(); return self->result; } SLIM_CREATE_FIXTURE(SingleCharacterNameOperators) SLIM_FUNCTION(query)SLIM_END SLIM_END And the Helper Class QueryResultAccumulator.h #pragma once #ifndef QUERYRESULTACCUMULATOR_H_ #define QUERYRESULTACCUMULATOR_H_ class SlimList; #include <vector> #include <string> class QueryResultAccumulator { public: typedef std::vector<SlimList*> v_SlimList; typedef v_SlimList::iterator iterator; QueryResultAccumulator(); virtual ~QueryResultAccumulator(); void finishCurrentObject(); void addFieldNamedWithValue(const std::string &name, const std::string &value); char *produceFinalResults(); private: SlimList* allocate(); void releaseAll(); void setInitialConditions(); private: v_SlimList created; SlimList *list; SlimList *currentObject; int lastFieldCount; int currentFieldCount; char *result; private: QueryResultAccumulator(const QueryResultAccumulator&); QueryResultAccumulator& operator=(const QueryResultAccumulator&); }; #endif

I know there are too many fields. The counts help with validating correct usage. I also wrote it so one instance could be re-used and I tried to make sure it was in a “ready to receive fields” state when necessary. In any case, this error checking helped find a defect I introduced while refactoring.

QueryResultAccumulator.cpp #include "QueryResultAccumulator.h" #include "DifferentFieldCountsInObjects.h" #include "InvalidStateException.h" extern "C" { #include "SlimList.h" #include "SlimListSerializer.h" } QueryResultAccumulator::QueryResultAccumulator() : result(0) { setInitialConditions(); } QueryResultAccumulator::~QueryResultAccumulator() { releaseAll(); SlimList_Release(result); } void QueryResultAccumulator::setInitialConditions() { releaseAll(); list = allocate(); currentObject = allocate(); lastFieldCount = -1; currentFieldCount = -1; } SlimList* QueryResultAccumulator::allocate() { SlimList *list = SlimList_Create(); created.push_back(list); return list; } void QueryResultAccumulator::releaseAll() { for (iterator i = created.begin(); i != created.end(); ++i) SlimList_Destroy(*i); created.clear(); } void QueryResultAccumulator::finishCurrentObject() { if(lastFieldCount >= 0 && lastFieldCount != currentFieldCount) throw DifferentFieldCountsInObjects(lastFieldCount, currentFieldCount); SlimList_AddList(list, currentObject); currentObject = allocate(); lastFieldCount = currentFieldCount; currentFieldCount = -1; } void QueryResultAccumulator::addFieldNamedWithValue(const std::string &name, const std::string &value) { SlimList *fieldList = allocate(); SlimList_AddString(fieldList, name.c_str()); SlimList_AddString(fieldList, value.c_str()); SlimList_AddList(currentObject, fieldList); ++currentFieldCount; } char* QueryResultAccumulator::produceFinalResults() { if(currentFieldCount != -1) throw InvalidStateException("Current object not written"); SlimList_Release(result); result = SlimList_Serialize(list); setInitialConditions(); return result; } Note, this code uses a method I added to the cslim library: SlimListSerializer.h – in include/CSlim void SlimList_Release(char *serializedResults); SlimListSerializer.c – in src/CSlim void SlimList_Release(char *serializedResults) { if(serializedResults) free(serializedResults); }

I needed to add these methods due to a false-positive memory leak indicated when using CppUTest to test this code. That’s another blog.

Categories: Mentoring

FitNesse, C++ and cslim, step-by-step instructions

Tue, 20/07/2010 - 06:26

Title says it all: http://schuchert.wikispaces.com/cpptraining.GettingStartedWithFitNesseInCpp

First draft. If you have problems, please let me know.

Categories: Mentoring

A Few C plus plus TDD videos

Mon, 12/07/2010 - 13:30

Using CppUTest, gcc 4.4 and the Eclipse CDT.

Rough, as usual.

The Video Album

Might redo first one with increased font size. Considering redoing whole series at 800×600.

Categories: Mentoring

Software Calculus - The Missing Abstraction.

Tue, 06/07/2010 - 03:26

The problem of infinity plagued mathematicians for millennia. Consider Xeno’s paradox; the one with Achilles and the tortoise. While it was intuitively clear that Achilles would pass the Tortoise quickly, the algebra and logic of the day seemed to suggest that the Tortoise would win every race given a head start. Every time Achilles got to where the tortoise was, the tortoise would have moved on. The ancients had no way to see that an infinite sum could be finite.

Then came Leibnitz and everything changed. Suddenly infinity was tractable. Suddenly you could sum infinite series and write the equations that showed Achilles passing the tortoise. Suddenly a whole range of calculations that had either been impossible or intractable became trivial.

Calculus was a watershed invention for mathematics. It opened up vistas that we have yet to fully plumb. It made possible things like Newtonian mechanics, Maxwell’s equations, special and general relativity and quantum mechanics. It supports the entire framework of our modern world. We need a watershed like that for software.

If you listen to my keynote: Twenty-Five Zeros you’ll hear me go on and on about how even though software has changed a lot in form over the last fifty years, it has changed little in substance. Software is still the organization of sequence, selection, and iteration.

For fifty years we have been inventing new languages, notations, and formulations to manage Sequence, Selection, and Iteration (SSI). Structured Programming is simply a way to organize SSI. Objects are another way to organize SSI. Functional is still another. Indeed, almost all of our software technologies are just different ways of organizing Sequence, Selection, and Iteration.

This is similar to Algebra in the days before calculus. We knew how to solve linear and polynomial equations. We knew how to complete squares and find roots. But in the end it was all just different ways to organize adding. That may sound simplistic, but it’s not. Subtracting is just adding in reverse. Multiplying is just adding repeatedly. Division is just multiplication in reverse. In short, Algebra is an organizational strategy for adding.

Algebra went through many different languages and notations too, just like software has. Think about Roman and Greek numerals. Think how long it took to invent the concept of zero, or the positional exponential notation we use today.

And then one day Newton saw an apple fall, and he changed the way we thought about mathematics. Suddenly it wasn’t about adding anymore. Suddenly it was about infinities and differentials. Mathematical reasoning was raised to a new order of abstraction.

Where is that apple for software (pun intended). Where is the Newton or Leibnitz that will transform everything about the way we think about software. Where is that long-sought new level of abstraction?

For awhile we thought it would be MDA. Bzzzzt, wrong. We thought it would be logic programming like prolog1. Bzzzt. We thought it would be database scripts and 4GLs. Bzzzt. None of those did it. None of those can do it. They are still just various ways of organizing sequence, selection, and iteration.

Some people have set their sights on quantum computing. While I’ll grant you that computations with bits that can be both states simultaneously is interesting, in the end I think this is just another hardware trick to increase throughput as opposed to a whole new way to think about software.

This software transformation, whatever it is, is coming. It must come; because we simply cannot keep piling complexity upon complexity. We need some new organizing principle that revamps the very foundations of the way we think about software and opens up vistas that will take us into the 22nd century.

1 Prolog comes closest to being something more than a simple reorganization of sequence, selection, and iteration. At first look logic programming seems very different. In the end, however, an algorithm expressed in prolog can be translated into any of the other languages, demonstrating the eventual equivalence.

Categories: Mentoring

C++ Algorithms, Boost and function currying

Sun, 13/06/2010 - 06:41

I’ve been experimenting with C++ using the Eclipse CDT and gcc 4.4. Since I’m a fan of boost, I’ve been using that as well. I finally got into I realistic use of boost::bind.

I converted this: int Dice::total() const { int total = 0; for(const_iterator current = dice.begin(); current != dice.end(); ++current) total += (*current)->faceValue(); return total; } Into this: int Dice::total() const { return std::accumulate( dice.begin(), dice.end(), 0, bind(std::plus<int>(), _1, bind(&Die::faceValue, _2)) ); }

To see how to go from the first version to the final version with lots of steps in between: http://schuchert.wikispaces.com/cpptraining.SummingAVector.

This is a first draft. I’ll be cleaning it up over the next few days. If you see typos, or if anything is not clear from the code, please let me know where. Also, if my interpretation of what boost is doing under the covers (there’s not much of that) is wrong, please correct me.

Thanks!

Categories: Mentoring

TDD in Clojure

Thu, 03/06/2010 - 19:33

OO is a tell-don’t-ask paradigm. Yes, I know people don’t always use it that way, but one of Kay’s original concepts was that objects were like cells in a living creature. The cells in a living creature do not ask any questions. They simply tell each other what to do. Neurons are tellers, not askers. Hormones are tellers not askers. In biological systems, (and in Kay’s original concept for OO) communication was half-duplex.

Clojure is a functional language. Functional languages are ask-dont-tell. Indeed, the whole notion of “tell” is to change the state of the system. In a functional program there is no state to change. So “telling” makes little sense.

When we use TDD to develop a tell-don’t-ask system, we start at the high level and write tests using mocks to make sure we are issuing the correct “tells”. We proceed from the top of the system to the bottom of the system. The last tests we write are for the utilities at the very bottom.

In an ask-don’t-tell system, data starts at the bottom and flows upwards. The operation of each function depends on the data fed to it by the lower level functions. There is no mocking framework. So we write tests that start at the bottom, and we work our way up the the top.

Therein lies the rub.

In a tell-don’t-ask system, the tells at the high level are relatively complex. They branch out into lower subsystems getting simpler, but more numerous as they descend. Testing these tells using mocks is not particularly difficult because we don’t need to depend on the lower level functions being there. The mocks make them irrelevant.

In an ask-don’t-tell system the asks at the low level are simple, but as the data moves upwards it gets grouped and composed into lists, maps, sets, and other complex data structures. At the top the data is in it’s most complex form. Writing tests against that complex data is difficult at best. And there is currently no way to mock out the lower levels1 so all tests written at the high level depend on all the functions below.

The perception of writing tests from the bottom to the top can be horrific at first. Consider, for example, the Orbit program I just wrote. This program simulates N-body gravitation. Imagine that I am writing tests at the top level. I have three bodies at position Pa, Pb, and Pc. They have masses Ma, Mb, and Mc. They have velocity vectors of Va, Vb, Vc. The test I want to write needs to make sure that new positions Pa’, Pb’, Pc’, and new Velocity vectors Va’, Vb’, and Vc’ are computed correctly. How do I do that?

Should I write a test that looks like this? test-update { Pa = (1,1) Ma = 2 Va = (0,0) Pb = (1,2) Mb = 3 Vb = (0,0) Pc = (4,5) Mc = 4 Vc = (0,0) update-all Pa should == (1.096, 4.128) Va should == (0.096, 3.128) Pb should == (1.1571348402636772, 0.1571348402636774) Vb should == (0.15713484026367727, -1.8428651597363226) Pc should == (3.834148869802242, 4.818148869802242) Vc should == (-0.16585113019775796, -0.18185113019775795) } A test like this is awful. It’s loaded with magic numbers, and secret information. It tells me nothing about how the update-all function is working. It only tells me that it generated certain numbers. Are those numbers correct? How would I know?

But wait! I’m working in a functional language. That means that every function I call with certain inputs will always return the same value; no matter how many times I call it. Functions don’t change state! And that means that I can write my tests quite differently.

How does update-all work? Simple, given a list of objects it performs the following operations (written statefully):

update-all(objects) { for each object in objects { accumulate-forces(object, objects) } for each object in objects { accelerate(object) reposition(object) } }

This is written in stateful form to make is easier for our non-functional friends to follow. First we accumulate the force of gravity between all the objects. This amounts to evaluating Newton’s F=Gm1m1/r^2 formula for each pair of objects, and adding up the force vectors.

Then, for each object we accelerate that object by applying the force vector to it’s mass, and adding the resultant delta-v vector to it’s velocity vector.

Then, for each object we reposition that object by applying the velocity vector to it’s current position.

Here’s the clojure code for update-all

(defn update-all [os] (reposition-all (accelerate-all (calculate-forces-on-all os))))

In this code you can clearly see the bottom-to-top flow of the application. First we calculate forces, then we accelerate, and finally we reposition.

Now, what do these -all functions look like? Here they are:

(defn calculate-forces-on-all [os] (map #(accumulate-forces % os) os)) (defn accelerate-all [os] (map accelerate os)) (defn reposition-all [os] (map reposition os))

If you don’t read clojure, don’t worry. the map function simply creates a new list from an old list by applying a function to each element of the old list. So in the case of reposition-all it simply calls reposition on the list of objects (os) producing a new list of objects that have been repositioned.

From this we can determine that the function of update-all is to call the three functions (accumulate-forces, accelerate, and reposition) on each element of the input list, producing a new list.

Notice how similar that is to a statement we might make about a high level method in an OO program. (It’s got to call these three functions on each element of the list). In an OO language we would mock out the three functions and just make sure they’d been called for each element. The calculations would be bypassed as irrelevant.

Oddly, we can make the same statement in clojure. Here’s the test for update-all

(testing "update-all" (let [ o1 (make-object ...) o2 (make-object ...) o3 (make-object ...) os [o1 o2 o3] us (update-all os) ] (is (= (nth us 0) (reposition (accelerate (accumulate-forces os o1) (is (= (nth us 1) (reposition (accelerate (accumulate-forces os o2) (is (= (nth us 2) (reposition (accelerate (accumulate-forces os o3) ) )

If you don’t read clojure don’t worry. All this is saying is that we test the update-all function by calling the appropriate functions for each input object, and then see if the elements in the output list match them.

In an OO program we’d find this dangerous because of side-effects. We couldn’t be sure that the functions could safely be called without changing the state of some object in the system. But in a functional language it doesn’t matter how many times you call a function. So long as you pass in the same data, you will get the same result.

So this test simply checks that the appropriate three functions are getting called on each element of the list. This is exactly the same thing an OO programmer would do with a mock object!

Is TDD necessary in Clojure?

If you follow the code in the Orbit example, you’ll note that I wrote tests for all the computations, but did not write tests for the Swing-Gui. This is typical of the way that I work. I try to test all business rules, but I “fiddle” with the GUI until I like it.

If you look carefully you’ll find that amidst the GUI functions there are some “presentation” functions that could have been tested, but that I neglected to write with TDD[2]. These functions were the worst to get working. I continuously encountered NPEs and Illegal Cast exceptions while trying to get them to work.

My conclusion is that Clojure without TDD is just as much a nightmare as Java or Ruby without TDD.

Summary

In OO we tend to TDD our way from the top to the bottom by using Mocks. In Clojure we tend to TDD our way from the bottom to the top. In either case we can compose our tests in terms of the functions they should call on the lower level objects. In the case of OO we use mocks to tell us if the functions have been called properly. This protects us from side-effects and allows us to decouple our tests from the whole system. In clojure we can rely on the fact that the language is functional, and that no matter how many times you call a function it will return the same value.

1 Brian Marick is working on something that looks a lot like a mocking framework for clojure. If his ideas pan out, we may be able to TDD from the top to the bottom in Clojure.

2 This is an unconscious game we all play with ourselves. When we have a segment of code that we consider to be immune to TDD (like GUI) then we unconsciously move lots of otherwise testable code into that segment. Yes, I heard my green band complain every time I did it; but I ignored it because I was in the GUI. Whoops.

Categories: Mentoring

Orbit in Clojure

Thu, 03/06/2010 - 00:08

I spent the last two days (in between the usual BS) writing a simple orbital simulator in Clojure using Java interop with Swing. This was a very pleasant experience, and I like the way the code turned out – even the swing code!

You can see the source code here

Those of you who are experienced with Clojure, I’d like your opinion on my use of namespaces and modules and other issues of style.

Those of you who are not experienced with Clojure, should start. You might want to use this application as a tutorial.

And just have fun watching the simulation of the coalescence of an accretion disk around a newly formed star.

Categories: Mentoring

A Coverage Metric That Matters

Fri, 28/05/2010 - 12:39

How much test coverage should your code have? 80%? 90%? If you’ve been writing tests from the beginning of your project, you probably have a percentage that hovers around 90%, but what about the typical project? The project which was started years ago, and contains hundreds of thousands of lines of code? Or millions of lines of code? What can we expect from it?

One of the things that I know is that in these code bases, one could spend one’s entire working life writing tests without doing anything else. There’s simply that much untested code. It’s better to write tests for the new code that you write and write tests for existing code you have to change, at the time you have to change it. Over time, you get more coverage, but your coverage percentage isn’t a goal. The goal is to make your changes safely. In a large existing code base, you may never get more than 20% coverage over its lifetime.

Changes occur in clusters in applications. There’s some code that you will simply never change and there’s other areas of code which change quite often. The other day it occurred to me that we could use that fact to arrive at a better metric, one that is a bit less disheartening and also gives us a sense of our true progress.

The metric I’m thinking about is percentage of commits on files which are covered by tests relative to the number of commits on files without tests.

In the beginning, you can expect to have a very low percentage, but as you start to get tests in place for changes that you make, your percentage will rise rapidly. If you write tests for all of your changes, it will continue to rise. At a certain point, you may want to track only a window of commits, say, the commits which have happened only in the last year. When you do this, you can end up very close to 100%.

If you think this through, it might seem a bit dodgy on a couple of fronts. The first is that having tests for code within a file does not mean that that code is completely covered by those tests. But, I often find that the hardest part of getting started with unit testing is getting classes isolated enough from their dependencies to be testable in a harness at all. Another dodgy bit is the fact that once you get some tests in place for a file, all of the commits you’ve ever done for that file count in the percentage. Again, that’s okay for fundamentally the same reason. Once you start getting coverage, you are in a good position with that particular code.

What about the moving window? If you track this metric over, say, the last N months of commits, you’ll progressively lose information about the code that you just aren’t changing. To me, that’s fine. Coverage matters for the code that we are changing.

The metric I’m considering (maybe we can call it ‘change coverage’) gives us information about how tests are really impacting our day to day work. Moreover, it’s likely that it would be a good motivational tool, and really, that’s one of the things that a good metric should be.

Categories: Mentoring

Hello World Revisited

Thu, 20/05/2010 - 17:12

Surprising revelations while taking a TDD approach to writing hello world.

Here it nearly 21 years since I started writing in C++ (and more for C+) and I realize I’ve been blindly writing main functions to actually do something.

This insanity must stop!

What am I talking about? Read it here.

Categories: Mentoring