Archive for the ‘Uncategorized’ Category

Refactoring template bloat

July 19, 2009

Refactoring: Extract Type Invariant Base From Template Class.

You have a C++ template class. Some of the class methods either fully or partially don’t depend on template parameters. Also the class might have fields not dependent on template parameters.

Separate template parameter dependent code from template parameter independent one by extracting a template independent superclass

Motivation:

Templates are quite a popular feature of C++ facilitating generic programming. However sometimes developers get so fascinated and carried away by templates that begin dumping their code indiscriminately into template functions and classes. That approach poses a risk of creating so called template bloat.

Template bloat is a result of excessive code instantiations that might significantly increase program memory footprint and compilation time.Template bloat resulting from excessive and indiscriminate use of templates might lead to performance degradation and system resources waste.

Also another possible negative side effect of template bloat might be unnecessary exposure of implementation details (as some compilers require all template code be in a header file).

One of the ways to prevent or reduce a potential template bloat is to separate template parameter independent code from the template parameter dependent one.

uml13

Mechanics

First let’s introduce some useful notions here. Consider a template class A

template
class A
{
   foo (T*t) { _t[0]=t;};
   foo1 (int i) { _t[i]=_t[0];};
   foo2 (int i) { _i=i;};

   T* _t[10];
   int   _i;
};

We’ll use a notion of

  • “methods with template dependent signature” for methods that either have template type arguments or return a template type value. Such as A::foo;
  • “methods with template independent signature” for methods that neither have template type arguments nor return a template type value. Such as A::foo1 and A::foo2;
  • “methods with template dependent implementation” for methods that refer in the implementation to the template parameter type. Such as A::foo and A::foo1;
  • “methods with template independent implementation” for methods that do not refer in the implementation to the template parameter type. Such as A::foo2;
  • “template dependent fields” for fields referring to a template parameter in their declaration. Such as _t;
  • “template independent fields” for fields not referring to a template parameter in their declaration. Such as _i;

Now we are ready to explain the mechanics of the refactoring.

  1. Create a blank non-template superclass; make the original class a subclass of this superclass.
  2. Pull up all template independent fields to the superclass. Declare those fields public in the superclass
  3. Examine implementation of all methods with template independent signature. If their implementation code does not depend on the template type parameter, pull those methods up to the superclass as is. Declare those methods public in the superclass
  4. Examine implementation of methods with template independent signature that were left in the original class on the previous step. For each method extract template dependent code to a separate method with template independent signature if possible (and if it makes sense). Declare them private.
  5. The newly created on step 4 methods have template independent signature but their implementation does depend on a template parameter. Therefore they cannot be pulled up to the superclass, but they can be declared abstract there. Declare them private virtual abstract in the superclass.
  6. Pull up to the superclass the methods that no longer have template dependent implementation after step 4.
  7. Now examine what should be the visibility of those fields and methods that were pulled up to the superclass. Check whether it can be reduced to protected or private. One can rely on compiler for that.
  8. Examine methods with template parameter signature. If there could be extracted methods with template independent signature from them do that and repeat steps 3-6.
  9. Examine which methods implementation in the superclass should be moved to an implementation file and which should stay implemented in the superclass header file.
  10. On each step compile and test.

Example

For the sake of simplicity I decided to make a bit artificial example out of an example published in Bruce Eckel’s Thinking in C++

http://www.codeguru.com/cpp/tic/tic0209.shtml

Method “swap” might be not typical for a Stack class but I added it for the sake of technique demonstration.

template
class Stack
{
public:
    Stack() {size = 0;}
    void push(T* t){_stack[size] = t;increaseSize();}
    T* pop() {decreaseSize();return _stack[size];}
    void swap(int start, int end)
    {
        if(end > size)
            return;
         for(;end > start;)
         {
             T* tmp = _stack[end];
             _stack[end] = _stack[start];
             _stack[start] = tmp;
             start++;
             end--;
         }
     }
    void printAll()
    {
        std::cout << " size: " << size << std::endl;
        for(int i = 0; i < size; i++)
        {
            std::cout << "element: " << i << " " << *_stack[i] << std::endl;
        }
    }
private:
    void increaseSize() {size++;}
    void decreaseSize() {size--;}
private:
    T* _stack[100];
    int size;
};

1. First we create a blank superclass BaseStack and make Stack derive from it

class BaseStack
{

};

template
class Stack : public BaseStack
{
//……… methods and fields declarations ……
};

2. Next, pull up template independent field size to the superclass

class BaseStack
{
    public:
        int size;

};

template
class Stack : public BaseStack
{
//……… methods and fields declarations ……
// size field is removed;
};

3. Next, pull up all methods with template independent signature, which implementation does not depend on the template parameter to the superclass. In our case those methods happen to be increaseSize and decreaseSize.

Note that having the field size in the superclass allows us also to move the implementation of Stack constructor to the constructor of BaseStack

class BaseStack
{
    public:
        BaseStack() {size = 0;}
        void increaseSize() {size++;}
        void decreaseSize() {size--;}
        int size;

};

template
class Stack : public BaseStack
{
public:
    Stack() : BaseStack() {}
    void push(T* t){_stack[size] = t;increaseSize();}
    T* pop() {decreaseSize();return _stack[size];}
    void swap(int start, int end)
    {
        if(end > size)
            return;
         for(;end > start;)
         {
             T* tmp = _stack[end];
             _stack[end] = _stack[start];
             _stack[start] = tmp;
             start++;
             end--;
         }
     }
    void printAll()
    {
        std::cout << " size: " << size << std::endl;
        for(int i = 0; i < size; i++)
        {
            std::cout << "element: " << i << " " << *_stack[i] << std::endl;
        }
    }
private:
    T* _stack[100];
};

4. Now consider methods swap and printAll. Those methods signatures do not depend on a template parameter, but their implementations do. Let’s deal with swap first and extract a template dependent method swap2Elements. Note that this method signature does not depend on a template parameter. The declarations and implementations of other Stack methods left after step 3 as well as class BaseStack are not shown as they are the same

template
class Stack : public BaseStack
{

//………other methods and fields declarations are not affected……
public:
    void swap2Elements(int start, int end)
    {
        T* tmp = _stack[end];
        _stack[end] = _stack[start];
        _stack[start] = tmp;
    }
    void swap(int start, int end)
    {
        if(end > size)
            return;
         for(;end > start;)
         {
             swap2Elements(start, end);
             start++;
             end--;
         }
     }
};

5. Next as swap2Elements method signature does not depend on a template parameter it can be declared abstract in the superclass. At this step that’s all we do.

class BaseStack
{
    public:
        BaseStack() {size = 0;}
        void increaseSize() {size++;}
        void decreaseSize() {size--;}

        int size;
    private:
        virtual void swap2Elements(int start, int end) = 0;
};

6. Now we can pull up swap method to the superclass

class BaseStack
{
    public:
        BaseStack() {size = 0;}
        void increaseSize() {size++;}
        void decreaseSize() {size--;}
        void swap(int start, int end)
        {
            if(end > size)
                return;
             for(;end > start;)
             {
                 swap2Elements(start, end);
                 start++;
                 end--;
             }
         }
        int size;
      private:
        virtual void swap2Elements(int start, int end) = 0;

};

template
class Stack : public BaseStack
{
public:
    Stack() : BaseStack() {}
    void push(T* t){_stack[size] = t;increaseSize();}
    T* pop() {decreaseSize();return _stack[size];}
    void printAll()
    {
        std::cout << " size: " << size << std::endl;
        for(int i = 0; i < size; i++)
        {
            std::cout << "element: " << i << " " << *_stack[i] << std::endl;
        }
    }

private:
    void swap2Elements(int start, int end)
    {
        T* tmp = _stack[end];
        _stack[end] = _stack[start];
        _stack[start] = tmp;
    }
private:
    T* _stack[100];
};

Now we can do the same trick with printAll method. For that we make a new method printIthElement

class BaseStack
{
    public:
        BaseStack() {size = 0;}
        void increaseSize() {size++;}
        void decreaseSize() {size--;}
        void swap(int start, int end)
        {
            if(end > size)
                return;
             for(;end > start;)
             {
                 swap2Elements(start, end);
                 start++;
                 end--;
             }
         }
        void printAll()
        {
            std::cout << " size: " << size << std::endl;
            for(int i = 0; i < size; i++)
            {
                std::cout << "element: " << i << " ";
                printIthElement(std::cout, i);
                std::cout <<std::endl;
            }
        }

        int size;
     private:
        virtual void swap2Elements(int start, int end) = 0;
        virtual void printIthElement(std::ostream& ostr, int i ) = 0;

};

template
class Stack : public BaseStack
{
public:
    Stack() : BaseStack() {}//{size = 0;}
    void push(T* t){_stack[size] = t;increaseSize();}
    T* pop() {decreaseSize();return _stack[size];}

private:
    void printIthElement(std::ostream& ostr, int i )
    {
        ostr << *_stack[i];
    }
    void swap2Elements(int start, int end)
    {
        T* tmp = _stack[end];
        _stack[end] = _stack[start];
        _stack[start] = tmp;
    }
private:
    T* _stack[100];
};

Note the amount of code in the template class is significantly reduced now

7. Finally examine what visibility of methods and fields we really need in the superclass. In many cases one can rely on compiler for that. A quick analysis helps to establish that the visibility of the field size as well as of methods increaseSize and decreaseSize can be made protected.

class BaseStack
{
    public:
        BaseStack() {size = 0;}
        void swap(int start, int end)
        {
            if(end > size)
                return;
             for(;end > start;)
             {
                 swap2Elements(start, end);
                 start++;
                 end--;
             }
         }
        void printAll()
        {
            std::cout << " size: " << size << std::endl;
            for(int i = 0; i < size; i++)
            {
                std::cout << "element: " << i << " ";
                printIthElement(std::cout, i);
                std::cout <<std::endl;
            }
        }

    protected:
        void increaseSize() {size++;}
        void decreaseSize() {size--;}
        // can be made private providing we add a getter and a setter
        int size;
    private:
        virtual void swap2Elements(int start, int end) = 0;
        virtual void printIthElement(std::ostream& ostr, int i ) = 0;

};

Those who are unhappy with the field size protected visibility, consider encapsulating it.

The last step concludes refactoring of the original class

________________________________________________________

I didn’t create any methods with template parameter dependent signature that can be split into template parameter dependent and template parameter independent parts (as described in steps 8 and 9 of the refactoring mechanics) but with a little imagination one can easily create one. Consider for instance a “hybrid” method

pushAndSwap(T* t, int start, int end)
{
    push(t);
   swap(start, end);
}

The chosen example of the Stack class might look a bit non natural but in my past experience I have encountered quite convoluted cases of template bloat.

For instance I still remember a template class with more than thousand lines of code all in a header file. Many methods of the class had template parameter independent signature. The case was even more complicated by another class nested into that template that also depended on the same template parameter. When the developer was asked why template independent part was not separated from the template dependent one the answer was that he didn’t know how to untangle all those dependencies. It took me a few hours to demonstrate how the problem can be solved. The final template dependent code amounted in size to roughly only 20% of the size of the original one.

That story and some others that I encountered in my past prompted me to write this little publication describing the technique.

The method of preventing a template bloat by separating template dependent part from template independent is not new and was described previously, but I haven’t seen a good clear step by step description of refactoring approach in a format similar to that used in Martin Fowler’s “Refactoring: Improving the Design of Existing Code” or Joshua Kirievsky’s “Refactoring to Patterns” books.

That’s why I thought it might be worthwhile to publish this technique in the format that in reality has become a standard for refactoring descriptions.
________________________________________________________

Thanks to legalize for suggesting the name for the refactoring and improving the UML image