Pages

Saturday, July 4, 2015

Implementing a generic vector with C


Vector is a container that can grow as needed unlike arrays or the generic stack implementation that was published in an earlier post. So if you don’t know anything you better look at generic stack post. It discusses the pointer arithmetic with void*, general structure for generic containers and dynamic memory management responsibilities. This is only about new things that come up in memory management of the vector implementation.
First as usual the struct for the vector so that we can make sense of the functions.
#ifndef VECTOR_H_
#define VECTOR_H_
#include <stdio.h>
#include <stdlib.h>
typedef struct{
    int elementSize;
    int size;
    int count;
    void* items;
    void (*memclear)(void*);
}Vector;
void newVector(Vector* v,int
size, int elementSize,void (*memclear)(void*));
void add(Vector* v,void* item);
void get(Vector* v,int
position,void* destination);
void v_dispose(Vector* v);
void v_remove(Vector* v,int
position,void* destination);
#endif
Here elementSIze is the size of an item that would be inserted and size variable holds the current size of the vector. Count is number of items in the vector, items is a pointer to the first byte of memory in the heap allocated for our vector and memclear is a function pointer to deal with cleaning the memory of items held in the vector.
Now the struct is done. Although there are many functions I am only interested in only two they are add and remove functions, others are pretty much the same as in our generic stack. So, first let’s see what happens in the add function.
void add(Vector* v,void* item){
    if(v->elementSize+v->size<=v->count*v->elementSize){
       
v->size=v->size+v->elementSize;
       
v->items=realloc(v->items,v->size);
        add(v,item);
    }else{
        void* dest=(char*)
v->items+v->count*v->elementSize;
        memcpy(item,dest,v->elementSize);
    }
}
If block of the add function executes if we don’t have enough memory to hold the new item. Therefore we have to allocate enough memory to hold the new item and here I’m just allocating memory to hold one more item. A better way might be to double the current size of the vector so that frequent reallocation does not happen. Now the question is why use realloc instead of malloc function. The reason is quite simple; when we use realloc it would expand the size without relocating if there is enough contiguous memory blocks available on heap after the last byte of current allocation. This saves us from copying all items to new block and freeing the memory of older one. If there is not enough contiguous memory copying and freeing would be handled by the realloc function. The else block is simple enough right. Just copy the content of our new item to the next free block of memory in our vector.
Now let’s move to the remove function. Idea of the function is to remove an item from the given position, hand it back to the client so that client owns the item which implies that he is responsible for managing memory for that item and rearrange the vector so that there are no empty slots before the last element in the vector. So, here is the code.
void v_remove(Vector* v,int
position,void* item){
    if(position>=v->count){
        return;
    }
    void*
dest=(char*)v->items+v->elementSize*position;
    memcpy(dest,item,v->elementSize);
    if(v->count > position){
       
memmove(dest,(char*)dest+v->elementSize,(v->count-position)*v->elementSize);
    }
    v->count--;
} 
First thing is we just return blindly if the given index is not filled in our vector using the first if block. Next if it is there we would copy the item to the given address using memcpy. Notice the pointer arithmetic with the void*. If you don’t understand it see the generic stack. So, now we have given it to the caller, then we should remove it from the vector. But if the item that we removed is not the last one in the vector it would create a haul or unused block of memory in the vector. Therefore we need to move or shift elements that are located after the removed element. That’s where the fancy little memmove function comes in. this function moves a block of memory. Of course we could have used memcopy in this situation, but memmove is fancier where we don’t have to worry about the overlapping of memory blocks. So, as a rule of thumb I usually consider using memcpy when we want to copy items in to a separate location where there is no possibility of overlapping.

That is that. If there are any errors pls let me know. Copilation happens gracefully when we use void type but there we cheat the compiler on its type system. Huh!