Pages

Saturday, July 4, 2015

Implementing a generic vector with C


Vector is a container that can grow as needed unlike arrays or the generic stack implementation that was published in an earlier post. So if you don’t know anything you better look at generic stack post. It discusses the pointer arithmetic with void*, general structure for generic containers and dynamic memory management responsibilities. This is only about new things that come up in memory management of the vector implementation.
First as usual the struct for the vector so that we can make sense of the functions.
#ifndef VECTOR_H_
#define VECTOR_H_
#include <stdio.h>
#include <stdlib.h>
typedef struct{
    int elementSize;
    int size;
    int count;
    void* items;
    void (*memclear)(void*);
}Vector;
void newVector(Vector* v,int
size, int elementSize,void (*memclear)(void*));
void add(Vector* v,void* item);
void get(Vector* v,int
position,void* destination);
void v_dispose(Vector* v);
void v_remove(Vector* v,int
position,void* destination);
#endif
Here elementSIze is the size of an item that would be inserted and size variable holds the current size of the vector. Count is number of items in the vector, items is a pointer to the first byte of memory in the heap allocated for our vector and memclear is a function pointer to deal with cleaning the memory of items held in the vector.
Now the struct is done. Although there are many functions I am only interested in only two they are add and remove functions, others are pretty much the same as in our generic stack. So, first let’s see what happens in the add function.
void add(Vector* v,void* item){
    if(v->elementSize+v->size<=v->count*v->elementSize){
       
v->size=v->size+v->elementSize;
       
v->items=realloc(v->items,v->size);
        add(v,item);
    }else{
        void* dest=(char*)
v->items+v->count*v->elementSize;
        memcpy(item,dest,v->elementSize);
    }
}
If block of the add function executes if we don’t have enough memory to hold the new item. Therefore we have to allocate enough memory to hold the new item and here I’m just allocating memory to hold one more item. A better way might be to double the current size of the vector so that frequent reallocation does not happen. Now the question is why use realloc instead of malloc function. The reason is quite simple; when we use realloc it would expand the size without relocating if there is enough contiguous memory blocks available on heap after the last byte of current allocation. This saves us from copying all items to new block and freeing the memory of older one. If there is not enough contiguous memory copying and freeing would be handled by the realloc function. The else block is simple enough right. Just copy the content of our new item to the next free block of memory in our vector.
Now let’s move to the remove function. Idea of the function is to remove an item from the given position, hand it back to the client so that client owns the item which implies that he is responsible for managing memory for that item and rearrange the vector so that there are no empty slots before the last element in the vector. So, here is the code.
void v_remove(Vector* v,int
position,void* item){
    if(position>=v->count){
        return;
    }
    void*
dest=(char*)v->items+v->elementSize*position;
    memcpy(dest,item,v->elementSize);
    if(v->count > position){
       
memmove(dest,(char*)dest+v->elementSize,(v->count-position)*v->elementSize);
    }
    v->count--;
} 
First thing is we just return blindly if the given index is not filled in our vector using the first if block. Next if it is there we would copy the item to the given address using memcpy. Notice the pointer arithmetic with the void*. If you don’t understand it see the generic stack. So, now we have given it to the caller, then we should remove it from the vector. But if the item that we removed is not the last one in the vector it would create a haul or unused block of memory in the vector. Therefore we need to move or shift elements that are located after the removed element. That’s where the fancy little memmove function comes in. this function moves a block of memory. Of course we could have used memcopy in this situation, but memmove is fancier where we don’t have to worry about the overlapping of memory blocks. So, as a rule of thumb I usually consider using memcpy when we want to copy items in to a separate location where there is no possibility of overlapping.

That is that. If there are any errors pls let me know. Copilation happens gracefully when we use void type but there we cheat the compiler on its type system. Huh!

Friday, July 3, 2015

Implementing a generic C Stack


In an earlier post I included the header file and why all kinds of variables should be there in the struct of our stack. This post is about the implementation of the functions and their prototypes.
First lets look at the implementation and then we will see why this and that happens in a particular way.
#include <stdio.h>
#include <stdlib.h>
#include "stack.h"

void newStack(Stack* stack,int size,int elementSize,void
(*clearmem)(void* )){
   
stack->size=size;
    stack->currentIndex=0;
   
stack->items=malloc(size*elementSize);
   
stack->elementSize=elementSize;
   
stack->clearmem=clearmem;
}
void push(Stack* stack,void* item){
   
if(stack->currentIndex>=stack->size){
       
printf("not enough space");
        return;
    }else{
        void*
addr=(char*)stack->items+(stack->currentIndex*stack->elementSize);
       
memcpy(item,addr,stack->elementSize);
       
stack->currentIndex++;
    }
}
void pop(Stack* stack,void* destination){
   
if(stack->currentIndex<=0){
       
printf("sorry empty stack");
        return;
    }else{
       
stack->currentIndex--;
       
memcpy((char*)stack->items+(stack->currentIndex*stack->elementSize),destination,stack->elementSize);
    }
}
void dispose(Stack* stack){
    if(stack->clearmem!=NULL){
       
while(stack->currentIndex>=0){
           
stack->currentIndex--;
           
stack->clearmem((char*)stack->items+(stack->currentIndex*stack->elementSize));
        }
    }
   
free(stack->items);
}

Now the newStack function dosent do anything interesting instead it just initializes the stuff in our struct. The push function has some interesting stuff. The idea of the function is we copy everything from the starting address given with void* to the stack covering the length of a single item (we got it when we initialized our stack). Interesting thing here is how to compute the address within our stack memory block.
Void dosent associate size information. That is if you have int* I and you say i+1 you would get the address of the byte that is 4 bytes a way from the starting address. This happens because we know that int is 4 bytes long (normally). But since we are dealing with void first we cast it in to a char* so that it would be assumed as one byte long and pointer arithmetic would be like normal calculation. That is if we have char* c and we do c+1 it would give the address of the byte that is 1 byte away from the starting address.
Now to the pop function. Why that is needed and address to place the item in the stack instead of just returning the address within our stack. Well think of what happens when one push another item after the pop we would overwrite it. That’s not what we want right?. Other thing is the pointer arithmetic which is mentioned above.
Finally, the dispose function destroys the stack and its content. We are not responsible for freeing up memory for items that was poped instead we would do the cleanup for items that are still in the stack by using the memclear function given to us by our client at the initialization. After cleaning items we would free the memory that was allocated for our stack using free function.

Now its done

Thursday, July 2, 2015

Creating a generic C stack

                Creating generic containers is little tough in c as compared to Java and C++. Templates in C++ would give you different containers for each type that you use where Java gives you only one container which is capable of holding any type due to its singly rooted object base class hierarchy. Generic containers in C are bit like Java containers in a sense that it would give you only one container as opposed to the container for each type in C++.
                It is tricky because we have to deal with void type for variables where we lose the size information and malloc for memory allocation which only deals with raw number of bytes. But it is fun alright!
                So first lets see the header file and see why this and that is there. I mean the purpose of having all kinds of stuff in header.
#ifndef STACK_H_
#define STACK_H_

#include <stdio.h>
#include <stdlib.h>

typedef struct {

    int size;

    int currentIndex;

    int elementSize;
    void* items;
    void
(*clearmem)(void*);
} Stack;

void newStack(Stack* stack ,int size,int elementSize,void
(*clearmem)(void* ));
void push(Stack* stack,void* item);
void pop(Stack* stack,void* destination);
void dispose(Stack* stack);
#endif
Let’s first look at the struct. The size variable defines max number of items that we are going to keep in our stack. The elementSize variable holds the size of a single item that would be kept in our stack. This is needed because we are going generic and we would be taking void type items, therefore we don’t have size information using lets say sizeof .  The currentIndex keep track of the number of elements in the stack and will be used for pointer arithmetic to locate addresses of items in push and pop operations. Void* items refer to a memory location in the heap which points to the starting address of our stack. Last void (*clearmem)(void*) is a function pointer which knows how to free the memory used by items stored in the stack. This is needed because we don’t know how to free the memory if our stack is holding pointers to strings in heap or pointers to structs etc. if we only deals with data types such as int, double etc. this is not needed because it would totally be within our stack.
Let’s recap what we have at the moment. We have size and elementSize which collectively defines the amount of memory or number of bytes that is needed for our stack. Then we have currentIndex and items to find out addresses of the items inside the stack and elementSize tells us how far from these addresses that we need to consider as single item.

Wow! This is getting long. Function prototypes and implementation next time.
Note: this is an old post moved here from a different location.