Archive | Vector RSS for this section

On ArrayLists and Vectors

Long ago, when I first started doing Java, I struggled against Java’s limitations regarding the resizing of arrays. Eventually after poking around for solutions, I found the ArrayList. Before that, I found the Vector, but for whatever reason, NetBeans was saying that Vector was deprecated, so I settled on ArrayList, and it was fine.

A few years later, I took my first class on C++, and learned about the std::vector. I was also introduced to the concepts of the vector and the linked list. Needless to say, I was confused. If vectors and linked lists aren’t the same thing, then what is this ArrayList thing? At the time, I elected not to pursue the question, needing to focus on figuring out these “pointer” things they were making us learn. I put it out of my mind.

Flash forward to today. This morning I was travelling the internet, and I came across a forum thread. The topic of the ArrayList came up. Curiosity finally overcame me, and I looked into the issue.

So, About Those ArrayLists…

The Java Collections framework is broken up into various datatype interfaces. There is an interface for lists, queues, maps, sets, and deqeues.

You’ll notice that there is no vector interface. While a list and an array have different semantics and use cases, you can use them roughly in the same way. “Array indexing” can be implemented on a list by iterating to the nth node, and arrays can be traversed like a list. Since there’s nothing a list can do that a vector can’t and vice versa, it follows that they can both implement the same interface. Sun had to make a choice: name that interface “list” or “vector”. They went with list.

Given that Java’s vector type implements the list interface, it follows that they would reflect that in the name: ArrayList. For all intents and purposes, ArrayList is equivalent to C++’s std::vector. ArrayList implements the Collections framework’s list interface, but it is a growable array. It behaves like a vector, indexing is cheap, resizing is expensive. Java also provides a LinkedList that behaves like a linked list.

So, what about Java’s Vector? It seems that Vector predates the Collections framework. After the introduction of the Collections framework Vector was retrofitted to be a part of it, also implementing the list interface. So, what’s the difference between ArrayList and Vector? Vector is synchronized, and therefore is safe to use in threaded code. Due to this it is significantly slower than ArrayList.

Simply put: use ArrayList in single-threaded code, and Vector in multi-threaded code.

Why Names Matter

This is why names matter. One might not think much of it, but imagine how many hiring managers would be out an interview question if ArrayList and Vector had been named Vector and SynchronizedVector respectively?

Implementing a Vector in C++: Working Smarter, Not Harder

You may remember when I created a vector in C. Yep, just another day in the life… How about we re-invent this wheel again?

This time we’ll be creating a vector in C++. This is much easier to do in C++ thanks to generics and operator overloading. Since this is a template class, we’ll be implementing the class in our .h file, forgoing the .cpp file. First we declare our class. Ensure that you declare it a template class.

namespace clt {
namespace dmp {

template <typename T>
class vector {

..words..

};
}
}

Remember namespaces? Gotta have those namespaces. Aside from that, nothing particularly special about this block. Next I’ll go over the methods and members of this class.

First up: our destructor. Nothing special going on here, we’re just going to delete our array pointer to ensure proper Resource Acquisition is Initialization (RAII). We’re going to let it be virtual to allow for subclassing.

virtual ~vector()
{
delete [] backing_array;
}

Next I’ll discuss the private method: initialize(). This method is called by all constructors and does 3 things: sets backing_array_size, resize_increment, and news up our array:

void initialize(const unsigned size, const unsigned increment)
{
backing_array_size = size;
resize_increment = increment;
backing_array = new T[backing_array_size];
}

Next up is our default constructor. It sets size and increment to 10. I like 10, it’s a nice, noble number. The kind of number you’d bring home to Mom and Dad:

vector()
{
initialize(10, 10);
}

Next we’ll talk about a private method: array_copy(). This method does pretty much what it says: it iterates through the original array and copies its contents into the new array:

void array_copy(T * orig_array, T * new_array)
{
for (int count = 0; count < backing_array_size; count++)
{
new_array[count] = orig_array[count];
}
}

Next to talk about is our copy constructor. It initializes the new vector with the old vector’s size and increment, then calls array_copy:

vector(const vector& orig)
{
initialize(orig.backing_array_size, orig.resize_increment);
array_copy(orig.backing_array, backing_array);
}

Moving along to operator=, which basically functions the same as the copy constructor, aside from the obligatory return statement:

vector& operator=(const vector& orig)
{
initialize(orig.backing_array_size, orig.resize_increment);
array_copy(orig.backing_array, backing_array);
return * this;
}

Next up: resize(). This private method creates a new T array sized backing_array_size+resize_increment. Then, it calls array_copy to copy the contents of the old array to the new. Finally, it increments backing_array_size by resize_increment, calls delete [] on backing_array, and sets backing_array to the new array:

void resize()
{
T * new_b_a = new T[backing_array_size + resize_increment];
array_copy(backing_array, new_b_a);
backing_array_size = backing_array_size + resize_increment;
delete [] backing_array;
backing_array = new_b_a;
}

Which brings us to operator[]. This method indexes into backing_array. If to_get >= size, it calls resize() until to_get is a valid index. It should be noted that this method does not prevent you from attempting to read into a null pointer:

T& operator[](const unsigned to_get)
{
while (to_get >= backing_array_size)
{
resize();
}
return backing_array[to_get];
}

Finally, we have a setter for resize_increment and a getter for backing_array_size:

void set_resize_increment(const unsigned to_set)
{
resize_increment = to_set;
}

const unsigned size()
{
return backing_array_size;
}

Lastly, this class has 3 private fields: a pointer array to the contents of the vector, the size of the array, and the resize increment:

T * backing_array;
unsigned backing_array_size;
unsigned resize_increment;

…and that’s all there is to it. No casting void pointers, no manual resource allocation, and no unweildly method names. Or you could just use std::vector I guess, but what would be the fun in that?

Implementing a Vector in C With Pointers: Taming the Beast

It was an unusually cold spring morning, that fateful April day, and the air was tense. The village had been bracing for a hard year, and the threat of a pointed menace was lurking about. But enough was enough. After decades of living in terror of this undefined horror, one man had the courage to say no more!

Chris set out to slay the beast.

The C programming language has these terrible things called pointers. Pointers are declared like so:

int * foo;
foo = malloc(sizeof(int));
*foo = 1;

… and used like …

printf("foo points to %d", *foo);

… outputting …

foo points to 1

Miss any of that, and things go off the rails. Miss the first line and your program won’t compile. Miss the second, and your pointer is null, resulting in a Undefined Behavior if you try to use it. Miss the third, and you haven’t assigned a value to your pointer. If you try to dereference this pointer, you’ll get Undefined Behavior.

Having had this undefined creature get the drop on him more often than he’s proud of, Chris gave himself an exercise: create a “generic” vector in C using void pointers. This vector should have all the usual resizing fanciness, and should take any type. This being C, and not having access to C++ templates, makes it a bit more difficult. Chris created the following struct to represent his vector:

typedef struct
{
void * contents;
int length;
int increment_by;
size_t size;
} vector;

Going down the list: contents is the array backing the vector, length is the size of the contents array, increment_by is how much vector_resize() should increase the array size by, and size is the sizeof() the type contained by this vector. By itself, this struct contains all the bits required to make this vector work, but as it stands it’s just an annoying to use array. Chris couldn’t even trust himself to not mess this up, what chance did the villagers have? He created helper functions.

This function populates the initial values of all the vector members, and returns a pointer to this newly created vector.

vector * vector_new(size_t type_size, int initial_size, int resize_inc)
{
vector * new_vec = malloc(sizeof(vector));
new_vec->length = initial_size;
new_vec->size = type_size;
new_vec->contents = malloc(new_vec->size * new_vec->length);
new_vec->increment_by = resize_inc;
return new_vec;
}

This function handles freeing all allocated memory.

void vector_delete(vector * to_delete)
{
free(to_delete->contents);
free(to_delete);
}

This function handles accessing the vector. Were this C++ Chris would just overload the [] operator. if to_get is greater than or equal to to_get_from->length it calls vector_resize() to increase the size of the vector. The pointer returned by this function needs to be casted to a pointer of the appropriate type.

void * vector_access(int to_get, vector * to_get_from)
{
while (to_get >= to_get_from->length)
vector_resize(to_get_from);
return accessor(to_get_from->contents, to_get, to_get_from->size);
}

This function resizes the contents array. It creates a new larger array, and copies the contents of the old array to the new one.
Since you can’t use pointer arithmetic on a void pointer, accessor() is used to access the array. The pointer returned by accessor() is casted to char*, and copied byte-for-byte into the new array.

void vector_resize(vector * to_resize)
{
int count;
int pointer_count;
void * new_cont = malloc (to_resize->size * (to_resize->length + to_resize->increment_by));

for (count = 0; count length; count++)
{
char * old = vector_access(count, to_resize);
char * new = accessor(new_cont, count);
for (pointer_count = 0; pointer_count size; pointer_count++)
{
*(new + pointer_count) = *(old + pointer_count);
}
}

free(to_resize->contents);
to_resize->contents = new_cont;
to_resize->length = to_resize->length + to_resize->increment_by;
}

Finally, this function handles indexing into the void pointer array. This is accomplished by casting the pointer to char*, and accessing the correct index by multiplying the index by to_access->size.

void * accessor(void * to_access, int index)
{
char * working = (char*) to_access;
void * return_value = (void*) (working + (index * to_access->size));
return return_value;
}

Together with his vector, Chris heroically sets off to make the night just a little bit less terrifying…

%d bloggers like this: