Archive for August, 2009

Why might a set API be useful

Why use a set api rather than a list API if one doesn’t care about performance

In most languages in built set classes have better performance than lists. But suppose one doesn’t care about performance – why would you use a set rather than a list.

Broadly speaking a set is distinguished from a list by the lack of

a) Ordering
b) Duplication of elements.

How can these properties be useful. In a sense it isn’t true that sets don’t have an ordering, since in most languages you can iterate over the members of a set – rather they just don’t have a very good ordering. So this is not really of any use to us.

So the only thing that could be useful about a set api is that it allows one to avoid duplication of items without writing any code to do this. This will be useful for any problems in which you explicitly want to avoid duplication, examples of this include:

* Only wanting to email a number of people once
* Only wanting to rerun tests once
* Only wanting to list each result once (but having an algorithm that generates duplicates).

This concept of avoiding duplication can be generalised

A dictionary allows one to avoid duplication of parts of an object. That is, if we consider two objects to be near enough identical only parts of them match then we can use these parts as keys in a dictionary – and thereby avoid duplication.

In fact we can encapsulate this behaviour in an equivalence set – a collection with ensures uniqueness up to some equivalence function (if our elements are immutable we could use the following).

class EquivalenceSet(object):
def __init__(self, getRepr):
self.getRepr = getRepr # get representation of equivalence class
self.mapping = {}

def add(self, el):
self.mapping[self.getRepr(el)] = el

def __iter__(self, el):
return self.mapping.itervalues()

This assumes that out “uniqueness” can somehow be represented by a function such that f(x) = f(y) implies x and y are equivalent . However, sometimes we have more interesting notions of not “unique” for example if we are representing objects in a plane – two objects might only be considered “unique” if they do not intersect. If we are in this situation it may not be clear what we should do when elements are not unique.

August 3, 2009 at 5:02 pm Leave a comment

August 2009
« Jul   Nov »