Can you convert these two recursive methods to iterative?

This is the place for queries that don't fit in any of the other categories.

Can you convert these two recursive methods to iterative?

Postby russ » Wed Mar 06, 2013 10:26 am

Code: Select all
class StringsAndThings(object):

    def __init__(self):
        self.tree = self.__randtree__()

    def __repr__(self, tree=None, root=True):
        sep, a, b = self.tree if tree == None else tree
        if type(a) != str:
            a = self.__repr__(a, False)
        if type(b) != str:
            b = self.__repr__(b, False)
        return "(%s %s %s)" % (a, sep, b)

    def __randtree__(self, depth=5):
        if depth == 0:
            return random.choice('abc')
        else:
            sep = random.choice('+-*/^')
            if random.randint(0, 1):
                a = self.__randtree__(depth-1)
            else:
                a = random.choice('abc')
            if random.randint(0, 1):
                b = self.__randtree__(depth-1)
            else:
                b = random.choice('abc')
            return sep, a, b


I'm refactoring a dead project I took on from last year but didn't finish. This is more or less what it looks like. If I can see the iterative solution to these two methods in a way that is clean and efficient I'd be sorted.
russ
 
Posts: 18
Joined: Sat Mar 02, 2013 8:59 am

Re: Can you convert these two recursive methods to iterative

Postby setrofim » Wed Mar 06, 2013 11:12 am

OK, a few points:
  • Methods that start and end with double underscore are "special" in python.
    • There is a pre-existing set of those, each with a specific purpose, and you shouldn't really be inventing your own.
    • You would not typically call the special methods directly, instead they would be invoke by the corresponding function/operator. e.g. ___repr__() method would be invoked by the repr() function.
    • You should not override special methods to take a different number of parameters from what they would normally expect (e.g. __repr__ take only the self param).
  • If you want to mark a method as internal to your class' implementation, a common convention is to use a single leading underscore. This won't be enforced by the Python interpreter but will server as a hint to those reading your code. You could also use double leading underscores (but not ending!), which would actually make the method inaccessible outside the class, but this is generally discouraged.
  • Is there a particular reason you want an iterative solution? Tree generation is one the tasks that lends itself particularly well to recursive implementations; an iterative implementation would probably be less clean than a well-implemented recursive one.

With that in mind, here is a cleaned-up recursive solution:
Code: Select all
import random


class StringsAndThings(object):

    def __init__(self, depth):
        depth = random.choice([0, depth])
        self.is_leaf = depth == 0
        if self.is_leaf:
            self.a = random.choice('abc')
        else:
            self.sep = random.choice('+-*/^')
            self.a = StringsAndThings(depth-1)
            self.b = StringsAndThings(depth-1)

    def __repr__(self):
        if self.is_leaf:
            return self.a
        else:
            return "(%s %s %s)" % (self.a, self.sep, self.b)


print StringsAndThings(5)


EDIT: My attempt at an iterative solution (I'm a little rusty, so may be someone else on the forum can come up with a neater one):

Code: Select all
import random


class StringsAndThings(object):

    def __init__(self, depth):
        if not depth:
            self.root = None
            return
        self.root = TreeNode()
        previous_level = [self.root]
        depth -= 1
        while depth:
            this_level = []
            for node in previous_level:
                if random.choice([0, 1]):
                    node.sep = random.choice('+-*/^')
                    node.a = TreeNode()
                    node.b = TreeNode()
                    this_level.extend([node.a, node.b])
                else:
                    node.a = random.choice('abc')
            previous_level = this_level
            depth -= 1
        for node in previous_level:
            node.a = random.choice('abc')

    def __repr__(self):
        if self.root is None:
            return "()"
        outstring = "%s"
        nodes = [self.root]
        while nodes:
            next_nodes = []
            substrings = []
            for node in nodes:
                if node.sep:
                    next_nodes.extend([node.a, node.b])
                    substrings.append('(%%s %s %%s)' % node.sep)
                else:
                    substrings.append(node.a)
            nodes = next_nodes
            outstring = outstring % tuple(substrings)
        return outstring


class TreeNode(object):

    def __init__(self):
        self.sep = None
        self.a = None
        self.b = None


print StringsAndThings(5)


As you can see, this is pretty ugly, will be slower than the recursive solutions and, for smaller depths at least, is going to consume more memory. For a very large number of nodes, the memory cost of un-pop'd stack frames for recursive calls is going to overtake the cost of the lists used to maintain the intermediate state; that is pretty much the only situation where you would even consider doing this.
setrofim
 
Posts: 288
Joined: Mon Mar 04, 2013 7:52 pm

Re: Can you convert these two recursive methods to iterative

Postby russ » Wed Mar 06, 2013 1:48 pm

That is a very informative answer, thanks!

I seem to have forgotten that __init__ is an actual function that can do things. :lol:

This is really for a module that is will attempt to find the sequence function for a given number sequence using a genetic algorithm.

While refactoring I thought it would be nice if I had my own datatype from a class called Expression so started fiddling with __repr__ and such but not sure yet.

When I was testing the original code it seemed too slow so I thought it must be because of all the recursion going on and I heard it's not good to use recursion with python because function calls are expensive. But I'll stick with it if an iterative solution is going to be slower.

This is my old code.
Code: Select all
from random import choice, random


class ExpressionFormConstructor(object):
    """
    BNF specificaton:
        <expr_form>     ::=     (<operator>, <operand>, <operand>)
        <operator>      ::=     "+" | "-" | "*" | "/" | "^"
        <operand>       ::=     "e" | "c" | "x"
        "e"             ::=     <expr_form>
        "c"             ::=     int
        "x"             ::=     x
    """
    non_terminating_expr_forms = (
        ('+', 'x', 'e'),
        ('-', 'x', 'e'),
        ('*', 'x', 'e'),
        ('/', 'x', 'e'))

    terminating_expr_forms = (
        ('+', 'x', 'c'),
        ('-', 'x', 'c'),
        ('*', 'x', 'c'),
        ('/', 'x', 'c'),
        ('^', 'x', 'c'))
       
    all_expr_forms = non_terminating_expr_forms + terminating_expr_forms

    def rand_form(self, depth=0):
        """
        rand_form() -> tuple(expr_form), list(constants)
        depth refers to maximum nested depth of expr_form
        """
        constants = []
        if depth is 0:
            expr_form = list(choice(self.terminating_expr_forms))
        else:
            expr_form = list(choice(self.all_expr_forms))

        if expr_form[1] is 'c':
            constants.append(0)
        elif expr_form[1] is 'e':
            expr_form[1], con = self.rand_form(depth - 1)
            constants.extend(con)
        else:
            pass

        if expr_form[2] is 'c':
            constants.append(0)
        elif expr_form[2] is 'e':
            expr_form[2], con = self.rand_form(depth - 1)
            constants.extend(con)
        else:
            pass
        return tuple(expr_form), constants


class ExpressionForm(object):

    symb_to_op = {'+':lambda a, b: a + b,
                  '-':lambda a, b: a - b,
                  '*':lambda a, b: a * b,
                  '/':lambda a, b: float(a) / b,
                  '^':pow}

    def __init__(self, expr_form, constants):
        self.expr_form = expr_form
        self.constants = constants # number of constants.
        self.population = [] # each individual is a unique tuple of constants.
        self.graveyard = set() # contains hash values for failed individuals.

    def __str__(self, expr_form=None, root=True):
        expr_form = self.expr_form if expr_form is None else expr_form
        symb, A, B = expr_form
        A = A if A in 'cx' else self.__str__(A, False)
        B = B if B in 'cx' else self.__str__(B, False)
        return ('%s %s %s' if root else '(%s %s %s)') % (A, symb, B)

    def evaluate(self, x, ind, expr=None, c=0):
        """
        x -> int, ind -> from self.population
        """
        symb, A, B = expr if expr else self.expr_form
        op = self.symb_to_op[symb]
        root = True if expr is None else False

        if A is 'c':
            A = ind[c]
            c += 1
        elif A is 'x':
            A = x
        else: # A is an expr_form
            A, c = self.evaluate(x=x, ind=ind, expr=A, c=c)

        if B is 'c':
            B = ind[c]
            c += 1
        elif B is 'x':
            B = x
        else: # B is an expr_form
            B, c = self.evaluate(x=x, ind=ind, expr=B, c=c)

        return op(A, B) if root else (op(A, B), c)


# ---------- testing ---------->



##from random import randint
##
##Constructor = ExpressionFormConstructor()
##
##def test():
##    Expr = ExpressionForm(*Constructor.rand_form(3))
##
##    # populate
##    for ind in range(100):
##        Expr.population.append(tuple([randint(1, 10) for constant in Expr.constants]))
##
##    # test and remove error prone individuals
##    for ind in range(len(Expr.population)):
##        try:
##            print [Expr.evaluate(x, Expr.population[ind]) for x in range(10)]
##        except (OverflowError, ZeroDivisionError, ValueError) as err:
##            print err
##            Expr.population[ind] = False
##    Expr.population = [ind for ind in Expr.population if ind]
##    print "lost:", 100 - len(Expr.population), "individuals"
##
##while True:
##    raw_input()
##    test()



I'm not sure if this is a good design but this is what it's supposed to do.

I delegated the task of generating random expression forms to a class on it's own because if you do it completely randomly there's a few errors that occur as some forms are error prone or just not valid. like (x ^ e) where e is another expression form itself or (c / x) where c is some constant and x can be 0.

ExpressionForm instances will contain a form attribute like (x + (c * x)), and a constants attribute which should be a list of integers for all the constants in the form. Constants are kept separate from the form because I thought it would be simpler and more efficient when making changes to the constants.

A third class called GA or something will will handle the main algorithm.

procedure:
- The third class which I haven't yet made called GA will create a population of unique ExpressionForm instances.
- ExpressionForm instances will have a method optimize(seq, runtime) which will apply a simple stochastic optimization algorithm to find the best set of constants to use with the expression form for the given sequence. This is why constants are kept in a separate list from the form.
- ExpressionForm instances will have a fitness rank which is determined after the simple optimization is done.
- Some selection takes place
- Crossover/mutation
- etc


I really don't know yet. Can you suggest a better layout?
russ
 
Posts: 18
Joined: Sat Mar 02, 2013 8:59 am

Re: Can you convert these two recursive methods to iterative

Postby setrofim » Wed Mar 06, 2013 3:33 pm

russ wrote:When I was testing the original code it seemed too slow so I thought it must be because of all the recursion going on and I heard it's not good to use recursion with python because function calls are expensive.

Code could be slow for a large number of reasons. Without profiling the code, it's usually hard to tell why, however the cost of function calls is unlikely to be hight on the list.

russ wrote: But I'll stick with it if an iterative solution is going to be slower.

Having actually benchmarked the two solutions, they appear to run in about the same amount of time. The recursive solution does start hitting recursion limit somewhere after depth 240 (that's up to 2**240 or about ~10**75, or a Duodecillion, nodes!). If you're not planning to evaluate expressions that big, I'd stick with the recursive implementation for the sake of clarity.

russ wrote:Can you suggest a better layout?

Your description is kinda vague, but your general approach seems sound. I think you might be over-complicating things with the Expression constructor though. Maybe start writing the outline for your GA class to get the overall shape of the program; that will give you a better idea what a good design for your Expression class might be.

Given that you want to implement crossover, representing Expressions as trees seems reasonable. Traversing the tree for each evaluation will impact performance however. As an optimization (get your code to work correctly first!), you might wanna look inot "pre-compiling" your expression trees (e.g. using the ast module). This is a bit of an advanced topic though, so if you're just starting with Python, don't worry about it.
setrofim
 
Posts: 288
Joined: Mon Mar 04, 2013 7:52 pm


Return to General Coding Help

Who is online

Users browsing this forum: Yoriz and 1 guest