ArtsAutosBooksBusinessEducationEntertainmentFamilyFashionFoodGamesGenderHealthHolidaysHomeHubPagesPersonal FinancePetsPoliticsReligionSportsTechnologyTravel

Adaboost in Python

Updated on July 10, 2012

The Adaboost Algorithm

Below is a Python implementation of the famous Adaboost algorithm. Adaboost, originally developed by Freund and Shapire back in 1995, combines rule-of-thumb to compute an ensemble estimate. The underlying concept is that many weak learners can yield a strong learner. One analogy that was used when I was taught described how a horse handicapper might explain how he picks winners. Basic rules, such as a horse doing well against the rail or in the mud, can often contradict each other. A meta-algo is needed to track the individual predictive qualities of each rule and generate a consensus, and that is exactly how Adaboost works.

The particular implementation described classifies points in 2-D space as either positive or negative. Rules are described via lambda notation, which makes extensions using more complicated models (neural nets, perceptron, etc) especially easy to craft.

Adaboost Algorithm

from __future__ import division
from numpy import *

class AdaBoost:

    def __init__(self, training_set):
        self.training_set = training_set
        self.N = len(self.training_set)
        self.weights = ones(self.N)/self.N
        self.RULES = []
        self.ALPHA = []

    def set_rule(self, func, test=False):
        errors = array([t[1]!=func(t[0]) for t in self.training_set])
        e = (errors*self.weights).sum()
        if test: return e
        alpha = 0.5 * log((1-e)/e)
        print 'e=%.2f a=%.2f'%(e, alpha)
        w = zeros(self.N)
        for i in range(self.N):
            if errors[i] == 1: w[i] = self.weights[i] * exp(alpha)
            else: w[i] = self.weights[i] * exp(-alpha)
        self.weights = w / w.sum()
        self.RULES.append(func)
        self.ALPHA.append(alpha)

    def evaluate(self):
        NR = len(self.RULES)
        for (x,l) in self.training_set:
            hx = [self.ALPHA[i]*self.RULES[i](x) for i in range(NR)]
            print x, sign(l) == sign(sum(hx))
        
if __name__ == '__main__':

    examples = []
    examples.append(((1,  2  ), 1))
    examples.append(((1,  4  ), 1))
    examples.append(((2.5,5.5), 1))
    examples.append(((3.5,6.5), 1))
    examples.append(((4,  5.4), 1))
    examples.append(((2,  1  ),-1))
    examples.append(((2,  4  ),-1))
    examples.append(((3.5,3.5),-1))
    examples.append(((5,  2  ),-1))
    examples.append(((5,  5.5),-1))

    m = AdaBoost(examples)
    m.set_rule(lambda x: 2*(x[0] < 1.5)-1)
    m.set_rule(lambda x: 2*(x[0] < 4.5)-1)
    m.set_rule(lambda x: 2*(x[1] > 5)-1)
    m.evaluate()

Applications of Adaboost

Applications of Adaboost are pretty varied, reflecting the power of the algorithm (I confess: I'm a huge fan of it). Alfaro et al. used boosting and neural networks to predict corporate defaults. Other applications include machine vision(numerous papers on facial recognition) and some work in finance (Sun et al.)

Comments

    0 of 8192 characters used
    Post Comment

    No comments yet.

    Click to Rate This Article