Implemented a greedy algorithm for building classification tree given a data set. It uses gini or information gain as a spliting criteria to decide the best attribute based on the user input. The algorithm gives an average accuracy of 0.95 which was evaluated by performing 10 fold cross validation on 10 different data sets.
python ClassficationTree.py "dataset" "," "gini"
The code accepts the following command line arguments:
- The path of the dataset file
- The delimiter used in the file to separate attributes
- A string either "gini" or "info" for deciding the best spliting attribute.