Assignment 1: Exercises from the Textbook
Complete the following exercises from the book Chapters 4 , pp.104,
(1) Problem 18, 19 and 20.
(2) For problem 18, you can change it if you could code the algorithm by yourself.
You can choice one of the algorithm from ID3, C4.5 or CART.
Problem 18:
一. 算法描述
ID3 选择具有最高信息熵增益的属性作为分裂属性,基于这种原则我们首先可以算出
初始集合的熵,然后分别求出以各个属性为分裂属性时的熵,然后将通过上面得到的
数据算出以各个属性为分裂属性时的信心增益,选择具有最大的信息增益属性作为我
们的分裂属性。
二. 算法实现(源代码)
#include <iostream>
#include <cmath>
#include <string>
using namespace std;
#dene SIZE 15
struct Data
{
string sex;
double stature;
string output;
};
Data data [SIZE]={{"女",1.6,"中"},{"男",2,"中"},
{"女",1.9,"高"},{"女",1.88,"高"},
{"女",1.7,"中"},{"男",1.85,"中"},
{"女",1.6,"中"},{"男",1.7,"中"},
{"男",2.2,"高"},{"男",2.1,"高"},
{"女",1.8,"中"},{"男",1.95,"中"},
{"女",1.9,"高"},{"女",1.8,"中"},{"女",1.75,"中"}};
double calculate(double a,double b);
void origin_entropy(Data data[],double &entropy);
void sex_entropy(Data data[],double &entropy,string &s);
void stature_entropy(Data data[],double &entropy,string &s);
int main()
{
double origin=0,stature=0,sex=0;
origin_entropy(data,origin);
sex_entropy(data,sex,string("sex"));