首页 > 分享 > 鸢尾花——随机森林分类模型(RandomForestClassifier)

鸢尾花——随机森林分类模型(RandomForestClassifier)

花匠小妙招
2024-11-06 23:35

采用随机森林分类模型(RandomForestClassifier)再次训练上述鸢尾花数据集，具体要求如下：

1、使用pandas库再次读取数据集，得到相应矩阵，并进项相应的数据预处理：包括数据标准化与鸢尾花类别编码等。

2、采用决策树模型训练鸢尾花数据集，测试集取30%，训练集取70%。

3、特征选择标准criterion请选择 “entropy”，随机森林的子树个数“n_estimators”取值为10，在控制台打印出其测试集正确率。请分析该正确率是否比决策树分类模型正确率更高。

4、为了提升模型的泛化能力，请分别使用十折交叉验证，确定随机森林分类模型的参数max_depth（子树的最大深度）与n_estimators（子树个数）的最优取值。max_depth取值范围为1-5，n_estimators的取值范围为1-20。请在控制台输出这两个参数的最优取值。

from sklearn.model_selection import GridSearchCV

from sklearn.tree import DecisionTreeClassifier

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import LabelEncoder

import pandas as pd

from sklearn.model_selection import ShuffleSplit

from sklearn.model_selection import cross_val_score

from sklearn.ensemble import RandomForestClassifier

from sklearn.metrics import accuracy_score

if __name__ == "__main__":

path = 'iris.data' # 数据文件路径

data = pd.read_csv(path, header=None)

x = data[list(range(4))]

y = LabelEncoder().fit_transform(data[4]) #讲栾尾花类别编码

x = x.iloc[:, :4]

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=1)

#进行十折交叉验证的数据预处理

#使用十折交叉验证获取，max_depth（子树的最大深度）的最优取值

d_scores = []

for i in range(1,6):

model = RandomForestClassifier(n_estimators=10, criterion='entropy', max_depth = i, oob_score=True)

scores = cross_val_score(model, x, y, cv=10, scoring='accuracy')

d_scores.append(scores.mean())

print('max_depth分别取1，2，3，4，5时得到的准确率:')

print(d_scores)

print('最优值为： ',max(d_scores))

print('最优 max_depth 值为： ',d_scores.index(max(d_scores))+1)

# 使用十折交叉验证获取，n_estimators（子树个数）的最优取值

n_scores = []

for i in range(1, 21):

model = RandomForestClassifier(n_estimators= i, criterion='entropy', max_depth= 3, oob_score=True)

scores = cross_val_score(model, x, y, cv=10, scoring='accuracy')

n_scores.append(scores.mean())

print('n_estimators分别取 1~20 时得到的准确率:')

print(n_scores)

print('最优值为： ', max(n_scores))

print('最优 n_estimators 值为： ', n_scores.index(max(n_scores))+1)

输出：

什么是白杨树生长环境？

白杨树开花的奇观（白杨树的花期）

热点分享

家庭养花知识大全(家庭养花知识大全与技巧)

养花常识养花技巧 1.浇花 ①残茶浇花残茶用来浇花,既能保持土...

养花知识大全,养花技巧大全

养花知识绿萝是一种很常见的盆栽植物，因为四季翠绿、养护简单...

推荐分享

家庭养花风水知识家庭养花“五行说”

许多人喜欢在家庭里面养花，但不是很了解家庭养花风水知识。居家...

家庭养花知识大全家庭养花有什么好处

家庭养花知识大全家庭养花有什么好处爱花之人总是喜欢在家里...

热门点击排行

君子兰什么品种最名贵十大名贵君子兰排名

世界上最名贵的10种兰花图片，莲瓣兰价值高达1500万

分享分类导航

花卉

每日分享

花卉图片

养花生活