成片的葛藤压倒了佐治亚州的树木,而藤蔓则威胁着全世界十几个国家的生境。这只是众多入侵物种中的两种,它们会对环境、经济、甚至人类健康产生破坏性影响。尽管影响广泛,但追踪入侵物种的位置和传播的努力是如此昂贵,以至于难以大规模地进行。
目前,生态系统和植物分布监测取决于专家知识。训练有素的科学家访问指定区域,并注意到居住在这些区域的物种。使用这种高素质的劳动力是昂贵的,时间效率低,而且不充分,因为人类在取样时不能覆盖大面积。
在这个竞赛中,Kagglers面临的挑战是开发算法,以更准确地识别森林和树叶的图像是否含有入侵的绣球花。来自计算机视觉的技术与其他当前技术(如航空成像)一起,可以使入侵物种监测更便宜、更快、更可靠。
4. 选用的模型及标准
模型选择
本次实验选择的是resnet34的预训练模型,由于是二分类问题,因此在后面加了一个线性层将其映射为1,并使用sigmoid激活函数
BatchNorm2D-1 [[32, 64, 112, 112]] [32, 64, 112, 112] 256 ReLU-5 [[32, 64, 112, 112]] [32, 64, 112, 112] 0 MaxPool2D-1 [[32, 64, 112, 112]] [32, 64, 56, 56] 0 Conv2D-2 [[32, 64, 56, 56]] [32, 64, 56, 56] 36,864 BatchNorm2D-2 [[32, 64, 56, 56]] [32, 64, 56, 56] 256 ReLU-6 [[32, 64, 56, 56]] [32, 64, 56, 56] 0 Conv2D-3 [[32, 64, 56, 56]] [32, 64, 56, 56] 36,864 BatchNorm2D-3 [[32, 64, 56, 56]] [32, 64, 56, 56] 256 BasicBlock-1 [[32, 64, 56, 56]] [32, 64, 56, 56] 0 Conv2D-4 [[32, 64, 56, 56]] [32, 64, 56, 56] 36,864 BatchNorm2D-4 [[32, 64, 56, 56]] [32, 64, 56, 56] 256 ReLU-7 [[32, 64, 56, 56]] [32, 64, 56, 56] 0 Conv2D-5 [[32, 64, 56, 56]] [32, 64, 56, 56] 36,864 BatchNorm2D-5 [[32, 64, 56, 56]] [32, 64, 56, 56] 256 BasicBlock-2 [[32, 64, 56, 56]] [32, 64, 56, 56] 0 Conv2D-6 [[32, 64, 56, 56]] [32, 64, 56, 56] 36,864 BatchNorm2D-6 [[32, 64, 56, 56]] [32, 64, 56, 56] 256 ReLU-8 [[32, 64, 56, 56]] [32, 64, 56, 56] 0 Conv2D-7 [[32, 64, 56, 56]] [32, 64, 56, 56] 36,864 BatchNorm2D-7 [[32, 64, 56, 56]] [32, 64, 56, 56] 256 BasicBlock-3 [[32, 64, 56, 56]] [32, 64, 56, 56] 0 Conv2D-9 [[32, 64, 56, 56]] [32, 128, 28, 28] 73,728 BatchNorm2D-9 [[32, 128, 28, 28]] [32, 128, 28, 28] 512 ReLU-9 [[32, 128, 28, 28]] [32, 128, 28, 28] 0 Conv2D-10 [[32, 128, 28, 28]] [32, 128, 28, 28] 147,456 BatchNorm2D-10 [[32, 128, 28, 28]] [32, 128, 28, 28] 512 Conv2D-8 [[32, 64, 56, 56]] [32, 128, 28, 28] 8,192 BatchNorm2D-8 [[32, 128, 28, 28]] [32, 128, 28, 28] 512 BasicBlock-4 [[32, 64, 56, 56]] [32, 128, 28, 28] 0 Conv2D-11 [[32, 128, 28, 28]] [32, 128, 28, 28] 147,456 BatchNorm2D-11 [[32, 128, 28, 28]] [32, 128, 28, 28] 512 ReLU-10 [[32, 128, 28, 28]] [32, 128, 28, 28] 0 Conv2D-12 [[32, 128, 28, 28]] [32, 128, 28, 28] 147,456 BatchNorm2D-12 [[32, 128, 28, 28]] [32, 128, 28, 28] 512 BasicBlock-5 [[32, 128, 28, 28]] [32, 128, 28, 28] 0 Conv2D-13 [[32, 128, 28, 28]] [32, 128, 28, 28] 147,456 BatchNorm2D-13 [[32, 128, 28, 28]] [32, 128, 28, 28] 512 ReLU-11 [[32, 128, 28, 28]] [32, 128, 28, 28] 0 Conv2D-14 [[32, 128, 28, 28]] [32, 128, 28, 28] 147,456 BatchNorm2D-14 [[32, 128, 28, 28]] [32, 128, 28, 28] 512 BasicBlock-6 [[32, 128, 28, 28]] [32, 128, 28, 28] 0 Conv2D-15 [[32, 128, 28, 28]] [32, 128, 28, 28] 147,456 BatchNorm2D-15 [[32, 128, 28, 28]] [32, 128, 28, 28] 512 ReLU-12 [[32, 128, 28, 28]] [32, 128, 28, 28] 0 Conv2D-16 [[32, 128, 28, 28]] [32, 128, 28, 28] 147,456 BatchNorm2D-16 [[32, 128, 28, 28]] [32, 128, 28, 28] 512 BasicBlock-7 [[32, 128, 28, 28]] [32, 128, 28, 28] 0 Conv2D-18 [[32, 128, 28, 28]] [32, 256, 14, 14] 294,912 BatchNorm2D-18 [[32, 256, 14, 14]] [32, 256, 14, 14] 1,024 ReLU-13 [[32, 256, 14, 14]] [32, 256, 14, 14] 0 Conv2D-19 [[32, 256, 14, 14]] [32, 256, 14, 14] 589,824 BatchNorm2D-19 [[32, 256, 14, 14]] [32, 256, 14, 14] 1,024 Conv2D-17 [[32, 128, 28, 28]] [32, 256, 14, 14] 32,768 BatchNorm2D-17 [[32, 256, 14, 14]] [32, 256, 14, 14] 1,024 BasicBlock-8 [[32, 128, 28, 28]] [32, 256, 14, 14] 0 Conv2D-20 [[32, 256, 14, 14]] [32, 256, 14, 14] 589,824 BatchNorm2D-20 [[32, 256, 14, 14]] [32, 256, 14, 14] 1,024 ReLU-14 [[32, 256, 14, 14]] [32, 256, 14, 14] 0 Conv2D-21 [[32, 256, 14, 14]] [32, 256, 14, 14] 589,824 BatchNorm2D-21 [[32, 256, 14, 14]] [32, 256, 14, 14] 1,024 BasicBlock-9 [[32, 256, 14, 14]] [32, 256, 14, 14] 0 Conv2D-22 [[32, 256, 14, 14]] [32, 256, 14, 14] 589,824 BatchNorm2D-22 [[32, 256, 14, 14]] [32, 256, 14, 14] 1,024 ReLU-15 [[32, 256, 14, 14]] [32, 256, 14, 14] 0 Conv2D-23 [[32, 256, 14, 14]] [32, 256, 14, 14] 589,824 BatchNorm2D-23 [[32, 256, 14, 14]] [32, 256, 14, 14] 1,024 BasicBlock-10 [[32, 256, 14, 14]] [32, 256, 14, 14] 0 Conv2D-24 [[32, 256, 14, 14]] [32, 256, 14, 14] 589,824 BatchNorm2D-24 [[32, 256, 14, 14]] [32, 256, 14, 14] 1,024 ReLU-16 [[32, 256, 14, 14]] [32, 256, 14, 14] 0 Conv2D-25 [[32, 256, 14, 14]] [32, 256, 14, 14] 589,824 BatchNorm2D-25 [[32, 256, 14, 14]] [32, 256, 14, 14] 1,024 BasicBlock-11 [[32, 256, 14, 14]] [32, 256, 14, 14] 0 Conv2D-26 [[32, 256, 14, 14]] [32, 256, 14, 14] 589,824 BatchNorm2D-26 [[32, 256, 14, 14]] [32, 256, 14, 14] 1,024 ReLU-17 [[32, 256, 14, 14]] [32, 256, 14, 14] 0 Conv2D-27 [[32, 256, 14, 14]] [32, 256, 14, 14] 589,824 BatchNorm2D-27 [[32, 256, 14, 14]] [32, 256, 14, 14] 1,024 BasicBlock-12 [[32, 256, 14, 14]] [32, 256, 14, 14] 0 Conv2D-28 [[32, 256, 14, 14]] [32, 256, 14, 14] 589,824 BatchNorm2D-28 [[32, 256, 14, 14]] [32, 256, 14, 14] 1,024 ReLU-18 [[32, 256, 14, 14]] [32, 256, 14, 14] 0 Conv2D-29 [[32, 256, 14, 14]] [32, 256, 14, 14] 589,824 BatchNorm2D-29 [[32, 256, 14, 14]] [32, 256, 14, 14] 1,024 BasicBlock-13 [[32, 256, 14, 14]] [32, 256, 14, 14] 0 Conv2D-31 [[32, 256, 14, 14]] [32, 512, 7, 7] 1,179,648 BatchNorm2D-31 [[32, 512, 7, 7]] [32, 512, 7, 7] 2,048 ReLU-19 [[32, 512, 7, 7]] [32, 512, 7, 7] 0 Conv2D-32 [[32, 512, 7, 7]] [32, 512, 7, 7] 2,359,296 BatchNorm2D-32 [[32, 512, 7, 7]] [32, 512, 7, 7] 2,048 Conv2D-30 [[32, 256, 14, 14]] [32, 512, 7, 7] 131,072 BatchNorm2D-30 [[32, 512, 7, 7]] [32, 512, 7, 7] 2,048 BasicBlock-14 [[32, 256, 14, 14]] [32, 512, 7, 7] 0 Conv2D-33 [[32, 512, 7, 7]] [32, 512, 7, 7] 2,359,296 BatchNorm2D-33 [[32, 512, 7, 7]] [32, 512, 7, 7] 2,048 ReLU-20 [[32, 512, 7, 7]] [32, 512, 7, 7] 0 Conv2D-34 [[32, 512, 7, 7]] [32, 512, 7, 7] 2,359,296 BatchNorm2D-34 [[32, 512, 7, 7]] [32, 512, 7, 7] 2,048 BasicBlock-15 [[32, 512, 7, 7]] [32, 512, 7, 7] 0 Conv2D-35 [[32, 512, 7, 7]] [32, 512, 7, 7] 2,359,296 BatchNorm2D-35 [[32, 512, 7, 7]] [32, 512, 7, 7] 2,048 ReLU-21 [[32, 512, 7, 7]] [32, 512, 7, 7] 0 Conv2D-36 [[32, 512, 7, 7]] [32, 512, 7, 7] 2,359,296 BatchNorm2D-36 [[32, 512, 7, 7]] [32, 512, 7, 7] 2,048 BasicBlock-16 [[32, 512, 7, 7]] [32, 512, 7, 7] 0 AdaptiveAvgPool2D-1 [[32, 512, 7, 7]] [32, 512, 1, 1] 0 Linear-1 [[32, 512]] [32, 1000] 513,000 ResNet-1 [[32, 3, 224, 224]] [32, 1000] 0 Dropout-1 [[32, 1000]] [32, 1000] 0 Linear-2 [[32, 1000]] [32, 1] 1,001 Sigmoid-2 [[32, 1]] [32, 1] 0 Flatten-1 [[32, 1]] [32] 0Total params: 21,815,697 Trainable params: 21,781,649 Non-trainable params: 34,048Input size (MB): 18.38 Forward/backward pass size (MB): 2744.86 Params size (MB): 83.22 Estimated Total Size (MB): 2846.45{‘total_params’: 21815697, ‘trainable_params’: 21781649} Binary Accuracy1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.16.17.18.19.20.21.22.23.24.25.26.27.28.29.30.31.32.33.34.35.36.37.38.39.40.41.42.43.44.45.46.47.48.49.50.51.52.53.54.55.56.57.58.59.60.61.62.63.64.65.66.67.68.69.70.71.72.73.74.75.76.77.78.79.80.81.82.83.84.85.86.87.88.89.90.91.92.93.94.95.96.97.98.99.100.101.102.103.104.105.106.107.108.109.110.111.112.113.114.115.116.117.118.
由于paddle没有实现Binary Accuracy标准用于二分类评估,故在这里实现
Binary Accuracy主要的思想是给定一个阈值对于sigmoid的输出做判断,如果大于该阈值我们将其认为是标签1,反之则认为是0
本文的实现上略有不同,我们将其转化为判断真实标签与sigmoid的输出是否小于某个阈值以判断其是否预测正确,本次实验阈值为0.5
7. Kaggle提交的结果
8. 总结
对于绣球花数据集来讲相对简单,使用预训练模型仅进行5个epochs的微调,在验证集上的精度最高可以达到0.9932,并且本次实验在Kaggle上提交了一下测试集的结果,精度可以达到0.99245。此外本次实验没有使用多模型融合技术,因此与Kaggle最先进的方法仍有一定差距(0.99245 vs 0.99770),在未来将尝试多模型融合技术进一步提高性能。
相关知识
基于卷积神经网络的农作物病虫害识别系统
基于卷积神经网络的农作物病虫害图像识别研究
基于轻量化卷积神经网络的番茄病害图像识别
卷积神经网络的算法范文
基于卷积神经网络的小麦叶部病害图像识别研究
【花卉识别系统】Python+卷积神经网络算法+人工智能+深度学习+图像识别+算法模型
毕设分享 基于深度学习卷积神经网络的花卉识别
软件工程毕设 基于深度学习卷积神经网络的花卉识别
基于改进YoloV3卷积神经网络的番茄病虫害检测
AI 名人堂:Yann LeCun 卷积神经网络之父
网址: 病虫害卷积神经网络图像识别可以做毕设吗 卷积神经网络入侵检测 https://m.huajiangbk.com/newsview1809406.html
上一篇: 彼岸花的栽培技术有哪些呢 |
下一篇: 每日一花 | 绿宝(海南菜豆树) |