西红柿叶片病虫害分类数据集,1.4GB,超过2万张图像,共11大类别分类,分别为
Late blight - 晚疫病
Healthy - 健康
Early blight - 早疫病
Septoria leaf spot - 嫌气性叶斑病
Tomato yellow leaf curl virus - 番茄黄叶卷曲病毒
Bacterial spot - 细菌斑点病
Target spot - 靶斑病
Tomato mosaic virus - 番茄花叶病毒
Leaf mold - 叶霉病
Spider mites (Two-spotted spider mite) - 蜘蛛螨(双斑蜘蛛螨)
Powdery mildew - 白粉病
西红柿叶片病虫害分类数据集
数据背景
西红柿是全球广泛种植的重要经济作物之一,但其生长过程中常受到各种病虫害的影响。及时准确地识别这些病虫害对于采取有效的防治措施至关重要。西红柿叶片病虫害分类数据集包含超过2万张图像,涵盖了11种不同的病虫害类型,每种类型的图像数量均衡分布,适合用于训练和评估深度学习模型。
应用领域
AI+植物病虫害检测
文件目录
深色版本
tomato_leaf_disease_dataset/
├── train/
│ ├── Late_blight/
│ ├── Healthy/
│ ├── Early_blight/
│ ├── Septoria_leaf_spot/
│ ├── Tomato_yellow_leaf_curl_virus/
│ ├── Bacterial_spot/
│ ├── Target_spot/
│ ├── Tomato_mosaic_virus/
│ ├── Leaf_mold/
│ ├── Spider_mites/
│ ├── Powdery_mildew/
├── valid/
│ ├── Late_blight/
│ ├── Healthy/
│ ├── Early_blight/
│ ├── Septoria_leaf_spot/
│ ├── Tomato_yellow_leaf_curl_virus/
│ ├── Bacterial_spot/
│ ├── Target_spot/
│ ├── Tomato_mosaic_virus/
│ ├── Leaf_mold/
│ ├── Spider_mites/
│ ├── Powdery_mildew/
├── test/
│ ├── Late_blight/
│ ├── Healthy/
│ ├── Early_blight/
│ ├── Septoria_leaf_spot/
│ ├── Tomato_yellow_leaf_curl_virus/
│ ├── Bacterial_spot/
│ ├── Target_spot/
│ ├── Tomato_mosaic_virus/
│ ├── Leaf_mold/
│ ├── Spider_mites/
│ ├── Powdery_mildew/
├── classes.txt
├── README.txt
├── models/
│ └── resnet/
├── src/
│ ├── train.py
│ ├── predict.py
│ ├── utils.py
│ ├── dataset.py
├── weights/
│ └── best_model.pth
├── requirements.txt
└── README.md
数据说明
数据集规模:共包含超过2万张图像样张。
训练集:包含16000张图像样张。
验证集:包含2000张图像样张。
测试集:包含2000张图像样张。
图像格式:JPG
类别:11种病虫害类型,每种类型的图像数量均衡分布。
预处理:数据集已按比例划分完成,无需进行任何处理即可直接用于训练。
深色版本
torch
torchvision
numpy
pandas
matplotlib
tqdm
pyyaml
opencv-python
然后,使用以下命令安装依赖:
bash
深色版本
pip install -r requirements.txt
2. 数据集准备
确保你的数据集已经按照以下结构组织:
深色版本
tomato_leaf_disease_dataset/
├── train/
│ ├── Late_blight/
│ ├── Healthy/
│ ├── Early_blight/
│ ├── Septoria_leaf_spot/
│ ├── Tomato_yellow_leaf_curl_virus/
│ ├── Bacterial_spot/
│ ├── Target_spot/
│ ├── Tomato_mosaic_virus/
│ ├── Leaf_mold/
│ ├── Spider_mites/
│ ├── Powdery_mildew/
├── valid/
│ ├── Late_blight/
│ ├── Healthy/
│ ├── Early_blight/
│ ├── Septoria_leaf_spot/
│ ├── Tomato_yellow_leaf_curl_virus/
│ ├── Bacterial_spot/
│ ├── Target_spot/
│ ├── Tomato_mosaic_virus/
│ ├── Leaf_mold/
│ ├── Spider_mites/
│ ├── Powdery_mildew/
├── test/
│ ├── Late_blight/
│ ├── Healthy/
│ ├── Early_blight/
│ ├── Septoria_leaf_spot/
│ ├── Tomato_yellow_leaf_curl_virus/
│ ├── Bacterial_spot/
│ ├── Target_spot/
│ ├── Tomato_mosaic_virus/
│ ├── Leaf_mold/
│ ├── Spider_mites/
│ ├── Powdery_mildew/
├── classes.txt
├── README.txt
每个文件夹中包含对应的图像文件。确保所有图像文件都是.jpg格式。
3.1 src/dataset.py
python
深色版本
import os
import torch
from torch.utils.data import Dataset
from torchvision import transforms
from PIL import Image
class TomatoLeafDiseaseDataset(Dataset):
def init(self, root_dir, transform=None):
self.root_dir = root_dir
self.transform = transform
self.classes = os.listdir(root_dir)
self.class_to_idx = {cls_name: idx for idx, cls_name in enumerate(self.classes)}
self.image_paths = []
self.labels = []
for cls_name in self.classes: cls_dir = os.path.join(root_dir, cls_name) for img_name in os.listdir(cls_dir): img_path = os.path.join(cls_dir, img_name) self.image_paths.append(img_path) self.labels.append(self.class_to_idx[cls_name]) def __len__(self): return len(self.image_paths) def __getitem__(self, index): img_path = self.image_paths[index] label = self.labels[index] image = Image.open(img_path).convert("RGB") if self.transform: image = self.transform(image) return image, label
12345678910111213141516171819def get_data_loaders(train_dir, valid_dir, test_dir, batch_size=16, num_workers=4):
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
train_dataset = TomatoLeafDiseaseDataset(train_dir, transform=transform) valid_dataset = TomatoLeafDiseaseDataset(valid_dir, transform=transform) test_dataset = TomatoLeafDiseaseDataset(test_dir, transform=transform) train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=num_workers) valid_loader = torch.utils.data.DataLoader(valid_dataset, batch_size=batch_size, shuffle=False, num_workers=num_workers) test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False, num_workers=num_workers) return train_loader, valid_loader, test_loader 123456789 ResNet训练代码
4.1 src/train.py
python
深色版本
import torch
import torch.optim as optim
import torch.nn as nn
from torch.utils.tensorboard import SummaryWriter
from tqdm import tqdm
from src.dataset import get_data_loaders
from torchvision.models import resnet50
def train_model(train_dir, valid_dir, epochs=100, batch_size=16, learning_rate=1e-4):
device = torch.device(“cuda” if torch.cuda.is_available() else “cpu”)
model = resnet50(pretrained=True) num_classes = 11 model.fc = nn.Linear(model.fc.in_features, num_classes) model = model.to(device) train_loader, valid_loader, _ = get_data_loaders(train_dir, valid_dir, "", batch_size=batch_size) optimizer = optim.Adam(model.parameters(), lr=learning_rate) criterion = nn.CrossEntropyLoss() writer = SummaryWriter() for epoch in range(epochs): model.train() running_loss = 0.0 correct = 0 total = 0 for images, labels in tqdm(train_loader, desc=f"Epoch {epoch + 1}/{epochs}"): images, labels = images.to(device), labels.to(device) optimizer.zero_grad() outputs = model(images) loss = criterion(outputs, labels) loss.backward() optimizer.step() running_loss += loss.item() _, predicted = outputs.max(1) total += labels.size(0) correct += predicted.eq(labels).sum().item() train_loss = running_loss / len(train_loader) train_acc = correct / total writer.add_scalar('Training Loss', train_loss, epoch) writer.add_scalar('Training Accuracy', train_acc, epoch) model.eval() running_valid_loss = 0.0 correct = 0 total = 0 with torch.no_grad(): for images, labels in valid_loader: images, labels = images.to(device), labels.to(device) outputs = model(images) loss = criterion(outputs, labels) running_valid_loss += loss.item() _, predicted = outputs.max(1) total += labels.size(0) correct += predicted.eq(labels).sum().item() valid_loss = running_valid_loss / len(valid_loader) valid_acc = correct / total writer.add_scalar('Validation Loss', valid_loss, epoch) writer.add_scalar('Validation Accuracy', valid_acc, epoch) print(f"Epoch {epoch + 1}/{epochs}, Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.4f}, Valid Loss: {valid_loss:.4f}, Valid Acc: {valid_acc:.4f}") torch.save(model.state_dict(), "weights/best_model.pth") writer.close()
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263if name == “main”:
train_dir = “tomato_leaf_disease_dataset/train”
valid_dir = “tomato_leaf_disease_dataset/valid”
train_model(train_dir, valid_dir)
5. 模型评估
训练完成后,可以通过测试集来评估模型的性能。示例如下:
5.1 src/predict.py
python
深色版本
import torch
import matplotlib.pyplot as plt
from torchvision.models import resnet50
from src.dataset import get_data_loaders
import numpy as np
def predict_and_plot(train_dir, valid_dir, test_dir, model_path, num_samples=5):
device = torch.device(“cuda” if torch.cuda.is_available() else “cpu”)
model = resnet50(pretrained=True) num_classes = 11 model.fc = nn.Linear(model.fc.in_features, num_classes) model.load_state_dict(torch.load(model_path)) model = model.to(device) model.eval() _, _, test_loader = get_data_loaders(train_dir, valid_dir, test_dir) fig, axes = plt.subplots(num_samples, 2, figsize=(10, 5 * num_samples)) with torch.no_grad(): for i, (images, labels) in enumerate(test_loader): if i >= num_samples: break images, labels = images.to(device), labels.to(device) outputs = model(images) _, predicted = outputs.max(1) images = images.cpu().numpy().transpose((0, 2, 3, 1)) for j in range(len(images)): ax = axes[j] if num_samples > 1 else axes ax[0].imshow(images[j]) ax[0].set_title(f"Input Image (Label: {labels[j].item()})") ax[0].axis('off') ax[1].imshow(images[j]) ax[1].set_title(f"Predicted Label: {predicted[j].item()}") ax[1].axis('off') plt.tight_layout() plt.show()
12345678910111213141516171819202122232425262728293031323334if name == “main”:
train_dir = “tomato_leaf_disease_dataset/train”
valid_dir = “tomato_leaf_disease_dataset/valid”
test_dir = “tomato_leaf_disease_dataset/test”
model_path = “weights/best_model.pth”
predict_and_plot(train_dir, valid_dir, test_dir, model_path)
6. 运行项目
确保你的数据集已经放在相应的文件夹中。
在项目根目录下运行以下命令启动训练:
bash
深色版本
python src/train.py
训练完成后,运行以下命令进行评估和可视化:
bash
深色版本
python src/predict.py
7. 功能说明
数据集类:TomatoLeafDiseaseDataset类用于加载和预处理数据。
数据加载器:get_data_loaders函数用于创建训练、验证和测试数据加载器。
训练模型:train.py脚本用于训练ResNet50模型。
评估模型:predict.py脚本用于评估模型性能,并可视化输入图像、真实标签和预测结果。
8. 详细注释
dataset.py
数据集类:定义了一个TomatoLeafDiseaseDataset类,用于加载和预处理数据。
数据加载器:定义了一个get_data_loaders函数,用于创建训练、验证和测试数据加载器。
train.py
训练函数:定义了一个train_model函数,用于训练ResNet50模型。
训练过程:在每个epoch中,模型在训练集上进行前向传播和反向传播,并在验证集上进行评估。
predict.py
预测和可视化:定义了一个predict_and_plot函数,用于在测试集上进行预测,并可视化输入图像、真实标签和预测结果。
相关知识
使用YOLOv8训练该数据集农业害虫检测数据集 农业虫害数据集.该数据集的害虫类别共为三类,该数据集共4010张JPG图片,标签文件为xml格式,4010个。
基于深度学习的植物叶片病害识别系统(网页版+YOLOv8/v7/v6/v5代码+训练数据集)
基于深度学习的玉米病虫害检测系统(网页版+YOLOv8/v7/v6/v5代码+训练数据集)
【叶片病虫害数据集】果树叶片病变识别 机器视觉 Python (含数据集)
细粒度分类数据集汇总
基于深度学习的植物病害检测系统(网页版+YOLOv8/v7/v6/v5代码+训练数据集)
5种分类花卉识别数据集
【果树林木病虫害数据集】 果树病虫害检测 林木叶片病虫害识别 计算机视觉(含数据集)
基于深度学习的农作物害虫检测系统(网页版+YOLOv8/v7/v6/v5代码+训练数据集)
基于深度学习的稻田虫害检测系统(网页版+YOLOv8/v7/v6/v5代码+训练数据集)
网址: 如何yolov8训练使用——西红柿叶片病虫害分类数据集,1.4GB,超过2万张图像,共11大类别分类 西红柿数据集 https://m.huajiangbk.com/newsview530414.html
上一篇: 怎么提取花粉? |
下一篇: 花卉休眠期扦插也叫() |