首页 > 分享 > EfficientNet 分类花数据集

EfficientNet 分类花数据集

花匠小妙招
2024-11-11 09:24

1. EfficientNet 网络

2. depth、width、resolution

3. EfficientNet 网络的结构

4. dos 命令train 网络

5. 代码

5.1 model

5.2 dataset

5.3 utils

5.4 train

5.5 predict

1. EfficientNet 网络

EfficientNet 对网络的重要三个参数进行的探索：图像分辨率、网络的宽度、网络的深度

图像分辨率：特征图的size，h*w就是图像的空间分辨率

网络的宽度：网络中特征图的个数，也就是卷积核的个数或者输出的channel

网络的深度：网络的层数，resnet34，resnet101等等

如下：

2. depth、width、resolution

不知道从什么时候开始，224*224的图像分辨率输入似乎成为了神经网络的输入标准，导致后来的网络几乎输入都是224*224的尺寸大小

虽然有的网络规定是224*224大小，但是输入是别的尺寸例如300*300也没问题。这个是有问题的，因为大多数的代码，在全连接层之前用的是自适应池化层。

否则，输入的图像尺寸不正确会影响到全连接层的参数，就会报错

因此，在规定了分辨率的这一基础下，后面的网络都在width或者depth上面下功夫。例如resnet可以增加到1000层的深度

下面简单说说三个参数的作用

宽度：增加channel的数量 ,更广泛的网络往往能够捕获更细粒度的特征，并且更容易训练。然而，极宽但较浅的网络往往难以捕捉更高层次的特征。经验结果表明，当网络变得更宽且w更大时，精度很快饱和。

深度：增加网络的层数，缩放网络深度是许多卷积神经网络最常用的方法。更深入的ConvNet可以捕获更丰富和更复杂的特征，并在新任务上很好地泛化。然而，由于梯度消失问题，更深层次的网络也更难训练。尽管一些技术，如shortcut和批量归一化缓解了训练问题，但深度网络的精度增益会降低

分辨率:使用更高分辨率的输入图像，卷积可以潜在地捕获更细粒度的模式。其中更高的分辨率确实可以提高精度，但对于非常高的分辨率，精度增益会减少

作者得出的结论：

EfficientNet 提出，将这三个参数如何平衡的缩放是很重要的。因为，不同尺度尺度之间并不是相互独立的。直观地说，对于更高分辨率的图像，我们应该增加网络深度，这样更大的接受域可以帮助捕获在更大的图像中包含更多像素的相似特征。相应的，在分辨率较高时，也应增加网络宽度为了在高分辨率图像中捕获更多像素的细粒度模式。这些直觉表明，我们需要协调和平衡不同的缩放维度，而不是传统的一维缩放。

3. EfficientNet 网络的结构

EfficientNet 网络的基本模块称为 MBConv，首先采用1*1卷积进行升维度，然后dw卷积，然后经过了SE注意力机制，在1*1卷积降维，经过dropout。如果用shortcut的话，加在一起输出

其中SE注意力机制如下：

然后EfficientNet B0的结构如下：

EfficientNet B1 - B7 就是在B0的基础上增加了宽度和深度的超参数，当改变这两个数的时候，输入图像的size要手动的根据表格改变

4. dos 命令train 网络

-h 查看可以定义的参数，这里将epochs 设定为30

训练过程：

预测：这里只预测单张图像

5. 代码

EfficientNet 网络的代码

5.1 model

import math

import copy

from functools import partial

from collections import OrderedDict

from typing import Optional, Callable

import torch

import torch.nn as nn

from torch import Tensor

from torch.nn import functional as F

def _make_divisible(ch, divisor=8, min_ch=None):

if min_ch is None:

min_ch = divisor

new_ch = max(min_ch, int(ch + divisor / 2) // divisor * divisor)

if new_ch < 0.9 * ch:

new_ch += divisor

return new_ch

def drop_path(x, drop_prob: float = 0., training: bool = False):

"""

Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).

"Deep Networks with Stochastic Depth", https://arxiv.org/pdf/1603.09382.pdf

This function is taken from the rwightman.

It can be seen here:

https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/layers/drop.py#L140

"""

if drop_prob == 0. or not training:

return x

keep_prob = 1 - drop_prob

shape = (x.shape[0],) + (1,) * (x.ndim - 1)

random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)

random_tensor.floor_()

output = x.div(keep_prob) * random_tensor

return output

class DropPath(nn.Module):

"""

Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).

"Deep Networks with Stochastic Depth", https://arxiv.org/pdf/1603.09382.pdf

"""

def __init__(self, drop_prob=None):

super(DropPath, self).__init__()

self.drop_prob = drop_prob

def forward(self, x):

return drop_path(x, self.drop_prob, self.training)

class ConvBNActivation(nn.Sequential):

def __init__(self,

in_planes: int,

out_planes: int,

kernel_size: int = 3,

stride: int = 1,

groups: int = 1,

norm_layer: Optional[Callable[..., nn.Module]] = None,

activation_layer: Optional[Callable[..., nn.Module]] = None):

padding = (kernel_size - 1) // 2

if norm_layer is None:

norm_layer = nn.BatchNorm2d

if activation_layer is None:

activation_layer = nn.SiLU

super(ConvBNActivation, self).__init__(nn.Conv2d(in_channels=in_planes,

out_channels=out_planes,

kernel_size=kernel_size,

stride=stride,

padding=padding,

groups=groups,

bias=False),

norm_layer(out_planes),

activation_layer())

class SqueezeExcitation(nn.Module):

def __init__(self,

input_c: int,

expand_c: int,

squeeze_factor: int = 4):

super(SqueezeExcitation, self).__init__()

squeeze_c = input_c // squeeze_factor

self.fc1 = nn.Conv2d(expand_c, squeeze_c, 1)

self.ac1 = nn.SiLU()

self.fc2 = nn.Conv2d(squeeze_c, expand_c, 1)

self.ac2 = nn.Sigmoid()

def forward(self, x: Tensor) -> Tensor:

scale = F.adaptive_avg_pool2d(x, output_size=(1, 1))

scale = self.fc1(scale)

scale = self.ac1(scale)

scale = self.fc2(scale)

scale = self.ac2(scale)

return scale * x

class InvertedResidualConfig:

def __init__(self,

kernel: int,

input_c: int,

out_c: int,

expanded_ratio: int,

stride: int,

use_se: bool,

drop_rate: float,

index: str,

width_coefficient: float):

self.input_c = self.adjust_channels(input_c, width_coefficient)

self.kernel = kernel

self.expanded_c = self.input_c * expanded_ratio

self.out_c = self.adjust_channels(out_c, width_coefficient)

self.use_se = use_se

self.stride = stride

self.drop_rate = drop_rate

self.index = index

@staticmethod

def adjust_channels(channels: int, width_coefficient: float):

return _make_divisible(channels * width_coefficient, 8)

class InvertedResidual(nn.Module):

def __init__(self,

cnf: InvertedResidualConfig,

norm_layer: Callable[..., nn.Module]):

super(InvertedResidual, self).__init__()

if cnf.stride not in [1, 2]:

raise ValueError("illegal stride value.")

self.use_res_connect = (cnf.stride == 1 and cnf.input_c == cnf.out_c)

layers = OrderedDict()

activation_layer = nn.SiLU

if cnf.expanded_c != cnf.input_c:

layers.update({"expand_conv": ConvBNActivation(cnf.input_c,

cnf.expanded_c,

kernel_size=1,

norm_layer=norm_layer,

activation_layer=activation_layer)})

layers.update({"dwconv": ConvBNActivation(cnf.expanded_c,

cnf.expanded_c,

kernel_size=cnf.kernel,

stride=cnf.stride,

groups=cnf.expanded_c,

norm_layer=norm_layer,

activation_layer=activation_layer)})

if cnf.use_se:

layers.update({"se": SqueezeExcitation(cnf.input_c,cnf.expanded_c)})

layers.update({"project_conv": ConvBNActivation(cnf.expanded_c,

cnf.out_c,

kernel_size=1,

norm_layer=norm_layer,

activation_layer=nn.Identity)})

self.block = nn.Sequential(layers)

self.out_channels = cnf.out_c

self.is_strided = cnf.stride > 1

if self.use_res_connect and cnf.drop_rate > 0:

self.dropout = DropPath(cnf.drop_rate)

else:

self.dropout = nn.Identity()

def forward(self, x: Tensor) -> Tensor:

result = self.block(x)

result = self.dropout(result)

if self.use_res_connect:

result += x

return result

class EfficientNet(nn.Module):

def __init__(self,

width_coefficient: float,

depth_coefficient: float,

num_classes: int = 1000,

dropout_rate: float = 0.2,

drop_connect_rate: float = 0.2,

block: Optional[Callable[..., nn.Module]] = None,

norm_layer: Optional[Callable[..., nn.Module]] = None

super(EfficientNet, self).__init__()

default_cnf = [[3, 32, 16, 1, 1, True, drop_connect_rate, 1],

[3, 16, 24, 6, 2, True, drop_connect_rate, 2],

[5, 24, 40, 6, 2, True, drop_connect_rate, 2],

[3, 40, 80, 6, 2, True, drop_connect_rate, 3],

[5, 80, 112, 6, 1, True, drop_connect_rate, 3],

[5, 112, 192, 6, 2, True, drop_connect_rate, 4],

[3, 192, 320, 6, 1, True, drop_connect_rate, 1]]

def round_repeats(repeats):

return int(math.ceil(depth_coefficient * repeats))

if block is None:

block = InvertedResidual

if norm_layer is None:

norm_layer = partial(nn.BatchNorm2d, eps=1e-3, momentum=0.1)

adjust_channels = partial(InvertedResidualConfig.adjust_channels,width_coefficient=width_coefficient)

bneck_conf = partial(InvertedResidualConfig,width_coefficient=width_coefficient)

b = 0

num_blocks = float(sum(round_repeats(i[-1]) for i in default_cnf))

inverted_residual_setting = []

for stage, args in enumerate(default_cnf):

cnf = copy.copy(args)

for i in range(round_repeats(cnf.pop(-1))):

if i > 0:

cnf[-3] = 1

cnf[1] = cnf[2]

cnf[-1] = args[-2] * b / num_blocks

index = str(stage + 1) + chr(i + 97)

inverted_residual_setting.append(bneck_conf(*cnf, index))

b += 1

layers = OrderedDict()

layers.update({"stem_conv": ConvBNActivation(in_planes=3,

out_planes=adjust_channels(32),

kernel_size=3,

stride=2,

norm_layer=norm_layer)})

for cnf in inverted_residual_setting:

layers.update({cnf.index: block(cnf, norm_layer)})

last_conv_input_c = inverted_residual_setting[-1].out_c

last_conv_output_c = adjust_channels(1280)

layers.update({"top": ConvBNActivation(in_planes=last_conv_input_c,

out_planes=last_conv_output_c,

kernel_size=1,

norm_layer=norm_layer)})

self.features = nn.Sequential(layers)

self.avgpool = nn.AdaptiveAvgPool2d(1)

classifier = []

if dropout_rate > 0:

classifier.append(nn.Dropout(p=dropout_rate, inplace=True))

classifier.append(nn.Linear(last_conv_output_c, num_classes))

self.classifier = nn.Sequential(*classifier)

for m in self.modules():

if isinstance(m, nn.Conv2d):

nn.init.kaiming_normal_(m.weight, mode="fan_out")

if m.bias is not None:

nn.init.zeros_(m.bias)

elif isinstance(m, nn.BatchNorm2d):

nn.init.ones_(m.weight)

nn.init.zeros_(m.bias)

elif isinstance(m, nn.Linear):

nn.init.normal_(m.weight, 0, 0.01)

nn.init.zeros_(m.bias)

def _forward_impl(self, x: Tensor) -> Tensor:

x = self.features(x)

x = self.avgpool(x)

x = torch.flatten(x, 1)

x = self.classifier(x)

return x

def forward(self, x: Tensor) -> Tensor:

return self._forward_impl(x)

def efficientnet_b0(num_classes=1000):

return EfficientNet(width_coefficient=1.0,depth_coefficient=1.0,dropout_rate=0.2,num_classes=num_classes)

def efficientnet_b1(num_classes=1000):

return EfficientNet(width_coefficient=1.0,depth_coefficient=1.1,dropout_rate=0.2,num_classes=num_classes)

def efficientnet_b2(num_classes=1000):

return EfficientNet(width_coefficient=1.1,depth_coefficient=1.2,dropout_rate=0.3,num_classes=num_classes)

def efficientnet_b3(num_classes=1000):

return EfficientNet(width_coefficient=1.2,depth_coefficient=1.4,dropout_rate=0.3,num_classes=num_classes)

def efficientnet_b4(num_classes=1000):

return EfficientNet(width_coefficient=1.4,depth_coefficient=1.8,dropout_rate=0.4,num_classes=num_classes)

def efficientnet_b5(num_classes=1000):

return EfficientNet(width_coefficient=1.6,depth_coefficient=2.2,dropout_rate=0.4,num_classes=num_classes)

def efficientnet_b6(num_classes=1000):

return EfficientNet(width_coefficient=1.8,depth_coefficient=2.6,dropout_rate=0.5,num_classes=num_classes)

def efficientnet_b7(num_classes=1000):

return EfficientNet(width_coefficient=2.0,depth_coefficient=3.1,dropout_rate=0.5,num_classes=num_classes)

5.2 dataset

from PIL import Image

import torch

from torch.utils.data import Dataset

class MyDataSet(Dataset):

"""自定义数据集"""

def __init__(self, images_path: list, images_class: list, transform=None):

self.images_path = images_path

self.images_class = images_class

self.transform = transform

def __len__(self):

return len(self.images_path)

def __getitem__(self, item):

img = Image.open(self.images_path[item])

if img.mode != 'RGB':

raise ValueError("image: {} isn't RGB mode.".format(self.images_path[item]))

label = self.images_class[item]

if self.transform is not None:

img = self.transform(img)

return img, label

@staticmethod

def collate_fn(batch):

images, labels = tuple(zip(*batch))

images = torch.stack(images, dim=0)

labels = torch.as_tensor(labels)

return images, labels

5.3 utils

import os

import sys

import json

import random

import torch

from tqdm import tqdm

import matplotlib.pyplot as plt

def read_split_data(root: str, val_rate: float = 0.2):

random.seed(0)

assert os.path.exists(root), "dataset root: {} does not exist.".format(root)

flower_class = [cla for cla in os.listdir(root) if os.path.isdir(os.path.join(root, cla))]

flower_class.sort()

class_indices = dict((k, v) for v, k in enumerate(flower_class))

json_str = json.dumps(dict((val, key) for key, val in class_indices.items()), indent=4)

with open('class_indices.json', 'w') as json_file:

json_file.write(json_str)

train_images_path = []

train_images_label = []

val_images_path = []

val_images_label = []

every_class_num = []

supported = [".jpg", ".JPG", ".png", ".PNG"]

for cla in flower_class:

cla_path = os.path.join(root, cla)

images = [os.path.join(root, cla, i) for i in os.listdir(cla_path)

if os.path.splitext(i)[-1] in supported]

images.sort()

image_class = class_indices[cla]

every_class_num.append(len(images))

val_path = random.sample(images, k=int(len(images) * val_rate))

for img_path in images:

if img_path in val_path:

val_images_path.append(img_path)

val_images_label.append(image_class)

else:

train_images_path.append(img_path)

train_images_label.append(image_class)

print("{} images were found in the dataset.".format(sum(every_class_num)))

print("{} images for training.".format(len(train_images_path)))

print("{} images for validation.".format(len(val_images_path)))

assert len(train_images_path) > 0, "number of training images must greater than 0."

assert len(val_images_path) > 0, "number of validation images must greater than 0."

plot_image = False

if plot_image:

plt.bar(range(len(flower_class)), every_class_num, align='center')

plt.xticks(range(len(flower_class)), flower_class)

for i, v in enumerate(every_class_num):

plt.text(x=i, y=v + 5, s=str(v), ha='center')

plt.xlabel('image class')

plt.ylabel('number of images')

plt.title('flower class distribution')

plt.show()

return train_images_path, train_images_label, val_images_path, val_images_label

def train_one_epoch(model, optimizer, data_loader, device, epoch,batch):

model.train()

loss_function = torch.nn.CrossEntropyLoss()

running_loss = 0.0

data_loader = tqdm(data_loader, file=sys.stdout)

for images, labels in data_loader:

optimizer.zero_grad()

pred = model(images.to(device))

loss = loss_function(pred, labels.to(device))

loss.backward()

running_loss += loss.item()

optimizer.step()

data_loader.desc = "[epoch {}] mean loss {}".format(epoch, round(loss.item()/batch, 3))

return running_loss

@torch.no_grad()

def evaluate(model, data_loader, device):

model.eval()

sum_num = 0

data_loader = tqdm(data_loader, file=sys.stdout)

for images, labels in data_loader:

pred = model(images.to(device))

pred = torch.max(pred, dim=1)[1]

sum_num += torch.eq(pred, labels.to(device)).sum()

return sum_num

5.4 train

import os

import math

import argparse

import torch

import torch.optim as optim

from torchvision import transforms

import torch.optim.lr_scheduler as lr_scheduler

from torch.utils.data import DataLoader

from model import efficientnet_b0 as create_model

from my_dataset import MyDataSet

from utils import read_split_data, train_one_epoch, evaluate

def main(args):

device = torch.device(args.device if torch.cuda.is_available() else "cpu")

if os.path.exists("./weights") is False:

os.makedirs("./weights")

train_images_path, train_images_label, val_images_path, val_images_label = read_split_data(args.data_path)

img_size = {"B0": 224,"B1": 240,"B2": 260,"B3": 300,"B4": 380,"B5": 456,"B6": 528,"B7": 600}

num_model = "B0"

data_transform = {

"train": transforms.Compose([transforms.RandomResizedCrop(img_size[num_model]),

transforms.RandomHorizontalFlip(),

transforms.ToTensor(),

transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),

"val": transforms.Compose([transforms.Resize(img_size[num_model]),

transforms.CenterCrop(img_size[num_model]),

transforms.ToTensor(),

transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])}

train_dataset = MyDataSet(images_path=train_images_path,images_class=train_images_label,transform=data_transform["train"])

val_dataset = MyDataSet(images_path=val_images_path,images_class=val_images_label,transform=data_transform["val"])

num_trainSet = len(train_dataset)

num_valSet = len(val_dataset)

batch_size = args.batch_size

nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8])

train_loader = DataLoader(train_dataset,batch_size=batch_size,shuffle=True,pin_memory=True,num_workers=nw,collate_fn=train_dataset.collate_fn)

val_loader = DataLoader(val_dataset,batch_size=batch_size,shuffle=False,pin_memory=True,num_workers=nw,collate_fn=val_dataset.collate_fn)

model = create_model(num_classes=args.num_classes).to(device)

if args.weights != "":

if os.path.exists(args.weights):

weights_dict = torch.load(args.weights, map_location=device)

load_weights_dict = {k: v for k, v in weights_dict.items() if model.state_dict()[k].numel() == v.numel()}

print(model.load_state_dict(load_weights_dict, strict=False))

else:

raise FileNotFoundError("not found weights file: {}".format(args.weights))

if args.freeze_layers:

for name, para in model.named_parameters():

if ("features.top" not in name) and ("classifier" not in name):

para.requires_grad_(False)

else:

print("training {}".format(name))

pg = [p for p in model.parameters() if p.requires_grad]

optimizer = optim.SGD(pg, lr=args.lr, momentum=0.9, weight_decay=1E-4)

lf = lambda x: ((1 + math.cos(x * math.pi / args.epochs)) / 2) * (1 - args.lrf) + args.lrf

scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lf)

best_acc = 0.0

for epoch in range(args.epochs):

loss_all = train_one_epoch(model=model, optimizer=optimizer, data_loader=train_loader, device=device, epoch=epoch, batch=batch_size)

scheduler.step()

acc_all = evaluate(model=model, data_loader=val_loader, device=device)

print("[epoch :%d],train loss:%.4f ,test accuracy: %.4f" % (epoch, loss_all / num_trainSet, acc_all / num_valSet))

if acc_all > best_acc:

best_acc = acc_all

torch.save(model.state_dict(),'./weights/shuffleNet_V2.pth')

if __name__ == '__main__':

parser = argparse.ArgumentParser()

parser.add_argument('--num_classes', type=int, default=5)

parser.add_argument('--epochs', type=int, default=5)

parser.add_argument('--batch-size', type=int, default=16)

parser.add_argument('--lr', type=float, default=0.1)

parser.add_argument('--lrf', type=float, default=0.01)

parser.add_argument('--data-path', type=str,default="./data/flower")

parser.add_argument('--weights', type=str, default='',help='initial weights path')

parser.add_argument('--freeze-layers', type=bool, default=False)

parser.add_argument('--device', default='cuda:0', help='device id (i.e. 0 or 0,1 or cpu)')

opt = parser.parse_args()

print('start training....')

main(opt)

print("finish training!!!")

5.5 predict

import os

os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'

import json

import torch

from PIL import Image

from torchvision import transforms

import matplotlib.pyplot as plt

from model import efficientnet_b0 as create_model

def main():

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

img_size = {"B0": 224,"B1": 240,"B2": 260,"B3": 300,"B4": 380,"B5": 456,"B6": 528,"B7": 600}

num_model = "B0"

data_transform = transforms.Compose(

[transforms.Resize(img_size[num_model]),

transforms.CenterCrop(img_size[num_model]),

transforms.ToTensor(),

transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])

img_path = "./OIP-C.jpg"

img = Image.open(img_path)

plt.imshow(img)

img = data_transform(img)

img = torch.unsqueeze(img, dim=0)

json_path = './class_indices.json'

assert os.path.exists(json_path), "file: '{}' dose not exist.".format(json_path)

with open(json_path, "r") as f:

class_indict = json.load(f)

model = create_model(num_classes=5).to(device)

model_weight_path = "./weights/shuffleNet_V2.pth"

model.load_state_dict(torch.load(model_weight_path, map_location=device))

model.eval()

with torch.no_grad():

output = torch.squeeze(model(img.to(device))).cpu()

predict = torch.softmax(output, dim=0)

predict_cla = torch.argmax(predict).numpy()

print_res = "class: {} prob: {:.3}".format(class_indict[str(predict_cla)],predict[predict_cla].numpy())

plt.title(print_res)

for i in range(len(predict)):

print("class: {:10} prob: {:.3}".format(class_indict[str(i)],predict[i].numpy()))

plt.show()

if __name__ == '__main__':

main()

植物大战僵尸电影视频解析及精彩片段介绍

《植物大战僵尸3》又双叒叕要回炉重造了

热点分享

家庭养花知识大全(家庭养花知识大全与技巧)

养花常识养花技巧 1.浇花 ①残茶浇花残茶用来浇花,既能保持土...

养花知识大全,养花技巧大全

养花知识绿萝是一种很常见的盆栽植物，因为四季翠绿、养护简单...

推荐分享

家庭养花风水知识家庭养花“五行说”

许多人喜欢在家庭里面养花，但不是很了解家庭养花风水知识。居家...

家庭养花知识大全家庭养花有什么好处

家庭养花知识大全家庭养花有什么好处爱花之人总是喜欢在家里...

热门点击排行

君子兰什么品种最名贵十大名贵君子兰排名

世界上最名贵的10种兰花图片，莲瓣兰价值高达1500万

分享分类导航

花卉

每日分享

花卉图片

养花生活

EfficientNet 分类花数据集

1. EfficientNet 网络

2. depth、width、resolution

3. EfficientNet 网络的结构

4. dos 命令train 网络

5. 代码

5.1 model

5.2 dataset

5.3 utils

5.4 train

5.5 predict

植物大战僵尸电影视频解析及精彩片段介绍

《植物大战僵尸3》又双叒叕要回炉重造了

家庭养花知识大全(家庭养花知识大全与技巧)

养花知识大全,养花技巧大全

家庭养花风水知识 家庭养花“五行说”

家庭养花知识大全 家庭养花有什么好处

家庭养花风水知识家庭养花“五行说”

家庭养花知识大全家庭养花有什么好处