orange pi 5 plus部署yolov5

使用声明

[!IMPORTANT]

本机和板子的主要工作

本机的工作：

训练自己的yolov5的模型

将自己的模型best.pt转换best.onnx

再将best.onnx转换成best.rknn

rknn的工作：

部署自己的模型进行检测任务

返回检测目标位置信息、种类等

所用涉及的工具包和系统环境说明。（没对应大概率出现问题）

本机（wsl或者虚拟机）所使用的环境：

ubuntu20.04

rknn_toolkit2(v1.4.0)

python3.8

rknn板子上的环境

平台rknn3588

rknn_toolkit2(v1.4.0)

yolov5(v1.6.0)

ubuntu20.04

python3.9

训练自己的模型

暂时不想补充

看看别人写的吧

https://zhuanlan.zhihu.com/p/501798155

.pt转.onnx

将models/yolo.py文件中的class类下的forward函数由：

def forward(self, x):
    z = []  # inference output
    for i in range(self.nl):
        x[i] = self.m[i](x[i])  # conv
        bs, _, ny, nx = x[i].shape  # x(bs,255,20,20) to x(bs,3,20,20,85)
        x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
        if not self.training:  # inference
            if self.dynamic or self.grid[i].shape[2:4] != x[i].shape[2:4]:
                self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)
            if isinstance(self, Segment):  # (boxes + masks)
                xy, wh, conf, mask = x[i].split((2, 2, self.nc + 1, self.no - self.nc - 5), 4)
                xy = (xy.sigmoid() * 2 + self.grid[i]) * self.stride[i]  # xy
                wh = (wh.sigmoid() * 2) ** 2 * self.anchor_grid[i]  # wh
                y = torch.cat((xy, wh, conf.sigmoid(), mask), 4)
            else:  # Detect (boxes only)
                xy, wh, conf = x[i].sigmoid().split((2, 2, self.nc + 1), 4)
                xy = (xy * 2 + self.grid[i]) * self.stride[i]  # xy
                wh = (wh * 2) ** 2 * self.anchor_grid[i]  # wh
                y = torch.cat((xy, wh, conf), 4)
            z.append(y.view(bs, self.na * nx * ny, self.no))
    return x if self.training else (torch.cat(z, 1),) if self.export else (torch.cat(z, 1), x)

更改为：

def forward(self, x):
	z = []  # inference output
	for i in range(self.nl):
		if os.getenv('RKNN_model_hack', '0') != '0':
			x[i] = torch.sigmoid(self.m[i](x[i]))  # conv

	return x

再在yolo.py和export.py文件的开头加上：

1 2	import os os.environ['RKNN_model_hack'] = 'npu_2'

修改之后按照如下命令导出onnx。

其中./runs/train/exp3/weights/best.pt换成自己训练的pt文件。
1
python export.py --weights ./runs/train/exp3/weights/best.pt --img 640 --batch 1 --include onnx --opset 12

.onnx转.rknn

使用虚拟机或者是wsl创建一个Ubuntu20.04。
因为使用的rknn-toolkit2-1.4.0版本所使用的模型转化的包只有python3.6和python3.8两个版本所以需要创建py3.6或者py3.8的python环境，可使用conda来创建，这里不展开叙述。
首先进入到你自己rknn模型转换的工作空间下，克隆下rknn-toolkit2的工具包。
1
git clone https://github.com/rockchip-linux/rknn-toolkit2.git -b v1.4.0
进入到rknn-toolkit2文件夹下的packages下

packages下应该有两个.whl的文件，因为我们所使用的python是python3.8，所以安装对应的包。
1
pip install rknn_toolkit2-1.4.0_22dcfef4-cp38-cp38m-linux_x86_64.whl
等待安装完毕检查是否安装成功：
1
python
进入python。
1
from rknn.api import RKNN
没有报错即说明安装成功。

然后进入rknn-toolkit2目录下的examples/onnx/yolov5

1	cd examples/onnx/yolov5

将原本的test.py复制一份重新命名为mytest.py

1	cp test.py ./mytest.py

几个需要注意修改的地方：(使用ctrl+F，直接进行查找到对应的位置进行修改)

ONNX_MODEL = 'best.onnx'    #待转换的onnx模型
RKNN_MODEL = 'best.rknn'    #转换后的rknn模型
IMG_PATH = './1.jpg'        #用于测试图片
DATASET = './dataset.txt'   #用于测试的数据集，内容为多个测试图片的名字
QUANTIZE_ON = True          #不修改
OBJ_THRESH = 0.25           #不修改
NMS_THRESH = 0.45           #不修改
IMG_SIZE = 640              #不修改
CLASSES = ("person")        #修改为你所训练的模型所含的标签

#rknn.config(mean_values=[[0, 0, 0]], std_values=[[255, 255, 255]]) 
#将上述语句添加注释改成下述语句
rknn.config(mean_values=[[0, 0, 0]], std_values=[[255, 255, 255]], target_platform='rk3588')#选择使用的平台此处使用的是rknn3588

想要程序执行完，展示推理效果，将以下代码中的语句的注释打开

1
2
3

# cv2.imshow("post process result", img_1)
# cv2.waitKey(0)
# cv2.destroyAllWindows()

然后终端执行：

1	python mytest.py

文件夹中出现best.rknn的文件即说明转换成功。

在orange pi 5 plus上部署rknn模型并实时摄像头推理检测

在RKNN3588的Ubuntu20系统上安装Miniconda，需要注意的是，RKNN3588的Ubuntu20系统为aarch架构因此下载的Miniconda版本和之前有所不同，需要选择对应的aarch版本。
安装miniconda

进入https://mirrors.tuna.tsinghua.edu.cn/anaconda/miniconda/下载对应所需版本

此处我们需要在ubuntu上使用python3.9，我们选择Miniconda3-py39_23.11.0-1-Linux-aarch64.sh

ctrl+F查找到对应的版本，将其放进板子中进行下载
1
bash ./Miniconda3-py39_23.11.0-1-Linux-aarch64.sh
在用户目录下安装就不加sudo，此处我们默认进入用户下就不添加否则之后输入conda会显示无该指令。
然后进行环境创建
1
conda create -n rknn_39 python=3.9
进入到创建的环境中
1
conda activate rknn_39
下载rknn-toolkit-lite2到板子上

将在PC端的rknn-toolkit2(1.4.0)复制到板子上

也可以再次克隆到板子上
1
git clone https://github.com/rockchip-linux/rknn-toolkit2.git -b v1.4.0
进入rknn_toolkit2/rknn-toolkit2/rknn_toolkit_lite2/packages安装对应的py3.9的包，

有别于PC端的此处下载的rknn-toolkit-lite轻量化的包
1
pip install rknn_toolkit_lite2-1.4.0-cp39-cp39-linux_aarch64.whl
进行测试：
1
python
1
from rknnlite.api import RKNNLite
没有报错即为安装成功。

安装rknn_npu

1	git clone https://github.com/rockchip-linux/rknpu2

将下面的so文件复制到/usr/lib/下

1	sudo cp rknpu2/runtime/RK3588/Linux/librknn_api/librknnrt.so /usr/lib/librknnrt.so

在你板子的yolov5目标检测的工作空间下创建一个名为depoly.py的文件

import urllib
import time
import sys
import numpy as np
import cv2
from rknnlite.api import RKNNLite


RKNN_MODEL = 'yolov5s.rknn'
IMG_PATH = './bus.jpg'
OBJ_THRESH = 0.25
NMS_THRESH = 0.45
IMG_SIZE = 640
CLASSES = ("person", "bicycle", "car", "motorbike ", "aeroplane ", "bus ", "train", "truck ", "boat", "traffic light",
           "fire hydrant", "stop sign ", "parking meter", "bench", "bird", "cat", "dog ", "horse ", "sheep", "cow", "elephant",
           "bear", "zebra ", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite",
           "baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket", "bottle", "wine glass", "cup", "fork", "knife ",
           "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza ", "donut", "cake", "chair", "sofa",
           "pottedplant", "bed", "diningtable", "toilet ", "tvmonitor", "laptop	", "mouse	", "remote ", "keyboard ", "cell phone", "microwave ",
           "oven ", "toaster", "sink", "refrigerator ", "book", "clock", "vase", "scissors ", "teddy bear ", "hair drier", "toothbrush ")


def sigmoid(x):
    return 1 / (1 + np.exp(-x))


def xywh2xyxy(x):
    # Convert [x, y, w, h] to [x1, y1, x2, y2]
    y = np.copy(x)
    y[:, 0] = x[:, 0] - x[:, 2] / 2  # top left x
    y[:, 1] = x[:, 1] - x[:, 3] / 2  # top left y
    y[:, 2] = x[:, 0] + x[:, 2] / 2  # bottom right x
    y[:, 3] = x[:, 1] + x[:, 3] / 2  # bottom right y
    return y


def process(input, mask, anchors):

    anchors = [anchors[i] for i in mask]
    grid_h, grid_w = map(int, input.shape[0:2])

    box_confidence = sigmoid(input[..., 4])
    box_confidence = np.expand_dims(box_confidence, axis=-1)

    box_class_probs = sigmoid(input[..., 5:])

    box_xy = sigmoid(input[..., :2])*2 - 0.5

    col = np.tile(np.arange(0, grid_w), grid_w).reshape(-1, grid_w)
    row = np.tile(np.arange(0, grid_h).reshape(-1, 1), grid_h)
    col = col.reshape(grid_h, grid_w, 1, 1).repeat(3, axis=-2)
    row = row.reshape(grid_h, grid_w, 1, 1).repeat(3, axis=-2)
    grid = np.concatenate((col, row), axis=-1)
    box_xy += grid
    box_xy *= int(IMG_SIZE/grid_h)

    box_wh = pow(sigmoid(input[..., 2:4])*2, 2)
    box_wh = box_wh * anchors

    box = np.concatenate((box_xy, box_wh), axis=-1)

    return box, box_confidence, box_class_probs


def filter_boxes(boxes, box_confidences, box_class_probs):
    boxes = boxes.reshape(-1, 4)
    box_confidences = box_confidences.reshape(-1)
    box_class_probs = box_class_probs.reshape(-1, box_class_probs.shape[-1])

    _box_pos = np.where(box_confidences >= OBJ_THRESH)
    boxes = boxes[_box_pos]
    box_confidences = box_confidences[_box_pos]
    box_class_probs = box_class_probs[_box_pos]

    class_max_score = np.max(box_class_probs, axis=-1)
    classes = np.argmax(box_class_probs, axis=-1)
    _class_pos = np.where(class_max_score >= OBJ_THRESH)

    boxes = boxes[_class_pos]
    classes = classes[_class_pos]
    scores = (class_max_score* box_confidences)[_class_pos]

    return boxes, classes, scores


def nms_boxes(boxes, scores):
    x = boxes[:, 0]
    y = boxes[:, 1]
    w = boxes[:, 2] - boxes[:, 0]
    h = boxes[:, 3] - boxes[:, 1]

    areas = w * h
    order = scores.argsort()[::-1]

    keep = []
    while order.size > 0:
        i = order[0]
        keep.append(i)

        xx1 = np.maximum(x[i], x[order[1:]])
        yy1 = np.maximum(y[i], y[order[1:]])
        xx2 = np.minimum(x[i] + w[i], x[order[1:]] + w[order[1:]])
        yy2 = np.minimum(y[i] + h[i], y[order[1:]] + h[order[1:]])

        w1 = np.maximum(0.0, xx2 - xx1 + 0.00001)
        h1 = np.maximum(0.0, yy2 - yy1 + 0.00001)
        inter = w1 * h1

        ovr = inter / (areas[i] + areas[order[1:]] - inter)
        inds = np.where(ovr <= NMS_THRESH)[0]
        order = order[inds + 1]
    keep = np.array(keep)
    return keep


def yolov5_post_process(input_data):
    masks = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
    anchors = [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45],
               [59, 119], [116, 90], [156, 198], [373, 326]]

    boxes, classes, scores = [], [], []
    for input, mask in zip(input_data, masks):
        b, c, s = process(input, mask, anchors)
        b, c, s = filter_boxes(b, c, s)
        boxes.append(b)
        classes.append(c)
        scores.append(s)

    boxes = np.concatenate(boxes)
    boxes = xywh2xyxy(boxes)
    classes = np.concatenate(classes)
    scores = np.concatenate(scores)

    nboxes, nclasses, nscores = [], [], []
    for c in set(classes):
        inds = np.where(classes == c)
        b = boxes[inds]
        c = classes[inds]
        s = scores[inds]

        keep = nms_boxes(b, s)

        nboxes.append(b[keep])
        nclasses.append(c[keep])
        nscores.append(s[keep])

    if not nclasses and not nscores:
        return None, None, None

    boxes = np.concatenate(nboxes)
    classes = np.concatenate(nclasses)
    scores = np.concatenate(nscores)

    return boxes, classes, scores


def draw1(image, boxes, scores, classes):
    for box, score, cl in zip(boxes, scores, classes):
        top, left, right, bottom = box

        print('class: {}, score: {}'.format(CLASSES[cl], score))
        print('box coordinate left,top,right,down: [{}, {}, {}, {}]'.format(top, left, right, bottom))
        top = int(top)
        left = int(left)
        right = int(right)
        bottom = int(bottom)

        
        cv2.rectangle(image, (top, left), (right, bottom), (255, 0, 0), 2)
        cv2.putText(image, '{0} {1:.2f}'.format(CLASSES[cl], score),
                    (top, left - 6),
                    cv2.FONT_HERSHEY_SIMPLEX,
                    0.6, (0, 0, 255), 2)


def letterbox(im, new_shape=(640, 640), color=(0, 0, 0)):
    shape = im.shape[:2]  # current shape [height, width]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])

    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
    return im, ratio, (dw, dh)


if __name__ == '__main__':
    rknn = RKNNLite()

    print('--> Load RKNN model')
    ret = rknn.load_rknn(RKNN_MODEL)
    if ret != 0:
        print('Load RKNN model failed')
        exit(ret)
    print('done')
    ret = rknn.init_runtime()
    if ret != 0:
        print('Init runtime environment failed!')
        exit(ret)
    print('done')
    
    capture = cv2.VideoCapture(0)

    ref, frame = capture.read()
    if not ref:
        raise ValueError("error reading")
 
    fps = 0.0
    while(True):
        t1 = time.time()
        # 
        ref, frame = capture.read()
        if not ref:
            break
        # BGRtoRGB
        frame = cv2.cvtColor(frame,cv2.COLOR_BGR2RGB)
        
        #############
        # 
        img = frame
        img, ratio, (dw, dh) = letterbox(img, new_shape=(IMG_SIZE, IMG_SIZE))
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
 
        # Inference
        print('--> Running model')
        outputs = rknn.inference(inputs=[img])
        

        input0_data = outputs[0]
        input1_data = outputs[1]
        input2_data = outputs[2]

        input0_data = input0_data.reshape([3, -1]+list(input0_data.shape[-2:]))
        input1_data = input1_data.reshape([3, -1]+list(input1_data.shape[-2:]))
        input2_data = input2_data.reshape([3, -1]+list(input2_data.shape[-2:]))

        input_data = list()
        input_data.append(np.transpose(input0_data, (2, 3, 0, 1)))
        input_data.append(np.transpose(input1_data, (2, 3, 0, 1)))
        input_data.append(np.transpose(input2_data, (2, 3, 0, 1)))

        boxes, classes, scores = yolov5_post_process(input_data)

        img_1 = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
        if boxes is not None:
            draw1(img_1, boxes, scores, classes)

        fps  = ( fps + (1./(time.time()-t1)) ) / 2
        print("fps= %.2f"%(fps))
        cv2.imshow("video",img_1[:,:,::-1])
        c= cv2.waitKey(1) & 0xff 
        if c==27:
            capture.release()
            break
    print("Video Detection Done!")
    capture.release()
    cv2.destroyAllWindows()

其中将一下几项改成你的配置

RKNN_MODEL = 'yolov5s.rknn' #修改为你的模型路径
IMG_PATH = './bus.jpg'		#修改为你要测试的图片路径
OBJ_THRESH = 0.25			#一般不做修改
NMS_THRESH = 0.45			#一般不做修改
IMG_SIZE = 640				#一般不做修改
CLASSES = ("person", "bicycle", "car", "motorbike ", "aeroplane ", "bus ", "train", "truck ", "boat", "traffic light",
           "fire hydrant", "stop sign ", "parking meter", "bench", "bird", "cat", "dog ", "horse ", "sheep", "cow", "elephant",
           "bear", "zebra ", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite",
           "baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket", "bottle", "wine glass", "cup", "fork", "knife ",
           "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza ", "donut", "cake", "chair", "sofa",
           "pottedplant", "bed", "diningtable", "toilet ", "tvmonitor", "laptop	", "mouse	", "remote ", "keyboard ", "cell phone", "microwave ",
           "oven ", "toaster", "sink", "refrigerator ", "book", "clock", "vase", "scissors ", "teddy bear ", "hair drier", "toothbrush ")

capture = cv2.VideoCapture(0) #修改成你所使用的摄像头编号

再将你生成的.rknn模型放置在该工作空间下
运行代码
1
python depoly.py

跟到了最后你很棒奖励一下自己

以上部署完毕

参考

rknn-toolkit2(v1.4.0)

https://github.com/rockchip-linux/rknn-toolkit2/tree/v1.4.0

yolov5模型(.pt)在RK3588(S)上的部署(实时摄像头检测)

https://github.com/ChuanSe/yolov5-PT-to-RKNN

保姆式yolov5教程，训练你自己的数据集

https://zhuanlan.zhihu.com/p/501798155

【RK3588】 File “rknnlite/api/rknn_runtime.py“, line 875, in rknnlite.api.rknn_runtime.RKNNRuntime.bui

https://blog.csdn.net/qq_36497369/article/details/134883399

【已解决】onnx转换为rknn置信度大于1，图像出现乱框问题解决

https://blog.csdn.net/zfenggo/article/details/136017885

运行示例出错：E Catch exception when setting inputs. #14

https://github.com/leafqycc/rknn-multi-threaded/issues/14

yolov5(orangepi5plus)