前言

最近要做人物的识别,有些图片没有正脸,所以要用到ReID相关的技术(有正脸就很好处理,用人脸检测现在有很多现成的库)。在网上找了一圈没找到很适合入门的教程,于是打算在这里写一下我是怎么用fast-reid库来做人物的特征向量的提取的。方便之后的人入坑。

现成的reid的库我就只找到了torchreidfast-reid。前者我尝试了,但没跑起来,而且代码好久没维护了,fast-reid是京东近年来发布的,品质更有保障,所以就选了后者。

环境配置

fast-reid并不像一些Python包可以直接用pip install fastreid安装,它的项目地址中并没有提到能用pip直接安装,所以装起来有点麻烦。

安装依赖

首先先创建一个虚拟环境,.venv或者conda都可以。然后安装一些依赖(fast-reid需要用到,在项目的文档里也提到了):

  1. Pytorch ≥ 1.6
  2. torchvision(装和Pytorch版本兼容的)
  3. yacs
  4. gdown
  5. sklearn
  6. termcolor
  7. tabulate
  8. faiss

前面七个每个都用pip install packet_name安装就可以了。如果速度慢或者网络错误记得加上-i https://mirrors.aliyun.com/pypi/simple/,换成阿里镜像源。

最后一个的安装有些麻烦,推荐不用pip安装,用conda install -c pytorch faiss-cpu就可以从pytorch官方的渠道下载faiss,又快又简单。

如果需要GPU版本的faiss可以conda install -c pytorch faiss-gpu

安装fast-reid

可以用git clone也可以直接下载这个文件夹。做特征提取只需要用到项目代码中的fastreidconfigs就可以了。

  • fastreid文件夹中是京东写好的库
  • configs里是一些配置文件,也下下来,fastreid中的代码要用到

模型下载

最后还要下载一个你喜欢的模型做特征提取,在这里找一个你喜欢的模型下载。

我用的是Resnet50,下载链接为:https://github.com/JDAI-CV/fast-reid/releases/download/v0.1.1/market_bot_R50.pth

直接wget url就可以了,如果嫌麻烦直接复制下面的代码,下载Resnet50:(可能不开VPN网络不是很好)

1
wget https://github.com/JDAI-CV/fast-reid/releases/download/v0.1.1/market_bot_R50.pth

代码

把文件夹还有模型文件按以下形式整理好:

1
2
3
4
./
├── configs
├── fastreid
├── market_bot_R50.pth

然后新建一个Python文件,复制以下代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
# encoding: utf-8
"""
@author: xingyu liao
@contact: sherlockliao01@gmail.com
"""

import atexit
import bisect
from collections import deque

import cv2
import torch
import torch.multiprocessing as mp

from fastreid.engine import DefaultPredictor

try:
mp.set_start_method('spawn')
except RuntimeError:
pass


class FeatureExtractionDemo(object):
def __init__(self, cfg, parallel=False):
"""
Args:
cfg (CfgNode):
parallel (bool) whether to run the model in different processes from visualization.:
Useful since the visualization logic can be slow.
"""
self.cfg = cfg
self.parallel = parallel

if parallel:
self.num_gpus = torch.cuda.device_count()
self.predictor = AsyncPredictor(cfg, self.num_gpus)
else:
self.predictor = DefaultPredictor(cfg)

def run_on_image(self, original_image):
"""

Args:
original_image (np.ndarray): an image of shape (H, W, C) (in BGR order).
This is the format used by OpenCV.

Returns:
predictions (np.ndarray): normalized feature of the model.
"""
# the model expects RGB inputs
original_image = original_image[:, :, ::-1]
# Apply pre-processing to image.
image = cv2.resize(original_image, tuple(self.cfg.INPUT.SIZE_TEST[::-1]), interpolation=cv2.INTER_CUBIC)
# Make shape with a new batch dimension which is adapted for
# network input
image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))[None]
predictions = self.predictor(image)
return predictions

def run_on_loader(self, data_loader):
if self.parallel:
buffer_size = self.predictor.default_buffer_size

batch_data = deque()

for cnt, batch in enumerate(data_loader):
batch_data.append(batch)
self.predictor.put(batch["images"])

if cnt >= buffer_size:
batch = batch_data.popleft()
predictions = self.predictor.get()
yield predictions, batch["targets"].cpu().numpy(), batch["camids"].cpu().numpy()

while len(batch_data):
batch = batch_data.popleft()
predictions = self.predictor.get()
yield predictions, batch["targets"].cpu().numpy(), batch["camids"].cpu().numpy()
else:
for batch in data_loader:
predictions = self.predictor(batch["images"])
yield predictions, batch["targets"].cpu().numpy(), batch["camids"].cpu().numpy()


class AsyncPredictor:
"""
A predictor that runs the model asynchronously, possibly on >1 GPUs.
Because when the amount of data is large.
"""

class _StopToken:
pass

class _PredictWorker(mp.Process):
def __init__(self, cfg, task_queue, result_queue):
self.cfg = cfg
self.task_queue = task_queue
self.result_queue = result_queue
super().__init__()

def run(self):
predictor = DefaultPredictor(self.cfg)

while True:
task = self.task_queue.get()
if isinstance(task, AsyncPredictor._StopToken):
break
idx, data = task
result = predictor(data)
self.result_queue.put((idx, result))

def __init__(self, cfg, num_gpus: int = 1):
"""

Args:
cfg (CfgNode):
num_gpus (int): if 0, will run on CPU
"""
num_workers = max(num_gpus, 1)
self.task_queue = mp.Queue(maxsize=num_workers * 3)
self.result_queue = mp.Queue(maxsize=num_workers * 3)
self.procs = []
for gpuid in range(max(num_gpus, 1)):
cfg = cfg.clone()
cfg.defrost()
cfg.MODEL.DEVICE = "cuda:{}".format(gpuid) if num_gpus > 0 else "cpu"
self.procs.append(
AsyncPredictor._PredictWorker(cfg, self.task_queue, self.result_queue)
)

self.put_idx = 0
self.get_idx = 0
self.result_rank = []
self.result_data = []

for p in self.procs:
p.start()

atexit.register(self.shutdown)

def put(self, image):
self.put_idx += 1
self.task_queue.put((self.put_idx, image))

def get(self):
self.get_idx += 1
if len(self.result_rank) and self.result_rank[0] == self.get_idx:
res = self.result_data[0]
del self.result_data[0], self.result_rank[0]
return res

while True:
# Make sure the results are returned in the correct order
idx, res = self.result_queue.get()
if idx == self.get_idx:
return res
insert = bisect.bisect(self.result_rank, idx)
self.result_rank.insert(insert, idx)
self.result_data.insert(insert, res)

def __len__(self):
return self.put_idx - self.get_idx

def __call__(self, image):
self.put(image)
return self.get()

def shutdown(self):
for _ in self.procs:
self.task_queue.put(AsyncPredictor._StopToken())

@property
def default_buffer_size(self):
return len(self.procs) * 5


import argparse
import glob
import os
import sys

import torch.nn.functional as F
import cv2
import numpy as np
import tqdm
from torch.backends import cudnn

sys.path.append('.')

from fastreid.config import get_cfg
from fastreid.utils.logger import setup_logger
from fastreid.utils.file_io import PathManager

# 读取配置文件
def setup_cfg(args):
# load config from file and command-line arguments
cfg = get_cfg()
# add_partialreid_config(cfg)
cfg.merge_from_file(args.config_file)
cfg.merge_from_list(args.opts)
cfg.freeze()
return cfg


def get_parser():
parser = argparse.ArgumentParser(description="Feature extraction with reid models")
parser.add_argument(
"--config-file", # config路径,通常包含模型配置文件
metavar="FILE",
help="path to config file",
)
parser.add_argument(
"--parallel", # 是否并行
action='store_true',
help='If use multiprocess for feature extraction.'
)
parser.add_argument(
"--input", # 输入图像路径
nargs="+",
help="A list of space separated input images; "
"or a single glob pattern such as 'directory/*.webp'",
)
parser.add_argument(
"--output", # 输出结果路径
default='demo_output',
help='path to save features'
)
parser.add_argument(
"--opts",
help="Modify config options using the command-line 'KEY VALUE' pairs",
default=[],
nargs=argparse.REMAINDER,
)
return parser


def postprocess(features):
# Normalize feature to compute cosine distance
features = F.normalize(features) # 特征归一化
features = features.cpu().data.numpy()
return features

def get_feature_extractor(model_path="./market_bot_R50.pth",config_file="./configs/Market1501/bagtricks_R50.yml",parallel=False):
args = get_parser().parse_args([]) #
args.config_file= config_file
args.opts.extend(["MODEL.WEIGHTS", model_path])
args.parallel = parallel # 是否并行

cfg = setup_cfg(args) # 读取cfg文件
return FeatureExtractionDemo(cfg, parallel=args.parallel) # 加载特征提取器,也就是加载模型

简单解释一下代码,从开头一直到postprocess函数的定义,都是抄的demo/predictor.py里面的代码,因为该文件中已经实现了一个特征提取的模型FeatureExtractionDemo,给定cv2读取的一张图片,就可以返回其的特征向量。

后面的get_feature_extractor是我加的,用来初始化这个FeatureExtractionDemo。其中:

  • model_path是下载好的预训练模型的路径
  • config_file是配置文件,我就用的"./configs/Market1501/bagtricks_R50.yml",其他配置文件没试过
  • parallel是是否启用并行化

args = get_parser().parse_args([])创建了一个默认的args,这样就不用命令行传参,可以直接用函数的参数:

1
2
3
args.config_file= config_file
args.opts.extend(["MODEL.WEIGHTS", model_path])
args.parallel = parallel # 是否并行

后面的就是demo/predictor.py的逻辑:

1
2
cfg = setup_cfg(args)  # 读取cfg文件
return FeatureExtractionDemo(cfg, parallel=args.parallel) # 加载特征提取器,也就是加载模型

然后就可以通过这样的方式来创建特征提取模型,然后传入cv2读取的结果,就能得到特征向量:

1
2
3
4
5
6
7
import cv2
model=get_feature_extractor()

img_path="path/to/your/image"
img=cv2.imread(img_path)
feature=model.run_on_image(img)
print(feature.shape)