正文

如何收集图片数据集（收集图片的）

发布时间：2023-04-08 02:13:02 稿源：创意岭阅读： 53

大家好！今天让创意岭的小编来大家介绍下关于如何收集图片数据集的问题，以下是小编对此问题的归纳整理，让我们一起来看看吧。

开始之前先推荐一个非常厉害的Ai人工智能工具，一键生成原创文章、方案、文案、工作计划、工作报告、论文、代码、作文、做题和对话答疑等等

只需要输入关键词，就能返回你想要的内容，越精准，写出的就越详细，有微信小程序端、在线网页版、PC客户端

官网：https://ai.de1919.com。

创意岭作为行业内优秀的企业，服务客户遍布全球各地，如需了解SEO相关业务请拨打电话175-8598-2043，或添加微信：1454722008

本文目录:

1、mxnet数据集的制作和训练
2、matlab中怎么从fig文件中提取数据集再绘图
3、idiap的数据集怎么申请啊？
4、深度学习的数据集都是怎样生成的

如何收集图片数据集（收集图片的）

一、mxnet数据集的制作和训练

安装mxnet

Cpu版：pip install mxnet

Gpu版：pip install mxnet-cu80

若cuda版本为cuda-9.0,改为pip install mxnet-90

pip或apt安装sklearn，easydict

制作数据集

将图片分类并放入不同文件夹

运行python im2rec.py train --list ./可生成.list文件，包含图片列表

运行python im2rec.py train ./ train.rec和train.idx训练文件

--train-ratio 0.9生成验证数据集.bin文件参数为训练数据集和验证数据集之比

--resize 128 128指定生成数据集的图片大小

参考 https://github.com/apache/incubator-mxnet

将生成的.rec,.idx,.bin（非必须）文件放入datasets/faces_emore中

新建property文本，写入图片数量，图片长宽 example : 86545 128 128

例：

python -u train.py --network m1 --loss softmax --dataset emore,1

使用softma、nosoftmax、arcface或cosface训练完成后，使用生成模型运行三元组损失训练

例：

Python -u train.py --network m1 -loss triplet --lr 0.005 --pretrained ./models/m1-softmax-emore

参数说明

--dataset训练集位置，具体位置查看config.py 108至120行

--network网络模型候选参数： r100 r100fc r50 r50v1（基于resnet） d169 d201（基于densenet） y1 y2（基于mobilefacenet） m1 m0.5（基于mobilenet） mnas mnas05 mnas025（基于mnasnet）

--loss损失函数候选参数：softmax（标准损失函数） nsoftmax （组合损失函数）arcface cosface combined triplet（三元组损失） atriplet

--ckpt模型存储时间。0：放弃存储 1：必要时存储（验证集准确率达标时，若无验证集则不存储 3：总是存储）

--lr学习率

--lr-steps学习率改变方法例：’10000,20000,2200000’即达到图片数量时学习率*0.1

--per-batch-size每次的训练的数量数量越少，占用显卡内存越少

参考

https://github.com/deepinsight/insightface/tree/master/recognition

二、matlab中怎么从fig文件中提取数据集再绘图

如果两条曲线都画在同一个axis里面

画完图之后，运行这个，gcf就是当前fig的句柄

ah=get(gcf,'children');

lineh=get(ah,'children');

x1=get(lineh(1),'xdata');

y1=get(lineh,(1),'ydata');

x2=get(lineh(2),'xdata');

y2=get(lineh,(2),'ydata');

如果图像是用subplot分成两幅图画在同一个fig里面的

画完图之后，运行这个，gcf就是当前fig句柄

ah=get(gcf,'children');

lineh1=get(ah(1),'children');

x1=get(lineh1,'xdata');

y1=get(lineh1,'ydata');

lineh4=get(ah(2),'children');

x2=get(lineh4,'xdata');

y2=get(lineh4,'ydata');

两中情况的区别是

一个是fig里只有1个axis，而axis中有2条曲线

一个是fig里有2个axis，而每个axis中只有1条曲线

三、idiap的数据集怎么申请啊？

The Replay-Mobile Database for face spoofing consists of 1190 video clips of photo and video attack attempts to 40 clients, under different lighting conditions. These videos have been recorded with current devices from the market: an iPad Mini2 tablet and a LG-G4 smartphone. This database has been produced at the Idiap Research Institute (Switzerland) within the framework of collaboration with Galician Research and Development Center in Advanced Telecommunications. Gradiant (Spain).

【用于人脸反欺诈的the Replay-Mobile Database包含尝试使用照片和视频进行攻击的1190个视频片段，它们实在不同的灯光条件下进行拍摄的。这些视频都是用市场上现有的设备录制的：一个iPad微型2平板电脑和一个LG-G4智能手机。该数据库是在与加利西亚先进电信研究和开发中心合作的框架内在idiap研究所（瑞士）制作的。】

Note:

The database would have consisted of 1200 videos. For client009 (test subset), however, we were not able to collect real-access videos (the person came for only the first recording session, not the second session). Therefore, for this client, we have only videos for the enrollment and the attacks.

【注意：该数据库将包含1200个视频。然而，对于client009（测试子集），我们无法收集实际访问视频（该人仅来自第一次录制会话，而不是第二次会话）。因此，对于此客户，我们只有视频用于注册和攻击。】

二、欺诈攻击描述

Spoofing Attacks Description

This 2D face spoofing attack database consists of 1,190 video clips of photo and video attack attempts of 40 clients, under various lighting conditions.

【这个2D面部欺骗攻击数据库包含1,190个视频剪辑，包括40个客户在各种照明条件下的照片和视频攻击尝试。】

The data is split into 4 sub-groups:

* Training data ("train"), to be used for training your anti-spoof classifier;

* Development data ("devel"), to be used for threshold estimation;

* Test data ("test"), with which to report error figures;

* Enrollment data ("enroll"), that can be used to verify spoofing sensitivity on face detection algorithms.

【数据分为4个子组：

*训练数据（“训练”），用于训练您的反欺骗分类器;

*用于阈值估计的开发数据（“devel”）;

*测试数据（“测试”），用于报告错误数据;

*注册数据（“注册”），可用于验证面部检测算法的欺骗灵敏度。】

The data-sets 'train', 'devel', and 'test' are disjoint. Clients that appear in one of these data sets (train, devel or test) do not appear in the other two sets.

【数据集'train'，'devel'和'test'是不相交的。出现在其中一个数据集（train，devel或test）中的客户端不会出现在其他两个数据集中。】

三、数据集描述

Database Description

--------------------

All videos are captured using the front-camera of the mobile device (tablet or mobile). The front-camera produces colour videos with a high definition resolution of 720 pixels (width) by 1280 pixels (height) which are saved in ".mov" format. The frame rate is about 25 Hz. Real accesses have been performed using the face of the genuine user. Attacks attempts have been performed by displaying a photo or a video recording of the same client for at least 10 seconds.

【使用移动设备（平板电脑或移动设备）的前置摄像头捕获所有视频。前置摄像头产生的彩色视频具有720像素（宽度）×1280像素（高度）的高清晰度分辨率，以“.mov”格式保存。帧速率约为25 Hz。我们使用真实用户的面部执行了真实访问。通过显示同一客户端的照片或视频记录至少10秒来执行攻击尝试。】

3.1光照条件

Real client accesses have been recorded under five different lighting conditions:

* **controlled** : The office light turned on, blinds down, background homogeneous

* **adverse** : The office light turned off, blinds halfway up, background homogeneous

* **direct** : The user captured the video in front of a window with direct sunlight, with more complex background.

* **lateral** : The user captured the video perpendicular to the window with lateral sunlight, with more complex background.

* **diffuse** : The user captured the video in an open hall with diffuse illumination, with more complex background.

【在五种不同的照明条件下记录了真实客户访问：

* **控制**：办公室灯亮，百叶窗关闭，背景均匀

* **不利**：办公室灯关闭，百叶窗半开，背景均匀

* **直接**：用户在阳光直射的窗前拍摄视频，背景更复杂。

* **横向**：用户在侧面阳光下垂直于窗户拍摄视频，背景更复杂。

* **漫反射**：用户在开放式大厅中拍摄视频，漫射照明，背景更复杂。】

【5种灯光方式加上2种拍摄设备，所以每个客户有10个real视频，如下】

To produce the attacks, high-resolution photos and videos from each client have been used under similar conditions as in their authentication sessions.

* **lighton** : The user sitting and the capturer device was holder on a tripod, the office light was turned on, blinds up, background is homogeneous

* **lightoff** : The user sitting and the capturer device was holder on a tripod, the office light was turned off, blinds up, background is homogeneous

【为了产生攻击，来自每个客户端的高分辨率照片和视频在与其身份验证会话类似的条件下使用。

* ** 灯亮 **：用户坐着，捕获器设备放在三脚架上，办公室灯打开，百叶窗打开，背景均匀

* **灯灭**：用户坐着，捕捉器装置放在三脚架上，办公室灯关闭，百叶窗打开，背景均匀】

3.2 攻击数据的获取

For photos attacks a Nikon Coolpix P520 camera has been used. The images have been captured with 18 Mpixel resolution. Video attacks have been captured using the back-camera of a LG-G4 smartphone, which records 1080p FHD video clips through it's 16 Mpixel camera.

【对于照片攻击，使用了尼康Coolpix P520相机。图像以18 Mpixel分辨率拍摄。而使用LG-G4智能手机的后置摄像头捕获视频攻击，该智能手机通过其1600万像素摄像头记录1080p FHD视频片段。】

Two kinds of attacks have been performed:

* **mattescreen** : All the attacks have been displayed on Philips 227ELH screen (resolution 1920x1080 pixels). This matte screen avoids reflections. The videos have been recorded using devices supported on a stand. Two kinds of attacks have been performed using mattescreens;

* **photo**: a still photo of the attacked identity is displayed on the screen.

* **video**: a video showing the attacked identity is replayed on the screen.

* **print**: Al the attacks have been printed on Konica Minolta ineo+ 224e color laser printer. The videos have been recorded in two modes:

* **fixed** : Using devices supported on a stand.

* **hand** : Using devices held by the spoofer.

In total, 16 attack videos have been recorded for each client, 8 for each of the attacking modes described above.

【已经进行了两种攻击：

* ** 磨砂显示屏 **：所有攻击都已在飞利浦227ELH屏幕上显示（分辨率为1920x1080像素）。这种磨砂屏幕避免了反射。视频已使用支架上支撑的设备进行录制。使用mattescreens进行了两种攻击;

* **照片**：屏幕上显示受攻击身份的静态照片。

* **视频**：在屏幕上重播显示受攻击身份的视频。

* **打印**：这些攻击已经由柯尼卡美能达ineo + 224e彩色激光打印机打印。

视频以两种模式录制：

* **固定**：使用支架上支撑的设备。

* **手**：使用欺骗者持有的设备。

总共为每个客户端记录了16个攻击视频，对于上述每种攻击模式记录了8个。】

3.3.攻击展示（display）类别

* 4 x mobile attacks using a mattescreen displaying:

* 1 x mobile photo/lighton

* 1 x mobile photo/lightoff

* 1 x mobile video/lighton

* 1 x mobile video/lightoff

* 4 x tablet attacks using a mattescreen displaying:

* 1 x tablet photo/lighton

* 1 x tablet photo/lightoff

* 1 x tablet video/lighton

* 1 x tablet video/lightoff

* 2 x Print attacks captured by smartphone with fixed support. The print image occupied the entire available printing surface on A4 paper for the following samples:

* 1 x high-resolution print of photo/lighton

* 1 x high-resolution print of photo/lightoff

* 2 x Print attacks captured by tablet with fixed support. The print image occupied the entire available printing surface on A4 paper for the following samples:

* 1 x high-resolution print of photo/lighton

* 1 x high-resolution print of photo/lightoff

* 2 x Print attacks captured by hand-held smartphone. The print image occupied the entire available printing surface on A4 paper for the following samples:

* 1 x high-resolution print of photo/lighton

* 1 x high-resolution print of photo/lightoff

* 2 x Print attacks captured by hand-held tablet. The print image occupied the entire available printing surface on A4 paper for the following samples:

* 1 x high-resolution print of photo/lighton

* 1 x high-resolution print of photo/lightoff

【每个客户的16种攻击视频举例如下：】

【*使用磨砂屏显示4种移动攻击：

* 1 x移动照片/ lighton

* 1 x移动照片/lightoff

* 1 x移动视频/ lighton

* 1 x移动视频/lightoff

*使用磨砂屏显示的4种平板电脑攻击：

* 1 x平板电脑照片/ lighton

* 1 x平板电脑照片/lightoff

* 1 x平板电脑视频/ lighton

* 1 x平板电脑视频/lightoff

*具有固定支持的智能手机捕获的2 种打印攻击。打印图像占据A4纸上的整个可用打印表面，用于以下样本：

* 1张高分辨率照片/lighton

* 1张高分辨率照片/lightoff

*具有固定支持的平板电脑捕获的2 种打印攻击。打印图像占据A4纸上的整个可用打印表面，用于以下样本：

* 1张高分辨率照片/lighton

* 1张高分辨率照片/lightoff

* 2个手持智能手机拍摄的打印攻击。打印图像占据A4纸上的整个可用打印表面，用于以下样本：

* 1张高分辨率照片/lighton

* 1张高分辨率照片/lightoff

*手持平板电脑拍摄的2 x打印攻击。打印图像占据A4纸上的整个可用打印表面，用于以下样本：

* 1张高分辨率照片/lighton

* 1张高分辨率照片/lightoff】

The following images illustrate the set-up for capturing videos of attacks using a matte screen.

![Alt](images/mattescreen_attack_1.jpg)

![Alt](images/mattescreen_attack_2.jpg)

![Alt](images/mattescreen_attack_3.jpg)

The images below show how print-attack videos have been captured.

![Alt](images/print_attack_1.jpg)

![Alt](images/print_attack_2.jpg)

【以下图像说明了使用磨砂屏幕捕获攻击视频的设置。

！[ALT]（图像/ mattescreen_attack_1.jpg）

！[ALT]（图像/ mattescreen_attack_2.jpg）

！[ALT]（图像/ mattescreen_attack_3.jpg）

下图显示了如何捕获打印攻击视频。

！[ALT]（图像/ print_attack_1.jpg）

！[ALT]（图像/ print_attack_2.jpg）】

3.4 数据集的划分

The 1200 real-accesses and attacks videos were then divided in the following way:

* **Training set**: contains 120 real-accesses and 192 attacks under different lighting conditions;

* **Development set**: contains 160 real-accesses and 256 attacks under different lighting conditions;

* **Test set**: contains 110 real-accesses and 192 attacks under different lighting conditions;

* **Enrollment set**: contains 160 real-accesses under various lighting conditions, to be used **exclusively** for studying the baseline performance of face recognition systems.

【按以下方式划分1200个实际访问和攻击视频：

* **训练集**：在不同的光照条件下包含120次实际访问和192次攻击;

* **开发集**：在不同光照条件下包含160次实际访问和256次攻击;

* **测试集**：在不同光照条件下包含110次实际访问和192次攻击;

* **注册集**：在各种照明条件下包含160个实际访问，仅用于**用于研究人脸识别系统的基线性能。】

四、人脸定位

Face Locations

--------------

We also provide face location (bounding-boxes) automatically annotated by a cascade of classifier based on a variant of Local Binary Patterns (LBP).

The bob package [bob.ip.facedetect](https://github.com/bioidiap/bob.ip.facedetect) has been used to generate the face-locations.

For each video, face-location is computed for each frame, and face-locations for all frames of a video are stored in a single file in a 4-column format (x,y,width, height). For each video, two versions of the face-location file are provided, one in ASCII text format (extension: .face), and the other in hdf5 format (extension: .hdf5).

The face-location files can be found in the folder `database/faceloc`.

【我们还提供基于局部二进制模式（LBP）的变体由级联分类器自动注释的面部位置（边界框）。

bob包[bob.ip.facedetect]（https://github.com/bioidiap/bob.ip.facedetect）已用于生成面部位置。

对于每个视频，计算每个帧的面部位置，并将视频的所有帧的面部位置存储在4列的单个文件中格式（x，y，宽度，高度）。对于每个视频，提供两个版本的面部位置文件，一个是ASCII文本格式（扩展名：.face），和另一种是hdf5格式（扩展名：.hdf5）。

面部位置文件可以在`database / faceloc`文件夹中找到。

四、深度学习的数据集都是怎样生成的

你好

genet网络的预训练模型训练自己的数据集。

Ok首先是自己的数据集了。Matconvnet中训练imagenet的数据集的准备不像caffe这些工具箱弄得那么好，弄个train文件夹，test文件夹，以及两个txt索引就好了，感觉很不人性。后面我将会将其输入改为这种人性的类型输入格式。

但是其类别索引是从0开始的，这在matlab中是不符合的，所以我将其改成从1开始的。同时添加了一个类class标签的txt，改完的

下载完打开这个文件夹看到：

其中train就是训练所用到的所有图片，test为测试所有图片，train_label为对应图片的名字以及跟随的类标签（从1开始），打开txt可以看到为：