├── Config.py
├── README.md
├── Test.py
├── Train.py
├── __pycache__
    ├── Config.cpython-37.pyc
    ├── augmentations.cpython-37.pyc
    ├── detection.cpython-37.pyc
    ├── l2norm.cpython-37.pyc
    ├── loss_function.cpython-37.pyc
    ├── my_window.cpython-37.pyc
    ├── ssd_net_vgg.cpython-37.pyc
    ├── utils.cpython-37.pyc
    └── voc0712.cpython-37.pyc
├── augmentations.py
├── bus_dataset.log
├── camera.py
├── camera_detection.py
├── camera_detection_1.py
├── detection.py
├── dnf_test.jpg
├── dnf_test_done.jpg
├── environment.yml
├── eval.py
├── l2norm.py
├── loss_function.py
├── model_file_test.py
├── result.jpg
├── ssd_net_vgg.py
├── test.jpg
├── test_done.jpg
├── utils.py
├── video_detection.py
├── voc0712.py
└── weights
    └── readme.txt


/Config.py:
--------------------------------------------------------------------------------
 1 | '''
 2 | 本项目是我在github（国内的话是gitee）的免费开源项目。如果你在某些平台（CSDN、淘宝）付费下载了该项目，烦请告知（邮箱(PengfeiM@outlook.com)）。
 3 | '''
 4 | import os.path as osp
 5 | sk = [ 15, 30, 60, 111, 162, 213, 264 ]
 6 | feature_map = [ 38, 19, 10, 5, 3, 1 ]
 7 | steps = [ 8, 16, 32, 64, 100, 300 ]
 8 | image_size = 300
 9 | aspect_ratios = [[2], [2, 3], [2, 3], [2, 3], [2], [2]]
10 | MEANS = (104, 117, 123)
11 | batch_size = 8
12 | data_load_number_worker = 0
13 | lr = 5e-4
14 | momentum = 0.9
15 | weight_decacy = 5e-4
16 | gamma = 0.1
17 | VOC_ROOT = osp.join('./', "dataset/")
18 | dataset_root = VOC_ROOT
19 | use_cuda = True
20 | lr_steps = (80000, 100000, 120000)
21 | max_iter = 120000
22 | class_num = 5
23 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # 基于 CNN 的疲劳驾驶检测
  2 | 简单的毕设小项目
  3 | ## 更新/Update
  4 | > [!ATTENTION] 这个项目我很久没更新了，由于一些原因吧。最近终于有时间把自己的开发环境重新整好了。  
  5 | > 很高兴看到还有人在关注这个项目，只有由于一些个人原因没能及时回复大家的疑问，请谅解。
  6 | > 接下来一段时间，我会在工作之余，尽可能修复这个项目中的一些问题（主要是版本不兼容），同时我也会把我自己的项目环境放在
  7 | > 代码仓中，如果你使用 miniconda 或者 anaconda，那么你可以直接从 `environment.yml` 导入项目的虚拟环境。
  8 | > 即使不适用 conda，我想这个清单也可以在一定程度上帮助你配置环境。
  9 | > 暂时我会使用 cpu 先调试项目，可以的话，后续会把 gpu 相关环境配置和方法放上来。
 10 | 
 11 | **我的硬件配置**
 12 | > 作为参考
 13 | ```bash
 14 |         _,met$$$$$gg.          revan_m@Ebon-Hawk
 15 |      ,g$$$$$$$$$$$$$$$P.       -----------------
 16 |    ,g$$P""       """Y$$.".     OS: Debian GNU/Linux 12 (bookworm) x86_64
 17 |   ,$$P'              `$$$.     Host: Windows Subsystem for Linux - Debian (2.4.13)
 18 | ',$$P       ,ggs.     `$$b:    Kernel: Linux 5.15.167.4-microsoft-standard-WSL2
 19 | `d$$'     ,$P"'   .    $$$     
 20 |  $$P      d$'     ,    $$P     
 21 |  $$:      $$.   -    ,d$$'     Shell: zsh 5.9
 22 |  $$;      Y$b._   _,d$P'       WM: WSLg 1.0.65 (Wayland)
 23 |  Y$$.    `.`"Y$$$$P"'          Terminal: tmux 3.5a
 24 |  `$$b      "-.__               CPU: 11th Gen Intel(R) Core(TM) i7-11800H (4) @ 2.30 GHz
 25 |   `Y$$b                        GPU 1: Microsoft Basic Render Driver
 26 |    `Y$$.                       GPU 2: Microsoft Basic Render Driver
 27 |      `$$b.                     Memory: 740.60 MiB / 7.63 GiB (9%)
 28 |        `Y$$b.                  Swap: 0 B / 2.00 GiB (0%)
 29 |          `"Y$b._               
 30 |              `""""             
 31 |                                
 32 |                                
 33 |                                
 34 |                                
 35 |                                Battery (Microsoft Hyper-V Virtual Batte): 100% [AC Connected]
 36 |                                Locale: zh_CN.UTF-8
 37 | ```
 38 | 啊，看起来显卡信息这里没有。我的机器其实有一个独显3060laptop，相信知道这个显卡的大概了解其性能，不再多做赘述。
 39 | 
 40 | 
 41 | ## 郑重声明：
 42 | 本项目是我在github（国内的话是gitee）的免费开源项目。我没有授权任何平台（CSDN、淘宝）付费提供该项目。  
 43 | This project is open-source on github and gitee.  
 44 | No authorization to any platform to sell my project.  
 45 | 
 46 | > [!IMPORTANT] 如果你需要论文，请直接发[邮件](PengfeiM@outlook.com)，不要提 issue，issue用来解决问题，或者提出你的想法。
 47 | 
 48 | ## 运行环境（Excution Environment）：
 49 | 
 50 | > 1.python 3.7.1  
 51 | > 2.pytorch 1.0.1  
 52 | > 3.python-opencv  
 53 | > 4.cuda大概可能是8或者9，时间太久记不清了。   不过主要还是显卡-cuda-cudnn-pytorch版本对应吧。
 54 | 
 55 | ## 说明（Notions）
 56 | 
 57 | 预训练的权重文件[vgg_16]
 58 | 
 59 | 1、具体的配置文件请看 Config.py 文件--file that save the configuration    
 60 | 2、训练运行 python Train.py        --file that start the training and control the loops  
 61 | 3、单张测试 python test.py         --file that test ssd with one image  
 62 | 4、测试网络性能 python eval.py     --file that evaluate the performance  
 63 | 5、测试视频 python camera_detection.py --file that test the cnn with a video sequence  
 64 | 
 65 | ## 目前进度（Process: All Done）：
 66 | 
 67 | | 内容             | 进度 |
 68 | | ---------------- | ---- |
 69 | | PERCLOS 计算     | DONE |
 70 | | 眨眼频率计算     | DONE |
 71 | | 打哈欠检测及计算 | DONE |
 72 | | 疲劳检测         | DONE |
 73 | 
 74 | ## 主要文件说明（File in the repo）：
 75 | 
 76 | ssd_net_vgg.py 定义 class SSD 的文件（define the ssd cnn）  
 77 | Train.py 训练代码  (training)  
 78 | voc0712.py 数据集处理代码（没有改文件名，改的话还要改其他代码，麻烦）  (processing the dataset)  
 79 | loss_function.py 损失函数  (loss function)  
 80 | detection.py 检测结果的处理代码，将 SSD 返回结果处理为 opencv 可以处理的形式   
 81 | eval.py 评估网络性能代码    
 82 | test.py 单张图片测试代码 Ps:没写参数接口，所以要改测试的图片就要手动改代码内部文件名了    
 83 | l2norm.py l2 正则化    
 84 | Config.py 配置参数     
 85 | utils.py 工具类  
 86 | camera.py opencv 调用摄像头测试  
 87 | camera_detection.py 摄像头检测代码 V1,V2  
 88 | video_detection.py 视频检测，V3
 89 | 
 90 | ## 数据集结构：
 91 | 
 92 | > /dataset:
 93 | >
 94 | > > /Annotations 存放含有目标信息的 xml 文件  
 95 | > > /ImageSets/Main 存放图片名的文件  
 96 | > > /JPEGImages 存放图片  
 97 | > > /gray2rgb.m 灰度图转三通道  
 98 | > > /txt.py 生成 ImageSets 文件的代码
 99 | 
100 | ## 权重文件存放路径：
101 | 
102 | weights
103 | 测试后的图片存放位置：
104 | tested
105 | 
106 | ## 参考代码：
107 | 
108 | https://github.com/amdegroot/ssd.pytorch
109 | 
110 | ## 数据集和权重文件：
111 | （针对部分代码中涉及的文件（指ssd_voc_5000_plus.pth），翻了翻旧U盘，算是找到了。）
112 | 百度云：
113 | [数据集和权重文件](https://pan.baidu.com/s/1cgl94gxSNEW0ZI-wYcZtpQ)
114 | 提取码：hwsi  
115 | Onedrive：
116 | [数据集](https://mailustceducn-my.sharepoint.com/:u:/g/personal/mpf916_mail_ustc_edu_cn/ER0UB-cAe1VDp9hJZ7e5Ef4B7kGvVX4PePSj7WRtb9VrLQ?e=lbDnjV)
117 | [权重文件](https://mailustceducn-my.sharepoint.com/:f:/g/personal/mpf916_mail_ustc_edu_cn/EqGCPA3SGz5Mp-RMHJSoSSwBg-KG09qwgSAPiOjMOcVVtQ?e=v5yhQz)
118 | 
119 | ## 测试
120 | 
121 | 1、运行 Train.py 训练
122 | 2、eval 可以用于测试整个测试集，test 用于单张图片测试。
123 | 
124 | ## 关于问题讨论
125 | 欢迎大家就代码中存在的问题提issue，同时本存储库开放了讨论功能（Discussion），欢迎各位将一些共性的问题放到Dicussion中提问（我也会将部分以前的issue放到Discussion中）。
126 | 
127 | ## 关于咨询
128 | 如果issue和Discussion不能满足您的需要，随时可以发邮件到我的邮箱(PengfeiM@outlook.com)提出您的问题。
129 | 当然，不管是issue/discussion还是邮件，我都会尽快回复（issue和discussion有更新github会给我发邮件，我也会时常检查github手机端APP）。
130 | 
131 | **最后，如果您想要支持我的工作，请扫描下面的二维码**
132 | ![我的支付宝](https://user-images.githubusercontent.com/45191163/116050673-55db0400-a6aa-11eb-9588-cc0546e89f70.jpg)
133 | 
134 | **谢谢您对我的支持和帮助**
135 | ## Star History
136 | [![Star History Chart](https://api.star-history.com/svg?repos=PengfeiM/Fatigue-Driven-Detection-Based-on-CNN&type=Date)](https://www.star-history.com/#PengfeiM/Fatigue-Driven-Detection-Based-on-CNN&Date)
137 | 
138 | 


--------------------------------------------------------------------------------
/Test.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | # import pdb
 3 | 
 4 | from torch.autograd import Variable
 5 | from detection import Detect
 6 | # from ssd_net_vgg import *
 7 | from ssd_net_vgg import SSD
 8 | # from voc0712 import *
 9 | from voc0712 import VOC_CLASSES
10 | import Config as config
11 | import torch.nn as nn
12 | import numpy as np
13 | import cv2
14 | import utils
15 | if torch.cuda.is_available():
16 |     torch.set_default_tensor_type('torch.cuda.FloatTensor')
17 | colors_tableau = [(255, 255, 255), (31, 119, 180), (174, 199, 232), (255, 127, 14), (255, 187, 120),
18 |                   (44, 160, 44), (152, 223, 138), (214, 39, 40), (255, 152, 150),
19 |                   (148, 103, 189), (197, 176, 213), (140, 86, 75), (196, 156, 148),
20 |                   (227, 119, 194), (247, 182, 210), (127, 127, 127), (199, 199, 199),
21 |                   (188, 189, 34), (219, 219, 141), (23, 190, 207), (158, 218, 229), (158, 218, 229), (158, 218, 229)]
22 | 
23 | net = SSD()    # initialize SSD
24 | net = torch.nn.DataParallel(net)
25 | net.train(mode=False)
26 | # net.load_state_dict(torch.load('./weights/ssd300_VOC_100000.pth',map_location=lambda storage, loc: storage))
27 | net.load_state_dict(torch.load('./weights/ssd_voc_120000.pth', map_location=lambda storage, loc: storage))
28 | img_id = 60
29 | name = 'test'
30 | image = cv2.imread('./' + name + '.jpg', cv2.IMREAD_COLOR)
31 | x = cv2.resize(image, (300, 300)).astype(np.float32)
32 | x -= (104.0, 117.0, 123.0)
33 | x = x.astype(np.float32)
34 | x = x[:, :, ::-1].copy()
35 | # plt.imshow(x)
36 | x = torch.from_numpy(x).permute(2, 0, 1)
37 | xx = Variable(x.unsqueeze(0))     # wrap tensor in Variable
38 | if torch.cuda.is_available():
39 |     xx = xx.cuda()
40 | y = net(xx)
41 | softmax = nn.Softmax(dim=-1)
42 | # detect = Detect(config.class_num, 0, 200, 0.01, 0.45)
43 | detect = Detect.apply  # pytorch新版本需要这样使用
44 | priors = utils.default_prior_box()
45 | 
46 | loc, conf = y
47 | loc = torch.cat([o.view(o.size(0), -1) for o in loc], 1)
48 | conf = torch.cat([o.view(o.size(0), -1) for o in conf], 1)
49 | 
50 | detections = detect(
51 |     loc.view(loc.size(0), -1, 4),
52 |     softmax(conf.view(conf.size(0), -1, config.class_num)),
53 |     torch.cat([o.view(-1, 4) for o in priors], 0),
54 |     config.class_num,
55 |     200,
56 |     0.7,
57 |     0.45
58 | ).data
59 | # detections = detect.apply
60 | 
61 | labels = VOC_CLASSES
62 | top_k = 10
63 | 
64 | # plt.imshow(rgb_image)  # plot the image for matplotlib
65 | 
66 | # scale each detection back up to the image
67 | scale = torch.Tensor(image.shape[1::-1]).repeat(2)
68 | for i in range(detections.size(1)):
69 |     j = 0
70 |     while detections[0, i, j, 0] >= 0.4:
71 |         score = detections[0, i, j, 0]
72 |         label_name = labels[i - 1]
73 |         display_txt = '%s: %.2f' % (label_name, score)
74 |         pt = (detections[0, i, j, 1:] * scale).cpu().numpy()
75 |         rec = [int(e) for e in pt]
76 |         # pdb.set_trace()
77 |         coords = (pt[0], pt[1]), pt[2] - pt[0] + 1, pt[3] - pt[1] + 1
78 |         color = colors_tableau[i]
79 |         # cv2.rectangle(image, (pt[0], pt[1]), (pt[2], pt[3]), color, 2)
80 |         print(rec)
81 |         cv2.rectangle(img=image, rec=rec, color=color, thickness=2)
82 |         cv2.putText(image, display_txt, (int(pt[0]), int(pt[1]) + 10), cv2.FONT_HERSHEY_SIMPLEX, 0.4, (255, 255, 255), 1, 8)
83 |         j += 1
84 | # cv2.imshow('test', image)
85 | # cv2.waitKey(100000) # not implemented in headless version
86 | print("------end-------")
87 | cv2.imwrite(name + '_done.jpg', image)
88 | 


--------------------------------------------------------------------------------
/Train.py:
--------------------------------------------------------------------------------
  1 | '''
  2 | 本项目是我在github（国内的话是gitee）的免费开源项目。如果你在某些平台（CSDN、淘宝）付费下载了该项目，烦请告知（邮箱(PengfeiM@outlook.com)）。
  3 | '''
  4 | 
  5 | import torch
  6 | import Config
  7 | if Config.use_cuda:
  8 |     torch.set_default_tensor_type('torch.cuda.FloatTensor')
  9 | if not Config.use_cuda:
 10 |     print("WARNING: It looks like you have a CUDA device, but aren't " +
 11 |           "using CUDA.\nRun with --cuda for optimal training speed.")
 12 |     torch.set_default_tensor_type('torch.FloatTensor')
 13 | 
 14 | import torch.nn as nn
 15 | import cv2
 16 | import utils
 17 | import loss_function
 18 | import voc0712
 19 | import augmentations
 20 | import ssd_net_vgg
 21 | import torch.utils.data as data
 22 | import torch.optim as optim
 23 | from torch.autograd import Variable
 24 | def adjust_learning_rate(optimizer, gamma, step):
 25 |     """Sets the learning rate to the initial LR decayed by 10 at every
 26 |         specified step
 27 |     # Adapted from PyTorch Imagenet example:
 28 |     # https://github.com/pytorch/examples/blob/master/imagenet/main.py
 29 |     """
 30 |     lr = Config.lr * (gamma ** (step))
 31 |     for param_group in optimizer.param_groups:
 32 |         param_group['lr'] = lr
 33 | def detection_collate(batch):
 34 |     """Custom collate fn for dealing with batches of images that have a different
 35 |     number of associated object annotations (bounding boxes).
 36 | 
 37 |     Arguments:
 38 |         batch: (tuple) A tuple of tensor images and lists of annotations
 39 | 
 40 |     Return:
 41 |         A tuple containing:
 42 |             1) (tensor) batch of images stacked on their 0 dim
 43 |             2) (list of tensors) annotations for a given image are stacked on
 44 |                                  0 dim
 45 |     """
 46 |     targets = []
 47 |     imgs = []
 48 |     for sample in batch:
 49 |         imgs.append(sample[0])
 50 |         targets.append(torch.FloatTensor(sample[1]))
 51 |     return torch.stack(imgs, 0), targets
 52 | def xavier(param):
 53 |     nn.init.xavier_uniform_(param)
 54 | def weights_init(m):
 55 |     if isinstance(m, nn.Conv2d):
 56 |         xavier(m.weight.data)
 57 |         m.bias.data.zero_()
 58 | def train():
 59 |     dataset = voc0712.VOCDetection(root=Config.dataset_root,
 60 |                            transform=augmentations.SSDAugmentation(Config.image_size,
 61 |                                                      Config.MEANS))
 62 |     data_loader = data.DataLoader(dataset, Config.batch_size,
 63 |                                   num_workers=Config.data_load_number_worker,
 64 |                                   shuffle=True, collate_fn=detection_collate,
 65 |                                   pin_memory=True)
 66 | 
 67 |     net = ssd_net_vgg.SSD()
 68 |     vgg_weights = torch.load('./weights/vgg16_reducedfc.pth')
 69 | 
 70 |     net.apply(weights_init)
 71 |     net.vgg.load_state_dict(vgg_weights)
 72 |     # net.apply(weights_init)
 73 |     if Config.use_cuda:
 74 |         net = torch.nn.DataParallel(net)
 75 |         net = net.cuda()
 76 |     net.train()
 77 |     loss_fun = loss_function.LossFun()
 78 |     optimizer = optim.SGD(net.parameters(), lr=Config.lr, momentum=Config.momentum,
 79 |                           weight_decay=Config.weight_decacy)
 80 |     iter = 0
 81 |     step_index = 0
 82 |     before_epoch = -1
 83 |     for epoch in range(1000):
 84 |         for step,(img,target) in enumerate(data_loader):
 85 |             if Config.use_cuda:
 86 |                 img = img.cuda()
 87 |                 target = [ann.cuda() for ann in target]
 88 |             img = torch.Tensor(img)
 89 |             loc_pre,conf_pre = net(img)
 90 |             priors = utils.default_prior_box()
 91 |             optimizer.zero_grad()
 92 |             loss_l,loss_c = loss_fun((loc_pre,conf_pre),target,priors)
 93 |             loss = loss_l + loss_c
 94 |             loss.backward()
 95 |             optimizer.step()
 96 |             if iter % 1 == 0 or before_epoch!=epoch:
 97 |                 print('epoch : ',epoch,' iter : ',iter,' step : ',step,' loss : ',loss.item())
 98 |                 before_epoch = epoch
 99 |             iter+=1
100 |             if iter in Config.lr_steps:
101 |                 step_index+=1
102 |                 adjust_learning_rate(optimizer,Config.gamma,step_index)
103 |             if iter % 10000 == 0 and iter!=0:
104 |                 torch.save(net.state_dict(), 'weights/ssd300_VOC_' +
105 |                            repr(iter) + '.pth')
106 |         if iter >= Config.max_iter:
107 |             break
108 |     torch.save(net.state_dict(), 'weights/ssd_voc_120000.pth')
109 | 
110 | if __name__ == '__main__':
111 |     train()
112 | 
113 | 
114 | 
115 | 


--------------------------------------------------------------------------------
/__pycache__/Config.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/__pycache__/Config.cpython-37.pyc


--------------------------------------------------------------------------------
/__pycache__/augmentations.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/__pycache__/augmentations.cpython-37.pyc


--------------------------------------------------------------------------------
/__pycache__/detection.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/__pycache__/detection.cpython-37.pyc


--------------------------------------------------------------------------------
/__pycache__/l2norm.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/__pycache__/l2norm.cpython-37.pyc


--------------------------------------------------------------------------------
/__pycache__/loss_function.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/__pycache__/loss_function.cpython-37.pyc


--------------------------------------------------------------------------------
/__pycache__/my_window.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/__pycache__/my_window.cpython-37.pyc


--------------------------------------------------------------------------------
/__pycache__/ssd_net_vgg.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/__pycache__/ssd_net_vgg.cpython-37.pyc


--------------------------------------------------------------------------------
/__pycache__/utils.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/__pycache__/utils.cpython-37.pyc


--------------------------------------------------------------------------------
/__pycache__/voc0712.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/__pycache__/voc0712.cpython-37.pyc


--------------------------------------------------------------------------------
/augmentations.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | from torchvision import transforms
  3 | import cv2
  4 | import numpy as np
  5 | import types
  6 | from numpy import random
  7 | 
  8 | def intersect(box_a, box_b):
  9 |     max_xy = np.minimum(box_a[:, 2:], box_b[2:])
 10 |     min_xy = np.maximum(box_a[:, :2], box_b[:2])
 11 |     inter = np.clip((max_xy - min_xy), a_min=0, a_max=np.inf)
 12 |     return inter[:, 0] * inter[:, 1]
 13 | 
 14 | 
 15 | def jaccard_numpy(box_a, box_b):
 16 |     """Compute the jaccard overlap of two sets of boxes.  The jaccard overlap
 17 |     is simply the intersection over union of two boxes.
 18 |     E.g.:
 19 |         A ∩ B / A ∪ B = A ∩ B / (area(A) + area(B) - A ∩ B)
 20 |     Args:
 21 |         box_a: Multiple bounding boxes, Shape: [num_boxes,4]
 22 |         box_b: Single bounding box, Shape: [4]
 23 |     Return:
 24 |         jaccard overlap: Shape: [box_a.shape[0], box_a.shape[1]]
 25 |     """
 26 |     inter = intersect(box_a, box_b)
 27 |     area_a = ((box_a[:, 2]-box_a[:, 0]) *
 28 |               (box_a[:, 3]-box_a[:, 1]))  # [A,B]
 29 |     area_b = ((box_b[2]-box_b[0]) *
 30 |               (box_b[3]-box_b[1]))  # [A,B]
 31 |     union = area_a + area_b - inter
 32 |     return inter / union  # [A,B]
 33 | 
 34 | 
 35 | class Compose(object):
 36 |     """Composes several augmentations together.
 37 |     Args:
 38 |         transforms (List[Transform]): list of transforms to compose.
 39 |     Example:
 40 |         >>> augmentations.Compose([
 41 |         >>>     transforms.CenterCrop(10),
 42 |         >>>     transforms.ToTensor(),
 43 |         >>> ])
 44 |     """
 45 | 
 46 |     def __init__(self, transforms):
 47 |         self.transforms = transforms
 48 | 
 49 |     def __call__(self, img, boxes=None, labels=None):
 50 |         for t in self.transforms:
 51 |             img, boxes, labels = t(img, boxes, labels)
 52 |         return img, boxes, labels
 53 | 
 54 | 
 55 | class Lambda(object):
 56 |     """Applies a lambda as a transform."""
 57 | 
 58 |     def __init__(self, lambd):
 59 |         assert isinstance(lambd, types.LambdaType)
 60 |         self.lambd = lambd
 61 | 
 62 |     def __call__(self, img, boxes=None, labels=None):
 63 |         return self.lambd(img, boxes, labels)
 64 | 
 65 | 
 66 | class ConvertFromInts(object):
 67 |     def __call__(self, image, boxes=None, labels=None):
 68 |         return image.astype(np.float32), boxes, labels
 69 | 
 70 | 
 71 | class SubtractMeans(object):
 72 |     def __init__(self, mean):
 73 |         self.mean = np.array(mean, dtype=np.float32)
 74 | 
 75 |     def __call__(self, image, boxes=None, labels=None):
 76 |         image = image.astype(np.float32)
 77 |         image -= self.mean
 78 |         return image.astype(np.float32), boxes, labels
 79 | 
 80 | 
 81 | class ToAbsoluteCoords(object):
 82 |     def __call__(self, image, boxes=None, labels=None):
 83 |         height, width, channels = image.shape
 84 |         boxes[:, 0] *= width
 85 |         boxes[:, 2] *= width
 86 |         boxes[:, 1] *= height
 87 |         boxes[:, 3] *= height
 88 | 
 89 |         return image, boxes, labels
 90 | 
 91 | 
 92 | class ToPercentCoords(object):
 93 |     def __call__(self, image, boxes=None, labels=None):
 94 |         height, width, channels = image.shape
 95 |         boxes[:, 0] /= width
 96 |         boxes[:, 2] /= width
 97 |         boxes[:, 1] /= height
 98 |         boxes[:, 3] /= height
 99 | 
100 |         return image, boxes, labels
101 | 
102 | 
103 | class Resize(object):
104 |     def __init__(self, size=300):
105 |         self.size = size
106 | 
107 |     def __call__(self, image, boxes=None, labels=None):
108 |         image = cv2.resize(image, (self.size,
109 |                                  self.size))
110 |         return image, boxes, labels
111 | 
112 | 
113 | class RandomSaturation(object):
114 |     def __init__(self, lower=0.5, upper=1.5):
115 |         self.lower = lower
116 |         self.upper = upper
117 |         assert self.upper >= self.lower, "contrast upper must be >= lower."
118 |         assert self.lower >= 0, "contrast lower must be non-negative."
119 | 
120 |     def __call__(self, image, boxes=None, labels=None):
121 |         if random.randint(2):
122 |             image[:, :, 1] *= random.uniform(self.lower, self.upper)
123 | 
124 |         return image, boxes, labels
125 | 
126 | 
127 | class RandomHue(object):
128 |     def __init__(self, delta=18.0):
129 |         assert delta >= 0.0 and delta <= 360.0
130 |         self.delta = delta
131 | 
132 |     def __call__(self, image, boxes=None, labels=None):
133 |         if random.randint(2):
134 |             image[:, :, 0] += random.uniform(-self.delta, self.delta)
135 |             image[:, :, 0][image[:, :, 0] > 360.0] -= 360.0
136 |             image[:, :, 0][image[:, :, 0] < 0.0] += 360.0
137 |         return image, boxes, labels
138 | 
139 | 
140 | class RandomLightingNoise(object):
141 |     def __init__(self):
142 |         self.perms = ((0, 1, 2), (0, 2, 1),
143 |                       (1, 0, 2), (1, 2, 0),
144 |                       (2, 0, 1), (2, 1, 0))
145 | 
146 |     def __call__(self, image, boxes=None, labels=None):
147 |         if random.randint(2):
148 |             swap = self.perms[random.randint(len(self.perms))]
149 |             shuffle = SwapChannels(swap)  # shuffle channels
150 |             image = shuffle(image)
151 |         return image, boxes, labels
152 | 
153 | 
154 | class ConvertColor(object):
155 |     def __init__(self, current='BGR', transform='HSV'):
156 |         self.transform = transform
157 |         self.current = current
158 | 
159 |     def __call__(self, image, boxes=None, labels=None):
160 |         if self.current == 'BGR' and self.transform == 'HSV':
161 |             image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
162 |         elif self.current == 'HSV' and self.transform == 'BGR':
163 |             image = cv2.cvtColor(image, cv2.COLOR_HSV2BGR)
164 |         else:
165 |             raise NotImplementedError
166 |         return image, boxes, labels
167 | 
168 | 
169 | class RandomContrast(object):
170 |     def __init__(self, lower=0.5, upper=1.5):
171 |         self.lower = lower
172 |         self.upper = upper
173 |         assert self.upper >= self.lower, "contrast upper must be >= lower."
174 |         assert self.lower >= 0, "contrast lower must be non-negative."
175 | 
176 |     # expects float image
177 |     def __call__(self, image, boxes=None, labels=None):
178 |         if random.randint(2):
179 |             alpha = random.uniform(self.lower, self.upper)
180 |             image *= alpha
181 |         return image, boxes, labels
182 | 
183 | 
184 | class RandomBrightness(object):
185 |     def __init__(self, delta=32):
186 |         assert delta >= 0.0
187 |         assert delta <= 255.0
188 |         self.delta = delta
189 | 
190 |     def __call__(self, image, boxes=None, labels=None):
191 |         if random.randint(2):
192 |             delta = random.uniform(-self.delta, self.delta)
193 |             image += delta
194 |         return image, boxes, labels
195 | 
196 | 
197 | class ToCV2Image(object):
198 |     def __call__(self, tensor, boxes=None, labels=None):
199 |         return tensor.cpu().numpy().astype(np.float32).transpose((1, 2, 0)), boxes, labels
200 | 
201 | 
202 | class ToTensor(object):
203 |     def __call__(self, cvimage, boxes=None, labels=None):
204 |         return torch.from_numpy(cvimage.astype(np.float32)).permute(2, 0, 1), boxes, labels
205 | 
206 | 
207 | class RandomSampleCrop(object):
208 |     """Crop
209 |     Arguments:
210 |         img (Image): the image being input during training
211 |         boxes (Tensor): the original bounding boxes in pt form
212 |         labels (Tensor): the class labels for each bbox
213 |         mode (float tuple): the min and max jaccard overlaps
214 |     Return:
215 |         (img, boxes, classes)
216 |             img (Image): the cropped image
217 |             boxes (Tensor): the adjusted bounding boxes in pt form
218 |             labels (Tensor): the class labels for each bbox
219 |     """
220 |     def __init__(self):
221 |         self.sample_options = (
222 |             # using entire original input image
223 |             None,
224 |             # sample a patch s.t. MIN jaccard w/ obj in .1,.3,.4,.7,.9
225 |             (0.1, None),
226 |             (0.3, None),
227 |             (0.7, None),
228 |             (0.9, None),
229 |             # randomly sample a patch
230 |             (None, None),
231 |         )
232 | 
233 |     def __call__(self, image, boxes=None, labels=None):
234 |         height, width, _ = image.shape
235 |         while True:
236 |             # randomly choose a mode
237 |             mode = random.choice(self.sample_options)
238 |             if mode is None:
239 |                 return image, boxes, labels
240 | 
241 |             min_iou, max_iou = mode
242 |             if min_iou is None:
243 |                 min_iou = float('-inf')
244 |             if max_iou is None:
245 |                 max_iou = float('inf')
246 | 
247 |             # max trails (50)
248 |             for _ in range(50):
249 |                 current_image = image
250 | 
251 |                 w = random.uniform(0.3 * width, width)
252 |                 h = random.uniform(0.3 * height, height)
253 | 
254 |                 # aspect ratio constraint b/t .5 & 2
255 |                 if h / w < 0.5 or h / w > 2:
256 |                     continue
257 | 
258 |                 left = random.uniform(width - w)
259 |                 top = random.uniform(height - h)
260 | 
261 |                 # convert to integer rect x1,y1,x2,y2
262 |                 rect = np.array([int(left), int(top), int(left+w), int(top+h)])
263 | 
264 |                 # calculate IoU (jaccard overlap) b/t the cropped and gt boxes
265 |                 overlap = jaccard_numpy(boxes, rect)
266 | 
267 |                 # is min and max overlap constraint satisfied? if not try again
268 |                 if overlap.min() < min_iou and max_iou < overlap.max():
269 |                     continue
270 | 
271 |                 # cut the crop from the image
272 |                 current_image = current_image[rect[1]:rect[3], rect[0]:rect[2],
273 |                                               :]
274 | 
275 |                 # keep overlap with gt box IF center in sampled patch
276 |                 centers = (boxes[:, :2] + boxes[:, 2:]) / 2.0
277 | 
278 |                 # mask in all gt boxes that above and to the left of centers
279 |                 m1 = (rect[0] < centers[:, 0]) * (rect[1] < centers[:, 1])
280 | 
281 |                 # mask in all gt boxes that under and to the right of centers
282 |                 m2 = (rect[2] > centers[:, 0]) * (rect[3] > centers[:, 1])
283 | 
284 |                 # mask in that both m1 and m2 are true
285 |                 mask = m1 * m2
286 | 
287 |                 # have any valid boxes? try again if not
288 |                 if not mask.any():
289 |                     continue
290 | 
291 |                 # take only matching gt boxes
292 |                 current_boxes = boxes[mask, :].copy()
293 | 
294 |                 # take only matching gt labels
295 |                 current_labels = labels[mask]
296 | 
297 |                 # should we use the box left and top corner or the crop's
298 |                 current_boxes[:, :2] = np.maximum(current_boxes[:, :2],
299 |                                                   rect[:2])
300 |                 # adjust to crop (by substracting crop's left,top)
301 |                 current_boxes[:, :2] -= rect[:2]
302 | 
303 |                 current_boxes[:, 2:] = np.minimum(current_boxes[:, 2:],
304 |                                                   rect[2:])
305 |                 # adjust to crop (by substracting crop's left,top)
306 |                 current_boxes[:, 2:] -= rect[:2]
307 | 
308 |                 return current_image, current_boxes, current_labels
309 | 
310 | 
311 | class Expand(object):
312 |     def __init__(self, mean):
313 |         self.mean = mean
314 | 
315 |     def __call__(self, image, boxes, labels):
316 |         if random.randint(2):
317 |             return image, boxes, labels
318 | 
319 |         height, width, depth = image.shape
320 |         ratio = random.uniform(1, 4)
321 |         left = random.uniform(0, width*ratio - width)
322 |         top = random.uniform(0, height*ratio - height)
323 | 
324 |         expand_image = np.zeros(
325 |             (int(height*ratio), int(width*ratio), depth),
326 |             dtype=image.dtype)
327 |         expand_image[:, :, :] = self.mean
328 |         expand_image[int(top):int(top + height),
329 |                      int(left):int(left + width)] = image
330 |         image = expand_image
331 | 
332 |         boxes = boxes.copy()
333 |         boxes[:, :2] += (int(left), int(top))
334 |         boxes[:, 2:] += (int(left), int(top))
335 | 
336 |         return image, boxes, labels
337 | 
338 | 
339 | class RandomMirror(object):
340 |     def __call__(self, image, boxes, classes):
341 |         _, width, _ = image.shape
342 |         if random.randint(2):
343 |             image = image[:, ::-1]
344 |             boxes = boxes.copy()
345 |             boxes[:, 0::2] = width - boxes[:, 2::-2]
346 |         return image, boxes, classes
347 | 
348 | 
349 | class SwapChannels(object):
350 |     """Transforms a tensorized image by swapping the channels in the order
351 |      specified in the swap tuple.
352 |     Args:
353 |         swaps (int triple): final order of channels
354 |             eg: (2, 1, 0)
355 |     """
356 | 
357 |     def __init__(self, swaps):
358 |         self.swaps = swaps
359 | 
360 |     def __call__(self, image):
361 |         """
362 |         Args:
363 |             image (Tensor): image tensor to be transformed
364 |         Return:
365 |             a tensor with channels swapped according to swap
366 |         """
367 |         # if torch.is_tensor(image):
368 |         #     image = image.data.cpu().numpy()
369 |         # else:
370 |         #     image = np.array(image)
371 |         image = image[:, :, self.swaps]
372 |         return image
373 | 
374 | 
375 | class PhotometricDistort(object):
376 |     def __init__(self):
377 |         self.pd = [
378 |             RandomContrast(),
379 |             ConvertColor(transform='HSV'),
380 |             RandomSaturation(),
381 |             RandomHue(),
382 |             ConvertColor(current='HSV', transform='BGR'),
383 |             RandomContrast()
384 |         ]
385 |         self.rand_brightness = RandomBrightness()
386 |         self.rand_light_noise = RandomLightingNoise()
387 | 
388 |     def __call__(self, image, boxes, labels):
389 |         im = image.copy()
390 |         im, boxes, labels = self.rand_brightness(im, boxes, labels)
391 |         if random.randint(2):
392 |             distort = Compose(self.pd[:-1])
393 |         else:
394 |             distort = Compose(self.pd[1:])
395 |         im, boxes, labels = distort(im, boxes, labels)
396 |         return self.rand_light_noise(im, boxes, labels)
397 | 
398 | 
399 | class SSDAugmentation(object):
400 |     def __init__(self, size=300, mean=(104, 117, 123)):
401 |         self.mean = mean
402 |         self.size = size
403 |         self.augment = Compose([
404 |             ConvertFromInts(),
405 |             ToAbsoluteCoords(),
406 |             PhotometricDistort(),
407 |             Expand(self.mean),
408 |             RandomSampleCrop(),
409 |             RandomMirror(),
410 |             ToPercentCoords(),
411 |             Resize(self.size),
412 |             SubtractMeans(self.mean)
413 |         ])
414 | 
415 |     def __call__(self, img, boxes, labels):
416 |         return self.augment(img, boxes, labels)
417 | 


--------------------------------------------------------------------------------
/bus_dataset.log:
--------------------------------------------------------------------------------
 1 | iter:	accuracy
 2 | 10000	0.906613
 3 | 20000	0.912413	
 4 | 30000	0.940835
 5 | 40000	0.939095
 6 | 50000	0.956497
 7 | 60000	0.946636
 8 | 70000	0.966937
 9 | 80000	0.941995
10 | 90000	0.968677
11 | 100000	0.972158
12 | 110000	0.970708
13 | 120000	0.969548
14 | final	0.969548
15 | 
16 | final=120009


--------------------------------------------------------------------------------
/camera.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import time
 3 | 
 4 | cap=cv2.VideoCapture('G:\\ustc\\bishe\\Captures\\002.WMV')
 5 | while cap.isOpened():
 6 | 	ret,frame=cap.read()
 7 | 	cv2.imshow('capture', frame)
 8 | 	time.sleep(0.050)
 9 | 	if cv2.waitKey(1) & 0xFF == ord('q'):
10 | 		 break
11 | cap.release()
12 | cv2.destroyAllWindows()


--------------------------------------------------------------------------------
/camera_detection.py:
--------------------------------------------------------------------------------
  1 | from torch.autograd import Variable
  2 | from detection import *
  3 | from ssd_net_vgg import *
  4 | from voc0712 import *
  5 | import torch
  6 | import torch.nn as nn
  7 | import numpy as np
  8 | import cv2
  9 | import utils
 10 | import torch.backends.cudnn as cudnn
 11 | import time
 12 | #检测cuda是否可用
 13 | if torch.cuda.is_available():
 14 | 	print('-----gpu mode-----')
 15 | 	torch.set_default_tensor_type('torch.cuda.FloatTensor')
 16 | else:
 17 | 	print('-----cpu mode-----')
 18 | colors_tableau=[ (214, 39, 40),(23, 190, 207),(188, 189, 34),(188,34,188),(205,108,8)]
 19 | 
 20 | def Yawn(list_Y,list_y1):
 21 | 	list_cmp=list_Y[:len(list_Y1)]==list_Y1
 22 | 	for flag in list_cmp:
 23 | 		if flag==False:
 24 | 			return False
 25 | 	return True
 26 | #初始化网络
 27 | net=SSD()
 28 | net=torch.nn.DataParallel(net)
 29 | net.train(mode=False)
 30 | net.load_state_dict(torch.load('./weights/ssd300_VOC_100000.pth',map_location=lambda storage,loc: storage))
 31 | if torch.cuda.is_available():
 32 | 	net = net.cuda()
 33 | 	cudnn.benchmark = True
 34 | 
 35 | img_mean=(104.0,117.0,123.0)
 36 | 
 37 | #调用摄像头
 38 | cap=cv2.VideoCapture(0)
 39 | max_fps=0
 40 | 
 41 | #保存检测结果的List
 42 | #眼睛和嘴巴都是，张开为‘1’，闭合为‘0’
 43 | list_B=np.ones(15)#眼睛状态List,建议根据fps修改
 44 | list_Y=np.zeros(50)#嘴巴状态list，建议根据fps修改
 45 | list_Y1=np.ones(5)#如果在list_Y中存在list_Y1，则判定一次打哈欠，同上，长度建议修改
 46 | blink_count=0#眨眼计数
 47 | yawn_count=0
 48 | blink_start=time.time()#炸眼时间
 49 | yawn_start=time.time()#打哈欠时间
 50 | blink_freq=0.5
 51 | yawn_freq=0
 52 | #开始检测，按‘q’退出
 53 | while(True):
 54 | 	flag_B=True#是否闭眼的flag
 55 | 	flag_Y=False
 56 | 	num_rec=0#检测到的眼睛的数量
 57 | 	start=time.time()#计时
 58 | 	ret,img=cap.read()#读取图片
 59 | 	
 60 | 	#检测
 61 | 	x=cv2.resize(img,(300,300)).astype(np.float32)
 62 | 	x-=img_mean
 63 | 	x=x.astype(np.float32)
 64 | 	x=x[:,:,::-1].copy()
 65 | 	x=torch.from_numpy(x).permute(2,0,1)
 66 | 	xx=Variable(x.unsqueeze(0))
 67 | 	if torch.cuda.is_available():
 68 | 		xx=xx.cuda()
 69 | 	y=net(xx)
 70 | 	softmax=nn.Softmax(dim=-1)
 71 | 	#detect=Detect(config.class_num,0,200,0.01,0.45)
 72 | 	detect = Detect.apply
 73 | 	priors=utils.default_prior_box()
 74 | 
 75 | 	loc,conf=y
 76 | 	loc=torch.cat([o.view(o.size(0),-1)for o in loc],1)
 77 | 	conf=torch.cat([o.view(o.size(0),-1)for o in conf],1)
 78 | 	
 79 | 	detections=detect(
 80 | 		loc.view(loc.size(0),-1,4),
 81 | 		softmax(conf.view(conf.size(0),-1,config.class_num)),
 82 | 		torch.cat([o.view(-1,4) for o in priors],0),
 83 | 		config.class_num,
 84 | 		200,
 85 | 		0.7,
 86 | 		0.45
 87 | 	).data
 88 | 	labels=VOC_CLASSES
 89 | 	top_k=10
 90 | 	
 91 | 	#将检测结果放置于图片上
 92 | 	scale=torch.Tensor(img.shape[1::-1]).repeat(2)
 93 | 	for i in range(detections.size(1)):
 94 | 		
 95 | 		j=0
 96 | 		while detections[0,i,j,0]>=0.4:
 97 | 			score=detections[0,i,j,0]
 98 | 			label_name=labels[i-1]
 99 | 			if label_name=='closed_eye':
100 | 				flag_B=False
101 | 			if label_name=='open_mouth':
102 | 				flag_Y=True
103 | 			display_txt='%s:%.2f'%(label_name,score)
104 | 			pt=(detections[0,i,j,1:]*scale).cpu().numpy()
105 | 			coords=(pt[0],pt[1]),pt[2]-pt[0]+1,pt[3]-pt[1]+1
106 | 			color=colors_tableau[i]
107 | 			cv2.rectangle(img,(pt[0],pt[1]),(pt[2],pt[3]),color,2)
108 | 			cv2.putText(img,display_txt,(int(pt[0]),int(pt[1])+10),cv2.FONT_HERSHEY_SIMPLEX, 0.4, (255,255,255), 1, 8)
109 | 			j+=1
110 | 			num_rec+=1
111 | 	if num_rec>0:
112 | 		if flag_B:
113 | 			#print(' 1:eye-open')
114 | 			list_B=np.append(list_B,1)#睁眼为‘1’
115 | 		else:
116 | 			#print(' 0:eye-closed')
117 | 			list_B=np.append(list_B,0)#闭眼为‘0’
118 | 		list_B=np.delete(list_B,0)
119 | 		if flag_Y:
120 | 			list_Y=np.append(list_Y,1)
121 | 		else:
122 | 			list_Y=np.append(list_Y,0)
123 | 		list_Y=np.delete(list_Y,0)
124 | 	else:
125 | 		print('nothing detected')
126 | 	#print(list)
127 | 	#实时计算PERCLOS
128 | 	perclos=1-np.average(list_B)
129 | 	print('perclos={:f}'.format(perclos))
130 | 	if list_B[13]==1 and list_B[14]==0:
131 | 		#如果上一帧为’1‘，此帧为’0‘则判定为眨眼
132 | 		print('----------------眨眼----------------------')
133 | 		blink_count+=1
134 | 	blink_T=time.time()-blink_start
135 | 	if blink_T>10:
136 | 		#每10秒计算一次眨眼频率
137 | 		blink_freq=blink_count/blink_T
138 | 		blink_start=time.time()
139 | 		blink_count=0
140 | 		print('blink_freq={:f}'.format(blink_freq))
141 | 	#检测打哈欠
142 | 	#if Yawn(list_Y,list_Y1):
143 | 	if (list_Y[len(list_Y)-len(list_Y1):]==list_Y1).all():
144 | 		print('----------------------打哈欠----------------------')
145 | 		yawn_count+=1
146 | 		list_Y=np.zeros(50)
147 | 	#计算打哈欠频率
148 | 	yawn_T=time.time()-yawn_start
149 | 	if yawn_T>60:
150 | 		yawn_freq=yawn_count/yawn_T
151 | 		yawn_start=time.time()
152 | 		yawn_count=0
153 | 		print('yawn_freq={:f}'.fomat(yawn_freq))
154 | 		
155 | 	#此处为判断疲劳部分
156 | 	'''
157 | 	想法1：最简单，但是太影响实时性
158 | 	if(perclos>0.4 or blink_freq<0.25 or yawn_freq>5/60):
159 | 		print('疲劳')
160 | 		if(blink_freq<0.25)
161 | 	else:
162 | 		print('清醒')
163 | 	'''
164 | 	#想法2：
165 | 	if(perclos>0.4):
166 | 		print('疲劳')
167 | 	elif(blink_freq<0.25):
168 | 		print('疲劳')
169 | 		blink_freq=0.5#如果因为眨眼频率判断疲劳，则初始化眨眼频率
170 | 	elif(yawn_freq>5.0/60):
171 | 		print("疲劳")
172 | 		yawn_freq=0#初始化，同上
173 | 	else:
174 | 		print('清醒')
175 | 	T=time.time()-start
176 | 	fps=1/T#实时在视频上显示fps
177 | 	if fps>max_fps:
178 | 		max_fps=fps
179 | 	fps_txt='fps:%.2f'%(fps)
180 | 	cv2.putText(img,fps_txt,(0,10),cv2.FONT_HERSHEY_SIMPLEX, 0.4, (255,255,255), 1, 8)
181 | 	cv2.imshow("ssd",img)
182 | 	if cv2.waitKey(100) & 0xff == ord('q'):
183 | 		break
184 | #print("-------end-------")
185 | cap.release()
186 | cv2.destroyAllWindows()
187 | #print(max_fps)


--------------------------------------------------------------------------------
/camera_detection_1.py:
--------------------------------------------------------------------------------
  1 | from torch.autograd import Variable
  2 | from detection import *
  3 | from ssd_net_vgg import *
  4 | from voc0712 import *
  5 | import torch
  6 | import torch.nn as nn
  7 | import numpy as np
  8 | import cv2
  9 | import utils
 10 | import torch.backends.cudnn as cudnn
 11 | import time
 12 | #检测cuda是否可用
 13 | if torch.cuda.is_available():
 14 | 	print('-----gpu mode-----')
 15 | 	torch.set_default_tensor_type('torch.cuda.FloatTensor')
 16 | else:
 17 | 	print('-----cpu mode-----')
 18 | colors_tableau=[ (214, 39, 40),(23, 190, 207),(188, 189, 34),(188,34,188),(205,108,8)]
 19 | 
 20 | def Yawn(list_Y,list_y1):
 21 | 	list_cmp=list_Y[:len(list_Y1)]==list_Y1
 22 | 	for flag in list_cmp:
 23 | 		if flag==False:
 24 | 			return False
 25 | 	return True
 26 | #初始化网络
 27 | net=SSD()
 28 | net=torch.nn.DataParallel(net)
 29 | net.train(mode=False)
 30 | net.load_state_dict(torch.load('./weights/ssd_voc_5000_plus.pth',map_location=lambda storage,loc: storage))
 31 | if torch.cuda.is_available():
 32 | 	net = net.cuda()
 33 | 	cudnn.benchmark = True
 34 | 
 35 | img_mean=(104.0,117.0,123.0)
 36 | 
 37 | #调用摄像头
 38 | cap=cv2.VideoCapture(0)
 39 | max_fps=0
 40 | 
 41 | #保存检测结果的List
 42 | #眼睛和嘴巴都是，张开为‘1’，闭合为‘0’
 43 | list_B=np.ones(15)#眼睛状态List,建议根据fps修改，个人电脑fps≈6
 44 | list_Y=np.zeros(50)#嘴巴状态list，建议根据fps修改
 45 | list_Y1=np.ones(5)#如果在list_Y中存在list_Y1，则判定一次打哈欠，同上，长度建议修改
 46 | list_blink=list(60)#大约是记录10S内信息，眨眼为‘1’，不眨眼为‘0’
 47 | list_yawn=np.zeros(360)#大约是一分钟内打哈欠记录，打哈欠为‘1’，不打哈欠为‘0’
 48 | 
 49 | #blink_count=0#眨眼计数
 50 | #yawn_count=0
 51 | #blink_start=time.time()#炸眼时间
 52 | #yawn_start=time.time()#打哈欠时间
 53 | blink_freq=0.5
 54 | yawn_freq=0
 55 | #开始检测，按‘q’退出
 56 | while(True):
 57 | 	flag_B=True#是否闭眼的flag
 58 | 	flag_Y=False#张嘴flag
 59 | 
 60 | 	num_rec=0#检测到的眼睛的数量
 61 | 	start=time.time()#计时
 62 | 	ret,img=cap.read()#读取图片
 63 | 	
 64 | 	#检测
 65 | 	x=cv2.resize(img,(300,300)).astype(np.float32)
 66 | 	x-=img_mean
 67 | 	x=x.astype(np.float32)
 68 | 	x=x[:,:,::-1].copy()
 69 | 	x=torch.from_numpy(x).permute(2,0,1)
 70 | 	xx=Variable(x.unsqueeze(0))
 71 | 	if torch.cuda.is_available():
 72 | 		xx=xx.cuda()
 73 | 	y=net(xx)
 74 | 	softmax=nn.Softmax(dim=-1)
 75 | 	detect=Detect(config.class_num,0,200,0.01,0.45)
 76 | 	priors=utils.default_prior_box()
 77 | 
 78 | 	loc,conf=y
 79 | 	loc=torch.cat([o.view(o.size(0),-1)for o in loc],1)
 80 | 	conf=torch.cat([o.view(o.size(0),-1)for o in conf],1)
 81 | 	
 82 | 	detections=detect(
 83 | 		loc.view(loc.size(0),-1,4),
 84 | 		softmax(conf.view(conf.size(0),-1,config.class_num)),
 85 | 		torch.cat([o.view(-1,4) for o in priors],0)
 86 | 	).data
 87 | 	labels=VOC_CLASSES
 88 | 	top_k=10
 89 | 	
 90 | 	#将检测结果放置于图片上
 91 | 	scale=torch.Tensor(img.shape[1::-1]).repeat(2)
 92 | 	for i in range(detections.size(1)):
 93 | 		
 94 | 		j=0
 95 | 		while detections[0,i,j,0]>=0.4:
 96 | 			score=detections[0,i,j,0]
 97 | 			label_name=labels[i-1]
 98 | 			if label_name=='closed_eye':
 99 | 				flag_B=False
100 | 			if label_name=='open_mouth':
101 | 				flag_Y=True
102 | 			display_txt='%s:%.2f'%(label_name,score)
103 | 			pt=(detections[0,i,j,1:]*scale).cpu().numpy()
104 | 			coords=(pt[0],pt[1]),pt[2]-pt[0]+1,pt[3]-pt[1]+1
105 | 			color=colors_tableau[i]
106 | 			cv2.rectangle(img,(pt[0],pt[1]),(pt[2],pt[3]),color,2)
107 | 			cv2.putText(img,display_txt,(int(pt[0]),int(pt[1])+10),cv2.FONT_HERSHEY_SIMPLEX, 0.4, (255,255,255), 1, 8)
108 | 			j+=1
109 | 			num_rec+=1
110 | 	if num_rec>0:
111 | 		if flag_B:
112 | 			#print(' 1:eye-open')
113 | 			list_B=np.append(list_B,1)#睁眼为‘1’
114 | 		else:
115 | 			#print(' 0:eye-closed')
116 | 			list_B=np.append(list_B,0)#闭眼为‘0’
117 | 		list_B=np.delete(list_B,0)
118 | 		if flag_Y:
119 | 			list_Y=np.append(list_Y,1)
120 | 		else:
121 | 			list_Y=np.append(list_Y,0)
122 | 		list_Y=np.delete(list_Y,0)
123 | 	else:
124 | 		print('nothing detected')
125 | 	#print(list)
126 | 	
127 | 	if list_B[13]==1 and list_B[14]==0:
128 | 		#如果上一帧为’1‘，此帧为’0‘则判定为眨眼
129 | 		print('----------------眨眼----------------------')
130 | 		list_blink=np.append(list_blink,1)
131 | 	else:
132 | 		list_blink=np.append(list_blink,0)
133 | 	list_blink=np.delete(list_blink,0)
134 | 	
135 | 	
136 | 	#检测打哈欠
137 | 	#if Yawn(list_Y,list_Y1):
138 | 	if (list_Y[len(list_Y)-len(list_Y1):]==list_Y1).all():
139 | 		print('----------------------打哈欠----------------------')
140 | 		yawn_count+=1
141 | 		list_Y=np.zeros(50)#此处是检测到一次打哈欠之后将嘴部状态list全部置‘0’，考虑到打哈欠所用时间较长，所以基本不会出现漏检
142 | 		list_yawn=np.append(list_yawn,1)
143 | 	else:
144 | 		list_yawn=np.append(list_yawn,0)
145 | 	list_yawn=np.delete(list_yawn,0)
146 | 	
147 | 	
148 | 	
149 | 	#实时计算PERCLOS perblink,peryawn
150 | 	#即计算平均闭眼时长百分比，平均眨眼百分比，平均打哈欠百分比
151 | 	perclos=1-np.average(list_B)
152 | 	perblink=np.average(list_blink)
153 | 	peryawn=np.average(list_yawn)
154 | 	#print('perclos={:f}'.format(perclos))
155 | 	
156 | 	#此处为判断疲劳部分
157 | 	#想法1：两个频率计算改为实时的，所以此处不再修改
158 | 	if(perclos>0.4 or perblink<0.25 or peryawn>5/60):
159 | 		print('疲劳')
160 | 		#if(blink_freq<0.25)
161 | 	else:
162 | 		print('清醒')
163 | 	
164 | 	'''#想法2：
165 | 	if(perclos>0.4):
166 | 	{
167 | 		print('疲劳')
168 | 	}
169 | 	elif(blink_freq<0.25):
170 | 	{
171 | 		print('疲劳')
172 | 		blink_freq=0.5#如果因为眨眼频率判断疲劳，则初始化眨眼频率
173 | 	}
174 | 	elif(yawn_freq>5.0/60):
175 | 	{
176 | 		print("疲劳")
177 | 		yawn_freq=0#初始化，同上
178 | 	}
179 | 	else:
180 | 	{	
181 | 		print('清醒')
182 | 	}
183 | 	'''
184 | 	T=time.time()-start
185 | 	fps=1/T#实时在视频上显示fps
186 | 	if fps>max_fps:
187 | 		max_fps=fps
188 | 	fps_txt='fps:%.2f'%(fps)
189 | 	cv2.putText(img,fps_txt,(0,10),cv2.FONT_HERSHEY_SIMPLEX, 0.4, (255,255,255), 1, 8)
190 | 	cv2.imshow("ssd",img)
191 | 	if cv2.waitKey(100) & 0xff == ord('q'):
192 | 		break
193 | #print("-------end-------")
194 | cap.release()
195 | cv2.destroyAllWindows()
196 | #print(max_fps)


--------------------------------------------------------------------------------
/detection.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | from torch.autograd import Function
 3 | from utils import decode, nms
 4 | 
 5 | 
 6 | class Detect(Function):
 7 |     """At test time, Detect is the final layer of SSD.  Decode location preds,
 8 |     apply non-maximum suppression to location predictions based on conf
 9 |     scores and threshold to a top_k number of output predictions for both
10 |     confidence score and locations.
11 |     """
12 |     def __init__(self, num_classes, bkg_label, top_k, conf_thresh, nms_thresh):
13 |         self.num_classes = num_classes
14 |         self.background_label = bkg_label
15 |         self.top_k = top_k
16 |         # Parameters used in nms.
17 |         self.nms_thresh = nms_thresh
18 |         if nms_thresh <= 0:
19 |             raise ValueError('nms_threshold must be non negative.')
20 |         self.conf_thresh = conf_thresh
21 |         self.variance = (0.1,0.2)
22 | 
23 |     @staticmethod
24 |     def forward(self, loc_data, conf_data, prior_data, num_classes, top_k, conf_thresh, nms_thresh):
25 |         """
26 |         Args:
27 |             loc_data: (tensor) Loc preds from loc layers
28 |                 Shape: [batch,num_priors*4]
29 |             conf_data: (tensor) Shape: Conf preds from conf layers
30 |                 Shape: [batch*num_priors,num_classes]
31 |             prior_data: (tensor) Prior boxes and variances from priorbox layers
32 |                 Shape: [1,num_priors,4]
33 |         """
34 |         num = loc_data.size(0)  # batch size
35 |         num_priors = prior_data.size(0)
36 |         output = torch.zeros(num, num_classes, top_k, 5)
37 |         conf_preds = conf_data.view(num, num_priors,
38 |                                     num_classes).transpose(2, 1)
39 | 
40 |         # Decode predictions into bboxes.
41 |         variance = (0.1, 0.2)
42 |         for i in range(num):
43 |             decoded_boxes = decode(loc_data[i], prior_data, variance)
44 |             # For each class, perform nms
45 |             conf_scores = conf_preds[i].clone()
46 |             for cl in range(1, num_classes):
47 |                 c_mask = conf_scores[cl].gt(conf_thresh)
48 |                 scores = conf_scores[cl][c_mask]
49 |                 if scores.dim() == 0:
50 |                     continue
51 |                 l_mask = c_mask.unsqueeze(1).expand_as(decoded_boxes)
52 |                 boxes = decoded_boxes[l_mask].view(-1, 4)
53 |                 # idx of highest scoring and non-overlapping boxes per class
54 |                 ids, count = nms(boxes, scores, nms_thresh, top_k)
55 | 
56 |                 if count==0:
57 |                     continue
58 |                 output[i, cl, :count] = \
59 |                     torch.cat((scores[ids[:count]].unsqueeze(1),
60 |                                boxes[ids[:count]]), 1)
61 |         flt = output.contiguous().view(num, -1, 5)
62 |         _, idx = flt[:, :, 0].sort(1, descending=True)
63 |         _, rank = idx.sort(1)
64 |         flt[(rank < top_k).unsqueeze(-1).expand_as(flt)].fill_(0)
65 |         return output
66 | 


--------------------------------------------------------------------------------
/dnf_test.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/dnf_test.jpg


--------------------------------------------------------------------------------
/dnf_test_done.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/dnf_test_done.jpg


--------------------------------------------------------------------------------
/environment.yml:
--------------------------------------------------------------------------------
 1 | name: torch
 2 | channels:
 3 |   - defaults
 4 |   - https://repo.anaconda.com/pkgs/main
 5 |   - https://repo.anaconda.com/pkgs/r
 6 | dependencies:
 7 |   - _libgcc_mutex=0.1=main
 8 |   - _openmp_mutex=5.1=1_gnu
 9 |   - bzip2=1.0.8=h5eee18b_6
10 |   - ca-certificates=2025.2.25=h06a4308_0
11 |   - expat=2.7.1=h6a678d5_0
12 |   - ld_impl_linux-64=2.40=h12ee557_0
13 |   - libffi=3.4.4=h6a678d5_1
14 |   - libgcc-ng=11.2.0=h1234567_1
15 |   - libgomp=11.2.0=h1234567_1
16 |   - libstdcxx-ng=11.2.0=h1234567_1
17 |   - libuuid=1.41.5=h5eee18b_0
18 |   - ncurses=6.4=h6a678d5_0
19 |   - openssl=3.0.16=h5eee18b_0
20 |   - pip=25.0=py312h06a4308_0
21 |   - python=3.12.9=h5148396_0
22 |   - readline=8.2=h5eee18b_0
23 |   - setuptools=75.8.0=py312h06a4308_0
24 |   - sqlite=3.45.3=h5eee18b_0
25 |   - tk=8.6.14=h39e8969_0
26 |   - tzdata=2025a=h04d1e81_0
27 |   - wheel=0.45.1=py312h06a4308_0
28 |   - xz=5.6.4=h5eee18b_1
29 |   - zlib=1.2.13=h5eee18b_1
30 |   - pip:
31 |       - contourpy==1.3.2
32 |       - cycler==0.12.1
33 |       - filelock==3.18.0
34 |       - fonttools==4.57.0
35 |       - fsspec==2025.3.2
36 |       - jinja2==3.1.6
37 |       - kiwisolver==1.4.8
38 |       - markupsafe==3.0.2
39 |       - matplotlib==3.10.1
40 |       - mpmath==1.3.0
41 |       - networkx==3.4.2
42 |       - numpy==2.2.5
43 |       - nvidia-cublas-cu12==12.4.5.8
44 |       - nvidia-cuda-cupti-cu12==12.4.127
45 |       - nvidia-cuda-nvrtc-cu12==12.4.127
46 |       - nvidia-cuda-runtime-cu12==12.4.127
47 |       - nvidia-cudnn-cu12==9.1.0.70
48 |       - nvidia-cufft-cu12==11.2.1.3
49 |       - nvidia-curand-cu12==10.3.5.147
50 |       - nvidia-cusolver-cu12==11.6.1.9
51 |       - nvidia-cusparse-cu12==12.3.1.170
52 |       - nvidia-cusparselt-cu12==0.6.2
53 |       - nvidia-nccl-cu12==2.21.5
54 |       - nvidia-nvjitlink-cu12==12.4.127
55 |       - nvidia-nvtx-cu12==12.4.127
56 |       - opencv-python-headless==4.11.0.86
57 |       - packaging==25.0
58 |       - pillow==11.2.1
59 |       - pyparsing==3.2.3
60 |       - python-dateutil==2.9.0.post0
61 |       - six==1.17.0
62 |       - sympy==1.13.1
63 |       - torch==2.6.0
64 |       - torchvision==0.21.0
65 |       - triton==3.2.0
66 |       - typing-extensions==4.13.2
67 | prefix: $HOME/miniconda3/envs/torch
68 | 


--------------------------------------------------------------------------------
/eval.py:
--------------------------------------------------------------------------------
  1 | from torch.autograd import Variable
  2 | from detection import *
  3 | from ssd_net_vgg import *
  4 | from voc0712 import *
  5 | import torch
  6 | import torch.nn as nn
  7 | import numpy as np
  8 | import cv2
  9 | import utils
 10 | import torch.backends.cudnn as cudnn
 11 | import time
 12 | import torch.utils.data as  data
 13 | import sys
 14 | import os
 15 | import pickle
 16 | 
 17 | #检测cuda是否可用
 18 | if torch.cuda.is_available():
 19 | 	print('-----gpu mode-----')
 20 | 	torch.set_default_tensor_type('torch.cuda.FloatTensor')
 21 | else:
 22 | 	print('-----cpu mode-----')
 23 | colors_tableau=[ (214, 39, 40),(23, 190, 207),(188, 189, 34),(188,34,188),(205,108,8)]
 24 | 
 25 | net=SSD()
 26 | net=torch.nn.DataParallel(net)
 27 | net.train(mode=False)
 28 | net.load_state_dict(torch.load('./weights/ssd300_voc_100000.pth',map_location=lambda storage,loc: storage))
 29 | if torch.cuda.is_available():
 30 | 	net = net.cuda()
 31 | 	cudnn.benchmark = True
 32 | 
 33 | devkit_path='./dataset/'
 34 | annopath=os.path.join(devkit_path,'Annotations', '%s.xml')
 35 | ftest=open(devkit_path+'ImageSets/Main/test.txt','r')
 36 | img_mean=(104.0,117.0,123.0)
 37 | 
 38 | def parse_rec(filename):
 39 | 	'''获取图片中所有的label和坐标'''
 40 | 	tree=ET.parse(filename)
 41 | 	objects=[]
 42 | 	for obj in tree.findall('object'):
 43 | 		obj_struct={}
 44 | 		obj_struct['name']=obj.find('name').text
 45 | 		bbox=obj.find('bndbox')
 46 | 		obj_struct['bbox']=[int(bbox.find('xmin').text)-1,
 47 | 							int(bbox.find('ymin').text)-1,
 48 | 							int(bbox.find('xmax').text)-1,
 49 | 							int(bbox.find('ymax').text)-1]
 50 | 		objects.append(obj_struct)
 51 | 		
 52 | 	return objects
 53 | 
 54 | def IoU(obj_R,obj_P):
 55 | 	#计算交并比
 56 | 	cood_r=obj_R['bbox']
 57 | 	cood_p=obj_P['bbox']
 58 | 	ixmin=max(cood_r[0],cood_p[0])
 59 | 	iymin=max(cood_r[1],cood_p[1])
 60 | 	ixmax=min(cood_r[2],cood_p[2])
 61 | 	iymax=min(cood_r[3],cood_p[3])
 62 | 	iw=max(ixmax-ixmin,0.)
 63 | 	ih=max(iymax-iymin,0.)
 64 | 	inters=iw*ih*1.0
 65 | 	uni=((cood_r[2]-cood_r[0])*(cood_r[3]-cood_r[1])+
 66 | 	     (cood_p[2]-cood_p[0])*(cood_p[3]-cood_p[1])-
 67 | 		 inters)
 68 | 	overlaps=inters/uni
 69 | 	return overlaps
 70 | 	
 71 | count=0
 72 | time_start=time.time()
 73 | accu_num=0
 74 | real_num=0
 75 | 
 76 | for line in ftest:
 77 | 	name=line.strip()
 78 | 	print(name)
 79 | 	obj_real=parse_rec(devkit_path+'Annotations/'+name+'.xml')
 80 | 	real_num+=len(obj_real)
 81 | 	img=cv2.imread(devkit_path+'JPEGImages/'+name+'.jpg',cv2.IMREAD_COLOR)
 82 | 	x=cv2.resize(img,(300,300)).astype(np.float32)
 83 | 	x-=img_mean
 84 | 	x=x.astype(np.float32)
 85 | 	x=x[:,:,::-1].copy()
 86 | 	x=torch.from_numpy(x).permute(2,0,1)
 87 | 	xx=Variable(x.unsqueeze(0))
 88 | 	if torch.cuda.is_available():
 89 | 		xx=xx.cuda()
 90 | 	y=net(xx)
 91 | 	softmax=nn.Softmax(dim=-1)
 92 | 	detect=Detect(config.class_num,0,200,0.01,0.45)
 93 | 	priors=utils.default_prior_box()
 94 | 
 95 | 	loc,conf=y
 96 | 	loc=torch.cat([o.view(o.size(0),-1)for o in loc],1)
 97 | 	conf=torch.cat([o.view(o.size(0),-1)for o in conf],1)
 98 | 	
 99 | 	detections=detect(
100 | 		loc.view(loc.size(0),-1,4),
101 | 		softmax(conf.view(conf.size(0),-1,config.class_num)),
102 | 		torch.cat([o.view(-1,4) for o in priors],0)
103 | 	).data
104 | 	labels=VOC_CLASSES
105 | 	top_k=10
106 | 	
107 | 	scale=torch.Tensor(img.shape[1::-1]).repeat(2)
108 | 	obj_pre=[]
109 | 	for i in range(detections.size(1)):
110 | 		j=0
111 | 		
112 | 		while detections[0,i,j,0]>=0.4:
113 | 			score=detections[0,i,j,0]
114 | 			obj={}
115 | 			obj['name']=labels[i-1]
116 | 			pt=(detections[0,i,j,1:]*scale).cpu().numpy()
117 | 			obj['bbox']=[int(pt[0]),
118 | 					     int(pt[1]),
119 | 						 int(pt[2]),
120 | 						 int(pt[3])]
121 | 			obj_pre.append(obj)
122 | 		
123 | 			label_name=labels[i-1]
124 | 			display_txt='%s:%.2f'%(label_name,score)
125 | 			coords=(pt[0],pt[1]),pt[2]-pt[0]+1,pt[3]-pt[1]+1
126 | 			color=colors_tableau[i]
127 | 			cv2.rectangle(img,(pt[0],pt[1]),(pt[2],pt[3]),color,2)
128 | 			cv2.putText(img,display_txt,(int(pt[0]),int(pt[1])+10),cv2.FONT_HERSHEY_SIMPLEX, 0.4, (255,255,255), 1, 8)
129 | 		
130 | 			j+=1
131 | 			
132 | 	#把测试过的图片写入磁盘
133 | 	#cv2.imwrite('./tested/'+name+'.jpg',img)
134 | 	#print('Pic:'+name+" writed!")
135 | 	
136 | 	for obj_R in obj_real:
137 | 		for obj_P in obj_pre:
138 | 			if IoU(obj_R,obj_P)>0.5:#阈值暂设为0.5
139 | 				if obj_R['name']==obj_P['name']:
140 | 					accu_num+=1
141 | 	count+=1
142 | print("-------end-------")
143 | elapsed=(time.time()-time_start)
144 | print('共{:d}张图片\n用时：{:f} s\nfps={:f}\n准确率：{:f}'
145 | 	  .format(count,elapsed,count/elapsed,accu_num/real_num))


--------------------------------------------------------------------------------
/l2norm.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn as nn
 3 | from torch.autograd import Function
 4 | from torch.autograd import Variable
 5 | import torch.nn.init as init
 6 | import Config
 7 | class L2Norm(nn.Module):
 8 |     def __init__(self,n_channels, scale):
 9 |         super(L2Norm,self).__init__()
10 |         self.n_channels = n_channels
11 |         self.gamma = scale or None
12 |         self.eps = 1e-10
13 |         if Config.use_cuda:
14 |             self.weight = nn.Parameter(torch.Tensor(self.n_channels).cuda())
15 |         else:
16 |             self.weight = nn.Parameter(torch.Tensor(self.n_channels))
17 |         self.reset_parameters()
18 | 
19 |     def reset_parameters(self):
20 |         nn.init.constant_(self.weight,self.gamma)
21 | 
22 |     def forward(self, x):
23 |         norm = x.pow(2).sum(dim=1, keepdim=True).sqrt()+self.eps
24 |         #x /= norm
25 |         x = torch.div(x,norm)
26 |         out = self.weight.unsqueeze(0).unsqueeze(2).unsqueeze(3).expand_as(x) * x
27 |         return out
28 | 


--------------------------------------------------------------------------------
/loss_function.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn as nn
 3 | import torch.nn.functional as F
 4 | import utils
 5 | import Config
 6 | 
 7 | class LossFun(nn.Module):
 8 |     def __init__(self):
 9 |         super(LossFun,self).__init__()
10 |     def forward(self, prediction,targets,priors_boxes):
11 |         loc_data , conf_data = prediction
12 |         loc_data = torch.cat([o.view(o.size(0),-1,4) for o in loc_data] ,1)
13 |         conf_data = torch.cat([o.view(o.size(0),-1,Config.class_num) for o in conf_data],1)
14 |         priors_boxes = torch.cat([o.view(-1,4) for o in priors_boxes],0)
15 |         if Config.use_cuda:
16 |             loc_data = loc_data.cuda()
17 |             conf_data = conf_data.cuda()
18 |             priors_boxes = priors_boxes.cuda()
19 |         # batch_size
20 |         batch_num = loc_data.size(0)
21 |         # default_box数量
22 |         box_num = loc_data.size(1)
23 |         # 存储targets根据每一个prior_box变换后的数据
24 |         target_loc = torch.Tensor(batch_num,box_num,4)
25 |         target_loc.requires_grad_(requires_grad=False)
26 |         # 存储每一个default_box预测的种类
27 |         target_conf = torch.LongTensor(batch_num,box_num)
28 |         target_conf.requires_grad_(requires_grad=False)
29 |         if Config.use_cuda:
30 |             target_loc = target_loc.cuda()
31 |             target_conf = target_conf.cuda()
32 |         # 因为一次batch可能有多个图，每次循环计算出一个图中的box，即8732个box的loc和conf，存放在target_loc和target_conf中
33 |         for batch_id in range(batch_num):
34 |             target_truths = targets[batch_id][:,:-1].data
35 |             target_labels = targets[batch_id][:,-1].data
36 |             if Config.use_cuda:
37 |                 target_truths = target_truths.cuda()
38 |                 target_labels = target_labels.cuda()
39 |             # 计算box函数，即公式中loc损失函数的计算公式
40 |             utils.match(0.5,target_truths,priors_boxes,target_labels,target_loc,target_conf,batch_id)
41 |         pos = target_conf > 0
42 |         pos_idx = pos.unsqueeze(pos.dim()).expand_as(loc_data)
43 |         # 相当于论文中L1损失函数乘xij的操作
44 |         pre_loc_xij = loc_data[pos_idx].view(-1,4)
45 |         tar_loc_xij = target_loc[pos_idx].view(-1,4)
46 |         # 将计算好的loc和预测进行smooth_li损失函数
47 |         loss_loc = F.smooth_l1_loss(pre_loc_xij,tar_loc_xij,size_average=False)
48 | 
49 |         batch_conf = conf_data.view(-1,Config.class_num)
50 | 
51 |         # 参照论文中conf计算方式，求出ci
52 |         loss_c = utils.log_sum_exp(batch_conf) - batch_conf.gather(1, target_conf.view(-1, 1))
53 | 
54 |         loss_c = loss_c.view(batch_num, -1)
55 |         # 将正样本设定为0
56 |         loss_c[pos] = 0
57 | 
58 |         # 将剩下的负样本排序，选出目标数量的负样本
59 |         _, loss_idx = loss_c.sort(1, descending=True)
60 |         _, idx_rank = loss_idx.sort(1)
61 | 
62 |         num_pos = pos.long().sum(1, keepdim=True)
63 |         num_neg = torch.clamp(3*num_pos, max=pos.size(1)-1)
64 | 
65 |         # 提取出正负样本
66 |         neg = idx_rank < num_neg.expand_as(idx_rank)
67 |         pos_idx = pos.unsqueeze(2).expand_as(conf_data)
68 |         neg_idx = neg.unsqueeze(2).expand_as(conf_data)
69 | 
70 |         conf_p = conf_data[(pos_idx+neg_idx).gt(0)].view(-1, Config.class_num)
71 |         targets_weighted = target_conf[(pos+neg).gt(0)]
72 |         loss_c = F.cross_entropy(conf_p, targets_weighted, size_average=False)
73 | 
74 |         N = num_pos.data.sum().double()
75 |         loss_l = loss_loc.double()
76 |         loss_c = loss_c.double()
77 |         loss_l /= N
78 |         loss_c /= N
79 |         return loss_l, loss_c
80 | 


--------------------------------------------------------------------------------
/model_file_test.py:
--------------------------------------------------------------------------------
1 | import torch
2 | vgg_weights = torch.load('./vgg16_reducedfc.pth')
3 | print(vgg_weights.keys())


--------------------------------------------------------------------------------
/result.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/result.jpg


--------------------------------------------------------------------------------
/ssd_net_vgg.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import l2norm
  4 | import Config as config
  5 | class SSD(nn.Module):
  6 |     def __init__(self):
  7 |         super(SSD,self).__init__()
  8 |         self.vgg = []
  9 |         #vgg-16模型
 10 |         self.vgg.append(nn.Conv2d(in_channels=3,out_channels=64,kernel_size=3,stride=1,padding=1))#conv1_1
 11 |         self.vgg.append(nn.ReLU(inplace=True))
 12 |         self.vgg.append(nn.Conv2d(in_channels=64,out_channels=64,kernel_size=3,stride=1,padding=1))#conv1_2
 13 |         self.vgg.append(nn.ReLU(inplace=True))
 14 |         self.vgg.append(nn.MaxPool2d(kernel_size=2,stride=2))#maxpool1
 15 |         self.vgg.append(nn.Conv2d(in_channels=64,out_channels=128,kernel_size=3,stride=1,padding=1))#conv2_1
 16 |         self.vgg.append(nn.ReLU(inplace=True))
 17 |         self.vgg.append(nn.Conv2d(in_channels=128,out_channels=128,kernel_size=3,stride=1,padding=1))#conv2_2
 18 |         self.vgg.append(nn.ReLU(inplace=True))
 19 |         self.vgg.append(nn.MaxPool2d(kernel_size=2,stride=2))#maxpool2
 20 |         self.vgg.append(nn.Conv2d(in_channels=128,out_channels=256,kernel_size=3,stride=1,padding=1))#conv3_1
 21 |         self.vgg.append(nn.ReLU(inplace=True))
 22 |         self.vgg.append(nn.Conv2d(in_channels=256,out_channels=256,kernel_size=3,stride=1,padding=1))#conv3_2
 23 |         self.vgg.append(nn.ReLU(inplace=True))
 24 |         self.vgg.append(nn.Conv2d(in_channels=256,out_channels=256,kernel_size=3,stride=1,padding=1))#conv3_3
 25 |         self.vgg.append(nn.ReLU(inplace=True))
 26 |         self.vgg.append(nn.MaxPool2d(kernel_size=2,stride=2,ceil_mode=True))#maxpool3
 27 |         self.vgg.append(nn.Conv2d(in_channels=256,out_channels=512,kernel_size=3,stride=1,padding=1))#conv4_1
 28 |         self.vgg.append(nn.ReLU(inplace=True))
 29 |         self.vgg.append(nn.Conv2d(in_channels=512,out_channels=512,kernel_size=3,stride=1,padding=1))#conv4_2
 30 |         self.vgg.append(nn.ReLU(inplace=True))
 31 |         self.vgg.append(nn.Conv2d(in_channels=512,out_channels=512,kernel_size=3,stride=1,padding=1))#conv4_3
 32 |         self.vgg.append(nn.ReLU(inplace=True))
 33 |         self.vgg.append(nn.MaxPool2d(kernel_size=2,stride=2))#maxpool4
 34 |         self.vgg.append(nn.Conv2d(in_channels=512,out_channels=512,kernel_size=3,stride=1,padding=1))#conv5_1
 35 |         self.vgg.append(nn.ReLU(inplace=True))
 36 |         self.vgg.append(nn.Conv2d(in_channels=512,out_channels=512,kernel_size=3,stride=1,padding=1))#conv5_2
 37 |         self.vgg.append(nn.ReLU(inplace=True))
 38 |         self.vgg.append(nn.Conv2d(in_channels=512,out_channels=512,kernel_size=3,stride=1,padding=1))#conv5_3
 39 |         self.vgg.append(nn.ReLU(inplace=True))
 40 |         self.vgg.append(nn.MaxPool2d(kernel_size=3,stride=1,padding=1))#maxpool5
 41 |         self.vgg.append(nn.Conv2d(in_channels=512,out_channels=1024,kernel_size=3,padding=6,dilation=6))#conv6
 42 |         self.vgg.append(nn.ReLU(inplace=True))
 43 |         self.vgg.append(nn.Conv2d(in_channels=1024,out_channels=1024,kernel_size=1))#conv7
 44 |         self.vgg.append(nn.ReLU(inplace=True))
 45 |         self.vgg = nn.ModuleList(self.vgg)
 46 |         self.conv8_1 = nn.Sequential(
 47 |             nn.Conv2d(in_channels=1024,out_channels=256,kernel_size=1),
 48 |             nn.ReLU(inplace=True)
 49 |         )
 50 |         self.conv8_2 = nn.Sequential(
 51 |             nn.Conv2d(in_channels=256,out_channels=512,kernel_size=3,stride=2,padding=1),
 52 |             nn.ReLU(inplace=True)
 53 |         )
 54 |         self.conv9_1 = nn.Sequential(
 55 |             nn.Conv2d(in_channels=512,out_channels=128,kernel_size=1),
 56 |             nn.ReLU(inplace=True)
 57 |         )
 58 |         self.conv9_2 = nn.Sequential(
 59 |             nn.Conv2d(in_channels=128,out_channels=256,kernel_size=3,stride=2,padding=1),
 60 |             nn.ReLU(inplace=True)
 61 |         )
 62 |         self.conv10_1 = nn.Sequential(
 63 |             nn.Conv2d(in_channels=256,out_channels=128,kernel_size=1),
 64 |             nn.ReLU(inplace=True)
 65 |         )
 66 |         self.conv10_2 = nn.Sequential(
 67 |             nn.Conv2d(in_channels=128,out_channels=256,kernel_size=3,stride=1),
 68 |             nn.ReLU(inplace=True)
 69 |         )
 70 |         self.conv11_1 = nn.Sequential(
 71 |             nn.Conv2d(in_channels=256, out_channels=128, kernel_size=1),
 72 |             nn.ReLU(inplace=True)
 73 |         )
 74 |         self.conv11_2 = nn.Sequential(
 75 |             nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, stride=1),
 76 |             nn.ReLU(inplace=True)
 77 |         )
 78 |         #特征层位置输出
 79 |         self.feature_map_loc_1 = nn.Sequential(
 80 |             nn.Conv2d(in_channels=512,out_channels=4*4,kernel_size=3,stride=1,padding=1)
 81 |         )
 82 |         self.feature_map_loc_2 = nn.Sequential(
 83 |             nn.Conv2d(in_channels=1024,out_channels=6*4,kernel_size=3,stride=1,padding=1)
 84 |         )
 85 |         self.feature_map_loc_3 = nn.Sequential(
 86 |             nn.Conv2d(in_channels=512,out_channels=6*4,kernel_size=3,stride=1,padding=1)
 87 |         )
 88 |         self.feature_map_loc_4 = nn.Sequential(
 89 |             nn.Conv2d(in_channels=256,out_channels=6*4,kernel_size=3,stride=1,padding=1)
 90 |         )
 91 |         self.feature_map_loc_5 = nn.Sequential(
 92 |             nn.Conv2d(in_channels=256,out_channels=4*4,kernel_size=3,stride=1,padding=1)
 93 |         )
 94 |         self.feature_map_loc_6 = nn.Sequential(
 95 |             nn.Conv2d(in_channels=256,out_channels=4*4,kernel_size=3,stride=1,padding=1)
 96 |         )
 97 |         #特征层类别输出
 98 |         self.feature_map_conf_1 = nn.Sequential(
 99 |             nn.Conv2d(in_channels=512,out_channels=4*config.class_num,kernel_size=3,stride=1,padding=1)
100 |         )
101 |         self.feature_map_conf_2 = nn.Sequential(
102 |             nn.Conv2d(in_channels=1024,out_channels=6*config.class_num,kernel_size=3,stride=1,padding=1)
103 |         )
104 |         self.feature_map_conf_3 = nn.Sequential(
105 |             nn.Conv2d(in_channels=512,out_channels=6*config.class_num,kernel_size=3,stride=1,padding=1)
106 |         )
107 |         self.feature_map_conf_4 = nn.Sequential(
108 |             nn.Conv2d(in_channels=256,out_channels=6*config.class_num,kernel_size=3,stride=1,padding=1)
109 |         )
110 |         self.feature_map_conf_5 = nn.Sequential(
111 |             nn.Conv2d(in_channels=256,out_channels=4*config.class_num,kernel_size=3,stride=1,padding=1)
112 |         )
113 |         self.feature_map_conf_6 = nn.Sequential(
114 |             nn.Conv2d(in_channels=256,out_channels=4*config.class_num,kernel_size=3,stride=1,padding=1)
115 |         )
116 | 
117 | 
118 |     #正向传播过程
119 |     def forward(self, image):
120 |         out = self.vgg[0](image)
121 |         out = self.vgg[1](out)
122 |         out = self.vgg[2](out)
123 |         out = self.vgg[3](out)
124 |         out = self.vgg[4](out)
125 |         out = self.vgg[5](out)
126 |         out = self.vgg[6](out)
127 |         out = self.vgg[7](out)
128 |         out = self.vgg[8](out)
129 |         out = self.vgg[9](out)
130 |         out = self.vgg[10](out)
131 |         out = self.vgg[11](out)
132 |         out = self.vgg[12](out)
133 |         out = self.vgg[13](out)
134 |         out = self.vgg[14](out)
135 |         out = self.vgg[15](out)
136 |         out = self.vgg[16](out)
137 |         out = self.vgg[17](out)
138 |         out = self.vgg[18](out)
139 |         out = self.vgg[19](out)
140 |         out = self.vgg[20](out)
141 |         out = self.vgg[21](out)
142 |         out = self.vgg[22](out)
143 |         my_L2Norm = l2norm.L2Norm(512, 20)
144 |         feature_map_1 = out
145 |         feature_map_1 = my_L2Norm(feature_map_1)
146 |         loc_1 = self.feature_map_loc_1(feature_map_1).permute((0,2,3,1)).contiguous()
147 |         conf_1 = self.feature_map_conf_1(feature_map_1).permute((0,2,3,1)).contiguous()
148 |         out = self.vgg[23](out)
149 |         out = self.vgg[24](out)
150 |         out = self.vgg[25](out)
151 |         out = self.vgg[26](out)
152 |         out = self.vgg[27](out)
153 |         out = self.vgg[28](out)
154 |         out = self.vgg[29](out)
155 |         out = self.vgg[30](out)
156 |         out = self.vgg[31](out)
157 |         out = self.vgg[32](out)
158 |         out = self.vgg[33](out)
159 |         out = self.vgg[34](out)
160 |         feature_map_2 = out
161 |         loc_2 = self.feature_map_loc_2(feature_map_2).permute((0,2,3,1)).contiguous()
162 |         conf_2 = self.feature_map_conf_2(feature_map_2).permute((0,2,3,1)).contiguous()
163 |         out = self.conv8_1(out)
164 |         out = self.conv8_2(out)
165 |         feature_map_3 = out
166 |         loc_3 = self.feature_map_loc_3(feature_map_3).permute((0,2,3,1)).contiguous()
167 |         conf_3 = self.feature_map_conf_3(feature_map_3).permute((0,2,3,1)).contiguous()
168 |         out = self.conv9_1(out)
169 |         out = self.conv9_2(out)
170 |         feature_map_4 = out
171 |         loc_4 = self.feature_map_loc_4(feature_map_4).permute((0,2,3,1)).contiguous()
172 |         conf_4 = self.feature_map_conf_4(feature_map_4).permute((0,2,3,1)).contiguous()
173 |         out = self.conv10_1(out)
174 |         out = self.conv10_2(out)
175 |         feature_map_5 = out
176 |         loc_5 = self.feature_map_loc_5(feature_map_5).permute((0,2,3,1)).contiguous()
177 |         conf_5 = self.feature_map_conf_5(feature_map_5).permute((0,2,3,1)).contiguous()
178 |         out = self.conv11_1(out)
179 |         out = self.conv11_2(out)
180 |         feature_map_6 = out
181 |         loc_6 = self.feature_map_loc_6(feature_map_6).permute((0,2,3,1)).contiguous()
182 |         conf_6 = self.feature_map_conf_6(feature_map_6).permute((0,2,3,1)).contiguous()
183 |         loc_list = [loc_1,loc_2,loc_3,loc_4,loc_5,loc_6]
184 |         conf_list = [conf_1,conf_2,conf_3,conf_4,conf_5,conf_6]
185 |         return loc_list,conf_list
186 | 


--------------------------------------------------------------------------------
/test.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/test.jpg


--------------------------------------------------------------------------------
/test_done.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/PengfeiM/Fatigue-Driven-Detection-Based-on-CNN/e2aeabe12bff3544dbbab04fa312ebfd866f2cd5/test_done.jpg


--------------------------------------------------------------------------------
/utils.py:
--------------------------------------------------------------------------------
  1 | import Config
  2 | from itertools import product as product
  3 | from math import sqrt as sqrt
  4 | import torch
  5 | def default_prior_box():
  6 |     mean_layer = []
  7 |     for k,f in enumerate(Config.feature_map):
  8 |         mean = []
  9 |         for i,j in product(range(f),repeat=2):
 10 |             f_k = Config.image_size/Config.steps[k]
 11 |             cx = (j+0.5)/f_k
 12 |             cy = (i+0.5)/f_k
 13 | 
 14 |             s_k = Config.sk[k]/Config.image_size
 15 |             mean += [cx,cy,s_k,s_k]
 16 | 
 17 |             s_k_prime = sqrt(s_k * Config.sk[k+1]/Config.image_size)
 18 |             mean += [cx,cy,s_k_prime,s_k_prime]
 19 |             for ar in Config.aspect_ratios[k]:
 20 |                 mean += [cx, cy, s_k * sqrt(ar), s_k/sqrt(ar)]
 21 |                 mean += [cx, cy, s_k / sqrt(ar), s_k * sqrt(ar)]
 22 |         if Config.use_cuda:
 23 |             mean = torch.Tensor(mean).cuda().view(Config.feature_map[k], Config.feature_map[k], -1).contiguous()
 24 |         else:
 25 |             mean = torch.Tensor(mean).view( Config.feature_map[k],Config.feature_map[k],-1).contiguous()
 26 |         mean.clamp_(max=1, min=0)
 27 |         mean_layer.append(mean)
 28 | 
 29 |     return mean_layer
 30 | def encode(match_boxes,prior_box,variances):
 31 |     g_cxcy = (match_boxes[:, :2] + match_boxes[:, 2:])/2 - prior_box[:, :2]
 32 |     # encode variance
 33 |     g_cxcy /= (variances[0] * prior_box[:, 2:])
 34 |     # match wh / prior wh
 35 |     g_wh = (match_boxes[:, 2:] - match_boxes[:, :2]) / prior_box[:, 2:]
 36 |     g_wh = torch.log(g_wh) / variances[1]
 37 |     # return target for smooth_l1_loss
 38 |     return torch.cat([g_cxcy, g_wh], 1)  # [num_priors,4]
 39 | 
 40 | def change_prior_box(box):
 41 |     if Config.use_cuda:
 42 |         return torch.cat((box[:, :2] - box[:, 2:]/2,     # xmin, ymin
 43 |                          box[:, :2] + box[:, 2:]/2), 1).cuda()  # xmax, ymax
 44 |     else:
 45 |         return torch.cat((box[:, :2] - box[:, 2:]/2,     # xmin, ymin
 46 |                          box[:, :2] + box[:, 2:]/2), 1)
 47 | # 计算两个box的交集
 48 | def insersect(box1,box2):
 49 |     label_num = box1.size(0)
 50 |     box_num = box2.size(0)
 51 |     max_xy = torch.min(
 52 |         box1[:,2:].unsqueeze(1).expand(label_num,box_num,2),
 53 |         box2[:,2:].unsqueeze(0).expand(label_num,box_num,2)
 54 |     )
 55 |     min_xy = torch.max(
 56 |         box1[:,:2].unsqueeze(1).expand(label_num,box_num,2),
 57 |         box2[:,:2].unsqueeze(0).expand(label_num,box_num,2)
 58 |     )
 59 |     inter = torch.clamp((max_xy-min_xy),min=0)
 60 |     return inter[:,:,0]*inter[:,:,1]
 61 | 
 62 | def jaccard(box_a, box_b):
 63 |     """计算jaccard比
 64 |     公式:
 65 |         A ∩ B / A ∪ B = A ∩ B / (area(A) + area(B) - A ∩ B)
 66 |     """
 67 |     inter = insersect(box_a, box_b)
 68 |     area_a = ((box_a[:, 2]-box_a[:, 0]) *
 69 |               (box_a[:, 3]-box_a[:, 1])).unsqueeze(1).expand_as(inter)  # [A,B]
 70 |     area_b = ((box_b[:, 2]-box_b[:, 0]) *
 71 |               (box_b[:, 3]-box_b[:, 1])).unsqueeze(0).expand_as(inter)  # [A,B]
 72 |     union = area_a + area_b - inter
 73 |     return inter / union  # [A,B]
 74 | def point_form(boxes):
 75 | 
 76 |     return torch.cat((boxes[:, :2] - boxes[:, 2:]/2,     # xmin, ymin
 77 |                      boxes[:, :2] + boxes[:, 2:]/2), 1)  # xmax, ymax
 78 | def match(threshold, truths, priors, labels, loc_t, conf_t, idx):
 79 |     """计算default box和实际位置的jaccard比，计算出每个box的最大jaccard比的种类和每个种类的最大jaccard比的box
 80 |     Args:
 81 |         threshold: (float) jaccard比的阈值.
 82 |         truths: (tensor) 实际位置.
 83 |         priors: (tensor) default box
 84 |         labels: (tensor) 一个图片实际包含的类别数.
 85 |         loc_t: (tensor) 需要存储每个box不同类别中的最大jaccard比.
 86 |         conf_t: (tensor) 存储每个box的最大jaccard比的类别.
 87 |         idx: (int) 当前的批次
 88 |     """
 89 |     # 计算jaccard比
 90 |     overlaps = jaccard(
 91 |         truths,
 92 |         # 转换priors，转换为x_min,y_min,x_max和y_max
 93 |         point_form(priors)
 94 |     )
 95 |     # [1,num_objects] best prior for each ground truth
 96 |     # 实际包含的类别对应box中jaccarb最大的box和对应的索引值，即每个类别最优box
 97 |     best_prior_overlap, best_prior_idx = overlaps.max(1, keepdim=True)
 98 |     # [1,num_priors] best ground truth for each prior
 99 |     # 每一个box,在实际类别中最大的jaccard比的类别，即每个box最优类别
100 |     best_truth_overlap, best_truth_idx = overlaps.max(0, keepdim=True)
101 |     best_truth_idx.squeeze_(0)
102 |     best_truth_overlap.squeeze_(0)
103 |     best_prior_idx.squeeze_(1)
104 |     best_prior_overlap.squeeze_(1)
105 |     # 将每个类别中的最大box设置为2，确保不影响后边操作
106 |     best_truth_overlap.index_fill_(0, best_prior_idx, 2)
107 | 
108 |     # 计算每一个box的最优类别，和每个类别的最优loc
109 |     for j in range(best_prior_idx.size(0)):
110 |         best_truth_idx[best_prior_idx[j]] = j
111 |     matches = truths[best_truth_idx]          # Shape: [num_priors,4]
112 |     conf = labels[best_truth_idx] + 1         # Shape: [num_priors]
113 |     conf[best_truth_overlap < threshold] = 0  # label as background
114 |     # 实现loc的转换，具体的转换公式参照论文中的loc的loss函数的计算公式
115 |     loc = encode(matches, priors,(0.1,0.2))
116 |     loc_t[idx] = loc    # [num_priors,4] encoded offsets to learn
117 |     conf_t[idx] = conf  # [num_priors] top class label for each prior
118 | 
119 | 
120 | def log_sum_exp(x):
121 |     """Utility function for computing log_sum_exp while determining
122 |     This will be used to determine unaveraged confidence loss across
123 |     all examples in a batch.
124 |     Args:
125 |         x (Variable(tensor)): conf_preds from conf layers
126 |     """
127 |     x_max = x.data.max()
128 |     result = torch.log(torch.sum(torch.exp(x-x_max), 1, keepdim=True)) + x_max
129 |     return torch.log(torch.sum(torch.exp(x-x_max), 1, keepdim=True)) + x_max
130 | 
131 | def decode(loc, priors, variances):
132 |     """Decode locations from predictions using priors to undo
133 |     the encoding we did for offset regression at train time.
134 |     Args:
135 |         loc (tensor): location predictions for loc layers,
136 |             Shape: [num_priors,4]
137 |         priors (tensor): Prior boxes in center-offset form.
138 |             Shape: [num_priors,4].
139 |         variances: (list[float]) Variances of priorboxes
140 |     Return:
141 |         decoded bounding box predictions
142 |     """
143 | 
144 |     boxes = torch.cat((
145 |         priors[:, :2] + loc[:, :2] * variances[0] * priors[:, 2:],
146 |         priors[:, 2:] * torch.exp(loc[:, 2:] * variances[1])), 1)
147 |     boxes[:, :2] -= boxes[:, 2:] / 2
148 |     boxes[:, 2:] += boxes[:, :2]
149 |     return boxes
150 | def nms(boxes, scores, overlap=0.5, top_k=200):
151 |     """Apply non-maximum suppression at test time to avoid detecting too many
152 |     overlapping bounding boxes for a given object.
153 |     Args:
154 |         boxes: (tensor) The location preds for the img, Shape: [num_priors,4].
155 |         scores: (tensor) The class predscores for the img, Shape:[num_priors].
156 |         overlap: (float) The overlap thresh for suppressing unnecessary boxes.
157 |         top_k: (int) The Maximum number of box preds to consider.
158 |     Return:
159 |         The indices of the kept boxes with respect to num_priors.
160 |     """
161 | 
162 |     keep = scores.new(scores.size(0)).zero_().long()
163 |     if boxes.numel() == 0:
164 |         return keep,0
165 |     x1 = boxes[:, 0]
166 |     y1 = boxes[:, 1]
167 |     x2 = boxes[:, 2]
168 |     y2 = boxes[:, 3]
169 |     area = torch.mul(x2 - x1, y2 - y1)
170 |     v, idx = scores.sort(0)  # sort in ascending order
171 |     # I = I[v >= 0.01]
172 |     idx = idx[-top_k:]  # indices of the top-k largest vals
173 |     xx1 = boxes.new()
174 |     yy1 = boxes.new()
175 |     xx2 = boxes.new()
176 |     yy2 = boxes.new()
177 |     w = boxes.new()
178 |     h = boxes.new()
179 | 
180 |     # keep = torch.Tensor()
181 |     count = 0
182 |     while idx.numel() > 0:
183 |         i = idx[-1]  # index of current largest val
184 |         # keep.append(i)
185 |         keep[count] = i
186 |         count += 1
187 |         if idx.size(0) == 1:
188 |             break
189 |         idx = idx[:-1]  # remove kept element from view
190 |         # load bboxes of next highest vals
191 |         torch.index_select(x1, 0, idx, out=xx1)
192 |         torch.index_select(y1, 0, idx, out=yy1)
193 |         torch.index_select(x2, 0, idx, out=xx2)
194 |         torch.index_select(y2, 0, idx, out=yy2)
195 |         # store element-wise max with next highest score
196 |         xx1 = torch.clamp(xx1, min=x1[i])
197 |         yy1 = torch.clamp(yy1, min=y1[i])
198 |         xx2 = torch.clamp(xx2, max=x2[i])
199 |         yy2 = torch.clamp(yy2, max=y2[i])
200 |         w.resize_as_(xx2)
201 |         h.resize_as_(yy2)
202 |         w = xx2 - xx1
203 |         h = yy2 - yy1
204 |         # check sizes of xx1 and xx2.. after each iteration
205 |         w = torch.clamp(w, min=0.0)
206 |         h = torch.clamp(h, min=0.0)
207 |         inter = w*h
208 |         # IoU = i / (area(a) + area(b) - i)
209 |         rem_areas = torch.index_select(area, 0, idx)  # load remaining areas)
210 |         union = (rem_areas - inter) + area[i]
211 |         IoU = inter/union  # store result in iou
212 |         # keep only elements with an IoU <= overlap
213 |         idx = idx[IoU.le(overlap)]
214 |     return keep, count
215 | if __name__ == '__main__':
216 |     mean = default_prior_box()
217 |     print(mean)


--------------------------------------------------------------------------------
/video_detection.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | Created on Fri Apr 26 15:58:49 2019
  4 | 
  5 | @author: 朋飞
  6 | """
  7 | 
  8 | from torch.autograd import Variable
  9 | from detection import *
 10 | from ssd_net_vgg import *
 11 | from voc0712 import *
 12 | import torch
 13 | import torch.nn as nn
 14 | import numpy as np
 15 | import cv2
 16 | import utils
 17 | import torch.backends.cudnn as cudnn
 18 | import time
 19 | #检测cuda是否可用
 20 | if torch.cuda.is_available():
 21 | 	print('-----gpu mode-----')
 22 | 	torch.set_default_tensor_type('torch.cuda.FloatTensor')
 23 | else:
 24 | 	print('-----cpu mode-----')
 25 | colors_tableau=[ (214, 39, 40),(23, 190, 207),(188, 189, 34),(188,34,188),(205,108,8)]
 26 | 
 27 | def Yawn(list_Y,list_y1):
 28 | 	list_cmp=list_Y[:len(list_Y1)]==list_Y1
 29 | 	for flag in list_cmp:
 30 | 		if flag==False:
 31 | 			return False
 32 | 	return True
 33 | #初始化网络
 34 | net=SSD()
 35 | net=torch.nn.DataParallel(net)
 36 | net.train(mode=False)
 37 | net.load_state_dict(torch.load('./weights/ssd300_VOC_100000.pth',map_location=lambda storage,loc: storage))
 38 | if torch.cuda.is_available():
 39 | 	net = net.cuda()
 40 | 	cudnn.benchmark = True
 41 | 
 42 | img_mean=(104.0,117.0,123.0)
 43 | 
 44 | #打开视频文件，file_name改成0即为打开摄像头
 45 | file_name='C:/Users/HP/Desktop/9-FemaleNoGlasses.avi'
 46 | cap=cv2.VideoCapture(file_name)
 47 | max_fps=0
 48 | 
 49 | #保存检测结果的List
 50 | #眼睛和嘴巴都是，张开为‘1’，闭合为‘0’
 51 | video_fps=20#视频fps=20
 52 | list_B=np.ones(video_fps*3)#眼睛状态List,建议根据fps修改，视频fps=20
 53 | list_Y=np.zeros(video_fps*10)#嘴巴状态list，10s
 54 | list_Y1=np.ones(int(video_fps*1.5))#如果在list_Y中存在list_Y1，则判定一次打哈欠(大约1.5s)，
 55 | list_Y1[int(video_fps*1.5)-1]=0#从持续张嘴到闭嘴判定为一次打哈欠
 56 | list_blink=np.ones(video_fps*10)#大约是记录10S内信息，眨眼为‘1’，不眨眼为‘0’
 57 | list_yawn=np.zeros(video_fps*30)#大约是半分钟内打哈欠记录，打哈欠为‘1’，不打哈欠为‘0’
 58 | 
 59 | #blink_count=0#眨眼计数
 60 | #yawn_count=0
 61 | #blink_start=time.time()#炸眼时间
 62 | #yawn_start=time.time()#打哈欠时间
 63 | blink_freq=0.5
 64 | yawn_freq=0
 65 | #开始检测，按‘q’退出
 66 | while cap.isOpened():
 67 | 	flag_B=True#是否闭眼的flag
 68 | 	flag_Y=False#张嘴flag
 69 | 
 70 | 	num_rec=0#检测到的眼睛的数量
 71 | 	start=time.time()#计时
 72 | 	ret,img=cap.read()#读取图片
 73 | 	
 74 | 	#检测
 75 | 	x=cv2.resize(img,(300,300)).astype(np.float32)
 76 | 	x-=img_mean
 77 | 	x=x.astype(np.float32)
 78 | 	x=x[:,:,::-1].copy()
 79 | 	x=torch.from_numpy(x).permute(2,0,1)
 80 | 	xx=Variable(x.unsqueeze(0))
 81 | 	if torch.cuda.is_available():
 82 | 		xx=xx.cuda()
 83 | 	y=net(xx)
 84 | 	softmax=nn.Softmax(dim=-1)
 85 | 	# detect=Detect(config.class_num,0,200,0.01,0.45)
 86 | 	detect = Detect.apply
 87 | 	priors=utils.default_prior_box()
 88 | 
 89 | 	loc,conf=y
 90 | 	loc=torch.cat([o.view(o.size(0),-1)for o in loc],1)
 91 | 	conf=torch.cat([o.view(o.size(0),-1)for o in conf],1)
 92 | 	
 93 | 	detections=detect(
 94 | 		loc.view(loc.size(0),-1,4),
 95 | 		softmax(conf.view(conf.size(0),-1,config.class_num)),
 96 | 		torch.cat([o.view(-1,4) for o in priors],0),
 97 | 		config.class_num,
 98 | 		200,
 99 |     	0.7,
100 |     	0.45
101 | 	).data
102 | 	labels=VOC_CLASSES
103 | 	top_k=10
104 | 	
105 | 	#将检测结果放置于图片上
106 | 	scale=torch.Tensor(img.shape[1::-1]).repeat(2)
107 | 	for i in range(detections.size(1)):
108 | 		
109 | 		j=0
110 | 		while detections[0,i,j,0]>=0.4:
111 | 			score=detections[0,i,j,0]
112 | 			label_name=labels[i-1]
113 | 			if label_name=='closed_eye':
114 | 				flag_B=False
115 | 			if label_name=='open_mouth':
116 | 				flag_Y=True
117 | 			display_txt='%s:%.2f'%(label_name,score)
118 | 			pt=(detections[0,i,j,1:]*scale).cpu().numpy()
119 | 			coords=(pt[0],pt[1]),pt[2]-pt[0]+1,pt[3]-pt[1]+1
120 | 			color=colors_tableau[i]
121 | 			cv2.rectangle(img,(pt[0],pt[1]),(pt[2],pt[3]),color,2)
122 | 			cv2.putText(img,display_txt,(int(pt[0]),int(pt[1])+10),cv2.FONT_HERSHEY_SIMPLEX, 0.4, (255,255,255), 1, 8)
123 | 			j+=1
124 | 			num_rec+=1
125 | 	if num_rec>0:
126 | 		if flag_B:
127 | 			#print(' 1:eye-open')
128 | 			list_B=np.append(list_B,1)#睁眼为‘1’
129 | 		else:
130 | 			#print(' 0:eye-closed')
131 | 			list_B=np.append(list_B,0)#闭眼为‘0’
132 | 		list_B=np.delete(list_B,0)
133 | 		if flag_Y:
134 | 			list_Y=np.append(list_Y,1)
135 | 		else:
136 | 			list_Y=np.append(list_Y,0)
137 | 		list_Y=np.delete(list_Y,0)
138 | 	else:
139 | 		print('nothing detected')
140 | 	#print(list)
141 | 	
142 | 	if list_B[13]==1 and list_B[14]==0:
143 | 		#如果上一帧为’1‘，此帧为’0‘则判定为眨眼
144 | 		print('----------------眨眼----------------------')
145 | 		list_blink=np.append(list_blink,1)
146 | 	else:
147 | 		list_blink=np.append(list_blink,0)
148 | 	list_blink=np.delete(list_blink,0)
149 | 	
150 | 	
151 | 	#检测打哈欠
152 | 	#if Yawn(list_Y,list_Y1):
153 | 	if (list_Y[len(list_Y)-len(list_Y1):]==list_Y1).all():
154 | 		print('----------------------打哈欠----------------------')
155 | 		list_Y=np.zeros(50)#此处是检测到一次打哈欠之后将嘴部状态list全部置‘0’，考虑到打哈欠所用时间较长，所以基本不会出现漏检
156 | 		list_yawn=np.append(list_yawn,1)
157 | 	else:
158 | 		list_yawn=np.append(list_yawn,0)
159 | 	list_yawn=np.delete(list_yawn,0)
160 | 	
161 | 	
162 | 	
163 | 	#实时计算PERCLOS perblink,peryawn
164 | 	#即计算平均闭眼时长百分比，平均眨眼百分比，平均打哈欠百分比
165 | 	perclos=1-np.average(list_B)
166 | 	perblink=np.average(list_blink)
167 | 	peryawn=np.average(list_yawn)
168 | 	#print('perclos={:f}'.format(perclos))
169 | 	
170 | 	#此处为判断疲劳部分
171 | 	#想法1：两个频率计算改为实时的，所以此处不再修改
172 | 	if(perclos>0.4 or perblink<2.5/(10*video_fps) or peryawn>3/(30*video_fps)):
173 | 		print('疲劳')
174 | 	else:
175 | 		print('清醒')
176 | 	
177 | 	
178 | 	T=time.time()-start
179 | 	fps=1/T#实时在视频上显示fps
180 | 	if fps>max_fps:
181 | 		max_fps=fps
182 | 	fps_txt='fps:%.2f'%(fps)
183 | 	cv2.putText(img,fps_txt,(0,10),cv2.FONT_HERSHEY_SIMPLEX, 0.4, (255,255,255), 1, 8)
184 | 	cv2.imshow("ssd",img)
185 | 	if cv2.waitKey(100) & 0xff == ord('q'):
186 | 		break
187 | #print("-------end-------")
188 | cap.release()
189 | cv2.destroyAllWindows()
190 | #print(max_fps)


--------------------------------------------------------------------------------
/voc0712.py:
--------------------------------------------------------------------------------
  1 | """VOC Dataset Classes
  2 | 
  3 | Original author: Francisco Massa
  4 | https://github.com/fmassa/vision/blob/voc_dataset/torchvision/datasets/voc.py
  5 | 
  6 | Updated by: Ellis Brown, Max deGroot
  7 | """
  8 | import os.path as osp
  9 | import sys
 10 | import torch
 11 | import torch.utils.data as data
 12 | import cv2
 13 | import numpy as np
 14 | if sys.version_info[0] == 2:
 15 |     import xml.etree.cElementTree as ET
 16 | else:
 17 |     import xml.etree.ElementTree as ET
 18 | 
 19 | VOC_CLASSES = [  # always index 0
 20 | 		'open_eye','closed_eye','closed_mouth','open_mouth']
 21 | 
 22 | # note: if you used our download scripts, this should be right
 23 | VOC_ROOT = osp.join('./', "data/VOCdevkit/")
 24 | 
 25 | 
 26 | class VOCAnnotationTransform(object):
 27 |     """Transforms a VOC annotation into a Tensor of bbox coords and label index
 28 |     Initilized with a dictionary lookup of classnames to indexes
 29 | 
 30 |     Arguments:
 31 |         class_to_ind (dict, optional): dictionary lookup of classnames -> indexes
 32 |             (default: alphabetic indexing of VOC's 20 classes)
 33 |         keep_difficult (bool, optional): keep difficult instances or not
 34 |             (default: False)
 35 |         height (int): height
 36 |         width (int): width
 37 |     """
 38 | 
 39 |     def __init__(self, class_to_ind=None, keep_difficult=False):
 40 |         self.class_to_ind = class_to_ind or dict(
 41 |             zip(VOC_CLASSES, range(len(VOC_CLASSES))))
 42 |         self.keep_difficult = keep_difficult
 43 | 
 44 |     def __call__(self, target, width, height):
 45 |         """
 46 |         Arguments:
 47 |             target (annotation) : the target annotation to be made usable
 48 |                 will be an ET.Element
 49 |         Returns:
 50 |             a list containing lists of bounding boxes  [bbox coords, class name]
 51 |         """
 52 |         res = []
 53 |         for obj in target.iter('object'):
 54 |             difficult = int(obj.find('difficult').text) == 1
 55 |             if not self.keep_difficult and difficult:
 56 |                 continue
 57 |             name = obj.find('name').text.lower().strip()
 58 |             bbox = obj.find('bndbox')
 59 | 
 60 |             pts = ['xmin', 'ymin', 'xmax', 'ymax']
 61 |             bndbox = []
 62 |             for i, pt in enumerate(pts):
 63 |                 cur_pt = int(bbox.find(pt).text) - 1
 64 |                 # scale height or width
 65 |                 cur_pt = cur_pt / width if i % 2 == 0 else cur_pt / height
 66 |                 bndbox.append(cur_pt)
 67 |             label_idx = self.class_to_ind[name]
 68 |             bndbox.append(label_idx)
 69 |             res += [bndbox]  # [xmin, ymin, xmax, ymax, label_ind]
 70 |             # img_id = target.find('filename').text[:-4]
 71 | 
 72 |         return res  # [[xmin, ymin, xmax, ymax, label_ind], ... ]
 73 | 
 74 | 
 75 | class VOCDetection(data.Dataset):
 76 |     """VOC Detection Dataset Object
 77 | 
 78 |     input is image, target is annotation
 79 | 
 80 |     Arguments:
 81 |         root (string): filepath to VOCdevkit folder.
 82 |         image_set (string): imageset to use (eg. 'train', 'val', 'test')
 83 |         transform (callable, optional): transformation to perform on the
 84 |             input image
 85 |         target_transform (callable, optional): transformation to perform on the
 86 |             target `annotation`
 87 |             (eg: take in caption string, return tensor of word indices)
 88 |         dataset_name (string, optional): which dataset to load
 89 |             (default: 'VOC2007')
 90 |     """
 91 | 
 92 |     def __init__(self, root,
 93 |                  image_sets=[('trainval')],
 94 |                  transform=None, target_transform=VOCAnnotationTransform(),
 95 |                  dataset_name='My_Data'):
 96 |         self.root = root
 97 |         self.image_set = image_sets
 98 |         self.transform = transform
 99 |         self.target_transform = target_transform
100 |         self.name = dataset_name
101 |         self._annopath = osp.join('%s', 'Annotations', '%s.xml')
102 |         self._imgpath = osp.join('%s', 'JPEGImages', '%s.jpg')
103 |         self.ids = list()
104 |         for (name) in image_sets:
105 |             #rootpath = osp.join(self.root, 'VOC' + year)
106 |             rootpath=self.root
107 |             for line in open(osp.join(rootpath, 'ImageSets', 'Main', name + '.txt')):
108 |                 self.ids.append((rootpath, line.strip()))
109 | 
110 |     def __getitem__(self, index):
111 |         im, gt, h, w = self.pull_item(index)
112 | 
113 |         return im, gt
114 | 
115 |     def __len__(self):
116 |         return len(self.ids)
117 | 
118 |     def pull_item(self, index):
119 |         img_id = self.ids[index]
120 | 
121 |         target = ET.parse(self._annopath % img_id).getroot()
122 |         img = cv2.imread(self._imgpath % img_id)
123 |         height, width, channels = img.shape
124 | 
125 |         if self.target_transform is not None:
126 |             target = self.target_transform(target, width, height)
127 | 
128 |         if self.transform is not None:
129 |             target = np.array(target)
130 |             img, boxes, labels = self.transform(img, target[:, :4], target[:, 4])
131 |             # to rgb
132 |             img = img[:, :, (2, 1, 0)]
133 |             # img = img.transpose(2, 0, 1)
134 |             target = np.hstack((boxes, np.expand_dims(labels, axis=1)))
135 |         return torch.from_numpy(img).permute(2, 0, 1), target, height, width
136 |         # return torch.from_numpy(img), target, height, width
137 | 
138 |     def pull_image(self, index):
139 |         '''Returns the original image object at index in PIL form
140 | 
141 |         Note: not using self.__getitem__(), as any transformations passed in
142 |         could mess up this functionality.
143 | 
144 |         Argument:
145 |             index (int): index of img to show
146 |         Return:
147 |             PIL img
148 |         '''
149 |         img_id = self.ids[index]
150 |         return cv2.imread(self._imgpath % img_id, cv2.IMREAD_COLOR)
151 | 
152 |     def pull_anno(self, index):
153 |         '''Returns the original annotation of image at index
154 | 
155 |         Note: not using self.__getitem__(), as any transformations passed in
156 |         could mess up this functionality.
157 | 
158 |         Argument:
159 |             index (int): index of img to get annotation of
160 |         Return:
161 |             list:  [img_id, [(label, bbox coords),...]]
162 |                 eg: ('001718', [('dog', (96, 13, 438, 332))])
163 |         '''
164 |         img_id = self.ids[index]
165 |         anno = ET.parse(self._annopath % img_id).getroot()
166 |         gt = self.target_transform(anno, 1, 1)
167 |         return img_id[1], gt
168 | 
169 |     def pull_tensor(self, index):
170 |         '''Returns the original image at an index in tensor form
171 | 
172 |         Note: not using self.__getitem__(), as any transformations passed in
173 |         could mess up this functionality.
174 | 
175 |         Argument:
176 |             index (int): index of img to show
177 |         Return:
178 |             tensorized version of img, squeezed
179 |         '''
180 |         return torch.Tensor(self.pull_image(index)).unsqueeze_(0)
181 | 


--------------------------------------------------------------------------------
/weights/readme.txt:
--------------------------------------------------------------------------------
1 | 将下载好的权重文件放在此处
2 | 
3 | 1000-5000 是原来的
4 | 10000-120000是后来的数据集训练结果


--------------------------------------------------------------------------------