道路标志分类学习建立CNN
Every year, automakers are adding more advanced driver-assistance systems (ADAS) to their fleets. These include adaptive cruise control (ACC), forward collision warning (FCW), automatic parking, and more. One study found that ADAS could prevent up to 28% of all crashes in the United States. This technology will only improve, and will eventually develop into Level 5, fully autonomous cars.
每年,汽车制造商都在其车队中增加更先进的驾驶员辅助系统 (ADAS)。 这些功能包括自适应巡航控制(ACC),前撞警告(FCW),自动停车等。 一项研究发现,ADAS可以防止多达28%的美国车祸 。 这项技术只会改进,最终将发展为5级全自动驾驶汽车。
For a car to completely drive itself, it needs to be able to understand its environment. This includes other vehicles, pedestrians, and road signs.
为了使汽车完全自我驾驶,它需要能够了解其环境。 这包括其他车辆,行人和道路标志 。
Road signs give us important information about the law, warn us about dangerous conditions, and guide us to our desired destination. If a car cannot distinguish the differences in symbols, colours, and shapes, many people could be seriously injured.
路标为我们提供了有关法律的重要信息,警告我们有关危险情况,并引导我们到达理想的目的地。 如果汽车无法区分符号,颜色和形状的差异,则可能会严重伤害许多人。
The way a car sees the road is different from how we perceive it. We can all tell the difference between road signs and various traffic situations instantly. When feeding images through to a computer, they just see ones and zeros. That means we need to teach the car to learn like humans, or at least identify signs like us.
汽车观察道路的方式与我们的感知方式不同。 我们都可以立即分辨出路标和各种交通状况之间的区别。 将图像传送到计算机时,他们只会看到一个和零。 这意味着我们需要教汽车像人类一样学习,或者至少要像我们这样识别标志。
To solve this problem, I tried building my own convolutional neural network (CNN) to classify traffic signs. In this process, there are three main steps: preprocessing images, building the convolutional neural network, and outputting a prediction.
为了解决这个问题,我尝试构建自己的卷积神经网络(CNN)对交通标志进行分类。 在此过程中,主要包括三个步骤: 预处理图像 , 构建卷积神经网络以及输出预测 。
预处理图像 (Preprocessing images)
In the preprocessing stage, the images are imported from the “german-traffic-signs” Bitbucket repository. This contains a dataset of labelled images which will allow us to build a supervised learning model. This repository can be cloned to a Google Colab notebook, making it easy to import the dataset and start coding.
在预处理阶段,图像将从“ german-traffic-signs” Bitbucket存储库中导入。 它包含一个带有标签的图像的数据集,这将使我们能够建立一个监督学习模型。 可以将该存储库克隆到Google Colab笔记本中,从而轻松导入数据集并开始编码。
Now to make use of this dataset, the images are going to be fed through a greyscale and equalize function.
现在要利用此数据集,图像将通过灰度和均衡功能进行馈送。
灰度 (Greyscale)
Currently, the images from the repository are three-dimensional. This is because coloured pictures have three colour channels — red, green, and blue (RGB) which are stacked onto each other to give them their vibrant colours.
当前,来自存储库的图像是三维图像 。 这是因为彩色图片具有三个颜色通道-红色,绿色和蓝色(RGB),它们相互堆叠以赋予它们鲜艳的色彩。
For this machine learning model, three layers of images aren’t necessary, only the features of the signs are needed. So, passing the dataset images through a greyscale function cleans up our data and filters only the important information, also reducing the images to a single dimension.
对于这种机器学习模型,不需要三层图像,只需要标志的特征。 因此,通过灰度函数传递数据集图像将清理我们的数据并仅过滤重要信息,还将图像缩小为一个维度。
均衡 (Equalize)
Now that the images are greyscaled, they have lost some of their contrast, or the whiteness or blackness of pixels. To increase the contrast, the images must be equalized. This is important because the model has to distinguish various features which are picked up by their changes in contrast.
现在,图像是灰度的,它们已经失去了某些对比度,或者失去了像素的白度或黑度。 为了增加对比度,必须使图像相等 。 这很重要,因为该模型必须区分各种特征,这些特征是通过它们的对比度变化来拾取的。
Equalizing an image means spreading out the pixel value distribution, creating a wider range of the whiteness and blackness of the image.
均衡图像意味着扩展像素值分布,从而创建更大范围的图像白度和黑度。
卷积神经网络 (Convolutional neural network)
A convolutional neural network is a class of deep learning networks, used to analyze visual imagery. In this case, it is being used to find unique sets of features between the variety of road signs.
卷积神经网络是一类深度学习网络,用于分析视觉图像。 在这种情况下,它被用来在各种路标之间找到独特的特征集。
The process it uses is similar to how our eyes and brains sort everything we see. For example, when looking at a set of numbers, you can tell the difference between a 1 and an 8. A 1 is a vertical line, while an 8 is a loop on top of another loop. Of course, you don’t actually say this in your head because we’ve seen them so many times, it has become a habit.
它使用的过程类似于我们的眼睛和大脑对所看到的一切进行排序。 例如,当查看一组数字时,您可以分辨出1和8之间的差异。1是垂直线,而8是另一个循环之上的循环。 当然,您实际上并不是在脑子里这么说,因为我们已经看过他们很多次了,这已经成为一种习惯。
他们如何学习? (How do they learn?)
For a convolutional neural network to extract the important features of an image, they use kernels to scan over or stride over an image.
为了使卷积神经网络提取图像的重要特征,他们使用内核扫描或跨过图像。
I think of it as your eyes moving in saccades over an image. They analyze one part and move horizontally to the next section until you’ve seen the whole picture.
我认为这是因为您的眼睛在图像上方扫视。 他们分析一个部分,然后水平移动到下一部分,直到您看到整个图片为止。
Kernels compare the difference between what they see to what they’re looking for. When a feature matches, it is recorded and stored in the feature map. These feature maps are refined versions of the original image. They save the important features of the sign and ignore the rest. Several different kernels go over the original image and extract different important features, then they join to create the final convolved pattern.
内核比较了所见与所寻找之间的差异。 要素匹配时,将其记录并存储在要素图中 。 这些功能图是原始图像的精炼版本。 它们保存了标牌的重要特征,而忽略了其余部分。 几个不同的内核遍历原始图像并提取不同的重要特征,然后它们合在一起创建最终的卷积模式。
解决过度拟合 (Solving Overfitting)
When working with a small dataset like the one used in the model, an issue called overfitting arises. This is when the model starts to memorize the images, instead of working to find their features. More specifically, when the model goes through too many epochs (basically how many times the model goes through the dataset), it starts listening to the input of some nodes and ignoring others. This reduces the accuracy of the model because it won’t know how to classify any new images from outside the dataset.
当使用小型数据集(如模型中使用的数据集)时,会出现称为过拟合的问题。 这是模型开始记住图像而不是寻找其特征的时候。 更具体地讲,当模型经过了太多时代 (模型基本上多少次经过的数据集),它开始听一些节点和无视他人的输入。 这降低了模型的准确性,因为它不知道如何从数据集外部对任何新图像进行分类。
To solve it, a dropout layer is added. This is a simple fix to this model. By dropping out a random subset of nodes, it prevents the overfitting because the nodes can’t memorize the labels (because there’s a high probability that the node will be turned off). It’s like the teacher who calls on the kid who isn’t paying attention in class. By embarrassing him and getting his attention, he’ll (hopefully) focus and provide value to the class.
为了解决这个问题,添加了一个辍学层 。 这是对此模型的简单修复。 通过丢弃节点的随机子集,可以防止过度拟合,因为节点无法记住标签(因为节点关闭的可能性很高)。 就像老师在呼唤班上没有注意力的孩子一样。 通过使他尴尬并引起他的注意,他将(希望)专注并为全班提供价值。
预测 (Prediction)
Finally, the model is provided with an image of a traffic sign, it’s run through the Convolutional Neural Network, and spits out the number associated with the corresponding sign.
最后,该模型提供有交通标志的图像,它通过卷积神经网络运行,并吐出与相应标志相关的数字。
When the following random sign is run through the model…
当以下随机符号在模型中运行时…
The model predicts the class as [1], which is correct!
该模型将类别预测为[1],这是正确的!
For anyone interested in the code, you can find it on my GitHub, here!
对于有兴趣在代码的人,你可以找到它在我的GitHub上, 在这里 !
重要要点 (Key Takeaways)
- Images are preprocessed with a greyscale and equalize function 使用灰度和均衡功能对图像进行预处理
- A Convolutional Neural Network (CNN) uses kernels to extract the features of a sign 卷积神经网络(CNN)使用内核提取符号的特征
- Features are compared to other classified images to make a prediction 将特征与其他分类图像进行比较以进行预测
Hey, I’m Kael Lascelle, a sixteen-year-old Innovator at The Knowledge Society! I have a passion for autonomous systems, especially self-driving cars, as well as sustainable energy.
嗨,我是知识社会的16岁创新者Kael Lascelle ! 我对自动驾驶系统特别是自动驾驶汽车以及可持续能源充满热情。
I would appreciate it if you could follow me on Medium and Twitter! Also, add me on LinkedIn, or send me an email.
如果您可以在Medium和Twitter上关注我,我将不胜感激! 另外,在LinkedIn上添加我,或给我发送电子邮件 。
翻译自: https://towardsdatascience.com/road-sign-classification-learning-to-build-a-cnn-7771373179d3