图像分类分类器
Image classification is an amazing application of deep learning. We can train a powerful algorithm to model a large image dataset. This model can then be used to classify a similar but unknown set of images.
图像分类是深度学习的惊人应用。 我们可以训练一个强大的算法来对大型图像数据集建模。 然后可以使用该模型对相似但未知的图像集进行分类。
There is no limit to the applications of image classification. You can use it in your next app or you can use it to solve some real world problem. That's all up to you. But to someone who is fairly new to this realm, it might seem very challenging at first. How should I get my data? How should I build my model? What tools should I use?
图像分类的应用没有限制。 您可以在下一个应用程序中使用它,也可以使用它来解决一些现实世界中的问题。 这完全取决于您。 但是对于这个领域的新手来说,乍一看似乎很有挑战性。 我应该如何获取数据? 我应该如何建立我的模型? 我应该使用什么工具?
In this article we will discuss all of that - from finding a dataset to training your model. I will try to make things as simple as possible by avoiding some technical details (PS: Please note that this doesn't mean those details are not important. I will mention some great resources which you can refer to learn more about those topics). The purpose of this article is to explain the basic process of building an image classifier and that's what we will focus more on here.
在本文中,我们将讨论所有这些-从查找数据集到训练模型。 我将通过避免一些技术细节来使事情变得尽可能简单( PS:请注意,这并不意味着这些细节并不重要。我将提到一些很棒的资源,您可以参考这些资源以了解有关这些主题的更多信息 )。 本文的目的是解释构建图像分类器的基本过程,这就是我们将在此处重点关注的内容。
We will build an Image classifier for the Fashion-MNIST Dataset. The Fashion-MNIST dataset is a collection of Zalando's article images. It contains 60,000 images for the training set and 10,000 images for the test set data (we will discuss the test and training datasets along with the validation dataset later). These images belong to the labels of 10 different classes.
我们将为Fashion-MNIST数据集构建一个图像分类器。 Fashion-MNIST数据集是Zalando的文章图像的集合。 它包含用于训练集的60,000张图像和用于测试集数据的10,000张图像( 我们稍后将讨论测试和训练数据集以及验证数据集 )。 这些图像属于10个不同类别的标签。
导入库 (Importing Libraries)
Our goal is to train a deep learning model that can classify a given set of images into one of these 10 classes. Now that we have our dataset, we should move on to the tools we need. There are many libraries and tools out there that you can choose based on your own project requirements. For this one I will stick to the following:
我们的目标是训练一种深度学习模型,该模型可以将一组给定的图像分类为这10个类中的一个。 现在我们有了数据集,我们应该继续使用所需的工具。 您可以根据自己的项目要求选择许多库和工具。 为此,我将坚持以下几点:
Numpy - Python library for numerical computation
Numpy-用于数值计算的Python库
Pandas - Python library data manipulation
Pandas -Python库数据操作
Matplotlib - Python library data visualisation
Matplotlib -Python库数据可视化
Keras - Python library based on tensorflow for creating deep learning models
Keras - Python库基于tensorflow创建深度学习模型
Jupyter - I will run all my code on Jupyter Notebooks. You can install it via the link. You can use Google Colabs also if you need better computational power.
Jupyter-我将在Jupyter Notebooks上运行所有代码。 您可以通过链接进行安装。 如果您需要更好的计算能力,也可以使用Google Colabs 。
Along with these four, we will also use scikit-learn. The purpose of these libraries will become more clear once we dive into the code.
除了这四个以外,我们还将使用scikit-learn 。 一旦我们深入研究代码,这些库的目的将变得更加清晰。
Okay! We have our tools and libraries ready. Now we should start setting up our code.
好的! 我们已经准备好工具和库。 现在,我们应该开始设置代码。
Start with importing all the above mentioned libraries. Along with importing libraries I have also imported some specific modules from these libraries. Let me go through them one by one.
首先导入上述所有库。 除了导入库,我还从这些库中导入了一些特定的模块。 让我一一讲解。
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import keras
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Dense, Dropout
from keras.layers import Flatten, BatchNormalization
train_test_split:This module splits the training dataset into training and validation data. The reason behind this split is to check if our model is overfitting or not. We use a training dataset to train our model and then we will compare the resulting accuracy to validation accuracy. If the difference between both quantities is significantly large, then our model is probably overfitting. We will reiterate through our model building process and making required changes along the way. Once we are satisfied with our training and validation accuracies, we will make final predictions on our test data.
train_test_split :此模块将训练数据集分为训练和验证数据。 进行此拆分的原因是要检查我们的模型是否过度拟合 。 我们使用训练数据集来训练我们的模型,然后将结果精度与验证精度进行比较。 如果两个量之间的差异非常大,则我们的模型可能过度拟合。 我们将重申我们的模型构建过程,并在此过程中进行必要的更改。 一旦我们对培训和验证的准确性感到满意,我们将对测试数据做出最终的预测。
to_categorical:to_categorical is a keras utility. It is used to convert the categorical labels into one-hot encodings. Let's say we have three labels ("apples", "oranges", "bananas"), then one hot encodings for each of these would be [1, 0, 0] -> "apples", [0, 1, 0] -> "oranges", [0, 0, 1] -> "bananas".
to_categorical: to_categorical是一个keras实用程序。 它用于将分类标签转换为一键编码 。 假设我们有三个标签(“苹果”,“橙色”,“香蕉”),那么每个标签的一种热门编码为[1、0、0]->“苹果”,[0、1、0] ->“橙色”,[0,0,1]->“香蕉”。
The rest of the Keras modules we have imported are convolutional layers. We will discuss convolutional layers when we start building our model. We will also give a quick glance to what each of these layers do.
我们导入的其余Keras模块是卷积层。 开始构建模型时,我们将讨论卷积层。 我们还将快速浏览这些层中的每一层。
数据预处理 (Data Pre-processing)
For now we will shift our attention to getting our data and analysing it. You should always remember the importance of pre-processing and analysing the data. It not only gives you insights about the data but also helps to locate inconsistencies.
现在,我们将注意力转移到获取数据和分析数据上。 您应该永远记住预处理和分析数据的重要性。 它不仅可以为您提供有关数据的见解,还可以帮助您找到不一致之处。
A very slight variation in data can sometimes lead to a devastating result for your model. This makes it important to preprocess your data before using it for training. So with that in mind let's start data preprocessing.
数据中的微小变化有时会导致模型的毁灭性后果。 因此,在对数据进行培训之前对其进行预处理非常重要。 因此,请记住这一点,开始数据预处理。
train_df = pd.read_csv('./fashion-mnist_train.csv')
test_df = pd.read_csv('./fashion-mnist_test.csv')
First of all let's import our dataset (Here is the link to download this dataset on your system). Once you have imported the dataset, run the following command.
首先,让我们导入数据集( 这是将数据集下载到系统上的链接 )。 导入数据集后,运行以下命令。
train_df.head()
This command will show you how your data looks like. The following screenshot shows the output of this command.
该命令将向您显示数据的外观。 以下屏幕截图显示了此命令的输出。
We can see how our image data is stored in the form of pixel values. But we cannot feed data to our model in this format. So, we will have to convert it into numpy arrays.
我们可以看到我们的图像数据如何以像素值的形式存储。 但是我们不能以这种格式将数据提供给我们的模型。 因此,我们将不得不将其转换为numpy数组。
train_data = np.array(train_df.iloc[:, 1:])
test_data = np.array(test_df.iloc[:, 1:])
Now, it's time to get our labels.
现在,是时候获得我们的标签了。
train_labels = to_categorical(train_df.iloc[:, 0])
test_labels = to_categorical(test_df.iloc[:, 0])
Here, you can see that we have used to_categorical to convert our categorical data into one hot encodings.
在这里,您可以看到我们已经使用to_categorical将分类数据转换为一种热编码。
We will now reshape the data and cast it into float32 type so that we can use it conveniently.
现在,我们将重塑数据并将其转换为float32类型,以便我们可以方便地使用它。
rows, cols = 28, 28
train_data = train_data.reshape(train_data.shape[0], rows, cols, 1)
test_data = test_data.reshape(test_data.shape[0], rows, cols, 1)
train_data = train_data.astype('float32')
test_data = test_data.astype('float32')
We are almost done. Let's just finish preprocessing our data by normalizing it. Normalizing image data will map all the pixel values in each image to the values between 0 to 1. This helps us reduce inconsistencies in data. Before normalizing, the image data can have large variations in pixel values which can lead to some unusual behaviour during the training process.
我们快完成了。 让我们通过规范化数据来完成预处理。 归一化图像数据会将每个图像中的所有像素值映射到0到1之间的值。这有助于我们减少数据的不一致。 在归一化之前,图像数据的像素值可能会有很大变化,这可能导致训练过程中出现某些异常行为。
train_data /= 255.0
test_data /= 255.0
卷积神经网络 (Convolutional Neural Networks)
So, data preprocessing is done. Now we can start building our model. We will build a Convolutional Neural Network for modeling the image data. CNNs are modified versions of regular neural networks. These are modified specifically for image data. Feeding images to regular neural networks would require our network to have a large number of input neurons. For example just for a 28x28 image we would require 784 input neurons. This would create a huge mess of training parameters.
因此,完成了数据预处理。 现在我们可以开始构建模型了。 我们将建立一个卷积神经网络来对图像数据建模。 CNN是常规神经网络的修改版本。 专门针对图像数据进行了修改。 将图像馈送到常规神经网络将需要我们的网络具有大量输入神经元。 例如,仅对于28x28图像,我们将需要784个输入神经元。 这将造成巨大的训练参数混乱。
CNNs fix this problem by already assuming that the input is going to be an image. The main purpose of convolutional neural networks is to take advantage of the spatial structure of the image and to extract high level features from that and then train on those features. It does so by performing a convolution operation on the matrix of pixel values.
CNN通过已经假设输入将是图像来解决此问题。 卷积神经网络的主要目的是利用图像的空间结构并从中提取高级特征,然后对这些特征进行训练。 它是通过对像素值矩阵执行卷积运算来实现的。
The visualization above shows how convolution operation works. And the Conv2D layer we imported earlier does the same thing. The first matrix (from the left) in the demonstration is the input to the convolutional layer. Then another matrix called "filter" or "kernel" is multiplied (matrix multiplication) to each window of the input matrix. The output of this multiplication is the input to the next layer.
上面的图表显示了卷积运算的工作原理。 我们之前导入的Conv2D层也做同样的事情。 演示中的第一个矩阵( 从左开始 )是卷积层的输入。 然后将另一个称为“过滤器”或“内核”的矩阵与输入矩阵的每个窗口相乘(矩阵相乘)。 该乘法的输出是下一层的输入。
Other than convolutional layers, a typical CNN also has two other types of layers: 1) a pooling layer, and 2) a fully connected layer.
除了卷积层之外,典型的CNN还具有其他两种类型的层:1) 缓冲 层 ,以及2) 完全连接的层 。
Pooling layers are used to generalize the output of the convolutional layers. Along with generalizing, it also reduces the number of parameters in the model by down-sampling the output of the convolutional layer.
池化层用于概括卷积层的输出。 除了泛化,它还通过对卷积层的输出进行下采样来减少模型中的参数数量。
As we just learned, convolutional layers represent high level features from image data. Fully connected layers use these high level features to train the parameters and to learn to classify those images.
正如我们刚刚了解到的,卷积层代表了图像数据中的高级特征。 完全连接的图层使用这些高级功能来训练参数并学习对这些图像进行分类。
We will also use the Dropout, Batch-normalization and Flatten layers in addition to the layers mentioned above. Flatten layer converts the output of convolutional layers into a one dimensional feature vector. It is important to flatten the outputs because Dense (Fully connected) layers only accept a feature vector as input. Dropout and Batch-normalization layers are for preventing the model from overfitting.
除上述层外,我们还将使用Dropout , Batch-normalization和Flatten层。 展平层将卷积层的输出转换为一维特征向量。 展平输出非常重要,因为密集(完全连接)层仅接受特征向量作为输入。 Dropout和Batch-normalization层用于防止模型过度拟合 。
train_x, val_x, train_y, val_y = train_test_split(train_data, train_labels, test_size=0.2)
batch_size = 256
epochs = 5
input_shape = (rows, cols, 1)
def baseline_model():
model = Sequential()
model.add(BatchNormalization(input_shape=input_shape))
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2,2)))
model.add(Dropout(0.25))
model.add(BatchNormalization())
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
return model
The code that you see above is the code for our CNN model. You can structure these layers in many different ways to get good results. There are many popular CNN architectures which give state of the art results. Here, I have just created my own simple architecture for the purpose of this problem. Feel free to try your own and let me know what results you get :)
您在上面看到的代码是我们的CNN模型的代码。 您可以用许多不同的方法来构造这些层,以获得良好的效果。 有许多流行的CNN体系结构可提供最新的结果。 在这里,我为这个问题创建了自己的简单体系结构。 随意尝试一下,让我知道您得到了什么结果:)
训练模型 (Training the model)
Once you have created the model you can import it and then compile it by using the code below.
创建模型后,可以导入它,然后使用下面的代码对其进行编译。
model = baseline_model()
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.compileconfigures the learning process for our model. We have passed it three arguments. These arguments define the loss function for our model, optimizer and metrics.
model.compile为我们的模型配置学习过程。 我们已经通过了三个论点。 这些参数定义了模型, 优化器和指标的损失函数 。
history = model.fit(train_x, train_y,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(val_x, val_y))
And finally by running the code above you can train your model. I am training this model for just five epochs but you can increase the number of epochs. After your training process is completed you can make predictions on the test set by using the following code.
最后,通过运行上面的代码,您可以训练模型。 我仅针对五个时期训练该模型,但是您可以增加时期的数量。 训练过程完成后,您可以使用以下代码对测试集进行预测。
predictions= model.predict(test_data)
结论 (Conclusion)
Congrats! You did it, you have taken your first step into the amazing world of computer vision.
恭喜! 做到了,您就迈入了令人惊叹的计算机视觉世界的第一步。
You have created a your own image classifier. Even though this is a great achievement, we have just scratched the surface.
您已经创建了自己的图像分类器。 尽管这是一项伟大的成就,但我们只是从头开始。
There is a lot you can do with CNNs. The applications are limitless. I hope that this article helped you to get an understanding of how the process of training these models works.
CNN可以做很多事情。 应用程序是无限的。 我希望本文能帮助您了解训练这些模型的过程如何工作。
Working on other datasets on your own will help you understand this even better. I have also created a GitHub repository for the code I used in this article. So, if this article was useful for you please let me know.
自己处理其他数据集将帮助您更好地理解这一点。 我还为我在本文中使用的代码创建了GitHub 存储库 。 因此,如果本文对您有用,请告诉我。
If you have any questions or you want to share your own results or if you just want to say "hi", feel free to hit me up on twitter, and I'll try to do my best to help you. And finally Thanks a lot for reading this article!! :)
如果您有任何疑问,或者想分享自己的结果,或者只想说“嗨”,请随时在twitter上打我,我会尽力帮助您的。 最后,非常感谢您阅读本文!! :)
翻译自: https://www.freecodecamp.org/news/creating-your-first-image-classifier/
图像分类分类器