使用train_test_split从我的本地目录中的图像

问题描述：

我已经阅读从我的本地目录中的图片如下：使用train_test_split从我的本地目录中的图像

from PIL import Image 
import os 

root = '/Users/xyz/Desktop/data' 

for path, subdirs, files in os.walk(root): 
    for name in files: 
     img_path = os.path.join(path,name)

我有两个子目录：category-1和category-2，每个都包含图像文件（.jpg），其属于每个类别。

如何在Scikit-Learn中将这些图像和两个类别与train_test_split()函数结合使用？换句话说，安排培训和测试数据？

谢谢。

答

您必须从图像中读取像素数据并将其存储在Pandas DataFrame或numpy数组中。同时，您必须将相应的类别值category-1 (1)和category-2 (2)存储在列表或numpy数组中。以下是一个草图：我将假设您有一些商店categories根据图片名称返回1或2。

X = numpy.array([]) 
y = list() 

for path, subdirs, files in os.walk(root): 
    for name in files: 
    img_path = os.path.join(path,name) 
    correct_cat = categories[img_path] 
    img_pixels = list(Image.open(img_path).getdata()) 
    X = numpy.vstack((X, img_pixels)) 
    y.append(correct_cat)

您正在有效地存储图像像素和类别值（转换为整数）。例如，可以有其他方法来做到这一点：Check this。

一旦你有X和y列表，你可以在他们

X_train, X_test, y_train, y_test = train_test_split(X, y)

调用 train_test_split

使用train_test_split从我的本地目录中的图像

相关推荐