将 Tensorflow BatchDataset 转换为带有图像和标签的 Numpy 数组（tensorflow数据类型转换）

25-04-28 1

这篇文章主要围绕将TensorflowBatchDataset转换为带有图像和标签的Numpy数组和tensorflow数据类型转换展开，旨在为您提供一份详细的参考资料。我们将全面介绍将Tensorf

这篇文章主要围绕将 Tensorflow BatchDataset 转换为带有图像和标签的 Numpy 数组和tensorflow数据类型转换展开，旨在为您提供一份详细的参考资料。我们将全面介绍将 Tensorflow BatchDataset 转换为带有图像和标签的 Numpy 数组的优缺点，解答tensorflow数据类型转换的相关问题，同时也会为您带来"import numpy as np" ImportError: No module named numpy、"ValueError: Failed to convert a NumPy array to an Tensor (Unsupported object type numpy.ndarray). 在 TensorFlow CNN 中进行图像分类、3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数、Difference between import numpy and import numpy as np的实用方法。

本文目录一览：

如何解决将 Tensorflow BatchDataset 转换为带有图像和标签的 Numpy 数组

我有一个图像目录，我正在像这样接收它们：

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    train_dir,labels="inferred",label_mode="int",class_names=None,color_mode="rgb",batch_size=batch_size,image_size=(IMG_HEIGHT,IMG_WIDTH),shuffle=True,seed=123,validation_split=Val_Split,subset=''training'',interpolation="bilinear",follow_links=False,)

然后在构建模型后，我使用 BatchDataset 运行 tuner.search(train_ds,epochs=50,validation_split=0.2,callbacks=[stop_early])，它给了我这个：

Traceback (most recent call last):
  File "DecisionTree/Gender/main.py",line 133,in <module>
    tuner.search(train_ds,callbacks=[stop_early])
  File "/usr/local/lib/python3.7/dist-packages/kerastuner/engine/base_tuner.py",line 131,in search
    self.run_trial(trial,*fit_args,**fit_kwargs)
  File "/usr/local/lib/python3.7/dist-packages/kerastuner/tuners/hyperband.py",line 354,in run_trial
    super(Hyperband,self).run_trial(trial,**fit_kwargs)
  File "/usr/local/lib/python3.7/dist-packages/kerastuner/engine/multi_execution_tuner.py",line 96,in run_trial
    history = self._build_and_fit_model(trial,fit_args,copied_fit_kwargs)
  File "/usr/local/lib/python3.7/dist-packages/kerastuner/engine/tuner.py",line 141,in _build_and_fit_model
    return model.fit(*fit_args,**fit_kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py",line 1041,in fit
    (x,y,sample_weight),validation_split=validation_split))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/data_adapter.py",line 1359,in train_validation_split
    "arrays,found following types in the input: {}".format(unsplitable))
ValueError: `validation_split` is only supported for Tensors or NumPy arrays,found following types in the input: [<class ''tensorflow.python.data.ops.dataset_ops.BatchDataset''>]

我尝试了很多方法来使用 .take()、iter() 和 next() 将其转换为 Numpy 数组，但均未成功。我怎样才能让它工作？

这是我的完整代码：

import tensorflow as tf
from tensorflow import keras
import IPython.display as display
from PIL import Image 
import numpy as np
import matplotlib.pyplot as plt
import kerastuner as kt
import os
import pathlib
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,Conv2D,Flatten,Dropout,MaxPooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import cv2
import datetime
import glob
import tensorflow_datasets as tfds
from tensorflow.python.data.ops.dataset_ops import AUTOTUNE

print("Num GPUs Available: ",len(tf.config.list_physical_devices(''GPU'')))

gpus = tf.config.list_physical_devices(''GPU'')
if gpus:
  # Create 2 virtual GPUs with 1GB memory each
  try:
    tf.config.experimental.set_virtual_device_configuration(
        gpus[0],[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024),tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
    logical_gpus = tf.config.experimental.list_logical_devices(''GPU'')
    print(len(gpus),"Physical GPU,",len(logical_gpus),"Logical GPUs")
  except RuntimeError as e:
    # Virtual devices must be set before GPUs have been initialized
    print(e)

AUTOTUNE=AUTOTUNE

epochs = 500
steps_per_epoch = 10
batch_size = 32
IMG_HEIGHT = 180
IMG_WIDTH = 180

train_dir = "Training"
pred_dir = "Pred"

train_image_generator = ImageDataGenerator(rescale=1. / 255)

test_image_generator = ImageDataGenerator(rescale=1. / 255)

pred_image_generator = ImageDataGenerator(rescale=1. / 255)

Val_Split = 0.2

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    train_dir,)

test_ds = tf.keras.preprocessing.image_dataset_from_directory(
    train_dir,subset=''validation'',)

def model_builder(hp):
    model = keras.Sequential()
       
    model.add(Flatten(input_shape=(IMG_HEIGHT,IMG_WIDTH,3)))
    model.add(keras.layers.Dense(64,activation="relu"))
    model.add(keras.layers.Dense(20,activation="relu"))
    hp_units = hp.Int(''units'',min_value=0,max_value=512,step=32)
    model.add(keras.layers.Dense(10,activation="relu"))
    model.add(keras.layers.Dense(1,activation="sigmoid"))

    hp_learning_rate = hp.Choice(''learning_rate'',values=[1e-2,1e-3,1e-4])

    model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate),loss="binary_crossentropy",metrics=[''accuracy''])
                  
    return model
    
tuner = kt.Hyperband(model_builder,objective=''val_accuracy'',max_epochs=10,factor=3,directory=''my_dir'',project_name=''intro_to_kt'')

stop_early = tf.keras.callbacks.EarlyStopping(monitor=''val_loss'',patience=5)



tuner.search(test_ds,callbacks=[stop_early])

best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]

print(f"""
The hyperparameter search is complete. The optimal number of units in the first densely-connected
layer is {best_hps.get(''units'')} and the optimal learning rate for the optimizer
is {best_hps.get(''learning_rate'')}.
""")


model = tuner.hypermodel.build(best_hps)

model.summary()
#tf.keras.utils.plot_model(model,to_file="model.png",show_shapes=True,show_layer_names=True,rankdir=''TB'')
checkpoint_path = "training_gender/cp.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,save_weights_only=True,verbose=1)

log_dir = "logs/fit/" + datetime.datetime.Now().strftime("%Y%m%d-%H%M%s")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir,histogram_freq=1)

history = model.fit(train_ds,steps_per_epoch=steps_per_epoch,epochs=epochs,validation_data=test_data_gen,validation_steps=10,callbacks=[cp_callback,tensorboard_callback])

val_acc_per_epoch = history.history[''val_accuracy'']
best_epoch = val_acc_per_epoch.index(max(val_acc_per_epoch)) + 1
print(''Best epoch: %d'' % (best_epoch,))


hypermodel = tuner.hypermodel.build(best_hps)

# Retrain the model
hypermodel.fit(train_ds,epochs=best_epoch,validation_split=0.2)


hypermodel.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
hypermodel.save(''gender.h5'',include_optimizer=True)

test_loss,test_acc = hypermodel.evaluate(test_ds)
print("Tested Acc: ",test_acc)
print("Tested Acc: ",test_acc*100,"%")

print(hypermodel.predict(test_ds))

解决方法

将图像数据集转换为 X 和 Y NumPy 数组的一种方法如下：

注意：此代码是从 here 借用的。此代码由 Github 上的“PARASTOOP”编写。

import os
import numpy as np
from os import listdir
from scipy.misc import imread,imresize
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split

# Settings:
img_size = 64
grayscale_images = False
num_class = 2
test_size = 0.2


def get_img(data_path):
    # Getting image array from path:
    img = imread(data_path,flatten=grayscale_images)
    img = imresize(img,(img_size,img_size,3)) #1 if grayscale_images else 3
    return img

def get_dataset(dataset_path=''data/train''):
    # Getting all data from data path:
    try:
        X = np.load(''npy_dataset/X.npy'')
        Y = np.load(''npy_dataset/Y.npy'')
    except:
        labels = listdir(dataset_path) # Geting labels
        X = []
        Y = []
        for i,label in enumerate(labels):
            datas_path = dataset_path+''/''+label
            for data in listdir(datas_path):
                img = get_img(datas_path+''/''+data)
                X.append(img)
                Y.append(i)
        # Create dateset:
        X = 1-np.array(X).astype(''float32'')/255.
        Y = np.array(Y).astype(''float32'')
        Y = to_categorical(Y,num_class)
        if not os.path.exists(''npy_dataset/''):
            os.makedirs(''npy_dataset/'')
        np.save(''npy_dataset/X.npy'',X)
        np.save(''npy_dataset/Y.npy'',Y)
    X,X_test,Y,Y_test = train_test_split(X,test_size=test_size,random_state=42)
    return X,Y_test

if __name__ == ''__main__'':
    get_dataset()

使用此代码，您可以将图像数据集转换为 X 和 Y .npy 数据集，然后您可以像这样加载它：

import numpy as np
from keras.models import Sequential
import matplotlib.pyplot as plt
from PIL import Image
import numpy as np

# Load entire dataset
X = np.load(''X.npy'')
Y = np.load(''Y.npy'')


plt.figure()
plt.imshow(X[0])
plt.colorbar()
plt.grid(False)
plt.show()

注意：如果您的数据集太大，此代码也会崩溃，因为它填满了您的 RAM。注意：此代码也出于某种原因在使用 Matplotlib 查看时将您的图像更改为蓝色，因此如果有人知道解决方案，请在下面发表评论。

"import numpy as np" ImportError: No module named numpy

问题：没有安装 numpy

解决方法：

下载文件，安装

numpy-1.8.2-win32-superpack-python2.7

安装运行 import numpy，出现

Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    import numpy
  File "C:\Python27\lib\site-packages\numpy\__init__.py", line 153, in <module>
    from . import add_newdocs
  File "C:\Python27\lib\site-packages\numpy\add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "C:\Python27\lib\site-packages\numpy\lib\__init__.py", line 8, in <module>
    from .type_check import *
  File "C:\Python27\lib\site-packages\numpy\lib\type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "C:\Python27\lib\site-packages\numpy\core\__init__.py", line 6, in <module>
    from . import multiarray
ImportError: DLL load failed: %1 不是有效的 Win32 应用程序。

原因是：python 装的是 64 位的，numpy 装的是 32 位的

重新安装 numpy 为：numpy-1.8.0-win64-py2.7

"ValueError: Failed to convert a NumPy array to an Tensor (Unsupported object type numpy.ndarray). 在 TensorFlow CNN 中进行图像分类

如何解决"ValueError: Failed to convert a NumPy array to an Tensor (Unsupported object type numpy.ndarray). 在 TensorFlow CNN 中进行图像分类

我一直在研究用于图像分类的 CNN，但我一直遇到同样的错误，我的数据正在加载到数据帧中，但我无法将其转换为张量以将其输入 CNN。如您所见，我使用此代码将图片加载到数据框中：


for i in range(len(merged)):
    full_path = merged.iloc[i][''Image Path Rel'']
    filename = full_path[-22:-1] + ''G''
    try:
        img = img_to_array(load_img(''D:/Serengeti_Data/Compressed/Compressed/'' + filename,target_size=(32,32,3)))
    except:
        img = np.zeros((32,3),dtype=np.float32)
        images = images.append({''Capture Id'' : merged.iloc[i][''Capture Id''],''Image'' : img},ignore_index = True)
    else:
        images = images.append({''Capture Id'' : merged.iloc[i][''Capture Id''],ignore_index = True)

然后，一旦我使用 load_img() 和 img_to_array() 加载了图像，我进行了重塑以获得所需的 (32,3) 形状。还通过将 Image 列除以 255 来标准化这些值。

然后我这样做是为了尝试将其转换为张量：

train_tf = tf.data.Dataset.from_tensor_slices(images[''Image''])
# Also tried this,but didn''t got the same results:
# train_tf = tf.convert_to_tensor(train_df[''Image''])

但不断收到错误：

ValueError: 无法将 NumPy 数组转换为张量（不支持的对象类型 numpy.ndarray）

我也尝试跳过它并立即尝试适应我们的模型，但得到了完全相同的错误：

trying_df = pd.DataFrame(images[''Image''])
target_df = pd.DataFrame(targets)
animal_model = models.Sequential()
animal_model.add(layers.Conv2D(30,kernel_size = (3,padding = ''valid'',activation = ''relu'',input_shape =(32,3)))
animal_model.add(layers.MaxPooling2D(pool_size=(1,1)))
animal_model.add(layers.Conv2D(60,kernel_size=(1,1),activation = ''relu''))
animal_model.add(layers.Flatten())
animal_model.add(layers.Dense(100,activation = ''relu''))
animal_model.add(layers.Dense(10,activation = ''softmax''))
## compiler to model
animal_model.compile(loss = ''categorical_crossentropy'',metrics = [''accuracy''],optimizer =''adam'')
## training the model
animal_model.fit(trying_df,target_df,batch_size = 128,epochs = 15)
animal_model.summary()

TensorFlow 版本：2.4.1

Numpy 版本：1.19.5

熊猫版本：1.0.1

解决方法

为了加载图像，您可以使用以下代码：

image = cv2.imread(filename)
image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)

为了调整图像的大小和缩放比例，最好让模型“嵌入”预处理功能。

IMG_SIZE = 180
resize_and_rescale = tf.keras.Sequential([
  layers.experimental.preprocessing.Resizing(IMG_SIZE,IMG_SIZE),layers.experimental.preprocessing.Rescaling(1./255)
])
model = tf.keras.Sequential(
    [
        resize_and_rescale,layers.Conv2D(32,3,activation="relu"),layers.MaxPooling2D(),layers.Conv2D(64,layers.Conv2D(128,layers.Flatten(),layers.Dense(128,layers.Dense(len(class_names),activation="softmax"),]
)
model.compile(
    optimizer="adam",loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True),metrics=["accuracy"],)

注意：
处理图像时使用 tf.Data 而不是 numpy 数组。您可以使用以下代码作为示例：
https://github.com/alessiosavi/tensorflow-face-recognition/blob/90d4acbea8f79539826b50c82a63a7c151441a1a/dense_embedding.py#L155

3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数

前言

具体我们来学 Numpy 的统计函数

（一）函数一览表

调用方式：np.*

.sum(a)	对数组 a 求和
.mean(a)	求数学期望
.average(a)	求平均值
.std(a)	求标准差
.var(a)	求方差
.ptp(a)	求极差
.median(a)	求中值，即中位数
.min(a)	求最大值
.max（a)	求最小值
.argmin(a)	求最小值的下标，都处里为一维的下标
.argmax(a)	求最大值的下标，都处里为一维的下标
.unravel_index(index, shape)	g 根据 shape, 由一维的下标生成多维的下标

（二）统计函数 1

（1）说明

（2）输出

.sum(a)

.mean(a)

.average(a)

.std(a)

.var(a)

（三）统计函数 2

（1）说明

（2）输出

.max(a) .min(a)

.ptp(a)

.median(a)

.argmin(a)

.argmax(a)

.unravel_index(index,shape)

作者：Mark

日期：2019/02/11 周一

Difference between import numpy and import numpy as np

up vote 18 down vote favorite

I understand that when possible one should use

import numpy as np

This helps keep away any conflict due to namespaces. But I have noticed that while the command below works

import numpy.f2py as myf2py

the following does not

import numpy as np
np.f2py #throws no module named f2py

Can someone please explain this?

python numpy

shareimprove this question

edited Mar 24 ''14 at 23:20

mu 無

24.7k104471

asked Mar 24 ''14 at 23:19

user1318806

3001311

@roippi have you tried exit your python and enter it and just do import numpy then numpy.f2py ? It throws an error in my case too – aha Mar 24 ''14 at 23:24

Importing a module doesn''t import sub-modules. You need to explicitly import the numpy.f2py module regardless of whether or not/how numpy itself has been imported. – alecb Mar 24 ''14 at 23:39

add a comment

4 Answers

active oldest votes

up vote 13 down vote

numpy is the top package name, and doing import numpy doesn''t import submodule numpy.f2py.

When you do import numpy it creats a link that points to numpy, but numpy is not further linked to f2py. The link is established when you do import numpy.f2py

In your above code:

import numpy as np # np is an alias pointing to numpy, but at this point numpy is not linked to numpy.f2py
import numpy.f2py as myf2py # this command makes numpy link to numpy.f2py. myf2py is another alias pointing to numpy.f2py as well

Here is the difference between import numpy.f2py and import numpy.f2py as myf2py:

import numpy.f2py
- put numpy into local symbol table(pointing to numpy), and numpy is linked to numpy.f2py
- both numpy and numpy.f2py are accessible
import numpy.f2py as myf2py
- put my2py into local symbol table(pointing to numpy.f2py)
- Its parent numpy is not added into local symbol table. Therefore you can not access numpy directly

shareimprove this answer

edited Mar 25 ''14 at 0:31

answered Mar 24 ''14 at 23:33

aha

1,2291718

add a comment

up vote 7 down vote

The import as syntax was introduced in PEP 221 and is well documented there.

When you import a module via

import numpy

the numpy package is bound to the local variable numpy. The import as syntax simply allows you to bind the import to the local variable name of your choice (usually to avoid name collisions, shorten verbose module names, or standardize access to modules with compatible APIs).

Thus,

import numpy as np

is equivalent to,

import numpy
np = numpy
del numpy

When trying to understand this mechanism, it''s worth remembering that import numpy actually means import numpy as numpy.

When importing a submodule, you must refer to the full parent module name, since the importing mechanics happen at a higher level than the local variable scope. i.e.

import numpy as np
import numpy.f2py   # OK
import np.f2py      # ImportError

I also take issue with your assertion that "where possible one should [import numpy as np]". This is done for historical reasons, mostly because people get tired very quickly of prefixing every operation with numpy. It has never prevented a name collision for me (laziness of programmers actually suggests there''s a higher probability of causing a collision with np)

Finally, to round out my exposé, here are 2 interesting uses of the import as mechanism that you should be aware of:

1. long subimports

import scipy.ndimage.interpolation as warp
warp.affine_transform(I, ...)

2. compatible APIs

try:
    import pyfftw.interfaces.numpy_fft as fft
except:
    import numpy.fft as fft
# call fft.ifft(If) with fftw or the numpy fallback under a common name

shareimprove this answer

answered Mar 25 ''14 at 0:59

hbristow

68345

add a comment

up vote 1 down vote

numpy.f2py is actually a submodule of numpy, and therefore has to be imported separately from numpy. As aha said before:

When you do import numpy it creats a link that points to numpy, but numpy is not further linked to f2py. The link is established when you do import numpy.f2py

when you call the statement import numpy as np, you are shortening the phrase "numpy" to "np" to make your code easier to read. It also helps to avoid namespace issues. (tkinter and ttk are a good example of what can happen when you do have that issue. The UIs look extremely different.)

shareimprove this answer

answered Mar 24 ''14 at 23:47

bspymaster

760923

add a comment

up vote 1 down vote

This is a language feature. f2py is a subpackage of the module numpy and must be loaded separately.

This feature allows:

you to load from numpy only the packages you need, speeding up execution.
the developers of f2py to have namespace separation from the developers of another subpackage.

Notice however that import numpy.f2py or its variant import numpy.f2py as myf2py are still loading the parent module numpy.

Said that, when you run

import numpy as np
np.f2py

You receive an AttributeError because f2py is not an attribute of numpy, because the __init__() of the package numpy did not declare in its scope anything about the subpackage f2py.

shareimprove this answer

answered Mar 24 ''14 at 23:57

gg349

7,67321739

when you do import numpy.f2py as myf2py, how do you access its parent numpy? it seems import numpy.f2py allows you to access its parent numpy, but import numpy.f2py as myf2py doesn''t – aha Mar 25 ''14 at 0:00

You don''t access it because you decided you didn''t want to use anything from numpy, and you only care of using the subpackage. It is similar to using from foo import bar: the name foo will not be accessible. See the comment after the first example of the docs, LINK – gg349 Mar 25 ''14 at 0:05

add a comment

关于将 Tensorflow BatchDataset 转换为带有图像和标签的 Numpy 数组和tensorflow数据类型转换的问题就给大家分享到这里，感谢你花时间阅读本站内容，更多关于"import numpy as np" ImportError: No module named numpy、"ValueError: Failed to convert a NumPy array to an Tensor (Unsupported object type numpy.ndarray). 在 TensorFlow CNN 中进行图像分类、3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数、Difference between import numpy and import numpy as np等相关知识的信息别忘了在本站进行查找喔。

本文标签：