以行方式使用 numpy matmul 和广播

25-04-28 1

本文将带您了解关于以行方式使用numpymatmul和广播的新内容，另外，我们还将为您提供关于einsum和matmul、InvalidArgumentError：无法计算MatMul，因为输入＃0（

本文将带您了解关于以行方式使用 numpy matmul 和广播的新内容，另外，我们还将为您提供关于einsum 和 matmul、InvalidArgumentError：无法计算MatMul，因为输入＃0（从零开始）应为浮点张量，但为双张量[Op：MatMul]、matMul 中的错误：形状为 684,1 和 2,1 且 transposeA=false 和 transposeB=false 的张量的内部形状 (1) 和 (2) 必须匹配、matmul：输入操作数 1 不匹配的实用信息。

本文目录一览：

以行方式使用 numpy matmul 和广播
einsum 和 matmul
InvalidArgumentError：无法计算MatMul，因为输入＃0（从零开始）应为浮点张量，但为双张量[Op：MatMul]
matMul 中的错误：形状为 684,1 和 2,1 且 transposeA=false 和 transposeB=false 的张量的内部形状 (1) 和 (2) 必须匹配
matmul：输入操作数 1 不匹配

以行方式使用 numpy matmul 和广播

如何解决以行方式使用 numpy matmul 和广播

我有一个 3D 点 (n,3) 数组，它们将使用以 nx3x3 数组形式存储的 3x3rotation 矩阵绕原点旋转。

目前我只是在使用 matmul 的 for 循环中执行此操作，但我认为这是毫无意义的，因为必须有更快的广播方式来执行此操作。

当前代码

n = 10
points = np.random.random([10,3])
rotation_matrices = np.tile(np.random.random([3,3]),(n,1,1))

result = []

for point in range(len(points)):
    rotated_point = np.matmul(rotation_matrices[point],points[point])

    result.append(rotated_point)

result = np.asarray(result)

注意：在这个例子中，我只是平铺了相同的旋转矩阵，但在我的真实情况下，每个 3x3 旋转矩阵都是不同的。

我想做什么

我猜一定有某种广播方式，因为当点云变得非常大时，for 循环会变得非常缓慢。我想这样做：

np.matmul(rotation_matrices,points)

其中 row 中的每个 points 乘以其相应的旋转矩阵。可能有一种方法可以使用 np.einsum 执行此操作，但我无法弄清楚签名。

解决方法

如果您看到 https://ghostbin.co/paste/s2qdw，则 np.einsum(''ij,jk'',a,b) 是 matmul 的签名。

所以你可以试试带有签名的np.einsum：

np.einsum(''kij,kj->ki'',rotation_matrices,points)

测试：

einsum = np.einsum(''kij,points)
manual = np.array([np.matmul(x,y) for x,y in zip (rotation_matrices,points)])
np.allclose(einsum,manual)
# True

einsum 和 matmul

如何解决einsum 和 matmul

相关问题BLAS with symmetry in higher order tensor in Fortran

我尝试使用 python 代码来利用张量收缩中的对称性， A[a,b] B[b,c,d] = C[a,d] 当 B[b,d] = B[b,d,c] 因此 C[a,c]。（假设爱因斯坦求和约定，即重复的 b 表示对其求和）

通过以下代码

import numpy as np
import time
# A[a,b] * B[b,d]
na = nb = nc = nd = 100
A = np.random.random((na,nb))
B = np.random.random((nb,nc,nd))
C = np.zeros((na,nd))
C2= np.zeros((na,nd))
C3= np.zeros((na,nd))
# symmetrize B
for c in range(nc):
    for d in range(c):
        B[:,d] = B[:,c]
start_time = time.time()
C2 = np.einsum(''ab,bcd->acd'',A,B)
finish_time = time.time()
print(''time einsum'',finish_time - start_time )
start_time = time.time()
for c in range(nc):
# c+1 is needed,since range(0) will be skipped
    for d in range(c+1):
       #C3[:,d] = np.einsum(''ab,b->a'',A[:,:],B[:,d] )
       C3[:,d] = np.matmul(A[:,d] )
       
for c in range(nc):
    for d in range(c+1,nd):
        C3[:,d] = C3[:,c] 
finish_time = time.time()
print( ''time partial einsum'',finish_time - start_time )
for a in range(int(na/10)):
    for c in range(int(nc/10)):
        for d in range(int(nd/10)):
            if abs((C3-C2)[a,d])> 1.0e-12:
                print(''warning'',a,(C3-C2)[a,d])

在我看来 np.matmul 比 np.einsum 快，例如，通过使用 np.matmul，我得到了

time einsum 0.07406115531921387
time partial einsum 0.0553278923034668

通过使用 np.einsum，我得到了

time einsum 0.0751657485961914
time partial einsum 0.11624622344970703

上面的性能差异是不是一般？我经常认为 einsum 是理所当然的。

解决方法

作为一般规则，我希望 matmul 更快，但在更简单的情况下，einsum 似乎实际上使用了 matmul。

但这里是我的时间

In [20]: C2 = np.einsum(''ab,bcd->acd'',A,B)
In [21]: timeit C2 = np.einsum(''ab,B)
126 ms ± 1.3 ms per loop (mean ± std. dev. of 7 runs,10 loops each)

你的对称性尝试einsum：

In [22]: %%timeit
    ...: for c in range(nc):
    ...: # c+1 is needed,since range(0) will be skipped
    ...:     for d in range(c+1):
    ...:        C3[:,c,d] = np.einsum(''ab,b->a'',A[:,:],B[:,d] )
    ...:        #C3[:,d] = np.matmul(A[:,d] )
    ...: 
    ...: for c in range(nc):
    ...:     for d in range(c+1,nd):
    ...:         C3[:,d] = C3[:,d,c]
    ...: 
128 ms ± 3.39 ms per loop (mean ± std. dev. of 7 runs,10 loops each)

与 matmul 相同：

In [23]: %%timeit
    ...: for c in range(nc):
    ...: # c+1 is needed,since range(0) will be skipped
    ...:     for d in range(c+1):
    ...:        #C3[:,d] )
    ...:        C3[:,c]
    ...: 
81.3 ms ± 1.14 ms per loop (mean ± std. dev. of 7 runs,10 loops each)

直接matmul：

In [24]: C4 = np.matmul(A,B.reshape(100,-1)).reshape(100,100,100)
In [25]: np.allclose(C2,C4)
Out[25]: True
In [26]: timeit C4 = np.matmul(A,100)
14.9 ms ± 167 µs per loop (mean ± std. dev. of 7 runs,100 loops each)

einsum 也有一个 optimize 标志。我认为只有 3 个或更多参数才重要，但它似乎在这里有帮助：

In [27]: timeit C2 = np.einsum(''ab,B,optimize=True)
20.3 ms ± 688 µs per loop (mean ± std. dev. of 7 runs,10 loops each)

有时当数组非常大时，某些迭代会更快，因为它降低了内存管理的复杂性。但是我认为在尝试利用对称性时不值得。其他 SO 表明，在某些情况下 matmul 可以检测对称性，并使用自定义 BLAS 调用，但我认为这里不是这种情况（它无法检测 {{1}没有昂贵的比较。）

InvalidArgumentError：无法计算MatMul，因为输入＃0（从零开始）应为浮点张量，但为双张量[Op：MatMul]

有人可以解释一下，TensorFlow的急切模式如何工作？我正在尝试建立一个简单的回归，如下所示：

import tensorflow as tftfe = tf.contrib.eagertf.enable_eager_execution()import numpy as npdef make_model():    net = tf.keras.Sequential()    net.add(tf.keras.layers.Dense(4, activation=''relu''))    net.add(tf.keras.layers.Dense(1))    return netdef compute_loss(pred, actual):    return tf.reduce_mean(tf.square(tf.subtract(pred, actual)))def compute_gradient(model, pred, actual):    """compute gradients with given noise and input"""    with tf.GradientTape() as tape:        loss = compute_loss(pred, actual)    grads = tape.gradient(loss, model.variables)    return grads, lossdef apply_gradients(optimizer, grads, model_vars):    optimizer.apply_gradients(zip(grads, model_vars))model = make_model()optimizer = tf.train.AdamOptimizer(1e-4)x = np.linspace(0,1,1000)y = x+np.random.normal(0,0.3,1000)y = y.astype(''float32'')train_dataset = tf.data.Dataset.from_tensor_slices((y.reshape(-1,1)))epochs = 2# 10batch_size = 25itr = y.shape[0] // batch_sizefor epoch in range(epochs):    for data in tf.contrib.eager.Iterator(train_dataset.batch(25)):        preds = model(data)        grads, loss = compute_gradient(model, preds, data)        print(grads)        apply_gradients(optimizer, grads, model.variables)#         with tf.GradientTape() as tape:#             loss = tf.sqrt(tf.reduce_mean(tf.square(tf.subtract(preds, data))))#         grads = tape.gradient(loss, model.variables)#         print(grads)#         optimizer.apply_gradients(zip(grads, model.variables),global_step=None)

Gradient output: [None, None, None, None, None, None] 错误如下：

----------------------------------------------------------------------ValueError                           Traceback (most recent call last)<ipython-input-3-a589b9123c80> in <module>     35         grads, loss = compute_gradient(model, preds, data)     36         print(grads)---> 37         apply_gradients(optimizer, grads, model.variables)     38 #         with tf.GradientTape() as tape:     39 #             loss = tf.sqrt(tf.reduce_mean(tf.square(tf.subtract(preds, data))))<ipython-input-3-a589b9123c80> in apply_gradients(optimizer, grads, model_vars)     17      18 def apply_gradients(optimizer, grads, model_vars):---> 19     optimizer.apply_gradients(zip(grads, model_vars))     20      21 model = make_model()~/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py in apply_gradients(self, grads_and_vars, global_step, name)    589     if not var_list:    590       raise ValueError("No gradients provided for any variable: %s." %--> 591                        ([str(v) for _, v, _ in converted_grads_and_vars],))    592     with ops.init_scope():    593       self._create_slots(var_list)ValueError: No gradients provided for any variable:

编辑

我更新了我的代码。现在，问题出在梯度计算上，它返回零。我已经检查了非零的损失值。

答案1

小编典典

第1部分：
问题确实是您输入的数据类型。默认情况下，您的keras模型期望float32，但您传递的是float64。您可以更改模型的dtype或将输入更改为float32。

更改模型：

def make_model():    net = tf.keras.Sequential()    net.add(tf.keras.layers.Dense(4, activation=''relu'', dtype=''float32''))    net.add(tf.keras.layers.Dense(4, activation=''relu''))    net.add(tf.keras.layers.Dense(1))    return net

更改输入： y = y.astype(''float32'')

第2部分：
您需要model(data)在tf.GradientTape（）下调用用于计算模型的函数（即）。例如，您可以compute_loss使用以下方法替换您的方法：

def compute_loss(model, x, y):    pred = model(x)    return tf.reduce_mean(tf.square(tf.subtract(pred, y)))

matMul 中的错误：形状为 684,1 和 2,1 且 transposeA=false 和 transposeB=false 的张量的内部形状 (1) 和 (2) 必须匹配

如何解决matMul 中的错误：形状为 684,1 和 2,1 且 transposeA=false 和 transposeB=false 的张量的内部形状 (1) 和 (2) 必须匹配

我是 AI 和 tensorflow.js 的完全初学者。目前正在学习 Stephen Grider 的机器学习课程。我应该在下面的代码之后得到一个输出，但我得到了错误。请帮忙：

代码：线性回归.js：

const tf = require(''@tensorflow/tfjs'');
class LinearRegression {
    constructor(features,labels,options) {
        this.features = tf.tensor(features);
        this.labels = tf.tensor(labels);
        this.features = tf.ones([this.features.shape[0],1]).concat(this.features) //generates the column of one for the horse power 
        this.options = Object.assign(
            { learningRate: 0.1,iterations: 1000 },options
        ); //default value is 0.1,if the learning rate is provided,the value is overrided... iteration no. of times gradient decent runs
        this.weights = tf.zeros([2,1]); // intial tensor of both m and b are zeros
    }
    gradientDescent() {
        const currentGuesses = this.features.matMul(this.weights); //matMul is matrix multiplication which is features * weights
        const differences = currentGuesses.sub(this.labels); //(features * weights) - labels
        const slopes = this.features
            .transpose()
            .matMul(differences)
            .div(features.shape[0]); // slope of MSE with respect to both m and b. features * ((features * weights) - labels) / total no. of features.
        
        this.weights = this.weights.sub(slopes.mul(this.options.learningRate));
    }
    train() {
        for (let i=0; i < this.options.iterations; i++) {
            this.gradientDescent();
        }
        /*test(testFeatures,testLabels) {
            testFeatures = tf.tensor(testFeatures);
            testLabels = tf.tensor(testLabels);
        } */
    }
}
module.exports = LinearRegression;

index.js:

require(''@tensorflow/tfjs-node'');
const tf = require(''@tensorflow/tfjs'');
const loadCSV = require(''./load-csv'');
const LinearRegression = require(''./linear-regression'');
let { features,testFeatures,testLabels } =loadCSV(''./cars.csv'',{
    shuffle: true,splitTest: 50,dataColumns: [''horsepower''],labelColumns: [''mpg'']
});
const regression = new LinearRegression(features,{
    learningRate: 0.002,iterations: 100
});
regression.train();
console.log(
    ''Updated M is:'',regression.weights.get(1,0),''Updated B is:'',regression.weights.get(0,0)
    );

错误：

D:\\Application Development\\MLKits-master\\MLKits-master\\regressions\\node_modules\\@tensorflow\\tfjs-core\\dist\\ops\\operation.js:32
            throw ex;
            ^
Error: Error in matMul: inner shapes (1) and (2) of Tensors with shapes 684,1 and 2,1 and transposeA=false and transposeB=false must match.
    at Object.assert (D:\\Application Development\\MLKits-master\\MLKits-master\\regressions\\node_modules\\@tensorflow\\tfjs-core\\dist\\util.js:36:15)
    at matMul_ (D:\\Application Development\\MLKits-master\\MLKits-master\\regressions\\node_modules\\@tensorflow\\tfjs-core\\dist\\ops\\matmul.js:25:10)
    at Object.matMul (D:\\Application Development\\MLKits-master\\MLKits-master\\regressions\\node_modules\\@tensorflow\\tfjs-core\\dist\\ops\\operation.js:23:29)
    at Tensor.matMul (D:\\Application Development\\MLKits-master\\MLKits-master\\regressions\\node_modules\\@tensorflow\\tfjs-core\\dist\\tensor.js:315:26)
    at LinearRegression.gradientDescent (D:\\Application Development\\MLKits-master\\MLKits-master\\regressions\\linear-regression.js:19:46)
    at LinearRegression.train (D:\\Application Development\\MLKits-master\\MLKits-master\\regressions\\linear-regression.js:34:18)
    at Object.<anonymous> (D:\\Application Development\\MLKits-master\\MLKits-master\\regressions\\index.js:18:12)
    at Module._compile (internal/modules/cjs/loader.js:1063:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1092:10)
    at Module.load (internal/modules/cjs/loader.js:928:32)

解决方法

错误是由

抛出的

this.features.matMul(this.weights)

形状this.features 的[684,1] 和形状this.weights 的[2,1] 之间存在矩阵乘法。为了能够将矩阵 A（形状 [a,b]）与 B（形状 [c,d]）相乘，b 和 c 应该匹配，但此处并非如此。

要解决这里的问题，this.weights 应该转置

this.features.matMul(this.weights,false,true)

matmul：输入操作数 1 不匹配

如何解决matmul：输入操作数 1 不匹配

显示：

matmul：输入操作数 1 在其核心维度 0 中存在不匹配，其中 gufunc 签名 (n?,k),(k,m?)->(n?,m?)（大小 5 不同于 1

当我跑步时：

pre = lm.predict(y_test)

请建议做什么

今天的关于以行方式使用 numpy matmul 和广播的分享已经结束，谢谢您的关注，如果想了解更多关于einsum 和 matmul、InvalidArgumentError：无法计算MatMul，因为输入＃0（从零开始）应为浮点张量，但为双张量[Op：MatMul]、matMul 中的错误：形状为 684,1 和 2,1 且 transposeA=false 和 transposeB=false 的张量的内部形状 (1) 和 (2) 必须匹配、matmul：输入操作数 1 不匹配的相关知识，请在本站进行查询。

本文标签：