tensorflow 2.0 基础操作之 Broadcasting机制

3.4_ Broadcasting

Broadcasting 机制
主要过程
理解
优点
能否 Broadcasting
练习
tf.broadcast_to
Broadcast VS Tile

张量维度扩张的手段，但是没有在数据层面上的复制。
是一种数据优化的手段。高效且直观。

Broadcasting 机制

expand
without copying data
VS tf.tile 进行数据层面上的 copy
tf.broadcast_to

主要过程

给需要位置插入 1 dim。
给 dim 为 1 的扩展成所需的数量。

举例：

Feature maps: [4, 32, 32, 3]
- Bias: [3] → [1, 1, 1, 32] → [4, 32, 32, 3]

在这里插入图片描述

理解

如果没有这个维度：
- 建立一个新的概念。
- [classes, students, scores] + [scores]
当存在 dim 为 1 的维度。
- 当成所需的量
- [classes, students, scores] + [students, 1]

优点

真实需求。
- [classes, students, scores]
- 给所有学生加五分的需求: + 5 score
- [4, 32, 8] + [4, 32, 8] No！
- [4, 32, 8] + [5.0] right
减少内存消耗。
- [4, 32, 8] → 1024
- bias=[8]: [5.0,5.0,5.0,…] → 8

能否 Broadcasting

从最低维匹配

如果当前维度 dim = 1 ，扩张。
如果没有当前维度，插入一个维度，再扩张。
否则，不能 Broadcasting。

举例一：

[4, 32, 14, 14]
[1, 32, 1, 1] → [4, 32, 14, 14] YES

举例二：

[4, 32, 14, 14]
[14, 14] → [1, 1, 14, 14] → [4, 32, 14, 14] YES

举例三：

[4, 32, 14, 14]
[2, 32, 14, 14] NO
- Dim 0 has dim, can NOT insert and expand to same
- Dim 0 has distinct dim, NOT size 1
- NOT broadcasting-able

练习

[4, 32, 32, 3]
+ [3]
+ [32, 32, 1]
+ [4, 1, 1, 1]

a = tf.random.normal([4, 32, 32, 3])
b = tf.ones([3])
c = tf.fill([32, 32, 1], 2.)
d = tf.random.uniform([4, 1, 1, 1])

(a+b).shape   # (4, 32, 32, 3)
(a+c).shape   # (4, 32, 32, 3)
(a+d).shape   # (4, 32, 32, 3)

tf.broadcast_to

注意：tf.broadcast_to 是两个已存在的 tensor 之间的转换，而 tf.**_like 初始化创建。

x = tf.ones([4, 32, 32, 3])
y = tf.ones([1, 32, 1 ])

(x+y).shape   # (4, 32, 32, 3)
tf.broadcast_to(y, x.shape).shape   # (4, 32, 32, 3)
tf.ones_like(x)   # (4, 32, 32, 3)

Broadcast VS Tile

broadcasting 内存无关，而 tile 内存相关。
用法不同，broadcasting 更简洁。
注意： tf.tile 的 multiple 参数是扩展倍数

# Broadcast VS Tile
a = tf.ones([3,4])
a1 = tf.broadcast_to(a, [2,3,4])
a1.shape   # (2, 3, 4)

a2 = tf.expand_dims(a, axis=0)
a2 = tf.tile(a2, [2,1,1])  # 每一维的倍数
a2.shape   # (2, 3, 4)

版权声明：本文为z_feng12489原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接和本声明。

原文链接：https://blog.csdn.net/z_feng12489/article/details/89332012