Skip to content

encoder中不使用mask?而且自注意力计算中的mask计算方式是不是有误? #10

Open
@wjx-git

Description

@wjx-git

在 bert_model.py 中第92行,
encoder_class = Encoder(self.d_model, self.d_k, self.d_v, self.sequence_length, self.h, self.batch_size,
self.num_layer, self.input_representation, self.input_representation,
dropout_keep_prob=self.dropout_keep_prob,
use_residual_conn=self.use_residual_conn)
参数mask为何没有赋值,意思是默认不用掩模?但编码器中掩模操作是必须的吧。

在 multi_head_attention.py中第82行,
mask = tf.expand_dims(self.mask, axis=-1) # [batch,sequence_length,1]
mask = tf.expand_dims(mask, axis=1) # [batch,1,sequence_length,1]
dot_product = dot_product + mask # [batch,h,sequence_length,1]

掩模操作怎么会是直接相加呢?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions