使用nezha_base_www模型，得到的嵌入向量为nan

#引用nezha模型
from transformers import NezhaModel, NezhaConfig

self.config = BertConfig.from_pretrained(config_path)
self.bert_module = NezhaModel.from_pretrained(bert_dir, config=self.config)
bert_outputs = self.bert_module(input_ids=x,
            attention_mask=mask,
            token_type_ids=segs,
            output_hidden_states =True)

bert_outputs结果中，多层结果是nan，不知道是什么原因。
BaseModelOutputWithPoolingAndCrossAttentions(last_hidden_state=tensor([[[nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
   ...,
        [nan, nan, nan,  ..., nan, nan, nan]], device='cuda:0'), hidden_states=(tensor([[[ 0.5742, -0.2564,  0.4186,  ...,  0.8307, -1.6965,  0.6848],
         [-0.6152,  0.1826, -1.1161,  ...,  0.6985, -3.4405,  1.4675],
         [-0.2423,  0.8284,  0.5155,  ...,  1.0843, -1.4233,  0.5122],
         ...,
         [-0.2828, -0.2603, -0.6676,  ...,  0.5609, -2.0621,  0.5314],
       
         [ 0.5203,  0.3228, -0.4273,  ..., -0.2345, -0.1468, -0.2845],
         [ 0.5203,  0.3228, -0.4273,  ..., -0.2345, -0.1468, -0.2845],
         [ 0.5203,  0.3228, -0.4273,  ..., -0.2345, -0.1468, -0.2845]]],
       device='cuda:0'), tensor([[[nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         ...,
       
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan]]], device='cuda:0'),), past_key_values=None, attentions=None, cross_attentions=None)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

使用nezha_base_www模型，得到的嵌入向量为nan #227

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

使用nezha_base_www模型，得到的嵌入向量为nan #227

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions