Skip to content

wmma smem 数据搬运问题 #396

@fenghuoxiguozu

Description

@fenghuoxiguozu

hgemm/wmma
line 127

#define LDST64BITS(value) (reinterpret_cast<float2 >(&(value))[0])
...
// s_a, 64
16, 每个线程load 4 half, ##每行需要4线程,64行,共256线程
const int load_smem_a_m = tid / 4; // 0~63
const int load_smem_a_k = (tid % 4) * 4; // 0,4,12,...
...
LDST64BITS(s_a[load_smem_a_m][load_smem_a_k]) =(LDST64BITS(A[load_gmem_a_addr]));

s_a每个线程是读取4个元素,为何在搬运时使用LDST64BITS,LDST64BITS定义的是搬运2个数吧?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions