-
Notifications
You must be signed in to change notification settings - Fork 64
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Very similar to my bug report in #846, it looks like something is not being constant-folded. Again, it works with CUDA.jl, but fails with AMDGPU.jl.
MWE:
using QEDbase
using QEDbase.Mocks
using KernelAbstractions
using Random
@kernel function mwe_kernel(dest::AbstractVector)
id = @index(Global)
dest[id] = zero(eltype(dest))
end
RNG = MersenneTwister(137137)
# works ->
using CUDA
moms = CuVector([Mocks._rand_momenta(RNG, 1, MockMomentum{Float32})[1] for _ in 1:128])
mwe_kernel(get_backend(moms))(moms; ndrange = length(moms))
KernelAbstractions.synchronize(get_backend(moms))
@info "CUDA Success"
# crashes ->
using AMDGPU
moms = ROCVector([Mocks._rand_momenta(RNG, 1, MockMomentum{Float32})[1] for _ in 1:128])
mwe_kernel(get_backend(moms))(moms; ndrange = length(moms))
KernelAbstractions.synchronize(get_backend(moms))
@info "AMDGPU Success"What I'm getting for AMDGPU is this:
ERROR: LoadError: InvalidIRError: compiling MethodInstance for gpu_mwe_kernel(::KernelAbstractions.CompilerMetadata{…}, ::AMDGPU.Device.ROCDeviceVector{…}) resulted in invalid LLVM IR
Reason: unsupported call to an external C function (call to jl_string_to_genericmemory)
Reason: unsupported call to an external C function (call to jl_genericmemory_to_string)
Reason: unsupported call to an external C function (call to ijl_pchar_to_string)
Reason: unsupported call to an external C function (call to ijl_rethrow)
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erroneous code with Cthulhu.jl
Stacktrace:
[1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}, args::LLVM.Module)
@ GPUCompiler ~/.julia/packages/GPUCompiler/wvn1Y/src/validation.jl:167
[2] macro expansion
@ ~/.julia/packages/GPUCompiler/wvn1Y/src/driver.jl:417 [inlined]
[3] macro expansion
@ ~/.julia/packages/Tracy/tYwAE/src/tracepoint.jl:163 [inlined]
[4] emit_llvm(job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
@ GPUCompiler ~/.julia/packages/GPUCompiler/wvn1Y/src/driver.jl:416
[5] emit_llvm
@ ~/.julia/packages/GPUCompiler/wvn1Y/src/driver.jl:182 [inlined]
[6] compile_unhooked(output::Symbol, job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
@ GPUCompiler ~/.julia/packages/GPUCompiler/wvn1Y/src/driver.jl:95
[7] compile_unhooked
@ ~/.julia/packages/GPUCompiler/wvn1Y/src/driver.jl:80 [inlined]
[8] compile(target::Symbol, job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
@ GPUCompiler ~/.julia/packages/GPUCompiler/wvn1Y/src/driver.jl:67
[9] compile
@ ~/.julia/packages/GPUCompiler/wvn1Y/src/driver.jl:55 [inlined]
[10] #hipcompile##0
@ ~/.julia/packages/AMDGPU/TqRG0/src/compiler/codegen.jl:211 [inlined]
[11] JuliaContext(f::AMDGPU.Compiler.var"#hipcompile##0#hipcompile##1"{GPUCompiler.CompilerJob{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}}; kwargs::@Kwargs{})
@ GPUCompiler ~/.julia/packages/GPUCompiler/wvn1Y/src/driver.jl:34
[12] JuliaContext(f::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/wvn1Y/src/driver.jl:25
[13] hipcompile(job::GPUCompiler.CompilerJob)
@ AMDGPU.Compiler ~/.julia/packages/AMDGPU/TqRG0/src/compiler/codegen.jl:210
[14] actual_compilation(cache::Dict{…}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{…}, compiler::typeof(AMDGPU.Compiler.hipcompile), linker::typeof(AMDGPU.Compiler.hiplink))
@ GPUCompiler ~/.julia/packages/GPUCompiler/wvn1Y/src/execution.jl:245
[15] cached_compilation(cache::Dict{Any, AMDGPU.HIP.HIPFunction}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{GPUCompiler.GCNCompilerTarget, AMDGPU.Compiler.HIPCompilerParams}, compiler::Function, linker::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/wvn1Y/src/execution.jl:159
[16] macro expansion
@ ~/.julia/packages/AMDGPU/TqRG0/src/compiler/codegen.jl:166 [inlined]
[17] macro expansion
@ ./lock.jl:376 [inlined]
[18] hipfunction(f::typeof(gpu_mwe_kernel), tt::Type{Tuple{KernelAbstractions.CompilerMetadata{…}, AMDGPU.Device.ROCDeviceVector{…}}}; kwargs::@Kwargs{})
@ AMDGPU.Compiler ~/.julia/packages/AMDGPU/TqRG0/src/compiler/codegen.jl:160
[19] hipfunction(f::typeof(gpu_mwe_kernel), tt::Type{Tuple{KernelAbstractions.CompilerMetadata{…}, AMDGPU.Device.ROCDeviceVector{…}}})
@ AMDGPU.Compiler ~/.julia/packages/AMDGPU/TqRG0/src/compiler/codegen.jl:159
[20] macro expansion
@ ~/.julia/packages/AMDGPU/TqRG0/src/highlevel.jl:155 [inlined]
[21] (::KernelAbstractions.Kernel{…})(args::ROCArray{…}; ndrange::Int64, workgroupsize::Nothing)
@ AMDGPU.ROCKernels ~/.julia/packages/AMDGPU/TqRG0/src/ROCKernels.jl:96
[22] top-level scope
@ ~/repos/QEDbase.jl/temp.jl:24
[23] include(mapexpr::Function, mod::Module, _path::String)
@ Base ./Base.jl:307
[24] top-level scope
@ REPL[3]:1
in expression starting at /home/reinha57/repos/QEDbase.jl/temp.jl:24
Some type information was truncated. Use `show(err)` to see complete types.
The implementation of the zero(::Type) function is the following:
Base.zero(mom_type::Type{<:AbstractMockMomentum}) = mom_type(zeros(eltype(mom_type), 4))A workaround is to change this to
function Base.zero(mom_type::Type{T}) where {EL_T, T <: AbstractMockMomentum{EL_T}}
return mom_type(zero(EL_T), zero(EL_T), zero(EL_T), zero(EL_T))
endwhich works with both backends.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working