[Bug]: FlashInfer CUTLASS MoE backend causes CUDA illegal memory access on H100 during CUDA graph capture (Qwen3-Next-80B BF16)
April 8, 2026 ยท #39288
Python
Difficulty: Easy
Parent Repository
vllm-project/vllm
Python repository
75,721 15,332