Skip to content

Conversation

@KlaudiuszRydzy
Copy link

Summary

9 counters in counters.py had empty or incorrect NCU metric mappings, causing warnings when profiling on H100.

Fixed 2 broken mappings:

  • pipe_fp16: H100 renamed the fp16 pipe metric to fma_type_fp16
  • kernel_name: "Kernel Name" is not a valid NCU metric, replaced with launch__kernel_name

Filled 7 previously empty mappings:

  • dram_util, dram_throughput, imc_hitrate, shared_ld_requests, shared_st_requests, gnic_lg_read_requests_precoalescing, gnic_lg_read_requests_postcoalescing

Warnings observed

[WARNING] The report doesn't has this counter: pipe_fp16 -> sm__inst_executed_pipe_fp16...
[WARNING] The report doesn't has this counter: dram_util ->
[WARNING] The report doesn't has this counter: dram_throughput ->
[WARNING] The report doesn't has this counter: imc_hitrate ->
[WARNING] The report doesn't has this counter: shared_ld_requests ->
[WARNING] The report doesn't has this counter: shared_st_requests ->
[WARNING] The report doesn't has this counter: gnic_lg_read_requests_postcoalescing ->
[WARNING] The report doesn't has this counter: gnic_lg_read_requests_precoalescing ->

Mappings were determined by matching semantically against available metrics in H100 .ncu-rep profiles.

When profiling on H100, 9 counters in counters.py produced warnings due
to empty or incorrect NCU metric mappings:

    [WARNING] The report doesn't has this counter: pipe_fp16 -> sm__inst_executed_pipe_fp16...
    [WARNING] The report doesn't has this counter: dram_util ->
    [WARNING] The report doesn't has this counter: shared_ld_requests ->
    ...

Fixed 2 broken mappings:
- pipe_fp16: H100 renamed the fp16 pipe metric to fma_type_fp16
- kernel_name: "Kernel Name" is not a valid NCU metric, use launch__kernel_name

Filled 7 previously empty mappings by matching against available H100
NCU profile metrics:
- dram_util -> dram__cycles_active.avg.pct_of_peak_sustained_elapsed
- dram_throughput -> dram__throughput.avg.pct_of_peak_sustained_elapsed
- imc_hitrate -> smsp__imc_request_hit_rate.pct
- shared_ld_requests -> l1tex__t_requests_pipe_lsu_mem_dshared_op_ld.sum
- shared_st_requests -> l1tex__t_requests_pipe_lsu_mem_dshared_op_st.sum
- gnic_lg_read_requests_precoalescing -> l1tex__t_requests_pipe_lsu_mem_global_op_ld.sum
- gnic_lg_read_requests_postcoalescing -> l1tex__t_output_wavefronts_pipe_lsu_mem_global_op_ld.sum
@jhdavis8 jhdavis8 merged commit c6c5866 into master Feb 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants