Skip to content

HDL Conventions

Jeff Bush edited this page Jul 14, 2018 · 78 revisions

I've tried to use some conventions and rules through this HDL implementation. These both make the design easier to read and understand and try to avoid bugs and synthesis mismatches. I've documented them here with motivations. Some of these are relaxed for test bench code, as they are primarily to ensure proper synthesis. Endnotes indicate sources that give more background and justification.

General

  • Use logic to define nets and flops rather than reg and wire. This is cleaner, since reg and wire are somewhat inconsistent in their usage.11

  • Initialize nets in an initial block rather than as part of the declaration. Some simulators don't work correctly with the first form. This is also more consistent: mixing both forms could result in unexpected behavior.

    No:

    logic burst_w = '0;

    Yes:

    logic burst_w;
    ...
    initial
    begin
        burst_w = '0;
  • Use interfaces and structs to group related signals.

    • This makes the code more concise and intent more explicit.
    • Less likely to miss flops, since a single statement can register all signals in a struct.20
  • Prefer typedefs rather than hardcoded sizes. e.g. rather than:

    logic[3:0] subcycle1;
    logic[3:0] subcycle2;

    Do this:

    typedef logic[3:0] subcycle_t; // in defines.sv
    
    subcycle_t subcycle1;
    subcycle_t subcycle2;
    • More readable, less visual noise with array indices.
    • Makes code more understandable by giving context about signals.
    • Avoids size mismatch bugs (for example, if I changed a signal size and missed one spot).
    • Since this is an experimental design that is in flux, this makes it easier to change things.
  • Use enumerations wherever possible (as opposed to constant parameters or `defines).

    • Tools will raise an error if code assigns a non-enumerated type to them, which can catch bugs (unfortunately, Verilator does not yet).20
    • Will catch if two constants have the same values
    • Checks that signals are wide enough to hold all values. Since the enum type is used everywhere and size is defined once, less likely to have one-off size mismatches and easier to add new constant values.
  • Avoid using 'X' values and casex. These have well known problems that can cause simulation/synthesis mismatches,2 but also Verilator ignores them (as it is a two-state simulator), so tests may not catch issues that would break 4-state simulators.12

  • Instantiate memories using sram_1r1w and sram_2r1w rather than arrays.

    No:

        logic[31:0] my_memory[4096];

    Yes:

    sram_1r1w #(
        .DATA_WIDTH(32),
        .SIZE(4096),
        .READ_DURING_WRITE("NEW_DATA")
    ) my_memory(.*);
    • This ensures FPGA tools infer memories rather than a huge number of flops.7
    • Avoids implicitly creating extra ports.
    • These enforce proper synchronous access semantics.
    • This guarantees consistent read-after-write behavior (when the same address is read and written in a cycle) and makes these requirements explicit. This behavior tends to vary in structural FPGA primitives between vendors and even between FPGAs in the same family. Using behavioral arrays may generate more logic than necessary to support unneeded semantics.
    • In simulation, these modules clear themselves to avoid X propagation bugs in 4-state simulators.
    • Simulation randomizes output when read enables aren't active to catch bugs.
    • For ASIC synthesis, this allows using memory generator tools.
  • Don't use translate_off/translate on. Use ifdef SIMULATION where necessary (but only if necessary; this should be rare).9, 16

    • Using the preprocessor will catch missing `endifs, where there is no checking for missing translate_offs.
    • But these constructs cause synthesis/simulation mismatches by design, so ensure it is something that only simulation cares about (for example, extra runtime error checking). Synthesis tools ignore most non-synthesizable constructs, so these are usually not necessary.
  • Don't use full_case/parallel_case pragmas. These cause simulation and synthesis to differ and can cause subtle bugs that tests won't catch.10 Behavior can also vary between synthesis tools.

  • Use default statements for all case statements.

    • Makes design more robust so it can recover from glitches
    • Avoids inferring latches.7
    • Avoids X-latching.2
  • Generally, case statements should be simple and parallel (only one case is true). Use the 'unique' keyword on all case statements to verify this with the simulator.20 Use constant values for the case clauses, and prefer a single net in the case statement. This makes it easier for synthesis tools to optimize logic, and gives more consistent among vendors and between synthesis and simulation:

    Yuck:

    logic a;
    logic b;
    logic c:
    case (1'b1)
        a: // do something
        b: // do something else
        c: // another thing
    endcase

    Better:

        unique case (dd_request_vaddr.offset[1:0])
            2'd0: byte_aligned = mem_load_lane[31:24];
            2'd1: byte_aligned = mem_load_lane[23:16];
            2'd2: byte_aligned = mem_load_lane[15:8];
            2'd3: byte_aligned = mem_load_lane[7:0];
            default: byte_aligned = '0;
        endcase
  • Use assertions liberally. As well as catching bugs, assertions clarify assumptions and requirements.19

    • Add a comment above each assertion to describe why it should be true.
    • assert should only be inside an always_ff and shouldn't be reachable when reset is high. Otherwise they may trigger incorrectly.
    • Use $onehot or $onehot0 to ensure one-hot signals only have one signal set.
    • Do not add assertions that duplicate RTL. Assertions should validate behavior, invariants, and inter-module dependencies.

    No:

    assign foo = bar && (baz || !boo);
    ...
    assert(foo == bar && (baz || !boo));

    Yes:

    for (way_idx = 0; way_idx < `L1D_WAYS; way_idx++)
    begin : hit_check_gen
        assign way_hit_oh[way_idx] = dt_request_paddr.tag == dt_tag[way_idx]
            && dt_valid[way_idx];
    end
    
    // Make sure data is not present in more than one way.
    assert(!dcache_load_en || $onehot0(way_hit_oh));
  • Allow tools to infer state machines from a case statement/next_state logic rather than trying to code something like a one-hot state machine explicitly. This allows the tools to generate more efficient logic and is easier to understand (thus less error prone). 7

  • Make all internal enable signals active high. It's easier to understand and less error prone. The exception is external interfaces that are already specified (e.g. SPI or SDRAM), which should have an _n suffix if they are active low.

  • Avoid generating logic with functions. These are harder to reason about and can infer latches16

  • Do not use tri-state/bidirectional (inout) signals internally. These may affect the ability of FPGA synthesis tools to optimize across hierarchical boundaries.7 These are okay for external pins (which have I/O cells that can actually float).

  • When computing address widths, use $clog2 rather than hardcoding.20 Use $bits where needed to determine the size of a signal.

Clocks and Reset

  • Do not use behavioral delays. These are not supported in Verilator, nor by synthesis tools, and can cause mismatches.16

    #4 a = b + c;
    q <= #2 d;
  • The main clock is named 'clk', and all flops are positive edge triggered. Mixing positive and negative edge triggered blocks is more complex and can reduce maximum clock speed.18

  • Keep the number of clock domains to a minimum.5 There are a limited clock resources on FPGAs. Clock domain crossings are difficult to test and debug in simulation and can cause subtle bugs. Prefer to use flip flop enables where possible instead.

  • Do not use the same signal as a clock (i.e. in a @posedge/@negedge block) and as data (i.e. RHS of an expression). Same with resets. These complicate synthesis and timing analysis.5 Note that some external signals named clocks (for example, the SPI clock) are treated as solely as enables, which is fine, as, despite the name, they are never routed into clock inputs of flops.

  • Use always_ff and always_comb to avoid inferred latches or sensitivity list bugs.6 always_comb also triggers at time 0 to avoid subtle initialization bugs.13

  • Use nonblocking assignment for sequential logic (always_ff), blocking for combinational (always_comb).1

    always_ff @(posedge clk)
        my_ff <= another_ff;
    
    always_comb
        my_signal = another_signal;
  • Do not use nonblocking and blocking assignments in the same block.1

  • Do not make assignments to the same variable from multiple always blocks.1 Verilator does not flag this as an error, but many other tools do, and it creates ambiguous logic definitions.

  • Do not use latches. These are difficult to make glitch-free, have slower timing performance on FPGAs, and timing analysis tools cannot model them as accurately.17

  • There is a single reset named 'reset', which is active high. All flops are reset asynchronously. This style reduces area and increases maximum clock frequency on Altera FPGAs, as a synchronous reset adds an extra gate in front of each flop. Since this is a large design that barely fits on some FPGAs, this is important.7 Reset is globally synchronized to clk. If using another clock domain, be careful to ensure reset is handled properly.4

  • Reset only to constant values (not to other signals) to avoid subtle reset bugs:

    Bad:

        always_ff @(posedge clk, posedge reset)
        begin
            if (reset)
                my_ff <= data_value_in;

    Okay:

        always_ff @(posedge clk, posedge reset)
        begin
            if (reset)
                my_ff <= '0;
    
  • If an always_ff block has a reset, make sure to reset all flops that are assigned anywhere in the block. Otherwise, synthesis tools may use reset as a data enable.4 Use AUTORESET to automatically generate these and reduce bugs.8

  • Only use reset if necessary. If a signal doesn't need to be reset for proper operation (data path signal), put it in a block without reset in the sensitivity list. This improves area and timing:

    If a signal needs reset:

        always_ff @(posedge clk, posedge reset)
        begin
            if (reset)
                control_signal <= '0;
            else
                control_signal <= a && (!b || c);

    If a signal does not need reset:

        always_ff @(posedge clk)
        begin
            data_signal <= some_other_data_signal;

Naming/Formatting

  • Keep the same signal name through all levels of the hierarchy5 (the exception is for generic modules like SRAMs). This makes it easier to find signals across different modules. It also allows implicit port connections with .*, which are less error prone and reduce redundant code.14

    No:

        my_module my_module(
           .foo(bar),
           .bar(baz));
  • Signal and module names use snake_case, lowercase only. Parameters and preprocessor defines use uppercase.

  • Make identifier names succinct, but don't abbreviate unnecessarily. Remove things that don't add information.3

    Bad:

    ins_vld, is_unaligned, load_signal

    Better:

    instruction_valid, unaligned, load
  • Break up complex logic by adding named signals to make it clearer:

    Okay:

    if (is_infinity || (add_result_exponent == 8’hFF && !is_nan))

    Better:

    logic add_overflow;
    
    assign add_overflow = add_result_exponent == 8’hFF && !is_nan;
    if (is_infinity || add_overflow)
  • Use named constants rather than hardcoded, magic values to make code clearer and easier to understand.3

    Okay:

    if (load_address == MEMORY_BASE_ADDRESS)

    No:

    if (load_address == 'x123456)
  • Use one definition per line, and do not separate multiple definitions with commas. This makes the code more readable, and also makes it easier to search the code for definitions.

    No:

    logic[31:0] a, b;
    logic foo,
          bar;

    Okay:

    logic[31:0] a;
    logic[31:0] b;
    logic foo;
    logic bar;
  • Use one module per file, and name the file the same as the module. This makes it easier to find modules when navigating source.

  • Connect ports and parameters by name and not position. Otherwise it is fragile and can break in subtle ways if a module is updated.

    No:

    my_module my_module(foo, bar, baz);

    Yes:

    my_module my_module(
        .foo(foo),
        .bar(bar),
        .baz(baz));
  • Modules tend to have the following order to make them easier to navigate:

    • Type definitions and local parameters
    • Signals (define all signals at the beginning of the module. Although Verilator doesn't complain about missing forward references, many other tools will raise errors).
    • Combinational logic
    • Sequential logic (usually registering the output and not the input)
  • Define global constants as parameters in defines.sv. For constants inside a module, use localparam. When a module sets a preprocessor variable, some tools make this visible in all subsequently compiled modules, which can cause namespace collisions. Using localparam restricts the scope to a single module. Using a constant parameter in defines.sv avoids conflicts when this chip is integrated into a design with unrelated IP, because it is wrapped in a namespace.

  • For non-generic components, make the instance name be the same as the component name. This makes it simpler to find the source module when viewing waveform traces

    Yes:

    writeback_stage writeback_stage(

    No:

    writeback_stage wback(
  • Use verilog-2001 style port definitions, specifying direction and type in one definition. This makes the sources more concise and captures all information about a signal in one place. Group module ports by source/destination and add a comment for each group.15 Prefix signals that go between non-generic components with an abbreviation of the source module. This makes it easier to search sources using automated tools and is easier to understand.

     // From io_request_queue
     input scalar_t                ior_read_value,
     input logic                   ior_rollback_en,
    
     // From control registers
     input scalar_t                cr_creg_read_val,
     input thread_bitmap_t         cr_interrupt_en,
     input scalar_t                cr_fault_handler,
  • Signals often use the following suffixes:

    Suffix Meaning
    _en Use for a signal that enables some operation. Internal enables are active high.
    _oh One-hot. No more than one signal will be set, corresponding to the index
    _idx Signal is an index. Usually used when one-hot signals of the same name are also present
    _t Typedef
    _gen Generated block
    _nxt Combinational logic that generates the next value (input) for a flop. Used to distinguish the input from the output of the flop

    Do not use _i or _o to designate inputs and output ports. This is implied by the rule above that net names must match port names. Nets are neither inputs or outputs, as a single net connects to both.

  • Use AUTOLOGIC to avoid incorrectly inferring single bit nets.13

  • Name all generate blocks (with begin : <label>) so their logic will appear in VCD traces.13

    generate
        for (byte_lane = 0; byte_lane < CACHE_LINE_BYTES; byte_lane++)
        begin : lane_bypass_gen
           ...
        end
    endgenerate

Comments and Documentation

Comments are as important as code. While excessive or complex comments can be signal that code can be simplified, I believe the concept of "self documenting" code is a myth, especially for something as low level as HDL. The comments augment the source by clarifying intent and giving context. The code captures only a small part of a larger abstract mental model.

  • Useful things to comment:

    • Side effects (for example, if this modifies some global state)
    • Preconditions and constraints (what values are legal and which are not)
    • Describe algorithms and heuristics used (link to external documents if these are common, describe in detail if not)
    • Define what a module/block is responsible for and what it is not. This helps keep coupling and cohesion clean in a design, and is something that is not always directly clear from the code.
    • What state are things left in when an error occurs. What responsibilities does the client of this code/logic/module have in this case?
    • For logic, is a signal registered/expected to be registered
    • When doing something unusual, describe why (e.g. this was a timing long path, working around a tool bug)
    • Describe the gist of what a block of statements is doing. This can help think about grouping logic clearly. For example, if the intent of a few statements is to push something on a stack.
  • Eliminate tautologies and comments that add no information. Use comments to provide background on why a piece of code exists.

    No:

    state = 0; // reset state

    Yes:

    // This code path restarts the operation as a side effect, so
    // reset the state machine
    state = 0;
  • Eliminate comments if the code can be clarified with a more descriptive identifier (but keep identifiers concise and avoid trying to cram too much information into them).

    No:

    bytes++; // increment number of bytes read
    

    Yes:

    bytes_read++;
    
  • A constraint or invariant can often be better expressed and testing with an assertion, but usually still requires a comment to indicate why it exists.

    No:

    // only one bit of way_hit_oh should be set
    

    Yes:

    // Make sure data is not present in more than one way.
    assert($onehot0(way_hit_oh));
    
  • Avoid boilerplate comments. While consistency is a good thing to strive for, there is usually not enough in common between things to warrant it, so most of it ends up being blank. It also encourages laziness. Think about what a reader would want or need to know.

  • For short inline comments, use imperative mood, as if you are telling the program what to do. This is more direct and concise. An exception is expository comments at the top of a module, which can have a more conversational tone.

    No:

    // Adds values together and writes back the result

    Yes:

    // Add the values together and write back result
  • Prefer present tense. Consistent tense is easier to understand, and present tense is more concise. It also avoids implying something hasn't been implemented yet (this more applicable to longer comments, as the rule for using imperative voice implies using present tense).

    No:

    // This will handle interrupts

    Yes:

    // This handles interrupts 
  • Prefer singular. When there are direct objects, this avoids ambiguity about the number. If the subject is plural, the object must be plural, which makes it is unclear whether there is a one-to-one or one-to-many relationship.

    No:

    // Arithmetic instructions update destination registers

    Yes:

    // An arithmetic instruction updates its destination register
  • Use third person to describe actions and events. The subject should be the program/hardware. Indicate which subsystem/function is performing the action wherever possible. Use active voice.

    No:

    // We load a value from memory
    // A value is loaded from memory

    Yes:

    // The cache loads a value from memory
  • Eliminate meaningless words and simplify phrases. e.g.

    • Note that/It should be noted -> eliminate
    • It is possible for X to -> X may
    • Actually/Technically -> eliminate (this is true for most adverbs). "If you see an adverb, kill it"21

References

  1. Cummings, Clifford E. "Nonblocking assignments in verilog synthesis, coding styles that kill!." SNUG (Synopsys Users Group) 2000 User Papers (2000).
  2. Turpin, Mike. "The Dangers of Living with an X (bugs hidden in your Verilog)." Synopsys Users Group Meeting. 2003.
  3. McConnell, Steve. Code complete. Pearson Education, 2004
  4. Cummings, Clifford E., Don Mills, and Steve Golson. "Asynchronous & synchronous reset design techniques-part deux." SNUG Boston 9 (2003).
  5. Amitay, Yair, Jamil Khatib, and Damjan Lampret. "OpenCores Coding Guidelines." (2001).
  6. Cummings, Clifford E. "SystemVerilog Logic Specific Processes for Synthesis-Benefits and Proper Usage."
  7. Altera, Quartus II. "Handbook Version 11.1, chapter 11. Recommended HDL Coding styles."
  8. Snyder, Wilson. "Verilog-mode: Reducing the veri-tedium." Synopsys Users Group Conference, San Jose. 2001
  9. Snyder, Wilson. "The Ten Edits I Make Against Most IP (And Wish I Didn’t Have To)" SNUG Boston 9 (2007).
  10. Cummings, Clifford E. "" full_case parallel_case", the Evil Twins of Verilog Synthesis." Proc Synopsys Users Group (SNUG) (1999).
  11. Sutherland, Stuart, and Don Mills. "Synthesizing SystemVerilog Busting the Myth that SystemVerilog is only for Verification." SNUG Silicon Valley (2013).
  12. Sutherland, Stuart. "I’m Still in Love with My X!." Design and Verification Conference (DVCon). 2013.
  13. Sutherland, Stuart, and Don Mills. "Standard gotchas subtleties in the verilog and systemverilog standards that every engineer should know." Synopsys User Group Conference Proc.(SNUG 2006/7). 2006.
  14. Cummings, Clifford E. "SystemVerilog implicit port enhancements accelerate system design & verification." Proceedings of the 45th annual Design Automation Conference. ACM, 2008.
  15. NetFPGA Coding Guidelines
  16. Mills, Don, and Clifford E. Cummings. "RTL Coding Styles That Yield Simulation and Synthesis Mismatches." SNUG (Synopsys Users Group) 1999 Proceedings. 1999.
  17. Quartus II "Handbook Volume 1: Design and Synthesis." Altera Corporation (2014).
  18. Arora, Mohit. The art of hardware architecture: Design methods and techniques for digital circuits. Springer Science & Business Media, 2011.
  19. Sutherland HDL. "Getting Started with SystemVerilog Assertions." DesignCon 2006
  20. Sutherland, S., & Mills, D. (2014). Can my synthesis compiler do that? What ASIC and FPGA synthesis compilers support in the systemverilog-2012 standard. DVCon-2014, San Jose. Chicago
  21. Mark Twain
Clone this wiki locally