Dual-rank DIMMs effectively double the accessible memory density, enabling DDR3 modules with larger capacities. However, this potential can only be unlocked with a controller designed to support dual-rank operations.
Now, imagine plugging an 8GB dual-rank DDR3 DIMM into your system, only to find that you can access only 4GB of it. Frustrating, right? This happens when the memory controller supports only single-rank DIMMs, leaving half of the capacity unused. But, of course, this is not a problem for UberDDR3! With UberDDR3 now supporting dual-rank DIMMs, you can fully unlock the potential of your high-capacity dual-rank memory modules.
Table of Contents:
I. Dual rank vs Single rank
In DDR3 memory modules, a rank (shown below) refers to a group of memory chips that share the same set of address, control, and data lines and can be all be simultaneously enabled using chip-select signals.
A single-rank module consists of one 64-bit bus group, typically made up of either eight x8 memory chips or sixteen x4 memory chips. When data is accessed, all the memory chips in the rank are activated simultaneously, making it straightforward in design but limited in capacity.
A dual-rank module, on the other hand, comprises two 64-bit bus ranks, each consisting of eight x8 memory chips. These ranks are usually located on opposite sides of the DIMM card. Each rank can be independently addressed via the chip-select line, allowing one rank to remain active while the other is idle. This architecture effectively doubling the memory capacity.
Below is a nice illustration from Memory4less:
To determine the number of ranks in your DDR3 DIMM, check the sticker on the front. For example, as shown below, "8GB 2Rx8" indicates an 8GB capacity, "2R" means dual rank, and "x8" signifies that the memory chips are x8:
A 64-bit bus requires eight x8 chips. Since this is a dual-rank module, there are two sets of eight x8 chips—one set on the front and another on the back of the DIMM, matching what we can see above!
II. UberDDR3 Support for Dual-Rank
The addition of dual-rank support in UberDDR3 unlocks the full capacity of dual-rank DIMMs. Without this feature, only one rank would be accessible, effectively halving the available memory capacity.
Now, after much anticipation, UberDDR3 now fully supports dual-rank DIMMs! Enabling this feature is as simple as setting the top-level parameter DUAL_RANK_DIMM to 1. With this, UberDDR3 can now access both ranks, maximizing the usable memory capacity.
It's important to note that this feature is relevant only for 64-bit DIMM DDR3 modules. For smaller x16 DDR3 modules commonly found in low-end FPGAs, the dual-rank parameter should be disabled.
To give an example, if we have an 8GB 2-Rank DDR3: DUAL_RANK_DIMM = 0 will only make 4GB accessible for UberDDR3. With DUAL_RANK_DIMM = 1 all 8GB will be accessible.
III. Design Implementation
Let’s dive into how dual-rank support is implemented in UberDDR3, starting with a quick overview of DRAM addressing.
In a typical DDR3 module, the address is divided into three main components: row, column, and bank. For example, a common configuration uses 14 bits for the row, 10 bits for the column, and 3 bits for the bank. To perform any read or write operation, the memory controller must first specify the corresponding row, column, and bank addresses.
From UberDDR3's top-level perspective, the main input you need to worry about is the Wishbone address (i_wb_addr). This address, typically 24 bits wide, is mapped to the row, column, and bank fields. This is shown below (figure was from Micron DDR3 datasheet here):
Column Address: {i_wb_addr[6:0], 3'b000} – The 7 most significant bits of the column address are sent to the Column Decoder block to select one of 128 columns.
Bank Address: i_wb_addr[9:7] – A 3-bit bank address is passed to the Bank Control Logic block to select one of the eight available banks.
Row Address: i_wb_addr[23:10] – A 14-bit row address is fed to the Row-Address MUX block to select one of 16,384 rows.
Since dual rank effectively doubles the DRAM capacity, then first we have to add one more bit to the Wishbone address (i_wb_addr). The original 24-bit address becomes 25 bits, with the most significant bit (i_wb_addr[24]) specifying which rank to access.
As far as the user is concern, adding that one bit to the Wishbone address effectively makes the capacity of both ranks to be accessible by the user. Now you might think that the modification needed for UberDDR3 to support dual rank is just as simple as connecting that last bit of Wishbone address to chip select like this:
Well, logically as far as controlling the chip-select line, this logic is not wrong. But there is A LOT to consider when support dual ranks. The main reason here is that you will have to monitor all timing parameters and bank usage of every bank FOR BOTH RANKS.
III.I Bank Monitoring for Dual-Rank
In DDR3 memory, managing bank activity is critical. Each bank state must be tracked, as well as timing delays for commands like precharge, activate, read, and write.
For each of the eight banks, we need to track:
bank_status: Whether the bank is active or idle.
bank_active_row: The currently active row in the bank.
Additionally, timing delays must be monitored:
precharge_counter: Remaining clock cycles before the next precharge command.
activate_counter: Remaining clock cycles before the next activate command.
write_counter: Remaining clock cycles before the next write command.
read_counter: Remaining clock cycles before the next read command
In dual-rank configurations, each rank operates independently, meaning rank 0’s banks could all be active while rank 1’s banks remain idle (this could happen if all user access is on rank 0 and not on rank 1). Because of this, separate monitoring variables are required for each rank, effectively doubling the monitoring requirements.
Now, instead of managing eight banks, we monitor two sets of eight banks, one for each rank:
This increases the bank indexing from 3 bits (0–7) to 4 bits (0–15). The most significant bit of the Wishbone address (i_wb_addr[24]) is also used as the most significant bit of the bank addressing to determine which rank’s banks are accessed.
III.II Initialization Sequence for Dual-Rank
Now that we can monitor all banks for both ranks, UberDDR3 is ready to run in normal operation and be accessed via Wishbone read and write operations. However, take note that before the controller enters normal operation, it must first execute a long calibration sequence. Unfortunately, the initialization process for dual rank differs significantly from that of a single rank.
For a single-rank configuration, the process is straightforward as shown below:
Perform a hard reset.
Execute DDR3 calibration sequence (as described here in Micron technical note).
Begin normal operation, allowing the Wishbone interface to access UberDDR3.
For dual-rank configuration, the process is more complex since calibration must be performed separately for each rank:
Rank 0 Calibration: Initialize and calibrate rank 0, starting from a hard reset.
Reset Controller: Reset the controller to prepare for rank 1 calibration
Rank 1 Calibration: Initialize and calibrate rank 1
Normal Operation: Both ranks are now ready for operation
A key challenge during dual-rank initialization is ensuring rank 0’s refresh requirements (7.8 us refresh period) are not violated while waiting for the very long calibration of rank 1 to end. Instead of adding unnecessary complexity by refreshing rank 0 in parallel to rank 1 calibration, UberDDR3 leverages its self-refresh feature (more about self-refresh from the previous blog post):
Rank 0 enters self-refresh mode before starting rank 1 calibration thus rank 0 auto refreshes itself while waiting for rank 1 calibration to end.
Once rank 1 calibration completes, rank 1 also enters self-refresh.
Both ranks then exit self-refresh simultaneously.
By synchronizing the self-refresh exit for both ranks, the controller was able to simplify tracking of refresh timeout and can now issue the subsequent refresh commands to both ranks at the same time. If we do not synchronize the refresh timeout, then we have to separately monitor the refresh timeout of both ranks which will add unnecessary complexity on the controller.
IV. Simulation Testbench
For the testbench simulation, as usual, we will use the Micron DDR3 model file. The test closely mirrors previous tests, first we run UberDDR3 from a hard reset until calibration is complete. Afterwards, we perform random read and write operations to various addresses, ensuring that both rank 0 and rank 1 are accessed randomly.
To run the testbench, refer to the blog post Getting Started with UberDDR3 (Part 1). Ensure the following configurations are set in ddr3_dimm_micron_sim.sv:
Uncomment the define EIGHT_LANES_x8.
Set the DUAL_RANK_DIMM local parameter to 1.
IV.I Initialization Sequence Analysis
During initialization, UberDDR3 transitions through the calibration process. The following key signals are observed:
current_rank: Indicates the rank currently being calibrated. As shown above, the calibration starts with rank 0 and eventually switches to rank 1.
sync_rst_controller: This signal resets the controller. It is asserted after rank 0 calibration is complete, switching current_rank to rank 1.
final_calibration_done: Indicates that the initialization sequence is complete. This is asserted only after the calibration for both rank 0 and rank 1 is finished.
o_ddr3_cs_n[1]: The active-low chip-select signal for rank 1. This is asserted (low) during rank 1 calibration.
o_ddr3_cs_n[0]: The active-low chip-select signal for rank 0. This is asserted (low) during rank 0 calibration.
o_ddr3_cke[1]: The clock-enable signal for rank 1. It is asserted (high) during rank 1 calibration.
o_ddr3_cke[0]: The clock-enable signal for rank 0. It is asserted (high) during rank 0 calibration but switches low during rank 1 calibration, indicating that rank 0 is in self-refresh mode.
IV.II Normal Operation Analysis
Once calibration is complete, UberDDR3 enters normal operation. The following key observations are made during random read and write requests sent to the Wishbone interface:
i_wb_addr: The most significant bit (i_wb_addr[26]) determines which rank is accessed.
When i_wb_addr[26] is low, the controller accesses rank 0.
When i_wb_addr[26] is high, the controller accesses rank 1.
bank_status_q: Indicates which bank is active.
For rank 0 (when i_wb_addr[26] is low), active banks such as bank_status_q[0] and bank_status_q[4] are observed changing states.
For rank 1 (when i_wb_addr[26] is high), active banks such as bank_status_q[8] and bank_status_q[12] are changing states.
o_ddr3_cs_n: The active-low chip-select signal corresponds to the rank being accessed.
For rank 0, o_ddr3_cs_n[0] is asserted low.
For rank 1, o_ddr3_cs_n[1] is asserted low.
V. Project Demonstration
In the previous section, we validated that UberDDR3's dual-rank support is working through testbench simulation. However, the ultimate proof of functionality is seeing this new feature operate successfully on real hardware!
For this demonstration, I’m using the AX7325B FPGA board from ALINX, which includes a SODIMM slot supporting up to 8GB DDR3 memory:
For the memory, I’m using a Hynix 8GB 2Rx8 DDR3 SODIMM module:
To showcase dual-rank functionality, I’ll use the example demo for UberDDR3, as detailed in the previous blog post, Getting Started with UberDDR3 (Part 2). This design integrates UberDDR3 with a simple UART interface:
When the UART receives lowercase letters, it writes these characters to DDR3 memory.
When the UART receives uppercase letters, it reads back the lowercase letters previously stored in DDR3 at the corresponding addresses.
For example:
Sending "abcdefg" through the UART terminal stores these characters in DDR3.
Sending "ABCDEFG" retrieves the stored lowercase equivalents, outputting "abcdefg".
To ensure proper initialization, I added debug logic to tap into critical signals. The snapshot below, captured using the Internal Logic Analyzer (ILA), shows the signals when current_rank switches from rank 0 to rank 1 during calibration:
current_rank: Begins at rank 0 (low) for calibration, then switches to rank 1 (high).
reset_after_rank_1: Resets the controller after rank 0 calibration is complete. When this signal asserts, current_rank switches to rank 1.
user_self_refresh_q: Asserts after rank 0 calibration, just before the controller resets. This prepares rank 0 for self-refresh mode before rank 1 calibration begins.
lane: Indicates the lane (0 to 7) currently being calibrated. It starts from lane 0 and increments to lane 7. Upon switching to rank 1 calibration, it resets back to lane 0.
state_calibrate: Represents the calibration state. It increments sequentially from state 0 to state 23. After the controller resets, state_calibrate restarts at state 0 for rank 1 calibration.
After confirming the initialization sequence, I tested the UART interface to ensure normal DDR3 operations. The test principle is simple: if DDR3 can handle both read and write operations correctly, then UberDDR3's dual-rank calibration and functionality are verified.
As shown below, sending lowercase letters "a-f" followed by uppercase letters "A-F" results in the correct retrieval of lowercase letters "a-f". This demonstrates successful operation, exactly as expected!
And that’s it! With this demonstration, we’ve verified that UberDDR3 works seamlessly with dual-rank support enabled. While this is a basic test, it provides a strong foundation for future, more rigorous testing.
VI. Conclusion
In conclusion, adding dual-rank support to UberDDR3 unlocks the full potential of dual-rank DDR3 DIMMs. With just a simple parameter change—DUAL_RANK_DIMM—you can take advantage of this feature. Behind the scenes, though, it involves careful design updates and improved monitoring for both ranks to ensure everything runs smoothly. From address handling and bank tracking to calibration and self-refresh, UberDDR3 takes care of the heavy lifting, making it a solid choice for memory-intensive applications.
That wraps up this post. Catch you in the next blog post!
Comments