feat(igc): add single-queue polling datapath#683
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an initial single-queue, polling-based TX/RX datapath for the Intel I225/I226 (igc) PCIe NIC, including DMA/bus-master enablement and minimal NetDevice wiring.
Changes:
- Enable PCIe bus-mastering (device + bridge) needed for DMA.
- Implement single-queue polling TX submission/reclaim and RX consume/refill, wired through
send(),recv(),can_send(), andcapabilities(). - Add a few igc descriptor constants required for advanced TX descriptors.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| awkernel_drivers/src/pcie.rs | Enables BUS_MASTER (and forwarding bits on bridges) so endpoints can DMA through bridges. |
| awkernel_drivers/src/pcie/intel/igc.rs | Implements single-queue polling TX/RX datapath and integrates with the NetDevice trait. |
| awkernel_drivers/src/pcie/intel/igc/igc_defines.rs | Adds constants for advanced TX descriptor encoding (DTYP + PAYLEN shift). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| fn igc_txeof(&self, que_id: usize, tx: &mut Tx) -> Result<(), IgcDriverErr> { | ||
| membar_sync(); | ||
| let reg_tdh = read_reg(&self.info, igc_regs::IGC_TDH(que_id))? as usize; |
There was a problem hiding this comment.
igc_txeof reads IGC_TDH to determine which descriptors have been processed, but Intel's SDM advises against reading the TX head register as it may return stale values during DMA. Since IGC_TXD_CMD_RS is already set on each descriptor, consider using the DD bit in each descriptor's write-back status for reclamation instead, which is the approach used by the Linux kernel igc driver.
| } | ||
|
|
||
| if !self.link_info.link_active { | ||
| return Ok(()); |
There was a problem hiding this comment.
When link_active is false (here) or when the TX ring is full (line 909), igc_send returns Ok(()) without actually transmitting. The caller cannot distinguish a successful send from a silently dropped packet. Consider returning an error in these cases so the caller can decide whether to retry.
|
|
||
| let mut node = MCSNode::new(); | ||
| let mut tx = inner.queue_info.que[0].tx.lock(&mut node); | ||
| inner.igc_txeof(0, &mut tx).is_ok() && tx.igc_desc_unused() > 0 |
There was a problem hiding this comment.
can_send() acquires the TX lock, issues a memory barrier, reads IGC_TDH via MMIO, and runs the igc_txeof reclamation loop. In a typical if can_send() { send() } polling pattern, all of this work is duplicated since send() performs the same operations. Other NIC drivers in this codebase (igb, ixgbe, genet) implement can_send() as a lightweight flag check. Consider making this a simple if_flags/link_active check and relying on send() for reclamation.
|
|
||
| const MAX_FRAME_SIZE: u32 = 9234; | ||
| const RX_BUFFER_SIZE: usize = 4096 * 3; | ||
| const TX_BUFFER_SIZE: usize = 2048; |
There was a problem hiding this comment.
TX_BUFFER_SIZE is 2048 while RX_BUFFER_SIZE is 12288 and MAX_FRAME_SIZE is 9234. The driver advertises VLAN_MTU but any frame over 2048 bytes is rejected by igc_send. This creates an asymmetry where the driver can receive frames it cannot retransmit. Either increase the TX buffer to match the supported frame size, or document this intentional limitation.
| Ok(()) | ||
| } | ||
|
|
||
| fn igc_recv(&self, que_id: usize) -> Result<Option<net_device::EtherFrameBuf>, IgcDriverErr> { |
There was a problem hiding this comment.
can_send() checks NetFlags::RUNNING before proceeding, but igc_recv does not. During igc_stop(), there is a window between removing the RUNNING flag and freeing DMA buffers where igc_recv could access hardware that is being reset. Consider adding a RUNNING check here for consistency.
| ); | ||
| read.olinfo_status = u32::to_le((ether_frame.data.len() as u32) << IGC_ADVTXD_PAYLEN_SHIFT); | ||
|
|
||
| tx.next_avail_desc += 1; |
There was a problem hiding this comment.
next_avail_desc is incremented and the descriptor is written before the write_reg(IGC_TDT) call. If write_reg fails, the ring's software state is advanced but the hardware TDT is not updated, leaving a zombie descriptor. Consider advancing the index only after the write_reg succeeds.
Description
This PR adds the first usable single-queue polling datapath for the Intel I225/I226 driver.
It enables the PCIe bus-mastering required for DMA, sets up single-queue TX/RX descriptor rings, and wires the minimal polling path through send(), recv(), can_send(), and capabilities(). It also adds TX reclamation based on TDH/TDT and RX consume/refill handling for the single-queue path.
This change is intentionally limited to the minimal end-to-end datapath. It does not add shell debug commands, NetDevice::debug_dump, IRQ RX buffering, multi-queue support, or checksum offload
Related links
How was this PR tested?
Real-machine smoke test:
Notes for reviewers