Skip to content

feat(igc): add single-queue polling datapath#683

Open
ytakano wants to merge 2 commits intotier4:mainfrom
ytakano:igc_single_queue_polling
Open

feat(igc): add single-queue polling datapath#683
ytakano wants to merge 2 commits intotier4:mainfrom
ytakano:igc_single_queue_polling

Conversation

@ytakano
Copy link
Copy Markdown
Collaborator

@ytakano ytakano commented Apr 22, 2026

Description

This PR adds the first usable single-queue polling datapath for the Intel I225/I226 driver.

It enables the PCIe bus-mastering required for DMA, sets up single-queue TX/RX descriptor rings, and wires the minimal polling path through send(), recv(), can_send(), and capabilities(). It also adds TX reclamation based on TDH/TDT and RX consume/refill handling for the single-queue path.

This change is intentionally limited to the minimal end-to-end datapath. It does not add shell debug commands, NetDevice::debug_dump, IRQ RX buffering, multi-queue support, or checksum offload

Related links

How was this PR tested?

  • make clippy
  • make x86_64 RELEASE=1
  • make check_x86_64

Real-machine smoke test:

  • Connected over /dev/ttyUSB0 at 38400
  • Ran (reboot) from the Awkernel serial shell
  • Confirmed the target rebooted and returned to the shell
  • Ran (ifconfig) and confirmed interface enumeration, including the igc-backed interfac

Notes for reviewers

  • Review intent is narrow: minimal polling datapath only.
  • Files changed are limited to:
    • awkernel_drivers/src/pcie.rs
    • awkernel_drivers/src/pcie/intel/igc.rs
    • awkernel_drivers/src/pcie/intel/igc/igc_defines.rs
  • The main runtime-sensitive pieces are:
    • PCIe endpoint/bridge bus-master enablement
    • single-queue TX descriptor submission/reclaim
    • RX descriptor consume/refill
  • Observability, IRQ buffering, multi-queue, and checksum offload are intentionally deferred to later PRs.

@ytakano ytakano requested a review from Copilot April 22, 2026 06:07
@ytakano ytakano marked this pull request as ready for review April 22, 2026 06:11
@ytakano ytakano requested review from atsushi421 and nokosaaan April 22, 2026 06:11
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an initial single-queue, polling-based TX/RX datapath for the Intel I225/I226 (igc) PCIe NIC, including DMA/bus-master enablement and minimal NetDevice wiring.

Changes:

  • Enable PCIe bus-mastering (device + bridge) needed for DMA.
  • Implement single-queue polling TX submission/reclaim and RX consume/refill, wired through send(), recv(), can_send(), and capabilities().
  • Add a few igc descriptor constants required for advanced TX descriptors.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
awkernel_drivers/src/pcie.rs Enables BUS_MASTER (and forwarding bits on bridges) so endpoints can DMA through bridges.
awkernel_drivers/src/pcie/intel/igc.rs Implements single-queue polling TX/RX datapath and integrates with the NetDevice trait.
awkernel_drivers/src/pcie/intel/igc/igc_defines.rs Adds constants for advanced TX descriptor encoding (DTYP + PAYLEN shift).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread awkernel_drivers/src/pcie/intel/igc.rs
Comment thread awkernel_drivers/src/pcie/intel/igc.rs Outdated
Comment thread awkernel_drivers/src/pcie/intel/igc.rs Outdated

fn igc_txeof(&self, que_id: usize, tx: &mut Tx) -> Result<(), IgcDriverErr> {
membar_sync();
let reg_tdh = read_reg(&self.info, igc_regs::IGC_TDH(que_id))? as usize;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

igc_txeof reads IGC_TDH to determine which descriptors have been processed, but Intel's SDM advises against reading the TX head register as it may return stale values during DMA. Since IGC_TXD_CMD_RS is already set on each descriptor, consider using the DD bit in each descriptor's write-back status for reclamation instead, which is the approach used by the Linux kernel igc driver.

}

if !self.link_info.link_active {
return Ok(());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When link_active is false (here) or when the TX ring is full (line 909), igc_send returns Ok(()) without actually transmitting. The caller cannot distinguish a successful send from a silently dropped packet. Consider returning an error in these cases so the caller can decide whether to retry.


let mut node = MCSNode::new();
let mut tx = inner.queue_info.que[0].tx.lock(&mut node);
inner.igc_txeof(0, &mut tx).is_ok() && tx.igc_desc_unused() > 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can_send() acquires the TX lock, issues a memory barrier, reads IGC_TDH via MMIO, and runs the igc_txeof reclamation loop. In a typical if can_send() { send() } polling pattern, all of this work is duplicated since send() performs the same operations. Other NIC drivers in this codebase (igb, ixgbe, genet) implement can_send() as a lightweight flag check. Consider making this a simple if_flags/link_active check and relying on send() for reclamation.


const MAX_FRAME_SIZE: u32 = 9234;
const RX_BUFFER_SIZE: usize = 4096 * 3;
const TX_BUFFER_SIZE: usize = 2048;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TX_BUFFER_SIZE is 2048 while RX_BUFFER_SIZE is 12288 and MAX_FRAME_SIZE is 9234. The driver advertises VLAN_MTU but any frame over 2048 bytes is rejected by igc_send. This creates an asymmetry where the driver can receive frames it cannot retransmit. Either increase the TX buffer to match the supported frame size, or document this intentional limitation.

Ok(())
}

fn igc_recv(&self, que_id: usize) -> Result<Option<net_device::EtherFrameBuf>, IgcDriverErr> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can_send() checks NetFlags::RUNNING before proceeding, but igc_recv does not. During igc_stop(), there is a window between removing the RUNNING flag and freeing DMA buffers where igc_recv could access hardware that is being reset. Consider adding a RUNNING check here for consistency.

);
read.olinfo_status = u32::to_le((ether_frame.data.len() as u32) << IGC_ADVTXD_PAYLEN_SHIFT);

tx.next_avail_desc += 1;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next_avail_desc is incremented and the descriptor is written before the write_reg(IGC_TDT) call. If write_reg fails, the ring's software state is advanced but the hardware TDT is not updated, leaving a zombie descriptor. Consider advancing the index only after the write_reg succeeds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants