[Intel-wired-lan] 答复: [question] i40e: Question about the i40e_clear_pxe_mode() function
Xuguizhu (Guzhu Xu, Intelligent Computing Business Dept)
xuguizhu at huawei.com
Thu Mar 28 12:34:12 UTC 2019
Today's test found that I40E_GLLAN_RCTL_0 and Clear PXE Mode Admin Command (Opcode: 0x0110) must be consistent. The i40e_clear_pxe_mode function in the I40E driver released in the Linux community clears the I40E_GLLAN_RCTL_0 register first. The i40e_probe() function calls the i40e_pf_reset() function, and the i40e_pf_reset() function calls the i40e_clear_pxe_mode() function. At this time, since the Admin Queue has not been created, the i40e_clear_pxe_mode() function does not call the i40e_aq_clear_pxe_mode() function, but it clears the value of the I40E_GLLAN_RCTL_0 register. The i40e_probe() function then calls the i40e_clear_pxe_mode() function to send the Clear PXE Mode Admin Command (Opcode: 0x0110). This will lead to the problem I mentioned in the previous email.
Next, I adjusted the code to look like this, after testing it can solve the problem I found before:
void i40e_clear_pxe_mode(struct i40e_hw *hw) {
u32 reg;
if (i40e_check_asq_alive(hw)) {
i40e_aq_clear_pxe_mode(hw, NULL);
/* Clear single descriptor fetch/write-back mode */
reg = rd32(hw, I40E_GLLAN_RCTL_0);
if (hw->revision_id == 0) {
/* As a work around clear PXE_MODE instead of setting it */
wr32(hw, I40E_GLLAN_RCTL_0, (reg & (~I40E_GLLAN_RCTL_0_PXE_MODE_MASK)));
} else {
wr32(hw, I40E_GLLAN_RCTL_0, (reg | I40E_GLLAN_RCTL_0_PXE_MODE_MASK));
}
}
}
-----邮件原件-----
发件人: Xuguizhu (Guzhu Xu, Intelligent Computing Business Dept)
发送时间: 2019年3月27日 18:59
收件人: Intel Wired LAN <intel-wired-lan at lists.osuosl.org>
抄送: Zoujingzhou (Eitan, Intelligent Computing R&D) <zoujingzhou.zoujingzhou at huawei.com>; tangkun <tangkun1 at huawei.com>; chenliyong (A) <chenliyong1 at huawei.com>
主题: [Intel-wired-lan][question] i40e: Question about the i40e_clear_pxe_mode() function
In the process of testing the NCSI function of the X722 NIC, when we started the ubuntu 18.04 system using the legacy mode, the system was put into the panic state by injecting a fault into the system through the "echo c > /proc/sysrq-trigger" command. When using the i40e driver in the ubuntu system, NCSI is broken after the injection failure. After upgrading the kernel to 5.1-rc2, the problem still exists. After using the driver released in sourceforge.net, the same test is performed and the NSCI function is normal.
After comparing the drivers of the two branches, we found that this problem may be related to the i40e_clear_pxe_mode() function. The code for the I40E driver released in the kernel community is as follows:
void i40e_clear_pxe_mode(struct i40e_hw *hw) {
u32 reg;
if (i40e_check_asq_alive(hw))
i40e_aq_clear_pxe_mode(hw, NULL);
/* Clear single descriptor fetch/write-back mode */
reg = rd32(hw, I40E_GLLAN_RCTL_0);
if (hw->revision_id == 0) {
/* As a work around clear PXE_MODE instead of setting it */
wr32(hw, I40E_GLLAN_RCTL_0, (reg & (~I40E_GLLAN_RCTL_0_PXE_MODE_MASK)));
} else {
wr32(hw, I40E_GLLAN_RCTL_0, (reg | I40E_GLLAN_RCTL_0_PXE_MODE_MASK));
}
}
The code for the i40e_clear_pxe_mode() function in the driver released in sourceforge.net is as follows:
void i40e_clear_pxe_mode(struct i40e_hw *hw) {
if (i40e_check_asq_alive(hw))
i40e_aq_clear_pxe_mode(hw, NULL);
}
We only modify the code of the i40e_clear_pxe_mode() function based on the code released by sourceforge.net. After testing, we can still reproduce the issue of NCSI break.
Our question is: what is the reason for adding this code, how to solve the problem we are now facing.
More information about the Intel-wired-lan
mailing list