[Intel-wired-lan] ixgbe: 5.10.0 kernel regression for 2.5Gbps link negotiation?

Fujinaka, Todd todd.fujinaka at intel.com
Mon Dec 21 15:16:47 UTC 2020


I would listen to you on Linus' list, but this is Intel-wired-lan.

Todd Fujinaka
Software Application Engineer
Data Center Group
Intel Corporation
todd.fujinaka at intel.com

-----Original Message-----
From: Paul Menzel <pmenzel at molgen.mpg.de> 
Sent: Monday, December 21, 2020 7:10 AM
To: Fujinaka, Todd <todd.fujinaka at intel.com>; Ben Greear <greearb at candelatech.com>
Cc: intel-wired-lan at lists.osuosl.org; Greg KH <gregkh at linuxfoundation.org>; Linus Torvalds <torvalds at linux-foundation.org>; Brandeburg, Jesse <jesse.brandeburg at intel.com>; Nguyen, Anthony L <anthony.l.nguyen at intel.com>
Subject: Re: [Intel-wired-lan] ixgbe: 5.10.0 kernel regression for 2.5Gbps link negotiation?

Dear Todd,


I kindly ask you again, please do not top-post. It’s impolite, and more importantly, it wastes the readers time as it looses context, and results in misunderstandings.

Am 19.12.20 um 17:19 schrieb Fujinaka, Todd:
> This is a bad case with no ideal solution. Detecting the case is not 
> possible as autonegotiation happens in the hardware without software 
> involvement.
> 
> One solution was to update the switch firmware for the a switch that 
> is is the link partner that give us the most trouble. The issue 
> appears to be in competing or half-implemented standards. 2.5G and 5G 
> were initially non-IEEE standards that different manufacturers hacked 
> onto 1G in different ways. We implemented it to one of the standards 
> which should be interoperable, but the corner case of the 
> widely-deployed switch will take the link from 10G to 1G with no 
> automated way to fix it.

Thank you for the background, which should have been in the commit message.

Can you please tell us the problematic switch name and the problematic firmware version and the one, where this issues is fixed?

> Updating switches means a lot of downtime for a lot of datacenters and 
> the OEMs we deal with would not accept that answer.

Well, then please discuss the problem and possible solutions on the mailing list. Breaking other peoples setups is unacceptable. A Linux kernel runtime parameter would be one solution, your customers could have used.

> Our solution was to disable 2.5G and 5G by default. This fixes 10G 
> linking at 1G on that switch, but 2.5G and 5G will link at 1G by 
> default. And, as I said, I've had very little contact with people 
> using 2.5G and 5G and I'm the guy on all the mailing lists.

Unfortunately, a lot of users are not on the mailing list.

> I apologize for making your life harder, but it seems like it's just 
> you so far. Paul seems to be arguing with me just for the fun of it.

Please keep the discussion respectful, and do not insult others.

Unfortunately, at work we have now been bitten several times by regressions updating to the current mainline Linux kernel, causing frictions in the team about what Linux kernel to use.

I am missing a statement by you, acknowledging that the commit and the whole communication was a big fail, and how you will fix the regression. 
Additionally, an analysis would be nice, where the process failed – why was the commit message incomplete and why did the test (Tested-by
present) not spot the issue – and how to improve it to avoid such a situation in the future.


Kind regards,

Paul


More information about the Intel-wired-lan mailing list