[Intel-wired-lan] [PATCH 2/3] e1000e: ignore status during auto-negotiation

Neftin, Sasha sasha.neftin at intel.com
Mon Jan 7 15:49:39 UTC 2019


On 1/7/2019 16:15, Jan-Marek Glogowski wrote:
> 
> 
> Am 07.01.19 um 10:00 schrieb Jan-Marek Glogowski:
>>
>>
>> Am 07.01.19 um 07:32 schrieb Neftin, Sasha:
>>> On 1/6/2019 21:53, Jan-Marek Glogowski wrote:
>>>> Am 6. Januar 2019 16:28:42 MEZ schrieb "Neftin, Sasha" <sasha.neftin at intel.com>:
>>>>> On 1/4/2019 15:31, Jan-Marek Glogowski wrote:
>>>>>> My problem is the fallback of the hardware to 10 Mbps after a
>>>>>> re-connect, which happens almost all times. In the broken case
>>>>>> the status field has always the 0x40000000 bit set.
>>>>>>
>>>>>> Still the naming for the status flag is just a guess. Ignoring
>>>>>> the status, when this bit is set, solves my problem. But I just
>>>>>> have one notebook hardware (I219-LM, rev 21), which exhibits the
>>>>>> problem. It doesn't happen for my other notebook with I219-V
>>>>>> (rev 21) hardware (or it's just much more unlikely).
>>>>>>
>>>>>> Signed-off-by: Jan-Marek Glogowski <glogow at fbihome.de>
>>>>>> ---
>>>>>>     drivers/net/ethernet/intel/e1000e/defines.h | 1 +
>>>>>>     drivers/net/ethernet/intel/e1000e/ich8lan.c | 3 ++-
>>>>>>     drivers/net/ethernet/intel/e1000e/mac.c     | 2 ++
>>>>>>     3 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/defines.h
>>>>> b/drivers/net/ethernet/intel/e1000e/defines.h
>>>>>> index fd550de..3cd9f99 100644
>>>>>> --- a/drivers/net/ethernet/intel/e1000e/defines.h
>>>>>> +++ b/drivers/net/ethernet/intel/e1000e/defines.h
>>>>>> @@ -221,6 +221,7 @@
>>>>>>     #define E1000_STATUS_LAN_INIT_DONE 0x00000200   /* Lan Init
>>>>> Completion by NVM */
>>>>>>     #define E1000_STATUS_PHYRA      0x00000400      /* PHY Reset
>>>>> Asserted */
>>>>>>     #define E1000_STATUS_GIO_MASTER_ENABLE    0x00080000    /* Master Req
>>>>> status */
>>>>>> +#define E1000_STATUS_AUTONEG    0x40000000      /* in
>>>>> auto-negotiation */
>>>>>>     
>>>>> There is no such indication. Should be removed.
>>>>>>     #define HALF_DUPLEX 1
>>>>>>     #define FULL_DUPLEX 2
>>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>>>> b/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>>>>> index fd59970..8588eb7 100644
>>>>>> --- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>>>>> +++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>>>>> @@ -1390,7 +1390,8 @@ static s32
>>>>> e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw)
>>>>>>             u16 speed;
>>>>>>             u8 duplex;
>>>>>>     -        e1000e_get_speed_and_duplex_copper(hw, &speed, &duplex);
>>>>>> +        if (e1000e_get_speed_and_duplex_copper(hw, &speed, &duplex))
>>>>>> +            goto out;
>>>>>>             tipg_reg = er32(TIPG);
>>>>>>             tipg_reg &= ~E1000_TIPG_IPGT_MASK;
>>>>>>     diff --git a/drivers/net/ethernet/intel/e1000e/mac.c
>>>>> b/drivers/net/ethernet/intel/e1000e/mac.c
>>>>>> index 19c816c..ada8fbb 100644
>>>>>> --- a/drivers/net/ethernet/intel/e1000e/mac.c
>>>>>> +++ b/drivers/net/ethernet/intel/e1000e/mac.c
>>>>>> @@ -1310,6 +1310,8 @@ s32 e1000e_get_speed_and_duplex_copper(struct
>>>>> e1000_hw *hw, u16 *speed,
>>>>>>            status = er32(STATUS);
>>>>>>     +    if (status & E1000_STATUS_AUTONEG)
>>>>>> +        return 1;
>>>>> This is wrong. We have no AUTONEG indication in bit 30 of E1000_STATUS
>>>>> (0x0008) register. These code piece should be removed.
>>>>>>         if (!(status & E1000_STATUS_LU))
>>>>>>             return 1;
>>>>>>    
>>>>> Hello Jan-Marek,
>>>>> That's okay to use u8 size for a duplex indication and u16 size for a
>>>>> link indication, as you refer in previous patch.
>>>>> But use the 'autoneg status' is wrong.
>>>>
>>>> Just as a reminder: I have no idea what this bit actually indicates. This is just a guess I had
>>>> when looking into the problem. I don't know if the device was still negotiating at this point, but
>>>> this bit was set in the status register.
>>>>
>>>>> I wonder how this can solve the problem. Do you
>>>>> encountered with this problem on other platforms with our devices? (I meant different, no similar
>>>>> HW)
>>>>
>>>> Other platforms as Windows? I'm just doing Linux development, but I'll ask the Windows people and
>>>> can check, if this problem also happens there.
>>>>
>>>> I don't see this problem with older HW (Fujitsu E7x6, also Skylake based, but I219-V). It happens
>>>> with both of my U7x7 test notebooks. I have some older Haswell based HW (E7x4), which I didn't yet
>>>> test. Google tells me they have "Intel 82579LM Gigabit" ethernet.
>>>>
>>>> All of these three series are in use and we have a few hundred or even thousand of them. This
>>>> problem was found during the tests for our next Ubuntu 18.04 based release. This just seems to
>>>> happen with the "new" U-series. I'm not aware of any problems like this with the older E-series HW.
>>>> And it probably just happens more often now for whatever reason.
>>>>
>>>>> Anyway, 0x40000000 indication is not relevant to the auto-negotiation.
>>>>> May I ask do your experiments with ME disable (via BIOS) and see if
>>>>> same problem still happen.
>>>>
>>>> Disabling ME shouldn't be a problem to test.
>>>>
>>> You have mentioned that there is no problem on I219-V. The main difference between I219-LM and
>>> I219-V is 'Intel Standard Manageability' feature. So, I suggest to disable ME and re-check.
>>>> I'll continue testing all the HW tomorrow, with both our releases, and report back. And maybe
>>>> there is an easier way to trigger the problem then re-plugging the cable all the time (maybe
>>>> better to get a switch and power cycle that...).
>>>>
>>>> Please tell me if there is anything else I should look for or test.
>>>> Further step more likely should be dump registers and try access to a
>>> PHY. But let's check ME disabled as the first step.
>>
>> According to the BIOS ME is actually disabled.
>> Nevertheless I selected "UnConfigure ME", which didn'tr change anything in the BIOS (ME
>> v11.8.50.3425 FWIW). I did look for vendor BIOS updates, as you think this problem might be ME
>> related. There is an update available.
> 
> So I did the BIOS update - no changes regarding the network auto-negotiation behavior.
> 
> I also tried both of my E-Series. The old Haswell series (E7x4) also has a disabled ME and as
> suspected the following HW:
> 
> 00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I217-LM (rev 04)
>          Subsystem: Fujitsu Limited. Ethernet Connection I217-LM
>          Flags: bus master, fast devsel, latency 0, IRQ 27
>          Memory at f0500000 (32-bit, non-prefetchable) [size=128K]
>          Memory at f053f000 (32-bit, non-prefetchable) [size=4K]
>          I/O ports at 3080 [size=32]
>          Capabilities: [c8] Power Management version 2
>          Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>          Capabilities: [e0] PCI Advanced Features
>          Kernel driver in use: e1000e
>          Kernel modules: e1000e
> 
> I tried the patched module on both E-series HW and they always have the 0x40000000 bit set when
> decoding the speed from the status register (always 0x40080083), either with or without the ME
> available. So my patch breaks my older HW, as you probably suspected. I removed the 0x40000000 test
> from the module, and they always negotiated 1000 Mbps just fine.
> 
> I've attached logs for all three notebooks with my patched module (without the  0x40000000 test) and
> a debug filter for all files of the module (echo "file */e1000e-20/* +p;" >
> /sys/kernel/debug/dynamic_debug/control).
> 
> My test consisted of rmmod'ing, sleep 1, insmod'ing, set debug filter + two reconnects.
> 
> So I'm basically back to square one.
> 
> How to proceed?
> 
ME disabled - good. How long time you wait for 1000Mbps after a re 
connection of the cable? Could please, wait 5-10s and see if link back 
to the 1000Mbps?
Unfortunately we have no such HW in our labs. I will try ask if our PAE 
can help with more debug if need.
> JMG
> 
Sasha


More information about the Intel-wired-lan mailing list