[Intel-wired-lan] [PATCH] e1000e: bump up timeout to wait when ME un-configure ULP mode
Paul Menzel
pmenzel at molgen.mpg.de
Thu Mar 26 18:37:11 UTC 2020
Dear Jeff, dear David,
Could you please comment as maintainers?
Am 26.03.20 um 15:34 schrieb Neftin, Sasha:
> On 3/26/2020 13:41, Paul Menzel wrote:
>> Am 26.03.20 um 12:29 schrieb Kai-Heng Feng:
>>
>>>> On Mar 25, 2020, at 23:49, Paul Menzel <pmenzel at molgen.mpg.de> wrote:
>>
>>>> Am 25.03.20 um 14:58 schrieb Neftin, Sasha:
>>>>> On 3/25/2020 08:43, Aaron Ma wrote:
>>>>
>>>>>> On 3/25/20 2:36 PM, Neftin, Sasha wrote:
>>>>>>> On 3/25/2020 06:17, Kai-Heng Feng wrote:
>>>>
>>>>>>>>> On Mar 24, 2020, at 03:16, Aaron Ma <aaron.ma at canonical.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> ME takes 2+ seconds to un-configure ULP mode done after resume
>>>>>>>>> from s2idle on some ThinkPad laptops.
>>>>>>>>> Without enough wait, reset and re-init will fail with error.
>>>>>>>>
>>>>>>>> Thanks, this patch solves the issue. We can drop the DMI quirk in
>>>>>>>> favor of this patch.
>>>>>>>>
>>>>>>>>> Fixes: f15bb6dde738cc8fa0 ("e1000e: Add support for S0ix")
>>>>>>>>> BugLink: https://bugs.launchpad.net/bugs/1865570
>>>>>>>>> Signed-off-by: Aaron Ma <aaron.ma at canonical.com>
>>>>>>>>> ---
>>>>>>>>> drivers/net/ethernet/intel/e1000e/ich8lan.c | 4 ++--
>>>>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>>>>>>>> b/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>>>>>>>> index b4135c50e905..147b15a2f8b3 100644
>>>>>>>>> --- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>>>>>>>> +++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>>>>>>>> @@ -1240,9 +1240,9 @@ static s32 e1000_disable_ulp_lpt_lp(struct
>>>>>>>>> e1000_hw *hw, bool force)
>>>>>>>>> ew32(H2ME, mac_reg);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> - /* Poll up to 300msec for ME to clear ULP_CFG_DONE. */
>>>>>>>>> + /* Poll up to 2.5sec for ME to clear ULP_CFG_DONE. */
>>>>>>>>> while (er32(FWSM) & E1000_FWSM_ULP_CFG_DONE) {
>>>>>>>>> - if (i++ == 30) {
>>>>>>>>> + if (i++ == 250) {
>>>>>>>>> ret_val = -E1000_ERR_PHY;
>>>>>>>>> goto out;
>>>>>>>>> }
>>>>>>>>
>>>>>>>> The return value was not caught by the caller, so the error ends up
>>>>>>>> unnoticed.
>>>>>>>> Maybe let the caller check the return value of
>>>>>>>> e1000_disable_ulp_lpt_lp()?
>>>>
>>>>>>> I a bit confused. In our previous conversation you told ME not
>>>>>>> running.
>>>>>>> let me shimming in. Increasing delay won't be solve the problem
>>>>>>> and just
>>>>>>> mask it. We need to understand why ME take too much time. What is
>>>>>>> problem with this specific system?
>>>>>>> So, basically no ME system should works for you.
>>>>>>
>>>>>> Some laptops ME work that's why only reproduce issue on some laptops.
>>>>>> In this issue i219 is controlled by ME.
>>>>>
>>>>> Who can explain - why ME required too much time on this system?
>>>>> Probably need work with ME/BIOS vendor and understand it.
>>>>> Delay will just mask the real problem - we need to find root cause.
>>>>>> Quirk is only for 1 model type. But issue is reproduced by more
>>>>>> models.
>>>>>> So it won't work.
>>>>
>>>> (Where is Aaron’s reply? It wasn’t delivered to me yet.)
>>>>
>>>> As this is clearly a regression, please revert the commit for now,
>>>> and then find a way to correctly implement S0ix support. Linux’
>>>> regression policy demands that as no fix has been found since
>>>> v5.5-rc1. Changing Intel ME settings, even if it would work around
>>>> the issue, is not an acceptable solution. Delaying the resume time
>>>> is also not a solution.
>>>
>>> The s0ix patch can reduce power consumption, finally makes S2idle an
>>> acceptable sleep method. So I'd say it's a fix, albeit a regression
>>> was introduced.
>>>
>>>> Regarding Intel Management Engine, only Intel knows what it does
>>>> and what the error is, as the ME firmware is proprietary and
>>>> closed.
>>>>
>>>> Lastly, there is no way to fully disable the Intel Management
>>>> Engine. The HAP stuff claims to stop the Intel ME execution, but
>>>> nobody knows, if it’s successful.
>>>>
>>>> Aaron, Kai-Hang, can you send the revert?
>>>
>>> I consider that as an important fix for s2idle, I don't think
>>> reverting is appropriate.
>>
>> If there is a fix with no other regression, I agree. But there has not
>> been one, so please revert. It doesn’t matter if the commit
>> introducing the regression fixes something else. It gets too
>> complicated to decide, which regression is worth it or not. The Linux
>> kernel guarantees its users, they can update any time without breaking
>> user space (in this case suspend/resume). Please read Linus’ several
>> messages about that. His message [1] exactly addresses your arguments.
>>
> Revert is no option. S0ix supported on none ME system, approved by Intel
> design team and power management domain owner.
> Vendor should provide none ME BIOS I thought. Our PAE will work toward
> meet this.
Did you read Linus’ messages? It doesn’t matter.
Requiring people to change system firmware settings is a no-go.
>>> Yeah, HELL NO!
>>>
>>> Guess what? You're wrong. YOU ARE MISSING THE #1 KERNEL RULE.
>>>
>>> We do not regress, and we do not regress exactly because your are
>>> 100% wrong.
>>>
>>> And the reason you state for your opinion is in fact exactly *WHY* you
>>> are wrong.
>>>
>>> Your "good reasons" are pure and utter garbage.
>>>
>>> The whole point of "we do not regress" is so that people can upgrade
>>> the kernel and never have to worry about it.
>>>
>>>> Kernel had a bug which has been fixed
>>>
>>> That is *ENTIRELY* immaterial.
>>>
>>> Guys, whether something was buggy or not DOES NOT MATTER.
>>
>> So, please (also Intel developers), please adhere to this rule, and
>> revert the commit.
Kind regards,
Paul
>> [1]: https://lkml.org/lkml/2018/8/3/621
More information about the Intel-wired-lan
mailing list