[Intel-wired-lan] [PATCH iwl-net v3 2/6] igc: Lengthen the hardware retry time to prevent timeouts

Paul Menzel pmenzel at molgen.mpg.de
Thu Nov 7 05:56:19 UTC 2024


[Cc: -Sasha, 550 #5.1.0 Address rejected.]

Am 07.11.24 um 06:48 schrieb Paul Menzel:
> [Cc: +Sasha]
> 
> Dear Christopher,
> 
> 
> Am 07.11.24 um 00:53 schrieb Hall, Christopher S:
> 
>>> From: Paul Menzel <pmenzel at molgen.mpg.de>
>>> Sent: Wednesday, November 06, 2024 3:14 PM
>>
>>> Subject: Re: [Intel-wired-lan] [PATCH iwl-net v3 2/6] igc: Lengthen the hardware retry time to prevent timeouts
> 
>>> I’d use the more specific summary/title below:
>>
>> Will do.
>>
>>> igc: Lengthen hardware retry time to 4 μs to prevent timeouts
>>>
>>> Am 06.11.24 um 19:47 schrieb Christopher S M Hall:
>>>> Lengthen the hardware retry timer to four microseconds.
>>>>
>>>> The i225/i226 hardware retries if it receives an inappropriate response
>>>> from the upstream device. If the device retries too quickly, the root
>>>> port does not respond.
>>>
>>> Any idea why? Is it documented somewhere?
>>
>> I do not. Theoretically, 1 us should work, but it does not. It could be a root
>> port problem or an issue with i225/i226 NIC. I am not able to directly observe
>> the state of either. 4 us has worked in all my testing I am comfortable with
>> that value. 2 us also works, but given the limited hardware at my disposal
>> I doubled the value to 4 us to be safe. PTM is not time critical. Typically,
>> software initiates a transaction between 8 and 32 times per second. There
>> is no performance impact for PTM or any other function of the card. The
>> timeout occurs rarely, but if the retry time is too short the PTM state
>> machine does not recover.
> 
> Thank you for clearing this up. If it’s not time critical, why not 
> revert the original patch and go back to 10 μs.
> 
> The referenced commit 6b8aa753a9f9 (igc: Decrease PTM short interval 
> from 10 us to 1 us) also says, that 1 μs was suggested by the hardware 
> team. Were you able to talk to them?
> 
>>>> The issue can be reproduced with the following:
>>>>
>>>> $ sudo phc2sys -R 1000 -O 0 -i tsn0 -m
>>>>
>>>> Note: 1000 Hz (-R 1000) is unrealistically large, but provides a way to
>>>> quickly reproduce the issue.
>>>>
>>>> PHC2SYS exits with:
>>>>
>>>> "ioctl PTP_OFFSET_PRECISE: Connection timed out" when the PTM transaction
>>>>     fails
>>>
>>> Why four microseconds, and not some other value?
>>
>> See above.
> 
> It’d be great, if you extended the commit message.
> 
>>>> Fixes: 6b8aa753a9f9 ("igc: Decrease PTM short interval from 10 us to 
>>>> 1 us")
>>>>
>>>> -#define IGC_PTM_SHORT_CYC_DEFAULT    1   /* Default short cycle interval */
>>>> +#define IGC_PTM_SHORT_CYC_DEFAULT    4   /* Default short cycle interval */
> 
> Maybe also add a comment, that 1 μs should work, but does not.
> 
> 
> Kind regards,
> 
> Paul


More information about the Intel-wired-lan mailing list