[Intel-wired-lan] Linux 4.12+ memory leak on router with i40e NICs

Paweł Staszewski pstaszewski at itcare.pl
Tue Oct 17 11:05:22 UTC 2017



W dniu 2017-10-17 o 12:59, Paweł Staszewski pisze:
>
>
> W dniu 2017-10-17 o 12:51, Paweł Staszewski pisze:
>>
>>
>> W dniu 2017-10-17 o 12:20, Paweł Staszewski pisze:
>>>
>>>
>>> W dniu 2017-10-17 o 11:48, Paweł Staszewski pisze:
>>>>
>>>>
>>>> W dniu 2017-10-17 o 02:44, Paweł Staszewski pisze:
>>>>>
>>>>>
>>>>> W dniu 2017-10-17 o 01:56, Alexander Duyck pisze:
>>>>>> On Mon, Oct 16, 2017 at 4:34 PM, Paweł Staszewski 
>>>>>> <pstaszewski at itcare.pl> wrote:
>>>>>>>
>>>>>>> W dniu 2017-10-16 o 18:26, Paweł Staszewski pisze:
>>>>>>>
>>>>>>>>
>>>>>>>> W dniu 2017-10-16 o 13:20, Pavlos Parissis pisze:
>>>>>>>>> On 15/10/2017 02:58 πμ, Alexander Duyck wrote:
>>>>>>>>>> Hi Pawel,
>>>>>>>>>>
>>>>>>>>>> To clarify is that Dave Miller's tree or Linus's that you are 
>>>>>>>>>> talking
>>>>>>>>>> about? If it is Dave's tree how long ago was it you pulled it 
>>>>>>>>>> since I
>>>>>>>>>> think the fix was just pushed by Jeff Kirsher a few days ago.
>>>>>>>>>>
>>>>>>>>>> The issue should be fixed in the following commit:
>>>>>>>>>>
>>>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/drivers/net/ethernet/intel/i40e/i40e_txrx.c?id=2b9478ffc550f17c6cd8c69057234e91150f5972 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> Do you know when it is going to be available on net-next and 
>>>>>>>>> linux-stable
>>>>>>>>> repos?
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Pavlos
>>>>>>>>>
>>>>>>>>>
>>>>>>>> I will make some tests today night with "net" git tree where 
>>>>>>>> this patch is
>>>>>>>> included.
>>>>>>>> Starting from 0:00 CET
>>>>>>>> :)
>>>>>>>>
>>>>>>>>
>>>>>>> Upgraded and looks like problem is not solved with that patch
>>>>>>> Currently running system with
>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/
>>>>>>> kernel
>>>>>>>
>>>>>>> Still about 0.5GB of memory is leaking somewhere
>>>>>>>
>>>>>>> Also can confirm that the latest kernel where memory is not 
>>>>>>> leaking (with
>>>>>>> use i40e driver intel 710 cards) is 4.11.12
>>>>>>> With kernel 4.11.12 - after hour no change in memory usage.
>>>>>>>
>>>>>>> also checked that with ixgbe instead of i40e with same net.git 
>>>>>>> kernel there
>>>>>>> is no memleak - after hour same memory usage - so for 100% this 
>>>>>>> is i40e
>>>>>>> driver problem.
>>>>>> So how long was the run to get the .5GB of memory leaking?
>>>>> 1 hour
>>>>>
>>>>>>
>>>>>> Also is there any chance of you being able to bisect to determine
>>>>>> where the memory leak was introduced since as you pointed out it
>>>>>> didn't exist in 4.11.12 so odds are it was introduced somewhere
>>>>>> between 4.11 and the latest kernel release.
>>>>> Can be hard cause currently need to back to 4.11.12 - this is 
>>>>> production host/router
>>>>> Will try to find some free/test router for tests/bicects with i40e 
>>>>> driver (intel 710 cards)
>>>>>
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> - Alex
>>>>>>
>>>>>
>>>>>
>>>> Also forgoto to add errors for i40e when driver initialize:
>>>> [   15.760569] i40e 0000:02:00.1: Error I40E_AQ_RC_ENOSPC adding RX 
>>>> filters on PF, promiscuous mode forced on
>>>> [   16.365587] i40e 0000:03:00.3: Error I40E_AQ_RC_ENOSPC adding RX 
>>>> filters on PF, promiscuous mode forced on
>>>> [   16.367686] i40e 0000:02:00.2: Error I40E_AQ_RC_ENOSPC adding RX 
>>>> filters on PF, promiscuous mode forced on
>>>> [   16.368816] i40e 0000:03:00.0: Error I40E_AQ_RC_ENOSPC adding RX 
>>>> filters on PF, promiscuous mode forced on
>>>> [   16.369877] i40e 0000:03:00.2: Error I40E_AQ_RC_ENOSPC adding RX 
>>>> filters on PF, promiscuous mode forced on
>>>> [   16.370941] i40e 0000:02:00.3: Error I40E_AQ_RC_ENOSPC adding RX 
>>>> filters on PF, promiscuous mode forced on
>>>> [   16.372005] i40e 0000:02:00.0: Error I40E_AQ_RC_ENOSPC adding RX 
>>>> filters on PF, promiscuous mode forced on
>>>> [   16.373029] i40e 0000:03:00.1: Error I40E_AQ_RC_ENOSPC adding RX 
>>>> filters on PF, promiscuous mode forced on
>>>>
>>>> some params that are set for this nic's
>>>>         ip link set up dev $i
>>>>         ethtool -A $i autoneg off rx off tx off
>>>>         ethtool -G $i rx 1024 tx 2048
>>>>         ip link set $i txqueuelen 1000
>>>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 512 
>>>> tx-usecs 128
>>>>         ethtool -L $i combined 6
>>>>         #ethtool -N $i rx-flow-hash udp4 sdfn
>>>>         ethtool -K $i ntuple on
>>>>         ethtool -K $i gro off
>>>>         ethtool -K $i tso off
>>>>
>>>>
>>>>
>>>>
>>> Also after TSO/GRO on there is memory usage change - and leaking faster
>>> Below image from memory usage before change with TSO/GRO OFF and 
>>> after enabling TSO/GRO
>>>
>>> https://ibb.co/dTqBY6
>>>
>>>
>>> Thanks
>>> Pawel
>>>
>>>
>>>
>> With settings like this:
>> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2 
>> enp3s0f3'
>> for i in $ifc
>>         do
>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 512 
>> tx-usecs 128
>>         ethtool -K $i gro on
>>         ethtool -K $i tso on
>>
>>         done
>>
>> Server is leaking about 4-6MB per each 10 seconds
>> MEMLEAK:
>> 5  MB/10sec
>> 6  MB/10sec
>> 4  MB/10sec
>> 4  MB/10sec
>>
>>
>> Other settings TSO/GRO off
>> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2 
>> enp3s0f3'
>> for i in $ifc
>>         do
>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 512 
>> tx-usecs 128
>>         ethtool -K $i gro off
>>         ethtool -K $i tso off
>>
>>         done
>>
>> Same leak about 5MB per 10 seconds
>> MEMLEAK:
>> 5  MB/10sec
>> 5  MB/10sec
>> 5  MB/10sec
>>
>>
>> Other settings rx-usecs change from 512 to 1024:
>> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2 
>> enp3s0f3'
>> for i in $ifc
>>         do
>>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 1024 
>> tx-usecs 128
>>         ethtool -K $i gro off
>>         ethtool -K $i tso off
>>
>>         done
>>
>> MEMLEAK:
>> 4  MB/10sec
>> 3  MB/10sec
>> 4  MB/10sec
>> 4  MB/10sec
>>
>>
>> So memleak have something to do with rx-usecs (less interrupts but 
>> bigger latency for traffic)
>>
>>
>> But also enabling TSO/GRO making leak about 1MB bigger for each 10 
>> seconds
>>
>>
>>
> So far best config is:
> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2 
> enp3s0f3'
> for i in $ifc
>         do
>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 64 
> tx-usecs 512
>         ethtool -K $i gro off
>         ethtool -K $i tso on
>
>         done
>
> MEMLEAK - about 2MB/10secs
> 2  MB/10sec
> 2  MB/10sec
> 2  MB/10sec
>
>
> With - rx-usecs set to 256 (about 7-9MB/10secs memleak)
> ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2 
> enp3s0f3'
> for i in $ifc
>         do
>         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 256 
> tx-usecs 512
>         ethtool -K $i gro off
>         ethtool -K $i tso on
>
>         done
>
> MEMLEAK:
> 7  MB/10sec
> 7  MB/10sec
> 8  MB/10sec
> 9  MB/10sec
>
>

And even less memleak with rx-usecs set to 32
ifc='enp2s0f0 enp2s0f1 enp2s0f2 enp2s0f3 enp3s0f0 enp3s0f1 enp3s0f2 
enp3s0f3'
for i in $ifc
         do
         ethtool -C $i adaptive-rx off adaptive-tx off rx-usecs 32 
tx-usecs 512
         ethtool -K $i gro off
         ethtool -K $i tso on

         done


MEMLEAK - about 0-2MB for each 10 seconds
0  MB/10sec
1  MB/10sec
0  MB/10sec
2  MB/10sec
1  MB/10sec




More information about the Intel-wired-lan mailing list