[Intel-wired-lan] Linux 4.12+ memory leak on router with i40e NICs

Pavlos Parissis pavlos.parissis at gmail.com
Thu Oct 19 11:41:59 UTC 2017


On 19 October 2017 at 01:40, Paweł Staszewski <pstaszewski at itcare.pl> wrote:
>
>
> W dniu 2017-10-19 o 01:29, Alexander Duyck pisze:
>
>> On Mon, Oct 16, 2017 at 10:51 PM, Vitezslav Samel <vitezslav at samel.cz>
>> wrote:
>>>
>>> On Tue, Oct 17, 2017 at 01:34:29AM +0200, Paweł Staszewski wrote:
>>>>
>>>> W dniu 2017-10-16 o 18:26, Paweł Staszewski pisze:
>>>>>
>>>>> W dniu 2017-10-16 o 13:20, Pavlos Parissis pisze:
>>>>>>
>>>>>> On 15/10/2017 02:58 πμ, Alexander Duyck wrote:
>>>>>>>
>>>>>>> Hi Pawel,
>>>>>>>
>>>>>>> To clarify is that Dave Miller's tree or Linus's that you are talking
>>>>>>> about? If it is Dave's tree how long ago was it you pulled it since I
>>>>>>> think the fix was just pushed by Jeff Kirsher a few days ago.
>>>>>>>
>>>>>>> The issue should be fixed in the following commit:
>>>>>>>
>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/drivers/net/ethernet/intel/i40e/i40e_txrx.c?id=2b9478ffc550f17c6cd8c69057234e91150f5972
>>>>>>
>>>>>> Do you know when it is going to be available on net-next and
>>>>>> linux-stable repos?
>>>>>>
>>>>>> Cheers,
>>>>>> Pavlos
>>>>>>
>>>>>>
>>>>> I will make some tests today night with "net" git tree where this patch
>>>>> is included.
>>>>> Starting from 0:00 CET
>>>>> :)
>>>>>
>>>>>
>>>> Upgraded and looks like problem is not solved with that patch
>>>> Currently running system with
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/
>>>> kernel
>>>>
>>>> Still about 0.5GB of memory is leaking somewhere
>>>>
>>>> Also can confirm that the latest kernel where memory is not leaking
>>>> (with
>>>> use i40e driver intel 710 cards) is 4.11.12
>>>> With kernel 4.11.12 - after hour no change in memory usage.
>>>>
>>>> also checked that with ixgbe instead of i40e with same  net.git kernel
>>>> there
>>>> is no memleak - after hour same memory usage - so for 100% this is i40e
>>>> driver problem.
>>>
>>>    I have (probably) the same problem here but with X520 cards: booting
>>> 4.12.x gives me oops after circa 20 minutes of our workload. Booting
>>> 4.9.y is OK. This machine is in production so any testing is very
>>> limited.
>>>
>>>    Machine was stable for >2 months (on the desk before got to
>>> production) with 4.12.8 but with no traffic on X520 cards.
>>>
>>>          Cheers,
>>>
>>>                  Vita
>>
>> Sorry but it can't be the same issue since we are discussing a
>> different driver (i40e) running different hardware (X710 or XL170).
>> You might want to start a new thread for your issue, and/or if
>> possible file a bug on e1000.sf.net.
>>
>> Thanks.
>>
>> - Alex
>>
> sorry but bugs reported on e1000.sf.net are delayed - some after about 6 or
> more months - when i reported first bug there iv got reply after a year
> about no activity :):) haha - and reported there bug is still actrive :)
> better for me is now to change nics (for sure cheaper from  the perspective
> of clients :) ) to mellanox or just to replace and use ixgbe - that have no
> this bug (mellanox and ixgbe have no such bug - have many servers with them
> with same conf - and only one with i40e where is same conf and memleak)
>
> If nobody from Intel wants to reproduce this - qool - this is not my problem
> but intels :) - there is now many good nics to use - like mellanox or just
> stick with many 10G based on ixgbe that is really good driver - but really ?
> intel guys have no XL710 cards ? i dont want to buy another buggy cards to
> do only kernel bisects .... sorry ....
> To do good bisects with this bug You need to spend maybee 200/300 bisects -
> and to confirm each - You need maybee 30minutes so count how much time You
> need - more that 100 cards in price from mellanox maybee :)
>

I have similar issues with you in regards to the stability of i40e
driver. I will need to open another thread about them, but I would
like to mention that you are not the only one who suffers from
problems related to i40e driver. In my case I can't simply change
NICs..so it is even worse.

Cheers,
Pavlos


More information about the Intel-wired-lan mailing list