[Intel-wired-lan] [RFC PATCH] i40e: enable PCIe relax ordering for SPARC

tndave tushar.n.dave at oracle.com
Fri Dec 9 01:16:08 UTC 2016



On 12/08/2016 04:45 PM, tndave wrote:
>
>
> On 12/08/2016 08:05 AM, Alexander Duyck wrote:
>> On Thu, Dec 8, 2016 at 2:43 AM, David Laight
>> <David.Laight at aculab.com> wrote:
>>> From: Alexander Duyck
>>>> Sent: 06 December 2016 17:10
>>> ...
>>>> I was thinking about it and I realized we can probably simplify
>>>> this even further.  In the case of most other architectures the
>>>> DMA_ATTR_WEAK_ORDERING has no effect anyway.  So from what I can
>>>> tell there is probably no reason not to just always pass that
>>>> attribute with the DMA mappings.  From what I can tell the only
>>>> other architecture that uses this is the PowerPC Cell
>>>> architecture.
>>>
>>> And I should have read all the thread :-(
>>>
>>>> Also I was wondering if you actually needed to enable this
>>>> attribute for both Rx and Tx buffers or just Rx buffers?  The
>>>> patch that enabled DMA_ATTR_WEAK_ORDERING for Sparc64 seems to
>>>> call out writes, but I didn't see anything about reads.  I'm just
>>>> wondering if changing the code for Tx has any effect?  If not you
>>>> could probably drop those changes and just focus on Rx.
>>>
>>> 'Weak ordering' only applies to PCIe read transfers, so can only
>>> have an effect on descriptor reads and transmit buffer reads.
>>>
>>> Basically PCIe is a comms protocol and an endpoint (or the host)
>>> can have multiple outstanding read requests (each of which might
>>> generate multiple response messages. The responses for each request
>>> must arrive in order, but responses for different requests can be
>>> interleaved. Setting 'not weak ordering' lets the host interwork
>>> with broken endpoints. (Or, like we did, you fix the fpga's PCIe
>>> implementation.)
>>
>> I get the basics of relaxed ordering.  The question is how does the
>> Sparc64 IOMMU translate DMA_ATTR_WEAK_ORDERING into relaxed ordering
>> messages, and at what level the ordering is relaxed.  Odds are the
>> wording in the description where this attribute was added to Sparc
>> is just awkward, but I was wanting to verify if this only applies to
>> writes, or also read completions.
> In Sparc64, passing DMA_ATTR_WEAK_ORDERING in dma map/unmap only affects
> PCIe root complex (Hostbridge). Using DMA_ATTR_WEAK_ORDERING, requested
> DMA transaction can be relaxed ordered within the PCIe root complex.
>
> In Sparc64, memory writes can be held at PCIe root complex not letting
> other memory writes to go through. By passing DMA_ATTR_WEAK_ORDERING in
> dma map/unmap allows memory writes to bypass other memory writes in PCIe
> root complex. (This applies to only PCIe root complex and does not
> affect at any other level of PCIe hierarchy e.g. PCIe bridges et al.
> Also the PCIe root complex when bypassing memory writes does follow PCIe
> relax ordering rules as per PCIe specification.
>
> For reference [old but still relevant write-up]: PCI-Express Relaxed
> Ordering and the Sun SPARC Enterprise M-class Servers
> https://blogs.oracle.com/olympus/entry/relaxed_ordering
>
>>
>>> In this case you need the reads of both transmit and receive rings
>>> to 'overtake' reads of transmit data.
>>
>> Actually that isn't quite right.  With relaxed ordering completions
>> and writes can pass each other if I recall correctly, but reads will
>> always force all writes ahead of them to be completed before you can
>> begin generating the read completions.
> That is my understanding as well.
>
>>
>>> I'm not at all clear how this 'flag' can be set on dma_map(). It is
>>> a property of the PCIe subsystem.
> Because in Sparc64, passing DMA_ATTR_WEAK_ORDERING flag in DMA map/unmap
> adds an entry in IOMMU/ATU table so that an access to requested DMA
> address from PCIe root complex can be relaxed ordered.
>>
>> That was where my original question on this came in.  We can do a
>> blanket enable of relaxed ordering for Tx and Rx data buffers, but
>> if we only need it on Rx then there isn't any need for us to make
>> unnecessary changes.
> I ran some quick test and it is likely that we don't need
> DMA_ATTR_WEAK_ORDERING for any TX dma buffer (because in case of TX dma
> buffers, its all memory reads from device).
in above line , s/from/by

+ cc sparclinux at vger.kernel.org

-Tushar
>
> -Tushar
>>
>> - Alex
>>
>


More information about the Intel-wired-lan mailing list