[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-dev] packed ring layout proposal
On Mon, Sep 19, 2016 at 09:06:19PM +0200, Paolo Bonzini wrote: > > > On 19/09/2016 19:15, Michael S. Tsirkin wrote: > > On Mon, Sep 19, 2016 at 02:01:57PM +0200, Paolo Bonzini wrote: > >> The main issue with VRING_DESC_F_INDIRECT, at least as it is implemented > >> in Linux, is the cost of allocating and freeing the indirect descriptor. > >> But if you know that the number of descriptor is fixed (e.g. 2 for > >> request header + data) you can pre-allocate the descriptors with a 1:1 > >> relationship with the entries of the ring buffer. This might be useful > >> for the virtio-net tx ring. > > > > Yes, I've been thinking the same thing. Is there any advantage > > to have the indirect addresses pre-programmed as > > opposed to stored in the descriptors? > > > > In theory > > if (desc.flags & indirect) > > use(*address[i]) > > > > might be speculated as compared to > > if (desc.flags & indirect) > > use(*desc.addr) > > which can't. > > But OTOH this might add to cache pressure as desc is already in cache > > anyway. > > Yes, I think this would be the worst of both worlds. :) > > If the driver knows it has 2 descriptors per request, it could allocate > a big block in advance and do for example > > struct { > // pre-allocated and pre-initialized > // to point to ->hdr > struct virtio_desc req_desc; > // only flags prefilled > struct virtio_desc data_desc; > struct req_header hdr; > } *indirect_data; > > i = index++ & (size - 1); > indirect_addr = &indirect_data[i]; > indirect_data[i]->data_desc.addr = this_data_addr; > indirect_data[i]->data_desc.len = this_data_len; > // ... fill in indirect_data[i]->hdr ... > > desc[i].addr = indirect_addr; > smp_wmb(); > desc[i].index = i | DESC_HW; > > where there is no need for an array of descriptor addresses, > &indirect_data[i]. As long as the indirect_data size is a power of 2, > indirect_data needn't even be physically contiguous. > > But I prefer all these tricks to be hidden within the driver. It seems > a good idea in the beginning to rig the device to second-guess what a > driver could do, but then it makes things awkward. (This is also why > I'd rather get rid of VRING_DESC_F_NEXT). Right, I'll CC dpdk list on this proposal. As dpdk uses _NEXT almost exclusively I'd like to make sure we are not messing things up. Maybe the right thing to do is to disallow _NEXT if _INDIRECT is negotiated. Each device can then decide whether it wants to use _INDIRECT or _NEXT (or both). How's that? > > If all descriptors are known ahead of the time to be indirect we could > > pack more of them per cache line by dropping addr (and maybe len?). > > This might work well for virtio blk. > > Nope, virtio-blk doesn't know at all the size of the sg list (for > small-I/O workloads it is always 2, but potentially unlimited). For the > above ideas I was thinking of the virtio tx header (but only for > MTU=1500, as Stephen pointed out). > > Paolo Length in the indirect descriptor itself might not be needed to figure out s.g list if we use the VRING_DESC_F_NEXT in the direct one that it points to. Using a flag in the descriptor might cause a stall but I think that's typically cheaper than a cache miss (although we are not talking a full miss here, more like 1/8 of a miss). -- MST
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]