OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

mqtt message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [mqtt] Recharter: Shared subscriptions and $share


I'm writing up some notes on yesterday's discussion. I will transfer them into the JIRA, but here they are in email form


1. Main Use case (as in Dominik's email)

To allow multiple consumer instances (which we should assume could be running on separate processors or servers) to share the load of processing messages that arrive on one or more MQTT topics, without them all getting copies of the same message

It must be possible to allow the number of instances to grow and shrink over time, without the publishers having to be aware of this. In particular this provides a level of fault-tolerance, as if 1 of n consumer fails there are still n-1 left to process the messages.

Maintaining message ordering isn't a key requirement, but it's something that I will consider later.

There is the possibility that the set of consumer instances might want to share the load for one topic, but also each get a copy of every message for another topic. This isn't as contrived as it might sound: you could have one topic that distributes work, and another that sends configuration parameters or sends commands that all the instances need to see.

To simplify things we agreed that "consumer instance" can be represented by an MQTT Session, in other words each instance must make its own distinct MQTT connection and if a piece of application code makes multiple MQTT connections then we view that as multiple "consumer instances".

Note that, within an MQTT server, every Session has a unique id which is referred to as a "clientID". In cases where cleanSession=false this is always assigned by the client itself, in cases where cleanSession=true it can be assigned either by the client or the server. [note on this note, I am using "MQTT Server" to mean a logical server - it could be distributed across many physical servers].


2. Model

We discussed a number of ways in which this could be achieved, building on what we have today and ended up in effect validating the "shared subscription" approach. I will write down what I think this means, but I will start with a description of what the non-shared subscription support in 3.1.1

a) Non-shared subscriptions
b) Shared Subscriptions

The proposal is to add a new kind of Subscription, which I will term a "Shared Subscription". As with the non-shared case, it has a Topic Filter and a Max QoS, and is responsible for collecting and delivering messages. However it differs from a non-shared subscription in the following ways.


Note, this approach does lead to an interesting side-effect.  You could have a single cleanSession=false Session which you use to create and maintain the shared subscription, and make sure that it never gets garbage-collected, while the Session(s) that actually receive the messages all use cleanSession = true.

3. Syntax options

The mechanism described above requires a ShareName.   We agreed that this will be asserted by the client applications, a bit like clientID used to be in 3.1.  There's no mechanism for getting the server to assign one.

The client needs a way to pass the ShareName. For example, let's suppose that it wants to use share name Foo and subscribe with a topic filter /Bar.  There are three options

i) It could supply it on Connect, and the name would then qualify all Subscribes and Unsubscribes that it issues. In our example it would pass sharename Foo on Connect and then subscribe to /Bar in the normal way. The server would have remembered the sharename Foo from the Connect and use it when it sees the subscribe.  We rejected this approach because we want to allow cases where the clients want to use multiple shared subscriptions with different ShareNames, or wish to use a mixture of shared and non-shared (technically this could be done using cleanSession=false and reconnecting multiple times, but that's ugly).

ii) Add the ShareName as a new parameter on Subscribe and Unsubscribe. A Subscribe request to /Bar without the ShareName would be treated as a non-shared subscription

iii) Devise a syntax to represent the sharename as part of the Topic Filter parameter of Subscribe/Unsubscribe. In effect this is repurposing the Topic Filter to be the Subscription Identifier (that's what it really is on Unsubscribe anyway).

Here are some suboptions for iii)

a) Use one of the spare bits in byte 1 to indicate that this is a Shared version of Subscribe/Unsubscribe.   Change the format of the TopicFilter (or whatever we decide to call it) to be sharename:topic filter , where : is a separator character that we forbid use of in the sharename (we could use a different separator if people prefer).   Our example would then be  Foo:/Bar.  The separator isn't strictly necessary if we forbid / in the sharename, but I think it makes things less confusing to have it as otherwise the string looks too much like a conventional topic filter.

b) Don't use the spare bit, but start the Topic filter with the sequence $$, e.g. $$Foo:/Bar.  This approach would take away the ability of an application to do a non-shared subscription to a topic that starts $$ but my assumption here is that nobody today is likely to be using topics that start $$

c) If we don't want to take that risk, then we could use a trigger sequence that is completely illegal today, such as ++ or ##, e.g. ++Foo:/Bar


4. Other details to be worked out

i) Overlapping subscriptions.  Because the shared sessions are global I think the only sensible approach is for each message to be filtered independently by each shared subscription. So if I have two shared subscriptions ++Foo:/Bar and ++Foo:/# then they both get a copy of a message published on Bar, and each distributes it to its respective member sessions. Similarly if a Session has a non-shared subscription to /Bar or /# that shouldn't interfere with these shared subscriptions (even if they were associated with that Session).

ii) QoS 1. I think it makes sense to allow a server to redeliver a unacked QoS 1 message to another member of the shared subscription if it detects that it has lost connection to the original intended recipient. We probably wouldn't want to require this

iii) Nacks.  This is a case where the hard/soft error thing might be useful.  In the soft error case it might be handy to attempt to redeliver to another recipient, but we'd want to avoid the situation where the server sends it one at at time out to everyone only to get them nacked each time.  Again this might be a case for giving the server some latitude.

iv) Mixed cleanSession true/false.  I don't see this as being a real issue.  The contract between the Shared Subscription and its associated Sessions is independent of the cleanSession flag. That just controls the lifetime of the Sessions. If we've started to deliver a QoS 2 message to a cleanSession=true session and that crashes while the message is in flight, it just gets lost.

v) Message ordering.  It would be nice if we could allow the server to assign messages to Sessions based on an order preservation scheme.  What that means here is that we should permit the server to choose to hang on to a message if its preferred recipient happens to off line, and not force it to send to one of the other Sessions just because they have an active connection.

vi) ShareName clashes.  The ShareNames are global.  Do we need to find a way to protect an application from inadvertantly joining someone else's shared subscription and taking their messages? This one does worry me.

Regards


Peter Niblett
IBM Senior Technical Staff Member
Member of the IBM Academy of Technology
+44 1962 815055
+44 7825 657662 (mobile)




From:        Dominik Obermaier <dominik.obermaier@dc-square.de>
To:        Peter Niblett/UK/IBM@IBMGB, Andrew Schofield <mqtt@lists.oasis-open.org>
Cc:        mqtt@lists.oasis-open.org
Date:        11/04/2016 02:39
Subject:        Re: [mqtt] Recharter: Shared subscriptions and $share




I’m not sure if I’m missing something, from my point of view the overlapping subscriptions with shared subscriptions should not be an issue if “Shared Subscription Group Names” are used. For the next examples I’m assuming that a Shared Subscription consists of the following properties:

* Group Name
* An Actual Topic Filter

So e.g. “$share:groupname:my/topic/filter/#”.

The main use case for Shared Subscriptions I saw was for the following use cases:

* A “hot topic” has a message rate too high for a single MQTT client to handle
* Multiple clients are used as some kind of failover mechanism in case a single client gets unavailable


img1.png (attached to this e-mail) shows the principle of Shared Subscriptions how it could work with group names. This example has two group names “group1” and “group2”, consisting of two individual MQTT clients each. One client of “group1” and one client of “group2” receive a copy the message.

img2.png (attached to this e-mail) shows almost the same scenario, this time client2 (red) subscribes to two separate topic groups. In this scenario, the client2 can receive a message twice (once for group1 and once for group2).

img3.png (attached to this e-mail) shows a scenario where client2 has a subscription for the topic and is in two shared subscription groups (group1 and group2), so the client can receive the message up to three times; once due to the standard subscription, at most once for group1 and at most once for group2. This is intentional, since it may be desired behaviour (the client is a worker application for group1 and group2 and executes standard business logic on its standard topic subscriptions).

I think the missing piece in such scenarios would be some kind of metadata. If the PUBLISH message would contain the group name the client received the PUBLISH for, the client would be able to execute different business logic based on different shared subscription groups and overlapping topics would not lead to problems since the client would not see the messages as duplicate if the client actually subscribed to different groups and would be able to distinguish between them. IIRC the current metadata proposal would help here by adding a group metadata. The received PUBLISH based on a standard subscription would not have this metadata.

If the client group would be an attribute to the MQTT session, it would not be possible to have a client that is in multiple groups and this would dramatically reduce the usefulness of shared subscriptions IMHO.

--
Dominik Obermaier

 
dc-square GmbH - Software Solutions
Innere Münchener Straße 30
84036 Landshut

Tel. +49 871 - 97 50 63 00
Fax. +49 871 - 97 50 63 29
Web. www.dc-square.de

Geschäftsführer Christian Götz, Dominik Obermaier
Registergericht Landshut, HRB 8906
USt.ID: DE283445184

On 11 April 2016 at 03:13:18, Peter Niblett (peter_niblett@uk.ibm.com) wrote:

I don't think the words that Brian quotes from the charter are an issue here.  They say
·
The TC will not identify MQTT topics nor prescribe any mechanism or convention for classification of MQTT topics or topic spaces.

The field in question is a Topic Filter, not a Topic, so we're free to define syntactic devices in that field to express things that aren't part of the topic name.  The + and # characters are examples of this, and it would in principle be ok to define a new syntax that lets us insert additional flags or subfields into the Topic Filter should we want to do that.


There is a bit of a problem doing that, as at the moment any leading character other than + / or # is assumed to be the first character of the topic name. Moreover Server implementations are permitted to use strings that start with a $ as real topic name. The compatibility sentences that Ken quotes mean that - if we want to go this way - we'd have to use a syntax that doesn't take away anything that people might already be using.  I would have thought that starting with $$ or ++ might be safe.


Anyway, before we get too wrapped up in the syntax, it's important to agree on the semantics of the thing, and what restrictions might be imposed on it. Andrew is suggesting quite a drastic restriction, but I suspect it will meet the requirements of a large number of users.


1.The simplest case to consider is one where the consuming application has a collection of threads, all subscribed to the a single topic. We need a way to place them all in the the same "consumer group" so that each incoming message is delivered to exactly one thread.  The application should be able to add or remove threads from this group dynamically, e.g. by subscribing or unsubscribing


2. The next case is the same, but with a wildcarded topic _expression_. Again, we want the messages to be delivered to exactly one consuming thread.


Note that in both these cases, I am assuming that the application doesn't require exact message ordering (or those messages that do need to be ordered come sufficiently far apart from each other that you are prepared to take a risk). If you need more guarantees you would do it by partitioning the topic into subtopics and not used shared subscriptions as you want to make sure that there's only one consuming thread per subtopic.


3. Now consider the case where the application submits two or more subscriptions. There seem to be two cases:


i) They still want to have a single pool of threads each processing exactly one of the messages. In effect this is the same as 2; the only reason you would need multiple subscriptions is if you couldn't express the filter with a single wildcard _expression_. In this case each thread would submit the same set of subscriptions. It's unlikely that the subscriptions would overlap, but I guess it could happen.


ii) They want to control the number of threads dedicated to the two subscriptions separately. In this case the threads in each different subpool would use the same topic filter, but the filters would be different for the different subpools. Again there's the unlikely possibility of overlap. The case where one of the subpools only has a single thread is in effect a non-shared subscription.


I think Andrew's suggestion covers 1, and 3 i). To do 3 ii) I think you would have to say that you get a different subscription group for each topic filter/share name combination, as otherwise we would force people to use different sessions for each subpool


I don't really understand what would happen with his suggestion if there is overlap between the topic filters in case 3.


I think there might be other semantics we have to think about, with regards to in-flight QoS 1 or Qos 2 messages and shared subscriptions.

Peter Niblett
IBM Senior Technical Staff Member
Member of the IBM Academy of Technology
+44 1962 815055
+44 7825 657662 (mobile)





From:        
Andrew Schofield/UK/IBM@IBMGB
To:        
mqtt@lists.oasis-open.org
Date:        
09/04/2016 09:52
Subject:        
Re: [mqtt] Recharter: Shared subscriptions and $share
Sent by:        
<mqtt@lists.oasis-open.org>





One final thought. I think a sticking point here is that it's hard to handle a mixture of shared and nonshared subscriptions from the same client. I think supporting that mixture is unnecessary and, if it makes it too complex, let's prevent it.

How about making the share name a property of a session? When a client connects it can provide an optional client ID and it could also supply an optional share name. For a client with a share name, any subscriptions it makes are shared. Sharing is still a property of the subscription, but the way in which we choose which subscriptions are shared would be different. This would work with clean-session and so on, just making the identity of the session a pair <client ID, share name> instead of just the client ID.




From:        
Nicholas O'Leary/UK/IBM@IBMGB
To:        
mqtt@lists.oasis-open.org
Date:        
09/04/2016 01:02
Subject:        
Re: [mqtt] Recharter: Shared subscriptions and $share
Sent by:        
<mqtt@lists.oasis-open.org>




Okay, let's say it is a property of the subscription and can be used with any topic. It would be helpful to me if we could flesh out the expected/desired behaviour in the following scenarios.

Both Andrews have implied the existence of a share name that goes alongside the topic. I presume that is so you can have more than one group of clients sharing the messages amongst themselves.

Eg client A, B and C subscribe to foo with a share name of alpha. Clients D, E and F subscribe to foo with a share name of beta. So there are effectively two groups of clients for which the messages are shared between the members of each group.

1. Is the share name scoped to the topic? If client G then subscribes to topic NotFoo with a share name of alpha, is that considered a different share?

2. If the share name is scoped to the topic, I presume more accurately it is scoped to the topic filter used in the subscription request? That is to say a sub to foo/# with share name gamma would not be related to a sub to foo/+ with share name gamma even when messages are published to foo/bar (which matches both filters)

3. I realise after I typed this one out its what Ken addresses below - what to do when a client has overlapping subs that are a mix of shared and non shared. I'm inclined to agree the cleanest answer would be to prohibit overlapping subs of mixed type.

I don't agree with the argument that using topic prefix is akin to adding point to point to mqtt and that using a share name on the subscription  isn't. The whole shared sub concept is, as far as the participating clients are concerned, queuing. But that's just semantics and not worth arguing over. (Because I expect one of you to say that in the share name case, you can still have normal subscribers on the topic receiving everything... To which I say, great, we now have a hybrid of queuing and pub/sub....)

Andrew S - I take your point that using a topic prefix on its own loses flexibility - that you cannot have a mix of subscribers. But you could combine it with the share name concept to allow different groups of subscribers. The key difference being if a client didn't provide a share name it would be put in a default share group - allowing 3.1.1 clients to partake without needing to know about share names. Just a thought.

I think my preference for a topic prefix approach came from the sense it had little, if any, impact on client implementation and on-the-wire formats. We could get shared sub behaviour quite easily. As has been identified, that approach has its limitations - the question is whether those limitations make it not fit for purpose.

The share name approach is more flexible at the cost of implementation complexity and client churn.

I think I find myself back on the fence at this point.






Nick O'Leary
IBM Emerging Technology Services

Twitter: @knolleary

IBM United Kingdom Ltd registered in England and Wales with number 741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants, PO6 3AU


Ken Borgendale --- Re: [mqtt] Recharter: Shared subscriptions and $share ---
From: "Ken Borgendale" <kwb@us.ibm.com>
To: mqtt@lists.oasis-open.org
Date: Fri, 8 Apr 2016 22:45
Subject: Re: [mqtt] Recharter: Shared subscriptions and $share




As a charter issue, whether or not $share is an attribute of a subscription or of a topic, we would be defining semantics of the topic space which is what the existing and proposed charter do not allow.

I agree with Andrew that making shared an attribute of the topic makes it a queue rather than a pub/sub topic. While queues do have some useful semantics, this would be a radical departure for MQTT and we would certainly need to make this an in-scope item if that is what we want to do.

It is perfectly reasonable for there to be multiple subscriptions to a topic, some shared and some not. There is not a significant semantic issue with this. The significant semantic issue comes only with overlapping shared and non-shared subscriptions by the same client. This is messy today (see MQTT-217) and shared subs is only likely to make it messier. We could solve this for shared subs by saying that a new shared subscription MUST be rejected if it overlaps an existing non-shared subscription and vice versa. We could make all overlapping subscriptions invalid, but that again would strain the compatibility clauses of the charter :

Changes to the input document should be compatible with implementations of previous versions of the standard such that it is possible for a client or server to implement multiple versions of the standard, allowing a client coded to an older version of the protocol to connect to and use a server that implements both the previous and current versions of the standard.


Specifying that either single or all subscriptions in a client must be honored in the case of overlapping subscriptions would most certainly break this.


Ken Borgendale -- kwb@us.ibm.com 1-207-805-6708 1-207-371-8082
Senior Programmer -- IBM MessageSight and Watson Internet of Things Connect - Architect


Inactive hide details for Andrew Schofield ---04/08/2016 04:30:19 PM---Hi Nick, If I've understood your comment correctly, you Andrew Schofield ---04/08/2016 04:30:19 PM---Hi Nick, If I've understood your comment correctly, you are suggesting that a topic

From:
Andrew Schofield <andrew_schofield@uk.ibm.com>
To:
mqtt@lists.oasis-open.org
Date:
04/08/2016 04:30 PM
Subject:
Re: [mqtt] Recharter: Shared subscriptions and $share
Sent by:
<mqtt@lists.oasis-open.org>





Hi Nick,

If I've understood your comment correctly, you are suggesting that a topic starting with the prefix "$shared" would actually treat all subscriptions as shared subscriptions. Unfortunately, I think that's a really bad idea. It's gone from N publishers->M subscribers to N publishers->1 subscription->M consumers. That's not pub/sub - it's just a queue with multiple consumers.

In a publish/subscribe system, ideally I want the publishers to be totally unaware of the number and availability of the subscribers. By having this rule for "$shared", if I want to use workload balancing among a set of consumers, I have to publish on a topic starting "$shared" and each message published can be consumed by at most one consumer. I can no longer dream up another purpose for the messages and add another subscription - it would just end up getting a small share of the existing shared subscription.

What I would seek to enable is exactly what you didn't like: I want subscribers that can take all of the messages to be able to have them all, and subscribers who cannot keep up with the message rate to be able to share the stream of messages with like-minded consumers.


So, I think sharing is a property of the subscription, not the topic. I prefer having a separate share name which is not part of the topic name.

Thanks,
Andrew

Andrew Schofield
Chief Architect, Hybrid Cloud Messaging
Senior Technical Staff Member
IBM Cloud Application Services

IBM United Kingdom Limited
Mail Point 211
Hursley Park
Winchester
Hampshire
SO21 2JN

Phone ext. 37248357 (External:+44-1962-818357), DE2J22
Internet mail: andrew_schofield@uk.ibm.com




From:
Nicholas O'Leary/UK/IBM@IBMGB
To:
mqtt@lists.oasis-open.org
Date:
08/04/2016 19:46
Subject:
[mqtt] Recharter: Shared subscriptions and $share
Sent by:
<mqtt@lists.oasis-open.org>





There are two fundamental approaches here.

1. The client chooses whether a subscription is shared or not
2. The ‘shared’ nature is a property of the topic itself


If the client gets to choose, then you need to define what happens if clients A and B subscribe to topic foo with the shared flag set, but client C subscribes without the shared flag set. I think trying to support that type of combination would overly complicate implementations. We could say that in that scenario client C’s sub is rejected, but it doesn’t feel like the right approach to me.

I think it is much cleaner for the shared nature to be a pre-existing property of the topic - in that way no changes are needed in client implementations at all, as demonstrated by the fact some broker implementations already support shared subs within the confines of 3.1.1.

The question is then whether it should be done by virtue of a well-known topic prefix - $shared, or have it as a property on the topic that can be administratively set on the broker.

The advantage of the latter is that any topic could be set as a shared topic - and would be within the constraints of our charter to not define topic spaces. The downside is it requires administrative action to use the feature - something that isn’t required anywhere else in the protocol. To deploy a new application that requires a shared sub would now require co-ordinated administrative action on the broker - that doesn’t feel lightweight to me and would fail the Just Works test that so much of MQTT benefits from.


That brings me to the conclusion that the cleanest solution is to define any topic that is prefixed with $shared as being a shared topic. We shouldn’t let the current text of the charter, which we’re currently redrafting anyway, stop us making the right technical choices for the protocol.


Nick O'Leary
IBM Emerging Technology Services

Twitter: @knolleary

IBM United Kingdom Ltd registered in England and Wales with number 741598 Registered office: PO Box 41, North Harbour, Portsmouth, Hants, PO6 3AU



Brian Raymor --- [mqtt] Recharter: Shared subscriptions and $share ---
From: "Brian Raymor" <Brian.Raymor@microsoft.com>
To: mqtt@lists.oasis-open.org
Date: Mon, 4 Apr 2016 16:25
Subject: [mqtt] Recharter: Shared subscriptions and $share







Forwarding to the mailing list for broader awareness and discussion.

The conversation is related to this out-of-scope item in the draft charter:


·
The TC will not identify MQTT topics nor prescribe any mechanism or convention for classification of MQTT topics or topic spaces.

…Brian


From:
Andrew Banks [
mailto:andrew_banks@uk.ibm.com]
Sent:
Friday, April 1, 2016 12:35 PM
To:
Ken Borgendale <kwb@us.ibm.com>
Cc:
Brian Raymor <Brian.Raymor@microsoft.com>
Subject:
Re: MQTT shared subs and charter


Ken, using the $share topic prefix seems the more natural way to go to me.

To avoid using the $ prefix we would still have to encode the sharename into the subscribe packet, probably using the same metadata encoding method as for the Publish Metadata. The existence of the sharename should signal the shared subscription and so remove the need to use one of the Qos bits. However, the Subscribe packet can carry multiple subscriptions and SubAck can carry multiple outcomes so that is where it starts to look unnatural.

The $ prefix was reserved in the V3.1.1 specification for uses such as this, and it is being used in practice, so I think we should alter the charter to make its use allowed.



Andrew Banks
Telephone (44) 1962 816123




From:
Ken Borgendale/Austin/IBM
To:
Andrew Banks/UK/IBM@IBMGB
Date:
31/03/2016 22:18
Subject:
MQTT shared subs and charter







The issue I have with charter and shared subs in the prohibition in the charter of defining topics. I would prefer to keep this charter limitation but it means that using $shared or any other $ topic to indicate shared subs would not be allowed. I think it makes much more sense to use one of the bits in the QoS byte of SUBSCRIBE for this purpose. This is especially true if we also implement nolocal in this way, as there needs to be an explanation of the intersection of shared subs and nolocal. The problem is that there is no corresponding bit on UNSUBSCRIBE but perhaps we should have an options byte per UNSUBSCRIBE and on UNSUBACK. This makes them symetical with SUBSCRIBE and allows for a return status on individual UNSUBSCRIBES.

On the issue of shared subs, how you specify shared subs is the simple issue. Some of the semantics of shared subs are very messy especially any text around overlapping subs and shared subs.





Ken Borgendale --
kwb@us.ibm.com1-207-805-6708 1-207-371-8082
Senior Programmer -- IBM MessageSight - Architect


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU[attachment "img1.png" deleted by Peter Niblett/UK/IBM] [attachment "img2.png" deleted by Peter Niblett/UK/IBM] [attachment "img3.png" deleted by Peter Niblett/UK/IBM]



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]