Set env.MSB_BROKER_RECONNECT to false by default #200

shumsky · 2017-07-26T10:27:04Z

Hello,

amqp-coffee library which the AMQP adapter is based on provides a feature of connection retry. MSB enables that feature by default:

  config.amqp = {
    // ...
    reconnect: env.MSB_BROKER_RECONNECT != 'false',
    // ...
  };

The drawback of that feature is that the retry is done silently without logging an error or emitting any events. If a microservice which consumes a queue via MSB is misconfigured and fails to establish a connection with the borker, it will retry forever without giving any signs of the problem.

I propose to disable connection retry in MSB by default, namely set MSB_BROKER_RECONNECT to false. So that the microservice would terminate with an error printed to stderr once the connection fails to be established or is lost. In that case the problem would be immediately detected.

When the application crashes, PM2 will try to restart the it with limited number of attempts. If the connection problem is temporary and may be gone in a moment, the microservice could be healed after a restart. If the connection problem is permanent, PM2 will exhaust the number of attempts and set the microservice to the errored state.

What do you think about that?

The text was updated successfully, but these errors were encountered:

agritsik · 2017-07-26T10:36:32Z

Consider this as a minor fix for the 1.x version. Still backward compatible but brings a new behavior in case of connectivity issue.

The same fix should be added to 2.x version as well.

agritsik · 2017-07-26T12:36:04Z

duplicates #56

Kumjami · 2017-07-27T07:27:18Z

Hi @shumsky,

I agree that an app/module doing things silently without notifying is a very annoying behaviour.
By deactivating this silent retry you are delegating retry responsibility to process level (PM2 in that case).
Not sure if retry is a required feature for MSB. Actually, wasn't it one of the main reasons to pick amqp-coffe on the first place?
If amqp-coffe retry functionality is not something pleasant to work with, I would rather go to use amqplib which is the official package recommended by RabbitMQ javascript examples and implement retry logic by ourselves if that is a requirement.

Maybe I'm going a little bit out of the scope. To stuck with the initial issue: I would disable retry if that is working silently by default and try to document really well that.
For a further step I would plan a way to fix this silently behaviour.

shumsky · 2017-07-27T08:29:45Z

Hi @Kumjami,

To me, MSB should have retry, whether it's provided by the underlying AMQP library or implemented by ourselves on the adapter level. But it looks like the current solution brings more harm than good.

So maybe it's worth to disable it by default (BTW, it is disabled by default in the amqp-coffee itself) and create an issue for implementing a better solution which could then become the default behavior. As you suggested, we could also consider replacement of the amqp-coffee with the amqplib within the scope of the new solution.

I like the way the retry is done in the RabbitMQ Java client. msb-java is aligned with it. We could consider implementing it in MSB for Node.

Kumjami · 2017-07-27T12:48:23Z

@shumsky I completely agree.

shumsky · 2017-07-31T10:17:27Z

I compared default behaviour of three AMQP libraries in a few different scenarios of connection loss:

Library / Scenrario	Consumption on startup	Consumption while running	Publish
amqp-coffee with RECONNECT=true	silent retry	silent retry	silent retry
amqp-coffee with RECONNECT=false	stop	stop	stop
amqplib for Node	stop	stop	stop
amqp-client for Java	stop	retry with logging; recovery of consumers and bindings	throw exception

The worst behaviour is amqp-coffee with RECONNECT=true, where silent retry is applied. And a good behaviour, to me, is amqp-client for Java, which we could consider implementing in MSB and which is also an official RabbitMQ client for Java.

Summing it up. Due to amqp-coffee API being not fluent enough, it is hard to adjust its retry mechanism without workarounds in MSB or microservices. We can do next sequence of steps:

Disable internal retry in amqp-coffee, i.e. set RECONNECT=false (align with amqplib for Node behaviour)
Since we are now aligned with amqplib behaviour, we can consider switching to it in the MSB AMQP adapter. amqplib compared to amqp-coffee looks more mature and stable
Implement a retry mechanism on MSB adapter level (align MSB with amqp-client for Java or another good strategy)

The first step is the solution I initially proposed in this issue. Once we encounter a connection loss, we would let the microservice die and delegate recovery to the process supervisor (e.g. PM2). The same approach is also advocated by the creator of the amqplib for Node: amqp-node/amqplib#25 (comment)

Kumjami · 2017-08-14T07:57:29Z

Hi @shumsky, great report.

So I understand that official amqp libraries has different behaviours in Java and Node. This is something a little bit confusing.

In my opinion we should try to have the logic a the same place for different implementations. So, if in java the retry is at amqp lib level, in Node it should be also at that level, and not in MSB logic itself.
If retry logic is at MSB level, then it should be implemented in Java and Node versions. Having different logic is a little bit strange for me.

So maybe it would be a possibility to open a PR to add retry and logging at amqplib or to add the logging while retrying at amqp-coffee. Do you think that could be possible?

shumsky · 2017-08-17T09:38:21Z

@Kumjami Agree that the retry approach should be aligned in Java and Node versions of MSB. I think opening a PR to either amqplib or amqp-coffee would be the right way. Though it might take time. The maintainer of the amqp-coffee hasn't been responsive for last few weeks. And the maintainer of amqplib seems to have concerns regarding retry functionality as a corresponding issue has been unresolved since 2013.

miguel-cotta · 2017-08-29T07:45:28Z

Hi all,
just wanted to add some other points regarding the amqplib vs amqp-coffee debate. Newer versions (>= 2.0.0) of the New Relic nodejs agent support amqplib:

https://docs.newrelic.com/docs/agents/nodejs-agent/supported-features/message-queues

agritsik · 2017-08-29T08:04:11Z

Thanks for the update. It is certainly one more good point for migrating to amqplib.

agritsik · 2017-08-29T08:22:26Z

All details regarding migration to amqplib will be described there #204

agritsik assigned agritsik and unassigned agritsik Jul 26, 2017

agritsik added the minor/feature label Jul 26, 2017

shumsky mentioned this issue Jul 31, 2017

Set env.MSB_BROKER_RECONNECT to false by default #201

Merged

agritsik mentioned this issue Aug 29, 2017

Migrate to amqplib #204

Open

agritsik closed this as completed Aug 29, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set env.MSB_BROKER_RECONNECT to false by default #200

Set env.MSB_BROKER_RECONNECT to false by default #200

shumsky commented Jul 26, 2017

agritsik commented Jul 26, 2017 •

edited

Loading

agritsik commented Jul 26, 2017

Kumjami commented Jul 27, 2017

shumsky commented Jul 27, 2017

Kumjami commented Jul 27, 2017

shumsky commented Jul 31, 2017

Kumjami commented Aug 14, 2017

shumsky commented Aug 17, 2017 •

edited

Loading

miguel-cotta commented Aug 29, 2017

agritsik commented Aug 29, 2017

agritsik commented Aug 29, 2017

Set env.MSB_BROKER_RECONNECT to false by default #200

Set env.MSB_BROKER_RECONNECT to false by default #200

Comments

shumsky commented Jul 26, 2017

agritsik commented Jul 26, 2017 • edited Loading

agritsik commented Jul 26, 2017

Kumjami commented Jul 27, 2017

shumsky commented Jul 27, 2017

Kumjami commented Jul 27, 2017

shumsky commented Jul 31, 2017

Kumjami commented Aug 14, 2017

shumsky commented Aug 17, 2017 • edited Loading

miguel-cotta commented Aug 29, 2017

agritsik commented Aug 29, 2017

agritsik commented Aug 29, 2017

agritsik commented Jul 26, 2017 •

edited

Loading

shumsky commented Aug 17, 2017 •

edited

Loading