Another connection issue thread with lots of research and will to help and test!

There is another thing which might help to get down the issue.

So for example if your send
{ “id”: 1, “method”: “set_power”, “params”:[“on”, “smooth”, 500]}
It will send {“method”:“props”,“params”:{“power”:“on”}} to all open Sockets.

If you send {“method”:“props”,“params”:{“power”:“on”}} a second time there will be no answer as lights are already on which is completely fine.

If you send {“method”:“props”,“params”:{“power”:“off”}} there will be an answer again on all sockets.

Maybe get_prop behaves like the other commands and somehow as some variable(state) saved in the firmware which gets stuck
get_prop shouldn’t work on same principle as the set commands, it should always answer without coniditions. Maybe theres a condition.

Hi, Tomas,

I tried your procedure but no luck, did not reproduce. The commands works just fine.

I’m now curious in how 3rd party process the messages received from the bulbs.
There are two types of messages (in JSON format)

  1. Response to requests. The bulbs send one response to each line of input from the clients
  2. Notifications of state change. Whenver the state on the bulbs changes, the bulbs broadcast notifications to all clients connected.

I guess HA wait on the response from bulbs to make sure they are ‘present’ or ‘available’. But please note that, after sending out the request, it should not expect the next message received to be a response, instead, it may need to go through several notifications until meet the response message.

This is just a wild guess. If anyone knows how HA interworks with the bulbs, it really would help a lot.

br,
Fei Liu

Hello,

There is no way the procedure does not work. I tested it with a total of 5 bulbs.
Maybe you are on different Firmware version?
I’m on 2.0.6_0030 connected to Frankfurt.

Here a recording of procedure.
https://www.youtube.com/watch?v=AFaTAUqau4k&feature=youtu.be
Make sure to open 5 simultaneous connection after you power on the bulb.
You need to hit the connection limit, in order to trigger the behaviour. So 5.

Make sure to use GET_PROP command. Other commands work fine.

best regards,
Thomas

Hi, Thomas,

Very nice video! The key difference is that in my test, the telnet sessions were closed ‘gracefully’ by the clients software, but in your video, the putty windows were closed ‘abruptly’ . Anyways the issue can reproduced in lab now. A patch is to be ready soon. Please hold on.

br,
Fei Liu

Hello,

awesome.

I didn’t answer you on the interworks of Homeassistant because I wanted this issue to get this issue with the firmware itself to get adressed.

So in Homeassistant there are 2 ways in which it gets data from the bulb.

  1. It sends a Request with ID and waits or answer, there is an 5 second timeout.
  2. It gets notifications of state change from open socket and processes it.

First thing when Homeassistant starts up it does the following. As long as availablevariable is false it will retry the entire procedure.

I’ll try to write it in machine code.

avalablevariable = false (bulb will show unavailable)
IF Socket OPEN THEN
{
     IF send_command('get_prop')=TRUE THEN
      {
           IF get_response=TRUE within 5 seconds THEN
              {
                avalablevariable = TRUE
               }
           ELSE
              {
                close the socket
              }
      } 
     ELSE
       {
        close the socket
       }
}

So this means, that if the socket is opened it tries to send command, if command is sent successfully it checks for answer. If either send command or get_resposne is not true it will close the socket and reattempt later. It always start with avaiablevariable=false.
I do not really see a problem in this approaach.

The other thing regarding state change, it uses the open socket (so only 1 socket per bulb) and sends the different SET_prop commands and then bulb notifies all open sockets about the change.
So when I change state on Device A, Device B will show the change. This is behaviour from Homeassistant is as expected per Yeelight Documentation.

So long story short. Homeassistant Yeelight Plugin respects the documentation provided by Yeelight for the bulb. It is also not spamming new sockets.

best regards,
Thomas B

Hi, Thomas,

We have located the root cause of the issue with get_prop. I wonder if you prefer a beta version, or waiting for a ‘release’ version which would need several more weeks.

The HA process looks impeccable. Though there is still one thing I want to be sure. Does get_response ignore all notifications and wait until a response is received? Or if a notification comes before the expected response, will HA deem it as a fault?

br,
Fei LIu

Hello,

Thanks for the quick answers.
I can imagine, that it would need intensive testing before releasing it to that many devices around the globe. I would be more than happy to recieve the update and test it also on my side. My MI ID is: 6382797779

Regarding the thing you want to be sure, I’m more than happy to answer that.
So HA starts an asyncronous request.
So when it sends out GET_PROP it sends it with ID and expecting an answer with the same ID. If in the meantime other notifications come in it will just process them without affecting this request.
So, no. It will no deem it as a fault.


Given that we have such good communication, there is 1 more issue with the firmware which should be adressed and you can replicate easily which I just remembered.

Just unplug the WIFI and plug it back in. The bulb will not reconnect till next physical power off and on.
The real world scenario is:
There is a electrical outage. The bulb firmware will load in maybe 3 seconds whereas most ISP Home routers will take at least 1 minute.

If I’m not mistaken you are using EPS 8266 or somethin similar in the bulb.
You could simply check for WIFI and reconnect in the loop(){} function. That’s what I am doing with all my custom sensors and devices. It might be resource intensive to implement it on every loop, but could for example time it. Like once every minute or something like this.

Maybe there is actually an reconnect function and I didn’t wait enough? Not sure.

It is not reallly, really big issue for me, but imagine using only Yeelight bulbs in a big project. Let’s say you have a hotel. Imagine having to go around the building switching off all bulbs on and off. You could solve it by mounting “Smart wall switches” and fire off on procedure if bulbs are not responsive just like @com_wolf (Another connection issue thread with lots of research and will to help and test!) mentioned above , but again this would be an additional investment (again, imagine a hotel, can get very expensive), which can be solved, maybe by a few lines of code.

best regards,
Thomas

OK. It could be related with some AUTOIP functions. One thing I’m sure, there ARE scanning and reconnect procedures before the bulbs finally connect to Wi-Fi. But if the DHCP server is not present (soon enought), autoip will come into play. I’m not sure though. Need to confirm with team after going to office tomorrow (not sure what time is it for you). Be back soon.

Sure.
Thanks! Appreciated.

Hey @liufei,

I’m posting so late, as I didn’t want to interrupt the conversation between you and @tomb92!
And thanks for the reply, really appreciated!!

And thanks for looking into the internet issue, that is awesome! Let me know if I can provide any more details, like about my network or anything!

About your HA questions, I think @tomb92 answered a lot already. I’m also not the expert on how HA works, but I’m pretty proficient in Python, so I do understand most of the things it does. So I try to respond:

Does the HA flood a bunch of requests to the bulb by any chance before it complained connect loss?

No, HA only does one request to each bulb every 30 seconds (by default), which is get_properties(). And that’s exactly the point where it throws the error message I posted

Or does HA have any tuning of TCP layer, like changing the keepalive parameters?

As far as I can see, no. The HA plugin definitely not, maybe the Python Yeelight lib, but I couldn’t spot tha there either.

Does HA read from the socket from time to time even when it’s not sending any requests? It could cause congestion in the bulbs sending queue which in time affects all traffic when bulbs’ (very limited) memory runs out.

As far as I can understand, yes. I am not 100% sure how it handles the socket connections as in: I don’t 100% understand if it keeps the socket open forever or recreates it at certain events. But I think @tomb92 already responded to that.

I had a quick chat with @tomb92 and he also things my issue could be the get_properties issue he mentioned.
Would it be possible that I could also get this fix build? As I mentioned, I do have quite some different lamps I could test it on! Not only the bulbs, but also different ceiling lights and the Mi bedside Lamps that throw this error every few minutes.
My MI Account ID: 1890771080

Thanks and greetings,

Andy!

Hi, Andy,

Thanks for the reply, and the waiting, very considerate :slight_smile: .

I think we have located the cause of the issue ‘Constant flapping state in Home Assistant’ when internet access is banned, and are working on a fix (should be ready within one day or two). Please provide a MI ID so that we can whitelist it, if you also prefer an early beta version. Thanks.

br,
Fei Liu

1 个赞

Thx @liufei! Really looking forward to that!! :slight_smile: I already sent my Mi account ID in the last post:

Greetings,

Andy!

Sorry I missed that…

1 个赞

No worries <3

@liufei, @ezcGman, @tomb92 In the other thread (Why is there no support?) after the DHCP discussion @_guofeng was so kind to provide me with the updated firmware for the lights and for last 28 hours my system recordered only one case when a single light dropped the connection. It’s a huge progress to several times an hour for each of the lights that was previously.
I do not have such a Phyton knowledge, but It was very interesting to read the discussion.

Good afternoon everyone.
Thank you all for the great research. I also have this issue and if possible I would also like to test the beta firmware. I also made a thread about this today before I saw this post. ( connection issues Home Assistant after update 2.0.6_0030 - 未分类 - Yeelight Forum)

My MI ID: 6317188532
Yeelight 1S
My bulb model is. Yeelink.light.color4

Hi all,

I have been following this issue closely.

@liufei - Could you kindly provide me the access to the beta firmware. I am running 5 Yeelight Color 2 bulbs with HA.

My info is

Mi Account ID: 1906383297
Hardware: 5 Yeelight Color 2 bulbs

Best,
Nick

Hi everyone
thank you so much for your impressive work on this topic.
I have the same problem, and I would like to test the beta firmware.
I’d test on the light color4 bulb
My ID 1866952536
Can you provide me access to the beta firmware test?
Thank

There’s no fix for color2 yet, sorry about that, because it is based on different Wi-Fi chips from these of bslamp2, color4, etc. We have not been able to reproduce the issue in lab yet for color2 yet. But we will keep on looking into the issues. Please provide as more clue as possible so that we can understand what’s going on.

Hi, everyone. We’re working on the beta version… Will let you known when it’s ready…
Color4 will delay due to non-technical issue. bad news… Won’t be long, 1 week or so.
Other models are on schedule.