Yeelight color bulbs offline

Hi again,

It’s TP-link TL-MR6400.

It’s been 3 days now without any issues. Even tried to turn off the power to the bulbs and on again, still works. Earlier when my kids turned the power off, it went offline and I had to repair them.

I just checked product spec for that router, it’s 4G LTE router. Did you use 4G SIM slot or WAN port for the uplink?

4G as I’m using the 4G for internet.

I have this issue with TP-Link Archer C7 v2. It’s been this way for as long as I have had yeelight bulbs and strips, and all are affected. I have 35 devices and they will randomly start to “fall off” the network over time and require a power off and power on. I have waited about a year for the yeelight devs to pick up and fix this issue and it’s a bit crushing to see that you’re still asking people what servers they are connected to when it has been explained in every discussion that it also affects LAN control. It’s your device, since I have easily 40 other smart devices in my home from Xiaomi smart plugs to google homes, broadlink devices, arduinos and smartlife bulbs and other LED controllers none of which are affected.

There’s not going to be any more Yeelight devices on my network until this is fixed. There is no point having smart home technology that I have to manually walk around powering off and powering on to get it back online. This is one of so many threads on this problem and still nothing seems to be done, I think all hope is lost. I will start returning Yeelight devices as faulty because I don’t see any chance of any resolution, Yeelight devs will just keep asking the same questions or wait until people replace their routers and then suggest that was the problem all along.

1 个赞

Hi,

We fixed some connectivity issue when lan control mode is enabled. The patch has been released with version 1.4.2_0070. Please update to this version and see if it works for you. Thanks.

By the way, what’s the status of the bulb when it goes offline? Is it on or off, in color mode or color temperature mode, and the brightness?

br,
Liu Fei

Hi Liu Fei,

I have a lightstrip which has been offline for about a week now, so I can give you the exact status.

When a device goes offline, it is always stuck at its current setting. In my case, it is stuck off however I have had many lights get stuck on with the current brightness and colour, which varies from room to room. There is no specific brightness or RGB colour that causes my bulbs to get stuck.

Here is my Homeassistant view of the two light strips mounted behind my TV:

image

As you can see, the upper strip is unavailable and the lower strip is available and on. Both of these are plugged into the same power outlet. I try pinging the available strip:

PING xmstrip3.lan (192.168.16.87) 56(84) bytes of data.
64 bytes from xmstrip3.lan (192.168.16.87): icmp_seq=1 ttl=255 time=1.52 ms
64 bytes from xmstrip3.lan (192.168.16.87): icmp_seq=2 ttl=255 time=5.22 ms
64 bytes from xmstrip3.lan (192.168.16.87): icmp_seq=3 ttl=255 time=11.0 ms

I try pinging the unavailable strip:

PING xmstrip4.lan (192.168.16.88) 56(84) bytes of data.
^C
— xmstrip4.lan ping statistics —
6 packets transmitted, 0 received, 100% packet loss, time 5093ms

The status in Homeassistant shows:


xmstrip4 was in the exact same state as xmstrip3 before it became unavailable. All of the values would be identical as they are using the same scene.

If I power off and power on the strip (which means powering down the whole TV) you can see that it will become available again:

PING xmstrip4.lan (192.168.16.88) 56(84) bytes of data.
64 bytes from xmstrip4.lan (192.168.16.88): icmp_seq=1 ttl=255 time=2.97 ms
64 bytes from xmstrip4.lan (192.168.16.88): icmp_seq=2 ttl=255 time=8.43 ms
64 bytes from xmstrip4.lan (192.168.16.88): icmp_seq=3 ttl=255 time=2.56 ms

However in the next few weeks another few bulbs and strips will go offline and I will have to manually reset them like this. By the way, I noticed the blue light was flashing on the controller when I climbed up to power down the light strips at the back of the TV.

Hi Liu Fei,

Any more information you need from me?

@liufei

OK, I have a bulb that has stopped responding today. I am going to give you any info you need to diagnose why. The first thing I want to show you is below, I have 6 yeelights in a single room only metres from each other, and only 1 of them has stopped working. This is very normal (and frustrating) and will continue to happen, if I left it long enough they would all end up like this. Obviously it’s the one that is still lit, which should have been turned off by a timer when motion stopped but it is now unreachable:

In the app, as you can see, it shows offline:

It does not respond to pings:

6 packets transmitted, 0 received, 100% packet loss, time 5126ms

I check for any instances of its MAC address attached to my router, it is not attached currently. I see all my other bulbs but not this one.

Thankfully my router logs via syslog to a server on my network. I pull up all of the logs and I see:

Bulb with that MAC address was performing handshakes every 10 minutes up until that point (please give me an email address and I can send the full logs to you) UNTIL the bulb itself seems to disconnect:

Dec 9 06:01:15 xxxx hostapd: wlan1-2: STA xx:xx:xx:xx:66:fa WPA: group key handshake completed (RSN)
Dec 9 06:11:15 xxxx hostapd: wlan1-2: STA xx:xx:xx:xx:66:fa WPA: group key handshake completed (RSN)
Dec 9 06:21:15 xxxx hostapd: wlan1-2: STA xx:xx:xx:xx:66:fa WPA: group key handshake completed (RSN)
Dec 9 06:31:19 xxxx hostapd: wlan1-2: AP-STA-DISCONNECTED xx:xx:xx:xx:66:fa
Dec 9 06:31:24 xxxx hostapd: wlan1-2: STA xx:xx:xx:xx:66:fa IEEE 802.11: deauthenticated due to local deauth request

It is not heard from again after this point. Does this not show the bulb disconnecting from the wifi network, and why would it do that? None of the other bulbs surrounding it are affected.

OK so this explains the behavior I have been following over at the Home Assistant forums which was originally thought to be because of Music Mode being on. Turned it off but my lights still go unreachable from time to time. Following

Hi,
I read you pm and noticed several devices show very poor signal strength (<= -70), please check if they are the devices that go offline frequently.

We have observed similar issues where low RSSI cause the devices go and stay offline. This could be caused by long distance from routers, or surroundings that weaken signal strength, sometimes defect in hardware.

If that’s the case, please move the devices nearer to the router and check RSSI again, if the RSSI stays low, then it is probably defected hardware. If RSSI looks good in the new location, check if the offline issue is gone.

Overheating issue can also drive the device offline, this only occurs on devices that have bad temperature sensors installed, and when they are turned on. In this case, turn the light off and see if the issue does not stay.

@liufei
I don’t think that is a fair representation of the results I sent you.

Of the results I sent:

5 were -40-49 db (classed as excellent)
9 were -50-59 db (classed as good)
12 were -60-69 db (classed as fair)
3 are >= -70db (classed as weak)

The reason for this is that I previously had 2 routers serving the house and turned one off to debug this problem that you keep blaming on your customers routers.

What I will do now is turn the second one back on, which will push up the signal strength to the other half of the house. I will send you the updated screenshots and I will have you either investigate this problem once and for all or refund my money for this defective product.

I am 100% sure that neither RSSI is not the issue here. I am sure of that because the devices which have failed in the last 2 weeks are not anywhere near the lowest RSSI of the devices I showed you, those are on the other side of the house.

Also with your comment here:

Overheating issue can also drive the device offline, this only occurs on devices that have bad temperature sensors installed, and when they are turned on. In this case, turn the light off and see if the issue does not stay.

I can’t do anything once the wifi connection is lost, I have no control over the bulb. Of course turning it off and on always, in every case fixes the problem, but that doesn’t mean it was overheated and needed to cool down first, I don’t let it sit before I power it back on, I just know by now there’s a fault in your product which causes it to disconnect from the wifi after a couple of weeks of continuous operation and that if I power it straight off and on again it will reconnect.

I also get the best experience with your product when there is a power outage, since it powers off all of the devices at the same time, so I know I won’t have to reset a light for another 2-3 weeks. If there’s no power outage, I know it will be a light every few days. So there is a review of your product: “works best after a power outage, before it goes back to being buggy, with no fix after almost 2 years”.

@liufei I also see nothing addressing the router logs that I showed you above. I will reset that device now and give you its particular RSSI value and then you can explain to me the log entries I shared above which shows the bulb disconnecting.

You also haven’t shared an email address with me so I can send you my entire unredacted and unfiltered set of router logs for the last 8 days to help in your analysis.

Edit: Sent you a PM with a screenshot. RSSI value of the bulb after reboot and before I turn back on the second router is -52db. This is considered a good value (https://www.netspotapp.com/what-is-rssi-level.html), only 2 db away from excellent.

Why did this bulb just stop responding the day before yesterday and why did I need to turn it back on again today?

Also, finally, why are these other devices not affected:

I have never had to reset any of these devices. Even if the RSSI level is low for a couple of the bulbs you saw, it does not explain the following:

  • Why a low data rate device like a yeelight even cares?
  • Why a lossy wifi signal would cause a bulb to disconnect from the network and never reconnect again until it is powered off and on
  • Why bulbs with perfectly good RSSI values are affected
  • Why a reconnect interval can’t be very simply coded into your bulb firmware so that it at least tries to renegotiate with the router once it loses connection

Can you explain the above for me?

@ngardiner I googled around using key phrase ‘deauthentication due to local deauth request’ and found below link:
https://www.raspberrypi.org/forums/viewtopic.php?t=191287
which says “This problem is related the config option “wpa_group_rekey”, because reducing the value to 600 (10 minutes) make the issue appear after 1 hour. Starting hostapd directly shows the following not really helpful”.
This seems to match with the log you captured on the router, because it happens when the fourth ‘group key handshake’ took place.
Please, if possible, check if this setting can be enlarged on your router, to see if the issue can be fixed. We will do some local testing regarding this setting to find more.

This is not true. There have been hundreds of group key handshake events, I just showed a subset from a single log file, the logs rotate daily. I offered twice now to send them to you for analysis if you just give me a way to do so privately.

I have changed this parameter as you have asked me to do, I will see if it has any effect, however either way I still feel I am making an AP side change due to a device side bug, and you have no documentation to explain why this is or what others should set on their side to avoid triggering the same bug.

Hi,

Thanks for helping to test this parameter out. We have modified this setting on our router and setup a test bed to collect log on device side, hoping to capture the offline event. The first step to narrow down the root cause of such issue is to find out the pattern and reproduce it. Next step is to fix it on device side.

If the setting has nothing to do with this issue, we would try purchasing a C7 router.

My email address is liufei@yeelight.com. Sorry for missing your request twice…

@liufei

I have disabled this feature (group key handshake) and I have some preliminary findings to share.

What I would like to ask you first are 2 questions:

  1. Are any of the devices in your test lab using hostapd for the AP functionality? (I would think this is quite common)
  2. If so, can you please check the hostapd.conf file and confirm the setting of the following variable, which I have changed on my routers:

wpa_group_rekey=0

Please note the following description from hostapd.conf documentation:

Time interval for rekeying GTK (broadcast/multicast encryption keys) in
seconds. (dot11RSNAConfigGroupRekeyTime)
This defaults to 86400 seconds (once per day) when using CCMP/GCMP as the
group cipher and 600 seconds (once per 10 minutes) when using TKIP as the
group cipher.
wpa_group_rekey=86400

Note that after turning this off, rekey events still occur when a device leaves the wifi network:

Rekey GTK when any STA that possesses the current GTK is leaving the BSS.
(dot11RSNAConfigGroupRekeyStrict)
wpa_strict_rekey=1

So, this setting only affects the behaviour of the scheduled rekeying and not the strict rekeying when a device leaves the network.

My findings so far:

  • It is not possible in such a short timeframe to identify if the problem is solved, however
  • My Yeelights are noticably more responsive now, in the past they have required several events to turn them on, they now (seem?) to come on immediately

That said, I want to make it clear that:

  • I don’t consider this fixed, I have not confirmed that it will not happen again and it could simply be a placebo effect
  • I still don’t understand why I need to change a hostapd default security setting to stop a behaviour in a single device from taking that device offline and never reconnecting. Note as I have said ad nauseum, no other device on my network suffers from this issue
  • I will correct any attempt to call this a router or customer side bug until it is proven to be so
  • I still expect Yeelight to fix thisor come up with appropriate guidance for other customers who cannot tune this setting on their side (or in the event that this setting causes a significant security risk, which I am not fully sure of yet) as hostapd is going to be a widely used platform

Just some other food for thought. Searching on this topic brings me to a few very similar stories:

https://forums.whirlpool.net.au/archive/1089217

I think a rekey value of zero actually prevents the AP from changing keys at all. This is bad because it reduces the security. It would prevent drop outs from wireless cards that aren’t properly implementing TKIP / AES though…

Also one of the simplest questions:

  • Group key is used for broadcast and multicast only. Unicast traffic is not affected. Why is this key causing a full loss of wifi connectivity?

Same Problem, Offline bulb or not list.

Please fix the problem.

Edit; Yeelink-Light-color2_miapbf69

Six hours ago, when I had no problems but now not control it.

1, Resseting Device(Success)

  1. Connection to Wifi or Internet (Success)

  2. Connection steps (Success)

  3. Connection Complate ( Success )


  1. Connecting and Control List but no bulb? ( Failed )

@liufei

Although I have noticed a faster response times from my Yeelights (again though, I mentioned that it is maybe because I am paying more attention now) after making the change, today another light went offline:

image

In the logs I see:

Dec 14 17:59:27 xxxx hostapd: wlan1-1: AP-STA-DISCONNECTED xx:xx:xx:xx:xx:9d
Dec 14 17:59:27 xxxx hostapd: wlan1-1: STA xx:xx:xx:xx:xx:9d IEEE 802.11: disassociated due to inactivity
Dec 14 17:59:28 xxxx hostapd: wlan1-1: STA xx:xx:xx:xx:xx:9d IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)

I believe I have read about this message as well, and yet another option that I will need to set to work around this problem, namely:

disassoc_low_ack=0

Again, this seems to be a workaround for device issues but I will try it and report back.

Edit: Just to confirm, setting has been applied to both routers and I had to power down the bulb and power it back on again to get it back on the network. We will see if any other bulbs go offline.

@liufei

Just an update on what I see as a result of the change I made above. All other devices appear to work as normal and I have not so far had any bulbs go offline (however this tends to happen every few days and at this point is still inconclusive) but I would like to add this one observation:

All of the responsiveness improvement that I mentioned in the previous posts after changing the first setting we tried are now gone after making the second change. I can assure you both settings are active at the same time on my router, the second setting has not cancelled out the first. I am not sure exactly what the implications of these settings are for the bulb response times but they are back to how they were originally - which I am fine with, it’s how they have been for their entire lifetime, I’m just trying to gather some data points here.

I am again seeing the behaviour I always have which is that if I walk into a room with motion sensors (I use the Xiaomi Aqara motion sensors) with a group of lights, only some of the lights may illuminate immediately, others may take another second to light up. I don’t care about this personally and don’t have any reason to pursue it (I see no timeouts in the logs, I don’t see any warnings in homeassistant or the app, I think it’s just the way it is with a lot of bulbs on a wifi network). It’s the bulbs going offline that I need solved as I plan to remove all light switches and replace them with touchscreens and if I do so, I won’t have a way to reset the lights if they go offline.

hi, @ngardiner, thanks for the information. this will help a lot in our reproducing and analyzing this issue.
We have referred the issues reported in this topic to Marvell, the MCU provider of this product, and they asked for ‘the sniffer log in 802.11 plus radiotap header as below shown’. I wonder if you would like to help collecting the data for them.