Endless unresponsive loop after ZST39 upgrade to firmware 1.30 (SDK 7.21.3) #6874
Replies: 30 comments 177 replies
-
👋 Hey @vladm! It looks like you attached a logfile, but its filename doesn't look like it a driver log that came from Z-Wave JS. Please double-check that you uploaded the correct logfile. If you did, disregard this comment. As a reminder, here's how to create one: |
Beta Was this translation helpful? Give feedback.
-
I don't see the issue happen in the driver logs, only the application logs which don't include the necessary information to diagnose it. Please unplug and re-plug the stick and include a new driver log which should hopefully capture what you described. |
Beta Was this translation helpful? Give feedback.
-
@AlCalzone thank you for your response. I did update zst39 back to the new 7.21.3, though this time I used Silicon lab's pc controller software from simplicity studio. What's interesting is that after this update controller works OK, ZW JS exhibits no such issues as it had after I used HA to update FW last time. I wonder, if the problem is rather with the update process via Home Assistant and I ended up with corrupted controller last time? Unrelated, even though controller is at SDK ver 7.21.3, the UI still complains about 7.19.3... |
Beta Was this translation helpful? Give feedback.
-
Similar issue here. Was on 1.1 (7.18.3) and working, but since I was getting probably 30+ "jammed" events a day I thought I would upgrade to see if it helped at all (although I knew it wasn't "fixed" in 7.21.3). So I updated to 1.3 (7/21/3) in zwave js ui and now I just get non-stop errors in the logs. I can't seem to downgrade/install any of the gbl in PC Controller or in Zwave JS UI in bootloader only mode either, so probably bricked my device. Such is life. I have another ZST39 on the way, and have NVM backups, so hopefully I can get a new stick working without re-pairing 94 devices...... |
Beta Was this translation helpful? Give feedback.
-
I had the exact same experience as you. Previously running 1.20 without issue for at least 6 months and foolishly decided to upgrade my stick a few days ago via zwave-js-ui interface. I started sporadically experiencing same problem in an LXC container on Proxmox. I was able to restore the ZST39 to normal operation only by shutting down the container and physically disconnecting / reconnecting the stick. I tried requesting soft reset and/or simply restarting the container but these were ineffective. I downgraded to 7.19.3 via zwave-js-ui interface last night and am hoping all is well again (though I had to manually reset RF region back to USA). |
Beta Was this translation helpful? Give feedback.
-
@AlCalzone some update - I was playing in ZWave-JS UI with rebuilding routes and sending pings to nodes, and suddenly it failed (when I was checking node 044 - which is an old, non zwave plus node with s0 security) and, as others indicated above, controller no longer responsive. Restarting zwave-js container didnt help, I had to unplug the stick to get it back online. I captured log, see attached: |
Beta Was this translation helpful? Give feedback.
-
Got it.
Maybe we should put a big warning somewhere telling people not to upgrade
to this version?
Is there a reliable way to downgrade? I had thought this was prevented by
the bootloader but have seen comments indicating it can be done.
…On Thu, 30 May 2024, 11:39 am Botched1, ***@***.***> wrote:
Yes and it has been plaguing people for months now.
If only there were members of the Zwave Alliance that could help push
getting this fixed. :)
(I'm joking)
—
Reply to this email directly, view it on GitHub
<#6874 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALST5GP7D6XZQRTYEBIMS3ZE5W7VAVCNFSM6AAAAABIN4BE2CVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TMMBZGI2DG>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Just a report. I upgraded ZST39 from 7.19.3 to 7.21.3 via ZwaveJS UI. Infinite controller unresponsive loop. Any kind of firmware update/downgrade fails with error 0x18 (Simplicity Studio, ZwaveJS) I restored NVM backup to a spare ZST10 which bricked it completely (no terminal). Then I bought another ZST39, flashed 7.19.3 via Simplicity Studio and restored NVM backup via ZwaveJS UI. After a few loops it seems to work. 800LR devices did not restore. It seems like upgrading to 7.21.3 via ZwaveJS UI has a chance to brick the stick in a really weird way... |
Beta Was this translation helpful? Give feedback.
-
Do you have some instructions on how to downgrade firmware versions? I tried many times and had no luck. |
Beta Was this translation helpful? Give feedback.
-
Haven't seen anyone mention it, but the Z-Wave SDK 7.21.3 release notes mention an issue which can cause 700/800 controllers to lock up. Not sure if this is the cause or not, but I'd guess so. |
Beta Was this translation helpful? Give feedback.
-
Does anyone except @vladm have driver logs showing the controller become unresponsive? Silabs are trying to figure out what could trigger this. In the log above it might be lots of incoming traffic, but i wonder if that's the same for others with the same problem. |
Beta Was this translation helpful? Give feedback.
-
I also have the endless unresponsive loop error. |
Beta Was this translation helpful? Give feedback.
-
Also began to encounter this issue after upgrading. I foolishly updated the firmware, when I never had issues at all on the previous firmware (1.20). A few days after, in the middle of the night, network became unresponsive (1.30). Update didnt seem to fail in zwavejs however it did show an error when finished. But no obvious problem with it. Im gonna try reinstalling the update in simplicity studio and will see what happens. Update 1: Been approximately 24 hours after I flashed the new firmware using simplicity studio (windows) and I have NOT faced any timeouts. Prior to this, it would usually timeout in the night, though there was an instance where it timed out afternoon. I will post another update in maybe 72 hours if nothing happens or if it times out again. |
Beta Was this translation helpful? Give feedback.
-
After reading these posts as well as the HA community posts it is clear that 7.21 introduced or exacerbated and issue with 800 series including the zst39. |
Beta Was this translation helpful? Give feedback.
-
I've been having the original jammed lockup issue as well. I was able to make it very rare by avoiding the use of my ZEN37. If you do a couple of hold events on a ZEN37, at least in my setup its basically guaranteed to cause a lockup. Unfortunately, my ZST39 is in a vacation rental, but I will be there tomorrow for a week to do some work on it, so I can try to get some logs if it will help. I tried firmware 1.30 last time I was there, but it was crashing every few minutes and went to 1.20 (I had no time to work on diagnosing any issue), combined with the removal of the ZEN37, has had relatively few issues. |
Beta Was this translation helpful? Give feedback.
-
I got a tip from Silicon Labs about enabling the hardware watchdog in the controller. This should help in the case where the controller locks up completely, because the watchdog should restart it. Should have an update ready tomorrow. |
Beta Was this translation helpful? Give feedback.
-
I also have the unresponsive controller, adding my logs in case they help track down the problem. zwave-js-ui: 9.14.1 running on haos: 34 zwave devices (mostly switches) including the controller ZST39 LR FW: v1.30 SDK: v7.21.3 i've had this controller lockup issue for weeks, i finally put the stick on a powered USB hub plugged into a wifi controlled outlet to "unplug" it when it goes unresponsive. it happens most often when HA tries to trigger a scene, it seems to have something to do with a large volume of traffic, or perhaps it has to interact with a device it doesn't like. I have mostly Zooz 77 and 76s with some inovelli 2-1s, a ZAC38 (range extender), a ZEN37 scene controller, and a few others (can add the whole list if its useful). I am attaching a debug log from today where it recently locked up. I have many more log files I can post if they help. Will stay tuned to this thread in hopes there is some news. Thanks to those of you who are helping to track this down, this is quite a frustrating puzzle. |
Beta Was this translation helpful? Give feedback.
-
Small update: Not sure how it came to this, but as of this morning, I am able to reproduce the situation where the controller hangs during transmission of a frame and becomes unresponsive afterwards. Working with Silabs right now to give them the information they need to reproduce and/or fix it. |
Beta Was this translation helpful? Give feedback.
-
Zooz shared a firmware based on SDK 7.22.0 for the ZST39, see link below. They tested upgrades and downgrades from and to 1.10, 1.20, and 1.30. |
Beta Was this translation helpful? Give feedback.
-
I just tried the Zooz latest beta (ZST39_SDK_7.22.0_US-LR_V01R32_BETA.gbl) and after restoring my NVM, all I see is an infinite loop of controller unresponsive/ready messages (I did a Soft Controller Reset as was mentioned earlier. It did not help). The curious thing is that I had originally attempted to include 2 ZSE44s as Long Range, but had to revert when I went back to my Aeotec 700 stick. But for some reason, now those 2 sensors are showing up as Long Range with my Zooz 800LR stick. (I did restore the NVM, so shouldn't it just run as a normal inclusion? The provisioning entries are not set to Long Range. So not sure what's going on there.) In any case, this build is DOA for me 😔 I'm providing both driver and UI logs. Update: I had to spend an extra hour to figure out how to handle the 2 ZSE44s (+ a third one that was gen1 and doesn't support LR) sensors that now just were sitting in PROTOCOL_INFO state. I ended up EXLUDING all 3 and re-INCLUDING them to restore them. Once you know all these quirks, it's not too bad, but the fact that I even had any issues with those 3 ZSE44s in the first place is baffling. Reminder for anybody else trying this, when you exclude and re-include, you may want to refresh the browser since the zwave-js UI list may show you excluded devices and not show you newly included devices. Just refresh and you should get the correct list. |
Beta Was this translation helpful? Give feedback.
-
Too bad I didn't know I had to avoid updating to 1.30. If I understand correctly, it's better to wait for 1.40, do you know the general delay? If it's several months, I'll go back to my 700 controller and wait, or if the downgrade works I could go back to version 1.20 but I'm afraid of bricking the key for nothing. |
Beta Was this translation helpful? Give feedback.
-
I wanted to give an update on the 1.40 firmware, but I cannot find the thread right now. Zooz have the firmware ready, but aren't able to upload it to their website right now. |
Beta Was this translation helpful? Give feedback.
-
Updated to 1.4.0 last night. Over 10 mains-powered Zooz devices that I've been unable to update over the air for months (persistent across soft resets, power cycles, excluding, etc.) were all able to update smoothly today, which I hope bodes well for others. Some others asked which SDK version 1.4.0 was based on, which is addressed in the ZST39 1.4.0 release notes published today. The notes confirm that this release is based upon Z-Wave SDK 7.22.0. |
Beta Was this translation helpful? Give feedback.
-
Updated to 1.4.0 and restored my configuration. The zooz 800 controller still locks up every few minutes. It does seem to recover if you wait, but the lockups keep happening. The problem is not resolved. PS. And now the controller sometimes goes into a "controller is not responsive" and "controller is ready" loop. |
Beta Was this translation helpful? Give feedback.
-
I was wondering if there is any relation to this issue zwave-js/zwave-js-ui#3639 |
Beta Was this translation helpful? Give feedback.
-
Zooz told me they have a beta version 1.41 available, based on SDK 7.22.1, which should improve stability of the 800 series even more. |
Beta Was this translation helpful? Give feedback.
-
just tried to update via homeassistant, it interrupted fw upload process pretty soon after start, and I think I bricked my stick. in the logs of zwave-js: 2024-09-12 16:39:34.375 INFO Z-WAVE: Controller status: Driver: Failed to recover from bootloader. Please flash a new firmware to continue... (ZW0100) |
Beta Was this translation helpful? Give feedback.
-
1.50 has been very smooth for a couple of weeks. |
Beta Was this translation helpful? Give feedback.
-
What's the consensus on the 1.5fw? Is it stable? Any lockups or jams? |
Beta Was this translation helpful? Give feedback.
-
Updated zst39 stick to the latest FW 1.30 (SDK 7.21.3) using HA, after that stick boots, does initial network detection and then enters endless 'unresponsive' loop.
Downgraded FW to 1.20 (based on SDK 7.19.3) using Simplicity studio and everything is back to normal.
attached log from docker, please let me know if that's hard to interpret, I can upgrade it one more time and try to log-to-file instead.
also someone else reported same thing here:
https://community.home-assistant.io/t/zooz-zst39-lr-800-firmware-7-21-3-fail/733405
Software versions
zwave-js-ui: 9.12.0.96eeb76
zwave-js: 12.5.6
Device information
Manufacturer: zooz
Model name: zst39
Node ID: 1
Upload Logfile
zw.zip
Beta Was this translation helpful? Give feedback.
All reactions