Author |
|
grif091 Super User
Joined: March 26 2008 Location: United States
Online Status: Offline Posts: 1357
|
Posted: April 28 2008 at 13:33 | IP Logged
|
|
|
Looking over your last post, those messages look like an "ACK of Direct Message" for Direct cmd ON. Were those messages the result of turning on those switches from ph using a Direct ON command/message? If so, than the switch response (as the Responder) with an "ACK of Direct Message" is to be expected. The flag byte of an "ACK of Direct Message" is a 0x2x so if there is something ORing on 0x20 in messages (as I suspect); you would not see the issue here because the "ACK of Direct Message" already has the 0x20 bit on.
__________________ Lee G
|
Back to Top |
|
|
grif091 Super User
Joined: March 26 2008 Location: United States
Online Status: Offline Posts: 1357
|
Posted: April 28 2008 at 13:41 | IP Logged
|
|
|
Our last posts crossed paths on the way to the forum. Thanks for confirming that those last two messages were from a ph initiated command. The ACK from the switches is expected in this case since the switches are Responders to the ph initiated command. You would not see the 0x20 problem because that ACK is both expected and already has the 0x20 bit on.
__________________ Lee G
|
Back to Top |
|
|
BeachBum Super User
Joined: April 11 2007 Location: United States
Online Status: Offline Posts: 1880
|
Posted: April 28 2008 at 14:03 | IP Logged
|
|
|
Tony, not to be old fashion but I am, what changed before April 15th to your first post on this subject?
__________________ Pete - X10 Oldie
|
Back to Top |
|
|
TonyNo Moderator Group
Joined: December 05 2001 Location: United States
Online Status: Offline Posts: 2889
|
Posted: April 28 2008 at 17:22 | IP Logged
|
|
|
Nothing I know of. Seems like my PLC must have broke around then.
|
Back to Top |
|
|
BeachBum Super User
Joined: April 11 2007 Location: United States
Online Status: Offline Posts: 1880
|
Posted: April 28 2008 at 17:40 | IP Logged
|
|
|
Things I would think of might be noise, change to linking, added another Insteon device…. Does everything else work as normal?? Do you have a backup prior to the failure you could load for a test??? I hope Lee is stirring his brew for new thoughts as my kettle is getting low.
__________________ Pete - X10 Oldie
|
Back to Top |
|
|
TonyNo Moderator Group
Joined: December 05 2001 Location: United States
Online Status: Offline Posts: 2889
|
Posted: April 28 2008 at 19:29 | IP Logged
|
|
|
Seems like noise couldn't decide to materialize only for paddle presses. Nothing was changed. No links, no device additions. Wait... The PLC warranty just expired.
This may be a good time to get a PLM.
|
Back to Top |
|
|
BeachBum Super User
Joined: April 11 2007 Location: United States
Online Status: Offline Posts: 1880
|
Posted: April 28 2008 at 19:34 | IP Logged
|
|
|
LOL !!!! My new PLM screams…
__________________ Pete - X10 Oldie
|
Back to Top |
|
|
grif091 Super User
Joined: March 26 2008 Location: United States
Online Status: Offline Posts: 1357
|
Posted: April 28 2008 at 23:47 | IP Logged
|
|
|
It may not be only on paddle presses. The same thing could be happening to the two "ACK of Direct Message"s in the earlier post, you just would not see a symptom because the 0x20 bit is already part of the expected ACK message generated by the switch. If you have other devices, like Keypadlincs, you could try a button press on each of those and see if the 0x20 bit shows up in the flag byte of the "Group Broadcast", "Group Cleanup Direct" messages generated by that device type. I think it will because I'm assuming it will be in all inbound messages presented by the PLC to sdm3 from the powerline. It is only an assumption at this point, to be proven or disproved by actual tracing. I'm thinking some hardware failure, like a memory failure, that results in that bit being set. Is this problem intermittent or solid now? Regardless of the presence of an external symptom, do all the sdm3 trace entries for inbound messages have the 0x20 bit on for those things that you have traced?
Seems like moving to a PLM is a good idea independent of this symptom. If the 0x20 bit continues to be seen after moving to a PLM (or a different PLC), then you would be looking for an individual device that is repeating the inbound messages incorrectly.
Thanks to both regarding tracing question.
__________________ Lee G
|
Back to Top |
|
|
TonyNo Moderator Group
Joined: December 05 2001 Location: United States
Online Status: Offline Posts: 2889
|
Posted: April 29 2008 at 08:03 | IP Logged
|
|
|
No KPL's here. The problem seems solid.
Here are two manual change presses for another set of messages...
PLC:receiveinsteonraw=02 03 6F 97 00 00 01 EF 17 01
PLC:receiveinsteonraw=02 03 6F 97 00 00 01 EF 18 00
PLC:receiveinsteonraw=02 03 6F 97 00 00 01 EF 17 00
PLC:receiveinsteonraw=02 03 6F 97 00 00 01 EF 18 00
|
Back to Top |
|
|
BeachBum Super User
Joined: April 11 2007 Location: United States
Online Status: Offline Posts: 1880
|
Posted: April 29 2008 at 08:12 | IP Logged
|
|
|
I’m still digging.
TonyNo wrote:
Rebuilt the links, even deleted and reset controller/responder, with no change. |
|
|
Did you factory reset the physical devices that are linked together?
__________________ Pete - X10 Oldie
|
Back to Top |
|
|
grif091 Super User
Joined: March 26 2008 Location: United States
Online Status: Offline Posts: 1357
|
Posted: April 29 2008 at 09:06 | IP Logged
|
|
|
These are four "Group Broadcast" messages from 03.6F.97, Group 1, for a Start BRIGHT, Stop, Start DIM, Stop command sequence. They all have the 0x20 bit on in the flag byte, in error. The EF should be a CF in all 4 messages. This is the same failure seen in the other traces you have posted.
PLC:receiveinsteonraw=02 03 6F 97 00 00 01 EF 17 01 Start BRIGHT
PLC:receiveinsteonraw=02 03 6F 97 00 00 01 EF 18 00 Stop
PLC:receiveinsteonraw=02 03 6F 97 00 00 01 EF 17 00 Start DIM
PLC:receiveinsteonraw=02 03 6F 97 00 00 01 EF 18 00 Stop
While it is failing solid, pull up the little power off tab on all but one of your devices. Then paddle press the remaining device. If the trace entries continue to show a 0x20 in the flag byte of messages where it should not be, then I believe the PLC has a hardware failure. Should the 0x20 disappear, then it is one of the devices that have been powered off, which is repeating Insteon messages incorrectly. If that is the case, continue doing paddle press test with that same switch, restoring power to each device, one at a time until the 0x20 reappears in the trace. The last device you reapplied power to will be the culprit. Be careful when pushing in the power off tab to only restore it to the normal position, don't push past that point or you will reset the device.
__________________ Lee G
|
Back to Top |
|
|
BeachBum Super User
Joined: April 11 2007 Location: United States
Online Status: Offline Posts: 1880
|
Posted: April 29 2008 at 09:16 | IP Logged
|
|
|
So far I’ve counted 3 devices with the problem and Lee it may be a different device in the repeat chain causing the problem. I think I would pull the RF repeaters also if applicable. Is the repeat process like Token Ring?
__________________ Pete - X10 Oldie
|
Back to Top |
|
|
grif091 Super User
Joined: March 26 2008 Location: United States
Online Status: Offline Posts: 1357
|
Posted: April 29 2008 at 09:57 | IP Logged
|
|
|
That is a good point about pulling the RF repeaters. They repeat messages just like other Insteon devices, plus they do it cross phase. Thanks Pete, I would have missed those. Also one more thought on testing. If paddle testing with one device powered on still shows the 0x20 in the flag byte of the trace entries (where I said I thought it would be the PLC), test with a different switch to be sure the only switch powered on is not the one that is failing. That is, turn off the test switch and turn on a different switch and do the paddle test. The probability of starting the single switch test with the bad switch is pretty low, but not impossible.
I have not counted the number of different device addresses that show the erroneous 0x20 bit but it has been more than one for sure. Since I don't believe more than one device is failing, it makes it the common PLC where all messages go, or one of the individual devices that is repeating the message incorrectly. The Insteon Details document describes a Cyclic Redundancy Check (CRC) character on the end of each Insteon message. Assuming that part of the architecture has been implemented, it has to be a device that is building the full message. Anything that injected noise would not be so specific and even X10 repeaters would not have the ability to add a bit and correct the CRC character along with it. Sure wish there was a completely independent test device that recorded powerline traffic.
Edited by grif091 - April 29 2008 at 10:19
__________________ Lee G
|
Back to Top |
|
|
BeachBum Super User
Joined: April 11 2007 Location: United States
Online Status: Offline Posts: 1880
|
Posted: April 29 2008 at 10:21 | IP Logged
|
|
|
Well I think I answered my own question. It appears the repeating process is done by Simulcasting. Therefore the signal is not regenerated but amplified. So anyone could add the bit. That process is true for the devices but not the RF cross phase modules who retransmit at a different time slot IF they are another phase. My gut feel is the originators are doing it for whatever reason or we are back to the PLC as being the culprit most likely as the only trace we have is the PLC itself which could be altered.
__________________ Pete - X10 Oldie
|
Back to Top |
|
|
BeachBum Super User
Joined: April 11 2007 Location: United States
Online Status: Offline Posts: 1880
|
Posted: April 29 2008 at 10:29 | IP Logged
|
|
|
Tony, one other thought, could you switch over the SDM from what your using 235 or 308 see if the pattern remains the same.
__________________ Pete - X10 Oldie
|
Back to Top |
|
|
TonyNo Moderator Group
Joined: December 05 2001 Location: United States
Online Status: Offline Posts: 2889
|
Posted: April 29 2008 at 11:33 | IP Logged
|
|
|
Wow. Thanks for all the suggestions guys!
I'll try to isolate one dimmer at a time and the SDM version shift tonight.
|
Back to Top |
|
|
TonyNo Moderator Group
Joined: December 05 2001 Location: United States
Online Status: Offline Posts: 2889
|
Posted: May 01 2008 at 07:47 | IP Logged
|
|
|
Hmm. Off press...
PLC:receiveinsteonraw=02 03 6F 97 00 00 01 EF 13 00
PLC:receiveinsteonraw=01 03 6F 97 00 D4 2C 65 13 01
A PLM should be here in 2 days.
|
Back to Top |
|
|
BeachBum Super User
Joined: April 11 2007 Location: United States
Online Status: Offline Posts: 1880
|
Posted: May 01 2008 at 08:20 | IP Logged
|
|
|
Does that mean the FOYER is the culprit or do you mean no change after testing?
As I said before PLM screams… Just the follow Nadler’s 3 steps plus Dave’s and it will be seamless.
__________________ Pete - X10 Oldie
|
Back to Top |
|
|
grif091 Super User
Joined: March 26 2008 Location: United States
Online Status: Offline Posts: 1357
|
Posted: May 01 2008 at 09:30 | IP Logged
|
|
|
Basically more of the same.
PLC:receiveinsteonraw=02 03 6F 97 00 00 01 EF 13 00 Group Broadcast
PLC:receiveinsteonraw=01 03 6F 97 00 D4 2C 65 13 01 Group Cleanup Direct
First message is Group Broadcast, from 03.6F.97, to no specific device, Group 1, CMD OFF
Second message is Group Cleanup Direct, from 03.6F.97, to 00.D4.2C, Group 1, CMD OFF
This is the message sequence expected from an OFF paddle press at 03.6F.97, except that flag byte has 0x20 on when it should not be.
Has the device testing shown a pattern, anything that we did not know before?
__________________ Lee G
|
Back to Top |
|
|
TonyNo Moderator Group
Joined: December 05 2001 Location: United States
Online Status: Offline Posts: 2889
|
Posted: May 01 2008 at 20:11 | IP Logged
|
|
|
Lee, thanks for digging. No, there has been nothing new. Hopefully, by tomorrow evening, this will be fixed by the PLM.
|
Back to Top |
|
|