**** BEGIN LOGGING AT Mon Jul 25 02:59:56 2022 Jul 26 02:01:09 is zmatt here Jul 26 02:01:22 I need help with this uart in py-uio Jul 26 02:02:31 https://pastebin.com/MFvXZ5Ya Jul 26 02:06:30 break detected generally either means wrong baudrate or the pin isn't muxed Jul 26 02:06:44 (uart rxd pin) Jul 26 02:07:25 receive fifo overrun means too much data was received that you didn't process fast enough Jul 26 02:07:47 what is weird is was working fine and it will misbehave, I am wondering if it is hardware related Jul 26 02:07:58 like I can poweroff Jul 26 02:08:17 and close is a method on the I/O object (pruss.uart.io) Jul 26 02:08:39 and then boot up drop into python3 and just try and send a command through the UART and I will get the FIFO runtime error Jul 26 02:08:56 will close and reinit help you think if I get stuck with that Jul 26 02:09:26 FIFO overrun you mean... yes, once you configure the load cell to start spamming it will... start spamming. unless the data is being processed continuously you are guaranteed to get a receive FIFO overrun Jul 26 02:10:35 like, py-uio's uart interface is primarily meant to do initial setup of the uart and whatever device is connected to the uart prior to starting pru Jul 26 02:11:13 handling a continuous stream of received uart data is a real-time task that you can't really do in python, certainly not reliably Jul 26 02:11:27 if you try, receive FIFO overruns are pretty much guaranteed Jul 26 02:11:40 if you try from the REPL, they're definitely guaranteed :P Jul 26 02:12:33 hmmm Jul 26 02:12:52 the only thing here that looks weird is the break detected... all the rest looks like exactly what I'd expect to happen Jul 26 02:13:39 ok Jul 26 02:14:17 does reading from PRU have the same limitation Jul 26 02:14:22 no Jul 26 02:14:25 ok Jul 26 02:14:50 so maybe this is not my problem really... Jul 26 02:14:52 "Receive FIFO overrun" is a performance problem... you're not keeping up with the data as fast as it's being sent Jul 26 02:15:19 ok this error is generated after I get an error transmitted Jul 26 02:15:29 ?? Jul 26 02:15:43 one second let me pastebin something Jul 26 02:18:27 ok in my application in which the pru is reading the UART. I have been runnning fine but occasionally I will get the following to throw an error Jul 26 02:18:28 https://pastebin.com/yCL2rW9h Jul 26 02:19:06 then when I try and trouble shoot I will try and interact with the UART through python Jul 26 02:19:08 okay, so that would be the "Break detected" .. like I said, that's pretty weird Jul 26 02:19:23 and that gave me that earlier paste bin Jul 26 02:19:37 basically it means the line was low for too long a time Jul 26 02:19:42 the rxd line Jul 26 02:20:00 so my code was running fine for 2 weeks did this behavior all the sudden fixed and now it is doing it again Jul 26 02:20:12 do you think it is wiring or HW possibly Jul 26 02:20:24 everything was working fine now it is not Jul 26 02:20:43 definitely possible... "break" is called that because in certain wiring setups it indicates a physical break in the cable Jul 26 02:21:16 does the uart rxd pin have pull-up or pull-down on the beaglebone side? Jul 26 02:21:38 let me check need to power on Jul 26 02:21:49 a power interruption of the load cell could cause it too Jul 26 02:23:02 pull up Jul 26 02:23:17 okay, then a cable break can't cause a break condition Jul 26 02:23:19 both txd and rxd Jul 26 02:24:01 power interruption of the load cell comes to mind, or maybe the load cell resets and produces some garbage on the line during that Jul 26 02:24:21 or something briefly shorts the rxd line to ground Jul 26 02:24:35 is there anyway to guard against that Jul 26 02:24:40 from the SW Jul 26 02:24:48 or is that someting where I need to check physical connections Jul 26 02:26:22 I mean, obviously you could add logic to python to automatically recover/restart when the system fails... but your first priority should probably be to figure out what's causing the failure to begin with Jul 26 02:29:00 now one of my C routinres works and does not throw the error so that makes me think I amd doing something wrong on the coding Jul 26 02:29:40 i can pastebin Jul 26 02:29:53 one second Jul 26 02:30:03 I don't see an obvious way you could cause a break condition through a software mistake, unless you're messing with the uart's registers Jul 26 02:30:35 no I havent changed the code Jul 26 02:30:51 could moisture and heat make a difference Jul 26 02:31:07 all my electronics are in a relatively well sealed box Jul 26 02:31:21 but it is sitting in a 95% humidity 37C incubator Jul 26 02:33:33 uhh, right now we don't even know what's going on that's causing a break condition to be detected... like, some of my guesses include the load cell resetting, but those are just _guesses_ ... if it doesn't out it *is* resetting then you can start wondering about what might cause it to do so, such as environmental factors, but that line of thinking seems excessively premature Jul 26 02:33:48 *if it turns out Jul 26 02:34:09 ok Jul 26 02:34:49 so my loadcells go to a summing box before going to the em100 digitizer that goes to the UART Jul 26 02:34:55 so I should be testing the EM100? Jul 26 02:35:00 or the actual load cells Jul 26 02:35:45 I've always meant the EM100, which is what you've previously referred to as the load cell hence which I'll continue to call that :P Jul 26 02:35:57 the-thing-driving-your-uart-rxd-line Jul 26 02:38:17 so again, a break condition means the rxd line was detected as being low (i.e. closer to 0V) for a longer amount of time than it can ever be as part of a normal character transmission. if this is actually happening on your uart rxd line then for example that's something a scope could capture if it has a way to trigger on "signal low for at least " Jul 26 02:40:33 if it's not something really happening at an electrical level the only alternatives are that the uart is confused, e.g. due to messing with its registers in a weird way, or the uart is being deceived, e.g. because the pinmux is being changed causing the uart to no longer be actually connected to the external pin Jul 26 02:42:39 if it *is* happening at an electrical level then the next question becomes why... a scope capture of the event might perhaps already give a clue, e.g. whether it's some form or interference or something the EM100 is doing. if it's something the EM100 is doing then the diagnosis would proceed from there Jul 26 02:42:51 ok Jul 26 02:44:06 all I do is play with the shared memory which I assume would not impact the UART at all Jul 26 02:44:25 correct Jul 26 02:44:29 and I definitely did not change the pinmux Jul 26 02:44:38 so I will scope tomorrow Jul 26 02:44:49 very strange Jul 26 02:45:07 was working fine for weeks Jul 26 02:45:18 now all over the place Jul 26 02:45:49 if you're suspecting software, then just go back to the last known-good version, which you obviously committed to git and/or backed up right? Jul 26 02:46:34 I can revert but the software was running fine as well Jul 26 02:46:56 the problem went away on its own and now is back Jul 26 02:47:08 definitely sounds more like hardware than software Jul 26 02:47:12 which makes me think it is HW but got to scope it Jul 26 02:47:34 next time it happens, try running this utility function to dump some data from the uart: https://pastebin.com/W9qXEjUK Jul 26 02:47:43 ok Jul 26 02:47:52 just for debugging Jul 26 02:48:23 so after it crashes run this Jul 26 02:48:41 yeah after the pru core has halted Jul 26 02:48:41 I dont catch the error so I would have to cntrl-c to get a prompt Jul 26 02:48:44 ok Jul 26 02:55:55 does it make sense that I cannot run io.close() without getting FIFO overrun Jul 26 02:56:07 ?? Jul 26 02:56:48 io.close() never throws an error Jul 26 02:56:53 well I tried running again crashed immediately, i ran that dump script that you just gave me and it printed nothing probably because I bombed immediately Jul 26 02:57:09 so I tried to reset it without powering everything off Jul 26 02:57:22 so I go into python and try and call uart.io.close() Jul 26 02:57:30 and it gives me the FIFO overrun error Jul 26 02:57:51 want the stack trace? Jul 26 02:58:01 nothing you just said makes any sense Jul 26 02:58:51 the little function I gave (which is not a standalone script, it assumes there's a pruss already) should be executed when pru has aborted with this specific problem Jul 26 02:58:59 https://pastebin.com/BmmxYBzr Jul 26 02:59:14 neither pruss nor its uart must be reset between the problem occurring and the debug function being run Jul 26 02:59:34 oh, close calls flush Jul 26 02:59:44 hmm **** ENDING LOGGING AT Tue Jul 26 02:59:56 2022