**** BEGIN LOGGING AT Fri Feb 12 02:59:58 2016 Feb 12 03:39:56 hey matt, thanks for the comments and clarifications i'm not insane Feb 12 03:40:13 ds2: 50cal? what's that? Feb 12 03:42:26 zmatt: the usb on our prototype board is simply dp, dm, and drvvbus directly from the processor to the le910 (on the board). no connectors, no other peripherals. that should be pretty clean, no? Feb 12 03:46:10 although i don't know the difference between drvvbus and vbus. i guess the latter is the 5V supply on the bus (out the connector)? Feb 12 03:46:39 oh, i see. drvvbus is the output in host mode, vbus is the input in device mode? Feb 12 03:47:16 what? rtfm? Feb 12 03:47:43 hey, thanks for the musb doc. that's good. Feb 12 03:49:20 zmatt: if you or someone could clarify if my logic is right, i would feel much better about this problem. namely this: Feb 12 03:49:57 I looked into the drivers/musb_core.c driver (very roughly) and it appears that the babble message comes from receiving a babble interrupt from the AM335x on-board USB peripheral. If that is the case, how can any patch or kernel version fix this problem? It appears what we need to do to fix our problem is understand what causes the AM335x USB peripheral to generate this interrupt, e.g., does it monitor power variati Feb 12 03:49:57 ons, DM/DP voltages, etc., or what? Feb 12 03:51:43 the driver can help by reinitializing the usb bus (reenumerating, etc.), but otherwise how can it (or a patch) help with the root cause?!? Feb 12 03:53:04 btw, our project manager has escalated this up through our local ti reps, and she told me she's passed it on to the european experts. we'll see what happens. Feb 12 03:54:52 the most recent workaround i've seen on it are changing a timeout in musb_core.c and disabling power management, from a guy named yordan on e2e: https://e2e.ti.com/support/embedded/linux/f/354/t/484475 Feb 12 03:59:04 does a 300-MHz AM3352 have any possible frequency scaling? i only see one frequency, 300, in cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies (from http://processors.wiki.ti.com/index.php/AM335x_Linux_Power_Management_User_Guide) Feb 12 03:59:55 also what i can't get my mind around is why, in our case/our board, having the eth0 plugged in makes the problem largely go away? Feb 12 04:00:52 one theory (which I think is a bit of a reach) is that the kernel is using power-management and when it sees the the eth0 isn't present switches to a lower power operation, which glitches something. Feb 12 04:05:00 no one knows anything about these things here? Feb 12 04:12:21 aha! Feb 12 04:13:15 section 8.5.5 defines the conditions under which a babble interrupt is generated. Feb 12 04:13:18 excellent! Feb 12 04:30:10 i think i just found the problem Feb 12 04:34:48 musb?! :P Feb 12 05:08:59 well, not exactly... Feb 12 05:09:34 i'll update yall tomorrow - gotta run and get this code uploaded to our other board. Feb 12 05:09:36 night! Feb 12 05:09:49 thanks zmatt, btw, for the information and espeically the Mentor doc. Feb 12 05:58:03 a proper fix for the babble err is complex Feb 12 12:01:08 http://beaglecore.com/ Feb 12 12:12:24 $55 ? Feb 12 12:46:30 ogra_: industrial Feb 12 12:46:41 ah Feb 12 12:47:52 industrial? conrad? I associate them more with overpriced hobbyist electronics Feb 12 12:51:25 so for the price of bbb, you get something that has only a small subset of the IO, fewer features, and requires a pcb layout that will still include Feb 12 12:51:29 many high-speed traces Feb 12 12:52:32 zmatt: yes, but a BBB has no industrial grade quality assurance Feb 12 12:53:01 zmatt: also, conrad would be dead if only selling overpriced hobbyist electronics Feb 12 12:54:19 zmatt: also, the bbcore is not the first SoM out there Feb 12 12:54:30 I know Feb 12 12:54:51 http://phytec.com/products/system-on-modules/phycore/am335x/ Feb 12 12:54:53 I don't understand SoMs very well, but I guess they have a market or they wouldn't exist Feb 12 12:55:06 thy phytec one was around 50 too IIRC Feb 12 12:56:18 or, well, I understand a SoM if you need a fast processor with its necessary high-speed entourage but have only relative basic I/O needs Feb 12 12:59:17 zmatt: I dont do industrial, so I dont know either. but the phytec ppl tell me these sell nicely Feb 12 12:59:33 although not having control over the power supply architecture however sucks when you want to connect any non-trivial amount of external hardware... Feb 12 13:00:35 and gawd I keep forgetting how infuriatingly awful the beaglecore website is Feb 12 13:01:04 it also draws 100% cpu load, nicely done Feb 12 13:38:26 If you need a fast time to market, you have to go with something like that. if the need arise, you can optimize later Feb 12 15:53:02 hey peeps Feb 12 15:53:43 wanna hear my musb problem resolution? Feb 12 15:55:17 Yordan, are you here? Feb 12 16:13:02 <_av500_> pray tell Feb 12 16:19:57 this is our uart0 configuration: http://www.digitalsignallabs.com/uart0.png Feb 12 16:20:20 the tx/rx lines at the right side of U16 go to the LE910 Feb 12 16:20:47 i did not disable uart0 as log/console Feb 12 16:22:28 still needs verification, but the theory is that the log messages associated with eth0 beign disconnected were getting sent to the le910 over this serial line, and the le910 was somehow subsequently misbehaving on the usb bus, causing the babble Feb 12 16:23:04 the whole "musb has babble problems" history was a shill. Feb 12 16:23:38 fix THAT with a kernel patch - ha ha! Feb 12 16:23:56 i worked around the problem by shorting P6 pins 2 and 3 Feb 12 16:25:01 good bedtime story for your grandchildren. Feb 12 16:25:50 but it gets better... Feb 12 16:26:15 ah, telit.. Feb 12 16:26:22 almost at exactly the same time last night, our hardware guy discovered that the OE to U16 was not connected. layout and/or fab error. Feb 12 16:26:45 "our footprint won't change this time, we swear!" Feb 12 16:27:15 so we really don't know what U16 was doing.. Feb 12 16:28:22 cc0_1: ha ha. we haven't encountered that. yet. Feb 12 16:29:52 it's partly my fault - i had the h/w guys run uart0 to the le910 because i wasn't sure we would be using the qmi driver, perhaps justplain old ppp instead. Feb 12 16:33:54 _av500_: kapish? Feb 12 16:34:46 capisce? Feb 12 16:37:51 yates: of all the uarts available on the BBB you used uart0 for that? :/ Feb 12 16:38:13 and regardless of what the le910 was doing it shouldn't have been babbling Feb 12 16:38:16 we're using all 3 uarts on the am3352 Feb 12 16:38:35 "shouldn't have" != does Feb 12 16:39:09 it has 7 uarts Feb 12 16:39:14 this may be a good thing to bubble up to telit. Feb 12 16:40:15 and given that babble errors are rather frequently seen with musb (even though they should be a protocol error that's not ever seen under any normal circumstance) I'm not inclined to absolve musb here that quickly :P Feb 12 16:40:34 though a fair share of problems seems to be musb overreacting to some mild external disturbance Feb 12 16:40:51 so it would still be an interaction problem of sorts Feb 12 16:42:27 zmatt: the datasheet says all the am335x family have 6 uarts Feb 12 16:42:50 there is no uart6 pin - searching the datasheet reveals that. Feb 12 16:43:04 so i still stand corrected. i did not realize that. Feb 12 16:43:15 i don't know why i was thinking there were only 3. Feb 12 16:43:59 pruss has another uart Feb 12 16:44:34 (less featureful but higher max speed) Feb 12 16:45:27 i see. then they should correct the datasheet. Feb 12 16:45:45 btw, should you ever find yourself in need of digging into usb again: the relevant docs are the musb doc you now have, the USB subsystem chapter of the TRM (obviously), and also some stuff in the Control Module chapter Feb 12 16:46:06 but i see them now, pr0_uartxyz Feb 12 16:46:25 zmatt: yes, i looked at 2/3 of those already. Feb 12 16:46:33 2 out of 3 Feb 12 16:46:39 and iirc the DM814x (which has a virtually identical USB subsystem) has a different subset of documentation in the USB chapter of its TRM Feb 12 16:47:43 i really don't think this is a musb problem, it's a "we fucked up" and/or "telit's chip burps on the usb when getting certain crap on the uart input" problem. Feb 12 16:48:16 know that i've looked at it 100x more carefully than you... Feb 12 16:48:26 our particular situation, that is Feb 12 16:48:30 possibly. so far in all cases I've had usb problems I can also assign external blame, but in all cases musb also overreacted Feb 12 16:49:32 perhaps this is not such a case. perhaps the 910 DID do something illegal on the bus. Feb 12 16:49:40 perhaps Feb 12 16:49:53 not really enough data to KNOW one way or the other (yet). Feb 12 16:50:20 true, and often annoyingly difficult to get full insight Feb 12 16:50:39 these complex, multilayer PCBs make it damn near impossible to see buried traces. Feb 12 16:50:53 coupled with bgas Feb 12 16:50:58 yes Feb 12 16:53:59 going to the gym... Feb 12 18:28:58 rcn-ee: found our problem (at least 80 percent sure) Feb 12 20:52:41 hello Feb 12 20:52:44 guys Feb 12 20:52:48 hi Zmatt Feb 12 20:52:52 are you here Feb 12 20:56:54 does the rasberry and the beagle have the same usb cable ? Feb 12 21:10:28 No, my rPi has a micro USB and my BBB has a mini Feb 12 21:10:44 (Assuming you are talking about BeagleBone Blacks) Feb 12 21:10:48 unless the beagle is green. ;) Feb 12 21:11:24 I don't know what the other Beagles have Feb 12 21:11:49 they stuck a micro on the green... Feb 12 21:54:01 superglue? Feb 12 22:09:14 http://paste.debian.net/382084/ Feb 12 22:09:29 babble at line 342 Feb 12 22:10:00 what are all these "...Failed to get cpu0 regulator/voltdm"? Feb 12 22:13:53 yates, the voltdm was something i had cherry picked from v4.1.x-ti, however after v4.4.0 it needs re-porting with the new opp later changes.. Feb 12 22:14:06 i ripped them out a couple weeks back: https://github.com/RobertCNelson/bb-kernel/commit/eb2313a05f71b28b58c99b6e31353f543169fc2e Feb 12 22:19:58 opp == operating performance points? this is associated with run-time power management? Feb 12 22:20:21 yeapers.. Feb 12 22:21:37 yates, here's the quick jist: http://git.ti.com/gitweb/?p=ti-linux-kernel/ti-linux-kernel.git;a=commit;h=d3e2dd94ed47bdfbd1cce104b8e2d0f5584a35e8 Feb 12 22:23:38 could this be causing usb babble interrupts? (i know that's vague, please interpolate for me) Feb 12 22:23:54 thanks for the link. read it. Feb 12 22:24:33 i have branch am33x-v4.4 checked out of bb-kernel. if i just do a git pull, will i get these changes? Feb 12 22:24:34 sorry nope... un-related.. Feb 12 22:24:40 yeap.. Feb 12 22:25:03 hokay. Feb 12 22:25:39 do you know Yordan? (e2e) Feb 12 22:26:06 he has implied there that runtime power management issues can cause babbles. Feb 12 22:26:39 by suggesting as a potential solution turning it off in the build (config menu) Feb 12 22:27:38 namely, here: https://e2e.ti.com/support/embedded/linux/f/354/t/484475 Feb 12 22:28:24 is that essentially not correct? Feb 12 22:32:57 nah i think border line usb spec hardware causes babble issues. ;) Feb 12 22:33:37 you mean the hardware ip on the SoC? or on the circuit board? Feb 12 22:33:59 or on client devices? Feb 12 22:34:07 client devices.. Feb 12 22:34:32 you said yourself in that email that you're using an LE910 with no problem. Feb 12 22:34:38 that's the only thing we have on our usb Feb 12 22:34:58 of course there are a lot of versions of LE910 with assorted versions too Feb 12 22:35:12 -svg, -nvg, -na, ... Feb 12 22:35:28 assorted firmware revisions, thatis Feb 12 22:35:30 -nag... Feb 12 22:35:45 -nvg here. Feb 12 22:36:06 Bus 001 Device 003: ID 1bc7:0021 Telit HE910 Feb 12 22:36:49 nothing from: `dmesg | grep -i babble` Feb 12 22:37:07 it's been running for 6hours.. Feb 12 22:37:15 Bus 001 Device 003: ID 1bc7:1201 Telit Feb 12 22:37:52 are you running it on some sort of a cape? Feb 12 22:38:06 althought it's talking to a "tower" it hasn't fully connected yet... Suppost to have a world sim... but... Feb 12 22:38:40 it's connected on the nimblelink cape.. Feb 12 22:38:43 that's the frustrating thing here: i can get on the network and start uploading to the cloud just fine, IF I can avoid babble interrupts! Feb 12 22:38:53 grr! Feb 12 22:39:17 it's got a built-in hub: https://paste.debian.net/382107/ Feb 12 22:39:21 so maybe that saves it? Feb 12 22:39:45 you're saying on the beaglebone green? Feb 12 22:40:05 no Feb 12 22:40:06 i see Feb 12 22:40:17 on the client? Feb 12 22:40:22 no i moved from the green, couldn't push enough current thru the usb connector... Feb 12 22:40:50 black/nimblelink cape/telit he910.. Feb 12 22:41:51 so the cape has the hub on it? Feb 12 22:41:54 http://paste.debian.net/382109/ Feb 12 22:42:31 yates, correct, line 3: |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M that's the hub between the am335x and telit he910.. Feb 12 22:43:19 that could definitely be the reason yours works and mine doesn't! Feb 12 22:44:22 do you have room to stick a hub on your board? Feb 12 22:45:20 well we've directly wired things: dm, dp, and drvvbus directly to the le910 from the am3352. i'm not even sure the traces aren't buried. Feb 12 22:45:33 we have room, though, yes. Feb 12 22:47:06 i was thinking the same thing, though. sure would be nice to try. this problem has skewered our project. Feb 12 22:48:30 I've also seen a hub *cause* babbling though Feb 12 22:49:07 though it was an iffy one with patch wires :P Feb 12 22:49:31 maybe some hub/root_hub interfaces are good and others aren't? Feb 12 22:52:58 root_hub is the virtual one-port "hub" representing the usb port Feb 12 22:54:19 I'm wondering what exactly the usb phy is reporting to musb when this purported babble is happening... since some kind of receiver problem seems more likely to me than actual babbling Feb 12 22:54:54 that was answered nicely in that mentor doc, section 8.5.5 Feb 12 22:55:46 no that's just stating the (usb-standard) definition of babble Feb 12 22:57:04 however unless the usb peripheral is very broken it is unlikely to be comitting such a protocol error, so that suggests the phy is misbehaving, or at least not getting along with the device Feb 12 22:57:15 s/peripheral/device/ Feb 12 22:57:47 i.e. reporting the bus is still active when it isn't Feb 12 22:58:43 btw, are you getting only babbles or also vbus errors? Feb 12 22:59:29 i don't know - how do i check? grep for vbus in kern.log? Feb 12 22:59:41 for example Feb 12 23:00:24 no instance of "vbus" there Feb 12 23:00:37 case-insensitive Feb 12 23:00:51 several babbles, though.. Feb 12 23:00:56 ok Feb 12 23:01:43 so by "phy" you mean the physical interface? the wires, connectors, etc? Feb 12 23:02:43 no, look at the block diagram of the usb system (TRM figure 16-1) Feb 12 23:02:55 the am335x trm? Feb 12 23:02:59 yeah Feb 12 23:03:32 the block labeled "USB 2.0" is musbmhdrc Feb 12 23:03:38 *blocks Feb 12 23:04:40 however connects via UTMI+3 to the PHY which contains the actual transceivers and such Feb 12 23:05:23 (this arrangement is somewhat analogous to an ethernet peripheral connecting via MII to an ethernet PHY) Feb 12 23:05:41 except these PHYs are integrated on chip Feb 12 23:06:23 they (along with every other part of the subsystem apart from the two musb cores) are TI's own creations Feb 12 23:06:37 i see Feb 12 23:07:36 there are so many possibilities though Feb 12 23:07:56 unless the musb core is somehow confused about when the frame should have ended, a false babble would imply the PHY is reporting the bus is still active when in fact it's not Feb 12 23:08:27 how is a 'bus active' state determined? some state of dp and dm? Feb 12 23:08:34 both hi, e.g.? Feb 12 23:09:23 rther, "bus inactive" Feb 12 23:09:23 I think line undriven (i.e. SE0) Feb 12 23:09:42 where are you getting this terminology from? Feb 12 23:09:55 i'm a usb virgin.. Feb 12 23:11:25 it's usb standard terminology Feb 12 23:11:37 even appears on its wikipedia page Feb 12 23:12:05 reading this now - looksgood: http://www.usbmadesimple.co.uk/ums_3.htm Feb 12 23:12:45 yeah, though highspeed is somewhat different thuogh Feb 12 23:12:47 *though Feb 12 23:13:28 are those 15k pulldowns inside the am335x phy? Feb 12 23:13:34 yes Feb 12 23:13:36 we don't have them on our board. Feb 12 23:14:30 i wonder if we have a grounding issue Feb 12 23:14:43 e.g., the ground of the le910 is not connected to the SoC ground. Feb 12 23:14:54 it better be Feb 12 23:16:11 se0 = both lines low Feb 12 23:16:21 yes, that's the idle state for highspeed Feb 12 23:16:24 iric Feb 12 23:16:26 *iirc Feb 12 23:17:39 i was wrong about the traces Feb 12 23:17:51 our hw guys actually brought the signals out to test points Feb 12 23:17:51 in fact since the highspeed voltage levels are lower than low/full-speed's logic-high, a bus communicating at highspeed looks like it's in reset from a non-highspeed pov (e.g. the fullspeed usb bus analyzer we have) Feb 12 23:17:58 should i have a look? Feb 12 23:18:02 (damn right!) Feb 12 23:18:51 probably babble on se1 Feb 12 23:18:59 for one condition Feb 12 23:19:23 I hope they were careful with those test points to avoid adding significant stubs to the bus Feb 12 23:19:36 se1 should never appear on a highspeed usb bus Feb 12 23:19:58 on any bus, high speed or not, right? Feb 12 23:20:55 possibly, I don't know all details of usb at electrical level either Feb 12 23:21:39 but a HS receiver cannot even report se1 as linestate, illegal or not Feb 12 23:22:31 rather it's a differential receiver with detection of when the line is idle ("squelch") Feb 12 23:23:04 i'm going to have a look at dm and dp, using the processor ground as reference. seem like a good idea? Feb 12 23:23:40 keep in mind it's half a gigahertz of bandwidth... probably very easily disturbed Feb 12 23:24:11 holy *&##@ Feb 12 23:24:23 i guess i won't see much with my 40 MHz scope? Feb 12 23:25:06 at least i can check the idle states and do things like pull the ethernet plug.. Feb 12 23:25:29 or if a device is connected, is the host periodically polling it? Feb 12 23:25:37 yes Feb 12 23:25:54 unless in suspend, but then it's also no longer in HS mode Feb 12 23:26:24 i dunno - you think it's worth it? Feb 12 23:26:31 doubtful Feb 12 23:27:21 of course you're free to try, but I'm not sure the bus will even still work once that probe touches it Feb 12 23:27:48 back to the babble condition specified in the musb doc, aren't they saying "this is the condition under which our ip block will report a babble"? Feb 12 23:28:03 yes, which is also specified by the usb standard Feb 12 23:31:00 note btw there are also a whole bunch of debug registers, especially for the phy, though many of them quite obscure Feb 12 23:33:09 this problem is hard. i don't know how to tackle it or what to do about it. Feb 12 23:33:22 i have just trying things and not knowing the real root cause. Feb 12 23:33:29 s/have/hate/ Feb 12 23:34:02 I know how you feel Feb 12 23:34:30 btw, another one of our boards is getting babble interrupts independent of whether or not the eth0 is plugged in. Feb 12 23:35:02 it is almost a direct correlation on my board. Feb 12 23:35:46 what would you try first, matt? Feb 12 23:36:43 I'm slightly worried by the testpoints you mentioned, since they're not very commonly seen on highspeed traces... Feb 12 23:37:01 I hope the stubs are very short Feb 12 23:38:00 you mentioned the traces were buried... routed as striplines? Feb 12 23:39:05 they're just pads, no pins on them. Feb 12 23:39:17 i don't know re: striplines Feb 12 23:40:21 stripline meaning basically a certain distance apart and thickness based on board material? Feb 12 23:40:29 certain characteristic impedance? Feb 12 23:41:39 stripline = signal trace sandwiched between two ground planes Feb 12 23:42:09 vs microstrip = signal trace routed on surface (over a ground plane on the next layer) Feb 12 23:42:59 for usb going a short distance between two ICs microstrip would make more sense to me Feb 12 23:45:24 "NOTE: Test points of any kind are NOT permitted on the DP/DM pair." Feb 12 23:45:33 -- AM335x and AM43xx USB Layout Guidelines Feb 12 23:45:47 clearly my worry was not unjustified :/ Feb 12 23:47:47 http://www.ti.com/lit/an/sprabt8a/sprabt8a.pdf stuff related to connectors and esd protection doesn't apply obviously, but the rest of it does Feb 12 23:48:43 huh. good find. Feb 12 23:49:20 yeah, i finally got the geometry of stripline and microstep, thanks to Johnon and Graham's, "High-Speed Digital Design" Feb 12 23:49:36 nice picture on p.140 Feb 12 23:49:43 Johnson Feb 12 23:49:43 wikipedia has pages too Feb 12 23:50:44 why is the conductor not centered between the ground planes in the stripline picture there? Feb 12 23:51:52 probably to show the general case (i.e. it can be centered, but isn't necessarily) Feb 12 23:52:21 that's actually mentioned on the page Feb 12 23:52:28 yeah, now i read it. Feb 12 23:53:58 stripline is often overkill though for shorter distances, plus you need vias to enter and exit it Feb 12 23:54:47 the two parts are no more than 2 inches apart Feb 12 23:57:24 how short is that string? Feb 12 23:57:25 :) Feb 12 23:57:45 looks like the usb device port on the BBB is similar (routed as microstrip on the bottom of pcb) Feb 13 00:02:54 isn't it true that the gpios are initialized to inputs by the SoC pad hw interface at reset? Feb 13 00:03:41 yeah, with weak pull up or down on most of them (see e.g. my pins spreadsheet) Feb 13 00:03:47 i.e., at reset, before the first loc is exectued. Feb 13 00:03:56 ah. Feb 13 00:03:59 where is that? Feb 13 00:04:11 https://goo.gl/Jkcg0w Feb 13 00:05:10 the orange tabs are BBB specific, the rest isn't Feb 13 00:05:44 nice Feb 13 00:06:08 some pins have different state during reset than immediately after (I'm not 100% sure whether this refers to any kind of reset or more specifically POR) Feb 13 00:06:22 hence the two columns Reset vs Post-reset state Feb 13 00:11:52 gotta go to the gym Feb 13 00:12:10 thanks for the insights amatt Feb 13 00:13:00 yw Feb 13 02:29:23 zmatt: you still there? Feb 13 02:29:36 i had this idea: Feb 13 02:30:20 isn't there a mechanism by which the usb bus determines if there is a low-speed (12 MHz?) device on the bus and slow down its operation accordingly? Feb 13 02:31:26 so maybe we could force that mode somehow and see if that kills the babble. Feb 13 02:57:36 or is there a way in softwrae to force the musb into low-speed mode? **** ENDING LOGGING AT Sat Feb 13 02:59:58 2016