**** BEGIN LOGGING AT Mon May 18 02:59:57 2020 May 18 03:01:54 Set - I added a new section to me html file. I added a section called builders corner. May 18 03:02:34 Ever link is verified and has stock. May 18 03:45:39 Is this channel active? May 18 03:46:30 Looking for a place to go for technical support. I need to do a factory reset of my board May 18 03:48:44 Ive loaded an SD card with an bbai image. Do I hold the reset button and power cycle? Because thats not working. You see I messed up network configuration and so my board doesnt show up on wifi May 18 07:53:39 zmatt: the pps-gmtimer module "works". There seems to be something wrong with it, though: I see no difference to pps-gpio. May 18 08:40:38 thinkfat: yeah I just looked at the code, and it's a complete mystery to me... I have no clue wtf it's trying to do May 18 08:41:08 I don't quite get the pps part of it ;) May 18 08:41:53 the rest is more or less clear - when a capture interrupt happens it reads the current counter and the capture register and then computes a picosecond offset between both May 18 08:43:20 so, basically it tries to find out the time difference between the IRQ handler entry and the capture event, but what it then does I cannot get May 18 08:43:28 oh like that. that seems... pointlessly complicated May 18 08:44:20 it subtracts this difference from the pps timestamp. I don't get why. this difference must be very very noisy since the interrupt entry itself is not timestamped May 18 08:45:25 well in theory if you take the timer difference between capture and now and subtract it from the current timestamp you should get a timestamp for the capture event May 18 08:45:38 assuming what you're doing doesn't get interrupted itself May 18 08:45:53 and that it always takes the same amount of time... May 18 08:46:00 nope that's not an assumption May 18 08:46:30 (and interrupt handlers don't get interrupted unless you're using an RT kernel) May 18 08:47:12 I do but the handler is "IRQF_TIMER" so it should not run threaded... May 18 08:47:51 ok. well then the only remaining jitter is due to variable access latency to the two timers May 18 08:48:31 *should be May 18 08:48:37 if there's more jitter than that, something is wrong May 18 08:48:57 hm, I switched the system clocksource to the same timer that captures the events... May 18 08:49:33 that will only help if the driver takes advantage of this, which it doesn't seem to May 18 08:53:48 what rate does the timer report in the pr_info() on line 265 ? May 18 08:54:37 24MHz May 18 08:54:39 that's ok May 18 08:54:42 indeed May 18 08:54:46 but I think I found something May 18 08:55:44 no, all good... May 18 08:56:14 I think what it does is OK... May 18 08:56:46 I mean, if you're seeing as much jitter as with pps-gpio then it's clearly still doing something wrong May 18 08:56:58 it should be several orders of magnitude lower May 18 08:57:06 on IRQ entry it gets a system timestamp and then subtracts the computed delta. That seems to be fine May 18 08:57:36 it's not ideal but it should still greatly improve jitter May 18 08:58:18 it's actually not so bad, unless the time between getting the system timestamp and reading the counter register is variable May 18 08:58:20 (ideal would be using the same timer as clocksource and for capture, and directly convert the capture counter value to timestamp) May 18 08:58:30 yes May 18 08:58:57 maybe check if any of the functions used to access either of the timestamps involved grabbing a mutex May 18 08:58:57 I did switch the system clocksource to the same timer, but the driver doesn't take advantage of it May 18 08:59:00 ? May 18 08:59:15 no, it's supposed to be run from IRQ, no mutexes here. May 18 08:59:21 spinlocks, yes, but no mutexes May 18 08:59:29 hmmm May 18 08:59:32 spinlocks are mutexes on an RT kernel May 18 08:59:33 but spinlock is just as bad May 18 08:59:51 (and are hence forbidden in non-threaded irq handlers) May 18 09:14:02 hm, using the raw capture value seems to be a bit difficult, the timekeeping kernel api doesn't seem to have af function for that May 18 09:21:17 lemme see what I did, since I actually acquired the need to convert timer counter values (captured by PRU) to raw monotonic timestamps myself May 18 09:23:42 ah I use ktime_get_snapshot() to get a matching pair of (timer cycles, raw monotonic timestamp) and just use that as a refernce point to translate other timer values May 18 09:24:07 though actually, if you know you're the clocksource, translation is easy, and there are kernel functions for that May 18 09:25:49 I think? May 18 09:26:21 I might be wrong May 18 09:27:06 I kinda of assumed a clocksource would embed a struct timecounter or something, but that doesn't seem to be the case May 18 09:27:19 pps_get_ts() also uses ktime_get_snapshot(), so I guess your approach is the same? May 18 09:27:51 the difference is that ktime_get_snapshot includes the raw counter value May 18 09:27:56 ah, indeed May 18 09:28:14 along with a point to the clocksource May 18 09:28:51 so you can check that the clocksource is your own, and if so use the cycles value from the snapshot May 18 09:29:20 yes, that is a good idea May 18 09:29:39 and if not, fall back to using the other method and using pps_get_ts() May 18 09:34:02 meh, the kernel seems to be compiled without CONFIG_NTP_PPS :-( May 18 09:34:24 yeah what I did to implement "cross-timestamping" was add a small extension to the clocksource API to acquire the current clocksource (with a mutex on it), so I can read the clocksource very close to sampling my own counter (under a single spinlock) and then convert it with clocksource_cyc2ns outside the spinlock May 18 09:34:29 see https://github.com/dutchanddutch/bb-kernel/blob/am33x-v4.14/patches/local/0016-cpts-add-cross-timestamping-support.patch May 18 09:38:01 the amount of time taken for the cross-timestamp is pretty reasonable: May 18 09:38:03 [ 10.240803] cpsw 4a100000.ethernet: cross-timestamp took 58 cycles (290 ns) May 18 09:38:36 that seems appropriate for reading a peripheral register May 18 09:39:30 well, no, it's still a bit too long for that, but it's not too bad May 18 09:40:49 ("cycles" here means timer-cycles of the 200 MHz PRU eCAP timer I'm using as clocksource) May 18 09:44:16 hmmm May 18 09:44:50 I'm a bit unhappy about the uint32_t use in all the picosecond calculations May 18 09:44:59 I hope theres no overflowing May 18 09:45:18 picosecond seem silly anyway, the linux clocksource only has nanosecond resolution May 18 09:47:11 (and its fixed-point conversion from timer cycles to ns uses a scaling factor computed to ensure no overflow happens, see cs->mult and cs->shift, and clocksource_cyc2ns) May 18 09:59:24 ooh, that looks much, much better May 18 09:59:45 using the system timestamp directly May 18 10:00:28 the corresponding cycle count instead of reading the counter register separately May 18 10:00:47 that gives a standard deviation of 15ns instead of 600ns May 18 10:08:25 there ya go May 18 10:17:14 hm, weird. all of a sudden I'm back to "rotten" behaviour May 18 10:17:50 press ctrl-Z to go back May 18 10:17:58 lol May 18 10:18:29 now it came good again... May 18 10:18:54 space weather event, maybe May 18 10:19:08 there's a gps receiver connected to the pps input May 18 10:21:22 yeah, now it's good again May 18 10:34:34 yup, seems to be the gps input, I see it also in the gpsdo statistics May 18 12:09:41 now, sawtooth correction, that would be a thing, too. May 18 14:09:58 zmatt: a 200Mhz timer would bring timestamp granularity in very interesting ranges... May 18 14:11:24 zmatt: can the eCAP do interval counting? May 18 14:19:01 what's your definition thereof? May 18 14:20:25 sub-nanosecond May 18 14:20:40 nothing on the bbb is sub-nanosecond May 18 14:20:53 interval counting means, start on one signal, stop on another signal May 18 14:21:11 ah, yes. May 18 14:22:39 "signal" ? you mean edges of the external input? then yes. I mean, anything that can timestamping edges can do that, though eCAP can do so better since it has multiple capture registers (so it can measure the width or a pulse or the distance between two pulses even if software doesn't have time to process the first timestamp before the next one is triggered) May 18 14:23:07 ok, I misread.. May 18 14:23:11 200MHz means 5ns May 18 14:23:16 close, but not quite May 18 14:23:18 correct May 18 14:23:37 which is the best you can do on the BBB I think May 18 14:24:25 yes, edges on two separate external inputs would be ideal, though not required May 18 14:25:29 eCAP has only a single input, but it's possible to synchronize the counters of the eCAP instances May 18 14:26:14 can it count between rising and falling edge of the input? May 18 14:27:10 well, moot point. 5ns is not enough May 18 14:27:23 it can May 18 14:41:09 the only thing more precise than 5ns is eHRPWM, whose output support sub-nanosecond adjustment (of either phase or duty cycle) using a delay line, but it's an output only and it's uncalibrated May 18 14:46:51 zmatt: well I'm fine with my dedicated time-to-digital chip, I get 56ps resolution for a very resonable price May 18 14:59:37 yeah that's hard to compete with May 18 14:59:54 what are you using? May 18 15:01:44 zmatt: someone said you're the resident emmc wizard May 18 15:02:11 I have a beaglebone enhanced (sancloud) with a dead-ish emmc May 18 15:02:22 reading is fine May 18 15:02:36 any write makes it lock up until power is removed May 18 15:02:44 and nothing is actually written May 18 15:02:57 mru: yep, that's what I've typically seen when an eMMC is worn out May 18 15:03:07 no more spare blocks? May 18 15:03:34 I don't know how the eMMC works internally or what causes it to lock up like this instead of returning a proper error May 18 15:03:53 but it's what I've typically seen on wearing out an eMMC May 18 15:04:06 the odd thing is this one left the factory in february last year May 18 15:04:33 we have some 10k of them out in the field, and this is the first time we've seen this May 18 15:04:39 wearout is dependent on amount of data written, not on age May 18 15:05:11 yes, but writing lots takes time May 18 15:05:39 many have been running the same system for much longer without issue May 18 15:06:40 you can easily write in excess of 10 gigabytes per hour if you someohw fuck up May 18 15:07:10 alternatively, it's conceivable that a power interrupton during write could result in corruption of internal bookkeeping tables May 18 15:07:43 it shouldn't, but hey these are bottom of the market eMMC chips so who knows May 18 15:08:23 there's no sign that such abnormal amounts of data were written shortly before it died May 18 15:08:45 if there was a burst of writes several months ago, the traces of that have been lost May 18 15:11:08 excess writes are the only clear cause I've seen for dead eMMC. however, I have seen eMMC die where the cause could not be clearly established (but excess writes could also not be excluded as cause) May 18 15:12:02 is there some tool that can get more info from the chip? May 18 15:12:10 like smartctl for hard drives May 18 15:12:30 does the BBE use cheap kingston eMMC like the BBB does? May 18 15:12:43 micron judging by the logo May 18 15:12:54 hmm. what bga code? May 18 15:13:14 where do I find that? May 18 15:14:18 on the chip :P probably begins with "JW" May 18 15:15:52 e.g. this one has bga marking "JW896": https://photos.app.goo.gl/iFpxtrs4pMCgmd427 May 18 15:16:32 damn, this is some tough conformal coating May 18 15:19:17 https://www.sancloud.co.uk/wp-content/uploads/2018/02/IMG_4042.jpg this shows JY976 May 18 15:21:08 JY976 here too May 18 15:22:13 then dunno, datasheet isn't freely downloadable and my micron account doesn't seem to work anymore (it was attached to an email adres at my previous employer) May 18 15:22:40 later eMMC spec versions have defined some stuff for getting more information, but I've never seen an eMMC that implements it May 18 15:26:07 you can try grabbing mmc-utils (git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc-utils.git), compile it, and then: sudo ./mmc extcsd read /dev/mmcblk1 May 18 15:29:42 zmatt: TDC7200 May 18 15:30:57 of course there's no need for that kind of resolution for a gpsdo, but it's easier to throw bits away than to create them. May 18 15:38:24 cute chip May 18 15:40:16 zmatt: well, that spat out a lot of lines May 18 15:47:13 mru: yes, yes it does May 18 15:49:07 so, which one am I looking for? May 18 15:49:44 are any of the "Vendor Specific Fields" nonzsero? May 18 15:50:05 some are May 18 15:50:44 okay, then it's possible it's trying to say something about the state of the eMMC... perhaps the publicly unavailable datasheet can shed more light on it :) May 18 15:51:36 is "eMMC Life Time Estimation" of any relevance? May 18 15:52:19 sounds plausible enough, what does it say? May 18 15:52:53 there's eMMC Life Time Estimation A and B May 18 15:52:58 both have value 0x01 May 18 15:53:17 let me check a good one and compare May 18 15:53:32 "0% - 10% device life time used" May 18 15:54:04 i.e. it's saying that the device is not worn out, not even a little bit May 18 15:54:36 yeah, we don't normally do a lot of writing May 18 15:55:14 do you have a PRE_EOL_INFO field? May 18 15:55:21 also 1 May 18 15:55:25 "normal" May 18 15:55:52 so yeah, the eMMC is saying it's fine, apart from crashing on write ;P May 18 15:56:21 I guess you can try contacting micron and/or sancloud May 18 15:56:59 or forget about it since it's only happened once May 18 15:57:18 ... so far. May 18 15:57:59 sending out some emails is relatively low-effort May 18 15:58:30 and if your eMMC claims it's perfectly fine and not worn out at all, and then the controller crashes on attempted write, that's a pretty obvious device failure May 18 15:58:57 at least sanloud may want to know that this is happening, in case there are more reports of it May 18 16:02:09 hmm, this other board has an older emmc chip May 18 16:02:49 anyhow "Exception events status" is 0 on the good board, 1 on the bad one May 18 16:03:03 reasonable to guess that something bad did happen? May 18 16:04:18 hmm, "URGENT_BKOPS" May 18 16:05:07 is that a field name? May 18 16:05:22 that's bit 0 of EXCEPTION_EVENTS_STATUS May 18 16:06:11 ok May 18 16:06:26 meaning? May 18 16:06:30 it indicates the eMMC is configured to use manual background ops (meaning the OS will tell it what's a good moment to do background operations for internal bookkeeping), and these urgently need to be done May 18 16:06:39 sorry, I've never had to delve this deeply into the workings of mmc May 18 16:06:59 I'm guess linux isn't issueing the command because you're accessing the device in a read-only fashion May 18 16:07:30 any attempt to access it in a write fashion locks up May 18 16:08:14 yeah do doing background ops presumably would lock up the controller as well May 18 16:08:35 anyway, this doesn't seem indicative of an error to me May 18 16:09:08 how would you kick it into doing those background ops? May 18 16:09:08 although it might be interesting to tell the eMMC to perform background ops and see if that also causes a lockup... but I'm not sure if there's an easy way to do so May 18 16:10:19 but "background ops urgently needed" isn't a failure state anyway, it just means performance may be degraded May 18 16:11:04 what's the value of BKOPS_STATUS ? May 18 16:11:06 and it would normally do these urgent things in the course of normal operation? May 18 16:11:22 BKOPS_STATUS: 0x03 May 18 16:11:51 if the OS doesn't give background ops an explicit opportunity, then the eMMC is forced to do them as part of normal operations, causing those to slow down May 18 16:12:28 "Host shall check the status periodically and start background operations as needed, so that the device has enough time for its maintenance operations, to help reduce the latencies during foreground operations. If the status is at level 3 ("critical"), some operations may extend beyond their original timeouts due to maintenance operations that cannot be delayed anymore." May 18 16:12:31 but it shouldn't cause the device to break, right? May 18 16:14:03 no, it just causes performance degradation May 18 16:14:53 BKOPS is just a performance optimization feature May 18 16:15:06 so this state could be related to the brokenness but not a cause of it May 18 16:15:48 it's probably just linux being unwilling to issue bkops (which internally cause writes) to a device that's being used in a readonly fashion May 18 16:16:12 which would make a lot of sense, since otherwise you probably wouldn't have been able to access this eMMC anymore at all and wouldn't be able to recover data from it :) May 18 16:16:49 if it issued bkops immediately on seeing the eMMC complain about it May 18 16:18:10 I could probably patch in a command to perform manual bkops if you think it would be informative to try May 18 16:18:32 I doubt it May 18 16:32:47 mru: well, added it anyway: https://github.com/dutchanddutch/mmc-utils/commit/ec5fc0b92a25 May 18 16:41:00 hmm, BKOPS_EN is 0 May 18 16:41:08 LOL May 18 16:45:37 I mean, I think there's technically nothing in the eMMC spec forbidding an eMMC from reporting non-zero bkops_status while bkops_en is 0 May 18 16:45:46 but it makes no sense May 18 16:46:32 what the heck, let's try to enable it and see what happens May 18 16:46:44 beware that that's irreversible May 18 16:46:53 I don't care what happens to this device May 18 16:47:10 I've made a full copy of it just in case May 18 16:48:03 actually eMMC reporting bkops_status >= 2 while !(bkops_en & 1) is pretty bad, and breaks exception handling (since the "urgent_bkops" exception cannot be masked) May 18 16:48:43 so it results in the exception flag being persistently set while the OS can't do anything about it May 18 16:49:51 setting BKOPS_EN worked May 18 16:50:02 but bkops start has no further discernible effect May 18 16:50:13 no effect on BKOPS_STATUS ? May 18 16:50:18 still 3 May 18 16:50:36 okay that's definitely a chip bug, the eMMC controller is drunk May 18 16:51:07 unless I just fucked up the implementation of course May 18 16:51:47 but I don't see much possibility for that, the implementation is pretty trivial May 18 16:52:01 "Writing any value to this field shall manually start background operations. Device shall stay busy till no more background operations are needed." May 18 16:52:23 so after writing anything to BKOPS_START, it is required that BKOPS_STATUS is zdero May 18 16:52:26 *zero May 18 16:53:48 looks like you're writing zero May 18 16:54:29 I am yes, which seems like a perfectly adequate instance of "any value" May 18 16:54:40 not disagreeing May 18 16:55:09 btw, just below in do_write_bkops_en() the error message uses "value" while you're writing BKOPS_ENABLE May 18 16:55:21 or whoever wrote that code May 18 16:55:33 yeah I noticed that bug May 18 17:31:03 zmatt: you'll like this May 18 17:31:10 writing 1 has much more of an effect May 18 17:31:29 it locks up the device May 18 17:42:49 mru: ah yes, because 1 is much more an "any value" than 0 May 18 17:44:54 it's anier May 18 17:45:05 (any, anier, aniest) May 18 17:45:19 all number are any, but some are more any than others May 19 01:55:16 It is amazing how books from 2011 still are relevant to today and source. May 19 01:56:31 This web dev. book I figured would be bust. I was/am completely wrong. It rules! May 19 02:02:18 Why are people moving away from Apache2 and towards Nginx? May 19 02:03:15 a better question is why are so many people still using apache :P May 19 02:04:03 I do not understand. It is like apache2 was around for so long and nginx is something new and gets "credit." May 19 02:04:27 They both seem to be similar from my standpoint. May 19 02:04:52 I mean, nginx has been around for quite a while now too May 19 02:05:05 Right but no as long as apache2, right? May 19 02:05:44 correct, apache2 initially gained at the expense of the NCSA webserver May 19 02:05:53 @zmatt: Do you get what I am saying? Why the switch in usage from a dev's view? May 19 02:06:15 Oh. May 19 02:06:55 "Nginx was written with an explicit goal of outperforming the Apache web server. Out of the box, serving static files, Nginx uses much less memory than Apache, and can handle roughly four times as many requests per second." -- wikipedia May 19 02:07:18 I am sure there is much to it. I am just starting to see the initial ideas of apache2 being removed from specific software like cherrypy or w/ BBB.io's boards for servers. May 19 02:07:19 Oh. May 19 02:07:25 Well, that pretty much sums things up. May 19 02:07:44 that doesn't mean it's better at everything obviously May 19 02:07:54 but it helps to understand why it has popularity May 19 02:08:20 Right. It makes more sense now. Less memory for some of the same tasks and that matters. May 19 02:08:25 truth nothing is perfect each has their flaws people are more familiar with Apache's flaws :D May 19 02:09:27 I have issues w/ apache2 at times w/ my fun site. May 19 02:10:09 I run the site w/ apache2 and it sometimes allows me to do simple things but I have not mastered it yet. Now, I "need" to switch. May 19 02:10:53 I tried nginx years ago and the nginx server set up for the config file kept disallowing me to make headway. May 19 02:10:56 better now before you get into bad habits... May 19 02:11:05 Sort of. Anyway, too many errors kept me away. May 19 02:11:16 just use NCSA Httpd May 19 02:11:18 ;) May 19 02:11:28 hehe May 19 02:11:32 Is that "oldschool?" May 19 02:11:56 from sys-v? May 19 02:12:15 set_: it was the hot webserver software in 1995 or so May 19 02:12:24 it is the parent of apache May 19 02:12:35 Oh. See. I was unaware. Hmm. May 19 02:12:40 '95! May 19 02:12:45 and browse with Mosaic May 19 02:12:59 I only remember '96 and Netscape. May 19 02:13:17 this newfangled "http"... gopher is clearly superious May 19 02:13:23 *superior May 19 02:13:55 GEt this, get that. I only remember how games were fast and how the WWW was slow. May 19 02:14:04 bzzt, garble, bzzt. May 19 02:15:42 Defender was the superb up and down motion one needed to stay "mesmorized." May 19 02:16:06 windows 95 LOL lots of songs about that. May 19 02:16:16 About defender? May 19 02:16:22 Oh. Win. 95. May 19 02:16:34 Win 3.11 W4W May 19 02:17:35 What is w4w? May 19 02:17:52 Windows for workgroups May 19 02:17:55 Oh. May 19 02:17:57 Ha. May 19 02:18:02 has a IP stack unlike plain windows May 19 02:18:29 I guess there is more than meets the eye. May 19 02:18:34 right so they could actually network... May 19 02:18:43 Oh. Intranet? May 19 02:19:03 I remember that too, connect everything all the time! May 19 02:20:12 He quit? May 19 02:23:04 This person's source still works and it has been nine years, "Python3 Web Development (Anders 2011)." May 19 02:23:18 Usually anything I type up breaks in minutes. May 19 02:23:19 Ha. May 19 02:25:53 Anders' set up a time thing w/ clicking the screen for .py, .html, and AJAX w/ a db, and some json. May 19 02:26:13 So much complication for just the time but it has a sign in screen! May 19 02:26:45 This is what i need for promoting bots, i.e. so people cannot just go to my IP and port. May 19 02:39:01 Hey! May 19 02:39:37 WHy does the BBB new images have a :8080 port that allows for opening it online? May 19 02:39:53 I can go to my ip w/ that port and do nothing (for now). May 19 02:40:02 But, it is steadily running. May 19 02:40:19 Was that on purpose? May 19 02:41:12 You can do all sorts of stuff w/ it, e.g. online bot connections, sign in screen, splashscreen, and etc. May 19 02:56:02 I know this is off subject for now but this is what I came across just now: https://en.wikipedia.org/wiki/Bombardier_beetle. May 19 02:56:21 "My first wikipedia post!" May 19 02:56:44 They fly! May 19 02:57:04 Now, if my BBB could only spy on bugs for me! May 19 02:58:26 Does anyone want to see a bug spy of a BBB, a webcam, and some nice lights? May 19 02:58:53 Now, you can be outdoors w/out being infected by the insects! **** ENDING LOGGING AT Tue May 19 02:59:57 2020