**** BEGIN LOGGING AT Fri May 23 03:00:01 2014 May 23 03:23:15 panto : ping! May 23 07:04:53 karki_, probe means I've matched a driver with a device, do your thing May 23 07:10:08 panto: How much latency should I target if 61us is way too fast? May 23 07:12:05 panto : So it's setting up all the IRQ's claiming control over that device etc. May 23 07:13:28 Abhishek_, target latency in the 10s of ms to be safe on the linux side May 23 07:13:40 but May 23 07:13:51 why do you care about latency to user-space? May 23 07:14:08 this is a logic analyzer right? May 23 07:14:14 yep May 23 07:14:23 as long as you have a big enough buffer in DRAM May 23 07:14:46 it's ok to have a few hundreds of ms latency May 23 07:15:05 for instance even in the case where you have a GUI updated at realtime May 23 07:15:28 the worst case refresh rate is 60Hz May 23 07:15:41 16ms May 23 07:16:04 the direction you were talking on your email sounds right May 23 07:16:11 use PRU1 as a frontend May 23 07:16:25 stuff data using XFER May 23 07:16:36 to the PRU0, and have the PRU0 update DRAM May 23 07:17:16 the whole point is to manage to 'cover' DRAM's latency May 23 07:17:40 indeed, and it's done using the two PRUs in tandem May 23 07:17:40 karki_, not only that, everything that needs to make the device operational May 23 07:17:46 yeah May 23 07:17:49 that's the way to go May 23 07:18:26 using rproc and vrings it (might be) possible to directly transfer to the user-space buffers using scatter gather May 23 07:18:53 note, there's a complication with DRAM May 23 07:18:57 panto: I tried giving uio_pruss extram_pool_sz above 8 MB but it results in a kernel dump May 23 07:19:12 :/ May 23 07:19:20 it's called dram refresh latency May 23 07:19:37 which could be a few us May 23 07:20:20 that's why steinkuehler suggested building a ring buffer in the shared RAM May 23 07:20:53 so if my timing is bad, I can miss a *lot* of samples, isn't it? May 23 07:21:26 panto : I did read up on this. I understand the 'whats', but not the 'whys'. Why can't I write a kernel module which when loaded takes control of the PRU and sets it up (have all the setup logic in the init of the module, instead of having probe? ) May 23 07:22:01 karki_, cause that's the way things are done in the kernel May 23 07:22:06 this ain't an arduino May 23 07:22:19 form is liberating May 23 07:22:45 Abhishek_, in theory you can use DMA to transfer your SRAM buffers to DRAM May 23 07:22:48 panto: another weird idea: Put the buffer in shared RAM and DMA to DRAM? May 23 07:22:55 this will cover the latency May 23 07:23:00 :) May 23 07:23:05 I think you understand May 23 07:23:10 but May 23 07:23:21 setting up the DMA controller and what not is not free May 23 07:23:40 so before you go that way measure if straight to DRAM works May 23 07:24:24 hmm, yes, I have cycle counters working on both PRUs, so I should be able to spot them May 23 07:25:30 panto : Fine, I'll read up on platform drivers today. But just a question, what would happen if I put the probe logic in the init of the module? (i.e.) treat it as a normal driver? May 23 07:25:43 that's not the way things work :) May 23 07:25:52 there is a reason this is done this way May 23 07:25:59 karki_: You'll get flamed by Linus. May 23 07:26:16 the bone is a DT platform May 23 07:26:31 if your patch makes it to the mainline. May 23 07:26:32 doing it your way will blow up on a standard kernel May 23 07:26:56 * karki_ has blown up things before :( May 23 07:27:45 panto : so on a non DT platform, is the scenario of platform drivers different? May 23 07:28:05 * karki_ will spend a lot of time on platform drivers now. May 23 07:28:30 no, it's the same May 23 07:28:39 but it's even worse on DT platforms May 23 07:29:41 oh, and run length compress May 23 07:29:53 this should dramatically decrease bandwidth requirements May 23 07:30:29 panto: I sent another mail in reply to the mail last night. Are you able to see? May 23 07:30:34 logic signals do not change level on each cycle May 23 07:30:46 yeah May 23 07:47:45 panto: did you get what I tried to say in the last paragraph about RLE? May 23 07:49:15 don't do the RLE on PRU1 May 23 07:49:19 do it on PRU0 May 23 07:49:42 dedicate PRU1 to be stupid fast and reliable May 23 07:50:14 RLE processing will cause jitter on PRU0 May 23 07:50:17 *PRU1 May 23 07:50:26 k May 23 07:50:55 Do you think it can be done in 10 cycles for 16 samples? May 23 07:51:17 that would be the headroom I have May 23 07:55:09 no idea May 23 07:57:06 I was thinking of some way to constructively utlize the NOPs in between two successive sampling instructions, see: https://github.com/abhishek-kakkar/BeagleLogic/blob/prutest/PRULATest/src/pru1fw.asm#L55 May 23 12:36:10 panto: I ran an experiment, it seems that I have to buffer at least 10ms of data to avoid missing interrupts May 23 12:36:47 *about 10ms, it's not an exact figure May 23 12:45:25 sounds reasonable May 23 12:45:41 and you're using uio correct? May 23 12:45:46 yep May 23 12:45:53 that's high overhead May 23 12:45:57 some trouble with residual interrupts from last execution May 23 12:46:06 a real kernel driver has much better latency May 23 12:46:26 the residual interrupts is from the non-acking May 23 12:46:41 is there a way to clear them? May 23 12:47:23 well, you have to implement the scheme we were talking about May 23 12:47:43 actually I now maintain two independent register counters May 23 12:47:58 interrupts working as indications only, not as transfer requests May 23 12:48:46 I mean, whenever I do a LDI R31.w0, IRQ, I increment a counter and store it in the PRUs internal RAM May 23 12:49:26 that can work as long as the counter is only read from the arm side May 23 12:49:29 the application, when it receives the interrupt, reads the value from the SRAM May 23 12:50:06 panto: Let me paste a sample output with just a 6 KB buffer, the one that was sending interrupts too fast May 23 12:50:12 k May 23 12:53:39 panto: https://gist.github.com/abhishek-kakkar/03236cc76ba2574e77d0 May 23 12:57:53 ok, what I'm looking at? May 23 12:58:46 this is the sample output from the application, which polls the PRU 20 times for PRU0 and PRU1 interrupts in sequence May 23 12:59:32 everytime the PRU triggered the interrupt, it incremented an internal counter, and my application in userspace sampled that RAM value May 23 13:00:54 I added somethings in the gist too, to help you interpret the output May 23 13:02:08 well, you get about 10+ missed interrupts May 23 13:02:32 which is no problem if you use the interrupt as a trigger to read the counter May 23 13:03:36 yes, so I know if there's a buffer underflow if the counter on my userspace and the counter on the PRU interrupt aren't the same May 23 13:05:01 don't do that May 23 13:05:13 the counter in userspace should old have a copy of the last PRU counter May 23 13:05:31 don't use an interrupt count May 23 13:05:44 okay. May 23 13:06:13 you can tell if you missed something by the delta being more than 1 May 23 13:06:43 indeed May 23 13:08:10 Currently trying with 4 MB buffers (20 ms), and I'm able to catch all interrupts independently. May 23 13:08:23 sounds OK May 23 13:09:02 but this was running in sort of *emulator* mode, there were no actual memory writes May 23 13:10:18 trying with actual memory writes now May 23 15:39:23 hi praveendath92, how are you now? May 23 15:49:58 * karki__ wonders how he got so many " __ " May 23 15:52:46 panto : what exactly is an attribute file used for? where does it appear in the fs? May 23 15:52:58 it's an sysfs attribute file May 23 15:54:20 Okay :) where does it appear in the sysfs? as in which sub directory? (sorry! I'm new to this) May 23 15:55:47 underneath the device's sysfs directory May 23 15:55:49 read about sysfs May 23 15:56:15 On the bright side I was able to understand the platform driver model thanks to free electrons :) May 23 15:56:42 I'll read about sysfs tomorrow! (after my last exam :D ) May 23 15:57:09 free electrons have good learning resources May 23 16:01:27 panto : "module_platform_driver(pruproc_driver)" takes care of __init and __exit and binding to the platform core, right? May 23 16:01:35 yes May 23 16:01:56 * karki__ finally thinks he is upto something ;) May 23 16:03:00 I remember searching for __init and __exit a couple of days back and was throughly confused! May 23 17:01:19 panto : "platform_get_resource" is populated as per the DTO? May 23 17:01:36 sorry I meant the structure May 23 17:02:32 the resource structure that is utilized in probe... May 23 17:46:09 panto: is there a DMA-optimized memset exposed to userspace in Linux? May 23 17:47:44 whats a memset? May 23 17:47:56 function? May 23 17:48:09 I mean the memset function? May 23 17:48:17 yes May 23 17:49:32 apparently memsetting an area of 8 MB takes ~200ms May 23 17:52:35 * karki__ is too sleepy for his own good! suddenly thought of memset as a ocp :/ May 23 17:53:22 * Abhishek_ wonders if the PRU can equal the rhetoric task May 23 18:25:44 panto: For short bursts of 32 bytes, I seem to get the same latency as PRU->RAM transfers. Crunching a number for maybe 4 MB of data May 23 19:00:52 panto: Applying the same concept of counters, I created a counting register in PRU0 that is XOUTed with the 8 sample registers. Should I used the received value of this register to add as an offset in my SBBO to DDR , or keep a separate counter in a register for the purpose? May 23 20:12:47 * Abhishek_ reports success with collecting 4 MB of samples from the PRU @100 MHz :) [ds2 jkridner mranostay panto] May 23 20:13:11 *4 Msamples (8 MB) May 24 00:08:08 Abhishek_: cool May 24 00:08:16 Abhishek_: any underflow detection? **** ENDING LOGGING AT Sat May 24 02:59:58 2014