**** BEGIN LOGGING AT Tue Oct 08 02:59:57 2019 Oct 08 07:25:41 TheKit: Thanks. I'll try that later. I think I should probably post on the forum describing the things I'm stuck at. You seem to be the only experienced person active in this channel at the moment ;) Oct 08 07:26:38 if you use Telegram, there is also a bit more active group there at https://t.me/GeminiPDA Oct 08 07:55:07 Hm, I just caused a spontaneous reboot by trying to start jackd with the pulseaudio-jack module. (One goal of my whole excercise is to learn the supercollider audio synthesis language while on the go, without using a clunky laptop. But having a pocket-sized laptop is a nice side-effect.) Oct 08 07:58:47 Hm, never tried or considered telegram before. Oct 08 09:45:43 Ok... I figured out that I can get out of chromium fullscreen mode by pressing Fn-Shift-0 (F10 on a normal keyboard) to open the context menu in chromium. And if I then manage to get my hands on a shell (mouse not working, can't switch windows, so I have to close windows until one pops up that has a shell prompt) I can restart qtile with ~/.virtualenvs/qtile/bin/qtile-cmd -o cmd -f restart and everything Oct 08 09:45:44 will be ok. (That also works if I can ssh in.) Oct 08 09:46:09 I guess the qtile guys might have an idea how to debug this on their side. It's definitely not an issue in lxqt. Oct 08 09:46:42 Starting scsynth or jackd causing a spontaneous reboot is more troublesome :-\ Oct 08 09:47:19 Could there be a watchdog doing it? Oct 08 09:48:14 Would this appear in a log somewhere, which I can inspect after the reboot? Or tail while ssh'd in? Oct 08 09:48:58 jackd/scsynth are *trying* to get realtime priority, but say they fail because of permissions... before the reboot happens. Oct 08 09:49:04 (sclang as well) Oct 08 09:49:39 (If rtpriority worked, it could case a lockup, which then could cause the watchdog to bite, iirc.) Oct 08 09:49:39 Perhaps :). Is netconsole available? Oct 08 09:49:54 Uhm, what is netconsole? Oct 08 09:50:50 A kernel module for sending kernel log messages over the network. Over usb in this case I imagine. Might work over wifi as well.. Oct 08 09:51:46 hm, lsmod doesn't work... how do I check? Oct 08 09:52:26 ah, proc/config.gz says netconsole is off Oct 08 09:52:50 (Sorry for asking too fast. I need to give myself more time to think...) Oct 08 09:52:58 Ok, so no then :/ Oct 08 09:53:49 I guess you checked journal and /var/log/*? Oct 08 09:53:52 Would that be enabled in a kernel I could get somewhere? Or do I need to compile one? Oct 08 09:54:40 Self-compiling is probably the way to go Oct 08 09:54:42 Well... journal is swamped... I did check and made a few attempts to grep, but I'm not sure what I'm looking for. Oct 08 09:54:50 Maybe I should disable logd. Oct 08 09:54:54 (Or maybe not.) Oct 08 09:55:45 Use journalctl -k --since etc? Oct 08 09:55:56 messages has nothing Oct 08 09:57:15 I'm afraid journalctl -k starts *after* the reboot, so I could add -f and make the next attempt. Oct 08 09:59:15 So no persistent logs? Oct 08 10:01:39 Just did that... nothing visible in journalctl -k -f ... Oct 08 10:02:53 Couldn't find any clue so far... I didn't really look very closely at /var/log/syslog and other logs so far. Oct 08 10:04:33 It looks like the system locks up pretty soon after jackd starts (though not immediately) and then reboots, which is quite likely to be a watchdog kicking in. I remember there are lots of posts about this regarding jackd (as it's a common effect of misconfiguration), so I'll check the jackd / rtprio docs next. Oct 08 10:05:58 Unfortunately the misconfiguration in my case might be that I'm trying to use pulseaudio with jackd but I think I don't have an alternative here. I should probably try to prevent more thoroughly that jackd/scsynth can gain realtime permissions. Probably I messed up there. Oct 08 10:11:34 Could it be jack could try to kill pulseaudio, triggering wd? Oct 08 10:28:43 Ah... I just found out I can reliably cause the reboot with this command: pactl load-module module-jack-source channels=2 connect=0 Oct 08 10:29:16 (from https://askubuntu.com/questions/1164862/can-supercollider-work-without-jack-with-just-pulseaudio-in-ubuntu-or-can-jac/1170081#1170081 ) Oct 08 10:33:27 Hm... convincing supercollider to use pulseaudio directly would make things a lot easier, as I don't care about the things that jack provides in this case (it's just for learning and experimenting, I don't care about latency). Oct 08 10:39:46 pulseaudio is the only way I *can* get sound under gemian, right? Since it does some black magic with the android side of things, or somesuch. Oct 08 10:40:21 (And yes, play soundfile.wav works fine, if I don't fiddle with jack or pulseaudio modules) Oct 08 10:59:03 Ok, I found another way that leads to a spontaneous reboot: https://github.com/brummer10/pajackconnect ... Looks like I may need to post my findings on the forum. Oct 08 11:00:47 I only have a very fuzzy idea of how gemian is interfacing with the hardware. I think I've used both approaches successfully on a "normal" linux system before. Oct 08 11:06:50 Other approaches to get this working would be termux (which also needs to use pulseaudio for interfacing with the android side of things) or build supercollider packages for sailfish (which means building a lot of packages). And I have no idea how sound works on sailfish at all ;) Oct 08 11:07:20 what distro you're running? Oct 08 11:07:41 I'm running debian TP3 on the gemini pda. Oct 08 11:08:45 does `/proc/cmdline` refer to a watcdog? does `find /etc|grep -i watchdog` reveal anything? `dpkg -l -a | grep -i watchdog` ? Oct 08 11:09:21 because if I were to configure a watchdog to a mobile phone, I would include the pulseaudio process in its watch list :-). Oct 08 11:09:39 the problem could be that loading that module kills pulseaudio (due to a bug) Oct 08 11:09:50 you could try out this theory by killing pulseaudio manually and see what happens Oct 08 11:12:20 I don't see the word "watchdog" in /proc/cmdline. Oct 08 11:13:15 find /etc|grep -i watchdog comes up empty Oct 08 11:13:38 I guess this busts that theory then.. :-) Oct 08 11:13:47 # dpkg -l -a | grep -i watchdog Oct 08 11:13:49 ii rtkit 0.11-4+deb9u1 arm64 Realtime Policy and Watchdog Daemon Oct 08 11:14:06 and if I kill pulseaudio, it's restarted immediately Oct 08 11:15:07 so all in all this seems strange Oct 08 11:16:31 When I put autospawn = no into ~/.config/pulse/client.conf I can kill pulseaudio and nothing bad happens. Oct 08 11:18:26 and when I *then* start the scsynth server from the sclang prompt, the gemini reboots. Oct 08 11:19:01 maybe you could try enabling persistent logging in journald Oct 08 11:19:18 also pay attention to the `-b` switch of `journalctl` Oct 08 11:19:42 it does sound a lot like the kernel does panic and therefore it reboots Oct 08 11:20:18 btw, what are the settings for `grep . /proc/sys/kernel/panic*`? Oct 08 11:20:22 Hm, this is weird, I see: "JACK server starting in realtime mode with priority 10" and it doesn't complain about not being able to get rtprio... maybe there's some configuration I overlooked, or it's suid root or something. Oct 08 11:21:32 Ah... it needs suid because it opens a raw socket... Oct 08 11:22:34 Will check when the reboot/fsck is finished. Oct 08 11:22:43 :D Oct 08 11:22:54 fsck is a good hint it's not very controlled reboot indeed Oct 08 11:23:26 (and enable persistent logging... but sometimes those lockups happen so fast that nothing can be persisted to disk, which is why serial console is really recommended for cases like this) Oct 08 11:23:51 I guess serial console is not really feasible in this case Oct 08 11:24:00 might even be even netconsole wouldn't work Oct 08 11:24:21 IIRC there is also some mechanism to store kernel messages or kernel dump in RAM so that the next boot will not overwrite it Oct 08 11:24:38 but this needs preallocating space for it and I'm 99% sure it's not enabled in your kernel Oct 08 11:26:04 debian does support it with the kdump package, though.. on x86 at least. Oct 08 11:31:16 # grep . /proc/sys/kernel/panic* Oct 08 11:31:18 /proc/sys/kernel/panic:1 Oct 08 11:31:20 /proc/sys/kernel/panic_on_oops:1 Oct 08 11:38:03 There's a kdump-tools package available, but the kernel doesn't have the correct options set. Oct 08 11:38:33 you could try setting those to 0 and then doing whatever you did to reboot it Oct 08 11:38:46 ..I guess you should be prepared to remove the battery.. ;-) Oct 08 11:39:21 I wonder if the pwoer button is actually wired fail-safely to the hardware Oct 08 11:43:53 Uhm, I'm not prepared to remove the battery right now. I opened the case and I'm afraid of all these tiny cables and stickers and glue attached to it, as I have two left hands... :-} Oct 08 11:44:51 (I managed to break some of the keyboard mat by carefully (!) following the instructions how to replace keys. Really not a hardware guy.) Oct 08 11:44:55 btw, you said it reboots immediately? because panic:1 should mean it reboots after one second after panic. though I guess it could also mean reboot immediately if 0 means don't boot 🤔 Oct 08 11:45:05 No, it takes 1 or 2 seconds. Oct 08 11:45:19 ..if it's a panic-induced boot that is Oct 08 11:45:27 regardless I think it should be safe to write ie. 10 to that file and try what happens Oct 08 11:45:30 ok, so then it is probably panic boot Oct 08 11:45:54 but I think ie `journalctl -k -f` should have logged the OOPS during that one second already Oct 08 11:45:55 Ok, that's a cool tip, maybe that means I'll see more in those persistent logs. Oct 08 11:46:37 you did use the `-k` flag when journactl-f;'ing, right? but maybe even better is to do `dmesg -w` Oct 08 11:48:28 I see call traces. Oct 08 12:02:51 Hm, there was no oops anywhere, so I put a 10 into both files and tried again and the resulting reboot was *much* faster than 10 seconds. Like 1 second at the most. Oct 08 12:05:25 And dmesg -w didn't show anything either. Oct 08 12:05:52 well, that's interesting Oct 08 12:06:09 not sure if there's any other way to debug this than kdump Oct 08 12:08:24 Thanks a lot for your help and input in any case. I learned (and re-learned) a lot. I think I'll take my findings to the OESF forum at some point. Oct 08 12:08:39 good luck :) Oct 08 13:39:17 TauPan, I think accessing ALSA device directly causes a kernel panic due to buggy kernel driver Oct 08 13:40:21 baiscally PulseAudio is using https://github.com/mer-hybris/pulseaudio-modules-droid on Sailfish/Gemian to output audio through Android HAL. Android HAL in turn talks to ALSA in kernel, but does it in particular way that doesn't crash the device Oct 08 13:40:41 to see reboot reason, you can use cat /proc/last_kmsg Oct 08 14:10:18 TheKit: Before I go down a rabbit hole, is https://github.com/NotKit/wlroots/tree/hwcomposer actually working? Oct 08 14:10:46 it was, but I didn't try with latest wlroots library Oct 08 14:11:02 also I didn't implement screen blanking, so it's just for testing Oct 08 14:11:09 TheKit: Ok great :) Oct 08 14:11:29 I'm porting another distro to the Gemini and just need some simple wayland compositor to test with Oct 08 14:11:33 So that's perfect Oct 08 14:12:02 which distro? Oct 08 14:12:05 NixOS Oct 08 14:13:34 test if EGL_PLATFORM=hwcomposer test_hwcomposer is working before anything else Oct 08 14:14:40 TheKit: Yes of course :) Oct 08 14:15:09 TheKit: Btw thanks for all your hard work! I keep seeing your name popping up all over the place Oct 08 14:27:34 it would be pretty interesting to get useful Wayland compositor on Gemini, should improve performance a lot **** ENDING LOGGING AT Wed Oct 09 03:01:40 2019