**** BEGIN LOGGING AT Sat Jul 16 23:59:56 2005 Jul 17 02:44:52 regarding "reboot problem" in openslug. Seems that if you umount /initrd early it will reboot like its supposed to. Why that would be comopletely escapes me. Jul 17 02:45:15 It Completely escapes me too. :-) Jul 17 02:46:52 so that's why I never had problems rebooting Jul 17 02:49:57 Anyway, thats my observation of the day. Jul 17 02:50:52 my openslug installation is kind of a bastard child right now; so it would be nice if someone could confirmm or deny that. Jul 17 03:26:42 <[cc]smart> dyoung: until now what i have thought is that it would be the case if i'm not on internal flash as root. but that would well be covered by the initrd case. Jul 17 03:26:46 <[cc]smart> so i'll try that Jul 17 03:30:08 another something to chew on... Jul 17 03:30:17 why is this in our defconfig? Jul 17 03:30:25 # CONFIG_USB_OHCI_BIG_ENDIAN is not set Jul 17 03:30:25 CONFIG_USB_OHCI_LITTLE_ENDIAN=y Jul 17 03:31:15 <[cc]smart> dunno... but maybe this could solve the hassle with usb disk/irq issues ? Jul 17 03:31:48 <[cc]smart> iirc the most obvious problems where usb20 ? Jul 17 03:32:16 <[cc]smart> unounted /initrd and system rebooted fine Jul 17 03:32:20 <[cc]smart> there you have it Jul 17 03:34:04 ochi is the usb1.1 stuff though, and is on irq27 and irq28 . Jul 17 03:34:36 <[cc]smart> too bad Jul 17 03:34:54 <[cc]smart> rechecked with /initrd left mounted, and it fails to reboot Jul 17 03:35:06 <[cc]smart> so i'd count the observation correct Jul 17 03:39:04 why that would be completely escapes me. Jul 17 03:39:35 anyways, if a few others can verify the behaviour, I vote for making that bit part of the /boot/foo scripts. Jul 17 05:12:45 03bwalle * 10unslung/ (Makefile make/fIcy.mk): Added fIcy package for testing. Jul 17 07:51:18 03ingeba * 10unslung/make/gnutls.mk: Found the old 1.2.4 version in the attic and readjusted SITE info Jul 17 08:38:49 So rmdir /initrd might give reliable reboots... Jul 17 08:39:19 But what does the reboot look like? All LEDs lit, 4s delay then orange? Jul 17 08:39:36 Or is it all LEDs lit, fractional second delay, then orange? Jul 17 08:43:18 03ingeba * 10unslung/make/imap.mk: Upgraded to 2004e as 2004d was no longer availble - moving it to ready_for_testing Jul 17 08:49:00 03ingeba * 10unslung/Makefile: Moved imap to ready_for_testing due to version upgrade Jul 17 09:04:58 The reboot is a hard (watchdog) reboot, the soft reboot still hangs. Jul 17 09:06:01 It seems to work now doing the umount at K06. The problem is that this is a turbo slug and I don't have a non-turbo - when we tried this before it wasn't reliable but maybe it wasn't getting enough time. Jul 17 09:06:31 Need to know how long it requires. Jul 17 09:08:01 03bzhou * 10unslung/Makefile: ocaml put in the wrong place, should be under NATIVE_PACKAGES_READY_FOR_TESTING Jul 17 09:10:47 <[cc]smart> jbowler: you might add a loop that waits unil its umounted Jul 17 09:11:10 umount waits until it is unmounted Jul 17 09:11:35 <[cc]smart> er, actually yes, you're right. but now i wonder what you meant :) Jul 17 09:12:14 I need to know how much time is required between the umount and the reboot. Jul 17 09:12:24 <[cc]smart> there's a delay ? Jul 17 09:12:55 There needs to be a delay - that's what dyoung is suggesting anyway. Jul 17 09:13:15 <[cc]smart> don't get why a delay should be needed... Jul 17 09:13:37 <[cc]smart> umount syncs the FS Jul 17 09:13:38 Well, no, I don't either - finding out why would be useful... Jul 17 09:13:52 sync doesn't help Jul 17 09:14:04 <[cc]smart> sure, since umount does it already Jul 17 09:14:21 <[cc]smart> this would have been the only thing i could think of that would make a delay possible Jul 17 09:14:36 <[cc]smart> s/possible/sensible/ Jul 17 09:16:37 Well, it certainly seems to work having a delay, but maybe something else important happens after the umount in rc6.d/* Jul 17 09:17:24 I can't see what though - nothing touches the flash in there Jul 17 09:18:18 I tried 10s after the existing umount, it still hangs. Jul 17 09:21:02 hm, I umount /initrd on bootup Jul 17 09:21:31 Does shutdown -r work consistently for you (i.e. never hang, apart for the 4s where all the LEDs are alight). Jul 17 09:21:56 I never looked at the leds, but it does reboot yes Jul 17 09:22:17 What is your rootfs (disk, nfs, usb stick?) Jul 17 09:22:56 I have two, one with usb disk, and one with usb stick.. the only difference being that the stick is mounted noatime Jul 17 09:23:50 The hang has been seen to happen with everything - including NFS - but seems not to always happen with NFS. Jul 17 09:24:02 Adding a 60s delay after the umount of /initrd doesn't help either... Jul 17 09:24:06 maybe I've just been lucky :) Jul 17 09:25:25 Yeah, that's possible, we've known for a long time that umounting /initrd would at least reduce the problem, but I don't think anyone every took the time to see whether it could be eliminated. Jul 17 09:27:53 Hum. One interesting thing of course is that the /initrd was mounted on / by the kernel. Jul 17 09:31:03 03ingeba * 10unslung/make/man-pages.mk: Updated man pages to version 2.05 as version 2.01 was no longer available and 2.06 did not have correct access rights Jul 17 09:31:05 <[cc]smart> jbowler: the test i did was definitely waiting less than 60s, and it worked with /initrd unmounted Jul 17 09:32:57 Yes, I've eliminated a simple delay as a possibility (not that I liked the idea of doing that very much, shutdown is slow enough already). Jul 17 09:35:42 <[cc]smart> what always wondered me is why "mount" and "cat /etc/mtab" give different results." Jul 17 09:35:49 ... and I've eliminated the possibility that it was because the /initrd was from the kernel / mount - I still get the hang if I rmdir /initrd, reboot, mkdir /initrd and mount it. Jul 17 09:36:30 [cc]smart: busybox mount is effectively cat /proc/mounts, /etc/mtab is written by busybox mount. Jul 17 09:36:48 Some distros do ln -s /proc/mounts /etc/mtab Jul 17 09:37:27 <[cc]smart> just like openslug Jul 17 09:37:45 <[cc]smart> which could be the reason Jul 17 09:37:55 <[cc]smart> if busybox mount writes into /proc/mounts Jul 17 09:37:58 <[cc]smart> what happens ? Jul 17 09:39:33 Oh, ok - on openslug busybox "mount" actually attempts to fix up the contents of /etc/mtab (or whatever it reads) to have the correct information using /etc/fstab. Jul 17 09:39:54 <[cc]smart> which still makes busybox write into /proc/mounts Jul 17 09:40:03 <[cc]smart> since theres the link Jul 17 09:40:05 /proc/pid/mounts is read only. Jul 17 09:40:05 <[cc]smart> not a file Jul 17 09:40:34 <[cc]smart> correct Jul 17 09:40:47 <[cc]smart> except that it's /proc/self/mounts Jul 17 09:41:06 'self' is just a link to $$ Jul 17 09:41:13 <[cc]smart> ah Jul 17 09:41:14 <[cc]smart> k Jul 17 09:41:31 <[cc]smart> i'll still try now replacing mtab with a file... let's see what happens... Jul 17 09:41:39 There seems to be no global mount table in Linux (which makes sense to me). Jul 17 09:42:05 <[cc]smart> doesn't make sense to me though Jul 17 09:42:09 I think the support for file mtab is in the init scripts, but I'm not sure. Jul 17 09:42:10 <[cc]smart> what do new processes get ? Jul 17 09:42:40 Same as everything else - it's inherited with variations (like the signal mask). Jul 17 09:43:07 <[cc]smart> which means there is a more global entry up to the most global which is THE global Jul 17 09:43:19 <[cc]smart> if there's really a different one Jul 17 09:43:23 <[cc]smart> couldn't see it really Jul 17 09:43:30 At least I'm assuming that's what happens - after a pivot_root there are two different views of the world. Jul 17 09:43:30 <[cc]smart> how would backup be possible if there isn't Jul 17 09:43:42 <[cc]smart> don't think so either Jul 17 09:44:00 <[cc]smart> his being the reason why you must leave all processes Jul 17 09:44:30 <[cc]smart> so that the base can be SWITCHED Jul 17 09:44:39 <[cc]smart> otherwise you'd just have an alternative view Jul 17 09:44:54 <[cc]smart> which would not require closing all processes Jul 17 09:45:08 <[cc]smart> but i don't know it really Jul 17 09:45:25 <[cc]smart> it's just what i would expect/is most logical to me Jul 17 09:45:28 Got it! mount -o remount,ro /initrd works too! Jul 17 09:46:44 If this is correct, shutdown -r would always work after reflash too. Jul 17 09:47:34 <[cc]smart> in any case, removing /etc/mtab as a link and replacing it as a file removes the discrepancy in between the two outputs Jul 17 09:48:07 <[cc]smart> ah ... Jul 17 09:48:10 <[cc]smart> this strikes me Jul 17 09:48:15 <[cc]smart> now assume what Jul 17 09:48:28 <[cc]smart> how about the following: Jul 17 09:48:40 <[cc]smart> when i do cat /etc/mtab Jul 17 09:49:08 <[cc]smart> what i get is the situation for the shell i'm running since /proc/self/mounts hits the shell, no ? Jul 17 09:49:16 <[cc]smart> if i do it via mount Jul 17 09:49:31 <[cc]smart> i see a different process accessing /proc/self/mounts Jul 17 09:49:39 <[cc]smart> or maybe not Jul 17 09:49:47 <[cc]smart> in any case the two see different situations Jul 17 09:51:10 ? I don't understand "since /proc/self/mounts hits the shell, no ?" - everything in /proc is inside the kernel. Jul 17 09:51:28 <[cc]smart> yes, but as you said self is $$ Jul 17 09:51:46 <[cc]smart> /proc/self/mounts is a different thing for every process Jul 17 09:52:14 <[cc]smart> so there must be two different vies Jul 17 09:52:21 <[cc]smart> for whatever reason Jul 17 09:52:38 <[cc]smart> in any case, replacing /etc/mtab by a file doesnt fix rebooting Jul 17 09:52:53 Do a diff on /proc/[1-9]*/mounts and see if they are all the same... Jul 17 09:53:21 I think they might be unless something survives from before the pivot_root, but maybe chroot processes are different? Jul 17 09:54:24 <[cc]smart> one sec Jul 17 09:54:30 <[cc]smart> at least in one point you where wrong Jul 17 09:54:38 <[cc]smart> s/h// Jul 17 09:54:45 Making /etc/mtab a file results in an empty file after reboot - so there is no support for /etc/mtab (in the traditional form). Jul 17 09:54:48 <[cc]smart> i replaced /etc/mtab with a file Jul 17 09:54:56 <[cc]smart> correct Jul 17 09:55:01 <[cc]smart> it's empty on my side too Jul 17 09:55:13 <[cc]smart> but mount still gives correct (or wrong then?) output Jul 17 09:55:17 <[cc]smart> where from ? Jul 17 09:55:45 <[cc]smart> yes mount gices incorrect output, at least by means of "what i'd expect" Jul 17 09:55:51 <[cc]smart> /dev/sda1 on /initrd type jffs2 (rw,noatime) Jul 17 09:55:51 <[cc]smart> /dev/sda1 on / type reiserfs (rw,noatime) Jul 17 09:55:55 <[cc]smart> which is incorrect Jul 17 09:56:06 <[cc]smart> sinc /initrd is mtdblock4 no ? Jul 17 09:56:08 I don't see any difference - it seems to be ignoring /etc/mtab Jul 17 09:56:16 <[cc]smart> that's what it does Jul 17 09:56:30 <[cc]smart> but as it was a link isaw two different contents Jul 17 09:56:41 <[cc]smart> depending if i'd cat /etc/mtab or doe mount Jul 17 09:56:51 busybox doesn't understand the concept of 'rootfs' I think. Jul 17 09:57:18 /proc/mounts is correct: Jul 17 09:57:28 rootfs / rootfs rw 0 0 Jul 17 09:57:31 /dev/root /initrd jffs2 ro,noatime 0 0 Jul 17 09:57:31 /dev/sda3 / reiserfs rw,noatime 0 0 Jul 17 09:57:38 <[cc]smart> so what i assume is the following Jul 17 09:57:49 <[cc]smart> busybox mount thinks /dev/sda1 is on /initrd Jul 17 09:58:01 <[cc]smart> and it umounts /dev/sda1 for that, not /initrd Jul 17 09:58:12 <[cc]smart> which makes / fail Jul 17 09:58:28 No. It doesn't know what /dev/root is, so it assumes the root device (which is /dev/sda1). Jul 17 09:58:35 <[cc]smart> nana Jul 17 09:58:40 <[cc]smart> do "mount" Jul 17 09:58:50 No. Look at man(2) umount Jul 17 09:58:52 <[cc]smart> you'll see /dev/sda1 on /initrd type jffs2 (rw,noatime) Jul 17 09:59:17 <[cc]smart> who'se man page is that ? Jul 17 09:59:25 Linux Jul 17 09:59:37 (Since we are running linux ;-) Jul 17 09:59:52 <[cc]smart> i thought we are running busybox Jul 17 10:00:41 Yes, that too. I gave you a reference to a system call man page, not a command man page. Jul 17 10:00:43 <[cc]smart> i bet a pair of used pants on umount doing umount -f /dev/sda1 in my case Jul 17 10:01:07 <[cc]smart> and pulling it's own / therefore Jul 17 10:01:12 <[cc]smart> leaving the kernel survived Jul 17 10:01:21 <[cc]smart> which blocks the slug from doing hardreset Jul 17 10:02:10 I would strongly recommend against passing devices to the umount command. You don't know what will happen. Jul 17 10:02:22 <[cc]smart> i'm not saying I do that Jul 17 10:02:36 <[cc]smart> i'm saying, this is what i think might be happening Jul 17 10:02:45 No, it doesn't. Jul 17 10:02:53 <[cc]smart> then i was in bad luck Jul 17 10:03:04 <[cc]smart> i would fit the behaviour perfectly though Jul 17 10:03:11 <[cc]smart> s/i/it/ Jul 17 10:03:16 What behaviour? Jul 17 10:03:23 <[cc]smart> the hanging reboot Jul 17 10:03:57 That's a hardware erratum. Jul 17 10:04:22 Sometimes the hardware reset fails to reset all the system. Jul 17 10:04:59 <[cc]smart> and you fix the hardware by unmounting /initrd ?????? Jul 17 10:04:59 It's not clear to me whether the flash driver stuff is at fault or whether it merely makes the hang more likely, but the flash chip hardware is definately involved. Jul 17 10:05:37 umounting /initrd - or mounting it ro - presumably allows the hardware to settle into a state where the problem is much less likely. Jul 17 10:06:15 But it's not just a time delay... Jul 17 10:09:11 Hum... I guess I have another potential fix though - the remount,ro seems to work reliably for me. Problem is I thought I had a fix when I did the umount of /initrd in the first place, and that didn't work... Jul 17 10:10:06 <[cc]smart> so unmountign /initrd does unreliable ? ok, that changes this. didn't see this here. Jul 17 10:11:04 /initrd is umounted just before the reboot - it reliably does not work! Jul 17 10:11:44 When I did that originally I had thought this was the problem (having /dev/mtdblock4 mounted at shutdown), because my tests to umount it then shutdown -r seemed to work... Jul 17 10:12:25 So my tests were much earlier in the shutdown sequence, and when I moved the umount and removed my debug code it stopped working. Jul 17 10:12:47 So I figured someone with a serial port had to debug it. Jul 17 10:13:00 <[cc]smart> have serial Jul 17 10:13:04 (Since the debug code wrote log messages to a disk) Jul 17 10:13:50 Yes, but in fact there are no errors from any of the shutdown commands. Jul 17 10:14:34 It was, I believe, the simple presence of my debugging which introduced enough stuff for it to work reliably. Jul 17 10:15:35 <[cc]smart> er, now i see sth. Jul 17 10:15:41 <[cc]smart> what was that calledd.. Jul 17 10:15:43 <[cc]smart> one sec Jul 17 10:15:52 <[cc]smart> think i might have sth. Jul 17 10:16:16 <[cc]smart> if only the slug would now be in slow mode Jul 17 10:16:16 Then someone (Tiersten?) pointed out that there was a hardware erratum, so I tried to make software reboot work... Jul 17 10:16:49 <[cc]smart> give a short moment to go back into slug and tell you what i just did/saw Jul 17 10:17:36 <[cc]smart> do cat /etc/init.d/umountfs Jul 17 10:17:49 <[cc]smart> i added : echo $dev $mp $type $opts Jul 17 10:18:07 <[cc]smart> just behind the if read in unmount Jul 17 10:18:13 <[cc]smart> heh Jul 17 10:18:16 <[cc]smart> its' cool Jul 17 10:18:20 <[cc]smart> this will be it Jul 17 10:18:22 <[cc]smart> i'm sure Jul 17 10:18:30 <[cc]smart> now what i saw Jul 17 10:18:36 <[cc]smart> a couple lines of mtab go by Jul 17 10:18:46 <[cc]smart> then screen was flooded by empty lines Jul 17 10:18:53 <[cc]smart> these reason prolly: Jul 17 10:19:05 <[cc]smart> it unounted its own access to /proc/mounts or similar Jul 17 10:19:18 <[cc]smart> or sth. along the lines of that Jul 17 10:19:30 <[cc]smart> in any case, it didn't finish the recursion Jul 17 10:19:44 It's possible it has a bug, but I hacked a directory 'umount /initrd' into my copy of the file right at the top. Jul 17 10:20:25 <[cc]smart> what i say is the "if read" succeeds when it shouldnt Jul 17 10:20:36 <[cc]smart> ah o Jul 17 10:20:39 <[cc]smart> grr Jul 17 10:20:44 <[cc]smart> maybe triced myself here Jul 17 10:20:51 <[cc]smart> nother try Jul 17 10:21:15 If the 'read' keeps on succeeding then probably the shell script will run until it gets OOM and dies. Jul 17 10:21:51 <[cc]smart> yes, that's what i thought was the fault, but the fault a least in this test was me sillyness Jul 17 10:21:55 <[cc]smart> so retry Jul 17 10:22:25 That function does work fine outside the reboot sequence - that is how I tested it (well, it used to work fine.) Jul 17 10:29:04 <[cc]smart> i have two testruns here which mighty be of use, dunno: Jul 17 10:29:06 <[cc]smart> http://pastebin.com/315096 Jul 17 10:29:18 03marceln * 10unslung/make/inetutils.mk: COSMETIC Change: removed a space in the section name Jul 17 10:31:23 The ramfs failure is unexpected, I suspect it may be something to do with /dev (which cannot be umounted at this point) Jul 17 10:31:48 The eth0 is a problem, IMO - it's been there for ever and is happening inside the kernel shutdown. Jul 17 10:32:07 <[cc]smart> /dev/ is not tried for unmount Jul 17 10:32:25 <[cc]smart> so the failure must be for a different request Jul 17 10:32:28 <[cc]smart> funny thing Jul 17 10:32:32 <[cc]smart> there is no ramfs tried Jul 17 10:32:54 <[cc]smart> only "out" lines are getting unmounted Jul 17 10:33:09 Look in the script immediately below the unmount call Jul 17 10:33:42 <[cc]smart> grr Jul 17 10:33:47 <[cc]smart> will modify test... Jul 17 10:34:13 I expect the /dev umount to fail - /dev/console is in use outputing those messages! Jul 17 10:34:33 I don't quite understand why this should lock /dev, but something seems to. Jul 17 10:34:56 <[cc]smart> you're right, it's /dev Jul 17 10:35:22 I wasn't able to find a way to umount it. IMO it would be good if it could be umounted, just for cleanness. Jul 17 10:37:38 The ramfs in the error message is there because when the umount fails it tries the equivalent of 'mount -o remount,ro ramfs /dev' Jul 17 10:37:41 <[cc]smart> i will try to identify the sinner.... Jul 17 10:38:49 ls -l /proc/*/fd/* can be a useful trick. Jul 17 10:39:57 <[cc]smart> http://pastebin.com/315100 Jul 17 10:40:32 <[cc]smart> init 1 root 10u FIFO 0,11 564 /dev/initctl Jul 17 10:40:57 <[cc]smart> rc 993 root 0u CHR 5,1 255 /dev/console Jul 17 10:40:58 <[cc]smart> rc 993 root 1u CHR 5,1 255 /dev/console Jul 17 10:40:58 <[cc]smart> rc 993 root 2u CHR 5,1 255 /dev/console Jul 17 10:41:02 Ooh, look at that mtdblockd Jul 17 10:42:34 Yeah, those /dev/consoles are where the output is. I tried to find a way to kill them, but I don't think it matters - /dev mounted shouldn't be an issue anywhere. Jul 17 10:43:09 <[cc]smart> need to read what mtdblockd is all about.... Jul 17 10:44:32 init has /dev/initctl open, but that's actually on /, not on the ramfs Jul 17 10:46:47 The mtdblockd is a kernel thread from the jffs2 file system support. Since jffs2 is built in there is no way to rmmod it therefore, I think, no way to stop it. Jul 17 10:47:32 <[cc]smart> may mtdblockd be stopping to flush too early/leave a "half flushed" state ? : http://www.ussg.iu.edu/hypermail/linux/kernel/0408.1/1913.html Jul 17 10:47:43 It might be possible to shut down eth in a controlled fashion by rmmod, but it clearly doesn't stop the hardware reset. Jul 17 10:50:14 I don't believe it can be a software kernel problem because of the way the reboots - soft or hard -happen. Hard is a watchdog timer which pulls the reset line on the chip down. Soft is a direct jump into the RedBoot code. Jul 17 10:52:14 <[cc]smart> i think the situation somehow doesn't allow the kernel to stop Jul 17 10:52:31 <[cc]smart> afaik, the kernel is keeping the watchdog happy for as long as it works Jul 17 10:54:49 Oh no, the kernel stops. Believe me I can stop the kernel... Jul 17 10:55:24 And the watchdog fires too, because otherwise those LEDs would stay all lit (or go to just amber.) Jul 17 10:56:06 Anyway, I'm just removing that code - it was only for debugging, I'm going to revert to hard reboot and, if that seems to work, check it in and see whether it works for everyone. Jul 17 10:56:22 (I.e I have a change to umountinitrd.sh to remount,ro /initrd on boot.) Jul 17 10:58:09 <[cc]smart> this would suggest two problems Jul 17 10:58:21 <[cc]smart> one in software which puts HW in strange state Jul 17 10:58:36 <[cc]smart> one in HW which is that watchdog reset doesnt work in all cases Jul 17 10:59:31 Yes Jul 17 10:59:43 And the software one presumably kills the software reboot too. Jul 17 11:00:53 hiho Jul 17 11:00:54 <[cc]smart> so a possible scenario with -o ro relieving might be that flash is put into write mode and it doesnt return from that Jul 17 11:01:22 <[cc]smart> iirc that falsh needs write mode to raise voltage etc to make a write Jul 17 11:02:08 <[cc]smart> which would be a physical "" Jul 17 11:02:16 <[cc]smart> "state" Jul 17 11:02:22 Yes, possible. Indeed it seems certain that something the flash driver does to support write must leave the hardware in a state where reset doesn't work. Jul 17 11:03:32 I would like to add loop-support and iso9660 filesystem support to my nslu2. For that it would be nice to have a .config for the kernel running on my unslung 4.x. Any suggestions where to find such a .config? Jul 17 11:04:46 <[cc]smart> monotone repo should prolly have that Jul 17 11:04:48 <[cc]smart> lemme see Jul 17 11:06:50 <[cc]smart> seems we have such Jul 17 11:06:53 <[cc]smart> in two pieces Jul 17 11:07:00 <[cc]smart> one prolly the original linksys Jul 17 11:07:04 <[cc]smart> and a patch for that Jul 17 11:07:22 <[cc]smart> can monotone be browsed via web ? Jul 17 11:07:39 <[cc]smart> or do you have a copy of the repo ? Jul 17 11:08:28 <[cc]smart> openembedded/packages/linux/nslu2-linksys-kernel-2.4.22/nslu2/defconfig Jul 17 11:08:58 <[cc]smart> openembedded/packages/linux/nslu2-linksys-kernel-2.4.22/config-fixes.patch Jul 17 11:09:26 <[cc]smart> ah forget about the patch Jul 17 11:10:14 sorry, but where can I access those files? Jul 17 11:11:28 it's not the cvs from sf.net right? Jul 17 11:12:54 <[cc]smart> no Jul 17 11:12:58 <[cc]smart> monotone repo Jul 17 11:13:04 <[cc]smart> bu maybe it's on the cvs too Jul 17 11:13:14 <[cc]smart> it should be a relativey old file Jul 17 11:14:20 <[cc]smart> do you maybe have /proc/config on your slug ? Jul 17 11:15:41 no :( Jul 17 11:16:08 how to access the monotone repo? Jul 17 11:16:49 <[cc]smart> try this: http://monotone.vanille.de/viewmtn/getfile.py?id=fa7e3b0e1fa2d612403b4249a3699780c86048ce&path=packages/linux/nslu2-linksys-kernel-2.4.22/nslu2/defconfig Jul 17 11:17:05 thanks :) Jul 17 11:24:13 03jbowler 07org.openembedded.nslu2-linux * r726e7491... 10/ (7 files in 6 dirs): Jul 17 11:24:13 propagate from branch 'org.openembedded.dev' (head 888665f8d2b7e9a8781c746c740393ab7115129a) Jul 17 11:24:13 to branch 'org.openembedded.nslu2-linux' (head a51050bb0bb6b5d2f70e71bda3224baf9d97929d) Jul 17 11:43:04 [cc]smart: I just got a hang after a reflash - a system where /dev/mtdblock4 was not mounted anywhere. Jul 17 11:45:18 <[cc]smart> hm Jul 17 11:45:41 <[cc]smart> but it had booted... ? Jul 17 11:46:16 ? I executed shutdown -r, so yes, it had to have booted. Jul 17 11:46:42 <[cc]smart> hrhr... cause you saiod "after a reflash" which i interpreted different :) Jul 17 11:47:05 xscale-reset.patch also seems to be relevant, since now I can't even get the original working scenario to work. Jul 17 11:47:06 <[cc]smart> so you made a flash image that would go to external media directly ? Jul 17 11:47:23 no, I did reflash -i then shutdown -r and it hung. Jul 17 11:47:51 Then I power cycled, booted to the new kernel, and now shutdown -r hangs even though my remount,ro fix is there. Jul 17 11:47:56 <[cc]smart> er, but then mtdblock4 is / no ? Jul 17 11:49:00 <[cc]smart> hmmm.... Jul 17 11:49:06 <[cc]smart> new observation... Jul 17 11:49:42 <[cc]smart> ah no Jul 17 11:49:44 <[cc]smart> irrelevant Jul 17 11:52:05 No, everything is as before - diskfull system, /dev/mtdblock4 mounted on /initrd ro, or not mounted at all. Jul 17 11:52:30 I'm backing out the kernel change though - I'm just going to remove the 4s timeout. Jul 17 11:59:05 <[cc]smart> does reboot -n actually o what it should ? Jul 17 12:00:16 I should think so, it can be fairly important. Jul 17 12:03:41 <[cc]smart> these are the mounts left right before reboot: Jul 17 12:03:43 <[cc]smart> rootfs / rootfs rw 0 0 Jul 17 12:03:43 <[cc]smart> /dev/sda1 / reiserfs ro 0 0 Jul 17 12:03:43 <[cc]smart> proc /proc proc rw,nodiratime 0 0 Jul 17 12:03:43 <[cc]smart> ramfs /dev ramfs rw 0 0 Jul 17 12:04:32 All of which are totally harmless I think. What's more shutdown -r always works if the root is on /dev/mtdblock4. Jul 17 12:05:25 I'm finding your suggestion more and more believable - suppose that the hang happened if the last access to the flash was a write, not a read? Jul 17 12:05:33 <[cc]smart> but hten theres no rootfs / rootfs rw 0 0 Jul 17 12:06:03 ? That's the orginal kernel root. Jul 17 12:07:33 I.e. linux hangs everything off of that psuedo filesystem. Jul 17 12:07:59 <[cc]smart> yes, could it be that the reboot tries to get rid of that ? Jul 17 12:08:16 <[cc]smart> eh Jul 17 12:08:22 <[cc]smart> but still the kernel is halted Jul 17 12:08:25 <[cc]smart> or hung Jul 17 12:08:29 <[cc]smart> really hung Jul 17 12:08:32 <[cc]smart> maybe Jul 17 12:09:24 <[cc]smart> verification of the mode of the flash should be realtively easy with a scope Jul 17 12:09:44 <[cc]smart> so it could be compared between normal boot and this reboot situation Jul 17 12:14:33 I haven't checked what flash chip is in the slug but most flash chips I've had to deal with recently generate Vpp internally so you can't tell from outside if it is still in write state. Jul 17 12:15:18 <[cc]smart> i love it when plans work Jul 17 12:16:58 Well, looking at the state of the system might tell us if it hung at the reset or whether it got some way into the RedBoot reboot before it got stuck. That would be useful... Jul 17 12:17:57 Hum. util-linux doesn't work for me with glibc either - it tries to link something using the build gcc at the do_install step. Jul 17 12:18:18 | gcc mount.o fstab.o sundries.o xmalloc.o realpath.o mntent.o version.o get_label_uuid.o mount_by_label.o mount_blkid.o mount_guess_fstype.o getusername.o ../lib/setproctitle.o ../lib/env.o nfsmount.o nfsmount_xdr.o nfsmount_clnt.o lomount.o ../lib/xstrncpy.o -o mount Jul 17 12:18:18 | /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.5-20050130/../../../../i686-pc-linux-gnu/bin/ld: mount.o: Relocations in generic ELF (EM: 40) Jul 17 12:18:18 | mount.o: could not read symbols: File in wrong format Jul 17 12:18:24 Anyone else see that? Jul 17 12:19:08 <[cc]smart> will try baking Jul 17 12:29:28 <[cc]smart> hmmm. could the main Makefile be modified to use a configfile, too, so that i can select what it should care for and what is not used by me ? Jul 17 12:29:56 <[cc]smart> dont need unslung optware and bitbake there Jul 17 12:30:18 There's a target for just openslug I think. Jul 17 12:30:33 Also, cd openslug, source setup-env, bitbake -i ;-) Jul 17 12:30:49 <[cc]smart> but the update stuff is nice Jul 17 12:31:34 I have to admit I just have a shell script which syncs everything, does a backup, then waits 10 minutes... Jul 17 12:31:38 <[cc]smart> so you dont forget the cvs on oe-symlinks et al Jul 17 12:31:52 Ah. Yes, well, I don't use those. Jul 17 12:32:45 <[cc]smart> you backup all stuff every 10 minutes ? Jul 17 12:33:05 <[cc]smart> and a high rate of HD exchange... Jul 17 12:33:43 I back up all the synced stuff by monotone sync to an NSLU2 running a monotone server. Jul 17 12:34:00 <[cc]smart> ah Jul 17 12:34:08 <[cc]smart> that eases the load a bit Jul 17 12:34:11 <[cc]smart> :) Jul 17 12:34:12 Because if I don't back it up every time there is a new rev the NSLU2 monotone server will take forever... Jul 17 12:35:03 jbowler: One thing I noticed with Unslung is that it prints something like "putting flash into read mode" as it reboots ... Jul 17 12:35:30 <[cc]smart> eek Jul 17 12:35:32 Ah ha, now that is a clue... Jul 17 12:36:00 I wonder if someone can strace the Do_Reboot in Unslung ... Jul 17 12:36:03 <[cc]smart> now just tell me that there is a linksys kernel patch around .... Jul 17 12:36:14 <[cc]smart> i will snoop unslung sources a bit Jul 17 12:36:18 It's also consistent with everything we've seen so far... I just got my shutdown -r now fix working, but a the cost of retaining the fake soft reboot and 0.25s watchdog timeout. Jul 17 12:36:36 [cc]smart: build unslung Jul 17 12:36:38 ;-) Jul 17 12:36:53 * rwhitby-treo goes back to sleep ... Jul 17 12:37:33 <[cc]smart> sth awoke the undead Jul 17 12:37:40 <[cc]smart> :) Jul 17 12:38:25 in fact I think bb unslung-kernel will probably work - but take care to delete tmp/deploy/zImage to avoid confusion... Jul 17 12:40:05 Ok. This is crazy but it works - my xscale-reset patch, soft reboot (which never completes) and mount -o remount,ro /initrd Yuck. Jul 17 12:41:06 <[cc]smart> NOTE: package util-linux-2.12q: completed Jul 17 12:41:06 <[cc]smart> NOTE: build 200507172131: completed Jul 17 12:51:20 [cc]smart: can you tell me what the timestamps are on two files in the working directory - ls -l work/util-linux*/*/mount/nfsmount_xdr.? Jul 17 12:54:05 03jbowler 07org.openembedded.nslu2-linux * r6b3cc2b0... 10/packages/ (6 files in 4 dirs): Jul 17 12:54:05 Horrible hacky, maybe working, work round for shutdown -r. Please report Jul 17 12:54:05 any future hangs on shutdown -r in slugbug 145. Jul 17 12:57:03 <[cc]smart> er, timestamp of noe ? Jul 17 12:57:31 <[cc]smart> s/noe/now/ Jul 17 13:03:14 The modification time of the two files - the .o and the .c Jul 17 13:05:15 <[cc]smart> already deleted again Jul 17 13:05:28 <[cc]smart> but well, they were of now Jul 17 13:05:35 <[cc]smart> current time Jul 17 13:05:40 <[cc]smart> CET Jul 17 13:06:45 <[cc]smart> seen the OE mail by freyther regarding quoting in bb files ? Jul 17 13:06:55 <[cc]smart> will car for the openslug ones Jul 17 13:07:00 <[cc]smart> s/car/care/ Jul 17 13:10:49 I just checked out Rod.s suggestion (but didn't want to actually reboot the slug) Jul 17 13:10:49 DO_Reboot is a shell script which calls /etc/rc.d/rc.reboot Jul 17 13:10:49 /etc/rc.d/rc.reboot is a shell script which calls busybox to do the reboot Jul 17 13:10:49 I haven't looked but I assume the Linksys source download includes the version of busybox they used Jul 17 13:13:06 heh - bad assumption Jul 17 13:15:27 <[cc]smart> it doesn't seem that linksys busybox coontains unusual/flash related strings. Jul 17 13:18:31 maybe it's in the kernel ... Jul 17 13:20:19 <[cc]smart> how do you uncompress vmlinuz ? Jul 17 13:20:49 It's LZ77 I believe, gunzip might work. Jul 17 13:22:24 <[cc]smart> didnt Jul 17 13:22:28 <[cc]smart> need to check... Jul 17 13:22:40 [cc]smart: see the firmware/Makefile in unslung cvs (might have to look in the attic) Jul 17 13:22:52 unslung 1.0 used to binary sed the kernel Jul 17 13:25:06 <[cc]smart> arg Jul 17 13:25:25 <[cc]smart> something scary for nightmares Jul 17 13:26:01 it wasn't just strings either - we binary edited a number in the code ... Jul 17 13:45:34 +#ifdef CONFIG_MTD_CFI_INTELEXT Jul 17 13:45:34 +/* Jul 17 13:45:35 + * Set the Intel flash back to read mode as MTD may leave it in command mode Jul 17 13:45:35 + */ Jul 17 13:46:10 where's that? Jul 17 13:46:25 2.4.22-xfs-nslu2.patch Jul 17 13:46:34 there ya go Jul 17 13:46:36 Here's another: Jul 17 13:46:38 ++/* Jul 17 13:46:38 ++ * HACK: Put flash back in read mode so RedBoot can boot properly. Jul 17 13:46:38 ++ */ Jul 17 13:47:27 That's the one with the message, but it does the same thing. Jul 17 13:47:51 well, that calls for an openslug binary release :-) Jul 17 13:48:36 If it works then the only remaining issue is irq26 - it would be nice to kill that too. Jul 17 13:49:08 apparently lennert thinks that's a "comment it out" issue :-) Jul 17 13:49:21 (but I've got that third or fourth -hand) Jul 17 13:50:41 <[g2]> jbowler, yes lennert does feel that way Jul 17 13:51:14 <[g2]> jbowler, excellent news on the reboot Jul 17 13:51:57 <[g2]> bewoolie had some concerns that that might be the issue as he ran into a similar problem on the sharp platform Jul 17 13:54:50 <[g2]> You'd think that the jffs2 code would automatically do that when umounting the jffs2 system Jul 17 13:55:08 <[g2]> actually... that *last* jffs2 partition Jul 17 13:55:15 it's board-specific Jul 17 13:55:16 Why? Jul 17 13:55:27 It's not a jffs2 issue. Jul 17 13:55:35 jbowler: so it will go in nslu2-setup.c ? Jul 17 13:56:06 I don't know what it's doing yet. It's a patch to the MTD driver. Jul 17 13:56:51 <[g2]> I think both Redboot and APEX don't boot properly Jul 17 13:57:08 [g2]: it's bootloader independent Jul 17 13:57:30 <[g2]> nod... the Intel Flash chip is left in a bad state right ? Jul 17 13:58:02 jbowler: nod - it's a mtd maps patch Jul 17 13:58:05 The fix is in the current kernel source. Jul 17 13:58:23 It probably just isn't getting called... Jul 17 13:58:27 <[g2]> The avila boards may have the same issue or they may yank the chip reset and have a work around Jul 17 13:59:25 <[g2]> jbowler, is that 2.6.12.2 ? Jul 17 13:59:46 Someone who understands the device drivers needs to have a look at this. It seems to be there in the driver remove code. Jul 17 13:59:54 jbowler: can't see it in mtd/maps/ixp4xx.c ... ? Jul 17 14:00:40 Line 121 Jul 17 14:01:16 ah, right. Jul 17 14:01:29 So I don't think the whole remove thing is actually called. Jul 17 14:02:09 And it would be called if the .remove proc was invoked. Jul 17 14:02:10 <[g2]> probably be easy enough to add a printk in there Jul 17 14:03:10 * [g2] wonders if info is properly set Jul 17 14:03:19 called from del_mtd_device in mtdcore.c, but that might not be called either Jul 17 14:05:45 * [g2] hugs dyoung-zzzz Jul 17 14:06:11 <[g2]> it's *really* nice having kernel messages back again Jul 17 14:08:28 jbowler: I think we need to add the explicit register_reboot_notifier to ixp4xx.c with a NSLU2 arch ifdef around it Jul 17 14:08:41 /nick dyoung Jul 17 14:08:53 * [g2] hugs dyoung again... Jul 17 14:08:55 so the reboot issueu is done? Jul 17 14:09:06 and I can stop looking now? Jul 17 14:09:19 dyoung: stop looking and start patching :-) Jul 17 14:09:44 what was the verdict? mtd driver needs to be forcibly put into read mode? Jul 17 14:10:10 There's code in there to close down the flash correctly, the hypothesis is that none of it is called. Jul 17 14:10:41 I think we need to take the ixp425_mtd_reboot, ixp425_mtd_notifier and the calls to [un]register_reboot_notifier from unslung's drivers/mtd/maps/ixp425.c and put them in openslug's drivers/mtd/maps/ixp4xx.c with an NSLU2 ifdef around them Jul 17 14:11:33 * rwhitby-away is getting ready for work soon, so won't be able to actually do that Jul 17 14:12:14 Deepak Saxena did the original patch, and did the ixp4xx code too - since the reboot notifier was removed it was probably unnecessary. Jul 17 14:12:45 * [g2] wants to if it's already getting called or not -- printk and dumping info will tell Jul 17 14:12:54 After all there must be some way of calling that remove API... Jul 17 14:13:27 line 105 of drivers/mtd/maps/ixp4xx.c Jul 17 14:13:48 <[g2]> actually... in 2.6.12.2 we are probably not using the api except for the code that dyoung put it for the nslu2-setup.c Jul 17 14:13:48 I've seen these calls before, one moement. Jul 17 14:13:59 Yes - all that stuff, although the first patch only did line 121. Jul 17 14:14:55 I suspect it worked for a hard reset, but without the last line a soft reset wouldn't work. Jul 17 14:15:02 jbowler: my guess is that we are missing something in our shutdown sequence, which keeps the mtd devce driver loaded, and so that remove routine never gets called Jul 17 14:15:48 Yes, that is also possible, however I've reproduced the problem where the last thing done to the flash was writing direct to /dev/mtdblock? Jul 17 14:15:54 maybe deepak never had occasion to pivot_root an ixp4xx board ... Jul 17 14:16:38 <[g2]> well.. I was thinking we've pivoted and umountd the initrd Jul 17 14:16:52 <[g2]> is the driver even active ? Jul 17 14:17:22 who knows. I vote for just putting the reboot notifier in, and punting on the whole issue. Jul 17 14:17:53 <[g2]> for testing yes Jul 17 14:18:01 it's not like it's going to have any side-effects after that code is executed :-) Jul 17 14:18:12 (apart from allowing the reboot to proceed) Jul 17 14:18:13 <[g2]> for production... we should fix this properly and get it upstream if it's not already there Jul 17 14:18:38 <[g2]> I'm looking to get all the kernel changes upstream Jul 17 14:19:03 I disagree. A forcible reboot should always work, no matter what other state the rest of the kernel drivers are in. Jul 17 14:19:12 (disagree on testing vs production) Jul 17 14:19:38 My position is that this reboot notifier must be in production too, to handle rebooting when the kernel drivers are in a bad state. Jul 17 14:19:50 <[g2]> I fully agree the forcible reboot should always work too Jul 17 14:20:06 To assume that the mtd device can always be unloaded when you need to reboot is not a safe assumption. Jul 17 14:20:13 <[g2]> I'm just saying that I'm not convinced the fix is the only and most proper one Jul 17 14:20:41 class_simple_device_remove? Jul 17 14:21:10 [g2]: I'm saying that irrespective of whether there is another fix which should be done, the reboot notifier should be there anyway. Jul 17 14:21:32 <[g2]> that may very well be the case Jul 17 14:22:04 of course adding the reboot notifier will mask whatever the other fix may be required for ... Jul 17 14:22:20 but if it's for a reboot, then who cares ... Jul 17 14:22:27 <[g2]> I'm wondering if our 2.6.12.x port is missing somethings like possibly the reboot notifier because of the way we implemented it Jul 17 14:23:00 rwhitby-away: it looks to me that the platform_notify slot has already been used up in common-pci.c Jul 17 14:23:27 Ah, hang on, that's remove, not reboot... Jul 17 14:24:41 Ok, so there is a registration interface, not a single slot, but is it used? Jul 17 14:29:58 can someone verify the sanity of line 51 in nslu2-setup.c please Jul 17 14:33:15 dyoung: it seems to be the same as ixdp425-setup.c (which doesn't mean it is correct) Jul 17 14:36:50 <[g2]> reflash -k zImage-printk Jul 17 14:36:50 <[g2]> reflash: /boot/rootfs.jffs2: root file system image file not found Jul 17 14:36:52 <[g2]> ???? Jul 17 14:37:22 That would seem to be correct, if the file name extensions are to be believed. Jul 17 14:38:04 BTW that's an old reflash I think. Jul 17 14:38:24 <[g2]> is there a version number ? Jul 17 14:38:57 Try reflash --help Jul 17 14:39:09 jbowler, is that line the same as saying .dev.platform_data = &nslu2_flash_data ? Jul 17 14:39:12 <[g2]> reflash --help Jul 17 14:39:12 <[g2]> reflash: usage: /usr/sbin/reflash [-k kernel -j rootfs] | -i image Jul 17 14:39:12 <[g2]> -k [/boot/zImage]: the new compressed kernel image ('zImage') Jul 17 14:39:12 <[g2]> -j [/boot/rootfs.jffs2]: the new root file system (jffs2) Jul 17 14:39:12 <[g2]> -i image: a complete flash image (gives both kernel and jffs2) Jul 17 14:39:13 <[g2]> The current jffs2 will be umounted if mounted. Jul 17 14:39:29 That's an old one. Jul 17 14:40:02 Note the options: [-k kernel -j rootfs] | -i image Jul 17 14:40:27 <[g2]> it should only be a week or so old Jul 17 14:41:05 that's a looooong slugtime Jul 17 14:41:43 ipkg install openslug-init Jul 17 14:41:58 (from unstable) Jul 17 14:42:54 <[g2]> I've got a test kernel which does the printk before and after the check for the !info the mtd remove function Jul 17 14:43:18 <[g2]> this is a clean FatTurbo that was setting of for package testing Jul 17 14:43:48 <[g2]> however, this is an very important topic and I'd like to try to help get closure on this issue Jul 17 14:44:48 [g2] thats what I'm trying to determine. Jul 17 14:44:53 does info return 0 or not Jul 17 14:45:05 if it doesnt, we know where to look. Jul 17 14:45:15 <[g2]> I'm ready to test and I'm search the logs for the devio comd Jul 17 14:45:16 <[g2]> cmd Jul 17 14:45:43 <[g2]> Jun 01 11:22:38 This command: devio '<>/dev/mtdblock2;wb$16+4;fb12,0;cp$' Jul 17 14:45:48 <[g2]> that one Jul 17 14:46:34 <[g2]> DOH! I'm such a dumbass Jul 17 14:47:18 * [g2] is running with APEX booting from the jffs2 Jul 17 14:47:34 <[g2]> it's a cp /initrd/boot for me Jul 17 14:47:39 <[g2]> and a reboot Jul 17 14:49:35 <[g2]> sigh... Jul 17 14:49:50 <[g2]> Ok... this slug is a litte out of date Jul 17 14:51:49 <[g2]> ok... this will work better Jul 17 14:59:45 <[g2]> http://pastebin.ca/17940 Jul 17 14:59:58 <[g2]> it appears to me that the routine is not getting called Jul 17 15:00:32 Yes, that's the same as [cc]smart saw earlier today. Jul 17 15:01:35 <[g2]> ah.. sorry I missed that Jul 17 15:01:56 <[g2]> I didn't know you guys had already verifed via running code that it didn't get called Jul 17 15:03:41 No - we didn't - I'm saying that there is no extra message in your code Jul 17 15:04:06 Adding a printk didn't change anything. Jul 17 15:04:15 <[g2]> Oh.. I did have extra messages in my code and that was the print out with the missing messages Jul 17 15:04:27 Right. Jul 17 15:09:21 is that code supposed to be run at startup or shutdown Jul 17 15:09:40 I would have expected it at startup Jul 17 15:10:14 ixp4xx_flash_remove? It's run on error from the probe, to clean up, and when the device is removed. Jul 17 15:12:25 <[g2]> what's the mbcache module btw ? Jul 17 15:19:38 <[g2]> dyoung, jbowler rwhitby-away etc.... From my kernel novice understanding we are running the cfi driver and the ixp4xx.c code is running the ixp4xx driver Jul 17 15:20:29 http://www.amazon.com/exec/obidos/tg/detail/-/B0000DF2GD/ref=e_de_a_smtpd/104-3248000-3615135?v=glance&s=electronics&vi=tech-data <- this thing runs linux ... Jul 17 15:21:37 <[g2]> yeah... the AXIS like is for the SOC and there was even and Open FPGA code that did the video encoding Jul 17 15:22:08 <[g2]> they've been doing that stuff for 1-2 years probably Jul 17 15:22:22 more than that. Jul 17 15:22:26 <[g2]> there was a really cool article on linuxdevice about that Jul 17 15:22:37 I had a linux-ified axis many years ago. Jul 17 15:22:45 it was $$$ Jul 17 15:22:45 <[g2]> nod Jul 17 15:22:50 <[g2]> nod. Jul 17 15:22:51 <[g2]> :) Jul 17 15:23:44 <[g2]> So... I think we are passing a device text name of IXP4XX-Flash, but are really running the CFI driver Jul 17 15:23:56 <[g2]> agree, disagree ? Jul 17 15:26:03 but we're calling the cfi detection reoutines. Jul 17 15:27:26 <[g2]> right CFI finds the flash Jul 17 15:28:26 dyoung: the axis 205 is starting to get into a reasonable price range ... Jul 17 15:29:01 alternatively: this one has pan and tilt too: http://www.amazon.com/exec/obidos/tg/detail/-/B0002GS4Z0/ref=pd_bxgy_text_1/104-3248000-3615135?v=glance&s=electronics&st=* Jul 17 15:29:27 Yeah, at $180 its starting to get cheaper to get one of those instead of fooling around with nslu2's Jul 17 15:33:51 Something weird here: remove is called to dissociate a driver with a device. This may be Jul 17 15:33:51 called if a device is physically removed from the system, if the Jul 17 15:33:51 driver module is being unloaded, or during a reboot sequence. Jul 17 15:51:33 03ingeba * 10unslung/make/openvpn.mk: Added dependency for kernel-module-tun.mk Jul 17 15:52:16 03ingeba * 10unslung/make/openvpn.mk: Bumped IPK number Jul 17 15:54:49 03ingeba * 10unslung/make/samba.mk: Added cups staging to ensure the presence of cups.h in staging area Jul 17 15:55:30 [g2]: can you try out this patch: http://pastebin.ca/17943 Jul 17 15:55:47 @bob_tm-toandfrom: Are you Inge Bjørnvall Arnesen? Jul 17 15:55:56 It should result in the remove API being called, but on my system something goes wrong at that point. Jul 17 15:55:57 bwalle: Yes Jul 17 15:56:24 Because of the capital letter issue: Do you have the latest ipkg-utils installed? Jul 17 15:56:27 I don't have this problem. Jul 17 15:56:39 Are you cross or native? Jul 17 15:56:43 But if it is the policy to have lowsercase letters in package names, I can change Jul 17 15:56:45 cross Jul 17 15:57:04 Hmmm... it is all from the common make system - just a moment. Jul 17 15:57:22 <[g2]> jbowler, sure I think I and try that Jul 17 15:57:29 <[g2]> I can Jul 17 15:57:40 You should get your message then probably an oops. Jul 17 15:57:53 (It doesn't hang the kernel - the power button still works). Jul 17 15:58:03 I have the solution: It depends on $LANG Jul 17 15:58:19 Normally I have LANG=de_DE.utf-8, no error. With LANG=C I get the same error as you Jul 17 15:58:39 This means I'll fix the package name. Just wait a moment Jul 17 15:59:31 Well, this means I have to delete fIcy.mk and re-add ficy.mk Jul 17 15:59:32 bwalle: Ah - that explains it. AFAIK all packages are lower case only - sort of remember an issue about gIft (or whatever the real name is) Jul 17 16:00:32 bwalle: You won't believe it, but there are 11 versions of ipkg-build in the make system (some are i686 and some ARM though) Jul 17 16:01:36 Don't understand the last message Jul 17 16:02:07 03ingeba * 10unslung/make/sysstat.mk: Upgraded to 6.0.1 as 6.0.0 was no longer availabel - bumped IPK number Jul 17 16:02:37 Oh - it is just that there are so many binaries of the same program in the Unslung/openslug make file system, so it is hard to find which is actually being used. Jul 17 16:03:04 :) Jul 17 16:03:49 bwalle: Need to go to bed. I'm on CET summer time like you, but my sleep pattern is all screwed up. Jul 17 16:04:11 Ok, fixed hard reboot but not soft - something after the 0x55 stuff hangs. Jul 17 16:04:27 I will look at fIcy tomorrow - will be fun. Thanks for the port! Keep up the good work! Jul 17 16:04:29 ok Jul 17 16:04:45 Good night :) Jul 17 16:04:50 Night all Jul 17 16:06:22 <[g2]> jbowler, our code is different Jul 17 16:06:36 03bwalle * 10unslung/ (Makefile make/ficy.mk make/fIcy.mk): Renamed fIcy to ficy. Jul 17 16:06:43 <[g2]> you've got: Jul 17 16:06:47 <[g2]> # Jul 17 16:06:47 <[g2]> struct platform_device *dev = to_platform_device(_dev); Jul 17 16:06:47 <[g2]> # Jul 17 16:06:47 <[g2]> @@ -243,6 +248,7 @@ Jul 17 16:07:05 <[g2]> I've got Jul 17 16:07:09 <[g2]> .name = "IXP4XX-Flash", Jul 17 16:07:09 <[g2]> Jul 17 16:07:31 It's a patch to 2.6.12.2 Jul 17 16:08:47 <[g2]> jbot, I've got 2.6.12.2 Jul 17 16:09:04 <[g2]> ~lart autocomplete Jul 17 16:09:06 It's just setting up the 'shutdown' member of the driver structure (with the remove function). Jul 17 16:09:07 * jbot readies the nuke launcher and fires some rounds at autocomplete Jul 17 16:09:09 wait a minute Jul 17 16:09:16 [g2] what file are you trying to patch? Jul 17 16:09:43 I think youre talking about diffnert files Jul 17 16:10:05 <[g2]> openslug-kernel-2.6.12.2-r0/linux-2.6.12.2/drivers/mtd/maps/ixp4xx.c Jul 17 16:10:35 I have [g2]'s line in my (driver/mtd/maps) file, it's after the probe API. Let me check is I damaged the.orig Jul 17 16:11:00 <[g2]> very similar Jul 17 16:11:51 No diff. Jul 17 16:12:24 I.e. my orig file is an exact match for the ixp4xx.c in a build tree I have from 0715 Jul 17 16:15:36 2634d3ce89ebac6cdffa804747127183 ixp4xx.c Jul 17 16:15:51 And it is identical in 2.6.11.2 as well. Jul 17 16:16:41 <[g2]> are you looking at the pre or post patched version Jul 17 16:16:57 post Jul 17 16:17:07 It's a patch to the source in the tree. Jul 17 16:17:45 <[g2]> it that the sha1 on the ixp4xx.c ? Jul 17 16:20:56 Yes Jul 17 16:21:30 It's the sha1 (md5sum) of the original (OE patched) file, before my patch. Jul 17 16:21:53 <[g2]> sha1sum tmp/work/openslug-kernel-2.6.12.2-r0/linux-2.6.12.2/drivers/mtd/maps/ixp4xx.c Jul 17 16:21:53 <[g2]> fc87a723aa5563bccc1714d31d14ef0e56e27057 tmp/work/openslug-kernel-2.6.12.2-r0/linux-2.6.12.2/drivers/mtd/maps/ixp4xx.c Jul 17 16:22:28 <[g2]> md5sum... Jul 17 16:22:48 <[g2]> md5sum tmp/work/openslug-kernel-2.6.12.2-r0/linux-2.6.12.2/drivers/mtd/maps/ixp4xx.c Jul 17 16:22:49 <[g2]> 2634d3ce89ebac6cdffa804747127183 tmp/work/openslug-kernel-2.6.12.2-r0/linux-2.6.12.2/drivers/mtd/maps/ixp4xx.c Jul 17 16:22:58 <[g2]> md5's match Jul 17 16:23:59 <[g2]> line 15 in the path file isn't right then in http://pastebin.ca/17943 Jul 17 16:24:00 tabs converted to white space? Jul 17 16:24:18 <[g2]> could be from Jul 17 16:24:24 <[g2]> something like that Jul 17 16:25:07 <[g2]> but the struct before the 243 is clearly not that same as in file Jul 17 16:26:55 That's line 149. Looks identical to me (apart from tab vs spaces) Jul 17 16:27:26 <[g2]> Ahhh... ok Jul 17 16:27:47 <[g2]> the stuff after line 243 starts with .bus.... Jul 17 16:28:11 No, that would be line 243, and that is .bus Jul 17 16:28:50 03bzhou * 10unslung/make/py-psycopg.mk: source site moved, fixed the location Jul 17 16:34:45 What's really annoying is that every test involves rebooting twice, once to get the new kernel after flashing it, once to test it... Jul 17 16:35:08 <[g2]> jbowler, actually you can just load the kernel via tftp and boot Jul 17 16:35:45 I never managed to get that to work. Jul 17 16:36:57 The (new) hang is in one of the following three lines: Jul 17 16:37:00 del_mtd_partitions(info->mtd); Jul 17 16:37:00 map_destroy(info->mtd); Jul 17 16:37:07 iounmap((void *) info->map.map_priv_1); Jul 17 16:38:47 Maybe those things can't be done in shutdown. Jul 17 16:49:06 It's one of the first two lines. Jul 17 17:34:02 03jbowler 07org.openembedded.nslu2-linux * r3e52cc02... 10/packages/linux/ (5 files in 2 dirs): Jul 17 17:34:02 Fix for the hardware reset hang problem - shut down the flash memory Jul 17 17:34:02 device (ensure it is in read mode, not write) on halt/reboot. Jul 17 17:35:29 The two lines for info->mtd are still commented out. It's clear from the code that if something deletes the info->mtd stuff the driver will fail, so presumably that's what happens. Jul 17 17:49:13 03jbowler 07org.openembedded.nslu2-linux * rb61dbb4f... 10/packages/ (33 files in 19 dirs): Jul 17 17:49:13 propagate from branch 'org.openembedded.dev' (head 905c8d454a8f78a32652b51edb7d9fc32edc640b) Jul 17 17:49:13 to branch 'org.openembedded.nslu2-linux' (head 3e52cc026575dc951fd4e165189b07d652bb7adc) Jul 17 18:06:56 I need to fix 197 before a binary release. Jul 17 18:09:03 03jbowler 07org.openembedded.nslu2-linux * r78d702d0... 10/packages/ (meta/openslug-packages.bb util-linux/util-linux.inc): Jul 17 18:09:03 Work round ccache/timestamp problem in util-linux caused by a machine Jul 17 18:09:03 generated .c seeming to be newer than the .o generated from it (cause by Jul 17 18:09:03 fine granularity make timestamps apparently). Jul 17 18:57:33 monotone 0.21 is out Jul 17 18:58:29 hmm, is it compatible with 0.20 ? Jul 17 18:58:51 Nope Jul 17 18:59:27 upgrade from 0,20: you need to run db migrate against each database Jul 17 19:00:16 maybe we can console ourselves that we are helping monotone mature by actually using it Jul 17 19:01:31 some consolation Jul 17 19:02:00 Hey ByronT-Away I was just gossiping about you. Jul 17 19:02:04 Oh FFS Jul 17 19:02:16 what have I done or not done now? Jul 17 19:02:18 :) Jul 17 19:02:19 have you plugged your Harmony 880 into the slug yet? Jul 17 19:02:26 should I? Jul 17 19:02:30 <[g2]> jbowler, hey.... Jul 17 19:03:21 <[g2]> sorry, guests arrived before.... Jul 17 19:03:33 <[g2]> they're gone now Jul 17 19:03:42 I dunno, was just wonder ing if you did or not Jul 17 19:04:38 monotone.nslu2-linux.org has been updated to 0.21 (not that it matters, cause the netsync protocol hasn't changed) Jul 17 19:05:16 <[g2]> the the .20 db I pulled a couple hours ago isn't any good any ? Jul 17 19:05:25 so I should upgrade my box to 0.21 then right? Jul 17 19:05:27 can someone test a pull from nslu2-linux? Jul 17 19:06:02 everyone should upgrade. You run "monotone db migrate" to update your local database - it just adds a few more sql indexes to speed things up Jul 17 19:06:04 working Jul 17 19:06:05 <[g2]> I emerged today and it was 20 Jul 17 19:06:27 note that you don't have to upgrade, unless you want to remain slow :-) Jul 17 19:06:38 s/unless/if/ Jul 17 19:07:08 dyoung, I'll hook it up to FrankenSlug later - he's down atm, and ProdSlug doesn't have serial Jul 17 19:07:36 I'm pulling with 0.20. Jul 17 19:07:43 to verify it works. Jul 17 19:07:49 cool. I'll do it as I'm stuck with weird errors anyways at 0.19. Jul 17 19:07:54 I'll try it with 0.21 if it ever finishes. Jul 17 19:08:10 VoodooZ, you cannot use 0.19 from monotone.nslu2-linux.org Jul 17 19:08:14 Ooh, did they fix the apparent n^2 check the revisions issue? Jul 17 19:08:24 the version of netsync changed between 0.19 and 0.20/ Jul 17 19:08:29 VoodooZ: We switched to 0.20 recently and it's not compatible with < 0.20 Jul 17 19:09:33 OOh, --exclude on branches, goodbye zecke... Jul 17 19:09:51 I know. that's not what I was saying. I had problems last week so I'm just glad we're going to 0.21 Jul 17 19:12:21 35 revs, just finished. Jul 17 19:12:32 so it worked from a local 0.20 Jul 17 19:12:38 now to upgrade to 0.21 Jul 17 19:15:22 The 0.20 patches still work... Jul 17 19:16:46 But they're trying to run a test program from configure... Jul 17 19:18:14 <[g2]> jbowler, so iounmap wasnt' getting called ? Jul 17 19:18:52 I don't know - the version I checked in has the iounmap stuff and works. Jul 17 19:22:21 <[g2]> jbowler COOL, congrats! Jul 17 19:22:26 <[g2]> that's a big fix Jul 17 19:22:48 Try removing the #if 0 and seeing what the kernel oops is... Jul 17 19:23:05 I can't get it because it happens after syslogd has been killed. Jul 17 19:23:35 huh? jbowler fixed openslug reboot? Jul 17 19:24:11 we fixed the two "big" bigs over the last 2 days. Jul 17 19:24:40 <[g2]> dyoung, what was the other ? Jul 17 19:24:47 serial Jul 17 19:24:51 console Jul 17 19:24:55 what?? that's fixed too? Jul 17 19:24:57 <[g2]> heh Jul 17 19:25:06 you guys are amazing (as usual) Jul 17 19:25:16 * [g2] hugs dyoung for the 3rd time today Jul 17 19:25:38 * jacques has been preoccupied with moving/trying to get a job. Jul 17 19:25:47 <[g2]> moving ? Jul 17 19:26:03 my lease is up, owner is going to sell house, I gotta be out by end of august Jul 17 19:26:12 * [g2] been preoccupied with emerge (170+ packages) yesterday/today Jul 17 19:26:16 so I probably won't stay in this area Jul 17 19:26:49 Wow, thats kind of a bummer eh? Didnt you just kind of get settled sorta recently? Jul 17 19:27:24 Wow, that *is* fast. Jul 17 19:27:53 0.21 is a good move. Jul 17 19:28:04 dyoung, yeah Jul 17 19:30:58 darn. Fedora core4 repos don't have monotone 0.21 yet. Jul 17 19:31:11 I know! I'm lazy! Jul 17 19:31:43 <[g2]> jacques, the tarball appears to be building for me Jul 17 19:31:56 <[g2]> I had just emerged 20 today Jul 17 19:32:03 <[g2]> and pulled my first db Jul 17 19:36:24 <[g2]> jacques, 21 build just fine from the tarball Jul 17 19:39:55 ok. Just to make sure: I just installed monotone 0.21 on my machine. I had 0.19 client/db before. do I do the "monotone db migrate" step anyways? Jul 17 19:40:28 Yes Jul 17 19:41:44 thanks Jul 17 19:58:50 so damn hot Jul 17 19:58:57 ~weather kcov Jul 17 19:58:58 same here! Jul 17 19:59:00 I can't find station code "KCOV" (see http://www.nws.noaa.gov/oso/site.shtml or http://www.nws.noaa.gov/tg/siteloc.shtml for ICAO locations codes). Jul 17 19:59:03 ~weather kcvo Jul 17 19:59:05 Corvallis, Corvallis Municipal Airport, OR, United States; (KCVO) 44-30N 123-17W; last updated: 2005.07.18 0215 UTC; Dew Point: 51 F (11 C); Pressure (altimeter): 29.87 in. Hg (1011 hPa); Relative Humidity: 24%; Sky conditions: clear; Temperature: 93 F (34 C); Visibility: 10 mile(s); Wind: from the N (360 degrees) at 9 MPH (8 KT) Jul 17 19:59:20 there was fog on the ice while playing hockey earlier!! Jul 17 19:59:25 was 97F before Jul 17 19:59:34 what's that in degrees C? Jul 17 20:00:07 (97 f = 36.111111 c) Jul 17 20:00:09 with humidex it feels like 34 celsius here and it's 11pm! Jul 17 20:00:14 ouch! Jul 17 20:00:24 it's 8pm here Jul 17 20:00:41 hot! Jul 17 20:01:01 we had rain last night so I'm surprise the humidity didn't go away more. Jul 17 20:02:22 is it normal that "monotone db migrate" returned after a few seconds with no messages? Jul 17 20:03:14 nothing changed with the Makefile either? I can just do make update? Jul 17 20:03:29 I have to make sure as I always manage to screw things up! Jul 17 20:07:01 I get: monotone: misuse: no branch pattern given and no default pattern set Jul 17 20:08:28 monotone -d monotone/nslu2-linux.db pull monotone.nslu2-linux.org Jul 17 20:08:28 org.openembedded.* org.nslu2-linux.* Jul 17 20:08:39 http://groups.yahoo.com/group/nslu2-linux/message/7488 Jul 17 20:09:51 thanks dyoung Jul 17 20:10:21 Sorry for not reading the list. I should start reaading more often Jul 17 20:10:58 np Jul 17 20:17:38 it's a long process. Jul 17 20:24:47 <[g2]> how long does the verify take with the monoton 21 dl ? Jul 17 20:24:53 03bzhou * 10unslung/make/monotone.mk: upstream upgrade from 0.21 to 0.20 Jul 17 20:26:36 night Jul 17 20:46:09 <[g2]> how many revs need to be written on the verify ? Jul 17 20:46:19 <[g2]> for the monotone pull ? Jul 17 21:05:15 my guess is the same as "revs in", i'm actually 100/301 right now Jul 17 21:05:20 About 5550 Jul 17 21:05:24 Sorry, 550 Jul 17 21:06:48 revs written often seems to be less than revs in. Jul 17 21:07:03 i c Jul 17 21:08:02 <[g2]> jbowler, thx Jul 17 21:08:17 <[g2]> doh.... only at 378 Jul 17 21:10:05 <[g2]> nite all Jul 17 21:10:30 <[g2]> The fresh pull should have a clean build early tomorrow Jul 17 21:45:22 i'm trying to get familar with openslug package structure, i guess oe-symlinks/packages/ got all the packages ported, but how can i tell native package from cross package? Jul 17 22:22:48 03bzhou * 10unslung/ (make/cogito.mk sources/cogito/Makefile.patch): upstream upgrade from 0.12 to 0.12.1 **** ENDING LOGGING AT Sun Jul 17 23:59:57 2005