**** BEGIN LOGGING AT Wed Jun 15 23:59:57 2005 Jun 16 05:57:10 Anybody did a full build of openslug since yesterday? Jun 16 05:57:40 Mine seems to die at openslug-image complaining it can't find the file Packages. ??? Jun 16 05:59:19 ipkg: /local/openslug/build/tmp/deploy/ipk/Packages: No such file or directory Jun 16 05:59:41 is that near the beginning or end of the build? Jun 16 06:00:03 Near the end. at the openslug-image step. Jun 16 06:00:23 oh, you said openslug-image already, sorry Jun 16 06:00:43 I checked and there is no such Packages file. Jun 16 06:00:46 I can start one, but it'll take > an hour Jun 16 06:01:01 Something must have changed as I didn't touch my repo. Jun 16 06:01:19 no problem. at least I'll know if i'm not alone. Jun 16 06:01:34 clearing .ccache now Jun 16 06:01:39 thanks man Jun 16 06:01:46 (slow disk on that machine) Jun 16 06:05:13 cleared tmp Jun 16 06:05:18 pulling latest oe now Jun 16 06:05:48 Applying 1 revisions to classes/rootfs_ipk.bbclass Jun 16 06:05:53 I wonder what that's about Jun 16 06:06:52 ChangeSet@1.3190.1.341, 2005-06-15 10:27:59+02:00, schurig@mnz66.mn-solutions.de Jun 16 06:06:52 classes/rootfs_ipk.bbclass: Jun 16 06:06:52 allows to keep the Packages file Jun 16 06:06:52 (my own python script to create Packages is way faster :-) Jun 16 06:07:29 and later: Jun 16 06:07:31 ChangeSet@1.3190.1.353, 2005-06-15 21:57:10+01:00, RP@tim.rpsys.net Jun 16 06:07:32 rootfs_ipk: Make the shell script valid Jun 16 06:07:42 eother of those could have broken it Jun 16 06:08:12 or first broke it ans second fixed it Jun 16 06:11:02 starting build to see if it's still broken Jun 16 07:21:48 VoodooZ_Work, my build completed Jun 16 07:22:01 do you have ChangeSet@1.3190.1.353 ? Jun 16 07:37:01 <[g2]> jbowler-away, ping Jun 16 07:42:49 jacques: Thanks. I guess it got fixed in this morning's pull. Jun 16 07:44:12 Mine just finished successfully too. Jun 16 07:46:13 <[g2]> VoodooZ_Work, is there a new fix in there ? I just pulled this moring Jun 16 07:47:01 I don't understand. I didn't see anything remotely related in the pull but something fixed it. Jun 16 07:54:23 VoodooZ_Work, it wasn't ChangeSet@1.3190.1.353 that fixed it? Jun 16 07:54:53 not too sure. I didn't see that one I guess. Jun 16 08:48:59 [g2] pong Jun 16 08:49:24 <[g2]> jbowler, morning ! Jun 16 08:55:31 VoodooZ_Work jacques: I pulled last night, my build had both of the root_ipk.bbclass changes - no problems. Jun 16 08:55:43 morning [g2] Jun 16 08:56:18 <[g2]> jbowler, I had a question but 10:36 (1.5 hours seems like a long time ago ) :) Jun 16 08:56:23 [g2]: it would be convenient for me too if the APEX config used (and maintained) the last block... Jun 16 09:03:15 <[g2]> jbowler, is preserve in there now ? Jun 16 09:03:36 <[g2]> I noticed a changeset that had some preserve stuff init Jun 16 09:06:55 <[g2]> jbowler, I think my question from earlier was regarding some packages and uclibc compatibility, however I think they were already in there when I checked Jun 16 09:07:26 "reflash" is in there - it preserves and restores the config. I haven't used upslug for days now ;-) Jun 16 09:07:53 I could add a 'preserve to flash' option, but there is no need for it in a normal upgrade. Jun 16 09:08:29 Still, if Rod wants to do unslung stuff he needs to preserve the whole of SysConf (at least), and ideally we should agree on somewhere to preserve it so that it is consistent. Jun 16 09:08:48 <[g2]> nod. Jun 16 09:09:19 <[g2]> so "reflash" overwrites the jffs2 partition ? Jun 16 09:09:21 (Of course he could use reflash on unslung too, that might be better - no RedBoot upgrade ever...) Jun 16 09:09:40 reflash: that's correct. It should work on APEX too. Jun 16 09:09:59 <[g2]> does it just take the jffs2 rootfs file ? Jun 16 09:10:05 What it has to be able to find is 'Kernel' and 'Flashdisk' in the output of 'cat /proc/mtd' Jun 16 09:10:30 Input to reflash is a complete image (-i) or the zImage and the jffs2 rootfs. Jun 16 09:11:22 <[g2]> ok.. it'd be nice to accomadate the latest APEX changes Jun 16 09:11:40 <[g2]> you know we can boot directly from the zImage in the jffs2 rootfs right ? Jun 16 09:12:19 Yes Jun 16 09:12:43 There are two issues at present: 1) How to detect APEX, 2) How to find the eth hardware id. Jun 16 09:13:15 <[g2]> Ok... 2) is handled by already running openslug right ? Jun 16 09:13:55 <[g2]> 1) would be a different switch or option like -j for the jffs2 partition only Jun 16 09:14:32 (2) is handled if (1) you flash APEX yourself from the openslug command line or (2) you don't invalidate SysConf in /proc/mtd Jun 16 09:15:09 <[g2]> initially, I don't think we need to lose the Sysconf partition Jun 16 09:15:32 But if you flash a complete 8Byte image you loset the ethernet id. Jun 16 09:16:32 <[g2]> several things Jun 16 09:16:41 If you release an APEX which relies on SysConf then you may well be relying on it forever ;-) Jun 16 09:17:06 <[g2]> APEX doesn't rely on anything Jun 16 09:17:17 <[g2]> not even the partition table yet Jun 16 09:17:41 <[g2]> back to "lost" Jun 16 09:17:49 <[g2]> that's a semantic thing Jun 16 09:18:06 <[g2]> first off, the MAC is on the box, case, and on the board inside Jun 16 09:18:25 <[g2]> secondly, it's only "lost" of if don't save it off and restore it Jun 16 09:18:53 <[g2]> thirdly, I'd like to be able to Jun 16 09:19:04 <[g2]> a) fix the reboot / shutdown issues Jun 16 09:19:17 <[g2]> b) possibly on warm boot leave information in memory Jun 16 09:19:40 <[g2]> thought ? Jun 16 09:20:25 (a) is the only thing between us and a release Jun 16 09:20:40 * [g2] hugs jbowler Jun 16 09:21:03 What does (b) give, I don't understand... Jun 16 09:21:21 <[g2]> with be you'd "preserve" in memory Jun 16 09:22:46 <[g2]> I don't know what the granularity is, but we could probably steal the top 4/8/16K of memory Jun 16 09:23:12 preserve doesn't need to work across a boot - reflash loads a new jffs2 root then puts the config back then reboots Jun 16 09:23:42 The only point to a preserve is to freeze information across a complete reflash. Jun 16 09:24:03 <[g2]> which is why the saving to memory would be useful Jun 16 09:24:07 I.e. where the whole kernel/rootfs stuff is rewritten externally. Jun 16 09:24:24 <[g2]> right Jun 16 09:25:09 I don't think we really care, except for the MAC - if someone does a complete reflash not from within openslug (or unslung) they want to reset everything ;-) Jun 16 09:25:11 <[g2]> 1st, the user should have a copy of the first couple blocks saved off somewhere Jun 16 09:25:33 <[g2]> right on the complete reflash Jun 16 09:26:15 <[g2]> it might only be an issue in the not-to-distant future if we fully redo the partition structure Jun 16 09:26:27 Hum. It's too long ago since I checked reflash in. I think it's fairly aggressive about getting the MAC back if it can't read the configuration. Jun 16 09:27:19 <[g2]> for now, I'm only considering removing the kernel partition Jun 16 09:27:31 <[g2]> and growing the jffs2 partition Jun 16 09:28:11 I think if you remove SysConf on an APEX system you will find that you have little choice but to re-enter the MAC from the label :-) Jun 16 09:29:09 The fact that your slug ends up called 'brokenslug' and that it has a fixed IP address and that the /etc/motd is very loud... Jun 16 09:29:14 <[g2]> actually, I think once we start using the config area for APEX, APEX will copy the MAC into its config block Jun 16 09:30:39 Yes, but openslug can't find it unless the config block is 80bytes at the end of the second erase block and has a sercomm signature at the end. Jun 16 09:31:23 Search for 'APEX' in a recent /etc/init.d/sysconfsetup Jun 16 09:32:40 I can't be sure the code in there works because I've got no safe (non-brick-making) way of doing a complete test. The individual parts work from the command line... Jun 16 09:32:44 <[g2]> # APEX: may need extra code to set initmac here. Jun 16 09:33:15 Yep, an elif cause for when /proc/mtd does not contain RedBoot Jun 16 09:33:25 s/cause/clause/ Jun 16 09:34:10 The loud /etc/motd is at the end of /etc/init.d/sysconfsetup Jun 16 09:34:40 (note British English spelling ;-) Jun 16 09:34:52 <[g2]> ;) Jun 16 09:35:03 is irq26 a release blocker? Jun 16 09:35:13 (I'd say no) Jun 16 09:35:29 I say no, I think it should be left in there no (not disabled). Jun 16 09:35:39 but I'd like to hear more about the reboot/shutdown issues Jun 16 09:36:20 is it the intermittent errors/hang when trying to reboot with a disk attached ? Jun 16 09:36:20 <[g2]> jbowler, thx for pointing that out in the sysconfsetup Jun 16 09:36:24 And fixing halt - the reboot problem is that the machine hangs, but if halt pulled the power the power button would work. Jun 16 09:36:45 jbowler, is the reboot problem repeatable ? Jun 16 09:36:51 <[g2]> for now it doesn't matter because we don't reflash the second block and we don't change the partition name :) Jun 16 09:37:14 <[g2]> jacques, irq26 isn't a release blocker Jun 16 09:37:49 <[g2]> we should (and I probably will) test with the noirqdebug kernel parameter Jun 16 09:38:02 <[g2]> that will mask the problem Jun 16 09:38:16 The reboot problem is repeatable on a disk system almost all the time. Jun 16 09:38:31 <[g2]> shutdown -h and halt will be fixed by pulling the power via GPIO8 Jun 16 09:38:43 is it somehow related to the GPIO8 thing? Jun 16 09:38:54 <[g2]> I don't think so Jun 16 09:38:59 does linksys reboot do anything openslug's doesn't? Jun 16 09:39:16 <[g2]> I think it's only related by the fact that you windup in the same code which is hanging Jun 16 09:39:27 In my experience the 2.4.22-xfs kernel had a similar problem, symptoms were identical. Jun 16 09:39:38 and it was fixed? Jun 16 09:40:16 (well, obviously it was fixed) :-) Jun 16 09:40:17 I don't know - I run that kernel with an old openslug root. Jun 16 09:40:42 ooh, you mean linksys kernel has the issue with openslug rootfs Jun 16 09:40:44 I don't think it was fixed - I think something that my openslug root does causes the bug to manifest. Jun 16 09:40:59 jacques: correct Jun 16 09:41:04 there really aren't any linksys reboot patches ? Jun 16 09:41:34 <[g2]> jacques, linksys never released the busybox changes which it should have Jun 16 09:41:44 what??? Jun 16 09:41:48 <[g2]> remember the violation for the SetLEDs ? Jun 16 09:41:59 jeeez I didn't know that Jun 16 09:42:12 <[g2]> we told andersee about it like 10 months ago Jun 16 09:42:22 <[g2]> I've never followed up on what happened Jun 16 09:42:25 what are we doing on unslung? using the linksys shutdown and reboot binaries? Jun 16 09:42:37 <[g2]> NOD Jun 16 09:42:38 jacques: my 2.4.22-xfs has a disk rootfs, standard linksys has a ramfs rootfs, and I think ramfs rootfs on openslug is fine! Jun 16 09:43:05 I guess one test would be to try the linksys reboot in openslug Jun 16 09:43:24 <[g2]> well the openslug reboot sometimes works Jun 16 09:43:27 And unslung has used ramfs (fine), jffs2 (fine) and disk (?) rootfs Jun 16 09:43:45 <[g2]> I'm guessing we are fully doing something right or looking at a bad place in memory Jun 16 09:43:46 jbowler, oooh damn I see your point Jun 16 09:44:03 unslung has never used a disk rootfs that I know of Jun 16 09:44:17 What is 5.5 using? Jun 16 09:44:57 <[g2]> the shutown -h will certainly point to cleaning up some errors Jun 16 09:45:19 <[g2]> but cause you get bogus scsi devices found *after* the power down message Jun 16 09:45:27 yeah Jun 16 09:45:53 <[g2]> I've been poking around in the kernel looking for the exact code that's getting run Jun 16 09:46:07 <[g2]> I haven't found it yet but I'm pretty darn close Jun 16 09:47:29 There's also still a but in the eth driver, but I'm sure this is unrelated: [error] ixEthMiiPhyScan : unexpected Mii PHY ID 00008201 Jun 16 09:47:54 not really a bug from what I've been told Jun 16 09:48:16 just that the particular PHY Linksys used is not listed in the driver Jun 16 09:48:49 (which I guess is still a bug) :-) Jun 16 09:49:02 but only cosmetic IIRC Jun 16 09:49:05 The code seems to make some decisions based on the ID, so I was a little uncertain about whether it was safe. Jun 16 09:49:27 Presumably LinkSys actually have a hacked version of the Intel driver... Jun 16 09:49:46 I remember trying to track that down during the whole "is it really running in full duplex mode" debate Jun 16 09:53:02 There's also the 'dropbear using /dev/urandom' problem. So that makes (1) reboot (2) halt (3) /dev/random starvation Jun 16 09:53:46 ah yeah I was wondering what had happened with the whole lack of entropy issue just the other day Jun 16 09:54:04 did we ever tell the entropy driver to use the NIC interrupts ? Jun 16 09:54:07 <[g2]> well there is no entropy on the slug so (3) is a mute point Jun 16 09:54:09 We need an entropy source. eth0 is now interrupt drive. Jun 16 09:54:37 jacques: no, I think I tried to make that change and found the driver missed some interface. Jun 16 09:55:01 <[g2]> right even with eth0 on my usual setup point to point no traffic = 0 entrophy Jun 16 09:55:03 oh that reminds me, wht about the issue I brought up in the other channel a couple of days ago? Jun 16 09:56:11 <[g2]> jacques, I couldn't remember why I ping'd jbowler an hour and a half ago, could you throw me a bone on your question ? Jun 16 09:56:41 [g2], look in other channel Jun 16 10:26:11 <[g2]> jacques, jbowler would you like to talk about the reset issue at all ? Jun 16 10:27:08 The reboot/halt issue? Yes. Jun 16 10:27:43 <[g2]> yeah Jun 16 10:28:22 <[g2]> from best I can tell we are winding up near here http://lxr.linux.no/source/kernel/sys.c?a=arm#L390 Jun 16 10:29:30 <[g2]> which I think takes us to http://lxr.linux.no/source/arch/arm/kernel/process.c?a=arm#L141 Jun 16 10:31:08 Does it get to the printk? Jun 16 10:31:28 <[g2]> the System restarting. ? Jun 16 10:31:35 <[g2]> For SURE Jun 16 10:31:44 The 'Reboot failed -- System halted' Jun 16 10:33:04 <[g2]> no.. I never see that message Jun 16 10:33:57 I think then that it probably gets to arch_reset, and that is in include/asm-arm/arch-ixp4xx/system.h Jun 16 10:34:36 <[g2]> iirc that's where I was looking next i think Jun 16 10:34:40 <[g2]> ebooting... [fatal] IXETHACC:ixEthAccPortDisable: ixEthAccPortDisable failed port 0 (state = 4) Jun 16 10:34:40 <[g2]> ixp425_eth: eth0: BUG: ixEthAccPortDisable(0) failed Jun 16 10:34:40 <[g2]> Restarting system. Jun 16 10:35:02 <[g2]> when it fails that's the last I see and the amber light stops blinking Jun 16 10:35:39 reboot_mode defaults to 'h' (for hard?) Jun 16 10:35:51 <[g2]> I'm wondering what the reboot mode is ? Jun 16 10:36:46 <[g2]> static char reboot_mode = 'h'; Jun 16 10:36:56 <[g2]> http://lxr.linux.no/source/arch/arm/kernel/process.c?a=arm#L117 Jun 16 10:37:49 reboot_mode controls what happens in arch_reset Jun 16 10:38:06 <[g2]> I know that I meant what it was initialized to Jun 16 10:39:04 You could try a printk after the setup_mm... - then we would know if it was getting to arch_reset Jun 16 10:39:15 (But printk may not work at that point). Jun 16 10:39:42 <[g2]> So you think we are winding up here (which I'd tend to agree with) http://lxr.linux.no/source/include/asm-arm/arch-ixp4xx/system.h?a=arm#L28 Jun 16 10:40:01 setup_mm_for_reboot doesn't use 'mode', but if reboot_mode is 'h' it would seem that it is not necessary to call it... Jun 16 10:40:13 <[cc]smart> hi all... Jun 16 10:40:22 hi [cc]smart Jun 16 10:40:30 <[g2]> hi [cc]smart Jun 16 10:40:40 [g2]: yes, I'm hoping, otherwise it is hanging in setup_mm_for_reboot. Jun 16 10:40:40 <[cc]smart> afaik thje licensing for bk is going to be changed Jun 16 10:41:00 <[cc]smart> will openslug do sth. in this respect ? Jun 16 10:41:23 [cc]smart: follow what OE does Jun 16 10:41:40 <[g2]> yeah you were too fast for me :) Jun 16 10:41:47 <[cc]smart> which is ? undecided ? Jun 16 10:42:14 <[g2]> dunno Jun 16 10:42:26 [g2]: I think we want a hard reset, i.e. mode != 's', and that's probably what is happening. Jun 16 10:42:37 <[g2]> we have a SVN backup which can be switch to in minutes Jun 16 10:43:41 [g2]: hum, beewoolie made some comment before about the jump to 0 having potential problems, I wonder if we are doing 's'? Jun 16 10:44:16 <[cc]smart> BTW: regarding rmrecovery i came to conclusion that inittab would be the better place, since the execution is generally independent from any runlevel considerations. Jun 16 10:46:19 [cc]smart: I'm not sure what that means, inittab is a container, in some sense, for the rc scripts, but I don't fully understand the model being used (if, indeed, there is one...) Jun 16 10:49:06 <[cc]smart> using /etc/rc*.d you always define a runlevel realtion. in inittab (or rc.local, which now that i mention it would be a good place too) et al you can have it runlevel independend Jun 16 10:49:36 <[cc]smart> so there will be no question regarding for which runlevels to configure rmrecovery, which has no sensible answer Jun 16 10:51:25 The answer is 'all user runlevels' - any runlevel which corresponds to a useable system. I.e. not rc6 or rc0 Jun 16 10:51:36 <[cc]smart> btw2, i just did bk citool and it mumbles sth. about openlogging.org which i've got no clue about... Jun 16 10:52:12 That's the license agreement (it was in the license when you downloaded bitkeeper) Jun 16 10:52:37 <[cc]smart> his repository is configured to send changeset summaries to ogging@openlogging.org Jun 16 10:52:50 <[cc]smart> We need to get an OK from each user that it is OK to publish the Jun 16 10:52:50 <[cc]smart> change logs. Jun 16 10:52:58 <[cc]smart> seems not licensing... Jun 16 10:53:37 <[cc]smart> ah denying is always right Jun 16 10:53:53 Denying it is always wrong. Jun 16 10:54:08 <[cc]smart> ha, they tell me without accepting i'd have to buy it. Jun 16 10:54:11 That's not the license we have with BitMover, or you. Jun 16 10:54:32 <[cc]smart> bk just got the kick from me Jun 16 10:54:39 You should read the license, it's quite explicit. Jun 16 10:55:07 <[cc]smart> i don't need anything from them Jun 16 10:55:34 Eh, bitkeeper? Jun 16 10:56:13 <[cc]smart> i just forgot about this... what's it called ? bitkeeper ? Jun 16 10:59:55 [g2]: reboot_mode can be set by 'reboot=h' or 'reboot=s' on the kernel command line Jun 16 11:00:11 <[g2]> jbowler, I'm thinking the following Jun 16 11:00:26 <[g2]> in the 's' portion Jun 16 11:03:36 (I believe halt doesn't work because pm_power_off is not set) Jun 16 11:05:17 <[g2]> *IXP4XX_GPIO_GPOUTR = *IXP4XX_GPIO_GPOUTR & 0xFFFFFFFB; Jun 16 11:05:22 <[g2]> and in the other case Jun 16 11:05:27 <[g2]> *IXP4XX_GPIO_GPOUTR = *IXP4XX_GPIO_GPOUTR & 0xFFFFFFF7; Jun 16 11:06:03 <[g2]> that would turn on Disk1 for 's' and Disk2 for the other correct ? Jun 16 11:06:48 I have a feeling that it may be necessary to set an enable register first, let me check... Jun 16 11:06:59 <[g2]> there already setup Jun 16 11:07:07 <[g2]> they're Jun 16 11:08:09 Yep, that's all n2lm_ledon does - looks good. Jun 16 11:17:39 <[g2]> jbowler, Ok kernel is built Jun 16 11:18:14 <[g2]> I can just ipkg upgrade kernel ? Jun 16 11:18:53 No: I still don't know how to make that work (chicken and egg problem with the rootfs). Jun 16 11:19:10 I scp'ed it then used devio. Jun 16 11:19:44 Oh, you're on APEX - yes. ipkg upgrade should work if you bumped the PR. Jun 16 11:19:48 <[g2]> actually I think I can just ipkg upgrade the kernel then copy it to my /initrd Jun 16 11:20:13 <[g2]> ok ipkg install kernel :) Jun 16 11:20:33 I'm running the mod'ed kernel - I didn't see either LED on a succesful (jffs2 rootfs) reboot. Jun 16 11:21:05 <[g2]> interesting :) Jun 16 11:21:11 /initrd: yes, you need to copy the zImage there (if you don't have /dev/mtdblock? as the rootfs) Jun 16 11:21:22 I suspect it was just too fast. Jun 16 11:22:08 I'm growing a turnip now. Jun 16 11:24:04 <[g2]> actually, I think we could just have dest in ipkg point to /initrd and the do the update right ? Jun 16 11:25:00 I think so. Jun 16 11:25:48 Hum, I don't get either disk light, but I think, given that I didn't see anything on a successful reboot, it may be a problem in setting the leds. Jun 16 11:25:58 I'll put in an mdelay. Jun 16 11:26:31 Can you change the kernel command line from APEX? Jun 16 11:26:39 <[g2]> yes :) Jun 16 11:26:50 <[g2]> I'm dd'ing to the old kernel space now :) Jun 16 11:27:15 <[g2]> or were you thinking the command line parameter Jun 16 11:27:17 Try 'reboot=s' if the LEDs don't work (I short-circuited all the bitbake stuff so maybe I didn't build the right file.) Jun 16 11:27:49 <[g2]> Ok.. I'll try that after I boot this test kernel Jun 16 11:29:08 <[g2]> Ok it just hung Jun 16 11:29:19 <[g2]> ixp425_eth: eth0: BUG: ixEthAccPortDisable(0) failed Jun 16 11:29:19 <[g2]> Restarting system. Jun 16 11:31:06 Ah, but we wouldn't see the LED without an mdelay, because something that happens does turn the LEDs off (cause the hung system has no LEDs on). Jun 16 11:31:20 <[g2]> pretty nifty Jun 16 11:31:46 <[g2]> I forgot to copy the header but when I loaded the kernel I just started from the proper addresss Jun 16 11:33:35 <[g2]> darn the reboot worked Jun 16 11:33:52 nothing fails when you want it to ;-) Jun 16 11:37:14 <[g2]> Kernel command line: console=ttyS0,115200 root=/dev/mtdblock4 rw rootfstype=jffs2 init=/linuxrc reboot=s Jun 16 11:38:26 <[g2]> locked up but slightly differently Jun 16 11:38:36 <[g2]> the Ready/Status is amber Jun 16 11:38:43 Ah ha! Jun 16 11:38:46 <[g2]> usually it's out Jun 16 11:39:18 Right, so something in the mode!='s' case is killing the LEDs. I'm about to confirm because I have the mdelay in now. Jun 16 11:40:35 <[g2]> could be a real hw reset maybe :) Jun 16 11:42:06 <[g2]> Kernel command line: console=ttyS0,115200 root=/dev/mtdblock4 rw rootfstype=jffs2 init=/linuxrc reboot=h Jun 16 11:46:42 <[g2]> with /initrd mounted and reboot=h the system locked up Jun 16 11:47:10 <[g2]> which lends support to beewoolie's theory that the flash isn't in a happy state Jun 16 11:48:18 Disk1 LED comes on (default command line etc). All LEDs go off after the mdelay (1s). Jun 16 11:49:22 I.e. the bottom LED comes on. Jun 16 11:50:30 <[g2]> DISK1 ? Jun 16 11:51:24 Weird. I'm going to double check this with a different LED... Jun 16 11:51:37 <[g2]> Right so that's the hard reboot with the watch dog Jun 16 11:52:04 <[g2]> The FFFF7 code Jun 16 11:52:11 <[g2]> GPIO3 Jun 16 11:52:50 No - it's 's' Jun 16 11:53:43 <[g2]> so you code is switched from mine then Jun 16 11:54:32 <[g2]> what value are you writing in the 's' path ? Jun 16 11:55:27 The FB one, but I just changed it to write F3. Jun 16 11:56:35 <[g2]> well the FB should set GPIO2 low which is DISK 2 Jun 16 11:57:23 <[g2]> I wonder if we need to tristate the GPIO8 first Jun 16 11:59:50 The comments in nslu2-io.c say: "IXP4XX_GPIO_GPOUTR &= DISK1_ON; //0xfffffffb" Jun 16 12:00:15 Aaargh, the comments are wrong, DISK1_ON is F7... Jun 16 12:00:23 <[g2]> http://www.nslu2-linux.org/wiki/Info/GPIOConnections Jun 16 12:01:12 <[g2]> well it was a 50/50 shot :) Jun 16 12:01:41 Fixed the comments... Jun 16 12:02:46 <[g2]> jbowler, so the point being you're winding up in the soft reset Jun 16 12:03:27 No - I used the comments. I'm in hard reset, I just confirmed it. Jun 16 12:03:39 <[g2]> OK Jun 16 12:03:43 So one or more of those three IXP4XX_ lines is wrong ;-) Jun 16 12:04:10 Or maybe some extra setup is required. Jun 16 12:04:35 <[g2]> the only setup is writing to the output enable register Jun 16 12:05:03 <[g2]> cat icache.S Jun 16 12:05:03 <[g2]> .section text Jun 16 12:05:03 <[g2]> .align 0 Jun 16 12:05:03 <[g2]> init: Jun 16 12:05:03 <[g2]> mov r1, #0xC8000000 @ Setup the GPIO address in R1 0xC8004000 Jun 16 12:05:04 <[g2]> mov r2, #0x4000 @ Load 0x40.. Jun 16 12:05:06 <[g2]> orr r2, r2, r1 @ R2 should now contain 0xC8004000 Jun 16 12:05:08 <[g2]> mov r0, #0xff00 @ We forgot to invert last time :) Jun 16 12:05:10 <[g2]> orr r0, r0, #0xf0 Jun 16 12:05:12 <[g2]> str r0, [r2,#4] @ Store it a GPIO_ER Jun 16 12:05:14 <[g2]> mov r0, #0x0003 @ Light up GPIOs 2-3 (DISK 1 and 2) Jun 16 12:05:16 <[g2]> str r0, [r2,#0] @ Set them Jun 16 12:05:18 <[g2]> loop: Jun 16 12:05:20 <[g2]> b loop @ Loop forever Jun 16 12:05:33 <[g2]> that's the code we ran from both Redboot and directly loading it into the mini I cache Jun 16 12:06:40 <[g2]> I ran it in Redboot and ep1220 ran it by loading dir mini icache lines directly Jun 16 12:07:41 So that doesn't do anything to the watchdog timer. Jun 16 12:08:01 (I assume that is what 'WDT' is) Jun 16 12:08:12 <[g2]> yes WDT is the watch dog timer Jun 16 12:08:44 <[g2]> right that code just verified the loading of the mini i cache visually by changing the LEDs Jun 16 12:09:09 Clearly the IXP4XX_OSW[KTE] writes are doing something because the LEDs go off... Jun 16 12:09:18 <[g2]> nod. Jun 16 12:09:33 <[g2]> I think we are actually reseting but then becomming unhappy Jun 16 12:10:28 <[g2]> which fits with bee's theory Jun 16 12:10:42 <[g2]> the processor is reset, but can't pull the instructions from flash Jun 16 12:10:57 <[g2]> cause it's in an unhappy place Jun 16 12:11:45 <[g2]> I think APEX toggles the LEDs pretty early and I don't see any Jun 16 12:12:43 The kernel is at address 0, that needs to be removed so that the flash is at address 0, right? Jun 16 12:13:42 <[g2]> I don't know what that status of the chip is after the dog bites Jun 16 12:13:47 Um, no, this is beyond my experience - I've never really looked at bootstrap code. Jun 16 12:14:12 <[g2]> I've written some boot strap code Jun 16 12:14:39 <[g2]> I'll have to check out the reference manual Jun 16 12:15:20 <[g2]> it might be worth looking at the memory setup. We may need to flush the caches or disable something before we reboot Jun 16 12:15:37 It may be more than that because of the other buses in there - i.e. it's more than just the CPU code and MMU. Jun 16 12:15:44 s/code/core/ Jun 16 12:16:09 <[g2]> I'm not worried about the other buses Jun 16 12:16:16 <[g2]> APEX initializes those Jun 16 12:16:40 <[g2]> we had a PCI bus init issue which originally caused the USB and Ethernet not to work Jun 16 12:17:21 Then maybe it isn't even executing anything in RedBoot/APEX? Jun 16 12:17:24 <[g2]> If we get to the code on the other end of the dog biting we'll be able to trace down what's happending Jun 16 12:17:43 <[g2]> jbowler, nod. That's the theory beewoolie has Jun 16 12:18:06 <[g2]> he's thought is that the flash is in a bad state and the reads aren't happening Jun 16 12:18:20 <[g2]> I'm wondering what state the MMU and caches are in Jun 16 12:19:29 Presumably not completely reset... Jun 16 12:19:29 <[g2]> Ideally, someone should just have a logic analyzer hooked up to the bus Jun 16 12:19:35 <[g2]> right Jun 16 12:20:02 <[g2]> then we could see what's happending Jun 16 12:20:55 <[g2]> JTAG debugging would be helpful also Jun 16 12:21:40 ...this would seem to be a list of things neither of us have... Jun 16 12:22:04 <[g2]> that's OK Jun 16 12:22:24 <[g2]> we've got our minds which are actually much better than a logic analyzer :) Jun 16 12:22:41 <[g2]> I just ping'ed Tiersten in openJTAG Jun 16 12:22:57 <[g2]> I think he's got both the logic analyzer and JTAG Jun 16 12:23:22 <[g2]> I think if he's running openslug he could just JTAG in can see the state of the CPU and probably the trace buffer Jun 16 12:23:51 * [g2] has faith Jun 16 12:25:35 It certainly looks like it is just a simple piece of missing initialisation. Jun 16 12:25:43 <[g2]> nod. Jun 16 12:26:02 <[g2]> it's just finding the right piece :) Jun 16 12:26:25 <[g2]> jbowler, on a totally separate note: are you interested in the IXP425 at all ? Jun 16 12:27:55 I don't think so - as I remember it the extra stuff it has is pretty much just Intel complexity. Using it means either using Intel proprietary code or doing a whole lot of low level programming... Jun 16 12:28:19 <[g2]> ok. Jun 16 12:28:21 <[g2]> THX Jun 16 12:30:04 Where's the watchdog documented? I'm looking at the XScale core manual, but there doesn't seem to be anything there... Jun 16 12:31:48 <[g2]> I'd guess in here 25248005.pdf that the 03/2005 Developer's Manual Jun 16 12:33:14 <[g2]> Page 411 Jun 16 12:41:13 Well the watchdog is pretty much useless if when it fires it does not reliably reset the system regardless of state... Jun 16 12:41:42 <[g2]> I'm thinking the following: Jun 16 12:41:56 <[g2]> 1) The watchdog isn't really firing Jun 16 12:42:24 <[g2]> 2) Due to the Linksys design with the power circuit there an issue there Jun 16 12:42:46 <[g2]> 3) There may be eratta on the chip which we haven't run across Jun 16 12:43:03 <[g2]> 4) could be something else entirely :) Jun 16 12:43:26 <[g2]> looks like the chip is reset to it's default state Jun 16 12:43:44 <[g2]> the only difference is that they warm start bit is set Jun 16 12:44:09 <[g2]> "warm reset bit" that is Jun 16 12:45:28 Yes, that's how it reads to me too. Still if RedBoot had a bug in the warm reset handling I wouldn't expect APEX to have the same bug... Jun 16 12:45:46 <[g2]> right and APEX works some of the time Jun 16 12:46:06 <[g2]> but it's the times that it doesn't work that are the issue :) Jun 16 12:47:10 And RedBoot works some of the time... Jun 16 12:48:06 <[g2]> right so it's not a "hard" failure Jun 16 12:48:50 <[g2]> jbowler, THX for all the help Jun 16 12:49:09 <[g2]> I've got to run out for a while but it's been good getting a lot closer to the issue Jun 16 12:49:53 Me too, meanwhile I'm trying setting the watchdog timer to >0. Jun 16 12:50:07 <[g2]> Ok Jun 16 12:50:10 <[g2]> GL Jun 16 12:50:14 <[g2]> see you later Jun 16 12:50:19 <[g2]> cheers Jun 16 14:55:07 [g2]: setting the watchdog timer to 0.5s shows that the code gets past the rights fine then hangs when the watchdog fires. Jun 16 14:55:45 <[g2]> jbowler, ok that's good input **** ENDING LOGGING AT Thu Jun 16 23:59:57 2005