**** BEGIN LOGGING AT Thu Jan 10 03:00:02 2019 Jan 10 06:02:25 morning Jan 10 06:02:35 jow: did you take a look at https://patchwork.ozlabs.org/patch/1018435/ ? Jan 10 06:02:44 does it look ok for you? I'd like to push it Jan 10 08:21:58 Hi can any one tell me if this bug in openwrt is fixt? Jan 10 08:22:02 https://bugs.openwrt.org/index.php?do=details&task_id=1810 Jan 10 08:29:04 Tapper: no, otherwise it would be closed :) Jan 10 08:29:26 Tapper: it seems no developer has this hardware so it will likely remain broken Jan 10 09:08:59 rmilecki: patch looks good Jan 10 09:09:43 jow: thank you Jan 10 09:39:31 Tapper: it seems to be fixed if you use snapshots, but it's broken in the latest 18.06.1 release Jan 10 09:52:10 ynezz ok mate thanks I will build a snapshot for my friend. Jan 10 09:53:31 you can download it Jan 10 09:53:59 http://downloads.openwrt.org/snapshots/targets/kirkwood/generic/ Jan 10 09:55:24 but it would be nice if you could build image from this branch https://git.openwrt.org/?p=openwrt/staging/jow.git;a=shortlog;h=refs/heads/backport-18.06 and test it, then update the FS#1810 ticket with the results Jan 10 10:06:00 kab-el: a driver for those cards is being worked on Jan 10 10:12:23 rmilecki: any idea about https://bugs.openwrt.org/index.php?do=details&task_id=1926#comment5898 ? Jan 10 10:15:10 jow: I don't understand that part of the report: Jan 10 10:15:12 "The kernel states that the ‘factory’ partition starts at 0x2e00000 (that’s correct), but in reality OpenWrt will search for the partition at 0x2e20000 (2e00000 + (1 * 128KiB))." Jan 10 10:15:22 if kernel has a proper partition created... Jan 10 10:15:38 what exactly OpenWrt is doing wrong then? Jan 10 10:16:17 no idea, to me it sounds as if people expect the factory partition to always start at 0x2e00000 even if there happens to be a few bad blocks there Jan 10 10:16:27 let me ask Jan 10 10:16:31 in the report Jan 10 10:16:36 possibly important because uboot or something expects fixed locations Jan 10 10:18:55 that CONFIG_APPEND stuff looks really handy Jan 10 10:19:12 rmilecki: the issue is that reading the first blocks of mtd4 (/dev/mtdblock4 I guess) does not return the contents at 0x2e00000 but from 0x2e20000 Jan 10 10:19:45 OK ynezz thanks. I will build for him because he's new to openwrt and I would have to talk him through installing packages anyway. :-) Jan 10 10:20:15 jow: i'm totally unaware of mtd doing any hacks like that (bumping read offset) Jan 10 10:20:28 I am not shore how to change branches. I build form snapshot. Jan 10 10:20:35 from* Jan 10 10:20:37 ynezz: Tapper: someone reported via the bugs ml "Mind not to forget to install the gpio module, because that's needed to enter failsave boot; without the reset button does nothing on boot." Jan 10 10:21:02 apparently some gpio module is not in the defualt pacakge selection fro the EA3500 Jan 10 10:21:05 no idea which one Jan 10 10:21:31 Ok I will have a look see what I can find out. Jan 10 10:21:36 thanks both Jan 10 10:24:20 rmilecki: the linked wiki page has some more details. apparently in the case of bad blocks, reading the mac fails with "mt76x2e 0000:01:00.0: EEPROM data check failed: ffff" Jan 10 10:24:26 rmilecki: the device dts in turn has mediatek,mtd-eeprom = <&factory 0x8000>; Jan 10 10:24:30 rmilecki: jow: from a first glance it might be that the mt7621 nand driver "hides" bad blocks and pretents all the good blocks are one contiguous block, causing the misalinged offsets Jan 10 10:24:31 jow: i saw that too Jan 10 10:24:36 jow: i'm reading forum thread as well Jan 10 10:25:04 KanjiMonster_: okay, that would be serious I guess Jan 10 10:28:14 does u-boot allow dumping flash content easily for a selected offset + size? Jan 10 10:31:11 KanjiMonster_: there is this code: https://pastebin.com/eGFu6kCm Jan 10 10:31:19 do you mean something like that? Jan 10 10:31:36 5.9.2.5. md - memory display Jan 10 10:33:46 jow: probably, it was last year when I looked at the driver. I also haven't verified that this happens, it's just a suspicion from that code part IIRC (the driver is quite complex, and I don't have any mt7621 devices) Jan 10 10:34:04 omg Jan 10 10:34:13 I'm not experienced with kernel mtd code, the driver is over my head Jan 10 10:34:14 that driver really includes some crazy blocks mapping Jan 10 10:35:11 rmilecki: I guess the underlying question is - how are things like <&mtdpart 0x8000> dts references supposed to behave when there happens to be a bad block Jan 10 10:35:37 I guess the intention for these pointers is to be logical addresses Jan 10 10:37:57 those are raw on-flash addresses, bad blocks need to be handled by the consumer Jan 10 10:38:55 ah wait in that case ... dunno. probably depends on what's behind mtdpart Jan 10 10:40:40 how do I find the C code dealing with a particular dts property? Jan 10 10:40:44 such an access results in using mtd_read I believe Jan 10 10:40:49 mtd_read skips bad blocks Jan 10 10:41:17 (i believe) Jan 10 10:41:24 grep -r "mtd-eeprom" did not yield any results in a linux-4.14 tree Jan 10 10:41:46 jow: what's the property named? Jan 10 10:41:53 mediatek,mtd-eeprom = <&factory 0x8000> Jan 10 10:42:40 that's likely within the driver for device (the mediatek, part implies its driver proprietery) Jan 10 10:43:59 when I grep this over our target/linux/ I only find references in *.dts files and one single match in target/linux/ramips/base-files/etc/hotplug.d/firmware/10-rt2x00-eeprom Jan 10 10:44:22 which is an echo statement rt2x00_eeprom_die "Please define mtd-eeprom in $board DTS file!" Jan 10 10:45:25 apparently it translates to "soc_wmac.eeprom" somehow Jan 10 10:47:19 package/kernel/mac80211/patches/rt2x00/604-rt2x00-load-eeprom-on-SoC-from-a-mtd-device-defines-.patch Jan 10 10:47:20 jow: https://github.com/openwrt/mt76/blob/master/eeprom.c#L40 Jan 10 10:48:07 rmilecki: you found ralink,mtd-eeprom, not mediatek,mtd-eeprom ;p Jan 10 10:48:19 right Jan 10 10:48:29 same code Jan 10 10:48:30 prolly the same code Jan 10 10:49:25 ok, so it uses mtd_read() to read the offset specified in dts Jan 10 10:50:32 right and mtd_read() uses ->_read() or ->_read_oob() Jan 10 10:50:55 and those pointers most likely use functions from nand_base.c Jan 10 10:52:42 nand_base.c: mtd->_read_oob = nand_read_oob; Jan 10 10:55:39 ok, so it seems mtd_read() doesn't skip bad blocks Jan 10 10:55:40 my bad Jan 10 10:55:44 i suggested that before Jan 10 10:55:54 so maybe we just need to skip the block on our own? Jan 10 10:56:00 let me look at the reports again Jan 10 10:56:18 we *may* need something like target/linux/generic/pending-4.19/431-mtd-bcm47xxpart-check-for-bad-blocks-when-calculatin.patch Jan 10 10:57:13 nah Jan 10 10:57:37 so we do need to handle bad blocks when dealing with "mtd-eeprom" DT property Jan 10 10:57:47 is mtd_read(), on a high level, supposed to transparently returned remapped blocks in case of bad blocks? Jan 10 10:57:50 but it's important only when factory partition sontains bad blocks Jan 10 10:58:26 jow: no, mtd_read() won't handle bad blocks magically Jan 10 10:58:31 there is no mapping table in the kernel Jan 10 10:58:37 UBI has blocks mapping Jan 10 10:58:52 okay Jan 10 10:58:52 that's why you need to be aware of bad blocks when using mtd_read() & NAND Jan 10 10:59:03 anyway, I still believe the real problem is NAND driver Jan 10 10:59:11 and why has the mt76 nand driver internal bad block logic? Jan 10 10:59:27 no idea at all Jan 10 10:59:30 is it some kind of additional feature to allow use with non-ubufs filesystems? Jan 10 10:59:38 it's a big pile of hacks after all Jan 10 11:00:14 okay so tu summarize, the issue appears to be mediatek,mtd-eeprom and not some genral mtd layer problem in openwrt Jan 10 11:00:50 mediatek,mtd-eeprom is a local "hack" in mt76 which uses naive mtd_read() without bad block error handling Jan 10 11:00:50 no, mediatek,mtd-eeprom is OK Jan 10 11:01:23 it is? Jan 10 11:01:29 mt76_get_of_eeprom() has to be improved, but it only matters if partitions pointed by mediatek,mtd-eeprom has a bad block Jan 10 11:01:40 this case is different Jan 10 11:01:44 which is the point of the ticket Jan 10 11:01:48 reading "Factory" partition is broken Jan 10 11:01:53 because of the flash driver Jan 10 11:02:19 I do not understand the logic yet Jan 10 11:02:39 sec Jan 10 11:02:40 <&factory 0x8000> -> read byte offset 0x8000 relative to start of factory mtd partition Jan 10 11:02:51 right Jan 10 11:02:54 0x000002e00000-0x000002f00000 : "factory" Jan 10 11:03:10 so 0x000002e08000 Jan 10 11:03:12 so that should result in mtd_read(0x2e00000 + 0x8000) Jan 10 11:03:33 what happens when the block at 0x2e00000 is broken Jan 10 11:03:38 sorry, bad Jan 10 11:03:42 the problem is that for mtd_read(0x2e00000 + 0x8000) NAND drivers returns a content of mtd_read(0x2e00000 + 0x8000 + bad blocks before the Factory) Jan 10 11:04:01 jow: look at this: Jan 10 11:04:02 [ 2.969549] Bad eraseblock 266 at 0x000002140000 Jan 10 11:04:15 the bad block is totally out of the "factory" partition Jan 10 11:04:38 ok, the kernel shifts the factory partition if thereh appens to be a bad block before it? Jan 10 11:04:46 no, the partition always stays fixed Jan 10 11:04:47 no Jan 10 11:04:54 fixed-partiions means really fixed Jan 10 11:05:10 right, otherwise stuff like on-flash eeprom data might make no sense Jan 10 11:05:29 right Jan 10 11:05:40 okay and if the fixed on-flash eeprom has a bad block in it thne it is also supposed to simply fail Jan 10 11:05:49 at least according to the current impl Jan 10 11:05:58 right Jan 10 11:06:07 for now i'd suggest ignoring that Jan 10 11:06:24 as we have more serious problem at the more base level Jan 10 11:06:39 ok so the working theory is that the mt76 nad driver has some sort of built in convenience bad block handling Jan 10 11:06:39 NAND driver shifts all reads that are past the bad block Jan 10 11:06:49 and that breaks the whole idea of fixed partitions Jan 10 11:06:50 which we actually do not want Jan 10 11:06:59 correct Jan 10 11:07:03 not even in the fixed partition case but in general Jan 10 11:07:10 right Jan 10 11:07:14 we want ubi etc. and the other standard subsystems take care of that Jan 10 11:07:22 yes Jan 10 11:11:10 btw. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/net/wireless/mediatek,mt76.txt Jan 10 11:11:51 thanks! Jan 10 11:12:03 I wonder if it is as simple as disabling CONFIG_MTK_NAND_BMT Jan 10 11:12:07 blogic: ping Jan 10 11:15:00 rmilecki: BMT seems to be some mediatek specific high level bad block remapping logic Jan 10 11:15:15 it reserves an area of the nand to maintain some remapping table there Jan 10 11:15:40 and a lot of io functions have #ifdef MTK_NAND_BMT #else #endif guards Jan 10 11:15:51 if we could throw that out, the driver would shrink a lot Jan 10 11:16:41 jow: hey Jan 10 11:17:29 BMT is the bad block translation crap Jan 10 11:17:42 blogic: I think we can throw that out completely Jan 10 11:17:51 they have their own layer thus making the nand appear as a nor and you can do raw access without intermediate ubi Jan 10 11:17:56 probably Jan 10 11:18:05 i will abandon the driver when we move to v4.19 Jan 10 11:18:46 https://pastebin.com/iAjEGPj4 Jan 10 11:19:03 seems to be the none-BMT case would be what the usual linux subsystems expect Jan 10 11:19:24 jow: correct Jan 10 11:19:27 blogic: I see. Do we even need "access like nor" in OpenWrt presently? Jan 10 11:19:37 no idea Jan 10 11:19:48 iirc all of ramips/mt76xx should be nand/ubi foo Jan 10 11:19:50 there is only 1 board that uses the driver and i dont own it Jan 10 11:19:54 nuke it from orbit Jan 10 11:19:59 hm Jan 10 11:20:37 sounds in general like a bad idea, and seems to rely on the wrong assumption that good blocks can't go bad later Jan 10 11:20:43 that board does not happen to be Netgear R6220 ? Jan 10 11:20:47 KanjiMonster_: correct Jan 10 11:20:58 and xiaomi i think Jan 10 11:21:14 any mt7621 I suppose Jan 10 11:21:29 (judging from the patch name target/linux/ramips/patches-4.14/0039-mtd-add-mt7621-nand-support.patch) Jan 10 11:27:08 MTK_NAND_BMT seems to be already disabled Jan 10 11:27:15 at least the only define for it is commetned out Jan 10 11:27:54 are you sure there aren't more drivers using it? Jan 10 11:28:18 s/drivers/devices/ Jan 10 11:29:14 I think EdgeRouter X uses that mt7621 nand driver (ping russell--) Jan 10 11:32:12 HiWiFi HC5962 and NETIS WF-2881 also seem to be 7621 with nand Jan 10 11:32:29 btw, I have an MT7621 device (mikrotik RB750Gr3) that I only have for testing, so if there is any recent development anyone wants tested on that device I can do that. Jan 10 11:33:14 SwedeMike: that one uses NOR flash, so not affected Jan 10 11:33:15 jow: yea, I've noticed it, so I've looked at the seed configs for 18.06.1 and for snapshots and couldn't find any difference nor any gpio module, and device packages is set to `kmod-mwl8k swconfig wpad-basic kmod-gpio-button-hotplug` so he probably meant kmod-gpio-button-hotplug, anyway my suspect is this commit `6a27c2f base-files: drop fwtool_pre_upgrade` I've found between v18.06.0 and v18.06.1 Jan 10 11:35:06 since those linksys devices seems to fiddle with fw_setenv and Sven has hit some issue like this probably also ebd57de1f9894a91 Jan 10 11:35:30 which nand chip function mtd_read() usually dispatches to? Jan 10 11:35:39 but the truth is we need to do sth about mt7621 nand driver Jan 10 11:35:43 it's unmaintenable Jan 10 11:35:43 nand_chip->read_page or nand_chip->read_buf ? Jan 10 11:35:55 read_page probably Jan 10 11:36:02 I think so Jan 10 11:36:04 not sure Jan 10 11:41:40 rmilecki: I guess the upstream mtk nand driver available in 4.19 can be used on mt7621 with some modification. Jan 10 11:42:06 gch981213: "some modification" is the problem ;) Jan 10 11:45:30 rmilecki: That should be a smaller problem than overhauling current mt7621 nand driver. I remember seeing someone saying mt7623/mt7621 has similar nand controller IPs. I haven't do any comparison though :) Jan 10 11:45:46 *done Jan 10 11:55:38 gch981213: ok, sounds good Jan 10 11:55:45 gch981213: i wish I got time to look at that Jan 10 11:59:29 now, it'll get real fun when we fix mt7621 nand driver Jan 10 11:59:58 after fixing it, i guess i won't be able to read correctly existing UBIs Jan 10 12:00:10 at least on NANDs with bad blocks Jan 10 12:04:08 so BMT seems to be already off Jan 10 12:07:24 and the kernel will call mtd_read(mtd) >> mtd->_read(mtd) >> nand_read(mtd) >> nand_do_read_ops(mtd) >> mtd->chip->ecc.read_page() >> mtk_nand_read_page_hwecc() >> mtk_nand_exec_read_page() Jan 10 12:08:20 neither mtk_nand_read_page_hwecc() nor mtk_nand_exec_read_page() have any obvious remapping logic, even when bmt is enabled Jan 10 12:09:41 mtk_nand_read_page() has (which is registered as mtd->chip->read_page) but the mtd_read() code path will apparently never use mtd->chip->read_page, only mtd->chip->ecc.read_page Jan 10 12:09:47 so no idea, especially without hw :) Jan 10 12:51:58 jow: rmilecki: what about the shift_on_bbt / block_remap stuff? Jan 10 12:58:22 KanjiMonster_: looks like a possible cause Jan 10 12:58:24 write_next_on_fail also seems to be a recipe for desaster (while the write fails, mark block as bad and try to write stuff to the next block) Jan 10 12:58:25 jow: ^^ Jan 10 12:58:36 KanjiMonster_: omg... Jan 10 12:59:41 Hey can someone tell how to modify uboot for lantiq devices? Jan 10 12:59:55 the buildroot seems to only give me the option to build uboot or not to build Jan 10 13:05:38 rex_victor: select it to be build, do a make package/u-boot/prepare, modify the extracted sources in build_dir/... , then run make package/u-boot/compile Jan 10 13:07:39 KanjiMonster_: How do I get to something like menuconfig? Jan 10 13:08:04 run menuconfig from within the extracted sources Jan 10 13:09:06 KanjiMonster_: you mean cd package/... ; make menuconfig? Jan 10 13:09:35 rex_victor: no, I mean cd build_dir/..../u-boot-...; make menuconfig Jan 10 13:17:59 KanjiMonster_: shift_on_bbtp appears to be not relevant to mtk_nand_read_page_hwecc() either Jan 10 13:28:18 KanjiMonster_: I see. It seems the patches are not integrated their though Jan 10 13:28:28 *there Jan 10 13:54:37 in target/linux/ath79/image/Makefile, how ok is it to nest definitions with regard to SUPPORTED_DEVICES? Jan 10 13:55:03 background: https://github.com/openwrt/openwrt/pull/1379#discussion_r244956207 Jan 10 15:12:19 DonkeyHotei: you can probably just add empty `SUPPORTED_DEVICES :=` to your Device/netgear_ar7240 Jan 10 15:16:00 ynezz: i'll push an update Jan 10 15:16:10 I would first test it :) Jan 10 15:17:18 BTW did you checked, that netgear,ar7240 is actually added to the supported devices? Jan 10 15:18:01 how would i check that? Jan 10 15:20:08 it's appended as a JSON at the end of the firmware image Jan 10 15:21:26 there's something like "supported_devices":["netgear,wndr3700v2","wndr3700v2"] Jan 10 15:21:53 { "supported_devices":["netgear,ex7300"], "version": { "dist": "OpenWrt", "version": "SNAPSHOT", "revision": "r8978+2-eb1887b", "board": "ath79" } } Jan 10 15:22:57 so i suppose it's a no-op Jan 10 15:23:47 yep, seems so Jan 10 15:30:58 i'll build an updated image and look at the json before pushing Jan 10 15:31:26 looks like gcc was updated so it may take a bit Jan 10 15:46:19 ynezz: with `SUPPORTED_DEVICES :=` the json is missing altogether Jan 10 15:52:44 i'll push without that Jan 10 16:17:32 jow: rmilecki: you missed in your call trace patches-4.14/0040-nand-hack.patch, which makes nand_do_read_ops(mtd) do chip->read_page() which is mtk_read_page() which *does* the remap crap Jan 10 16:18:09 oh, great Jan 10 16:37:48 KanjiMonster_: ahh! Jan 10 16:38:30 yeah I never looked at the entire context Jan 10 16:39:08 so this driver has not only one but two remapping code paths Jan 10 16:39:47 seems we should get rid of all #ifdef MTK_NAND_BMT and if (shift_on_bbt) { ... } code parts Jan 10 16:40:04 the result should be considerably smaller than the current patch Jan 10 16:41:10 but given that blogic wants to kill the driver anyway we can probably simply change the sole "shift_on_bbt = 1" to "shift_on_bbt = 0" Jan 10 17:03:06 mtd_nand_write does the try next block until write succeeds regardless of this flag, which needs to be removed as well Jan 10 17:03:35 hm, true Jan 10 17:04:34 I am tempted to clean this up Jan 10 17:04:49 1) move driver to files-4.14 for easier review Jan 10 17:05:10 2) drop all MTK_NAND_BMT code parts (and bmt.{h,c}) Jan 10 17:05:22 3) drop all shift_on_bbt logic Jan 10 17:06:07 any idea for a cheap mt7621 device I can order? Jan 10 17:07:22 https://wikidevi.com/wiki/ZBT_WE1326 perhaps? Jan 10 17:07:49 mir3g too, but the naming is more complicted and you would want to get exactly the right one. Jan 10 17:09:15 order via aliexpress? Jan 10 17:12:10 jow: https://www.ebay.de/itm/163010688146 at least claims to be 7621 based, so should be the right one Jan 10 17:13:35 and from germany, so should be fast at your door, and you can send it back if its the wrong one, or you didn't like the colour Jan 10 17:14:56 thanks! brought Jan 10 17:14:59 bought Jan 10 17:15:38 (um, I didn't check whether that was a mt7621 with nand though? did that need to be?) Jan 10 17:15:56 toh says it's 128mb nand though, so nvm... Jan 10 17:21:01 jow: on a closer look I'm not sure anymore if it actually comes from germany ... it says "Artikelstandort: Hamburg, HB, Deutschland", but it also says "Versandoptionen aus dem Ausland (MiniPak aus dem Ausland)" as shipping method Jan 10 17:21:26 well the seller name on the pypal confirmation was chinese gibberish Jan 10 17:21:45 I guess its import stuff, also matches the 1-2 week delivery timeframe Jan 10 17:22:41 yeah Jan 10 17:23:40 jow: in case you used paypal, if you have https://www.paypal.com/de/webapps/mpp/refunded-returns activated, paypal will pay for the return shipment (might be too late for this one though) Jan 10 17:26:21 hah, a collegue of mine can provide me with an zbt-wg2626 as well Jan 10 17:26:27 great Jan 10 17:26:35 it's very annoying to not know whether the thing is shipped from .de or from outside EU, since here in Sweden now there is 7 EUR administration fee + 25% VAT on things from !EU. I buy from amazon.de because I presume it's from EU and includes VAT, but I've heard others getting screwed so... Jan 10 17:30:40 SwedeMike: I'm quite grateful about amazon.[com|co.jp] offering pre-paid import VAT/taxes, which sidesteps this issue. unfortunately almost no other stores offer that Jan 10 17:31:27 and you also have to let them know not to apply VAT since you'll pay taxes on that too Jan 10 17:31:45 france only has 19% VAT AFAIK Jan 10 17:31:53 nvm Jan 10 17:32:21 I was thinking about norway Jan 10 17:32:43 nbd: will the driver for those cards (mt7615) be part of mt76? or is it going to be a separate driver? are there some open repositories already? Jan 10 17:56:19 nbd: and more importantly, there is a cortex-r4 for full host offload on that device. Do you know if the actual firmware really does everything, as in ath10k and is closed? Jan 10 18:27:48 rmilecki: pong Jan 10 20:20:57 pang Jan 10 23:57:46 build #773 of layerscape/armv8_32b is complete: Failure [failed fetchrefs] Build details are at http://phase1.builds.lede-project.org/builders/layerscape%2Farmv8_32b/builds/773 blamelist: Rafa? Mi?ecki Jan 11 00:09:46 build #1154 of kirkwood/generic is complete: Failure [failed] Build details are at http://phase1.builds.lede-project.org/builders/kirkwood%2Fgeneric/builds/1154 blamelist: Rafa? Mi?ecki Jan 11 00:45:26 build #1193 of x86/geode is complete: Failure [failed updatefeeds] Build details are at http://phase1.builds.lede-project.org/builders/x86%2Fgeode/builds/1193 blamelist: Rafa? Mi?ecki **** ENDING LOGGING AT Fri Jan 11 02:59:57 2019