Patch Net-next Bnx2x: Revising Locking Scheme For Mac
Date: Fri, 19 Feb 2016 16:24:06 +0900 (text/plain, inline) Package: src:linux Version: 4.4.2-1 Severity: critical Justification: breaks the whole system Dear all, till now I was running linux-image-4.4.0-trunk-amd64 (from experimental) without any problem. Today I installed linux-image-4.4.0-1-amd64 (version 4.4.2-1) and tried to boot into it. Booting stopped at Loading initrd and remains there. The previous image (4.4.0-trunk-amd64) is still working. Installing the kernel package again gives: $ dpkg -i /var/cache/apt/archives/linux-image-4.4.0-1-amd644.4.2-1amd64.deb (Reading database. 607463 files and directories currently installed.) Preparing to unpack./linux-image-4.4.0-1-amd644.4.2-1amd64.deb. Unpacking linux-image-4.4.0-1-amd64 (4.4.2-1) over (4.4.2-1).
- Patch Net-next Bnx2x Revising Locking Scheme For Mac Terminal
- Patch Net-next Bnx2x Revising Locking Scheme For Mac
Setting up linux-image-4.4.0-1-amd64 (4.4.2-1). /etc/kernel/postinst.d/initramfs-tools: update-initramfs: Generating /boot/initrd.img-4.4.0-1-amd64 cryptsetup: WARNING: failed to detect canonical device of /dev/sdb2 cryptsetup: WARNING: could not determine root device from /etc/fstab dropbear: WARNING: Invalid authorizedkeys file, remote unlocking of cryptroot via ssh won't work! /etc/kernel/postinst.d/zz-update-grub: Generating grub configuration file. Found background image: /usr/share/images/desktop-base/desktop-grub.png Found linux image: /boot/vmlinuz-4.4.0-1-amd64 Found initrd image: /boot/initrd.img-4.4.0-1-amd64 Found linux image: /boot/vmlinuz-4.4.0-trunk-amd64 Found initrd image: /boot/initrd.img-4.4.0-trunk-amd64 Found linux image: /boot/vmlinuz-4.3.0-1-amd64 Found initrd image: /boot/initrd.img-4.3.0-1-amd64 Found memtest86+ image: /boot/memtest86+.bin Found memtest86+ multiboot image: /boot/memtest86+multiboot.bin Found Mac OS X on /dev/sda2 done $ This is a MacPro from 2009. Package-specific info:. Kernel log: boot messages should be attached Booting into vmlinuz-4.4.0-trunk-amd64 gives me the attached dmesg output. Model information sysvendor: Apple Inc.
Patch Net-next Bnx2x Revising Locking Scheme For Mac Terminal
Productname: MacPro4,1 productversion: 0.0 chassisvendor: Apple Inc. Chassisversion: Mac-F221BEC8 biosvendor: Apple Inc. Biosversion: MP41.88Z.0081.B729 boardvendor: Apple Inc. Date: Fri, 19 Feb 2016 23:25:09 +0100 (text/plain, inline) On Fri, 19 Feb 2016 16:24:06 +0900 Norbert Preining wrote: Package: src:linux Version: 4.4.2-1 Severity: critical Justification: breaks the whole system Dear alltill now I was running linux-image-4.4.0-trunk-amd64 (from experimental) without any problem. Today I installed linux-image-4.4.0-1-amd64 (version 4.4.2-1) and tried to boot into it. Booting stopped at Loading initrd and remains there.
I'm experiencing this issue too, on a 2015 Lenovo ThinkPad X250, using UEFI boot. I've tried to boot a vanilla 4.4.2 kernel with custom configuration, which boots fine. I'm rebuilding a vanilla 4.4.2 kernel with the Debian configuration to check wether it boots fine or not. It might be related to the UEFI patches added on top of the 4.4.2 kernel, not sure when they appeared. Regards, - Yves-Alexis (application/pgp-signature, inline). To: Norbert Preining, 'gustavo panizzo (gfa )', Yves-Alexis Perez, Jim Barber, Vincent Bernat, Sebastian Fontius, Martin Dickopp, Serhii Yehorov, Guy Durrieu, Mateusz Kaduk, Arash Zeini, Zdravko Yanakiev, Wolfgang Walter, Eric Kelm, Alexander Clouter.
Date: Sat, 20 Feb 2016 17:43:27 +0000 (text/plain, inline) I apologise for this regression, which of course didn't affect any of the several machines I was able to test on. Please can you each reply to the bug address with the following details: - Does the affected system boot using the BIOS boot protocol (including CSM) or UEFI boot protocol? - If you boot with the added kernel parameter 'earlyprintk=vga' or 'earlyprintk=serial' (and without 'quiet'), do any boot messages appear before the hang/reboot? PLease send them (a photo is fine). If it boots with UEFI, does the kernel parameter 'efi=noruntime' work around the problem? - If you haven't already reported this, does the Linux 4.5-rc4 package from experimental have the same problem? - Ben Hutchings Tomorrow will be cancelled due to lack of interest.
(application/pgp-signature, inline). To: 815125@bugs.debian.org, Norbert Preining, 'gustavo panizzo (gfa )', Jim Barber, Vincent Bernat, Sebastian Fontius, Martin Dickopp, Serhii Yehorov, Guy Durrieu, Mateusz Kaduk, Arash Zeini, Zdravko Yanakiev, Wolfgang Walter, Eric Kelm, Alexander Clouter. Date: Sat, 20 Feb 2016 19:24:46 +0100 (text/plain, inline) On sam., 2016-02-20 at 17:43 +0000, Ben Hutchings wrote: I apologise for this regression, which of course didn't affect any of the several machines I was able to test on. Hi Ben, no issue, that's why it's called “unstable”:) Please can you each reply to the bug address with the following details: - Does the affected system boot using the BIOS boot protocol (including CSM) or UEFI boot protocol? UEFI, no CSM.
- If you boot with the added kernel parameter 'earlyprintk=vga' or 'earlyprintk=serial' (and without 'quiet'), do any boot messages appear before the hang/reboot? PLease send them (a photo is fine). Nothing - If it boots with UEFI, does the kernel parameter 'efi=noruntime' work around the problem? Anything you want from that boot? - If you haven't already reported this, does the Linux 4.5-rc4 package from experimental have the same problem? Same thing happens.
Regards, - Yves-Alexis (application/pgp-signature, inline). Date: Sat, 20 Feb 2016 19:32:09 +0100 (text/plain, inline) On 18:43, Ben Hutchings wrote: - Does the affected system boot using the BIOS boot protocol (including CSM) or UEFI boot protocol? The system uses UEFI without Secure Boot or legacy compatibility features.
- If you boot with the added kernel parameter 'earlyprintk=vga' or 'earlyprintk=serial' (and without 'quiet'), do any boot messages appear before the hang/reboot? PLease send them (a photo is fine). No helpful information is printed with either of these parameters, only a line saying 'Booting a command list.' - If it boots with UEFI, does the kernel parameter 'efi=noruntime' work around the problem?
Yes, this parameter fixes the problem for me. The kernel boots fine and the system works as usual. - If you haven't already reported this, does the Linux 4.5-rc4 package from experimental have the same problem?
Yes, the same thing happens using the kernel from linux-image-4.5.0-rc4-amd64. (application/pgp-signature, attachment). Date: Sat, 20 Feb 2016 23:12:16 +0000 (text/plain, inline) On Sat, 2016-02-20 at 23:05 +0000, Jim Barber wrote:.
I also tried booting with the kernel parameter 'earlyprintk=efi,keep' that I saw someone used in one of the merged bug reports. Ah, good thinking, I didn't expect that would work.
This outputs messages, but was extremely slow to scroll each line:) However it shows the crash I think, so I have attached a photo. Unfortunately it is tiny text rendered on a Hi-DPI displaytaken with a poor quality phone camera, but seems readable when zoomed up. This is very useful, thanks. Ben Hutchings Tomorrow will be cancelled due to lack of interest. (application/pgp-signature, inline). Date: Sat, 20 Feb 2016 23:05:18 +0000 (text/plain, inline) From: Ben Hutchings mailto:ben@decadent.org.uk Please can you each reply to the bug address with the following details: Hi Ben. Thanks for your help with this.
I have upgraded from version 4.4.2-1 to 4.4.2-2 of the package with the same results. - Does the affected system boot using the BIOS boot protocol (including CSM) or UEFI boot protocol?
The system is booting with the UEFI boot protocol with no BIOS compatibility options enabled. - If you boot with the added kernel parameter 'earlyprintk=vga' or 'earlyprintk=serial' (and without 'quiet'), do any boot messages appear before the hang/reboot?
PLease send them (a photo is fine). If I do this, then only an initial ' Booting a command list' is added to the screen at the top. I see: Booting a command list Loading Linux 4.4.0-1-adm64. Loading initial ramdisk. I also tried booting with the kernel parameter 'earlyprintk=efi,keep' that I saw someone used in one of the merged bug reports. This outputs messages, but was extremely slow to scroll each line:) However it shows the crash I think, so I have attached a photo. Unfortunately it is tiny text rendered on a Hi-DPI display, taken with a poor quality phone camera, but seems readable when zoomed up.
- If it boots with UEFI, does the kernel parameter 'efi=noruntime' work around the problem? Yes, using this parameter allows the system to boot.
- If you haven't already reported this, does the Linux 4.5-rc4 package from experimental have the same problem? I haven't tried this yet, but given the other responses to this bug, it looks likely that this will have the same problem. (image/jpeg, attachment). To: Norbert Preining, 'gustavo panizzo (gfa )', Yves-Alexis Perez, Jim Barber, Vincent Bernat, Sebastian Fontius, Martin Dickopp, Serhii Yehorov, Guy Durrieu, Mateusz Kaduk, Arash Zeini, Zdravko Yanakiev, Wolfgang Walter, Eric Kelm, Alexander Clouter. Date: Sat, 20 Feb 2016 23:33:48 +0000 (text/plain, inline) On Sat, 2016-02-20 at 17:43 +0000, Ben Hutchings wrote: I apologise for this regression, which of course didn't affect any of the several machines I was able to test on.
Please can you each reply to the bug address with the following details: - Does the affected system boot using the BIOS boot protocol (including CSM) or UEFI boot protocol? - If you boot with the added kernel parameter 'earlyprintk=vga' or 'earlyprintk=serial' (and without 'quiet'), do any boot messages appear before the hang/reboot? PLease send them (a photo is fine). Apparently 'earlyprintk=efi,keep' is likely to work better. Jim Barber was able to get a traceback this way. It looks like the efi-bgrt driver is crashing, and there is an upstream fix for it.
I'll upload a test version of the package shortly. However, if you see a traceback where the IP is.not. shown as being in 'efibgrtinit', please report that. - If it boots with UEFI, does the kernel parameter 'efi=noruntime' work around the problem? - If you haven't already reported this, does the Linux 4.5-rc4 package from experimental have the same problem?
- Ben Hutchings Tomorrow will be cancelled due to lack of interest. (application/pgp-signature, inline). Date: Sun, 21 Feb 2016 10:56:14 +0900 (text/plain, inline) Hi Ben, thanks for taking this up. - Does the affected system boot using the BIOS boot protocol (including CSM) or UEFI boot protocol? UEFI - If you boot with the added kernel parameter 'earlyprintk=vga' or 'earlyprintk=serial' (and without 'quiet'), do any boot messages appear before the hang/reboot? PLease send them (a photo is fine). Apparently 'earlyprintk=efi,keep' is likely to work better.
Jim Barber was able to get a traceback this way. I tried that one, but it grinded the whole system to a halt. Lines builtup from the kernel message was about 1 line per 5sec (!!!!) At the end it stopped at 4.688226 ACPI: 3 ACPI AML tables successfully acquired and loaded But it was far from 4sec that it took to arrive there, more like 5min! After that nothing moves. I attach the screenshot when it was hanging at the end/ - If it boots with UEFI, does the kernel parameter 'efi=noruntime' work around the problem? Does not boot with UEFI. - If you haven't already reported this, does the Linux 4.5-rc4 package from experimental have the same problem?
I have the same hangs. Nothing goes on after Loading initial ramdisk. Norbert - PREINING, Norbert JAIST, Japan TeX Live & Debian Developer GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13 - (image/jpeg, attachment). Date: Sun, 21 Feb 2016 10:02:53 +0800 Package: src:linux Version: 4.4.2-2 Followup-For: Bug #815125 1. To add another datum, Dell Inspiron 5758 also does not boot past displaying the kernel boot command line (from rEFInd booter). Booting proceeds normally with 'efi=noruntime', suggested elsewhere.
BIOS UEFI is in Legacy mode. 'earlyprintk=efi,keep' shows several pages of scrolling info, finishing with a kernel panic and lockup as above. I didn't take a photo at the time. Let me know if you need it and I'll reboot again. Trimmed details below: - - Package-specific info:.
Version: Linux version 4.4.0-1-amd64 (debian-kernel@lists.debian.org) (gcc version 5.3.1 20160205 (Debian 5.3.1-8) ) #1 SMP Debian 4.4.2-2 (2016-02-19). Command line: boot vmlinuz-4.4.0-1-amd64 root=UUID=8e8b29e9-6bb0-40ad-93bf-b5cc4baa5e9a ro quietd initrd=boot initrd.img-4.4.0-1-amd64 efi=noruntime. Not tainted. Model information sysvendor: Dell Inc. Productname: Inspiron 5758 productversion: 01 chassisvendor: Dell Inc. Chassisversion: biosvendor: Dell Inc.
Biosversion: A07 boardvendor: Dell Inc. To: Norbert Preining, 'gustavo panizzo (gfa )', Yves-Alexis Perez, Jim Barber, Vincent Bernat, Sebastian Fontius, Martin Dickopp, Serhii Yehorov, Guy Durrieu, Mateusz Kaduk, Arash Zeini, Zdravko Yanakiev, Wolfgang Walter, Eric Kelm, Alexander Clouter. Date: Sun, 21 Feb 2016 03:45:24 +0000 (text/plain, inline) On Sat, 2016-02-20 at 23:33 +0000, Ben Hutchings wrote: On Sat, 2016-02-20 at 17:43 +0000, Ben Hutchings wrote: I apologise for this regression, which of course didn't affect any of the several machines I was able to test on. Please can you each reply to the bug address with the following details: - Does the affected system boot using the BIOS boot protocol (including CSM) or UEFI boot protocol? - If you boot with the added kernel parameter 'earlyprintk=vga' or 'earlyprintk=serial' (and without 'quiet'), do any boot messages appear before the hang/reboot? PLease send them (a photo is fine).
Apparently 'earlyprintk=efi,keep' is likely to work better. Jim Barber was able to get a traceback this way. It looks like the efi-bgrt driver is crashing, and there is an upstream fix for it. I'll upload a test version of the package shortly. This test version is now available at Please report back whether this does or doesn't fix the problem for you. Ben Hutchings Time is nature's way of making sure that everything doesn't happen at once.
(application/pgp-signature, inline). Date: Sun, 21 Feb 2016 08:40:54 +0100 (text/plain, inline) Hi Ben- Does the affected system boot using the BIOS boot protocol (including CSM) or UEFI boot protocol? I have grub-efi-amd64 installed so I believe I am using UEFI. - If you boot with the added kernel parameter 'earlyprintk=vga' or 'earlyprintk=serial' (and without 'quiet'), do any boot messages appear before the hang/reboot?
PLease send them (a photo is fine). After adding earlyprintk=efi,keep in the attached photo you can see error about ACPI - If it boots with UEFI, does the kernel parameter 'efi=noruntime' work around the problem? Yes that fixes the problem. - If you haven't already reported this, does the Linux 4.5-rc4 package from experimental have the same problem? I am using 4.5.0-rc4-amd64 from experimental at the moment as that boots without issues. Only 4.4.0 seems to be affected.
This test version is now available at Please report back whether this does or doesn't fix the problem for you. I installed 4.4.2-3a.test and it replaced previous 4.4.x package, but system is not booting either. Let me know if you need anything. /Mateusz (text/html, inline).
Cc: Norbert Preining, 'gustavo panizzo (gfa )', Yves-Alexis Perez, Jim Barber, Vincent Bernat, Sebastian Fontius, Martin Dickopp, Serhii Yehorov, Guy Durrieu, Mateusz Kaduk, Arash Zeini, Zdravko Yanakiev, Wolfgang Walter, Eric Kelm. Date: Sun, 21 Feb 2016 08:15:09 +0000 On Sun, Feb 21, 2016 at 03:45:24AM +0000, Ben Hutchings wrote: Apparently 'earlyprintk=efi,keep' is likely to work better. Jim Barber was able to get a traceback this way. It looks like the efi-bgrt driver is crashing, and there is an upstream fix for it.
I'll upload a test version of the package shortly. This test version is now available at Please report back whether this does or doesn't fix the problem for you. Works for Me(tm) Thanks - Alexander Clouter.sigmonster says: Excellent day to have a rotten day.
Date: Sun, 21 Feb 2016 12:18:59 +0000 (text/plain, inline) On Sun, 2016-02-21 at 08:40 +0100, Mateusz Kaduk wrote: Hi Ben,. - If you haven't already reported this, does the Linux 4.5-rc4 package from experimental have the same problem?
I am using 4.5.0-rc4-amd64 from experimental at the moment as that boots without issues. Only 4.4.0 seems to be affected. This test version is now available at Please report back whether this does or doesn't fix the problem for you. I installed 4.4.2-3a.test and it replaced previous 4.4.x package, but system is not booting either. Let me know if you need anything. It sounds like this is a slightly different bug; please open a new bug report.
Patch Net-next Bnx2x Revising Locking Scheme For Mac
Ben Hutchings Time is nature's way of making sure that everything doesn't happen at once. (application/pgp-signature, inline). Date: Sun, 21 Feb 2016 13:45:11 +0100 (text/plain, inline) Hi Ben, Unfortunately, the test packages that you uploaded don't fix the problem on my machine. I'm attaching a photo of the output with the kernel parameter 'earlyprintk=efi,keep' enabled. Zdravko Yanakiev On 04:45, Ben Hutchings wrote: On Sat, 2016-02-20 at 23:33 +0000, Ben Hutchings wrote: On Sat, 2016-02-20 at 17:43 +0000, Ben Hutchings wrote: I apologise for this regression, which of course didn't affect any of the several machines I was able to test on. Please can you each reply to the bug address with the following details: - Does the affected system boot using the BIOS boot protocol (including CSM) or UEFI boot protocol?
- If you boot with the added kernel parameter 'earlyprintk=vga' or 'earlyprintk=serial' (and without 'quiet'), do any boot messages appear before the hang/reboot? PLease send them (a photo is fine).
Apparently 'earlyprintk=efi,keep' is likely to work better. Jim Barber was able to get a traceback this way.
It looks like the efi-bgrt driver is crashing, and there is an upstream fix for it. I'll upload a test version of the package shortly.
This test version is now available at Please report back whether this does or doesn't fix the problem for you. (image/jpeg, attachment) (application/pgp-signature, attachment). Date: Sun, 21 Feb 2016 13:23:59 +0000 Source: linux Source-Version: 4.4.2-3 We believe that the bug you reported is fixed in the latest version of linux, which is due to be installed in the Debian FTP archive. A summary of the changes between this version and the previous one is attached. Thank you for reporting the bug, which will now be closed. If you have further comments please address them to 815125@bugs.debian.org, and the maintainer will reopen the bug report if appropriate.
Debian distribution maintenance software pp. Date: Sun, 21 Feb 2016 22:34:00 +0900 Hi Ben, On Sun, 21 Feb 2016, Ben Hutchings wrote: Norbert, I didn't get confirmation from you whether the problem on your system is fixed by the new patches I found. If it isn't, please.unmerge. this bug report before reopening it. Sorry, this is my work computer and I didn't have access to it today after I sent the one email. I will try tomorrow whether it fixes the problem. Thanks for your work on that Norbert - PREINING, Norbert JAIST, Japan TeX Live & Debian Developer GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13.
Date: Tue, 23 Feb 2016 11:08:56 +0900 unmerge 815125 found 815125 4.4.2-3 thanks Hi Ben, I have now tried the kernel you uploaded and it shows the same problems, so it seems related to something else. When I add earlyprintk=efi,keep it boots extremely slow, see this little (6sec) video It stops at the same place as before, so please see the previously attached screenshots. Thanks Norbert - PREINING, Norbert JAIST, Japan TeX Live & Debian Developer GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13. Date: Tue, 23 Feb 2016 20:17:03 +0100 (text/plain, inline) Hi, I have the same traceback as Zdravko in Message #121 (NULL pointer dereference at RIP=0xffffffff81063682, changepageattrsetclr+0x242). If I add 'efi=oldmap' parameter to kernel cmdline, the kernel boots fine.
Also, this might help Norbert to have a traceback printed: using 'quiet earlyprintk=efi,keep' kernel cmdline options will print only the traceback (so it should be faster to get the kernel to the crash traceback, if the cause is really a crash). After compiling several kernels, I narrowed down the crash to the patch 'x86-efi-build-our-own-page-table-structures.patch'.
Without this patch, the kernel boots fine (dmesg output in attachment dmesglinux-4.4.2.txt) With this patch, I get the crash in eficall (photo of 'earlyprintk=efi,keep' output in attachment tracebacklinux-4.4.2withpatch.jpg). I also added 4 printk to add information before the crash when calling eficallphys with efiphys.setvirtualaddressmap (see also 'additionnalprintk.diff' in attachment).
(Not sure if it can help) When I added these printk, the traceback stop at eficall (changepageattrsetclr isn't anymore in the traceback) but RIP is still the same as without these changes. See also the traceback I get in attachment tracebacklinux-4.4.2-3unmodified.jpg with the current 4.4 kernel (version 4.4.2-3 unmodified) Also, let me know if a new bug should be opened for this. Thanks, Alexis Murzeau (text/plain, attachment) (image/jpeg, attachment) (image/jpeg, attachment) (text/plain, attachment). Date: Sun, 28 Feb 2016 23:31:42 +0100 (text/plain, inline) Hi, I would like to add this bug bit me too. I am on a Dell XPS 13 9350 (Skylake, late 2015 model).
So far, all 4.4.0 kernel packages from Unstable hang at boot (ie 4.4.2-1 through -3). Booting in UEFI mode, CSM disabled. I had built my own kernel based on a 4.4.0 config from an earlier Experimental build (RC8) and updated that on every patchlevel release, still works fine (on 4.4.3 now).
The efi=noruntime boot argument does the trick for me; the 4.4.2 testing package Ben uploaded however still hangs, just like the 4.5 RC4 package from Experimental. Can test/provide more info if needed. Cheers Stijn Segers On Tue, 23 Feb 2016 20:17:03 +0100 Alexis Murzeau wrote: HiI have the same traceback as Zdravko in Message #121 (NULL pointer dereference at RIP=0xffffffff81063682, changepageattrsetclr+0x242). If I add 'efi=oldmap' parameter to kernel cmdline, the kernel boots fine.
Also, this might help Norbert to have a traceback printed: using 'quiet earlyprintk=efi,keep' kernel cmdline options will print only the traceback (so it should be faster to get the kernel to the crash traceback, if the cause is really a crash). After compiling several kernels, I narrowed down the crash to the patch 'x86-efi-build-our-own-page-table-structures.patch'. Without this patch, the kernel boots fine (dmesg output in attachment dmesglinux-4.4.2.txt) With this patch, I get the crash in eficall (photo of 'earlyprintk=efi,keep' output in attachment tracebacklinux-4.4.2withpatch.jpg). I also added 4 printk to add information before the crash when calling eficallphys with efiphys.setvirtualaddressmap (see also 'additionnalprintk.diff' in attachment). (Not sure if it can help) When I added these printk, the traceback stop at eficall (changepageattrsetclr isn't anymore in the traceback) but RIP is still the same as without these changes. See also the traceback I get in attachment tracebacklinux-4.4.2-3unmodified.jpg with the current 4.4 kernel (version 4.4.2-3 unmodified) Also, let me know if a new bug should be opened for this.
ThanksAlexis Murzeau (text/html, inline). Date: Mon, 29 Feb 2016 12:25:35 +0000 On Mon, 29 Feb, at 10:49:54AM, Raphael Hertzog wrote: Hello Matt and Borislavin Debian we got a report (see below and ) that was breaking early boot on some machines. Can you have a look at those failures?
Can someone provide me with the list of EFI patches that were applied for linux-image-4.4.0-1-amd64 that are not part of the stable kernel tree for linux-4.4.y? The patch you referenced above isn't in Linus' tree yet and there are a bunch of prerequisite patches required to make it work. Date: Mon, 29 Feb 2016 21:34:55 +0900 On Mon, Feb 29, 2016 at 9:25 PM, Matt Fleming wrote: On Mon, 29 Feb, at 10:49:54AM, Raphael Hertzog wrote: Hello Matt and Borislavin Debian we got a report (see below and ) that was breaking early boot on some machines.
Can you have a look at those failures? Can someone provide me with the list of EFI patches that were applied for linux-image-4.4.0-1-amd64 that are not part of the stable kernel tree for linux-4.4.y? Debian's kernel patch is located in: and EFI related, I guess, is in: the final '?h=sid' implies it's for sid which is currently 4.4 the master branch is for preparing 4.5-rc now. Cheers, - Roger Shimizu, GMT +9 Tokyo PGP/GPG: 17B3ACB1. Date: Mon, 29 Feb 2016 13:51:24 +0000 On Mon, 29 Feb, at 09:34:55PM, Roger Shimizu wrote: On Mon, Feb 29, 2016 at 9:25 PM, Matt Fleming wrote: On Mon, 29 Feb, at 10:49:54AM, Raphael Hertzog wrote: Hello Matt and Borislavin Debian we got a report (see below and ) that was breaking early boot on some machines. Can you have a look at those failures?
Can someone provide me with the list of EFI patches that were applied for linux-image-4.4.0-1-amd64 that are not part of the stable kernel tree for linux-4.4.y? Debian's kernel patch is located in: and EFI related, I guess, is in: the final '?h=sid' implies it's for sid which is currently 4.4 the master branch is for preparing 4.5-rc now. Thanks Roger. OK, that rules out an error porting the feature because all the required patches are present. Looking at tracebacklinux-4.4.2withpatchx86-efi-build-our-own-page-table-structures.jpg from comment #164, it appears as though the firmware is trying to access an address that isn't mapped in our new dedicated EFI page tables while inside of SetVirtualAddressMap.
Curiously the E280 memory map describes the range covering the faulting IP (0x00000000aa9462ee) as 'type 20' which is a bogus E820 type and a bogus EFI memory map type. Alexis, could you boot a kernel with CONFIGEFIPGTDUMP enabled, efi=debug on the command line and upload the dmesg output? Booting with efi=oldmap,debug should be fine (so your machine won't crash). Date: Tue, 1 Mar 2016 01:03:22 +0100 (text/plain, inline) 2016-02-29 14:51 GMT+01:00 Matt Fleming: Thanks Roger. OK, that rules out an error porting the feature because all the required patches are present.
Looking at tracebacklinux-4.4.2withpatchx86-efi-build-our-own-page-table-structures.jpg from comment #164, it appears as though the firmware is trying to access an address that isn't mapped in our new dedicated EFI page tables while inside of SetVirtualAddressMap. Curiously the E280 memory map describes the range covering the faulting IP (0x00000000aa9462ee) as 'type 20' which is a bogus E820 type and a bogus EFI memory map type. Alexis, could you boot a kernel with CONFIGEFIPGTDUMP enabledefi=debug on the command line and upload the dmesg output? Booting with efi=oldmap,debug should be fine (so your machine won't crash). I've updated my additional debug code to dump all entries of virtualmap when calling SetVirtualAddressMap. Date: Fri, 4 Mar 2016 13:07:00 +0000 On Tue, 01 Mar, at 01:03:22AM, Alexis Murzeau wrote: I've updated my additional debug code to dump all entries of virtualmap when calling SetVirtualAddressMap.
Date: Wed, 9 Mar 2016 01:02:44 +0100 2016-03-04 14:07 GMT+01:00 Matt Fleming: It must have been a herculean effort to take photos of the screen while the buggy kernel booted. I'm not really seeing anything jumping out as obviously wrong apart from the fact that we don't have all of EFICONVENTIONALMEMORY mapped in the buggy kernel. Could you try this patch? - diff -git a/arch/x86/platform/efi/efi64.c b/arch/x86/platform/efi/efi64.c index 49e4dd4a1f58.f5e77d240ff1 100644 - a/arch/x86/platform/efi/efi64.c b/arch/x86/platform/efi/efi64.c @@ -241,15 +241,6 @@ int init efisetuppagetables(unsigned long pamemmap, unsigned numpages) efiscratch.usepgd = true; /.
-. When making calls to the firmware everything needs to be 1:1 -. mapped and addressable with 32-bit pointers. Map the kernel -. text and allocate a new stack because we can't rely on the -. stack pointer being -./ - if (!ISENABLED(CONFIGEFIMIXED)) - return 0; - - /. Map all of RAM so that we can access arguments in the 1:1.
mapping when making EFI runtime calls./ @@ -268,6 +259,15 @@ int init efisetuppagetables(unsigned long pamemmap, unsigned numpages) + /. +.
When making calls to the firmware everything needs to be 1:1 +. mapped and addressable with 32-bit pointers. Map the kernel +.
text and allocate a new stack because we can't rely on the +. stack pointer being +./ + if (!ISENABLED(CONFIGEFIMIXED)) + return 0; + page = allocpage(GFPKERNEL GFPDMA32); if (!page) panic('Unable to allocate EFI runtime stack.
Date: Wed, 9 Mar 2016 22:56:07 +0000 On Wed, 09 Mar, at 11:01:18PM, Alexis Murzeau wrote: Indeed I get the 'Could not reserve range' message, and with a kernel v4.3 the physical address 0x1 contains the value 1. And this patch works and make a unmodified + this patch 4.4 debian kernel boots, nice well found:) Great, thanks for testing.
However, now a bad page state is reported in dmesg (which doesn't seem to affect the kernel to me as a user but might hide something buggy): 0.030096 BUG: Bad page state in process swapper/0 pfn:00000 0.030100 page:ffffea count:0 mapcount:1 mapping: (null) index:0x0 0.030102 flags: 0x0 The efifreebootservices function seems to expect size 0 to not free non reserved memory according to commit 7d68dc3. Not sure if this bad page state is related to this patch though, but I don't get this with the 4.3 kernel.
Yeah, it's definitely related to my quick and dirty patch. I'll have a think about how to fix it properly tomorrow morning. Date: Thu, 10 Mar 2016 16:40:19 +0000 (text/plain, inline) On Wed, 09 Mar, at 10:56:07PM, Matt Fleming wrote: On Wed, 09 Mar, at 11:01:18PM, Alexis Murzeau wrote: Indeed I get the 'Could not reserve range' message, and with a kernel v4.3 the physical address 0x1 contains the value 1. And this patch works and make a unmodified + this patch 4.4 debian kernel boots, nice well found:) Great, thanks for testing. However, now a bad page state is reported in dmesg (which doesn't seem to affect the kernel to me as a user but might hide something buggy): 0.030096 BUG: Bad page state in process swapper/0 pfn:00000 0.030100 page:ffffea count:0 mapcount:1 mapping: (null) index:0x0 0.030102 flags: 0x0 The efifreebootservices function seems to expect size 0 to not free non reserved memory according to commit 7d68dc3. Not sure if this bad page state is related to this patch though, but I don't get this with the 4.3 kernel.
Yeah, it's definitely related to my quick and dirty patch. I'll have a think about how to fix it properly tomorrow morning. Alexis, could you, and anybody else that hit this bug, please try out the attached patch?
If it works for you I'll pull it into the EFI tree ASAP - the merge window is approaching fast. (text/plain, attachment). Date: Thu, 10 Mar 2016 23:48:04 +0100 (text/plain, inline) 2016-03-10 17:40 GMT+01:00 Matt Fleming: Alexis, could you, and anybody else that hit this bug, please try out the attached patch? If it works for you I'll pull it into the EFI tree ASAP - the merge window is approaching fast.
I tried your patch with both debian linux 4.4.2 and the next branch of your git repo. Both kernels boot without warning or errors in dmesg.
Thanks for your help to make this work:) Alexis Murzeau (text/plain, attachment) (text/plain, attachment). Date: Thu, 17 Mar 2016 12:35:54 +0000 Source: linux Source-Version: 4.4.6-1 We believe that the bug you reported is fixed in the latest version of linux, which is due to be installed in the Debian FTP archive. A summary of the changes between this version and the previous one is attached. Thank you for reporting the bug, which will now be closed. If you have further comments please address them to 815125@bugs.debian.org, and the maintainer will reopen the bug report if appropriate. Debian distribution maintenance software pp.