JustKernel

Ray Of Hope

braindump- memory corruption

Linux kernel 3.10 has somewhat dubious use of __GFP_WAIT which is more recent kernel has been replaced __GFP_RECLAIM and __GFP_DIRECT_RECLAIM which are more sensible .

I have been debugging a customer issue where AMD GPU passhtrough for S9050 card is causing XenServer host running 3.10 kernel to crash but XenServer 7.x Tech Preview release with kenrel 4.4 works fine.

Stack trace:
BUG: unable to handle kernel paging request at 0000000400180013 --> user space address.
Sep 28 17:53:52 skekung kernel: [ 3961.707950] IP: [] kmem_cache_alloc_trace+0x7b/0x130
Sep 28 17:53:52 skekung kernel: [ 3961.707959] PGD 180a06067 PUD 0 --> PUD is NULL as the address is not mapped and doesn't exist.
Sep 28 17:53:52 skekung kernel: [ 3961.707962] Oops: 0000 [#1] SMP
Sep 28 17:53:52 skekung kernel: [ 3961.707966] Modules linked in: tun des_generic md4 sha256_generic cifs bnx2fc(O) cnic(O) uio fcoe libfcoe libfc scsi_transport_fc scsi_tgt openvswitch(O) gre libcrc32c 8021q garp mrp stp llc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_multiport xt_conntrack nf_conntrack iptable_filter dm_multipath coretemp ipmi_devintf crc32_pclmul aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper dcdbas dm_mod lpc_ich microcode nfsd wmi sg mfd_core ipmi_si ipmi_msghandler hed shpchp auth_rpcgss oid_registry nfs_acl lockd nls_utf8 isofs sunrpc ip_tables x_tables sr_mod cdrom hid_generic usbhid usb_storage hid sd_mod ahci libahci igb(O) ehci_pci ehci_hcd libata xhci_hcd ixgbe(O) ptp pps_core megaraid_sas(O) scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_dh scsi_mod ipv6 autofs4
Sep 28 17:53:52 skekung kernel: [ 3961.708031] CPU: 0 PID: 1428 Comm: cdrommon Tainted: G O 3.10.0+10 #1
Sep 28 17:53:52 skekung kernel: [ 3961.708034] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.2.10 03/09/2015
Sep 28 17:53:52 skekung kernel: [ 3961.708037] task: ffff8801832f8000 ti: ffff880180a18000 task.ti: ffff880180a18000
Sep 28 17:53:52 skekung kernel: [ 3961.708040] RIP: e030:[] [] kmem_cache_alloc_trace+0x7b/0x130
Sep 28 17:53:52 skekung kernel: [ 3961.708044] RSP: e02b:ffff880180a19ae0 EFLAGS: 00010202
Sep 28 17:53:52 skekung kernel: [ 3961.708046] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 000000000001a872
Sep 28 17:53:52 skekung kernel: [ 3961.708049] RDX: 000000000001a871 RSI: 0000000000000010 RDI: ffffffff811a9612
Sep 28 17:53:52 skekung kernel: [ 3961.708051] RBP: ffff880180a19b10 R08: 0000000000015980 R09: 0000000000000010
Sep 28 17:53:52 skekung kernel: [ 3961.708054] R10: 0000000000000008 R11: ffffffffa007e408 R12: 0000000400180013
Sep 28 17:53:52 skekung kernel: [ 3961.708056] R13: 0000000000000010 R14: 0000000000000018 R15: ffff880187803c80
Sep 28 17:53:52 skekung kernel: [ 3961.708062] FS: 00007f20d42bb740(0000) GS:ffff880188600000(0000) knlGS:ffff880188600000
Sep 28 17:53:52 skekung kernel: [ 3961.708065] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 28 17:53:52 skekung kernel: [ 3961.708067] CR2: 0000000400180013 CR3: 0000000184332000 CR4: 00000000000506a0
Sep 28 17:53:52 skekung kernel: [ 3961.708070] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Sep 28 17:53:52 skekung kernel: [ 3961.708072] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Sep 28 17:53:52 skekung kernel: [ 3961.708074] Stack:
Sep 28 17:53:52 skekung kernel: [ 3961.708075] ffffffff811a9612 0000000000000001 0000000000000001 0000000000000008
Sep 28 17:53:52 skekung kernel: [ 3961.708080] 0000000000000000 ffff880180a19cf0 ffff880180a19b88 ffffffff811a9612
Sep 28 17:53:52 skekung kernel: [ 3961.708084] ffff880180a19b30 ffffffff812c4fee ffff88017fbaf1c0 ffffffff812ca782
Sep 28 17:53:52 skekung kernel: [ 3961.708089] Call Trace:
Sep 28 17:53:52 skekung kernel: [ 3961.708095] [] ? bio_copy_user_iov+0x102/0x430
Sep 28 17:53:52 skekung kernel: [ 3961.708099] [] bio_copy_user_iov+0x102/0x430
Sep 28 17:53:52 skekung kernel: [ 3961.708105] [] ? elv_set_request+0x1e/0x30
Sep 28 17:53:52 skekung kernel: [ 3961.708108] [] ? get_request+0x3b2/0x6d0
Sep 28 17:53:52 skekung kernel: [ 3961.708112] [] bio_copy_kern+0x3f/0xf0
Sep 28 17:53:52 skekung kernel: [ 3961.708116] [] blk_rq_map_kern+0x14b/0x190
Sep 28 17:53:52 skekung kernel: [ 3961.708119] [] ? blk_get_request+0x9c/0xd0
Sep 28 17:53:52 skekung kernel: [ 3961.708129] [] scsi_execute+0x108/0x170 [scsi_mod]
Sep 28 17:53:52 skekung kernel: [ 3961.708134] [] sr_do_ioctl+0x13b/0x2d0 [sr_mod]
Sep 28 17:53:52 skekung kernel: [ 3961.708138] [] sr_packet+0x39/0x50 [sr_mod]
Sep 28 17:53:52 skekung kernel: [ 3961.708143] [] cdrom_get_media_event+0x60/0xc0 [cdrom]
Sep 28 17:53:52 skekung kernel: [ 3961.708147] [] sr_drive_status+0x8c/0x110 [sr_mod]
Sep 28 17:53:52 skekung kernel: [ 3961.708152] [] cdrom_ioctl+0x896/0xf80 [cdrom]
Sep 28 17:53:52 skekung kernel: [ 3961.708156] [] ? mntput+0x35/0x40
Sep 28 17:53:52 skekung kernel: [ 3961.708164] [] ? _raw_spin_unlock_irqrestore+0x1e/0x30
Sep 28 17:53:52 skekung kernel: [ 3961.708168] [] ? __pm_runtime_resume+0x67/0x80
Sep 28 17:53:52 skekung kernel: [ 3961.708173] [] sr_block_ioctl+0x6d/0xd0 [sr_mod]
Sep 28 17:53:52 skekung kernel: [ 3961.708176] [] blkdev_ioctl+0x80e/0x8a0
Sep 28 17:53:52 skekung kernel: [ 3961.708179] [] block_ioctl+0x41/0x50
Sep 28 17:53:52 skekung kernel: [ 3961.708183] [] do_vfs_ioctl+0x4f1/0x530
Sep 28 17:53:52 skekung kernel: [ 3961.708186] [] ? _raw_spin_lock+0xe/0x20
Sep 28 17:53:52 skekung kernel: [ 3961.708189] [] ? final_putname+0x3f/0x50
Sep 28 17:53:52 skekung kernel: [ 3961.708193] [] SyS_ioctl+0x57/0x90
Sep 28 17:53:52 skekung kernel: [ 3961.708199] [] system_call_fastpath+0x16/0x1b
Sep 28 17:53:52 skekung kernel: [ 3961.708201] Code: 85 e4 75 1c 48 89 fa 44 89 ee 4c 89 ff e8 28 b9 3d 00 41 f7 c5 00 80 00 00 49 89 c4 74 4a eb 31 49 63 47 20 48 8d 4a 01 4d 8b 07 <49> 8b 1c 04 4c 89 e0 65 49 0f c7 08 0f 94 c0 84 c0 74 ae 49 63
Sep 28 17:53:52 skekung kernel: [ 3961.708231] RIP [] kmem_cache_alloc_trace+0x7b/0x130
Sep 28 17:53:52 skekung kernel: [ 3961.708235] RSP
Sep 28 17:53:52 skekung kernel: [ 3961.708236] CR2: 0000000400180013
Sep 28 17:53:52 skekung kernel: [ 3961.740905] ---[ end trace b1bf2c5404c6589d ]---
Sep 28 17:53:52 skekung kernel: [ 3961.799528] BUG: unable to handle kernel paging request at 0000000400180013
Sep 28 17:53:52 skekung kernel: [ 3961.799539] IP: [] kmem_cache_alloc_trace+0x7b/0x130
Sep 28 17:53:52 skekung kernel: [ 3961.799548] PGD 0
Sep 28 17:53:52 skekung kernel: [ 3961.799551] Oops: 0000 [#2] SMP

Stack trace shows that the issue occurred in bulk driver while copying user space buffer to kernel space (bio_copy_user_iov). But its hard to understand how GPU passthrough can cause bulk driver to crash.

Approach:
1) compared kernel 4.4 and kernel 3.10. Found some useful and relevant changes that can cause the crash but changes were many to be ported.

2) I focussed on the crash dump that was in my hand and started debugging it using addr2line, gdb and printk.

Code walk: Code walk revealed that we are doing DMA transfer to device using bounce buffers. Bounce buffer is allocated at 16 G range, and in bio_copy_user_iov we are copying this buffer to the kernel space to be transferred to the device.

Most probably it looks like that the bounce buffer has been taken by the kernel (freed or allocated to some other process) under bulk driver’s foot. bio_copy_user_iov uses copy_from-user which should do the necessary checks whether the buffer is valid or not. Why isn’t that happening.
Also as per the crash dump the fault is at kmem_cache_alloc when new buffer is being allocated, it doesn’t seem anywhere related to memory corruption. It’s totally a different investigation path.

As this is a customer environment so I can’t repro it our lab, I have shared a debug build with the customer to get the feedback.

mail_to: anshul_makkar@justkernel.com


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.