    • Joern Engel's avatar
      Logfs: Allow NULL block_isbad() methods · f2933e86
      Joern Engel authored
      Not all mtd drivers define block_isbad().  Let's assume no bad blocks
      instead of refusing to mount.
      Signed-off-by: default avatarJoern Engel <>
    • Joern Engel's avatar
      logfs: Grow inode in delete path · bbe01387
      Joern Engel authored
      Can be necessary if an inode gets deleted (through -ENOSPC) before being
      written.  Might be better to move this into logfs_write_rec(), but for
      now go with the stupid&safe patch.
      Signed-off-by: default avatarJoern Engel <>
    • Joern Engel's avatar
      logfs: Free areas before calling generic_shutdown_super() · 1bcceaff
      Joern Engel authored
      Or hit an assertion in map_invalidatepage() instead.
      Signed-off-by: default avatarJoern Engel <>
    • Joern Engel's avatar
      logfs: remove useless BUG_ON · 6c69494f
      Joern Engel authored
      It prevents write sizes >4k.
      Signed-off-by: default avatarJoern Engel <>
    • Prasad Joshi's avatar
      logfs: Propagate page parameter to __logfs_write_inode · 0bd90387
      Prasad Joshi authored
      During GC LogFS has to rewrite each valid block to a separate segment.
      Rewrite operation reads data from an old segment and writes it to a
      newly allocated segment. Since every write operation changes data
      block pointers maintained in inode, inode should also be rewritten.
      In GC path to avoid AB-BA deadlock LogFS marks a page with
      PG_pre_locked in addition to locking the page (PG_locked). The page
      lock is ignored iff the page is pre-locked.
      LogFS uses a special file called segment file. The segment file
      maintains an 8 bytes entry for every segment. It keeps track of erase
      count, level etc. for every segment.
      Bad things happen with a segment belonging to the segment file is GCed
       ------------[ cut here ]------------
      kernel BUG at /home/prasad/logfs/readwrite.c:297!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: logfs joydev usbhid hid psmouse e1000 i2c_piix4
      		serio_raw [last unloaded: logfs]
      Pid: 20161, comm: mount Not tainted 3.1.0-rc3+ #3 innotek GmbH
      EIP: 0060:[<f809132a>] EFLAGS: 00010292 CPU: 0
      EIP is at logfs_lock_write_page+0x6a/0x70 [logfs]
      EAX: 00000027 EBX: f73f5b20 ECX: c16007c8 EDX: 00000094
      ESI: 00000000 EDI: e59be6e4 EBP: c7337b28 ESP: c7337b18
      DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      Process mount (pid: 20161, ti=c7336000 task=eb323f70 task.ti=c7336000)
      f8099a3d c7337b24 f73f5b20 00001002 c7337b50 f8091f6d f8099a4d f80994e4
      00000003 00000000 c7337b68 00000000 c67e4400 00001000 c7337b80 f80935e5
      00000000 00000000 00000000 00000000 e1fcf000 0000000f e59be618 c70bf900
      Call Trace:
      [<f8091f6d>] logfs_get_write_page.clone.16+0xdd/0x100 [logfs]
      [<f80935e5>] logfs_mod_segment_entry+0x55/0x110 [logfs]
      [<f809460d>] logfs_get_segment_entry+0x1d/0x20 [logfs]
      [<f8091060>] ? logfs_cleanup_journal+0x50/0x50 [logfs]
      [<f809521b>] ostore_get_erase_count+0x1b/0x40 [logfs]
      [<f80965b8>] logfs_open_area+0xc8/0x150 [logfs]
      [<c141a7ec>] ? kmemleak_alloc+0x2c/0x60
      [<f809668e>] __logfs_segment_write.clone.16+0x4e/0x1b0 [logfs]
      [<c10dd563>] ? mempool_kmalloc+0x13/0x20
      [<c10dd563>] ? mempool_kmalloc+0x13/0x20
      [<f809696f>] logfs_segment_write+0x17f/0x1d0 [logfs]
      [<f8092e8c>] logfs_write_i0+0x11c/0x180 [logfs]
      [<f8092f35>] logfs_write_direct+0x45/0x90 [logfs]
      [<f80934cd>] __logfs_write_buf+0xbd/0xf0 [logfs]
      [<c102900e>] ? kmap_atomic_prot+0x4e/0xe0
      [<f809424b>] logfs_write_buf+0x3b/0x60 [logfs]
      [<f80947a9>] __logfs_write_inode+0xa9/0x110 [logfs]
      [<f8094cb0>] logfs_rewrite_block+0xc0/0x110 [logfs]
      [<f8095300>] ? get_mapping_page+0x10/0x60 [logfs]
      [<f8095aa0>] ? logfs_load_object_aliases+0x2e0/0x2f0 [logfs]
      [<f808e57d>] logfs_gc_segment+0x2ad/0x310 [logfs]
      [<f808e62a>] __logfs_gc_once+0x4a/0x80 [logfs]
      [<f808ed43>] logfs_gc_pass+0x683/0x6a0 [logfs]
      [<f8097a89>] logfs_mount+0x5a9/0x680 [logfs]
      [<c1126b21>] mount_fs+0x21/0xd0
      [<c10f6f6f>] ? __alloc_percpu+0xf/0x20
      [<c113da41>] ? alloc_vfsmnt+0xb1/0x130
      [<c113db4b>] vfs_kern_mount+0x4b/0xa0
      [<c113e06e>] do_kern_mount+0x3e/0xe0
      [<c113f60d>] do_mount+0x34d/0x670
      [<c10f2749>] ? strndup_user+0x49/0x70
      [<c113fcab>] sys_mount+0x6b/0xa0
      [<c142d87c>] syscall_call+0x7/0xb
      Code: f8 e8 8b 93 39 c9 8b 45 f8 3e 0f ba 28 00 19 d2 85 d2 74 ca eb d0 0f 0b 8d 45 fc 89 44 24 04 c7 04 24 3d 9a 09 f8 e8 09 92 39 c9 <0f> 0b 8d 74 26 00 55 89 e5 3e 8d 74 26 00 8b 10 80 e6 01 74 09
      EIP: [<f809132a>] logfs_lock_write_page+0x6a/0x70 [logfs] SS:ESP 0068:c7337b18
      ---[ end trace 96e67d5b3aa3d6ca ]---
      The patch passes locked page to __logfs_write_inode. It calls function
      logfs_get_wblocks() to pre-lock the page. This ensures any further
      attempts to lock the page are ignored (esp from get_erase_count).
      Acked-by: default avatarJoern Engel <>
      Signed-off-by: default avatarPrasad Joshi <>
    • Prasad Joshi's avatar
      logfs: set superblock shutdown flag after generic sb shutdown · ecfd8909
      Prasad Joshi authored
      While unmounting the file system LogFS calls generic_shutdown_super.
      The function does file system independent superblock shutdown.
      However, it might result in call file system specific inode eviction.
      LogFS marks FS shutting down by setting bit LOGFS_SB_FLAG_SHUTDOWN in
      super->s_flags. Since, inode eviction might call truncate on inode,
      following BUG is observed when file system is unmounted:
      ------------[ cut here ]------------
      kernel BUG at /home/prasad/logfs/segment.c:362!
      invalid opcode: 0000 [#1] PREEMPT SMP
      CPU 3
      Modules linked in: logfs binfmt_misc ppdev virtio_blk parport_pc lp
      	parport psmouse floppy virtio_pci serio_raw virtio_ring virtio
      Pid: 1933, comm: umount Not tainted 3.0.0+ #4 Bochs Bochs
      RIP: 0010:[<ffffffffa008c841>]  [<ffffffffa008c841>]
      		logfs_segment_write+0x211/0x230 [logfs]
      RSP: 0018:ffff880062d7b9e8  EFLAGS: 00010202
      RAX: 000000000000000e RBX: ffff88006eca9000 RCX: 0000000000000000
      RDX: ffff88006fd87c40 RSI: ffffea00014ff468 RDI: ffff88007b68e000
      RBP: ffff880062d7ba48 R08: 8000000020451430 R09: 0000000000000000
      R10: dead000000100100 R11: 0000000000000000 R12: ffff88006fd87c40
      R13: ffffea00014ff468 R14: ffff88005ad0a460 R15: 0000000000000000
      FS:  00007f25d50ea760(0000) GS:ffff88007fd80000(0000)
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000d05e48 CR3: 0000000062c72000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process umount (pid: 1933, threadinfo ffff880062d7a000,
      	task ffff880070b44500)
      ffff880062d7ba38 ffff88005ad0a508 0000000000001000 0000000000000000
      8000000020451430 ffffea00014ff468 ffff880062d7ba48 ffff88005ad0a460
      ffff880062d7bad8 ffffea00014ff468 ffff88006fd87c40 0000000000000000
      Call Trace:
      [<ffffffffa0088fee>] logfs_write_i0+0x12e/0x190 [logfs]
      [<ffffffffa0089360>] __logfs_write_rec+0x140/0x220 [logfs]
      [<ffffffffa0089312>] __logfs_write_rec+0xf2/0x220 [logfs]
      [<ffffffffa00894a4>] logfs_write_rec+0x64/0xd0 [logfs]
      [<ffffffffa0089616>] __logfs_write_buf+0x106/0x110 [logfs]
      [<ffffffffa008a19e>] logfs_write_buf+0x4e/0x80 [logfs]
      [<ffffffffa008a6b8>] __logfs_write_inode+0x98/0x110 [logfs]
      [<ffffffffa008a7c4>] logfs_truncate+0x54/0x290 [logfs]
      [<ffffffffa008abfc>] logfs_evict_inode+0xdc/0x190 [logfs]
      [<ffffffff8115eef5>] evict+0x85/0x170
      [<ffffffff8115f126>] iput+0xe6/0x1b0
      [<ffffffff8115b4a8>] shrink_dcache_for_umount_subtree+0x218/0x280
      [<ffffffff8115ce91>] shrink_dcache_for_umount+0x51/0x90
      [<ffffffff8114796c>] generic_shutdown_super+0x2c/0x100
      [<ffffffffa008cc47>] logfs_kill_sb+0x57/0xf0 [logfs]
      [<ffffffff81147de5>] deactivate_locked_super+0x45/0x70
      [<ffffffff811487ea>] deactivate_super+0x4a/0x70
      [<ffffffff81163934>] mntput_no_expire+0xa4/0xf0
      [<ffffffff8116469f>] sys_umount+0x6f/0x380
      [<ffffffff814dd46b>] system_call_fastpath+0x16/0x1b
      Code: 55 c8 49 8d b6 a8 00 00 00 45 89 f9 45 89 e8 4c 89 e1 4c 89 55
      b8 c7 04 24 00 00 00 00 e8 68 fc ff ff 4c 8b 55 b8 e9 3c ff ff ff <0f>
      0b 0f 0b c7 45 c0 00 00 00 00 e9 44 fe ff ff 66 66 66 66 66
      RIP  [<ffffffffa008c841>] logfs_segment_write+0x211/0x230 [logfs]
      RSP <ffff880062d7b9e8>
      ---[ end trace fe6b040cea952290 ]---
      Therefore, move super->s_flags setting after the fs-indenpendent work
      has been finished.
      Reviewed-by: default avatarJoern Engel <>
      Signed-off-by: default avatarPrasad Joshi <>
    • Prasad Joshi's avatar
      logfs: take write mutex lock during fsync and sync · 13ced29c
      Prasad Joshi authored
      LogFS uses super->s_write_mutex while writing data to disk. Taking the
      same mutex lock in sync and fsync code path solves the following BUG:
      ------------[ cut here ]------------
      kernel BUG at /home/prasad/logfs/dev_bdev.c:134!
      Pid: 2387, comm: flush-253:16 Not tainted 3.0.0+ #4 Bochs Bochs
      RIP: 0010:[<ffffffffa007deed>]  [<ffffffffa007deed>]
                      bdev_writeseg+0x25d/0x270 [logfs]
      Call Trace:
      [<ffffffffa007c381>] logfs_open_area+0x91/0x150 [logfs]
      [<ffffffff8128dcb2>] ? find_level.clone.9+0x62/0x100
      [<ffffffffa007c49c>] __logfs_segment_write.clone.20+0x5c/0x190 [logfs]
      [<ffffffff810ef005>] ? mempool_kmalloc+0x15/0x20
      [<ffffffff810ef383>] ? mempool_alloc+0x53/0x130
      [<ffffffffa007c7a4>] logfs_segment_write+0x1d4/0x230 [logfs]
      [<ffffffffa0078f8e>] logfs_write_i0+0x12e/0x190 [logfs]
      [<ffffffffa0079300>] __logfs_write_rec+0x140/0x220 [logfs]
      [<ffffffffa0079444>] logfs_write_rec+0x64/0xd0 [logfs]
      [<ffffffffa00795b6>] __logfs_write_buf+0x106/0x110 [logfs]
      [<ffffffffa007a13e>] logfs_write_buf+0x4e/0x80 [logfs]
      [<ffffffffa0073e33>] __logfs_writepage+0x23/0x80 [logfs]
      [<ffffffffa007410c>] logfs_writepage+0xdc/0x110 [logfs]
      [<ffffffff810f5ba7>] __writepage+0x17/0x40
      [<ffffffff810f6208>] write_cache_pages+0x208/0x4f0
      [<ffffffff810f5b90>] ? set_page_dirty+0x70/0x70
      [<ffffffff810f653a>] generic_writepages+0x4a/0x70
      [<ffffffff810f75d1>] do_writepages+0x21/0x40
      [<ffffffff8116b9d1>] writeback_single_inode+0x101/0x250
      [<ffffffff8116bdbd>] writeback_sb_inodes+0xed/0x1c0
      [<ffffffff8116c5fb>] writeback_inodes_wb+0x7b/0x1e0
      [<ffffffff8116cc23>] wb_writeback+0x4c3/0x530
      [<ffffffff814d984d>] ? sub_preempt_count+0x9d/0xd0
      [<ffffffff8116cd6b>] wb_do_writeback+0xdb/0x290
      [<ffffffff814d984d>] ? sub_preempt_count+0x9d/0xd0
      [<ffffffff814d6208>] ? _raw_spin_unlock_irqrestore+0x18/0x40
      [<ffffffff8105aa5a>] ? del_timer+0x8a/0x120
      [<ffffffff8116cfac>] bdi_writeback_thread+0x8c/0x2e0
      [<ffffffff8116cf20>] ? wb_do_writeback+0x290/0x290
      [<ffffffff8106d2e6>] kthread+0x96/0xa0
      [<ffffffff814de514>] kernel_thread_helper+0x4/0x10
      [<ffffffff8106d250>] ? kthread_worker_fn+0x190/0x190
      [<ffffffff814de510>] ? gs_change+0xb/0xb
      RIP  [<ffffffffa007deed>] bdev_writeseg+0x25d/0x270 [logfs]
      ---[ end trace 0211ad60a57657c4 ]---
      Reviewed-by: default avatarJoern Engel <>
      Signed-off-by: default avatarPrasad Joshi <>
    • Joern Engel's avatar
      logfs: Prevent memory corruption · 934eed39
      Joern Engel authored
      This is a bad one.  I wonder whether we were so far protected by
      no_free_segments(sb) usually being smaller than LOGFS_NO_AREAS.
      Found by Dan Carpenter <> using smatch.
      Signed-off-by: default avatarJoern Engel <>
      Signed-off-by: default avatarPrasad Joshi <>
    • Prasad Joshi's avatar
      logfs: update page reference count for pined pages · 96150606
      Prasad Joshi authored
      LogFS sets PG_private flag to indicate a pined page. We assumed that
      marking a page as private is enough to ensure its existence. But
      instead it is necessary to hold a reference count to the page.
      The change resolves the following BUG
      BUG: Bad page state in process flush-253:16  pfn:6a6d0
      page flags: 0x100000000000808(uptodate|private)
      Suggested-and-Acked-by: default avatarJoern Engel <>
      Signed-off-by: default avatarPrasad Joshi <>
    • Paul Gortmaker's avatar
      fs: add module.h to files that were implicitly using it · 143cb494
      Paul Gortmaker authored
      Some files were using the complete module.h infrastructure without
      actually including the header at all.  Fix them up in advance so
      once the implicit presence is removed, we won't get failures like this:
        CC [M]  fs/nfsd/nfssvc.o
      fs/nfsd/nfssvc.c: In function 'nfsd_create_serv':
      fs/nfsd/nfssvc.c:335: error: 'THIS_MODULE' undeclared (first use in this function)
      fs/nfsd/nfssvc.c:335: error: (Each undeclared identifier is reported only once
      fs/nfsd/nfssvc.c:335: error: for each function it appears in.)
      fs/nfsd/nfssvc.c: In function 'nfsd':
      fs/nfsd/nfssvc.c:555: error: implicit declaration of function 'module_put_and_exit'
      make[3]: *** [fs/nfsd/nfssvc.o] Error 1
      Signed-off-by: default avatarPaul Gortmaker <>
    • Josef Bacik's avatar
      fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers · 02c24a82
      Josef Bacik authored
      Btrfs needs to be able to control how filemap_write_and_wait_range() is called
      in fsync to make it less of a painful operation, so push down taking i_mutex and
      the calling of filemap_write_and_wait() down into the ->fsync() handlers.  Some
      file systems can drop taking the i_mutex altogether it seems, like ext3 and
      ocfs2.  For correctness sake I just pushed everything down in all cases to make
      sure that we keep the current behavior the same for everybody, and then each
      individual fs maintainer can make up their mind about what to do from there.
      Acked-by: default avatarJan Kara <>
      Signed-off-by: default avatarJosef Bacik <>
      Signed-off-by: default avatarAl Viro <>
    • Linus Torvalds's avatar
      sanitize <linux/prefetch.h> usage · 268bb0ce
      Linus Torvalds authored
      Commit e66eed65 ("list: remove prefetching from regular list
      iterators") removed the include of prefetch.h from list.h, which
      uncovered several cases that had apparently relied on that rather
      obscure header file dependency.
      So this fixes things up a bit, using
         grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
         grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')
      to guide us in finding files that either need <linux/prefetch.h>
      inclusion, or have it despite not needing it.
      There are more of them around (mostly network drivers), but this gets
      many core ones.
      Reported-by: default avatarStephen Rothwell <>
      Signed-off-by: default avatarLinus Torvalds <>