1. 17 Jan, 2019 2 commits
    • David Howells's avatar
      afs: Fix race in async call refcounting · 34fa4761
      David Howells authored
      There's a race between afs_make_call() and afs_wake_up_async_call() in the
      case that an error is returned from rxrpc_kernel_send_data() after it has
      queued the final packet.
      
      afs_make_call() will try and clean up the mess, but the call state may have
      been moved on thereby causing afs_process_async_call() to also try and to
      delete the call.
      
      Fix this by:
      
       (1) Getting an extra ref for an asynchronous call for the call itself to
           hold.  This makes sure the call doesn't evaporate on us accidentally
           and will allow the call to be retained by the caller in a future
           patch.  The ref is released on leaving afs_make_call() or
           afs_wait_for_call_to_complete().
      
       (2) In the event of an error from rxrpc_kernel_send_data():
      
           (a) Don't set the call state to AFS_CALL_COMPLETE until *after* the
           	 call has been aborted and ended.  This prevents
           	 afs_deliver_to_call() from doing anything with any notifications
           	 it gets.
      
           (b) Explicitly end the call immediately to prevent further callbacks.
      
           (c) Cancel any queued async_work and wait for the work if it's
           	 executing.  This allows us to be sure the race won't recur when we
           	 change the state.  We put the work queue's ref on the call if we
           	 managed to cancel it.
      
           (d) Put the call's ref that we got in (1).  This belongs to us as long
           	 as the call is in state AFS_CALL_CL_REQUESTING.
      
      Fixes: 341f741f ("afs: Refcount the afs_call struct")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      34fa4761
    • David Howells's avatar
      afs: Provide a function to get a ref on a call · 7a75b007
      David Howells authored
      Provide a function to get a reference on an afs_call struct.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      7a75b007
  2. 15 Nov, 2018 1 commit
    • David Howells's avatar
      rxrpc: Fix life check · 7150ceaa
      David Howells authored
      The life-checking function, which is used by kAFS to make sure that a call
      is still live in the event of a pending signal, only samples the received
      packet serial number counter; it doesn't actually provoke a change in the
      counter, rather relying on the server to happen to give us a packet in the
      time window.
      
      Fix this by adding a function to force a ping to be transmitted.
      
      kAFS then keeps track of whether there's been a stall, and if so, uses the
      new function to ping the server, resetting the timeout to allow the reply
      to come back.
      
      If there's a stall, a ping and the call is *still* stalled in the same
      place after another period, then the call will be aborted.
      
      Fixes: bc5e3a54 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals")
      Fixes: f4d15fb6 ("rxrpc: Provide functions for allowing cleaner handling of signals")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7150ceaa
  3. 23 Oct, 2018 9 commits
    • David Howells's avatar
      afs: Probe multiple fileservers simultaneously · 3bf0fb6f
      David Howells authored
      Send probes to all the unprobed fileservers in a fileserver list on all
      addresses simultaneously in an attempt to find out the fastest route whilst
      not getting stuck for 20s on any server or address that we don't get a
      reply from.
      
      This alleviates the problem whereby attempting to access a new server can
      take a long time because the rotation algorithm ends up rotating through
      all servers and addresses until it finds one that responds.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      3bf0fb6f
    • David Howells's avatar
      afs: Eliminate the address pointer from the address list cursor · 2feeaf84
      David Howells authored
      Eliminate the address pointer from the address list cursor as it's
      redundant (ac->addrs[ac->index] can be used to find the same address) and
      address lists must be replaced rather than being rearranged, so is of
      limited value.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      2feeaf84
    • David Howells's avatar
      afs: Calc callback expiry in op reply delivery · 12d8e95a
      David Howells authored
      Calculate the callback expiration time at the point of operation reply
      delivery, using the reply time queried from AF_RXRPC on that call as a
      base.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      12d8e95a
    • David Howells's avatar
      afs: Implement the YFS cache manager service · 35dbfba3
      David Howells authored
      Implement the YFS cache manager service which gives extra capabilities on
      top of AFS.  This is done by listening for an additional service on the
      same port and indicating that anyone requesting an upgrade should be
      upgraded to the YFS port.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      35dbfba3
    • David Howells's avatar
      afs: Add a couple of tracepoints to log I/O errors · f51375cd
      David Howells authored
      Add a couple of tracepoints to log the production of I/O errors within the AFS
      filesystem.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f51375cd
    • David Howells's avatar
      afs: Handle EIO from delivery function · 4ac15ea5
      David Howells authored
      Fix afs_deliver_to_call() to handle -EIO being returned by the operation
      delivery function, indicating that the call found itself in the wrong
      state, by printing an error and aborting the call.
      
      Currently, an assertion failure will occur.  This can happen, say, if the
      delivery function falls off the end without calling afs_extract_data() with
      the want_more parameter set to false to collect the end of the Rx phase of
      a call.
      
      The assertion failure looks like:
      
      	AFS: Assertion failed
      	4 == 7 is false
      	0x4 == 0x7 is false
      	------------[ cut here ]------------
      	kernel BUG at fs/afs/rxrpc.c:462!
      
      and is matched in the trace buffer by a line like:
      
      kworker/7:3-3226 [007] ...1 85158.030203: afs_io_error: c=0003be0c r=-5 CM_REPLY
      
      Fixes: 98bf40cd ("afs: Protect call->state changes against signals")
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      4ac15ea5
    • David Howells's avatar
      afs: Set up the iov_iter before calling afs_extract_data() · 12bdcf33
      David Howells authored
      afs_extract_data sets up a temporary iov_iter and passes it to AF_RXRPC
      each time it is called to describe the remaining buffer to be filled.
      
      Instead:
      
       (1) Put an iterator in the afs_call struct.
      
       (2) Set the iterator for each marshalling stage to load data into the
           appropriate places.  A number of convenience functions are provided to
           this end (eg. afs_extract_to_buf()).
      
           This iterator is then passed to afs_extract_data().
      
       (3) Use the new ITER_DISCARD iterator to discard any excess data provided
           by FetchData.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      12bdcf33
    • David Howells's avatar
      afs: Better tracing of protocol errors · 160cb957
      David Howells authored
      Include the site of detection of AFS protocol errors in trace lines to
      better be able to determine what went wrong.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      160cb957
    • David Howells's avatar
      iov_iter: Separate type from direction and use accessor functions · aa563d7b
      David Howells authored
      In the iov_iter struct, separate the iterator type from the iterator
      direction and use accessor functions to access them in most places.
      
      Convert a bunch of places to use switch-statements to access them rather
      then chains of bitwise-AND statements.  This makes it easier to add further
      iterator types.  Also, this can be more efficient as to implement a switch
      of small contiguous integers, the compiler can use ~50% fewer compare
      instructions than it has to use bitwise-and instructions.
      
      Further, cease passing the iterator type into the iterator setup function.
      The iterator function can set that itself.  Only the direction is required.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      aa563d7b
  4. 15 Oct, 2018 1 commit
    • David Howells's avatar
      afs: Fix clearance of reply · f0a7d188
      David Howells authored
      The recent patch to fix the afs_server struct leak didn't actually fix the
      bug, but rather fixed some of the symptoms.  The problem is that an
      asynchronous call that holds a resource pointed to by call->reply[0] will
      find the pointer cleared in the call destructor, thereby preventing the
      resource from being cleaned up.
      
      In the case of the server record leak, the afs_fs_get_capabilities()
      function in devel code sets up a call with reply[0] pointing at the server
      record that should be altered when the result is obtained, but this was
      being cleared before the destructor was called, so the put in the
      destructor does nothing and the record is leaked.
      
      Commit f014ffb0 removed the additional ref obtained by
      afs_install_server(), but the removal of this ref is actually used by the
      garbage collector to mark a server record as being defunct after the record
      has expired through lack of use.
      
      The offending clearance of call->reply[0] upon completion in
      afs_process_async_call() has been there from the origin of the code, but
      none of the asynchronous calls actually use that pointer currently, so it
      should be safe to remove (note that synchronous calls don't involve this
      function).
      
      Fix this by the following means:
      
       (1) Revert commit f014ffb0.
      
       (2) Remove the clearance of reply[0] from afs_process_async_call().
      
      Without this, afs_manage_servers() will suffer an assertion failure if it
      sees a server record that didn't get used because the usage count is not 1.
      
      Fixes: f014ffb0 ("afs: Fix afs_server struct leak")
      Fixes: 08e0e7c8 ("[AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC.")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f0a7d188
  5. 03 Aug, 2018 1 commit
  6. 21 Jun, 2018 1 commit
  7. 23 May, 2018 1 commit
  8. 14 May, 2018 3 commits
    • David Howells's avatar
      afs: Fix the non-encryption of calls · 4776cab4
      David Howells authored
      Some AFS servers refuse to accept unencrypted traffic, so can't be accessed
      with kAFS.  Set the AF_RXRPC security level to encrypt client calls to deal
      with this.
      
      Note that incoming service calls are set by the remote client and so aren't
      affected by this.
      
      This requires an AF_RXRPC patch to pass the value set by setsockopt to calls
      begun by the kernel.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      4776cab4
    • David Howells's avatar
      afs: Fix the handling of an unfound server in CM operations · a86b06d1
      David Howells authored
      If the client cache manager operations that need the server record
      (CB.Callback, CB.InitCallBackState, and CB.InitCallBackState3) can't find
      the server record, they abort the call from the file server with
      RX_CALL_DEAD when they should return okay.
      
      Fixes: c35eccb1 ("[AFS]: Implement the CB.InitCallBackState3 operation.")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      a86b06d1
    • David Howells's avatar
      afs: Fix giving up callbacks on server destruction · f2686b09
      David Howells authored
      When a server record is destroyed, we want to send a message to the server
      telling it that we're giving up all the callbacks it has promised us.
      
      Apply two fixes to this:
      
       (1) Only send the FS.GiveUpAllCallBacks message if we actually got a
           callback from that server.  We assume this to be the case if we
           performed at least one successful FS operation on that server.
      
       (2) Send it to the address last used for that server rather than always
           picking the first address in the list (which might be unreachable).
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f2686b09
  9. 09 Apr, 2018 1 commit
  10. 27 Mar, 2018 1 commit
    • David Howells's avatar
      rxrpc, afs: Use debug_ids rather than pointers in traces · a25e21f0
      David Howells authored
      In rxrpc and afs, use the debug_ids that are monotonically allocated to
      various objects as they're allocated rather than pointers as kernel
      pointers are now hashed making them less useful.  Further, the debug ids
      aren't reused anywhere nearly as quickly.
      
      In addition, allow kernel services that use rxrpc, such as afs, to take
      numbers from the rxrpc counter, assign them to their own call struct and
      pass them in to rxrpc for both client and service calls so that the trace
      lines for each will have the same ID tag.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      a25e21f0
  11. 20 Mar, 2018 1 commit
  12. 02 Jan, 2018 1 commit
  13. 13 Nov, 2017 16 commits
    • David Howells's avatar
      afs: Protect call->state changes against signals · 98bf40cd
      David Howells authored
      Protect call->state changes against the call being prematurely terminated
      due to a signal.
      
      What can happen is that a signal causes afs_wait_for_call_to_complete() to
      abort an afs_call because it's not yet complete whilst afs_deliver_to_call()
      is delivering data to that call.
      
      If the data delivery causes the state to change, this may overwrite the state
      of the afs_call, making it not-yet-complete again - but no further
      notifications will be forthcoming from AF_RXRPC as the rxrpc call has been
      aborted and completed, so kAFS will just hang in various places waiting for
      that call or on page bits that need clearing by that call.
      
      A tracepoint to monitor call state changes is also provided.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      98bf40cd
    • Marc Dionne's avatar
      afs: Use a dynamic port if 7001 is in use · 83732ec5
      Marc Dionne authored
      It is not required that the afs client operate on port 7001.
      The port could be in use because another kernel or userspace
      client has already bound to it.
      
      If the port is in use, just fallback to using a dynamic port.
      Signed-off-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      83732ec5
    • David Howells's avatar
      afs: Trace the sending of pages · 2c099014
      David Howells authored
      Add a pair of tracepoints to log the sending of pages for an FS.StoreData
      or FS.StoreData64 operation.
      
      Tracepoint afs_send_pages notes each set of pages added to the operation.
      There may be several of these per operation as we get up at most 8
      contiguous pages in one go because the bvec we're using is on the stack.
      
      Tracepoint afs_sent_pages notes the end of adding data from a whole run of
      pages to the operation and the completion of the request phase.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      2c099014
    • David Howells's avatar
      afs: Trace the initiation and completion of client calls · 025db80c
      David Howells authored
      Add tracepoints to trace the initiation and completion of client calls
      within the kafs filesystem.
      
      The afs_make_vl_call tracepoint watches calls to the volume location
      database server.
      
      The afs_make_fs_call tracepoint watches calls to the file server.
      
      The afs_call_done tracepoint watches for call completion.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      025db80c
    • David Howells's avatar
      afs: Fix total-length calculation for multiple-page send · 1199db60
      David Howells authored
      Fix the total-length calculation in afs_make_call() when the operation
      being dispatched has data from a series of pages attached.
      
      Despite the patched code looking like that it should reduce mathematically
      to the current code, it doesn't because the 32-bit unsigned arithmetic
      being used to calculate the page-offset-difference doesn't correctly extend
      to a 64-bit value when the result is effectively negative.
      
      Without this, some FS.StoreData operations that span multiple pages fail,
      reporting too little or too much data.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      1199db60
    • David Howells's avatar
      afs: Only progress call state at end of Tx phase from rxrpc callback · 5f0fc8ba
      David Howells authored
      Only progress the AFS call state at the end of Tx phase from the callback
      passed to rxrpc_kernel_send_data() rather than setting it before the last
      data send call.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      5f0fc8ba
    • David Howells's avatar
      afs: Overhaul volume and server record caching and fileserver rotation · d2ddc776
      David Howells authored
      The current code assumes that volumes and servers are per-cell and are
      never shared, but this is not enforced, and, indeed, public cells do exist
      that are aliases of each other.  Further, an organisation can, say, set up
      a public cell and a private cell with overlapping, but not identical, sets
      of servers.  The difference is purely in the database attached to the VL
      servers.
      
      The current code will malfunction if it sees a server in two cells as it
      assumes global address -> server record mappings and that each server is in
      just one cell.
      
      Further, each server may have multiple addresses - and may have addresses
      of different families (IPv4 and IPv6, say).
      
      To this end, the following structural changes are made:
      
       (1) Server record management is overhauled:
      
           (a) Server records are made independent of cell.  The namespace keeps
           	 track of them, volume records have lists of them and each vnode
           	 has a server on which its callback interest currently resides.
      
           (b) The cell record no longer keeps a list of servers known to be in
           	 that cell.
      
           (c) The server records are now kept in a flat list because there's no
           	 single address to sort on.
      
           (d) Server records are now keyed by their UUID within the namespace.
      
           (e) The addresses for a server are obtained with the VL.GetAddrsU
           	 rather than with VL.GetEntryByName, using the server's UUID as a
           	 parameter.
      
           (f) Cached server records are garbage collected after a period of
           	 non-use and are counted out of existence before purging is allowed
           	 to complete.  This protects the work functions against rmmod.
      
           (g) The servers list is now in /proc/fs/afs/servers.
      
       (2) Volume record management is overhauled:
      
           (a) An RCU-replaceable server list is introduced.  This tracks both
           	 servers and their coresponding callback interests.
      
           (b) The superblock is now keyed on cell record and numeric volume ID.
      
           (c) The volume record is now tied to the superblock which mounts it,
           	 and is activated when mounted and deactivated when unmounted.
           	 This makes it easier to handle the cache cookie without causing a
           	 double-use in fscache.
      
           (d) The volume record is loaded from the VLDB using VL.GetEntryByNameU
           	 to get the server UUID list.
      
           (e) The volume name is updated if it is seen to have changed when the
           	 volume is updated (the update is keyed on the volume ID).
      
       (3) The vlocation record is got rid of and VLDB records are no longer
           cached.  Sufficient information is stored in the volume record, though
           an update to a volume record is now no longer shared between related
           volumes (volumes come in bundles of three: R/W, R/O and backup).
      
      and the following procedural changes are made:
      
       (1) The fileserver cursor introduced previously is now fleshed out and
           used to iterate over fileservers and their addresses.
      
       (2) Volume status is checked during iteration, and the server list is
           replaced if a change is detected.
      
       (3) Server status is checked during iteration, and the address list is
           replaced if a change is detected.
      
       (4) The abort code is saved into the address list cursor and -ECONNABORTED
           returned in afs_make_call() if a remote abort happened rather than
           translating the abort into an error message.  This allows actions to
           be taken depending on the abort code more easily.
      
           (a) If a VMOVED abort is seen then this is handled by rechecking the
           	 volume and restarting the iteration.
      
           (b) If a VBUSY, VRESTARTING or VSALVAGING abort is seen then this is
               handled by sleeping for a short period and retrying and/or trying
               other servers that might serve that volume.  A message is also
               displayed once until the condition has cleared.
      
           (c) If a VOFFLINE abort is seen, then this is handled as VBUSY for the
           	 moment.
      
           (d) If a VNOVOL abort is seen, the volume is rechecked in the VLDB to
           	 see if it has been deleted; if not, the fileserver is probably
           	 indicating that the volume couldn't be attached and needs
           	 salvaging.
      
           (e) If statfs() sees one of these aborts, it does not sleep, but
           	 rather returns an error, so as not to block the umount program.
      
       (5) The fileserver iteration functions in vnode.c are now merged into
           their callers and more heavily macroised around the cursor.  vnode.c
           is removed.
      
       (6) Operations on a particular vnode are serialised on that vnode because
           the server will lock that vnode whilst it operates on it, so a second
           op sent will just have to wait.
      
       (7) Fileservers are probed with FS.GetCapabilities before being used.
           This is where service upgrade will be done.
      
       (8) A callback interest on a fileserver is set up before an FS operation
           is performed and passed through to afs_make_call() so that it can be
           set on the vnode if the operation returns a callback.  The callback
           interest is passed through to afs_iget() also so that it can be set
           there too.
      
      In general, record updating is done on an as-needed basis when we try to
      access servers, volumes or vnodes rather than offloading it to work items
      and special threads.
      
      Notes:
      
       (1) Pre AFS-3.4 servers are no longer supported, though this can be added
           back if necessary (AFS-3.4 was released in 1998).
      
       (2) VBUSY is retried forever for the moment at intervals of 1s.
      
       (3) /proc/fs/afs/<cell>/servers no longer exists.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      d2ddc776
    • David Howells's avatar
      afs: Add an address list concept · 8b2a464c
      David Howells authored
      Add an RCU replaceable address list structure to hold a list of server
      addresses.  The list also holds the
      
      To this end:
      
       (1) A cell's VL server address list can be loaded directly via insmod or
           echo to /proc/fs/afs/cells or dynamically from a DNS query for AFSDB
           or SRV records.
      
       (2) Anyone wanting to use a cell's VL server address must wait until the
           cell record comes online and has tried to obtain some addresses.
      
       (3) An FS server's address list, for the moment, has a single entry that
           is the key to the server list.  This will change in the future when a
           server is instead keyed on its UUID and the VL.GetAddrsU operation is
           used.
      
       (4) An 'address cursor' concept is introduced to handle iteration through
           the address list.  This is passed to the afs_make_call() as, in the
           future, stuff (such as abort code) that doesn't outlast the call will
           be returned in it.
      
      In the future, we might want to annotate the list with information about
      how each address fares.  We might then want to propagate such annotations
      over address list replacement.
      
      Whilst we're at it, we allow IPv6 addresses to be specified in
      colon-delimited lists by enclosing them in square brackets.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      8b2a464c
    • David Howells's avatar
      afs: Rename struct afs_call server member to cm_server · d0676a16
      David Howells authored
      Rename the server member of struct afs_call to cm_server as we're only
      going to be using it for incoming calls for the Cache Manager service.
      This makes it easier to differentiate from the pointer to the target server
      for the client, which will point to a different structure to allow for
      callback handling.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      d0676a16
    • David Howells's avatar
      afs: Potentially return call->reply[0] from afs_make_call() · 33cd7f2b
      David Howells authored
      If call->ret_reply0 is set, return call->reply[0] on success.  Change the
      return type of afs_make_call() to long so that this can be passed back
      without bit loss and then cast to a pointer if required.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      33cd7f2b
    • David Howells's avatar
      afs: Condense afs_call's reply{,2,3,4} into an array · 97e3043a
      David Howells authored
      Condense struct afs_call's reply anchor members - reply{,2,3,4} - into an
      array.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      97e3043a
    • David Howells's avatar
      afs: Consolidate abort_to_error translators · f780c8ea
      David Howells authored
      The AFS abort code space is shared across all services, so there's no need
      for separate abort_to_error translators for each service.
      
      Consolidate them into a single function and remove the function pointers
      for them.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f780c8ea
    • David Howells's avatar
      afs: Allow IPv6 address specification of VL servers · 3838d3ec
      David Howells authored
      Allow VL server specifications to be given IPv6 addresses as well as IPv4
      addresses, for example as:
      
      	echo add foo.org 1111:2222:3333:0:4444:5555:6666:7777 >/proc/fs/afs/cells
      
      Note that ':' is the expected separator for separating IPv4 addresses, but
      if a ',' is detected or no '.' is detected in the string, the delimiter is
      switched to ','.
      
      This also works with DNS AFSDB or SRV record strings fetched by upcall from
      userspace.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      3838d3ec
    • David Howells's avatar
      afs: Keep and pass sockaddr_rxrpc addresses rather than in_addr · 4d9df986
      David Howells authored
      Keep and pass sockaddr_rxrpc addresses around rather than keeping and
      passing in_addr addresses to allow for the use of IPv6 and non-standard
      port numbers in future.
      
      This also allows the port and service_id fields to be removed from the
      afs_call struct.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      4d9df986
    • David Howells's avatar
      afs: Lay the groundwork for supporting network namespaces · f044c884
      David Howells authored
      Lay the groundwork for supporting network namespaces (netns) to the AFS
      filesystem by moving various global features to a network-namespace struct
      (afs_net) and providing an instance of this as a temporary global variable
      that everything uses via accessor functions for the moment.
      
      The following changes have been made:
      
       (1) Store the netns in the superblock info.  This will be obtained from
           the mounter's nsproxy on a manual mount and inherited from the parent
           superblock on an automount.
      
       (2) The cell list is made per-netns.  It can be viewed through
           /proc/net/afs/cells and also be modified by writing commands to that
           file.
      
       (3) The local workstation cell is set per-ns in /proc/net/afs/rootcell.
           This is unset by default.
      
       (4) The 'rootcell' module parameter, which sets a cell and VL server list
           modifies the init net namespace, thereby allowing an AFS root fs to be
           theoretically used.
      
       (5) The volume location lists and the file lock manager are made
           per-netns.
      
       (6) The AF_RXRPC socket and associated I/O bits are made per-ns.
      
      The various workqueues remain global for the moment.
      
      Changes still to be made:
      
       (1) /proc/fs/afs/ should be moved to /proc/net/afs/ and a symlink emplaced
           from the old name.
      
       (2) A per-netns subsys needs to be registered for AFS into which it can
           store its per-netns data.
      
       (3) Rather than the AF_RXRPC socket being opened on module init, it needs
           to be opened on the creation of a superblock in that netns.
      
       (4) The socket needs to be closed when the last superblock using it is
           destroyed and all outstanding client calls on it have been completed.
           This prevents a reference loop on the namespace.
      
       (5) It is possible that several namespaces will want to use AFS, in which
           case each one will need its own UDP port.  These can either be set
           through /proc/net/afs/cm_port or the kernel can pick one at random.
           The init_ns gets 7001 by default.
      
      Other issues that need resolving:
      
       (1) The DNS keyring needs net-namespacing.
      
       (2) Where do upcalls go (eg. DNS request-key upcall)?
      
       (3) Need something like open_socket_in_file_ns() syscall so that AFS
           command line tools attempting to operate on an AFS file/volume have
           their RPC calls go to the right place.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f044c884
    • David Howells's avatar
      Pass mode to wait_on_atomic_t() action funcs and provide default actions · 5e4def20
      David Howells authored
      Make wait_on_atomic_t() pass the TASK_* mode onto its action function as an
      extra argument and make it 'unsigned int throughout.
      
      Also, consolidate a bunch of identical action functions into a default
      function that can do the appropriate thing for the mode.
      
      Also, change the argument name in the bit_wait*() function declarations to
      reflect the fact that it's the mode and not the bit number.
      
      [Peter Z gives this a grudging ACK, but thinks that the whole atomic_t wait
      should be done differently, though he's not immediately sure as to how]
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      cc: Ingo Molnar <mingo@kernel.org>
      5e4def20
  14. 18 Oct, 2017 1 commit
    • David Howells's avatar
      rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals · bc5e3a54
      David Howells authored
      Make AF_RXRPC accept MSG_WAITALL as a flag to sendmsg() to tell it to
      ignore signals whilst loading up the message queue, provided progress is
      being made in emptying the queue at the other side.
      
      Progress is defined as the base of the transmit window having being
      advanced within 2 RTT periods.  If the period is exceeded with no progress,
      sendmsg() will return anyway, indicating how much data has been copied, if
      any.
      
      Once the supplied buffer is entirely decanted, the sendmsg() will return.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      bc5e3a54