This project is mirrored from The repository failed to update .
Repository mirroring has been paused due to too many failed attempts, and can be resumed by a project maintainer.
Last successful update .
  1. 15 May, 2012 1 commit
  2. 15 Apr, 2012 4 commits
    • John Fastabend's avatar
      net: rtnetlink notify events for FDB NTF_SELF adds and deletes · 3ff661c3
      John Fastabend authored
      It is useful to be able to monitor for FDB events in user space.
      This patch adds support to generate netlink events when a change
      is made to a device supporting the FDB ops.
      This brings embedded switches inline with the SW net/bridge which
      triggers events on FDB updates as well.
      Signed-off-by: default avatarJohn Fastabend <>
      Signed-off-by: default avatarDavid S. Miller <>
    • John Fastabend's avatar
      net: add fdb generic dump routine · d83b0603
      John Fastabend authored
      This adds a generic dump routine drivers can call. It
      should be sufficient to handle any bridging model that
      uses the unicast address list. This should be most SR-IOV
      enabled NICs.
      v2: return error on nlmsg_put and use -EMSGSIZE instead
          of -ENOMEM this is inline other usages
      Signed-off-by: default avatarJohn Fastabend <>
      Signed-off-by: default avatarDavid S. Miller <>
    • John Fastabend's avatar
      net: add generic PF_BRIDGE:RTM_ FDB hooks · 77162022
      John Fastabend authored
      This adds two new flags NTF_MASTER and NTF_SELF that can
      now be used to specify where PF_BRIDGE netlink commands should
      be sent. NTF_MASTER sends the commands to the 'dev->master'
      device for parsing. Typically this will be the linux net/bridge,
      or open-vswitch devices. Also without any flags set the command
      will be handled by the master device as well so that current user
      space tools continue to work as expected.
      The NTF_SELF flag will push the PF_BRIDGE commands to the
      device. In the basic example below the commands are then parsed
      and programmed in the embedded bridge.
      Note if both NTF_SELF and NTF_MASTER bits are set then the
      command will be sent to both 'dev->master' and 'dev' this allows
      user space to easily keep the embedded bridge and software bridge
      in sync.
      There is a slight complication in the case with both flags set
      when an error occurs. To resolve this the rtnl handler clears
      the NTF_ flag in the netlink ack to indicate which sets completed
      successfully. The add/del handlers will abort as soon as any
      error occurs.
      To support this new net device ops were added to call into
      the device and the existing bridging code was refactored
      to use these. There should be no required changes in user space
      to support the current bridge behavior.
      A basic setup with a SR-IOV enabled NIC looks like this,
                veth0  veth2
                  |      |
                |  bridge0 |   <---- software bridging
        ethx.y      ethx
          VF         PF
           \         \          <---- propagate FDB entries to HW
           \         \
        |  Embedded Bridge |    <---- hardware offloaded switching
      In this case the embedded bridge must be managed to allow 'veth0'
      to communicate with 'ethx.y' correctly. At present drivers managing
      the embedded bridge either send frames onto the network which
      then get dropped by the switch OR the embedded bridge will flood
      these frames. With this patch we have a mechanism to manage the
      embedded bridge correctly from user space. This example is specific
      to SR-IOV but replacing the VF with another PF or dropping this
      into the DSA framework generates similar management issues.
      Examples session using the 'br'[1] tool to add, dump and then
      delete a mac address with a new "embedded" option and enabled
      ixgbe driver:
      # br fdb add 22:35:19:ac:60:59 dev eth3
      # br fdb
      port    mac addr                flags
      veth0   22:35:19:ac:60:58       static
      veth0   9a:5f:81:f7:f6:ec       local
      eth3    00:1b:21:55:23:59       local
      eth3    22:35:19:ac:60:59       static
      veth0   22:35:19:ac:60:57       static
      #br fdb add 22:35:19:ac:60:59 embedded dev eth3
      #br fdb
      port    mac addr                flags
      veth0   22:35:19:ac:60:58       static
      veth0   9a:5f:81:f7:f6:ec       local
      eth3    00:1b:21:55:23:59       local
      eth3    22:35:19:ac:60:59       static
      veth0   22:35:19:ac:60:57       static
      eth3    22:35:19:ac:60:59       local embedded
      #br fdb del 22:35:19:ac:60:59 embedded dev eth3
      I added a couple lines to 'br' to set the flags correctly is all. It
      is my opinion that the merit of this patch is now embedded and SW
      bridges can both be modeled correctly in user space using very nearly
      the same message passing.
      [1] 'br' tool was published as an RFC here and will be renamed 'bridge'

      Thanks to Jamal Hadi Salim, Stephen Hemminger and Ben Hutchings for
      valuable feedback, suggestions, and review.
      v2: fixed api descriptions and error case with both NTF_SELF and
          NTF_MASTER set plus updated patch description.
      Signed-off-by: default avatarJohn Fastabend <>
      Signed-off-by: default avatarDavid S. Miller <>
    • Eric Dumazet's avatar
      net: cleanup unsigned to unsigned int · 95c96174
      Eric Dumazet authored
      Use of "unsigned int" is preferred to bare "unsigned" in net tree.
      Signed-off-by: default avatarEric Dumazet <>
      Signed-off-by: default avatarDavid S. Miller <>
  3. 13 Apr, 2012 1 commit
  4. 02 Apr, 2012 2 commits
  5. 28 Mar, 2012 1 commit
  6. 05 Mar, 2012 1 commit
  7. 01 Mar, 2012 1 commit
  8. 26 Feb, 2012 1 commit
  9. 21 Feb, 2012 1 commit
    • Greg Rose's avatar
      rtnetlink: Fix problem with buffer allocation · 115c9b81
      Greg Rose authored
      Implement a new netlink attribute type IFLA_EXT_MASK.  The mask
      is a 32 bit value that can be used to indicate to the kernel that
      certain extended ifinfo values are requested by the user application.
      At this time the only mask value defined is RTEXT_FILTER_VF to
      indicate that the user wants the ifinfo dump to send information
      about the VFs belonging to the interface.
      This patch fixes a bug in which certain applications do not have
      large enough buffers to accommodate the extra information returned
      by the kernel with large numbers of SR-IOV virtual functions.
      Those applications will not send the new netlink attribute with
      the interface info dump request netlink messages so they will
      not get unexpectedly large request buffers returned by the kernel.
      Modifies the rtnl_calcit function to traverse the list of net
      devices and compute the minimum buffer size that can hold the
      info dumps of all matching devices based upon the filter passed
      in via the new netlink attribute filter mask.  If no filter
      mask is sent then the buffer allocation defaults to NLMSG_GOODSIZE.
      With this change it is possible to add yet to be defined netlink
      attributes to the dump request which should make it fairly extensible
      in the future.
      Signed-off-by: default avatarGreg Rose <>
      Signed-off-by: default avatarDavid S. Miller <>
  10. 26 Jan, 2012 1 commit
  11. 05 Jan, 2012 1 commit
  12. 14 Dec, 2011 1 commit
  13. 16 Oct, 2011 1 commit
    • Greg Rose's avatar
      if_link: Add additional parameter to IFLA_VF_INFO for spoof checking · 5f8444a3
      Greg Rose authored
      Add configuration setting for drivers to turn spoof checking on or off
      for discrete VFs.
      v2 - Fix indentation problem, wrap the ifla_vf_info structure in
           #ifdef __KERNEL__ to prevent user space from accessing and
           change function paramater for the spoof check setting netdev
           op from u8 to bool.
      v3 - Preset spoof check setting to -1 so that user space tools such
           as ip can detect that the driver didn't report a spoofcheck
           setting.  Prevents incorrect display of spoof check settings
           for drivers that don't report it.
      Signed-off-by: default avatarGreg Rose <>
      Signed-off-by: default avatarJeff Kirsher <>
  14. 11 Aug, 2011 1 commit
  15. 01 Jul, 2011 1 commit
    • Thomas Graf's avatar
      rtnl: provide link dump consistency info · 4e985ada
      Thomas Graf authored
      This patch adds a change sequence counter to each net namespace
      which is bumped whenever a netdevice is added or removed from
      the list. If such a change occurred while a link dump took place,
      the dump will have the NLM_F_DUMP_INTR flag set in the first
      message which has been interrupted and in all subsequent messages
      of the same dump.
      Note that links may still be modified or renamed while a dump is
      taking place but we can guarantee for userspace to receive a
      complete list of links and not miss any.
      I have added 500 VLAN netdevices to make sure the dump is split
      over multiple messages. Then while continuously dumping links in
      one process I also continuously deleted and re-added a dummy
      netdevice in another process. Multiple dumps per seconds have
      had the NLM_F_DUMP_INTR flag set.
      I guess we can wait for Johannes patch to hit net-next via the
      wireless tree.  I just wanted to give this some testing right away.
      Signed-off-by: default avatarThomas Graf <>
      Signed-off-by: default avatarDavid S. Miller <>
  16. 10 Jun, 2011 1 commit
    • Greg Rose's avatar
      rtnetlink: Compute and store minimum ifinfo dump size · c7ac8679
      Greg Rose authored
      The message size allocated for rtnl ifinfo dumps was limited to
      a single page.  This is not enough for additional interface info
      available with devices that support SR-IOV and caused a bug in
      which VF info would not be displayed if more than approximately
      40 VFs were created per interface.
      Implement a new function pointer for the rtnl_register service that will
      calculate the amount of data required for the ifinfo dump and allocate
      enough data to satisfy the request.
      Signed-off-by: default avatarGreg Rose <>
      Signed-off-by: default avatarJeff Kirsher <>
  17. 25 May, 2011 1 commit
    • Eric Dumazet's avatar
      net: hold rtnl again in dump callbacks · 2907c35f
      Eric Dumazet authored
      Commit e67f88dd (dont hold rtnl mutex during netlink dump callbacks)
      missed fact that rtnl_fill_ifinfo() must be called with rtnl held.
      Because of possible deadlocks between two mutexes (cb_mutex and rtnl),
      its not easy to solve this problem, so revert this part of the patch.
      It also forgot one rcu_read_unlock() in FIB dump_rules()
      Add one ASSERT_RTNL() in rtnl_fill_ifinfo() to remind us the rule.
      Signed-off-by: default avatarEric Dumazet <>
      CC: Patrick McHardy <>
      CC: Stephen Hemminger <>
      Signed-off-by: default avatarDavid S. Miller <>
  18. 23 May, 2011 1 commit
  19. 10 May, 2011 1 commit
  20. 09 May, 2011 1 commit
    • Eric Dumazet's avatar
      net: use batched device unregister in veth and macvlan · 226bd341
      Eric Dumazet authored
      veth devices dont use the batched device unregisters yet.
      Since veth are a pair of devices, it makes sense to use a batch of two
      unregisters, this roughly divides dismantle time by two.
      Fix this by changing dellink() callers to always provide a non NULL
      head. (Idea from Michał Mirosław)
      This patch also handles macvlan case : We now dismantle all macvlans on
      top of a lower dev at once.
      Reported-by: Alex Bligh's avatarAlex Bligh <>
      Signed-off-by: default avatarEric Dumazet <>
      Cc: Michał Mirosław <>
      Cc: Jesse Gross <>
      Cc: Paul E. McKenney <>
      Cc: Ben Greear <>
      Signed-off-by: default avatarDavid S. Miller <>
  21. 05 May, 2011 1 commit
  22. 02 May, 2011 1 commit
    • Eric Dumazet's avatar
      net: dont hold rtnl mutex during netlink dump callbacks · e67f88dd
      Eric Dumazet authored
      Four years ago, Patrick made a change to hold rtnl mutex during netlink
      dump callbacks.
      I believe it was a wrong move. This slows down concurrent dumps, making
      good old /proc/net/ files faster than rtnetlink in some situations.
      This occurred to me because one "ip link show dev ..." was _very_ slow
      on a workload adding/removing network devices in background.
      All dump callbacks are able to use RCU locking now, so this patch does
      roughly a revert of commits :
      1c2d670f : [RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks
      6313c1e0 : [RTNETLINK]: Remove unnecessary locking in dump callbacks
      This let writers fight for rtnl mutex and readers going full speed.
      It also takes care of phonet : phonet_route_get() is now called from rcu
      read section. I renamed it to phonet_route_get_rcu()
      Signed-off-by: default avatarEric Dumazet <>
      Cc: Patrick McHardy <>
      Cc: Remi Denis-Courmont <>
      Acked-by: default avatarStephen Hemminger <>
      Signed-off-by: default avatarDavid S. Miller <>
  23. 31 Mar, 2011 1 commit
  24. 14 Feb, 2011 1 commit
  25. 30 Jan, 2011 1 commit
    • Eric W. Biederman's avatar
      net: Fix ip link add netns oops · 13ad1774
      Eric W. Biederman authored
      Ed Swierk <> writes:
      > On
      >  ip link add link eth0 netns 9999 type macvlan
      > where 9999 is a nonexistent PID triggers an oops and causes all network functions to hang:
      > [10663.821898] BUG: unable to handle kernel NULL pointer dereference at 000000000000006d
      >  [10663.821917] IP: [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
      >  [10663.821933] PGD 1d3927067 PUD 22f5c5067 PMD 0
      >  [10663.821944] Oops: 0000 [#1] SMP
      >  [10663.821953] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
      >  [10663.821959] CPU 3
      >  [10663.821963] Modules linked in: macvlan ip6table_filter ip6_tables rfcomm ipt_MASQUERADE binfmt_misc iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack sco ipt_REJECT bnep l2cap xt_tcpudp iptable_filter ip_tables x_tables bridge stp vboxnetadp vboxnetflt vboxdrv kvm_intel kvm parport_pc ppdev snd_hda_codec_intelhdmi snd_hda_codec_conexant arc4 iwlagn iwlcore mac80211 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi i915 snd_seq_midi_event snd_seq thinkpad_acpi drm_kms_helper btusb tpm_tis nvram uvcvideo snd_timer snd_seq_device bluetooth videodev v4l1_compat v4l2_compat_ioctl32 tpm drm tpm_bios snd cfg80211 psmouse serio_raw intel_ips soundcore snd_page_alloc intel_agp i2c_algo_bit video output netconsole configfs lp parport usbhid hid e1000e sdhci_pci ahci libahci sdhci led_class
      >  [10663.822155]
      >  [10663.822161] Pid: 6000, comm: ip Not tainted 2.6.35-23-generic #41-Ubuntu 2901CTO/2901CTO
      >  [10663.822167] RIP: 0010:[<ffffffff8149c2fa>] [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
      >  [10663.822177] RSP: 0018:ffff88014aebf7b8 EFLAGS: 00010286
      >  [10663.822182] RAX: 00000000fffffff4 RBX: ffff8801ad900800 RCX: 0000000000000000
      >  [10663.822187] RDX: ffff880000000000 RSI: 0000000000000000 RDI: ffff88014ad63000
      >  [10663.822191] RBP: ffff88014aebf808 R08: 0000000000000041 R09: 0000000000000041
      >  [10663.822196] R10: 0000000000000000 R11: dead000000200200 R12: ffff88014aebf818
      >  [10663.822201] R13: fffffffffffffffd R14: ffff88014aebf918 R15: ffff88014ad62000
      >  [10663.822207] FS: 00007f00c487f700(0000) GS:ffff880001f80000(0000) knlGS:0000000000000000
      >  [10663.822212] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      >  [10663.822216] CR2: 000000000000006d CR3: 0000000231f19000 CR4: 00000000000026e0
      >  [10663.822221] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      >  [10663.822226] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      >  [10663.822231] Process ip (pid: 6000, threadinfo ffff88014aebe000, task ffff88014afb16e0)
      >  [10663.822236] Stack:
      >  [10663.822240] ffff88014aebf808 ffffffff814a2bb5 ffff88014aebf7e8 00000000a00ee8d6
      >  [10663.822251] <0> 0000000000000000 ffffffffa00ef940 ffff8801ad900800 ffff88014aebf818
      >  [10663.822265] <0> ffff88014aebf918 ffff8801ad900800 ffff88014aebf858 ffffffff8149c413
      >  [10663.822281] Call Trace:
      >  [10663.822290] [<ffffffff814a2bb5>] ? dev_addr_init+0x75/0xb0
      >  [10663.822298] [<ffffffff8149c413>] dev_alloc_name+0x43/0x90
      >  [10663.822307] [<ffffffff814a85ee>] rtnl_create_link+0xbe/0x1b0
      >  [10663.822314] [<ffffffff814ab2aa>] rtnl_newlink+0x48a/0x570
      >  [10663.822321] [<ffffffff814aafcc>] ? rtnl_newlink+0x1ac/0x570
      >  [10663.822332] [<ffffffff81030064>] ? native_x2apic_icr_read+0x4/0x20
      >  [10663.822339] [<ffffffff814a8c17>] rtnetlink_rcv_msg+0x177/0x290
      >  [10663.822346] [<ffffffff814a8aa0>] ? rtnetlink_rcv_msg+0x0/0x290
      >  [10663.822354] [<ffffffff814c25d9>] netlink_rcv_skb+0xa9/0xd0
      >  [10663.822360] [<ffffffff814a8a85>] rtnetlink_rcv+0x25/0x40
      >  [10663.822367] [<ffffffff814c223e>] netlink_unicast+0x2de/0x2f0
      >  [10663.822374] [<ffffffff814c303e>] netlink_sendmsg+0x1fe/0x2e0
      >  [10663.822383] [<ffffffff81488533>] sock_sendmsg+0xf3/0x120
      >  [10663.822391] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
      >  [10663.822400] [<ffffffff81168656>] ? __d_lookup+0x136/0x150
      >  [10663.822406] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
      >  [10663.822414] [<ffffffff812b7a0d>] ? _atomic_dec_and_lock+0x4d/0x80
      >  [10663.822422] [<ffffffff8116ea90>] ? mntput_no_expire+0x30/0x110
      >  [10663.822429] [<ffffffff81486ff5>] ? move_addr_to_kernel+0x65/0x70
      >  [10663.822435] [<ffffffff81493308>] ? verify_iovec+0x88/0xe0
      >  [10663.822442] [<ffffffff81489020>] sys_sendmsg+0x240/0x3a0
      > [10663.822450] [<ffffffff8111e2a9>] ? __do_fault+0x479/0x560
      >  [10663.822457] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20
      >  [10663.822465] [<ffffffff8116cf4a>] ? alloc_fd+0x10a/0x150
      >  [10663.822473] [<ffffffff8158d76e>] ? do_page_fault+0x15e/0x350
      >  [10663.822482] [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b
      >  [10663.822487] Code: 90 48 8d 78 02 be 25 00 00 00 e8 92 1d e2 ff 48 85 c0 75 cf bf 20 00 00 00 e8 c3 b1 c6 ff 49 89 c7 b8 f4 ff ff ff 4d 85 ff 74 bd <4d> 8b 75 70 49 8d 45 70 48 89 45 b8 49 83 ee 58 eb 28 48 8d 55
      >  [10663.822618] RIP [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170
      >  [10663.822627] RSP <ffff88014aebf7b8>
      >  [10663.822631] CR2: 000000000000006d
      >  [10663.822636] ---[ end trace 3dfd6c3ad5327ca7 ]---
      This bug was introduced in:
      commit 81adee47
      Author: Eric W. Biederman <>
      Date:   Sun Nov 8 00:53:51 2009 -0800
          net: Support specifying the network namespace upon device creation.
          There is no good reason to not support userspace specifying the
          network namespace during device creation, and it makes it easier
          to create a network device and pass it to a child network namespace
          with a well known name.
          We have to be careful to ensure that the target network namespace
          for the new device exists through the life of the call.  To keep
          that logic clear I have factored out the network namespace grabbing
          logic into rtnl_link_get_net.
          In addtion we need to continue to pass the source network namespace
          to the rtnl_link_ops.newlink method so that we can find the base
          device source network namespace.
      Signed-off-by: default avatarEric W. Biederman <>
      Acked-by: default avatarEric Dumazet <>
      Where apparently I forgot to add error handling to the path where we create
      a new network device in a new network namespace, and pass in an invalid pid.
      Reported-by: default avatarEd Swierk <>
      Signed-off-by: default avatar"Eric W. Biederman" <>
      Signed-off-by: default avatarDavid S. Miller <>
  26. 27 Jan, 2011 1 commit
  27. 21 Jan, 2011 1 commit
  28. 20 Jan, 2011 2 commits
  29. 19 Jan, 2011 1 commit
  30. 10 Jan, 2011 1 commit
  31. 28 Nov, 2010 1 commit
    • Thomas Graf's avatar
      rtnl: make link af-specific updates atomic · cf7afbfe
      Thomas Graf authored
      As David pointed out correctly, updates to af-specific attributes
      are currently not atomic. If multiple changes are requested and
      one of them fails, previous updates may have been applied already
      leaving the link behind in a undefined state.
      This patch splits the function parse_link_af() into two functions
      validate_link_af() and set_link_at(). validate_link_af() is placed
      to validate_linkmsg() check for errors as early as possible before
      any changes to the link have been made. set_link_af() is called to
      commit the changes later.
      This method is not fail proof, while it is currently sufficient
      to make set_link_af() inerrable and thus 100% atomic, the
      validation function method will not be able to detect all error
      scenarios in the future, there will likely always be errors
      depending on states which are f.e. not protected by rtnl_mutex
      and thus may change between validation and setting.
      Also, instead of silently ignoring unknown address families and
      config blocks for address families which did not register a set
      function the errors EAFNOSUPPORT respectively EOPNOSUPPORT are
      returned to avoid comitting 4 out of 5 update requests without
      notifying the user.
      Signed-off-by: default avatarThomas Graf <>
      Signed-off-by: default avatarDavid S. Miller <>
  32. 17 Nov, 2010 1 commit
    • Thomas Graf's avatar
      rtnetlink: Link address family API · f8ff182c
      Thomas Graf authored
      Each net_device contains address family specific data such as
      per device settings and statistics. We already expose this data
      via procfs/sysfs and partially netlink.
      The netlink method requires the requester to send one RTM_GETLINK
      request for each address family it wishes to receive data of
      and then merge this data itself.
      This patch implements a new API which combines all address family
      specific link data in a new netlink attribute IFLA_AF_SPEC.
      IFLA_AF_SPEC contains a sequence of nested attributes, one for each
      address family which in turn defines the structure of its own
      attribute. Example:
         [IFLA_AF_SPEC] = {
             [AF_INET] = {
                 [IFLA_INET_CONF] = ...,
             [AF_INET6] = {
                 [IFLA_INET6_FLAGS] = ...,
                 [IFLA_INET6_CONF] = ...,
      The API also allows for address families to implement a function
      which parses the IFLA_AF_SPEC attribute sent by userspace to
      implement address family specific link options.
      Signed-off-by: default avatarThomas Graf <>
      Signed-off-by: default avatarDavid S. Miller <>
  33. 12 Nov, 2010 1 commit
    • Thomas Graf's avatar
      rtnetlink: Fix message size calculation for link messages · 369cf77a
      Thomas Graf authored
      nlmsg_total_size() calculates the length of a netlink message
      including header and alignment. nla_total_size() calculates the
      space an individual attribute consumes which was meant to be used
      in this context.
      Also, ensure to account for the attribute header for the
      IFLA_INFO_XSTATS attribute as implementations of get_xstats_size()
      seem to assume that we do so.
      The addition of two message headers minus the missing attribute
      header resulted in a calculated message size that was larger than
      required. Therefore we never risked running out of skb tailroom.
      Signed-off-by: default avatarThomas Graf <>
      Acked-by: default avatarPatrick McHardy <>
      Signed-off-by: default avatarDavid S. Miller <>
  34. 21 Oct, 2010 1 commit
  35. 24 Aug, 2010 1 commit