1. 26 Feb, 2019 1 commit
    • David Ahern's avatar
      ipv4: Return error for RTA_VIA attribute · b6e9e5df
      David Ahern authored
      IPv4 currently does not support nexthops outside of the AF_INET family.
      Specifically, it does not handle RTA_VIA attribute. If it is passed
      in a route add request, the actual route added only uses the device
      which is clearly not what the user intended:
      
        $ ip ro add 172.16.1.0/24 via inet6 2001:db8:1::1 dev eth0
        $ ip ro ls
        ...
        172.16.1.0/24 dev eth0
      
      Catch this and fail the route add:
        $ ip ro add 172.16.1.0/24 via inet6 2001:db8:1::1 dev eth0
        Error: IPv4 does not support RTA_VIA attribute.
      
      Fixes: 03c05665 ("mpls: Netlink commands to add, remove, and dump routes")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6e9e5df
  2. 15 Jan, 2019 1 commit
    • Ido Schimmel's avatar
      net: ipv4: Fix memory leak in network namespace dismantle · f97f4dd8
      Ido Schimmel authored
      IPv4 routing tables are flushed in two cases:
      
      1. In response to events in the netdev and inetaddr notification chains
      2. When a network namespace is being dismantled
      
      In both cases only routes associated with a dead nexthop group are
      flushed. However, a nexthop group will only be marked as dead in case it
      is populated with actual nexthops using a nexthop device. This is not
      the case when the route in question is an error route (e.g.,
      'blackhole', 'unreachable').
      
      Therefore, when a network namespace is being dismantled such routes are
      not flushed and leaked [1].
      
      To reproduce:
      # ip netns add blue
      # ip -n blue route add unreachable 192.0.2.0/24
      # ip netns del blue
      
      Fix this by not skipping error routes that are not marked with
      RTNH_F_DEAD when flushing the routing tables.
      
      To prevent the flushing of such routes in case #1, add a parameter to
      fib_table_flush() that indicates if the table is flushed as part of
      namespace dismantle or not.
      
      Note that this problem does not exist in IPv6 since error routes are
      associated with the loopback device.
      
      [1]
      unreferenced object 0xffff888066650338 (size 56):
        comm "ip", pid 1206, jiffies 4294786063 (age 26.235s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 b0 1c 62 61 80 88 ff ff  ..........ba....
          e8 8b a1 64 80 88 ff ff 00 07 00 08 fe 00 00 00  ...d............
        backtrace:
          [<00000000856ed27d>] inet_rtm_newroute+0x129/0x220
          [<00000000fcdfc00a>] rtnetlink_rcv_msg+0x397/0xa20
          [<00000000cb85801a>] netlink_rcv_skb+0x132/0x380
          [<00000000ebc991d2>] netlink_unicast+0x4c0/0x690
          [<0000000014f62875>] netlink_sendmsg+0x929/0xe10
          [<00000000bac9d967>] sock_sendmsg+0xc8/0x110
          [<00000000223e6485>] ___sys_sendmsg+0x77a/0x8f0
          [<000000002e94f880>] __sys_sendmsg+0xf7/0x250
          [<00000000ccb1fa72>] do_syscall_64+0x14d/0x610
          [<00000000ffbe3dae>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<000000003a8b605b>] 0xffffffffffffffff
      unreferenced object 0xffff888061621c88 (size 48):
        comm "ip", pid 1206, jiffies 4294786063 (age 26.235s)
        hex dump (first 32 bytes):
          6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
          6b 6b 6b 6b 6b 6b 6b 6b d8 8e 26 5f 80 88 ff ff  kkkkkkkk..&_....
        backtrace:
          [<00000000733609e3>] fib_table_insert+0x978/0x1500
          [<00000000856ed27d>] inet_rtm_newroute+0x129/0x220
          [<00000000fcdfc00a>] rtnetlink_rcv_msg+0x397/0xa20
          [<00000000cb85801a>] netlink_rcv_skb+0x132/0x380
          [<00000000ebc991d2>] netlink_unicast+0x4c0/0x690
          [<0000000014f62875>] netlink_sendmsg+0x929/0xe10
          [<00000000bac9d967>] sock_sendmsg+0xc8/0x110
          [<00000000223e6485>] ___sys_sendmsg+0x77a/0x8f0
          [<000000002e94f880>] __sys_sendmsg+0xf7/0x250
          [<00000000ccb1fa72>] do_syscall_64+0x14d/0x610
          [<00000000ffbe3dae>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<000000003a8b605b>] 0xffffffffffffffff
      
      Fixes: 8cced9ef ("[NETNS]: Enable routing configuration in non-initial namespace.")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f97f4dd8
  3. 24 Oct, 2018 1 commit
  4. 16 Oct, 2018 4 commits
  5. 11 Oct, 2018 1 commit
    • Sabrina Dubroca's avatar
      net: ipv4: update fnhe_pmtu when first hop's MTU changes · af7d6cce
      Sabrina Dubroca authored
      Since commit 5aad1de5 ("ipv4: use separate genid for next hop
      exceptions"), exceptions get deprecated separately from cached
      routes. In particular, administrative changes don't clear PMTU anymore.
      
      As Stefano described in commit e9fa1495 ("ipv6: Reflect MTU changes
      on PMTU of exceptions for MTU-less routes"), the PMTU discovered before
      the local MTU change can become stale:
       - if the local MTU is now lower than the PMTU, that PMTU is now
         incorrect
       - if the local MTU was the lowest value in the path, and is increased,
         we might discover a higher PMTU
      
      Similarly to what commit e9fa1495 did for IPv6, update PMTU in those
      cases.
      
      If the exception was locked, the discovered PMTU was smaller than the
      minimal accepted PMTU. In that case, if the new local MTU is smaller
      than the current PMTU, let PMTU discovery figure out if locking of the
      exception is still needed.
      
      To do this, we need to know the old link MTU in the NETDEV_CHANGEMTU
      notifier. By the time the notifier is called, dev->mtu has been
      changed. This patch adds the old MTU as additional information in the
      notifier structure, and a new call_netdevice_notifiers_u32() function.
      
      Fixes: 5aad1de5 ("ipv4: use separate genid for next hop exceptions")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af7d6cce
  6. 08 Oct, 2018 1 commit
    • David Ahern's avatar
      rtnetlink: Update fib dumps for strict data checking · e8ba330a
      David Ahern authored
      Add helper to check netlink message for route dumps. If the strict flag
      is set the dump request is expected to have an rtmsg struct as the header.
      All elements of the struct are expected to be 0 with the exception of
      rtm_flags (which is used by both ipv4 and ipv6 dumps) and no attributes
      can be appended. rtm_flags can only have RTM_F_CLONED and RTM_F_PREFIX
      set.
      
      Update inet_dump_fib, inet6_dump_fib, mpls_dump_routes, ipmr_rtm_dumproute,
      and ip6mr_rtm_dumproute to call this helper if strict data checking is
      enabled.
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e8ba330a
  7. 22 Sep, 2018 1 commit
  8. 21 Sep, 2018 1 commit
  9. 29 Jul, 2018 1 commit
  10. 08 Jul, 2018 1 commit
  11. 12 Jun, 2018 1 commit
    • Kees Cook's avatar
      treewide: kzalloc() -> kcalloc() · 6396bb22
      Kees Cook authored
      The kzalloc() function has a 2-factor argument form, kcalloc(). This
      patch replaces cases of:
      
              kzalloc(a * b, gfp)
      
      with:
              kcalloc(a * b, gfp)
      
      as well as handling cases of:
      
              kzalloc(a * b * c, gfp)
      
      with:
      
              kzalloc(array3_size(a, b, c), gfp)
      
      as it's slightly less ugly than:
      
              kzalloc_array(array_size(a, b), c, gfp)
      
      This does, however, attempt to ignore constant size factors like:
      
              kzalloc(4 * 1024, gfp)
      
      though any constants defined via macros get caught up in the conversion.
      
      Any factors with a sizeof() of "unsigned char", "char", and "u8" were
      dropped, since they're redundant.
      
      The Coccinelle script used for this was:
      
      // Fix redundant parens around sizeof().
      @@
      type TYPE;
      expression THING, E;
      @@
      
      (
        kzalloc(
      -	(sizeof(TYPE)) * E
      +	sizeof(TYPE) * E
        , ...)
      |
        kzalloc(
      -	(sizeof(THING)) * E
      +	sizeof(THING) * E
        , ...)
      )
      
      // Drop single-byte sizes and redundant parens.
      @@
      expression COUNT;
      typedef u8;
      typedef __u8;
      @@
      
      (
        kzalloc(
      -	sizeof(u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(__u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(char) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(unsigned char) * (COUNT)
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(u8) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(__u8) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(char) * COUNT
      +	COUNT
        , ...)
      |
        kzalloc(
      -	sizeof(unsigned char) * COUNT
      +	COUNT
        , ...)
      )
      
      // 2-factor product with sizeof(type/expression) and identifier or constant.
      @@
      type TYPE;
      expression THING;
      identifier COUNT_ID;
      constant COUNT_CONST;
      @@
      
      (
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (COUNT_ID)
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * COUNT_ID
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * COUNT_CONST
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (COUNT_ID)
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * COUNT_ID
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * COUNT_CONST
      +	COUNT_CONST, sizeof(THING)
        , ...)
      )
      
      // 2-factor product, only identifiers.
      @@
      identifier SIZE, COUNT;
      @@
      
      - kzalloc
      + kcalloc
        (
      -	SIZE * COUNT
      +	COUNT, SIZE
        , ...)
      
      // 3-factor product with 1 sizeof(type) or sizeof(expression), with
      // redundant parens removed.
      @@
      expression THING;
      identifier STRIDE, COUNT;
      type TYPE;
      @@
      
      (
        kzalloc(
      -	sizeof(TYPE) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kzalloc(
      -	sizeof(THING) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      )
      
      // 3-factor product with 2 sizeof(variable), with redundant parens removed.
      @@
      expression THING1, THING2;
      identifier COUNT;
      type TYPE1, TYPE2;
      @@
      
      (
        kzalloc(
      -	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kzalloc(
      -	sizeof(THING1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(THING1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      |
        kzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      )
      
      // 3-factor product, only identifiers, with redundant parens removed.
      @@
      identifier STRIDE, SIZE, COUNT;
      @@
      
      (
        kzalloc(
      -	(COUNT) * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	(COUNT) * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kzalloc(
      -	COUNT * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      )
      
      // Any remaining multi-factor products, first at least 3-factor products,
      // when they're not all constants...
      @@
      expression E1, E2, E3;
      constant C1, C2, C3;
      @@
      
      (
        kzalloc(C1 * C2 * C3, ...)
      |
        kzalloc(
      -	(E1) * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	(E1) * (E2) * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	(E1) * (E2) * (E3)
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kzalloc(
      -	E1 * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      )
      
      // And then all remaining 2 factors products when they're not all constants,
      // keeping sizeof() as the second factor argument.
      @@
      expression THING, E1, E2;
      type TYPE;
      constant C1, C2, C3;
      @@
      
      (
        kzalloc(sizeof(THING) * C2, ...)
      |
        kzalloc(sizeof(TYPE) * C2, ...)
      |
        kzalloc(C1 * C2 * C3, ...)
      |
        kzalloc(C1 * C2, ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * (E2)
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(TYPE) * E2
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * (E2)
      +	E2, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	sizeof(THING) * E2
      +	E2, sizeof(THING)
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	(E1) * E2
      +	E1, E2
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	(E1) * (E2)
      +	E1, E2
        , ...)
      |
      - kzalloc
      + kcalloc
        (
      -	E1 * E2
      +	E1, E2
        , ...)
      )
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      6396bb22
  12. 29 May, 2018 1 commit
  13. 25 May, 2018 1 commit
  14. 23 May, 2018 2 commits
  15. 17 May, 2018 1 commit
  16. 27 Mar, 2018 1 commit
  17. 13 Feb, 2018 1 commit
    • Kirill Tkhai's avatar
      net: Convert pernet_subsys, registered from inet_init() · f84c6821
      Kirill Tkhai authored
      arp_net_ops just addr/removes /proc entry.
      
      devinet_ops allocates and frees duplicate of init_net tables
      and (un)registers sysctl entries.
      
      fib_net_ops allocates and frees pernet tables, creates/destroys
      netlink socket and (un)initializes /proc entries. Foreign
      pernet_operations do not touch them.
      
      ip_rt_proc_ops only modifies pernet /proc entries.
      
      xfrm_net_ops creates/destroys /proc entries, allocates/frees
      pernet statistics, hashes and tables, and (un)initializes
      sysctl files. These are not touched by foreigh pernet_operations
      
      xfrm4_net_ops allocates/frees private pernet memory, and
      configures sysctls.
      
      sysctl_route_ops creates/destroys sysctls.
      
      rt_genid_ops only initializes fields of just allocated net.
      
      ipv4_inetpeer_ops allocated/frees net private memory.
      
      igmp_net_ops just creates/destroys /proc files and socket,
      noone else interested in.
      
      tcp_sk_ops seems to be safe, because tcp_sk_init() does not
      depend on any other pernet_operations modifications. Iteration
      over hash table in inet_twsk_purge() is made under RCU lock,
      and it's safe to iterate the table this way. Removing from
      the table happen from inet_twsk_deschedule_put(), but this
      function is safe without any extern locks, as it's synchronized
      inside itself. There are many examples, it's used in different
      context. So, it's safe to leave tcp_sk_exit_batch() unlocked.
      
      tcp_net_metrics_ops is synchronized on tcp_metrics_lock and safe.
      
      udplite4_net_ops only creates/destroys pernet /proc file.
      
      icmp_sk_ops creates percpu sockets, not touched by foreign
      pernet_operations.
      
      ipmr_net_ops creates/destroys pernet fib tables, (un)registers
      fib rules and /proc files. This seem to be safe to execute
      in parallel with foreign pernet_operations.
      
      af_inet_ops just sets up default parameters of newly created net.
      
      ipv4_mib_ops creates and destroys pernet percpu statistics.
      
      raw_net_ops, tcp4_net_ops, udp4_net_ops, ping_v4_net_ops
      and ip_proc_ops only create/destroy pernet /proc files.
      
      ip4_frags_ops creates and destroys sysctl file.
      
      So, it's safe to make the pernet_operations async.
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Acked-by: default avatarAndrei Vagin <avagin@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f84c6821
  18. 25 Jan, 2018 1 commit
  19. 20 Dec, 2017 1 commit
  20. 01 Nov, 2017 1 commit
  21. 21 Sep, 2017 1 commit
    • Paolo Abeni's avatar
      net: avoid a full fib lookup when rp_filter is disabled. · 6e617de8
      Paolo Abeni authored
      Since commit 1dced6a8 ("ipv4: Restore accept_local behaviour
      in fib_validate_source()") a full fib lookup is needed even if
      the rp_filter is disabled, if accept_local is false - which is
      the default.
      
      What we really need in the above scenario is just checking
      that the source IP address is not local, and in most case we
      can do that is a cheaper way looking up the ifaddr hash table.
      
      This commit adds a helper for such lookup, and uses it to
      validate the src address when rp_filter is disabled and no
      'local' routes are created by the user space in the relevant
      namespace.
      
      A new ipv4 netns flag is added to account for such routes.
      We need that to preserve the same behavior we had before this
      patch.
      
      It also drops the checks to bail early from __fib_validate_source,
      added by the commit 1dced6a8 ("ipv4: Restore accept_local
      behaviour in fib_validate_source()") they do not give any
      measurable performance improvement: if we do the lookup with are
      on a slower path.
      
      This improves UDP performances for unconnected sockets
      when rp_filter is disabled by 5% and also gives small but
      measurable performance improvement for TCP flood scenarios.
      
      v1 -> v2:
       - use the ifaddr lookup helper in __ip_dev_find(), as suggested
         by Eric
       - fall-back to full lookup if custom local routes are present
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e617de8
  22. 09 Aug, 2017 1 commit
  23. 03 Aug, 2017 1 commit
    • Ido Schimmel's avatar
      net: core: Make the FIB notification chain generic · 04b1d4e5
      Ido Schimmel authored
      The FIB notification chain is currently soley used by IPv4 code.
      However, we're going to introduce IPv6 FIB offload support, which
      requires these notification as well.
      
      As explained in commit c3852ef7 ("ipv4: fib: Replay events when
      registering FIB notifier"), upon registration to the chain, the callee
      receives a full dump of the FIB tables and rules by traversing all the
      net namespaces. The integrity of the dump is ensured by a per-namespace
      sequence counter that is incremented whenever a change to the tables or
      rules occurs.
      
      In order to allow more address families to use the chain, each family is
      expected to register its fib_notifier_ops in its pernet init. These
      operations allow the common code to read the family's sequence counter
      as well as dump its tables and rules in the given net namespace.
      
      Additionally, a 'family' parameter is added to sent notifications, so
      that listeners could distinguish between the different families.
      
      Implement the common code that allows listeners to register to the chain
      and for address families to register their fib_notifier_ops. Subsequent
      patches will implement these operations in IPv6.
      
      In the future, ipmr and ip6mr will be extended to provide these
      notifications as well.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04b1d4e5
  24. 20 Jul, 2017 1 commit
    • Mahesh Bandewar's avatar
      ipv4: initialize fib_trie prior to register_netdev_notifier call. · 8799a221
      Mahesh Bandewar authored
      Net stack initialization currently initializes fib-trie after the
      first call to netdevice_notifier() call. In fact fib_trie initialization
      needs to happen before first rtnl_register(). It does not cause any problem
      since there are no devices UP at this moment, but trying to bring 'lo'
      UP at initialization would make this assumption wrong and exposes the issue.
      
      Fixes following crash
      
       Call Trace:
        ? alternate_node_alloc+0x76/0xa0
        fib_table_insert+0x1b7/0x4b0
        fib_magic.isra.17+0xea/0x120
        fib_add_ifaddr+0x7b/0x190
        fib_netdev_event+0xc0/0x130
        register_netdevice_notifier+0x1c1/0x1d0
        ip_fib_init+0x72/0x85
        ip_rt_init+0x187/0x1e9
        ip_init+0xe/0x1a
        inet_init+0x171/0x26c
        ? ipv4_offload_init+0x66/0x66
        do_one_initcall+0x43/0x160
        kernel_init_freeable+0x191/0x219
        ? rest_init+0x80/0x80
        kernel_init+0xe/0x150
        ret_from_fork+0x22/0x30
       Code: f6 46 23 04 74 86 4c 89 f7 e8 ae 45 01 00 49 89 c7 4d 85 ff 0f 85 7b ff ff ff 31 db eb 08 4c 89 ff e8 16 47 01 00 48 8b 44 24 38 <45> 8b 6e 14 4d 63 76 74 48 89 04 24 0f 1f 44 00 00 48 83 c4 08
       RIP: kmem_cache_alloc+0xcf/0x1c0 RSP: ffff9b1500017c28
       CR2: 0000000000000014
      
      Fixes: 7b1a74fd ("[NETNS]: Refactor fib initialization so it can handle multiple namespaces.")
      Fixes: 7f9b8052 ("[IPV4]: fib hash|trie initialization")
      Signed-off-by: default avatarMahesh Bandewar <maheshb@google.com>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8799a221
  25. 05 Jul, 2017 1 commit
  26. 04 Jul, 2017 1 commit
  27. 30 May, 2017 2 commits
  28. 22 May, 2017 2 commits
  29. 16 May, 2017 1 commit
    • David Ahern's avatar
      net: Improve handling of failures on link and route dumps · f6c5775f
      David Ahern authored
      In general, rtnetlink dumps do not anticipate failure to dump a single
      object (e.g., link or route) on a single pass. As both route and link
      objects have grown via more attributes, that is no longer a given.
      
      netlink dumps can handle a failure if the dump function returns an
      error; specifically, netlink_dump adds the return code to the response
      if it is <= 0 so userspace is notified of the failure. The missing
      piece is the rtnetlink dump functions returning the error.
      
      Fix route and link dump functions to return the errors if no object is
      added to an skb (detected by skb->len != 0). IPv6 route dumps
      (rt6_dump_route) already return the error; this patch updates IPv4 and
      link dumps. Other dump functions may need to be ajusted as well.
      Reported-by: default avatarJan Moskyto Matejka <mq@ucw.cz>
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6c5775f
  30. 27 Apr, 2017 1 commit
  31. 17 Apr, 2017 1 commit
  32. 13 Apr, 2017 1 commit
  33. 22 Mar, 2017 1 commit
  34. 01 Mar, 2017 1 commit