Skip to content

Commit 80f2321

Browse files
committed
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller: "Highlights: 1) Support AES128-CCM ciphers in kTLS, from Vakul Garg. 2) Add fib_sync_mem to control the amount of dirty memory we allow to queue up between synchronize RCU calls, from David Ahern. 3) Make flow classifier more lockless, from Vlad Buslov. 4) Add PHY downshift support to aquantia driver, from Heiner Kallweit. 5) Add SKB cache for TCP rx and tx, from Eric Dumazet. This reduces contention on SLAB spinlocks in heavy RPC workloads. 6) Partial GSO offload support in XFRM, from Boris Pismenny. 7) Add fast link down support to ethtool, from Heiner Kallweit. 8) Use siphash for IP ID generator, from Eric Dumazet. 9) Pull nexthops even further out from ipv4/ipv6 routes and FIB entries, from David Ahern. 10) Move skb->xmit_more into a per-cpu variable, from Florian Westphal. 11) Improve eBPF verifier speed and increase maximum program size, from Alexei Starovoitov. 12) Eliminate per-bucket spinlocks in rhashtable, and instead use bit spinlocks. From Neil Brown. 13) Allow tunneling with GUE encap in ipvs, from Jacky Hu. 14) Improve link partner cap detection in generic PHY code, from Heiner Kallweit. 15) Add layer 2 encap support to bpf_skb_adjust_room(), from Alan Maguire. 16) Remove SKB list implementation assumptions in SCTP, your's truly. 17) Various cleanups, optimizations, and simplifications in r8169 driver. From Heiner Kallweit. 18) Add memory accounting on TX and RX path of SCTP, from Xin Long. 19) Switch PHY drivers over to use dynamic featue detection, from Heiner Kallweit. 20) Support flow steering without masking in dpaa2-eth, from Ioana Ciocoi. 21) Implement ndo_get_devlink_port in netdevsim driver, from Jiri Pirko. 22) Increase the strict parsing of current and future netlink attributes, also export such policies to userspace. From Johannes Berg. 23) Allow DSA tag drivers to be modular, from Andrew Lunn. 24) Remove legacy DSA probing support, also from Andrew Lunn. 25) Allow ll_temac driver to be used on non-x86 platforms, from Esben Haabendal. 26) Add a generic tracepoint for TX queue timeouts to ease debugging, from Cong Wang. 27) More indirect call optimizations, from Paolo Abeni" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1763 commits) cxgb4: Fix error path in cxgb4_init_module net: phy: improve pause mode reporting in phy_print_status dt-bindings: net: Fix a typo in the phy-mode list for ethernet bindings net: macb: Change interrupt and napi enable order in open net: ll_temac: Improve error message on error IRQ net/sched: remove block pointer from common offload structure net: ethernet: support of_get_mac_address new ERR_PTR error net: usb: smsc: fix warning reported by kbuild test robot staging: octeon-ethernet: Fix of_get_mac_address ERR_PTR check net: dsa: support of_get_mac_address new ERR_PTR error net: dsa: sja1105: Fix status initialization in sja1105_get_ethtool_stats vrf: sit mtu should not be updated when vrf netdev is the link net: dsa: Fix error cleanup path in dsa_init_module l2tp: Fix possible NULL pointer dereference taprio: add null check on sched_nest to avoid potential null pointer dereference net: mvpp2: cls: fix less than zero check on a u32 variable net_sched: sch_fq: handle non connected flows net_sched: sch_fq: do not assume EDT packets are ordered net: hns3: use devm_kcalloc when allocating desc_cb net: hns3: some cleanup for struct hns3_enet_ring ...
2 parents 82efe43 + a9e41a5 commit 80f2321

File tree

1,636 files changed

+126811
-26978
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,636 files changed

+126811
-26978
lines changed

.clang-format

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -387,14 +387,14 @@ ForEachMacros:
387387
- 'rhl_for_each_entry_rcu'
388388
- 'rhl_for_each_rcu'
389389
- 'rht_for_each'
390-
- 'rht_for_each_continue'
390+
- 'rht_for_each_from'
391391
- 'rht_for_each_entry'
392-
- 'rht_for_each_entry_continue'
392+
- 'rht_for_each_entry_from'
393393
- 'rht_for_each_entry_rcu'
394-
- 'rht_for_each_entry_rcu_continue'
394+
- 'rht_for_each_entry_rcu_from'
395395
- 'rht_for_each_entry_safe'
396396
- 'rht_for_each_rcu'
397-
- 'rht_for_each_rcu_continue'
397+
- 'rht_for_each_rcu_from'
398398
- '__rq_for_each_bio'
399399
- 'rq_for_each_bvec'
400400
- 'rq_for_each_segment'

.mailmap

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,9 @@ Alan Cox <[email protected]>
1616
1717
Aleksey Gorelov <[email protected]>
1818
Aleksandar Markovic <[email protected]> <[email protected]>
19+
Alexei Starovoitov <[email protected]> <[email protected]>
20+
Alexei Starovoitov <[email protected]> <[email protected]>
21+
Alexei Starovoitov <[email protected]> <[email protected]>
1922
2023
2124
@@ -46,6 +49,12 @@ Christoph Hellwig <[email protected]>
4649
Christophe Ricard <[email protected]>
4750
Corey Minyard <[email protected]>
4851
Damian Hobson-Garcia <[email protected]>
52+
53+
54+
55+
56+
57+
4958
David Brownell <[email protected]>
5059
David Woodhouse <[email protected]>
5160

Documentation/ABI/testing/sysfs-class-net-batman-adv renamed to Documentation/ABI/obsolete/sysfs-class-net-batman-adv

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
This ABI is deprecated and will be removed after 2021. It is
2+
replaced with the batadv generic netlink family.
13

24
What: /sys/class/net/<iface>/batman-adv/elp_interval
35
Date: Feb 2014

Documentation/ABI/testing/sysfs-class-net-mesh renamed to Documentation/ABI/obsolete/sysfs-class-net-mesh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
This ABI is deprecated and will be removed after 2021. It is
2+
replaced with the batadv generic netlink family.
13

24
What: /sys/class/net/<mesh_iface>/mesh/aggregated_ogms
35
Date: May 2010

Documentation/bpf/bpf_design_QA.rst

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -85,8 +85,33 @@ Q: Can loops be supported in a safe way?
8585
A: It's not clear yet.
8686

8787
BPF developers are trying to find a way to
88-
support bounded loops where the verifier can guarantee that
89-
the program terminates in less than 4096 instructions.
88+
support bounded loops.
89+
90+
Q: What are the verifier limits?
91+
--------------------------------
92+
A: The only limit known to the user space is BPF_MAXINSNS (4096).
93+
It's the maximum number of instructions that the unprivileged bpf
94+
program can have. The verifier has various internal limits.
95+
Like the maximum number of instructions that can be explored during
96+
program analysis. Currently, that limit is set to 1 million.
97+
Which essentially means that the largest program can consist
98+
of 1 million NOP instructions. There is a limit to the maximum number
99+
of subsequent branches, a limit to the number of nested bpf-to-bpf
100+
calls, a limit to the number of the verifier states per instruction,
101+
a limit to the number of maps used by the program.
102+
All these limits can be hit with a sufficiently complex program.
103+
There are also non-numerical limits that can cause the program
104+
to be rejected. The verifier used to recognize only pointer + constant
105+
expressions. Now it can recognize pointer + bounded_register.
106+
bpf_lookup_map_elem(key) had a requirement that 'key' must be
107+
a pointer to the stack. Now, 'key' can be a pointer to map value.
108+
The verifier is steadily getting 'smarter'. The limits are
109+
being removed. The only way to know that the program is going to
110+
be accepted by the verifier is to try to load it.
111+
The bpf development process guarantees that the future kernel
112+
versions will accept all bpf programs that were accepted by
113+
the earlier versions.
114+
90115

91116
Instruction level questions
92117
---------------------------

Documentation/bpf/btf.rst

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,8 @@ sequentially and type id is assigned to each recognized type starting from id
8282
#define BTF_KIND_RESTRICT 11 /* Restrict */
8383
#define BTF_KIND_FUNC 12 /* Function */
8484
#define BTF_KIND_FUNC_PROTO 13 /* Function Proto */
85+
#define BTF_KIND_VAR 14 /* Variable */
86+
#define BTF_KIND_DATASEC 15 /* Section */
8587

8688
Note that the type section encodes debug info, not just pure types.
8789
``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram.
@@ -393,6 +395,61 @@ refers to parameter type.
393395
If the function has variable arguments, the last parameter is encoded with
394396
``name_off = 0`` and ``type = 0``.
395397

398+
2.2.14 BTF_KIND_VAR
399+
~~~~~~~~~~~~~~~~~~~
400+
401+
``struct btf_type`` encoding requirement:
402+
* ``name_off``: offset to a valid C identifier
403+
* ``info.kind_flag``: 0
404+
* ``info.kind``: BTF_KIND_VAR
405+
* ``info.vlen``: 0
406+
* ``type``: the type of the variable
407+
408+
``btf_type`` is followed by a single ``struct btf_variable`` with the
409+
following data::
410+
411+
struct btf_var {
412+
__u32 linkage;
413+
};
414+
415+
``struct btf_var`` encoding:
416+
* ``linkage``: currently only static variable 0, or globally allocated
417+
variable in ELF sections 1
418+
419+
Not all type of global variables are supported by LLVM at this point.
420+
The following is currently available:
421+
422+
* static variables with or without section attributes
423+
* global variables with section attributes
424+
425+
The latter is for future extraction of map key/value type id's from a
426+
map definition.
427+
428+
2.2.15 BTF_KIND_DATASEC
429+
~~~~~~~~~~~~~~~~~~~~~~~
430+
431+
``struct btf_type`` encoding requirement:
432+
* ``name_off``: offset to a valid name associated with a variable or
433+
one of .data/.bss/.rodata
434+
* ``info.kind_flag``: 0
435+
* ``info.kind``: BTF_KIND_DATASEC
436+
* ``info.vlen``: # of variables
437+
* ``size``: total section size in bytes (0 at compilation time, patched
438+
to actual size by BPF loaders such as libbpf)
439+
440+
``btf_type`` is followed by ``info.vlen`` number of ``struct btf_var_secinfo``.::
441+
442+
struct btf_var_secinfo {
443+
__u32 type;
444+
__u32 offset;
445+
__u32 size;
446+
};
447+
448+
``struct btf_var_secinfo`` encoding:
449+
* ``type``: the type of the BTF_KIND_VAR variable
450+
* ``offset``: the in-section offset of the variable
451+
* ``size``: the size of the variable in bytes
452+
396453
3. BTF Kernel API
397454
*****************
398455

Documentation/bpf/index.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,16 @@ Two sets of Questions and Answers (Q&A) are maintained.
3636
bpf_devel_QA
3737

3838

39+
Program types
40+
=============
41+
42+
.. toctree::
43+
:maxdepth: 1
44+
45+
prog_cgroup_sysctl
46+
prog_flow_dissector
47+
48+
3949
.. Links:
4050
.. _Documentation/networking/filter.txt: ../networking/filter.txt
4151
.. _man-pages: https://www.kernel.org/doc/man-pages/
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
2+
3+
===========================
4+
BPF_PROG_TYPE_CGROUP_SYSCTL
5+
===========================
6+
7+
This document describes ``BPF_PROG_TYPE_CGROUP_SYSCTL`` program type that
8+
provides cgroup-bpf hook for sysctl.
9+
10+
The hook has to be attached to a cgroup and will be called every time a
11+
process inside that cgroup tries to read from or write to sysctl knob in proc.
12+
13+
1. Attach type
14+
**************
15+
16+
``BPF_CGROUP_SYSCTL`` attach type has to be used to attach
17+
``BPF_PROG_TYPE_CGROUP_SYSCTL`` program to a cgroup.
18+
19+
2. Context
20+
**********
21+
22+
``BPF_PROG_TYPE_CGROUP_SYSCTL`` provides access to the following context from
23+
BPF program::
24+
25+
struct bpf_sysctl {
26+
__u32 write;
27+
__u32 file_pos;
28+
};
29+
30+
* ``write`` indicates whether sysctl value is being read (``0``) or written
31+
(``1``). This field is read-only.
32+
33+
* ``file_pos`` indicates file position sysctl is being accessed at, read
34+
or written. This field is read-write. Writing to the field sets the starting
35+
position in sysctl proc file ``read(2)`` will be reading from or ``write(2)``
36+
will be writing to. Writing zero to the field can be used e.g. to override
37+
whole sysctl value by ``bpf_sysctl_set_new_value()`` on ``write(2)`` even
38+
when it's called by user space on ``file_pos > 0``. Writing non-zero
39+
value to the field can be used to access part of sysctl value starting from
40+
specified ``file_pos``. Not all sysctl support access with ``file_pos !=
41+
0``, e.g. writes to numeric sysctl entries must always be at file position
42+
``0``. See also ``kernel.sysctl_writes_strict`` sysctl.
43+
44+
See `linux/bpf.h`_ for more details on how context field can be accessed.
45+
46+
3. Return code
47+
**************
48+
49+
``BPF_PROG_TYPE_CGROUP_SYSCTL`` program must return one of the following
50+
return codes:
51+
52+
* ``0`` means "reject access to sysctl";
53+
* ``1`` means "proceed with access".
54+
55+
If program returns ``0`` user space will get ``-1`` from ``read(2)`` or
56+
``write(2)`` and ``errno`` will be set to ``EPERM``.
57+
58+
4. Helpers
59+
**********
60+
61+
Since sysctl knob is represented by a name and a value, sysctl specific BPF
62+
helpers focus on providing access to these properties:
63+
64+
* ``bpf_sysctl_get_name()`` to get sysctl name as it is visible in
65+
``/proc/sys`` into provided by BPF program buffer;
66+
67+
* ``bpf_sysctl_get_current_value()`` to get string value currently held by
68+
sysctl into provided by BPF program buffer. This helper is available on both
69+
``read(2)`` from and ``write(2)`` to sysctl;
70+
71+
* ``bpf_sysctl_get_new_value()`` to get new string value currently being
72+
written to sysctl before actual write happens. This helper can be used only
73+
on ``ctx->write == 1``;
74+
75+
* ``bpf_sysctl_set_new_value()`` to override new string value currently being
76+
written to sysctl before actual write happens. Sysctl value will be
77+
overridden starting from the current ``ctx->file_pos``. If the whole value
78+
has to be overridden BPF program can set ``file_pos`` to zero before calling
79+
to the helper. This helper can be used only on ``ctx->write == 1``. New
80+
string value set by the helper is treated and verified by kernel same way as
81+
an equivalent string passed by user space.
82+
83+
BPF program sees sysctl value same way as user space does in proc filesystem,
84+
i.e. as a string. Since many sysctl values represent an integer or a vector
85+
of integers, the following helpers can be used to get numeric value from the
86+
string:
87+
88+
* ``bpf_strtol()`` to convert initial part of the string to long integer
89+
similar to user space `strtol(3)`_;
90+
* ``bpf_strtoul()`` to convert initial part of the string to unsigned long
91+
integer similar to user space `strtoul(3)`_;
92+
93+
See `linux/bpf.h`_ for more details on helpers described here.
94+
95+
5. Examples
96+
***********
97+
98+
See `test_sysctl_prog.c`_ for an example of BPF program in C that access
99+
sysctl name and value, parses string value to get vector of integers and uses
100+
the result to make decision whether to allow or deny access to sysctl.
101+
102+
6. Notes
103+
********
104+
105+
``BPF_PROG_TYPE_CGROUP_SYSCTL`` is intended to be used in **trusted** root
106+
environment, for example to monitor sysctl usage or catch unreasonable values
107+
an application, running as root in a separate cgroup, is trying to set.
108+
109+
Since `task_dfl_cgroup(current)` is called at `sys_read` / `sys_write` time it
110+
may return results different from that at `sys_open` time, i.e. process that
111+
opened sysctl file in proc filesystem may differ from process that is trying
112+
to read from / write to it and two such processes may run in different
113+
cgroups, what means ``BPF_PROG_TYPE_CGROUP_SYSCTL`` should not be used as a
114+
security mechanism to limit sysctl usage.
115+
116+
As with any cgroup-bpf program additional care should be taken if an
117+
application running as root in a cgroup should not be allowed to
118+
detach/replace BPF program attached by administrator.
119+
120+
.. Links
121+
.. _linux/bpf.h: ../../include/uapi/linux/bpf.h
122+
.. _strtol(3): http://man7.org/linux/man-pages/man3/strtol.3p.html
123+
.. _strtoul(3): http://man7.org/linux/man-pages/man3/strtoul.3p.html
124+
.. _test_sysctl_prog.c:
125+
../../tools/testing/selftests/bpf/progs/test_sysctl_prog.c

Documentation/networking/bpf_flow_dissector.rst renamed to Documentation/bpf/prog_flow_dissector.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
.. SPDX-License-Identifier: GPL-2.0
22
3-
==================
4-
BPF Flow Dissector
5-
==================
3+
============================
4+
BPF_PROG_TYPE_FLOW_DISSECTOR
5+
============================
66

77
Overview
88
========

Documentation/devicetree/bindings/net/altera_tse.txt

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,8 @@ Required properties:
4646
- reg: phy id used to communicate to phy.
4747
- device_type: Must be "ethernet-phy".
4848

49-
Optional properties:
50-
- local-mac-address: See ethernet.txt in the same directory.
51-
- max-frame-size: See ethernet.txt in the same directory.
49+
The MAC address will be determined using the optional properties defined in
50+
ethernet.txt.
5251

5352
Example:
5453

0 commit comments

Comments
 (0)