Skip to content

Commit 93d085d

Browse files
committed
Merge branch 'end-of-ip-csum'
Tom Herbert says: ==================== net: The beginning of the end for NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM Background: This patch set starts to address one front in the battle against protocol ossification. Protocol ossification describes the state that we have arrived at in the evolution of the Internet where we are materially limited to only using a very narrow range of protocols and protocol features. For instance, only TCP and UDP is sufficiently supported on the Internet so that deploying alternative protocols, such as SCTP and DCCP, are non-starters. Similarly, IP options and IPv6 extension headers are typically not considered feasible for wide deployment, so we have loss the extensibility of IP protocols. Protocol ossification is not only a problem on the Internet, but in the data center as well. A root cause of this seems to be narrow, protocol specific optimizations implemented in switches (for doing EMCP) and in NICs (NIC offloads). These tend to be performance optimization around TCP and UDP packets, and these have become requirements to implement performant network solutions at scale. Attempts to deal with protocol ossification in data center have yielded ad hoc, sub-optimal solutions. A main driver of foo-over-UDP (e.g. GRE/UDP, MPLS/UDP) is to leverage the existing EMCP and RSS support for UDP by setting the source port as an entropy value. This has seen some success, but the cost of additional overhead and layering limits its usefulness. An even more extreme solution is STT where non-TCP packets are spoofed as TCP to leverage NIC offloads. This patch set endeavours to address protocol ossification caused by techniques used in transmit checksum offload for NICs. Future work will address protocol ossification in the other primary NIC offloads-- namely receive checksum offload, LSO, LRO, and RSS. NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM: NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM exemplify the problem of protocol ossification. These features are relics from a simpler time in the Internet, before encapsulation, before GRE and IPIP. Many hardware vendors only saw the need to provide checksum offload for simple UDP and TCP packets over IPv4 (IPv6 support is an afterthought also). In today's Internet and data centers, checksum offload is well established as a valuable feature, but we can no longer afford to be contsrained to use a handful of protocols and features that are supported at the discretion of NIC vendors. Generic and protocol agnostic methods are needed. The actual interface that the stack uses with drivers for checksum offload is CHECKSUM_PARTIAL. This is a generic and protocol agnostic interface. A driver for a device that supports this generic interface advertises NETIF_F_HW_CSUM. Goals of this patch set: We propose that drivers advertise NETIF_F_HW_CSUM instead of protocol specific values of NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM. If the driver's device is constrained (for instance it can only offlaod simple IPv4 and IPv6 packets) then these constraints can be checked in the transmit path and skb_checksum_help would be called for packets that the driver is unable to offload. In order to facilitate this, we add some helper functions that takes a specification argument indicating the type of packets a device is able to offload. If a packet does not match the specification, the helper function calls skb_checksum_help. Benefits of this approach are: - Simplify the stack and clarify the interface for checksum offload - Encourage NIC vendors to implement the generic. protocol agnostic checksum offload methods in hardware - Encourage feature parity in NIC offloads for IPv4 and IPv6 Many drivers advertise NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM and it probably isn't feasible to convert them all in a given time frame (although if we could this would be a great simplification to the stack). A reasonable direction may be to declare that new drivers must use NETIF_F_HW_CSUM as NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM are considered deprecated. There is a class of drivers that should now be converted to advertise NETIF_F_HW_CSUM, namely those that support offload of ecapsulated checksums. These drivers have to date been using skb->encapsulation to infer that checksum offload is being performed for an encapsulated checksum. This is strictly not correct. skb->encapsulation indicates that the inner headers are valid in the skbuff, whereas the stack indicates checksum offload arguments exclusively in csum_start and csum_offset. At some point we may want to set the inner headers for an skbuff but offload the outer transport checksum, so this needs to be fixed. In this patch set: - Rename some of constants involved in checksum offload to be more reflective of their function - Eliminate NETIF_F_GEN_CSUM and NETIF_F_V[46]_CSUM entirely as unnecessary convolutions - Fix conditions in tcp_sendpage and tcp_sendmsg to take IP protocol into account when determining if checksum offload can be done - Add driver helper functions for determining if a checksum can be offloaded to a device. If not, the helper function can call skb_checksum_help - Document the checksum offload interface between the stack and drivers with detail and specifics Testing: Have been testing ixgbe and mlx4. No noticeable regressions seen yet. ==================== Signed-off-by: David S. Miller <[email protected]>
2 parents b4bc88a + 7a6ae71 commit 93d085d

File tree

41 files changed

+437
-114
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+437
-114
lines changed

drivers/net/bonding/bond_main.c

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1067,12 +1067,12 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
10671067
return features;
10681068
}
10691069

1070-
#define BOND_VLAN_FEATURES (NETIF_F_ALL_CSUM | NETIF_F_SG | \
1070+
#define BOND_VLAN_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
10711071
NETIF_F_FRAGLIST | NETIF_F_ALL_TSO | \
10721072
NETIF_F_HIGHDMA | NETIF_F_LRO)
10731073

1074-
#define BOND_ENC_FEATURES (NETIF_F_ALL_CSUM | NETIF_F_SG | NETIF_F_RXCSUM |\
1075-
NETIF_F_ALL_TSO)
1074+
#define BOND_ENC_FEATURES (NETIF_F_HW_CSUM | NETIF_F_SG | \
1075+
NETIF_F_RXCSUM | NETIF_F_ALL_TSO)
10761076

10771077
static void bond_compute_features(struct bonding *bond)
10781078
{
@@ -4182,7 +4182,6 @@ void bond_setup(struct net_device *bond_dev)
41824182
NETIF_F_HW_VLAN_CTAG_RX |
41834183
NETIF_F_HW_VLAN_CTAG_FILTER;
41844184

4185-
bond_dev->hw_features &= ~(NETIF_F_ALL_CSUM & ~NETIF_F_HW_CSUM);
41864185
bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL;
41874186
bond_dev->features |= bond_dev->hw_features;
41884187
}

drivers/net/ethernet/emulex/benet/be_main.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5289,7 +5289,7 @@ static netdev_features_t be_features_check(struct sk_buff *skb,
52895289
skb->inner_protocol != htons(ETH_P_TEB) ||
52905290
skb_inner_mac_header(skb) - skb_transport_header(skb) !=
52915291
sizeof(struct udphdr) + sizeof(struct vxlanhdr))
5292-
return features & ~(NETIF_F_ALL_CSUM | NETIF_F_GSO_MASK);
5292+
return features & ~(NETIF_F_CSUM_MASK | NETIF_F_GSO_MASK);
52935293

52945294
return features;
52955295
}

drivers/net/ethernet/ibm/ibmveth.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -763,7 +763,7 @@ static netdev_features_t ibmveth_fix_features(struct net_device *dev,
763763
*/
764764

765765
if (!(features & NETIF_F_RXCSUM))
766-
features &= ~NETIF_F_ALL_CSUM;
766+
features &= ~NETIF_F_CSUM_MASK;
767767

768768
return features;
769769
}
@@ -928,7 +928,8 @@ static int ibmveth_set_features(struct net_device *dev,
928928
rc1 = ibmveth_set_csum_offload(dev, rx_csum);
929929
if (rc1 && !adapter->rx_csum)
930930
dev->features =
931-
features & ~(NETIF_F_ALL_CSUM | NETIF_F_RXCSUM);
931+
features & ~(NETIF_F_CSUM_MASK |
932+
NETIF_F_RXCSUM);
932933
}
933934

934935
if (large_send != adapter->large_send) {

drivers/net/ethernet/intel/fm10k/fm10k_netdev.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1357,7 +1357,7 @@ static netdev_features_t fm10k_features_check(struct sk_buff *skb,
13571357
if (!skb->encapsulation || fm10k_tx_encap_offload(skb))
13581358
return features;
13591359

1360-
return features & ~(NETIF_F_ALL_CSUM | NETIF_F_GSO_MASK);
1360+
return features & ~(NETIF_F_CSUM_MASK | NETIF_F_GSO_MASK);
13611361
}
13621362

13631363
static const struct net_device_ops fm10k_netdev_ops = {

drivers/net/ethernet/intel/i40e/i40e_main.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8766,7 +8766,7 @@ static netdev_features_t i40e_features_check(struct sk_buff *skb,
87668766
if (skb->encapsulation &&
87678767
(skb_inner_mac_header(skb) - skb_transport_header(skb) >
87688768
I40E_MAX_TUNNEL_HDR_LEN))
8769-
return features & ~(NETIF_F_ALL_CSUM | NETIF_F_GSO_MASK);
8769+
return features & ~(NETIF_F_CSUM_MASK | NETIF_F_GSO_MASK);
87708770

87718771
return features;
87728772
}
@@ -8842,7 +8842,7 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
88428842

88438843
netdev->features = NETIF_F_SG |
88448844
NETIF_F_IP_CSUM |
8845-
NETIF_F_SCTP_CSUM |
8845+
NETIF_F_SCTP_CRC |
88468846
NETIF_F_HIGHDMA |
88478847
NETIF_F_GSO_UDP_TUNNEL |
88488848
NETIF_F_GSO_GRE |

drivers/net/ethernet/intel/i40evf/i40evf_main.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2321,7 +2321,7 @@ int i40evf_process_config(struct i40evf_adapter *adapter)
23212321
netdev->features |= NETIF_F_HIGHDMA |
23222322
NETIF_F_SG |
23232323
NETIF_F_IP_CSUM |
2324-
NETIF_F_SCTP_CSUM |
2324+
NETIF_F_SCTP_CRC |
23252325
NETIF_F_IPV6_CSUM |
23262326
NETIF_F_TSO |
23272327
NETIF_F_TSO6 |

drivers/net/ethernet/intel/igb/igb_main.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2379,8 +2379,8 @@ static int igb_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
23792379
}
23802380

23812381
if (hw->mac.type >= e1000_82576) {
2382-
netdev->hw_features |= NETIF_F_SCTP_CSUM;
2383-
netdev->features |= NETIF_F_SCTP_CSUM;
2382+
netdev->hw_features |= NETIF_F_SCTP_CRC;
2383+
netdev->features |= NETIF_F_SCTP_CRC;
23842384
}
23852385

23862386
netdev->priv_flags |= IFF_UNICAST_FLT;

drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8598,7 +8598,7 @@ ixgbe_features_check(struct sk_buff *skb, struct net_device *dev,
85988598

85998599
if (unlikely(skb_inner_mac_header(skb) - skb_transport_header(skb) >
86008600
IXGBE_MAX_TUNNEL_HDR_LEN))
8601-
return features & ~NETIF_F_ALL_CSUM;
8601+
return features & ~NETIF_F_CSUM_MASK;
86028602

86038603
return features;
86048604
}
@@ -8995,8 +8995,8 @@ static int ixgbe_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
89958995
case ixgbe_mac_X540:
89968996
case ixgbe_mac_X550:
89978997
case ixgbe_mac_X550EM_x:
8998-
netdev->features |= NETIF_F_SCTP_CSUM;
8999-
netdev->hw_features |= NETIF_F_SCTP_CSUM |
8998+
netdev->features |= NETIF_F_SCTP_CRC;
8999+
netdev->hw_features |= NETIF_F_SCTP_CRC |
90009000
NETIF_F_NTUPLE;
90019001
break;
90029002
default:

drivers/net/ethernet/jme.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2753,7 +2753,7 @@ static netdev_features_t
27532753
jme_fix_features(struct net_device *netdev, netdev_features_t features)
27542754
{
27552755
if (netdev->mtu > 1900)
2756-
features &= ~(NETIF_F_ALL_TSO | NETIF_F_ALL_CSUM);
2756+
features &= ~(NETIF_F_ALL_TSO | NETIF_F_CSUM_MASK);
27572757
return features;
27582758
}
27592759

drivers/net/ethernet/marvell/sky2.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4380,7 +4380,7 @@ static netdev_features_t sky2_fix_features(struct net_device *dev,
43804380
*/
43814381
if (dev->mtu > ETH_DATA_LEN && hw->chip_id == CHIP_ID_YUKON_EC_U) {
43824382
netdev_info(dev, "checksum offload not possible with jumbo frames\n");
4383-
features &= ~(NETIF_F_TSO|NETIF_F_SG|NETIF_F_ALL_CSUM);
4383+
features &= ~(NETIF_F_TSO | NETIF_F_SG | NETIF_F_CSUM_MASK);
43844384
}
43854385

43864386
/* Some hardware requires receive checksum for RSS to work. */

0 commit comments

Comments
 (0)