Skip to content

Commit 5b62996

Browse files
author
Paolo Abeni
committed
Merge branch 'netconsole-add-taskname-sysdata-support'
Breno Leitao says: ==================== netconsole: Add taskname sysdata support This patchset introduces a new feature to the netconsole extradata subsystem that enables the inclusion of the current task's name in the sysdata output of netconsole messages. This enhancement is particularly valuable for large-scale deployments, such as Meta's, where netconsole collects messages from millions of servers and stores them in a data warehouse for analysis. Engineers often rely on these messages to investigate issues and assess kernel health. One common challenge we face is determining the context in which a particular message was generated. By including the task name (task->comm) with each message, this feature provides a direct answer to the frequently asked question: "What was running when this message was generated?" This added context will significantly improve our ability to diagnose and troubleshoot issues, making it easier to interpret output of netconsole. The patchset consists of seven patches that implement the following changes: * Refactor CPU number formatting into a separate function * Prefix CPU_NR sysdata feature with SYSDATA_ * Patch to covert a bitwise operation into boolean * Add configfs controls for taskname sysdata feature * Add taskname to extradata entry count * Add support for including task name in netconsole's extra data output * Document the task name feature in Documentation/networking/netconsole.rst * Add test coverage for the task name feature to the existing sysdata selftest script These changes allow users to enable or disable the task name feature via configfs and provide additional context for kernel messages by showing which task generated each console message. I have tested these patches on some servers and they seem to work as expected. v1: https://lore.kernel.org/r/[email protected] Signed-off-by: Breno Leitao <[email protected]> ==================== Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2 parents 188fa9b + d7a2522 commit 5b62996

File tree

3 files changed

+153
-21
lines changed

3 files changed

+153
-21
lines changed

Documentation/networking/netconsole.rst

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -240,6 +240,34 @@ Delete `userdata` entries with `rmdir`::
240240

241241
It is recommended to not write user data values with newlines.
242242

243+
Task name auto population in userdata
244+
-------------------------------------
245+
246+
Inside the netconsole configfs hierarchy, there is a file called
247+
`taskname_enabled` under the `userdata` directory. This file is used to enable
248+
or disable the automatic task name population feature. This feature
249+
automatically populates the current task name that is scheduled in the CPU
250+
sneding the message.
251+
252+
To enable task name auto-population::
253+
254+
echo 1 > /sys/kernel/config/netconsole/target1/userdata/taskname_enabled
255+
256+
When this option is enabled, the netconsole messages will include an additional
257+
line in the userdata field with the format `taskname=<task name>`. This allows
258+
the receiver of the netconsole messages to easily find which application was
259+
currently scheduled when that message was generated, providing extra context
260+
for kernel messages and helping to categorize them.
261+
262+
Example::
263+
264+
echo "This is a message" > /dev/kmsg
265+
12,607,22085407756,-;This is a message
266+
taskname=echo
267+
268+
In this example, the message was generated while "echo" was the current
269+
scheduled process.
270+
243271
CPU number auto population in userdata
244272
--------------------------------------
245273

drivers/net/netconsole.c

Lines changed: 81 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,9 @@ struct netconsole_target_stats {
103103
*/
104104
enum sysdata_feature {
105105
/* Populate the CPU that sends the message */
106-
CPU_NR = BIT(0),
106+
SYSDATA_CPU_NR = BIT(0),
107+
/* Populate the task name (as in current->comm) in sysdata */
108+
SYSDATA_TASKNAME = BIT(1),
107109
};
108110

109111
/**
@@ -418,12 +420,26 @@ static ssize_t sysdata_cpu_nr_enabled_show(struct config_item *item, char *buf)
418420
bool cpu_nr_enabled;
419421

420422
mutex_lock(&dynamic_netconsole_mutex);
421-
cpu_nr_enabled = !!(nt->sysdata_fields & CPU_NR);
423+
cpu_nr_enabled = !!(nt->sysdata_fields & SYSDATA_CPU_NR);
422424
mutex_unlock(&dynamic_netconsole_mutex);
423425

424426
return sysfs_emit(buf, "%d\n", cpu_nr_enabled);
425427
}
426428

429+
/* configfs helper to display if taskname sysdata feature is enabled */
430+
static ssize_t sysdata_taskname_enabled_show(struct config_item *item,
431+
char *buf)
432+
{
433+
struct netconsole_target *nt = to_target(item->ci_parent);
434+
bool taskname_enabled;
435+
436+
mutex_lock(&dynamic_netconsole_mutex);
437+
taskname_enabled = !!(nt->sysdata_fields & SYSDATA_TASKNAME);
438+
mutex_unlock(&dynamic_netconsole_mutex);
439+
440+
return sysfs_emit(buf, "%d\n", taskname_enabled);
441+
}
442+
427443
/*
428444
* This one is special -- targets created through the configfs interface
429445
* are not enabled (and the corresponding netpoll activated) by default.
@@ -699,7 +715,9 @@ static size_t count_extradata_entries(struct netconsole_target *nt)
699715
/* Userdata entries */
700716
entries = list_count_nodes(&nt->userdata_group.cg_children);
701717
/* Plus sysdata entries */
702-
if (nt->sysdata_fields & CPU_NR)
718+
if (nt->sysdata_fields & SYSDATA_CPU_NR)
719+
entries += 1;
720+
if (nt->sysdata_fields & SYSDATA_TASKNAME)
703721
entries += 1;
704722

705723
return entries;
@@ -837,6 +855,40 @@ static void disable_sysdata_feature(struct netconsole_target *nt,
837855
nt->extradata_complete[nt->userdata_length] = 0;
838856
}
839857

858+
static ssize_t sysdata_taskname_enabled_store(struct config_item *item,
859+
const char *buf, size_t count)
860+
{
861+
struct netconsole_target *nt = to_target(item->ci_parent);
862+
bool taskname_enabled, curr;
863+
ssize_t ret;
864+
865+
ret = kstrtobool(buf, &taskname_enabled);
866+
if (ret)
867+
return ret;
868+
869+
mutex_lock(&dynamic_netconsole_mutex);
870+
curr = !!(nt->sysdata_fields & SYSDATA_TASKNAME);
871+
if (taskname_enabled == curr)
872+
goto unlock_ok;
873+
874+
if (taskname_enabled &&
875+
count_extradata_entries(nt) >= MAX_EXTRADATA_ITEMS) {
876+
ret = -ENOSPC;
877+
goto unlock;
878+
}
879+
880+
if (taskname_enabled)
881+
nt->sysdata_fields |= SYSDATA_TASKNAME;
882+
else
883+
disable_sysdata_feature(nt, SYSDATA_TASKNAME);
884+
885+
unlock_ok:
886+
ret = strnlen(buf, count);
887+
unlock:
888+
mutex_unlock(&dynamic_netconsole_mutex);
889+
return ret;
890+
}
891+
840892
/* configfs helper to sysdata cpu_nr feature */
841893
static ssize_t sysdata_cpu_nr_enabled_store(struct config_item *item,
842894
const char *buf, size_t count)
@@ -850,7 +902,7 @@ static ssize_t sysdata_cpu_nr_enabled_store(struct config_item *item,
850902
return ret;
851903

852904
mutex_lock(&dynamic_netconsole_mutex);
853-
curr = nt->sysdata_fields & CPU_NR;
905+
curr = !!(nt->sysdata_fields & SYSDATA_CPU_NR);
854906
if (cpu_nr_enabled == curr)
855907
/* no change requested */
856908
goto unlock_ok;
@@ -865,13 +917,13 @@ static ssize_t sysdata_cpu_nr_enabled_store(struct config_item *item,
865917
}
866918

867919
if (cpu_nr_enabled)
868-
nt->sysdata_fields |= CPU_NR;
920+
nt->sysdata_fields |= SYSDATA_CPU_NR;
869921
else
870922
/* This is special because extradata_complete might have
871923
* remaining data from previous sysdata, and it needs to be
872924
* cleaned.
873925
*/
874-
disable_sysdata_feature(nt, CPU_NR);
926+
disable_sysdata_feature(nt, SYSDATA_CPU_NR);
875927

876928
unlock_ok:
877929
ret = strnlen(buf, count);
@@ -882,6 +934,7 @@ static ssize_t sysdata_cpu_nr_enabled_store(struct config_item *item,
882934

883935
CONFIGFS_ATTR(userdatum_, value);
884936
CONFIGFS_ATTR(sysdata_, cpu_nr_enabled);
937+
CONFIGFS_ATTR(sysdata_, taskname_enabled);
885938

886939
static struct configfs_attribute *userdatum_attrs[] = {
887940
&userdatum_attr_value,
@@ -942,6 +995,7 @@ static void userdatum_drop(struct config_group *group, struct config_item *item)
942995

943996
static struct configfs_attribute *userdata_attrs[] = {
944997
&sysdata_attr_cpu_nr_enabled,
998+
&sysdata_attr_taskname_enabled,
945999
NULL,
9461000
};
9471001

@@ -1117,28 +1171,41 @@ static void populate_configfs_item(struct netconsole_target *nt,
11171171
init_target_config_group(nt, target_name);
11181172
}
11191173

1174+
static int append_cpu_nr(struct netconsole_target *nt, int offset)
1175+
{
1176+
/* Append cpu=%d at extradata_complete after userdata str */
1177+
return scnprintf(&nt->extradata_complete[offset],
1178+
MAX_EXTRADATA_ENTRY_LEN, " cpu=%u\n",
1179+
raw_smp_processor_id());
1180+
}
1181+
1182+
static int append_taskname(struct netconsole_target *nt, int offset)
1183+
{
1184+
return scnprintf(&nt->extradata_complete[offset],
1185+
MAX_EXTRADATA_ENTRY_LEN, " taskname=%s\n",
1186+
current->comm);
1187+
}
11201188
/*
11211189
* prepare_extradata - append sysdata at extradata_complete in runtime
11221190
* @nt: target to send message to
11231191
*/
11241192
static int prepare_extradata(struct netconsole_target *nt)
11251193
{
1126-
int sysdata_len, extradata_len;
1194+
u32 fields = SYSDATA_CPU_NR | SYSDATA_TASKNAME;
1195+
int extradata_len;
11271196

11281197
/* userdata was appended when configfs write helper was called
11291198
* by update_userdata().
11301199
*/
11311200
extradata_len = nt->userdata_length;
11321201

1133-
if (!(nt->sysdata_fields & CPU_NR))
1202+
if (!(nt->sysdata_fields & fields))
11341203
goto out;
11351204

1136-
/* Append cpu=%d at extradata_complete after userdata str */
1137-
sysdata_len = scnprintf(&nt->extradata_complete[nt->userdata_length],
1138-
MAX_EXTRADATA_ENTRY_LEN, " cpu=%u\n",
1139-
raw_smp_processor_id());
1140-
1141-
extradata_len += sysdata_len;
1205+
if (nt->sysdata_fields & SYSDATA_CPU_NR)
1206+
extradata_len += append_cpu_nr(nt, extradata_len);
1207+
if (nt->sysdata_fields & SYSDATA_TASKNAME)
1208+
extradata_len += append_taskname(nt, extradata_len);
11421209

11431210
WARN_ON_ONCE(extradata_len >
11441211
MAX_EXTRADATA_ENTRY_LEN * MAX_EXTRADATA_ITEMS);

tools/testing/selftests/drivers/net/netcons_sysdata.sh

Lines changed: 44 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -31,17 +31,38 @@ function set_cpu_nr() {
3131
echo 1 > "${NETCONS_PATH}/userdata/cpu_nr_enabled"
3232
}
3333

34+
# Enable the taskname to be appended to sysdata
35+
function set_taskname() {
36+
if [[ ! -f "${NETCONS_PATH}/userdata/taskname_enabled" ]]
37+
then
38+
echo "Not able to enable taskname sysdata append. Configfs not available in ${NETCONS_PATH}/userdata/taskname_enabled" >&2
39+
exit "${ksft_skip}"
40+
fi
41+
42+
echo 1 > "${NETCONS_PATH}/userdata/taskname_enabled"
43+
}
44+
3445
# Disable the sysdata cpu_nr feature
3546
function unset_cpu_nr() {
3647
echo 0 > "${NETCONS_PATH}/userdata/cpu_nr_enabled"
3748
}
3849

39-
# Test if MSG content and `cpu=${CPU}` exists in OUTPUT_FILE
40-
function validate_sysdata_cpu_exists() {
50+
# Once called, taskname=<..> will not be appended anymore
51+
function unset_taskname() {
52+
echo 0 > "${NETCONS_PATH}/userdata/taskname_enabled"
53+
}
54+
55+
# Test if MSG contains sysdata
56+
function validate_sysdata() {
4157
# OUTPUT_FILE will contain something like:
4258
# 6.11.1-0_fbk0_rc13_509_g30d75cea12f7,13,1822,115075213798,-;netconsole selftest: netcons_gtJHM
4359
# userdatakey=userdatavalue
4460
# cpu=X
61+
# taskname=<taskname>
62+
63+
# Echo is what this test uses to create the message. See runtest()
64+
# function
65+
SENDER="echo"
4566

4667
if [ ! -f "$OUTPUT_FILE" ]; then
4768
echo "FAIL: File was not generated." >&2
@@ -62,12 +83,19 @@ function validate_sysdata_cpu_exists() {
6283
exit "${ksft_fail}"
6384
fi
6485

86+
if ! grep -q "taskname=${SENDER}" "${OUTPUT_FILE}"; then
87+
echo "FAIL: 'taskname=echo' not found in ${OUTPUT_FILE}" >&2
88+
cat "${OUTPUT_FILE}" >&2
89+
exit "${ksft_fail}"
90+
fi
91+
6592
rm "${OUTPUT_FILE}"
6693
pkill_socat
6794
}
6895

69-
# Test if MSG content exists in OUTPUT_FILE but no `cpu=` string
70-
function validate_sysdata_no_cpu() {
96+
# Test if MSG content exists in OUTPUT_FILE but no `cpu=` and `taskname=`
97+
# strings
98+
function validate_no_sysdata() {
7199
if [ ! -f "$OUTPUT_FILE" ]; then
72100
echo "FAIL: File was not generated." >&2
73101
exit "${ksft_fail}"
@@ -85,6 +113,12 @@ function validate_sysdata_no_cpu() {
85113
exit "${ksft_fail}"
86114
fi
87115

116+
if grep -q "taskname=" "${OUTPUT_FILE}"; then
117+
echo "FAIL: 'taskname= found in ${OUTPUT_FILE}" >&2
118+
cat "${OUTPUT_FILE}" >&2
119+
exit "${ksft_fail}"
120+
fi
121+
88122
rm "${OUTPUT_FILE}"
89123
}
90124

@@ -133,10 +167,12 @@ OUTPUT_FILE="/tmp/${TARGET}_1"
133167
MSG="Test #1 from CPU${CPU}"
134168
# Enable the auto population of cpu_nr
135169
set_cpu_nr
170+
# Enable taskname to be appended to sysdata
171+
set_taskname
136172
runtest
137173
# Make sure the message was received in the dst part
138174
# and exit
139-
validate_sysdata_cpu_exists
175+
validate_sysdata
140176

141177
#====================================================
142178
# TEST #2
@@ -148,7 +184,7 @@ OUTPUT_FILE="/tmp/${TARGET}_2"
148184
MSG="Test #2 from CPU${CPU}"
149185
set_user_data
150186
runtest
151-
validate_sysdata_cpu_exists
187+
validate_sysdata
152188

153189
# ===================================================
154190
# TEST #3
@@ -160,8 +196,9 @@ OUTPUT_FILE="/tmp/${TARGET}_3"
160196
MSG="Test #3 from CPU${CPU}"
161197
# Enable the auto population of cpu_nr
162198
unset_cpu_nr
199+
unset_taskname
163200
runtest
164201
# At this time, cpu= shouldn't be present in the msg
165-
validate_sysdata_no_cpu
202+
validate_no_sysdata
166203

167204
exit "${ksft_pass}"

0 commit comments

Comments
 (0)