Skip to content

Commit 3db978d

Browse files
tehcastertorvalds
authored andcommitted
kernel/sysctl: support setting sysctl parameters from kernel command line
Patch series "support setting sysctl parameters from kernel command line", v3. This series adds support for something that seems like many people always wanted but nobody added it yet, so here's the ability to set sysctl parameters via kernel command line options in the form of sysctl.vm.something=1 The important part is Patch 1. The second, not so important part is an attempt to clean up legacy one-off parameters that do the same thing as a sysctl. I don't want to remove them completely for compatibility reasons, but with generic sysctl support the idea is to remove the one-off param handlers and treat the parameters as aliases for the sysctl variants. I have identified several parameters that mention sysctl counterparts in Documentation/admin-guide/kernel-parameters.txt but there might be more. The conversion also has varying level of success: - numa_zonelist_order is converted in Patch 2 together with adding the necessary infrastructure. It's easy as it doesn't really do anything but warn on deprecated value these days. - hung_task_panic is converted in Patch 3, but there's a downside that now it only accepts 0 and 1, while previously it was any integer value - nmi_watchdog maps to two sysctls nmi_watchdog and hardlockup_panic, so there's no straighforward conversion possible - traceoff_on_warning is a flag without value and it would be required to handle that somehow in the conversion infractructure, which seems pointless for a single flag This patch (of 5): A recently proposed patch to add vm_swappiness command line parameter in addition to existing sysctl [1] made me wonder why we don't have a general support for passing sysctl parameters via command line. Googling found only somebody else wondering the same [2], but I haven't found any prior discussion with reasons why not to do this. Settings the vm_swappiness issue aside (the underlying issue might be solved in a different way), quick search of kernel-parameters.txt shows there are already some that exist as both sysctl and kernel parameter - hung_task_panic, nmi_watchdog, numa_zonelist_order, traceoff_on_warning. A general mechanism would remove the need to add more of those one-offs and might be handy in situations where configuration by e.g. /etc/sysctl.d/ is impractical. Hence, this patch adds a new parse_args() pass that looks for parameters prefixed by 'sysctl.' and tries to interpret them as writes to the corresponding sys/ files using an temporary in-kernel procfs mount. This mechanism was suggested by Eric W. Biederman [3], as it handles all dynamically registered sysctl tables, even though we don't handle modular sysctls. Errors due to e.g. invalid parameter name or value are reported in the kernel log. The processing is hooked right before the init process is loaded, as some handlers might be more complicated than simple setters and might need some subsystems to be initialized. At the moment the init process can be started and eventually execute a process writing to /proc/sys/ then it should be also fine to do that from the kernel. Sysctls registered later on module load time are not set by this mechanism - it's expected that in such scenarios, setting sysctl values from userspace is practical enough. [1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/ [2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter [3] https://lore.kernel.org/r/[email protected]/ Signed-off-by: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Luis Chamberlain <[email protected]> Reviewed-by: Masami Hiramatsu <[email protected]> Acked-by: Kees Cook <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Iurii Zaikin <[email protected]> Cc: Ivan Teterevkov <[email protected]> Cc: Michal Hocko <[email protected]> Cc: David Rientjes <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: "Eric W . Biederman" <[email protected]> Cc: "Guilherme G . Piccoli" <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Christian Brauner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
1 parent 01f39c1 commit 3db978d

File tree

4 files changed

+122
-0
lines changed

4 files changed

+122
-0
lines changed

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4969,6 +4969,15 @@
49694969

49704970
switches= [HW,M68k]
49714971

4972+
sysctl.*= [KNL]
4973+
Set a sysctl parameter, right before loading the init
4974+
process, as if the value was written to the respective
4975+
/proc/sys/... file. Both '.' and '/' are recognized as
4976+
separators. Unrecognized parameters and invalid values
4977+
are reported in the kernel log. Sysctls registered
4978+
later by a loaded module cannot be set this way.
4979+
Example: sysctl.vm.swappiness=40
4980+
49724981
sysfs.deprecated=0|1 [KNL]
49734982
Enable/disable old style sysfs layout for old udev
49744983
on older distributions. When this option is enabled

fs/proc/proc_sysctl.c

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
#include <linux/mm.h>
1515
#include <linux/module.h>
1616
#include <linux/bpf-cgroup.h>
17+
#include <linux/mount.h>
1718
#include "internal.h"
1819

1920
static const struct dentry_operations proc_sys_dentry_operations;
@@ -1703,3 +1704,109 @@ int __init proc_sys_init(void)
17031704

17041705
return sysctl_init();
17051706
}
1707+
1708+
/* Set sysctl value passed on kernel command line. */
1709+
static int process_sysctl_arg(char *param, char *val,
1710+
const char *unused, void *arg)
1711+
{
1712+
char *path;
1713+
struct vfsmount **proc_mnt = arg;
1714+
struct file_system_type *proc_fs_type;
1715+
struct file *file;
1716+
int len;
1717+
int err;
1718+
loff_t pos = 0;
1719+
ssize_t wret;
1720+
1721+
if (strncmp(param, "sysctl", sizeof("sysctl") - 1))
1722+
return 0;
1723+
1724+
param += sizeof("sysctl") - 1;
1725+
1726+
if (param[0] != '/' && param[0] != '.')
1727+
return 0;
1728+
1729+
param++;
1730+
1731+
/*
1732+
* To set sysctl options, we use a temporary mount of proc, look up the
1733+
* respective sys/ file and write to it. To avoid mounting it when no
1734+
* options were given, we mount it only when the first sysctl option is
1735+
* found. Why not a persistent mount? There are problems with a
1736+
* persistent mount of proc in that it forces userspace not to use any
1737+
* proc mount options.
1738+
*/
1739+
if (!*proc_mnt) {
1740+
proc_fs_type = get_fs_type("proc");
1741+
if (!proc_fs_type) {
1742+
pr_err("Failed to find procfs to set sysctl from command line\n");
1743+
return 0;
1744+
}
1745+
*proc_mnt = kern_mount(proc_fs_type);
1746+
put_filesystem(proc_fs_type);
1747+
if (IS_ERR(*proc_mnt)) {
1748+
pr_err("Failed to mount procfs to set sysctl from command line\n");
1749+
return 0;
1750+
}
1751+
}
1752+
1753+
path = kasprintf(GFP_KERNEL, "sys/%s", param);
1754+
if (!path)
1755+
panic("%s: Failed to allocate path for %s\n", __func__, param);
1756+
strreplace(path, '.', '/');
1757+
1758+
file = file_open_root((*proc_mnt)->mnt_root, *proc_mnt, path, O_WRONLY, 0);
1759+
if (IS_ERR(file)) {
1760+
err = PTR_ERR(file);
1761+
if (err == -ENOENT)
1762+
pr_err("Failed to set sysctl parameter '%s=%s': parameter not found\n",
1763+
param, val);
1764+
else if (err == -EACCES)
1765+
pr_err("Failed to set sysctl parameter '%s=%s': permission denied (read-only?)\n",
1766+
param, val);
1767+
else
1768+
pr_err("Error %pe opening proc file to set sysctl parameter '%s=%s'\n",
1769+
file, param, val);
1770+
goto out;
1771+
}
1772+
len = strlen(val);
1773+
wret = kernel_write(file, val, len, &pos);
1774+
if (wret < 0) {
1775+
err = wret;
1776+
if (err == -EINVAL)
1777+
pr_err("Failed to set sysctl parameter '%s=%s': invalid value\n",
1778+
param, val);
1779+
else
1780+
pr_err("Error %pe writing to proc file to set sysctl parameter '%s=%s'\n",
1781+
ERR_PTR(err), param, val);
1782+
} else if (wret != len) {
1783+
pr_err("Wrote only %zd bytes of %d writing to proc file %s to set sysctl parameter '%s=%s\n",
1784+
wret, len, path, param, val);
1785+
}
1786+
1787+
err = filp_close(file, NULL);
1788+
if (err)
1789+
pr_err("Error %pe closing proc file to set sysctl parameter '%s=%s\n",
1790+
ERR_PTR(err), param, val);
1791+
out:
1792+
kfree(path);
1793+
return 0;
1794+
}
1795+
1796+
void do_sysctl_args(void)
1797+
{
1798+
char *command_line;
1799+
struct vfsmount *proc_mnt = NULL;
1800+
1801+
command_line = kstrdup(saved_command_line, GFP_KERNEL);
1802+
if (!command_line)
1803+
panic("%s: Failed to allocate copy of command line\n", __func__);
1804+
1805+
parse_args("Setting sysctl args", command_line,
1806+
NULL, 0, -1, -1, &proc_mnt, process_sysctl_arg);
1807+
1808+
if (proc_mnt)
1809+
kern_unmount(proc_mnt);
1810+
1811+
kfree(command_line);
1812+
}

include/linux/sysctl.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -197,6 +197,7 @@ struct ctl_table_header *register_sysctl_paths(const struct ctl_path *path,
197197
void unregister_sysctl_table(struct ctl_table_header * table);
198198

199199
extern int sysctl_init(void);
200+
void do_sysctl_args(void);
200201

201202
extern int pwrsw_enabled;
202203
extern int unaligned_enabled;
@@ -235,6 +236,9 @@ static inline void setup_sysctl_set(struct ctl_table_set *p,
235236
{
236237
}
237238

239+
static inline void do_sysctl_args(void)
240+
{
241+
}
238242
#endif /* CONFIG_SYSCTL */
239243

240244
int sysctl_max_threads(struct ctl_table *table, int write, void *buffer,

init/main.c

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1412,6 +1412,8 @@ static int __ref kernel_init(void *unused)
14121412

14131413
rcu_end_inkernel_boot();
14141414

1415+
do_sysctl_args();
1416+
14151417
if (ramdisk_execute_command) {
14161418
ret = run_init_process(ramdisk_execute_command);
14171419
if (!ret)

0 commit comments

Comments
 (0)