Skip to content

Commit 0d9c35f

Browse files
authored
Merge pull request #5309 from NlightNFotis/feature/vsd-port-of-restrict-function-pointer
Porting of the function pointer restriction feature to variable sensitivity domain branch.
2 parents f855d5f + 45d171f commit 0d9c35f

File tree

52 files changed

+2443
-61
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+2443
-61
lines changed

doc/cprover-manual/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
[A Short Tutorial](cbmc/tutorial/),
1010
[Loop Unwinding](cbmc/unwinding/),
1111
[Assertion Checking](cbmc/assertions/),
12+
[Restricting function pointers](cbmc/restrict-function-pointer/),
1213
[Memory Analyzer](cbmc/memory-analyzer/),
1314
[Program Harness](cbmc/goto-harness/)
1415

Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
[CPROVER Manual TOC](../../)
2+
3+
## Restricting function pointers
4+
5+
In this document, we describe the `goto-instrument` feature to replace calls
6+
through function pointers by case distinctions over calls to given sets of
7+
functions.
8+
9+
### Motivation
10+
11+
The CPROVER framework includes a goto program transformation pass
12+
`remove_function_pointers()` to resolve calls to function pointers to direct
13+
function calls. The pass is needed by `cbmc`, as symbolic execution itself can't
14+
handle calls to function pointers. In practice, the transformation pass works as
15+
follows:
16+
17+
Given that there are functions with these signatures available in the program:
18+
19+
```
20+
int f(int x);
21+
int g(int x);
22+
int h(int x);
23+
```
24+
25+
And we have a call site like this:
26+
27+
```
28+
typedef int(*fptr_t)(int x);
29+
void call(fptr_t fptr) {
30+
int r = fptr(10);
31+
assert(r > 0);
32+
}
33+
```
34+
35+
Function pointer removal will turn this into code similar to this:
36+
37+
```
38+
void call(fptr_t fptr) {
39+
int r;
40+
if(fptr == &f) {
41+
r = f(10);
42+
} else if(fptr == &g) {
43+
r = g(10);
44+
} else if(fptr == &h) {
45+
r = h(10);
46+
} else {
47+
// sanity check
48+
assert(false);
49+
assume(false);
50+
}
51+
return r;
52+
}
53+
```
54+
55+
This works well enough for simple cases. However, this is a very simple
56+
replacement only based on the signature of the function (and whether or not they
57+
have their address taken somewhere in the program), so if there are many
58+
functions matching a particular signature, or if some of these functions are
59+
expensive in symex (e.g. functions with lots of loops or recursion), then this
60+
can be a bit cumbersome - especially if we, as a user, already know that a
61+
particular function pointer will only resolve to a single function or a small
62+
set of functions. The `goto-instrument` option `--restrict-function-pointer`
63+
allows to manually specify this set of functions.
64+
65+
### Example
66+
67+
Take the motivating example above. Let us assume that we know for a fact that
68+
`call` will always receive pointers to either `f` or `g` during actual
69+
executions of the program, and symbolic execution for `h` is too expensive to
70+
simply ignore the cost of its branch. For this, we will label the places in each
71+
function where function pointers are being called, to this pattern:
72+
73+
```
74+
<function-name>.function_pointer_call.<N>
75+
```
76+
77+
where `N` is referring to which function call it is - so the first call to a
78+
function pointer in a function will have `N=1`, the 5th `N=5` etc.
79+
80+
We can call `goto-instrument --restrict-function-pointer
81+
call.function_pointer_call.1/f,g in.gb out.gb`. This can be read as
82+
83+
> For the first call to a function pointer in the function `call`, assume that
84+
> it can only be a call to `f` or `g`
85+
86+
The resulting output (written to goto binary `out.gb`) looks similar to the
87+
original example, except now there will not be a call to `h`:
88+
89+
```
90+
void call(fptr_t fptr) {
91+
int r;
92+
if(fptr == &f) {
93+
r = f(10);
94+
} else if(fptr == &g) {
95+
r = g(10);
96+
} else {
97+
// sanity check
98+
assert(false);
99+
assume(false);
100+
}
101+
return r;
102+
}
103+
```
104+
105+
Another example: Imagine we have a simple virtual filesystem API and implementation
106+
like this:
107+
108+
```
109+
typedef struct filesystem_t filesystem_t;
110+
struct filesystem_t {
111+
int (*open)(filesystem_t *filesystem, const char* file_name);
112+
};
113+
114+
int fs_open(filesystem_t *filesystem, const char* file_name) {
115+
filesystem->open(filesystem, file_name);
116+
}
117+
118+
int nullfs_open(filesystem_t *filesystem, const char* file_name) {
119+
return -1;
120+
}
121+
122+
filesystem_t nullfs_val = {.open = nullfs_open};
123+
filesystem *const nullfs = &nullfs_val;
124+
125+
filesystem_t *get_fs_impl() {
126+
// some fancy logic to determine
127+
// which filesystem we're getting -
128+
// in-memory, backed by a database, OS file system
129+
// - but in our case, we know that
130+
// it always ends up being nullfs
131+
// for the cases we care about
132+
return nullfs;
133+
}
134+
int main(void) {
135+
filesystem_t *fs = get_fs_impl();
136+
assert(fs_open(fs, "hello.txt") != -1);
137+
}
138+
```
139+
140+
In this case, the assumption is that *we* know that in our `main`, `fs` can be
141+
nothing other than `nullfs`; But perhaps due to the logic being too complicated
142+
symex ends up being unable to figure this out, so in the call to `fs_open()` we
143+
end up branching on all functions matching the signature of
144+
`filesystem_t::open`, which could be quite a few functions within the program.
145+
Worst of all, if its address is ever taken in the program, as far as the "dumb"
146+
function pointer removal is concerned it could be `fs_open()` itself due to it
147+
having a matching signature, leading to symex being forced to follow a
148+
potentially infinite recursion until its unwind limit.
149+
150+
In this case we can again restrict the function pointer to the value which we
151+
know it will have:
152+
153+
```
154+
--restrict-function-pointer fs_open.function_pointer_call.1/nullfs_open
155+
```
156+
157+
### Loading from file
158+
159+
If you have many places where you want to restrict function pointers, it'd be a
160+
nuisance to have to specify them all on the command line. In these cases, you
161+
can specify a file to load the restrictions from instead, via the
162+
`--function-pointer-restrictions-file` option, which you can give the name of a
163+
JSON file with this format:
164+
165+
```
166+
{
167+
"function_call_site_name": ["function1", "function2", ...],
168+
...
169+
}
170+
```
171+
172+
**Note:** If you pass in multiple files, or a mix of files and command line
173+
restrictions, the final restrictions will be a set union of all specified
174+
restrictions.
175+
176+
**Note:** as of now, if something goes wrong during type checking (i.e. making
177+
sure that all function pointer replacements refer to functions in the symbol
178+
table that have the correct type), the error message will refer to the command
179+
line option `--restrict-function-pointer` regardless of whether the restriction
180+
in question came from the command line or a file.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
int f(void)
2+
{
3+
return 1;
4+
}
5+
6+
int g(void)
7+
{
8+
return 2;
9+
}
10+
11+
int h(void)
12+
{
13+
return 3;
14+
}
15+
16+
typedef int (*fp_t)(void);
17+
18+
fp_t fp;
19+
20+
void main()
21+
{
22+
int cond;
23+
fp = cond ? f : g;
24+
int res = fp();
25+
__CPROVER_assert(res == 1, "");
26+
__CPROVER_assert(res == 2, "");
27+
__CPROVER_assert(res == 1 || res == 2, "");
28+
}
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
CORE
2+
test.c
3+
--restrict-function-pointer-by-name fp/f,g
4+
\[main\.assertion\.1\] line \d+ dereferenced function pointer at main\.function_pointer_call\.1 must be one of \[(f, g)|(g, f)\]: SUCCESS
5+
\[main.assertion.2\] line \d+ assertion: FAILURE
6+
\[main.assertion.3\] line \d+ assertion: FAILURE
7+
\[main.assertion.4\] line \d+ assertion: SUCCESS
8+
f\(\)
9+
g\(\)
10+
^EXIT=10$
11+
^SIGNAL=0$
12+
--
13+
h\(\)
14+
--
15+
Check that a call to a global function pointer is correctly restricted
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
#include <assert.h>
2+
3+
int f(void)
4+
{
5+
return 1;
6+
}
7+
8+
int g(void)
9+
{
10+
return 2;
11+
}
12+
13+
typedef int (*fp_t)(void);
14+
15+
void main()
16+
{
17+
fp_t fp = f;
18+
assert(fp() == 1);
19+
}
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
CORE
2+
test.c
3+
--restrict-function-pointer-by-name main::1::fp/f
4+
\[main\.assertion\.1\] line \d+ dereferenced function pointer at main\.function_pointer_call\.1 must be f: SUCCESS
5+
\[main\.assertion\.2\] line \d+ assertion fp\(\) == 1: SUCCESS
6+
f\(\)
7+
^EXIT=0$
8+
^SIGNAL=0$
9+
--
10+
g\(\)
11+
--
12+
Check that a call to a local function pointer is correctly restricted
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
#include <assert.h>
2+
3+
int f(void)
4+
{
5+
return 1;
6+
}
7+
8+
int g(void)
9+
{
10+
return 2;
11+
}
12+
13+
typedef int (*fp_t)(void);
14+
15+
void use_fp(fp_t fp)
16+
{
17+
assert(fp() == 1);
18+
}
19+
20+
void main()
21+
{
22+
use_fp(f);
23+
}
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
CORE
2+
test.c
3+
--restrict-function-pointer-by-name use_fp::fp/f
4+
\[use_fp\.assertion\.1\] line \d+ dereferenced function pointer at use_fp\.function_pointer_call\.1 must be f: SUCCESS
5+
\[use_fp\.assertion\.2\] line \d+ assertion fp\(\) == 1: SUCCESS
6+
f\(\)
7+
^EXIT=0$
8+
^SIGNAL=0$
9+
--
10+
g\(\)
11+
--
12+
Check that a call to a function pointer parameter is correctly restricted
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
#include <assert.h>
2+
3+
typedef int (*fptr_t)(int);
4+
5+
fptr_t get_f(void);
6+
fptr_t get_g(void);
7+
8+
void use_fg(int choice, fptr_t fptr, fptr_t gptr)
9+
{
10+
assert((choice ? fptr : gptr)(10) == 10 + choice);
11+
}
12+
13+
int main(void)
14+
{
15+
use_fg(0, get_f(), get_g());
16+
use_fg(1, get_f(), get_g());
17+
}
18+
19+
int f(int x)
20+
{
21+
return x + 1;
22+
}
23+
24+
int g(int x)
25+
{
26+
return x;
27+
}
28+
29+
int h(int x)
30+
{
31+
return x / 2;
32+
}
33+
34+
// Below we take the address of f, g, and h. Thus remove_function_pointers()
35+
// would create a case distinction involving f, g, and h in the function
36+
// use_fg() above.
37+
int always_false_cond = 0;
38+
39+
fptr_t get_f(void)
40+
{
41+
if(!always_false_cond)
42+
{
43+
return f;
44+
}
45+
else
46+
{
47+
return h;
48+
}
49+
}
50+
51+
fptr_t get_g(void)
52+
{
53+
if(!always_false_cond)
54+
{
55+
return g;
56+
}
57+
else
58+
{
59+
return h;
60+
}
61+
}

0 commit comments

Comments
 (0)