Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions emcc.py
Original file line number Diff line number Diff line change
Expand Up @@ -1647,8 +1647,8 @@ def repl(m):

if shared.Settings.USE_PTHREADS:
target_dir = os.path.dirname(os.path.abspath(target))
shutil.copyfile(shared.path_from_root('src', 'pthread-main.js'),
os.path.join(target_dir, 'pthread-main.js'))
shared.run_c_preprocessor_on_file(shared.path_from_root('src', 'pthread-main.js'),
os.path.join(target_dir, 'pthread-main.js'))

# Generate the fetch-worker.js script for multithreaded emscripten_fetch() support if targeting pthreads.
if shared.Settings.FETCH and shared.Settings.USE_PTHREADS:
Expand Down
15 changes: 8 additions & 7 deletions src/pthread-main.js
Original file line number Diff line number Diff line change
Expand Up @@ -76,10 +76,12 @@ this.onmessage = function(e) {
assert(STACK_BASE != 0);
assert(STACK_MAX > STACK_BASE);
Runtime.establishStackSpace(e.data.stackBase, e.data.stackBase + e.data.stackSize);

#if STACK_OVERFLOW_CHECK
writeStackCookie();
#endif

var result = 0;
//#if STACK_OVERFLOW_CHECK
if (typeof writeStackCookie === 'function') writeStackCookie();
//#endif

PThread.receiveObjectTransfer(e.data);

Expand All @@ -93,10 +95,9 @@ this.onmessage = function(e) {
} else {
result = Module['asm'].dynCall_i(e.data.start_routine); // as a hack, try signature 'i' as fallback.
}
//#if STACK_OVERFLOW_CHECK
if (typeof checkStackCookie === 'function') checkStackCookie();
//#endif

#if STACK_OVERFLOW_CHECK
checkStackCookie();
#endif
} catch(e) {
if (e === 'Canceled!') {
PThread.threadCancel();
Expand Down
53 changes: 38 additions & 15 deletions tools/shared.py
Original file line number Diff line number Diff line change
Expand Up @@ -2652,21 +2652,44 @@ def safe_copy(src, dst):
if dst == '/dev/null': return
shutil.copyfile(src, dst)

def clang_preprocess(filename):
# TODO: REMOVE HACK AND PASS PREPROCESSOR FLAGS TO CLANG.
return subprocess.check_output([CLANG_CC, '-DFETCH_DEBUG=1', '-E', '-P', '-C', '-x', 'c', filename])

def read_and_preprocess(filename):
f = open(filename, 'r').read()
pos = 0
include_pattern = re.compile('^#include\s*["<](.*)[">]\s?$', re.MULTILINE)
while(1):
m = include_pattern.search(f, pos)
if not m:
return f
included_file = open(os.path.join(os.path.dirname(filename), m.groups(0)[0]), 'r').read()
def find_lines_with_preprocessor_directives(file):
return filter(lambda x: '#if' in x or '#elif' in x, open(file, 'r').readlines())

def run_c_preprocessor_on_file(src, dst):
# Run LLVM's C preprocessor on the given file, expanding #includes and variables found in src/setting.js.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this overlaps with part of #5296 (the preprocessing bit from python). As discussed there, how about doing this in python - just iterate over lines, find lines with #if and their matching #endif, and compute their condition, etc.?

I don't think we need the full c preprocessor here, and the complexity (as discussed in the comment here) worries me.

Copy link
Member

@dschuff dschuff Aug 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read #5296 yet (I guess I should) but your suggestion alarms me. We really need to be decreasing rather than increasing the number of places and ways that we munge the code. e.g. we have the CPP already in lots of places, I'd rather use that than put yet another custom ad-hoc bit if python code that parses the input and spits it out again.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about doing this in python

I did not want to create a second implementation of the partial C preprocessor in src/parseTools.js. Using LLVM to preprocess can be considered Correct out of the box, so that won't need to care about adding tests, especially tests to check that python preprocessor and src/parseTools.js would agree. Writing a preprocessor is hard (#5457, #5458, [3]). I took a pass long time ago at fixing these types of issues with a more complete preprocessor in JS (#1969), but its complexity was too high that we did not want to land it.

I don't think I'm able to implement a preprocessor in Python that I would feel good about, that would either have less complexity than the approach at #1969 had, or if settling for less supported features, not have various similar kinds of limits that people eventually run into with the current JS preprocessor. That is why I felt really glad to notice I can just reuse LLVM's preprocessor here, albeit with a slight change of behavior. Getting the ability to do #define, #if A || B and #if !A && B for "free" are big improvements, so I'd strongly like to use LLVM to be able to get to use those.

Before implementing the whitelist heuristic, I wrote a blacklisting heuristic, with manually listed identifiers that should not get expanded, which only contained TOTAL_MEMORY for the purposes of pthread-main.js and Fetch.js. That felt a bit more dirty than this.

Some alternatives to be able to drop the whitelisting mechanism:

  • expand the src/settings.js defines to code with an alias. For example, we could give all defines there a S_ prefix, so that TOTAL_MEMORY would become -DS_TOTAL_MEMORY=1 for the preprocessor.
  • Don't expand src/settings.js defines to code automatically, but have a directive in src/Settings.js, with something like exportToPreprocessor('TOTAL_MEMORY', (optional)'S_TOTAL_MEMORY'); which would tell this machinery to take the TOTAL_MEMORY variable and preprocess with it, possibly using the specified optional alias.
  • Expand src/settings.js defines automatically, but add a hideFromPreprocessor('TOTAL_MEMORY'); directive or similar to allow removing some fields, or a exportToPreprocessor('TOTAL_MEMORY', (optional)'S_TOTAL_MEMORY'); as well to allow routing the defines under other names.

Overall, running C preprocessor on our .js files is the best way we have to offer as efficient as possible form of "don't pay for what you don't use" type of dead code elimination, to be able to get to tiny compiled applications that a lot of people are looking for. Having the "full" LLVM preprocessor available to do that would be quite helpful here. Is there some variant above that you think would work better than the current whitelist mechanism in this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're saying about the benefits of the LLVM preprocessor. It would be nice to use it, but I think we should only do so if (1) we can do it without hacks, and (2) we have one preprocessor for everything (or at least if we have 2 then they are simple and have identical behavior).

So I'm saying that if we want to use the LLVM preprocessor, let's replace the js one too. I'm not sure how that would work in terms of calling it from inside the js compiler though. Aside from that are the issues you mentioned with making it work without hacks, might need some refactoring of things (like no longer using {{{ }}} stuff, etc.). But maybe this could be done.

Alternatively, I still think a limited preprocessor - basically the one we have in js right now - is simple and gives 95% of the benefits. Instead of adding another preprocessor, we can run the existing one in js for our purposes here, we would just need a little refactoring to make it usable standalone so it is callable from python (this was proposed in #5296). This seems like very little work, but I admit it isn't as good a final result as using the LLVM preprocessor.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) we can do it without hacks,

I am not sure the "preprocess only src/settings.js defines that are present on lines starting with #if and #elif" warrants to be called a hack over the JS one, since that was added to implement exactly the same semantic that the existing JS one gives so that these two align.

(2) we have one preprocessor for everything

This sounds like a good direction, although it is too much to work on right now, so let me pause here, and splice the fix to the multithreading stack cookie setting to a separate PR, and pick this up when I have the cycles to start this kind of refactoring.

# Historically, .js file preprocessing only expands variables found in #if etc. statements, and .js
# code uses some setting names as variables as well. For example, pthread-main.js has a variable
# TOTAL_MEMORY, which is also a Setting name. Therefore detect to only expand those Setting names that
# are referred to if and #elif defines - but that expansion is done globally in the file, so it will
# preclude one from doing things like
#
# var TOTAL_MEMORY = {{{ TOTAL_MEMORY }}};
# if TOTAL_MEMORY > 65536
#
# Still, this should give a good balance to be compatible with existing behavior.

# Find the #if lines that we'll allow expanding.
whitelisted_defines = find_lines_with_preprocessor_directives(src)

def any_string_contains(string_list, substr):
for s in string_list:
if substr in s:
return True
return False

defines = []
for s in Settings.attrs:
if any_string_contains(whitelisted_defines, s):
d = '-D' + s + '=' + str(Settings.attrs[s])
logging.debug('Expanding #define ' + d + ' when preprocessing file ' + src)
defines += [d]

response_filename = response_file.create_response_file(defines, TEMP_DIR)
preprocessed = subprocess.check_output([CLANG_CC, '-E', '-P', '-C', '-x', 'c', '@' + response_filename, src])
try_delete(response_filename)

f = f[:m.start(0)] + included_file + f[m.end(0):]
if dst: open(dst, 'w').write(preprocessed)
return preprocessed

# Generates a suitable fetch-worker.js script from the given input source JS file (which is an asm.js build output),
# and writes it out to location output_file. fetch-worker.js is the root entry point for a dedicated filesystem web
Expand Down Expand Up @@ -2696,7 +2719,7 @@ def make_fetch_worker(source_file, output_file):
func_code = src[loc:end_loc]
function_prologue = function_prologue + '\n' + func_code

fetch_worker_src = function_prologue + '\n' + clang_preprocess(path_from_root('src', 'fetch-worker.js'))
fetch_worker_src = function_prologue + '\n' + run_c_preprocessor_on_file(path_from_root('src', 'fetch-worker.js'), dst=None)
open(output_file, 'w').write(fetch_worker_src)


Expand Down