Skip to content

Conversation

@shade5144
Copy link
Contributor

@shade5144 shade5144 commented May 22, 2025

Added format string capabilities to the puts function in printf.c and included the documentation in doc/contents/puts.tex

Other changes:

  • Added a division algorithm for 64 bit by 64 bit division(_bdiv in printf.c, needs migration to a general math library), based on restoring division. Should be used for division by a variable, constants will be optimized by the compiler.
  • Updated help dialogue in kernel.c to include all options
  • Added an option for testing the format specifiers of puts(Accessed by 'e' in the prompt. Not included in the help prompt as it is just for testing puts, and so is temporary).
  • Changed start.S in the Makefile to start.s for correct compilation.

Copy link
Member

@chrisdedman chrisdedman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution. I have a couple of questions that I left in the code review before I can merge it. Otherwise, I think it is pretty good.

Let's talk about my comments :)

}

unsigned long long _bdiv(unsigned long long dividend, unsigned long long divisor, unsigned long long *remainder)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to add a safeguard in case the divisor is zero (division by zero would crash the system).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be done in the future, along with a faster division algorithm. The current use I've put this algorithm through involves dividing only by 10 and 16 so it isn't an issue right now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. I will be adding a comment in the code to specify this (just in case we forget moving forward).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing about this for the future. While thinking about implementing division, I found out that Linux implements its own 64 bit by 64 bit division algorithm in linux/math64.h(I believe). You can reference that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user/printf.c Outdated
#include <stdarg.h>
#include <stddef.h>

// TODO:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should be the behavior of those cases, for example:

puts("%d %d\n", 10);      // Fewer arguments than specifiers
puts("%d\n", 10, 20, 30); // More arguments than specifiers

or maybe this:

puts("Lone percent at end -> %\n"); // currently print: Lone percent at end -> 
puts("Unknown specifier -> %z\n"); // currently print: Unknown specifier ->
puts("%s\n", NULL); // currently print an empty char

Those are the few edge cases I think about.

Copy link
Contributor Author

@shade5144 shade5144 May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

puts("%d %d\n", 10);      // Fewer arguments than specifiers
puts("%d\n", 10, 20, 30); // More arguments than specifiers

The current behaviour for the first case is the same as glibc printf, which is reading random garbage values from the stack and printing it out, when a matching argument is not specified.

In the second case, only 10 will be printed, and the rest will be discarded without any fanfare.

puts("Lone percent at end -> %\n"); // currently print: Lone percent at end -> 
puts("Unknown specifier -> %z\n"); // currently print: Unknown specifier ->
puts("%s\n", NULL); // currently print an empty char

The first case and the second case will be in the same category, as the \n will be grouped with the %. Even if you leave a space, it will be grouped with the %. If instead, such a case were given

puts("Percent: %");

In glibc printf, the above line returns with -1 to indicate an error, and also does not print the line. I do not think this is necessary, because the programmer would know something went wrong if their purpose was to:

  1. Print a %
  2. Print some other format specifier.

from the awkward output. Also, a partially printed line would probably be better for debugging than a line that is not printed at all, since you can at least pin point where it went wrong.

I don't think the second case will be much of a problem. I intended unused and unrecognized format specifiers to be skipped, and if one desires to print something like z% of Y they can use the format %%z of Y.

The third case is an oversight on my part, as it allows null pointer dereferencing. Would it be adequate if the string (null) were printed if a NULL pointer were encountered?

If there are any other changes that you'd like, please let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the puts("Lone percent at end -> %\n"); and puts("Unknown specifier -> %z\n"); in C implementation, you will get a warning, and the percent printed (see screenshot from GDB). Not sure if we should allow the same behavior. It could be worth checking the behavior in other kernels.
image

For the NULL, in the C implementation, it prints just NULL NULL -> (null). So, I think (like you mentioned) maybe follow the same behavior string (null) for this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing about the warnings in this context I think that it will just complicate the implementation without much payoff, as I've mentioned earlier. But if you find similar models in other kernels too, then let me know.

}

va_end(elem_list);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return the length of the string like the C implementation of printf() for debugging purposes?
https://cplusplus.com/reference/cstdio/printf/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see this being useful for, taking an example, if execution stops with an error for some case in your edge cases comment. Will there be any other use for this?

I suppose that then an error handling routine would also be appropriate, like the set errno and perror model used in the Linux API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't know what other use case we could have, but it could be a good idea to have the option.

I was thinking about having some error handling anyway, so maybe we can look up the implementation in the Linux API indeed.

@chrisdedman chrisdedman linked an issue May 22, 2025 that may be closed by this pull request
Copy link
Member

@chrisdedman chrisdedman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chrisdedman chrisdedman merged commit 3449a79 into sandbox-science:main May 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optional parameters for puts()

2 participants