-
-
Notifications
You must be signed in to change notification settings - Fork 101
Add code format command #622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6f2019a
to
1065d15
Compare
Sonar detected a code duplication with |
8052a5e
to
968bcc1
Compare
@Tais993 reminder :) |
b491bb9
to
7889183
Compare
* The feature is secondary though, which is why its kept in RAM and not in the DB. | ||
*/ | ||
private final Cache<Long, Long> originalMessageToCodeReply = | ||
Caffeine.newBuilder().maximumSize(10_000).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice a general trend of massive caches, but is this needed? I feel like this is an easy way for trollers to "crash" our bot, at least I'd assume 10k entries takes up a lot of ram.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Tais993 10k should take almost no RAM at all, id estimate around 100 KB for a full cache. u need 1000 such caches to reach 1 GB RAM.
the point of the cache is exactly that, to prevent against RAM blowups. with a traditional Map
, u have no proper limit and users are able to blow it up.
if u feel better, we can reduce it to maybe 2k? its not like it really matters for this feature anyways
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done ✔️
reduced to 2k
@Nullable CodeAction disabledAction) { | ||
return labelToCodeAction.values().stream().map(action -> { | ||
Button button = createButtonForAction(action, originalMessageId); | ||
return action == disabledAction ? button.asDisabled() : button; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you can only disable 1 action?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the disabled action is the action that is currently active, yes.
like, if u have 3 buttons (format, run, bytecode) and u click on run, then u want run to be deactivated, cause it is currently active.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im going to rename the param to currentlyActiveAction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done ✔️
} | ||
|
||
private static Stream<Arguments> provideExtractCodeTests() { | ||
return Stream.of(createExtractCodeArgumentsFor("basic", """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All honesty unreadable, honestly, using a list and using separate List#add calls would be a lot more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i doubt it would be more readable after spotless. but sure, can give it a try
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done ✔️
i think its actually more readable now, thanks
* Indexes tokens to contain information about whether they are code tokens or not | ||
* Formats the given string. | ||
* <p> | ||
* Best results are achieved for Java code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"best results", that sounds like this works for JS, but not as good.
Instead just say "Only works with Java"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"best results", that sounds like this works for JS, but not as good.
but it does. and its expected to be used for that as well.
right now, the format action is available for all languages and yields okayish results for everything i tested (except languages without semicolons)
import java.util.stream.Stream; | ||
|
||
/** | ||
* Queue that holds tokens to be consumed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The term "token" is very often used, but I haven't noticed much of a description.
I'd heavily appreciate it if you link to the Token class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats fair. i guess it was clear due to the context of lexxing (= tokenization).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done ✔️
added some extra paragraph and links
DOT("."), | ||
SEMICOLON(";"), | ||
METHOD_REFERENCE("::"), | ||
COLON(":", false, true), // technically not a "real" operator but used in an enhanced for loop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is needed, so that the lexxer recognizes it as individual token.
its still part of the list somewhere, i reordered a lot of stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
888e183
to
9fd8e1a
Compare
Kudos, SonarCloud Quality Gate passed! |
Going to merge this now, its quite old already and I do not want to hold back the feature for much longer, just bc people dont have time to review. There dont seem to be any red flags, so you can continue with the CR and Ill do proposed changes in a follow-up PR instead. |
Overview
Implements and closes #621 .
This PR does two things:
Code commands
UX
Once it detects code, it sends a message with buttons:
If the button is clicked, it attaches the result as embed to the bot message and disables the button:
Additionally, it auto-updates itself on every update of the original message:
As well as auto-deletes if the original message is deleted.
Extension
The code was also written in a generic way, such that adding new code-actions is very simple. Here is an example with some mock commands:
All that has to be done is implementing a simple interface:
and adding it to a list in
CodeMessageHandler
.Code detection
The feature activates on messages that contain code. Code is detected by the presence of code-blocks/fences.
This matching is done without regex to save crucial performance. The check is executed on each and every single posted message after all.
Formatter rework
No file was left untouched. The actual core logic and approach still is the same though. Just a major refactoring, documentation, unit tests, extra features and bug fixes.
The basic approach is:
Lexer
)CodeSectionFormatter
)The flow is available to the user via the class
Formatter
.Lexer
The core of the lexer is the enum
TokenType
, which lists all recognized tokens.Tokens are found by constantly matching the next token from the code. For example
int x = 5;
results in a list ofThe previous proof-of-concept version of the lexer had significant performance problems, which is why matching is now done very fast and performant by:
substring
s (CharBuffer
)Unit tests ensure, now in a very elegant way, that no type hides another as prefix (for example
:
is a prefix of::
and hence latter has to be matched first).Formatting
The actual formatting happens in
CodeSectionFormatter
. For most of the actual logic, it refers toFormatterRules
.Essentially, it iterates through all tokens and constructs back a string. Each time, it has to decide stuff like:
It does so by maintaining some states, such as:
currentIndentLevel
currentGenericLevel
isStartOfLine
expectedSemicolonsInLine
isInPackageDeclaration
isInImportDeclaration
It is important that the states are kept rather slim, otherwise it would be too fragile for non-compiling/incorrect code.
The actual formatter rules can be a bit nasty, since there are so many edge cases. Keep in mind though, that the goal is not to create a 100% correct formatter. The formatter has to support incorrect code and must always yield at least okay-ish looking results.
There are lots of unit tests that cover a lot of cases and real code examples.
Checklist
add language support for buttons(can be done later, once needed)