-
Notifications
You must be signed in to change notification settings - Fork 2
Conversation
WalkthroughThe pull request introduces support for assembly function declarations in the compiler. This involves extending the lexer to recognize a new "asmf" keyword, adding a corresponding token type, updating the Abstract Syntax Tree (AST) to include a new node type for assembly function declarations, and modifying the parser to handle the new syntax. The changes are focused on expanding the language's capabilities by allowing inline assembly function definitions with a specific parsing mechanism. Changes
Sequence DiagramsequenceDiagram
participant Lexer
participant Parser
participant AST
Lexer->>Lexer: Recognize "asmf" keyword
Lexer->>Parser: Send ASM_FUNCTION token
Parser->>Parser: Call parseASMFunctionDeclaration()
Parser->>AST: Create AST_ASM_FUNCTION_DECLARATION node
AST-->>Parser: Return parsed node
Parser->>AST: Link node to AST
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (5)
src/lexer/tokens.h (1)
5-5: Maintain logical grouping of token types in the enumThe new token
ASM_FUNCTIONis added at the top of theTokenTypeenum. For better readability and maintainability, consider placing it near related tokens, such asFUNCTION, to group similar token types together.Apply this diff to reposition the new token type:
enum TokenType { - ASM_FUNCTION, FUNCTION, + ASM_FUNCTION, RETURN, VAR,src/parser/asts/functions.h (2)
33-36: Correct typo in function documentation commentThere's a typographical error in the comment: "defintition" should be "definition".
Apply this diff to fix the typo:
/** - * Parse the parameters from a function defintition (for example). + * Parse the parameters from a function definition (for example). * @param result the lexer result. * @param index the starting index of the parsing. */
38-38: Add missing newline at the end of fileFor consistency and to adhere to POSIX standards, add a newline character at the end of the file.
src/parser/ast.h (1)
22-23: Ensure consistent ordering of AST node typesThe new
AST_ASM_FUNCTION_DECLARATIONnode is added between existing node types. For better organization, consider placing it alongside other function-related node types, such as nearAST_FUNCTION_DECLARATION.Apply this diff to reposition the new AST node type:
AST_VARIABLE_REFERENCE, + AST_FUNCTION_DECLARATION, + AST_ASM_FUNCTION_DECLARATION, + AST_FUNCTION_HEADER, - - AST_ASM_FUNCTION_DECLARATION, - - AST_FUNCTION_DECLARATION, - AST_FUNCTION_HEADER,src/lexer/lexer.c (1)
77-79: Consider consistent naming conventionThe keyword "asmf" might not be intuitive. Consider using a more descriptive keyword like "asm_func" or "asmfunc" to maintain consistency with other keywords.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
src/lexer/lexer.c(1 hunks)src/lexer/tokens.h(1 hunks)src/parser/ast.h(1 hunks)src/parser/asts/functions.c(3 hunks)src/parser/asts/functions.h(1 hunks)src/parser/parser.c(1 hunks)
🧰 Additional context used
🪛 cppcheck (2.10-2)
src/parser/asts/functions.c
[error] 185-185: Common realloc mistake
(memleakOnRealloc)
🔇 Additional comments (3)
src/parser/asts/functions.c (1)
151-153: Verify token type before parsing parameters
The condition if(result.tokens[index].type != KEYWORD) may not accurately represent all valid cases for parsing parameters. Ensure that this condition aligns with the lexer's token definitions and consider whether other token types should be accepted.
Please confirm that only KEYWORD tokens are valid at this point, or adjust the condition accordingly.
src/parser/parser.c (1)
47-54: Verify the implementation of parseASMFunctionDeclaration
The ASM function parsing logic follows the same pattern as other declarations, which is good. However, we need to verify the implementation of parseASMFunctionDeclaration.
Consider adding error handling for invalid ASM function declarations:
case ASM_FUNCTION:
node = parseASMFunctionDeclaration(result, index);
if(node != NULL) {
current->next = node;
current = node;
index = node->endingIndex;
+ } else {
+ // Log error or handle invalid ASM function declaration
}
break;src/lexer/lexer.c (1)
77-79: 🛠️ Refactor suggestion
Verify ASM_FUNCTION token type definition
The lexer correctly recognizes the "asmf" keyword, but we should verify the ASM_FUNCTION token type definition.
Consider improving the keyword handling:
- Use a hash table or switch statement for better performance with multiple keywords
- Add validation for keyword length to prevent buffer overflow:
else if(isalpha(c)) {
int keywordLen = 0;
+ const int MAX_KEYWORD_LENGTH = 31; // buffLen - 1 for null terminator
while(isalpha(c)) {
+ if (keywordLen >= MAX_KEYWORD_LENGTH) {
+ // Handle keyword too long error
+ break;
+ }
buff[keywordLen] = c;
keywordLen++;
c = *string++;
}
+ buff[keywordLen] = '\0'; // Ensure null termination✅ Verification successful
ASM_FUNCTION token type is properly integrated
The token type is correctly defined in the enum and properly integrated across the codebase:
- Defined in
tokens.henum - Used in parser's AST definitions (
ast.h) - Handled in parser implementation (
parser.c) - Correctly processed in AST function declarations (
functions.c)
The lexer changes are consistent with the token's usage throughout the codebase.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for ASM_FUNCTION token type definition
echo "Searching for token type definition..."
rg "ASM_FUNCTION" "./src/lexer/tokens.h"
Length of output: 143
Script:
#!/bin/bash
# Let's check the complete token enum definition to understand the context
echo "Checking token enum definition..."
rg -A 5 -B 5 "ASM_FUNCTION" "./src/lexer/tokens.h"
# Also verify if ASM_FUNCTION is used in other parts of the codebase
echo -e "\nChecking ASM_FUNCTION usage across codebase..."
rg "ASM_FUNCTION" -g '!tokens.h'
Length of output: 648
| for(; index <= result.size; ++index) { | ||
| struct Token t = result.tokens[index]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct loop boundary condition to prevent out-of-bounds access
The loop condition index <= result.size may cause an out-of-bounds access on result.tokens[index] when index == result.size. Since array indices start from 0, the last valid index is result.size - 1.
Apply this diff to fix the loop condition:
- for(; index <= result.size; ++index) {
+ for(; index < result.size; ++index) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for(; index <= result.size; ++index) { | |
| struct Token t = result.tokens[index]; | |
| for(; index < result.size; ++index) { | |
| struct Token t = result.tokens[index]; |
| char c; | ||
| while(c = *t.value++) { | ||
| buff[buffIndex] = c; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid modifying the original token value pointer
The line while(c = *t.value++) advances t.value, which might affect subsequent operations that rely on t.value. To prevent unintended side effects, use a separate pointer to iterate over the string.
Apply this diff to refactor the code:
- while(c = *t.value++) {
+ char* valPtr = t.value;
+ while((c = *valPtr++)) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| char c; | |
| while(c = *t.value++) { | |
| buff[buffIndex] = c; | |
| char c; | |
| char* valPtr = t.value; | |
| while((c = *valPtr++)) { | |
| buff[buffIndex] = c; |
| if(buffIndex >= buffSize) { | ||
| buffSize *= 1.5; | ||
| buff = realloc(buff, buffSize); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix potential memory leak when reallocating buffer
If realloc fails, it returns NULL and the original memory is not freed, leading to a memory leak. Assigning the result of realloc directly to buff without checking for NULL could overwrite the original pointer. Use a temporary pointer to check the result before assigning it back to buff.
Apply this diff to fix the issue:
- buff = realloc(buff, buffSize);
+ char* temp = realloc(buff, buffSize);
+ if (temp == NULL) {
+ free(buff);
+ return NULL;
+ }
+ buff = temp;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if(buffIndex >= buffSize) { | |
| buffSize *= 1.5; | |
| buff = realloc(buff, buffSize); | |
| } | |
| if(buffIndex >= buffSize) { | |
| buffSize *= 1.5; | |
| char* temp = realloc(buff, buffSize); | |
| if (temp == NULL) { | |
| free(buff); | |
| return NULL; | |
| } | |
| buff = temp; | |
| } |
🧰 Tools
🪛 cppcheck (2.10-2)
[error] 185-185: Common realloc mistake
(memleakOnRealloc)
| char* buff = malloc(buffSize); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check for malloc failure to ensure robust memory allocation
The malloc call for buff does not check if the allocation was successful. If malloc returns NULL, subsequent operations on buff will result in undefined behavior. Add a check to handle allocation failure.
Apply this diff to handle malloc failure:
char* buff = malloc(buffSize);
+ if (buff == NULL) {
+ // Handle allocation failure
+ return NULL;
+ }Committable suggestion skipped: line range outside the PR's diff.
Summary by CodeRabbit
New Features
ASM_FUNCTIONfor assembly functions.AST_ASM_FUNCTION_DECLARATIONfor assembly function declarations.Bug Fixes
Documentation