#oe4 Unterminated string within statement, no error supposed to be raised?

I’m stuck on Stage #oe4 .

Just before I go into fiddling about here, what is the desired output? For the scanner error to be raised but there still to be no issue? Do I need to add a special case for semicolon into my scanner, so that within statements, unterminated strings are fine?

Here are my logs:

[tester::#OE4] [test-4] [test.lox] // This program prints the result of a comparison operation
[tester::#OE4] [test-4] [test.lox] // It also tests multi-line strings and non-ASCII characters
[tester::#OE4] [test-4] [test.lox] print false != true;
[tester::#OE4] [test-4] [test.lox] 
[tester::#OE4] [test-4] [test.lox] print "74
[tester::#OE4] [test-4] [test.lox] 11
[tester::#OE4] [test-4] [test.lox] 89
[tester::#OE4] [test-4] [test.lox] ";
[tester::#OE4] [test-4] [test.lox] 
[tester::#OE4] [test-4] [test.lox] print "There should be an empty line above this.";
[tester::#OE4] [test-4] [test.lox] 
[tester::#OE4] [test-4] [test.lox] print "(" + "" + ")";
[tester::#OE4] [test-4] [test.lox] 
[tester::#OE4] [test-4] [test.lox] print "non-ascii: ॐ";
[tester::#OE4] [test-4] [test.lox] 
[tester::#OE4] [test-4] $ ./your_program.sh run test.lox
[your_program] [line 5] Error: Unterminated string.
[your_program] [line 8] Error: Unterminated string.
[tester::#OE4] [test-4] expected no error (exit code 0), got exit code 65

Here’s the scanner logic:

# Handle strings
            elif char.value == '"':
                self.advance()  # Consume the opening quote
                string_value = ""

                while True:
                    next_char = self.peek()

                    if not next_char:  # Handle unexpected None
                        print(f"[line {current_line}] Error: Unterminated string.", file=sys.stderr)
                        self.lex_errors = True
                        break

                    if next_char.value == '"':  # Found the closing quote
                        self.advance()  # Consume the closing quote
                        tokens.append(Token(TokenType.STRING, f'"{string_value}"', string_value, current_line))
                        break

                    if next_char.value == '\n' or self.is_at_end_of_line():  # Unterminated string due to newline
                        print(f"[line {current_line}] Error: Unterminated string.", file=sys.stderr)
                        self.lex_errors = True
                        break

                    string_value += self.advance().value  # Safe to add

Strings should be multi-line:

For no particular reason, Lox supports multi-line strings. There are pros and cons to that, but prohibiting them was a little more complex than allowing them, so I left them in. That does mean we also need to update line when we hit a newline inside a string.

So the string that starts on line 5 is terminated on line 8.

Thanks, that helps. I must have missed it because there’s no newline token so the tokenizer would be the same whether I was handling newlines correctly or incorrectly in strings. Also why did I not have elifs??

Corrected code:

while True:
                    next_char = self.peek()

                    if not next_char or self.is_at_end():  # Handle unexpected None
                        print(f"[line {current_line}] Error: Unterminated string.", file=sys.stderr)
                        self.lex_errors = True
                        break

                    elif next_char.value == '"':  # Found the closing quote
                        self.advance()  # Consume the closing quote
                        tokens.append(Token(TokenType.STRING, f'"{string_value}"', string_value, current_line))
                        break
                    elif next_char.value == '\n':  # Unterminated string due to newline
                        self.advance()
                        string_value += '\n'
                        break
                    else:
                        string_value += self.advance().value  # Safe to add

As it happens, I now have a new parser error, which I assume is to do with new lines and white spaces issues. I added in line output for parser errors to help but it hasn’t helped.

[tester::#OE4] [test-4] [test.lox] // This program prints the result of a comparison operation
[tester::#OE4] [test-4] [test.lox] // It also tests multi-line strings and non-ASCII characters
[tester::#OE4] [test-4] [test.lox] print true != false;
[tester::#OE4] [test-4] [test.lox] 
[tester::#OE4] [test-4] [test.lox] print "48
[tester::#OE4] [test-4] [test.lox] 82
[tester::#OE4] [test-4] [test.lox] 66
[tester::#OE4] [test-4] [test.lox] ";
[tester::#OE4] [test-4] [test.lox] 
[tester::#OE4] [test-4] [test.lox] print "There should be an empty line above this.";
[tester::#OE4] [test-4] [test.lox] 
[tester::#OE4] [test-4] [test.lox] print "(" + "" + ")";
[tester::#OE4] [test-4] [test.lox] 
[tester::#OE4] [test-4] [test.lox] print "non-ascii: ॐ";
[tester::#OE4] [test-4] [test.lox] 
[tester::#OE4] [test-4] $ ./your_program.sh run test.lox
[your_program] Parser error: [line 7] Expect ';' after value.
[tester::#OE4] [test-4] expected no error (exit code 0), got exit code 65

Nevermind, it’s still to do with me not handling multiline strings correctly, but now in a different way. Do I need a kind of “invisible” token for newlines, I can’t see where this is mentioned in the book?

EDIT (SOLVED):

Basically, I forgot how I had implemented my advance method for the scanner, which did a lot of the work for me.

self.advance: (note this returns a Character object that I have defined previously to keep track of position of tokens in files (can be converted to a python string literal by doing str(Character) due to the existence of a return value for __str__ method in the Character class).

def advance(self):
        """Consume the current character and return it"""
        if self.is_at_end():
            return None
        
        current_line = self.file_content.lines[self.current_line_index]
        if self.is_at_end_of_line():
            # Move to the next line
            self.current_line_index += 1
            self.current_char_index = 0
            return '\n'  # Return newline as a special character
        
        char = current_line.characters[self.current_char_index]
        self.current_char_index += 1
        return char

Then the snippet with the fix:

elif next_char.value == '"':  # Found the closing quote
                        self.advance()  # Consume the closing quote
                        tokens.append(Token(TokenType.STRING, f'"{string_value}"', string_value, current_line))
                        break
                    else:
                        # hackery going on here self.advance() actually does things as well as returning the character
                        string_value += str(self.advance())
1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.