What parser state should I be in for LE5?

I’m stuck on Stage #GU3

Specifically, when I went to implement GU3, my (passed) LE5 implementation broke

Here are my logs:

[tester::#GU3] Running tests for Stage #GU3 (Quoting - Backslash within double quotes)
[tester::#GU3] [setup] export PATH=/tmp/pineapple/mango/raspberry:$PATH
[tester::#GU3] [setup] echo -n "pineapple apple." > "/tmp/quz/\"f 14\""
[tester::#GU3] [setup] echo -n "pineapple banana." > "/tmp/quz/\"f\\99\""
[tester::#GU3] [setup] echo -n "pear banana." > "/tmp/quz/f6"
[tester::#GU3] Running ./your_program.sh
[your-program] $ echo "script'world'\\'example"
[your-program] script'world'\'example
[tester::#GU3] ✓ Received expected response
[your-program] $ echo "script\"insidequotes"world\"
[your-program] script"insidequotesworld"
[tester::#GU3] ✓ Received expected response
[your-program] $ echo "mixed\"quote'shell'\\"
[your-program] mixed"quote'shell'\
[tester::#GU3] ✓ Received expected response
[your-program] $ cat '/tmp/quz/"f 14"' '/tmp/quz/"f\99"' '/tmp/quz/f6'
[your-program] pineapple apple.pineapple banana.pear banana.
[tester::#GU3] ✓ Received expected response
[your-program] $ 
[tester::#GU3] Test passed.

[tester::#LE5] Running tests for Stage #LE5 (Quoting - Backslash within single quotes)
[tester::#LE5] [setup] export PATH=/tmp/raspberry/blueberry/orange:$PATH
[tester::#LE5] Running ./your_program.sh
[tester::#LE5] [setup] echo -n "pear banana." > "/tmp/bar/'f 90'"
[tester::#LE5] [setup] echo -n "apple pear." > "/tmp/bar/'f  \\12'"
[tester::#LE5] [setup] echo -n "orange grape." > "/tmp/bar/'f \\84\\'"
[your-program] $ echo 'test\\nworld'
[your-program] test\\nworld
[tester::#LE5] ✓ Received expected response
[your-program] $ echo 'script\"helloexample\"test'
[your-program] script\"helloexample\"test
[tester::#LE5] ✓ Received expected response
[your-program] $ echo 'example\\nworld'
[your-program] example\\nworld
[tester::#LE5] ✓ Received expected response
[your-program] $ cat "/tmp/bar/'f 90'" "/tmp/bar/'f  \12'" "/tmp/bar/'f \84\'"
[your-program] pear banana.$ 
[tester::#LE5] Output does not match expected value.
[tester::#LE5] Expected: "pear banana.apple pear.orange grape."
[tester::#LE5] Received: "pear banana.$ "
[tester::#LE5] Assertion failed.
[tester::#LE5] Test failed

My cargo tests are working as follows:

    #[test]
    fn test_le5_broken() {
        let test_string =
            "cat \"/tmp/baz/\'f 58\'\" \"/tmp/baz/\'f  \\15\'\" \"/tmp/baz/\'f \\13\'\"";
        assert_eq!(
            parse_without_scanner(test_string),
            [
                "cat",
                "/tmp/baz/\'f 58\'",
                "/tmp/baz/\'f \\15\'",
                "/tmp/baz/\'f \\79\'"
            ]
        )
    }

with the output:

    #[test]
    fn test_le5_broken() {
        let test_string =
            "cat \"/tmp/baz/\'f 58\'\" \"/tmp/baz/\'f  \\15\'\" \"/tmp/baz/\'f \\13\'\"";
        assert_eq!(
            parse_without_scanner(test_string),
            [
                "cat",
                "/tmp/baz/\'f 58\'",
                "/tmp/baz/\'f \\15\'",
                "/tmp/baz/\'f \\79\'"
            ]
        )
    }

And here’s a snippet of my code:

{
    let mut out: Vec<String> = Vec::new();
    let mut curr = Vec::new();
    let mut state = ParseState::Normal;
    for c in input.trim().chars() {
        match c {
            ' ' => match &state {
                ParseState::Normal => {
                    if !curr.is_empty() {
                        out.push(curr.into_iter().collect());
                        curr = Vec::new();
                    }
                    state = ParseState::Normal;
                }
                ParseState::Escape(v) => {
                    curr.push(c);
                    state = match v {
                        PrevState::Single => ParseState::Single,
                        PrevState::Normal => ParseState::Normal,
                        PrevState::Double => ParseState::Double,
                    }
                }

                _ => curr.push(c),
            },
            '\'' => match &state {
                ParseState::Normal => {
                    state = ParseState::Single;
                }
                ParseState::Single => {
                    state = ParseState::Normal;
                }
                ParseState::Double => {
                    curr.push(c);
                }
                ParseState::Escape(v) => {
                    curr.push(c);
                    state = match v {
                        PrevState::Single => ParseState::Single,
                        PrevState::Normal => ParseState::Normal,
                        PrevState::Double => ParseState::Double,
                    }
                }
            },
            '\"' => match &state {
                ParseState::Normal => {
                    state = ParseState::Double;
                }
                ParseState::Single => {
                    curr.push(c);
                }
                ParseState::Double => {
                    state = ParseState::Normal;
                }
                ParseState::Escape(v) => {
                    state = match v {
                        PrevState::Single => ParseState::Single,
                        PrevState::Normal => {
                            curr.push(c);
                            ParseState::Normal
                        }
                        PrevState::Double => {
                            curr.push(c);
                            ParseState::Double
                        }
                    }
                }
            },
            '\\' => match &state {
                ParseState::Normal => {
                    state = ParseState::Escape(PrevState::Normal);
                }
                ParseState::Double => {
                    state = ParseState::Escape(PrevState::Double);
                }
                ParseState::Single => curr.push(c),
                ParseState::Escape(v) => {
                    curr.push(c);
                    state = match v {
                        PrevState::Single => ParseState::Single,
                        PrevState::Double => ParseState::Double,
                        PrevState::Normal => ParseState::Normal,
                    }
                }
            },

            s => curr.push(s),
        }
    }
    if !curr.is_empty() {
        out.push(curr.into_iter().collect());
    }
    out

So since I’m tracking the state of the parser at each character ,in LE5 which state should I be in inside those single quotes, I’m pretty sure the answer is that I should be using “double quotes” rules, but I wanted to make sure.

Hi @ilovenuclearpower, thanks for your detailed description of the issue!

It looks like this line is not correct:

Within double quotes, a backslash \ is almost always a literal character \, except when it appears before \ , $ , " or newline.

To ensure that tests for all previous stages pass again, we can temporarily treat every \ as a literal by using curr.push(c):

Next step:

  • Only transition to Escape when the next character is one that requires escaping ( \ , $ , " or newline).