Grep: Are simplified test cases on purpose?

Hi there
I noticed that in the “build your own grep” challenge the test cases can be passed by the simplest possible implementation of each stage. Some combinations would make the implementation much harder (like for instance something like c[ae]+t testing with ‘cat’, ‘caat’ and ‘caeat’ would be way harder than the existing tests). I assume this is on purpose and it’s not expected to implement a “complete” grep solution. Am I correct on that?

We’re open to adding simple tests, just wanted to avoid cryptic tests that’d make it super difficult to figure out what the expected behaviour.

The stage you’re referring to is “Match one or more times”, yes?

These are the tests for that stage:

It’s intentional that we haven’t introduced the [ syntax yet, that’d likely be part of a “Character class” extension: Character Classes and Bracket Expressions (GNU Grep 3.11).

There are a ton of cases to handle there, like [0-9], [a-z] etc.

@syeo66 does that answer your question? We definitely don’t want to make it easy to pass with a naive implementation, but the [ char specifically would be out of scope here.

(Looking at these tests again, I do think we could add some randomness though - like not always testing the same pattern)

1 Like

Yeah, I think this answers the question. I guess my point was that I know how complex implementations can get when you combine modifiers and wildcards and character groups (requires multiple paths evaluation or recursive descent parsing).

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.