#BR6: Go tail execution

I’m stuck on Stage #BR6.

I currently run executables with exec.Command and change the commands stdin, stdout and stderr depending on what is happening with redirection. This has worked fine, including on the first test as can be seen by the logs.

[tester::#BR6] [setup] echo -e "apple pear\nbanana pineapple\nblueberry mango\ngrape orange\nstrawberry raspberry" > "/tmp/ant/file-75"
[your-program] $ cat /tmp/ant/file-75 | wc
[your-program]        5      10      78
[tester::#BR6] ✓ Received expected response
[tester::#BR6] [setup] echo -e "1. mango banana\n2. orange raspberry\n3. pineapple grape" > "/tmp/fox/file-41"
[your-program] $ tail -f /tmp/fox/file-41 | head -n 5
[tester::#BR6] Didn't find expected line.
[tester::#BR6] Expected: "1. mango banana"
[tester::#BR6] Received: "" (no line received)
[tester::#BR6] Assertion failed.
[tester::#BR6] Test failed

tail works as expected on it’s own but I need to exit with Ctrl+C and also works with

tail file.txt | head -n 2

But when adding “-f“ the tail command doesn’t finish.

And here’s a snippet of my code:

func (e Executable) Exec(stdin io.Reader, stdout io.Writer, stderr io.Writer) {
    cmd := exec.Command(e.Literal, e.GetStringArgs()...)

    cmd.Stdin = stdin
    cmd.Stderr = stderr
    cmd.Stdout = stdout

    if err := cmd.Run(); err != nil {
	    stderr.Write([]byte(err.Error()))
    }
    fmt.Println("finished executing")
}

when running tail -f | head -n 2 the fmt.Println isn’t reached. I’m guessing I need to change my exec.Command() but can’t find anything about ending the cmd.Run().

The io.Reader and io.Writers are bytes.Buffer until the final result and were to output it is evaluated.

Thanks

Since tail -f will keep its input, and thus its output open, that suggest your head -n 2 subprocess is not terminating properly after reading two lines. Normally head -n 2 will close after two lines, then tail -f sees it output is closed and terminates.

How are you constructing your pipes?

1 Like
type Executor struct {
	l             *lexer.Lexer
	curToken      token.Token
	curTokenIndex int
	tokenBuff     []token.Token

	stdout *outputState
	stderr *outputState
	stdin  *pipeState
}

type pipeState struct {
	stream *os.File      // The stream that buff will be written out to on completion
	buff   *bytes.Buffer // Buffer to store data between pipes and redirects
}

type outputState struct {
	*pipeState
	mode output.OutputMode // How buff will be written to s

}

func (e *Executor) Execute() {
	// Loop whilst there are tokens, including token.EOI as this dictates when we execute
	for e.curToken.Type != token.EOI {

		// If an operator is encountered we execute whatever is currently in tokenBuff into pipeState
		switch {
		case e.curToken.IsPipe():
			e.executePipe()
		case e.curToken.IsRedirect():
			e.executeRedirect()
		default:
			e.tokenBuff = append(e.tokenBuff, e.curToken) // If just a regular token.ARG stack these for execution once pipes and redirects have been processed
		}

		e.nextToken()
	}

	// Once all operators have been processed and we break out of loops we execute
	e.executeOutput()
}

func (e *Executor) executePipe() {
	// Create command out of tokenBuff
	cmd, err := commands.NewCommand(e.tokenBuff)
	if err != nil {
		panic(err)
	}

	// Execute everything before pipe, e.pipeState.stdin holds stdout of command to be used in command after pipe
	cmd.Exec(e.stdin.buff, e.stdout.buff, e.stderr.stream)
	e.tokenBuff = nil

	e.stdin.buff.Reset()
	e.stdin.buff.Write(e.stdout.buff.Bytes())

	e.stdout.buff.Reset() // Reset stdout buff for after pipe
}

func (e *Executor) executeOutput() {

	cmd, err := commands.NewCommand(e.tokenBuff)

	if err != nil {
		e.stderr.buff.Write([]byte(err.Error()))
		output.WriteOutput(e.stderr.buff, e.stderr.stream, e.stderr.mode)
		return
	}

	cmd.Exec(e.stdin.buff, e.stdout.stream, e.stderr.stream)

	if e.stdout.buff.Len() > 0 {
		output.WriteOutput(e.stdout.buff, e.stdout.stream, e.stdout.mode)
	}

	if e.stderr.buff.Len() > 0 {
		output.WriteOutput(e.stderr.buff, e.stderr.stream, e.stderr.mode)
	}
}

Sorry for the massive snippet, basically I read each token and store it in tokenBuff until I hit an operator. Then create a new command and exec it storing whatever the output is in the buffers.

Then execute the last tokenBuff with whatever is stored in the stdin, stdout and stderr.

I should mention that I have tried printing from executeOutput and it doesn’t reach there.

Thanks

I think there is some conceptual principles to clarify. A typical pipeline A | B | C | D | E has all A, B, C, D, E subprocesses running concurrently. Each subprocess stdout is typically passed as the next one’s stdin unless redirected. The initial stdin is typically the shell, and the final stdout and stderr is also the shell.

Imagine the following conceptual model of each subprocess A, B, etc.:

while (data = read(stdin)) {
    write(stdout, process(data));
}

The underlying mechanism may be async etc, but the principle above is the same. What can we state about this:

  1. Processes suitable for pipelines read from the file stdin and write to the file stdout (and stderr).
  2. We can’t assume stdin will ever run dry by default (you saw this with tail -f; it will “follow” the file forever, but stops when its output closes presumably).
  3. We can’t assume stdout will ever run dry by deault (suppose we piped data - maybe from /dev/random to tail -f - there will always be more data available)
  4. You can probably assume the “composition” of such a pipeline does what the user wants, so it will likely terminate if that is the intention, but some, say use of watch will not terminate without user input.

Your current approach, if I understand correct (I don’t use Go), is you parse along the command and if you hit a | pipe, say in tail -f | .... you launch a process, here tail -f inside executePipe. I can’t find cmd.Exec or see its definition. But I notice you pass it buffers, and after launching it you write the previously buffered output into the new ones stdin buffer.
So I assume you either read as much as you can into the buffers, or you keep reading and resizing (increasing) the size of the buffers. Regardless, either you only get a partial data, or you are stuck waiting forever like we learned above about tail -f processes keeping their stdin and thus their stdout open.

Besides the problem with waiting forever, if you resize buffers you might run out of ram (or disk space) - (what happens if you read from /dev/random ?). So it looks like you are trying to create some manual pipe via byte buffers. So you need to run all subprocesses concurrently to keep feeding the data forward. Otherwise head -n 5 will never close its stdin which is what causes the subprocesses in that particular pipeline to close.

And also the OS have a thing called a pipe. And as I highlighted file above, that OS primitive already have file-like API and internal buffering, whatever you write into one end, can be read by the other. So you get the classic “synchronous/blocking” read/write file-like behavior for free that processes in a (working) pipeline expects. So the idea would be to have a pipe P A.stdout == P such that when A does write(stdout, data) then it goes into P, and then give the same P to process B such that when B does read(stdin) it is reading from P.

I don’t know how Go does it, but it looks like after you create a command for say A (tail -f) you can take its cmd.StdoutPipe which is apparently a io.ReadCloser which make sense because that probably creates a pipe from tail -f to write to, and you now get back the “reading” abstraction which you can somehow pass as the stdin of the next process. I don’t use Go so I can’t help you here.

So rather than trying to handle the pipeline sequentially, split it up on |, prepare each command, setup the redirects/pipes between them, and then launch them all.

2 Likes

Right I see, you are correct I have modeled this as the right side of the pipe waits for the left to finish and return instead of them being connected and running concurrently.

Regarding the buffers, these do resize automatically I believe but I am going to change them to the built in File types, this is what the standard library does in go for /dev/stdout etc. and gives all of the benefits you mentioned. To be honest I don’t know why I’m not using those to begin with.

Thank you for your help it’s much appreciated!

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.