I think there is some conceptual principles to clarify. A typical pipeline A | B | C | D | E has all A, B, C, D, E subprocesses running concurrently. Each subprocess stdout is typically passed as the next one’s stdin unless redirected. The initial stdin is typically the shell, and the final stdout and stderr is also the shell.
Imagine the following conceptual model of each subprocess A, B, etc.:
while (data = read(stdin)) {
write(stdout, process(data));
}
The underlying mechanism may be async etc, but the principle above is the same. What can we state about this:
- Processes suitable for pipelines read from the file stdin and write to the file stdout (and stderr).
- We can’t assume stdin will ever run dry by default (you saw this with
tail -f; it will “follow” the file forever, but stops when its output closes presumably).
- We can’t assume stdout will ever run dry by deault (suppose we piped data - maybe from
/dev/random to tail -f - there will always be more data available)
- You can probably assume the “composition” of such a pipeline does what the user wants, so it will likely terminate if that is the intention, but some, say use of
watch will not terminate without user input.
Your current approach, if I understand correct (I don’t use Go), is you parse along the command and if you hit a | pipe, say in tail -f | .... you launch a process, here tail -f inside executePipe. I can’t find cmd.Exec or see its definition. But I notice you pass it buffers, and after launching it you write the previously buffered output into the new ones stdin buffer.
So I assume you either read as much as you can into the buffers, or you keep reading and resizing (increasing) the size of the buffers. Regardless, either you only get a partial data, or you are stuck waiting forever like we learned above about tail -f processes keeping their stdin and thus their stdout open.
Besides the problem with waiting forever, if you resize buffers you might run out of ram (or disk space) - (what happens if you read from /dev/random ?). So it looks like you are trying to create some manual pipe via byte buffers. So you need to run all subprocesses concurrently to keep feeding the data forward. Otherwise head -n 5 will never close its stdin which is what causes the subprocesses in that particular pipeline to close.
And also the OS have a thing called a pipe. And as I highlighted file above, that OS primitive already have file-like API and internal buffering, whatever you write into one end, can be read by the other. So you get the classic “synchronous/blocking” read/write file-like behavior for free that processes in a (working) pipeline expects. So the idea would be to have a pipe P A.stdout == P such that when A does write(stdout, data) then it goes into P, and then give the same P to process B such that when B does read(stdin) it is reading from P.
I don’t know how Go does it, but it looks like after you create a command for say A (tail -f) you can take its cmd.StdoutPipe which is apparently a io.ReadCloser which make sense because that probably creates a pipe from tail -f to write to, and you now get back the “reading” abstraction which you can somehow pass as the stdin of the next process. I don’t use Go so I can’t help you here.
So rather than trying to handle the pipeline sequentially, split it up on |, prepare each command, setup the redirects/pipes between them, and then launch them all.