2023-08-23
I wanted to make a program I had written log everything to a file. I simply called it as follows from bash:
./program >> output.log 2>&1
The >>
redirect operator appends to
output.log
, while the 2>&1
ensures we
append both stdout
and stderr
1.
This seemingly worked well, opening the log following program execution revealed every print statement and error that would have otherwise appeared on the command line. However, opening the log while the program was still running revealed… An empty file. How could that be?
As soon as I closed the program, all the output appeared at once. Other than redirecting to a file, I had changed nothing, so why was this happening?
Well, it turns out that where the output is going actually changes the type of buffering used, at least in glibc. When writing to a file, newlines will no longer automatically flush the output as they do when printing to a TTY2.
For logging, I really want messages to be saved to disk as soon as possible in case something goes wrong. I chose to modify my program to force flush after each print (easy enough to do in Python):
print("This is some output", flush=True)
If you don’t control the program itself, there’s a big stackexchange discussion about different options.
stdbuf
,
part of the GNU coreutils, can theoretically be used to control the type
of buffering. However, it seems it does not play well with Python.
unbuffer
will control buffering quite reliably, but you will likely need to
install it (it is part of expect
):
unbuffer ./program >> output.log 2>&1
I thought I was being very clever and saving work by tacking a little
>>
on my command, but there was a little catch I
didn’t expect. Unfortunately, this is a pretty common experience with
software, in which our assumptions of how something works aren’t quite
aligned with reality.