Recommand · November 3, 2021 0

Why does an await cause my stream to be read

I have a stream created from a file and I intend to use it later on. However, I noticed that if I await any async code to run, even if it’s completely unrelated to my stream, it causes the stream to be read. Why does this happen?

The code is on the server-side.

const data = fs.createReadStream(filePath);
console.log(data.bytesRead); // output: 0
await new Promise(r => setTimeout(r, 2000));
console.log(data.bytesRead); // output: 65536

This explanation is for node v14 and before… The behavior in node v15+ has changed and both data.bytesRead values show 0 when I test it in v16.

After fs.createReadStream() calls an asynchronous for the file, it then reads the first chunk of the file so that it’s available for streaming whenever the stream starts flowing. This is pre-buffering internally. It’s likely done because it improves performance. If you assume that the reader is going to read the whole stream (or most of it), then you may as well fetch the first buffer of content as soon as the file open succeeds, even if the stream isn’t yet flowing or even if the caller hasn’t get manually called to read some bytes.

If you want to follow the code with fs.createReadStream(), you can step through it in the debugger and watch exactly what it does. The structure of the code in that ReadStream constructor has changed quite a bit between v14 and the latest version so you may get confused if you compare what you see in the Github source with stepping through it in your debugger if the two are not the same version (that happened to me). Though the implementation has changed in recent versions, the idea is the same (prefetching the first chunk of content).

First, it calls the ReadStream constructor which is here. Then, as part of that, it opens the file and after the file open succeeds, it reads the first chunk of the file into its current buffer.

It appears that this Github issue seems to be responsible for changing a readStream so that it no longers auto-starts reading and the corresponding code changes look like they went into v15.

In the v16 ReadStream code here, you can now see where the file is opened (in the new _construct method) and after opening the file, there is no longer any code to pre-fetch the first chunk of the file.