Just a few questions re sequential I/O and the OPLOG versus the Extent Store. If the I/O is deemed sequential in nature, will this always bypass the OPLOG or only when the write operation is larger then 1MB? Does bypassing the oplog mean that the write will be a lot slower in comparison? It still hits the SSD so my assumption is that it’s going to be the same. Why does the process of coalescing the writes before sequentially draining them help with performance? I’m interested in why this step is necessary as opposed to just writing directly to the SSD and then replicating out.
Best answer by Alona
I am more than happy to attempt to explain this. Feel free to ask more questions.
You read the data after you write it. Just like me writing this right now. Imagine that you’re working with the alphabet. You can write a b c d e … z or you can write a y h i … c b l … k.
In both instances, the task is the same – to read the letters in alphabetical order. Which scenario is going to take you longer? (Please disregard the fact that you can reproduce the order from memory without reading it)
It gets a little more complicated in real life where there are multiple layers of data organisation but in a nutshell, this is why sequenced I/O is better than random.
The write is received and evaluated. If it’s sequential then it is written to the extent store. If it is random then it hangs out in the oplog until either it becomes part of the sequence (and is drained) or it is overwritten.
Draining oplog sequentially means to write pieces of data not as they appear in the oplog but in the order to the extent store. Instead of writing a y h i … c b l … k the extent store will receive a b c d e … z. In that way, when the read request comes for a letter, a number of them or a sequence, it is easy to locate them on the extent store. Think of it as looking for a file or a folder on your computer. You either sort it by date or alphabetical order but you sort it to find what you’re looking for.
The data that has been touched recently is likely to be touched again soon. That’s why the buffers are everywhere: RAM, your recent files in any text editor, your recent file in any file browser that you use, NICs have sort of a cache to handle bursts of I/O too.