In this short blog post, we will check how to use linux-fincore to check which files are in the in-memory Linux page cache. To have an introductory read about the Linux page cache check here and here.
In summary, whenever you read from or write to a file (unless you are using Direct_IO to bypass the functionality), the result is cached in memory, so that subsequent requests can be served from it, instead of the orders of magnitude-slower disk subsystem (it can also be used to cache writes, before flushing them to disk). This is done as far as there is memory that is not being used by any process; whenever there is a shortage of otherwise free memory, the kernel will choose to first evict the page cache out of it.
This process is transparent to us as userland dwellers and is generally something that we shouldn’t mind. However, what if we wanted to have more information on it? Is it possible? How can we do it? Let’s find out!
Installing it
Unless it’s for CentOS 6, there seem to be no packages available, so we need to download the source and compile. The steps for this are simple enough:
1 2 3 4 5 | git clone https://github.com/yazgoo/linux-ftools.git cd linux-ftools/ ./configure make sudo make install |
After this, we will have the binaries ready to be used in /usr/local/bin/.
1 2 3 4 5 6 7 8 9 10 11 12 13 | SHELL> linux-fincore --help fincore version 1.3.0 fincore [options] files... -s --summarize When comparing multiple files, print a summary report -p --pages Print pages that are cached -o --only-cached Only print stats for files that are actually in cache. -g --graph Print a visual graph of each file's cached page distribution. -S --min-size Require that each files size be larger than N bytes. -C --min-cached-size Require that each files cached size be larger than N bytes. -P --min-perc-cached Require percentage of a file that must be cached. -h --help Print this message. -L --vertical Print the output of this script vertically. |
Using it
As seen in the previous output, we need to pass either a file or a list of files for it to work. This is kind of strange at first glance, and begs for the question: “What if I don’t provide some files that are indeed in the page cache?” The answer is simple – they won’t be listed, even if they are in cache! Let’s see it in action. First, let’s write two files, and check if they are in the cache (and how much space they are taking up).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | SHELL> echo aoeu > test_file_1 SHELL> echo htns > test_file_2 SHELL> linux-fincore -L test_file_1 test_file_2 [...trimmed for brevity...] test_file_1 size: 5 total_pages: 1 min_cached_page: 0 cached: 1 cached_size: 4,096 cached_perc: 100.00 test_file_2 size: 5 total_pages: 1 min_cached_page: 0 cached: 1 cached_size: 4,096 cached_perc: 100.00 --- total cached size: 8,192 |
The -L option shows us the output in vertical format, instead of the default columnar style. Now, if we leave out the third argument (which is the second file name):
1 2 3 4 5 6 7 8 9 10 11 | SHELL> linux-fincore -L test_file_1 [...trimmed for brevity...] test_file_1 size: 5 total_pages: 1 min_cached_page: 0 cached: 1 cached_size: 4,096 cached_perc: 100.00 --- total cached size: 4,096 |
We only see test_file_1, even though we know test_file_2 is also cached. This is something to have present every time we use the tool.
A more interesting example is to check from, for instance, a running MySQL server, which files are cached. We can use a command like the following:
1 2 3 4 5 6 7 8 9 10 | SHELL> linux-fincore --only-cached $(find /var/lib/mysql/ -type f) filename size total_pages min_cached page cached_pages cached_size cached_perc -------- ---- ----------- --------------- ------------ ----------- ----------- ... /var/lib/mysql/ibdata1 12,582,912 3,072 0 3,072 12,582,912 100.00 /var/lib/mysql/ib_logfile1 50,331,648 12,288 0 12,288 50,331,648 100.00 /var/lib/mysql/ib_logfile0 50,331,648 12,288 0 12,288 50,331,648 100.00 ... --- total cached size: 115,634,176 |
The –only-cached flag will make it less verbose, by only showing outputs for the files that are in the cache.
Caveats and Limitations
The number one caveat, as mentioned before, is to provide an accurate list of files to check for, or else the results will be obviously wrong (if we are trying to dimension the whole cache usage).
One limitation is that there is an upper bound on the number of files we can check (at least with one command), given the argument list is not limitless. For instance, one natural way of wanting to check for the whole cache is to use a command like the following:
1 2 | SHELL> linux-fincore --only-cached $(find / -type f) -bash: /usr/local/bin/linux-fincore: Argument list too long |
The command fails, as we can see, because we exceeded the number of arguments allowed by bash.
Cleaning the Page Cache
This topic is a bit out of scope for the blog post, but I thought that I could at least mention it. There are three ways of manually purging the page cache if needed:
1- directly writing to the /proc/sys/vm/drop_caches file
1 2 3 | sync && \ echo 1 > /proc/sys/vm/drop_caches && \ echo 3 > /proc/sys/vm/compact_memory |
2- using the sysctl configuration tool
1 2 | sync && \ sysctl vm.drop_caches=1 |
More information about this here (search for the drop_caches section).
3- (Updated: Thanks LeFred for input on this!) Use the uncache tool from DBSake to selectively remove files from the cache, without the need to evict the whole of it.
1 | dbsake uncache /path/to/file_to_be_evicted |
What can I use it for?
The tool can be used for checking which files are cached, and how much memory they are taking. For instance, if you see a spike in memory usage after certain operations, you could:
- capture initial output with linux-fincore
- flush the page cache (as shown above)
- run the problematic operation
- capture a second sample with linux-fincore
Then you could compare the outputs to see which files were used by the operation, and how much was needed for them.
Further reading and similar tools
There is an extended blog post on this matter, which has more tools and how to use them.
uncache – https://dbsake.
vmtouch – https://hoytech.com/vmtouch/
mincore – http://man7.org/linux/man-pages/man2/mincore.2.html
Hi Agusin !
Nice post, I just wanted to point you another tool I really like to work with: https://dbsake.readthedocs.io/en/latest/
dbsake allows you to “uncache” a specific file too without having to drop the full filesystem cache that can be dangerous. This is very useful to free filesystem cache after reading many binlogs for example.
Cheers.
Hey lefred,
Awesome! I’ll have the blog updated to include this in the last section, and a note in the ‘cleaning cache section’, too. I see it also has a fincore-like tool (using mincore to implement it), so it kills two birds 🙂