fswatch 1.5.0: Monitors

5 Monitors

fswatch is a file system monitoring utility that achieves portability across multiple platform by decoupling the front-end (the fswatch itself) from back-end logic. Back-end logic is encapsulated in multiple, system-specific monitors, interacting with different monitoring APIs. Since each operating system may ship a different set of APIs¹⁶, each operating system will support the corresponding set of monitors.

The list of available monitors is decided at build time by the configure script. Monitors cannot be currently plugged-in but recompiling the libfswatch library (shipped with fswath). The list of available monitors can be obtained in the help message:

$ fswatch --help
[...]
Available monitors in this platform:

  fsevents_monitor
  kqueue_monitor
  poll_monitor
[...]

5.1 Available Monitors

Currently, the available monitors are:

The FSEvents monitor, a monitor based on the File System Events API of Apple OS X (see The FSEvents Monitor).
The kqueue monitor, a monitor based on kqueue, an event notification interface introduced in FreeBSD 4.1 and supported on most *BSD systems (including OS X) (see The kqueue Monitor).
The inotify monitor, a Linux kernel subsystem that reports file system changes to applications (see The inotify Monitor).
The poll monitor, a monitor that periodically stats the file system, saves file modification times in memory and manually calculates file system changes, which can work on any operating system where stat can be used (see The Poll Monitor).

Each monitor has its own strengths, weakness and peculiarities. Although fswatch strives to provide a uniform experience no matter which monitor is used, it is still important for users to know which monitor they are using and to be aware of existing bugs, limitations, corner cases or pathological behaviour.

5.2 The FSEvents Monitor

The FSEvents monitor, available only on Apple OS X, has no known limitations and scales very well with the number of files being observed. In fact, I observed no performance degradation when testing fswatch observing changes on a filesystem of 500 GB over long periods of time. On OS X, this is the default monitor.

5.2.1 Peculiarities

The (--recursive, -r) option has no effect when used with the FSEvents monitor since the FSEvents API already monitors a directory’s children by default. There is no overhead nor resource-consumption issue with this behaviour, but users processing the output must be aware that for each directory multiple events may be generated by its children.

5.3 The kqueue Monitor

The kqueue monitor, available on any *BSD system featuring the kevent function, is very similar in principle to other similar APIs (such as FSEvents and inotify) but has important drawback and limitations.

5.3.1 Peculiarities

The kqueue monitor requires a file descriptor to be opened for every file being watched. As a result, this monitor scales badly with the number of files being observed and may begin to misbehave as soon as the fswatch process runs out of file descriptors. In this case, fswatch dumps one error on standard error for every file that cannot be opened so that users are notified and can take action, including terminating the fswatch session. Beware that on some systems the maximum number of file descriptors that can be opened by a process is set to a very low value (values as low as 256 are not uncommon), even if the operating system may allow a much larger value.

If you are running out of file descriptors when using this monitor and you cannot reduce the number of observed items, either:

Consider raising the number of maximum open file descriptors (check your OS’ documentation).
Consider using another monitor.

5.4 The inotify Monitor

The inotify monitor uses is backed by the inotify API and the inotify_* set of functions, introduced on Linux since kernel 2.6.13. Similarly to the FSEvents API, inotify is very efficient, suffers from no known resource-exhaustion problems and scales very well with the number of files being watched. This monitor is the default monitor on systems running inotify-enabled Linux kernels.

5.4.1 Peculiarities

5.4.1.1 Queue Overflow

The inotify monitor, may suffer a queue overflow if events are generated faster than they are read from the queue. In any case, the application is guaranteed to receive an overflow notification which can be handled to gracefully recover. Currently, the fswatch process is terminated after the notification is sent by throwing an exception. Future versions will handle the overflow by emitting a notification in form of a specially-crafted change event. However, the odds of observing a queue overflow on a default configured mainstream GNU/Linux distribution is very low.

5.4.1.2 Duplicate Events

The inotify API sends events for the direct child elements of a watched directory and it scales pretty well with the number of watched items. For this reason, depending on the number of files to watch, it may sometimes be preferable to non-recursively watch a common parent directory and filter received events rather than adding a huge number of file watches. If recursive watches are used, then duplicate change events will be received:

One generated by the parent directory of the file that has changed.
One generated by the file that has changed.

5.5 The Poll Monitor

The poll monitor was added as a fallback mechanisms in the cases where no other monitor could be used, including:

Operating system without any sort of file events API.
Situations where the limitations of the available monitors cannot be overcome¹⁷.

The poll monitor, available on any platform, only relies on available CPU and memory to perform its task.

5.5.1 Peculiarities

5.5.1.1 Performance Problems

The resource consumption of this monitor increases increases linearly with the number of files being watched (the resulting system performance will probably degrade linearly or quicker).

The authors’ experience indicates that fswatch requires approximately 150 MB of RAM memory to observe a hierarchy of 500,000 files with a minimum path length of 32 characters. A common bottleneck of the poll monitor is disk access, since stat()-ing a great number of files may take a huge amount of time. In this case, the latency (see Latency) should be set to a sufficiently large value in order to reduce the performance degradation that may result from frequent disk access; the inotify monitor, in fact, will re-scan all the monitored object hierarchy looking for differences every time its ‘monitoring loop’ is repeated.

Note: Using a disk drive with lower latencies may certainly help, although the authors suspect that switching to an operating system with proper file monitoring APIs is a better solution when performance problems with the poll monitors are experienced or when fswatch should drive mission-critical processes.

5.5.1.2 Missing Events and Missing Event Flags

Since this monitor periodically checks the state of monitored objects looking for differences, it may miss events happened between one scan and another. Let’s suppose, for example, that a file file exists at time t_0 when a scan occurs. The poll monitors detects file and saves the relevant attributes in memory. file is then updated, moved to another place and recreated with the same name. The chain of events¹⁸ occurred to file are:

Updated
MovedFrom (or Deleted)
Created
Link

At time t_1, another scan runs and the poll monitor detects that the modification date has changed. The poll monitor can only infer that a ‘change’ has occurred and raises an Updated event; other events that would be noticed and raised by other APIs are effectively lost since they go unnoticed.

The odds of incurring such a loss is inversely proportional to the latency l: reducing the latency helps alleviating this problem, although on the other hands it also results in linearly increasing resource usage.

5.6 How to Choose a Monitor

fswatch already chooses the ‘best’ monitor for your platform if you do not specify any. However, a specific monitor may be better suited to specific use cases. Please, see Monitors to get a description of all the available monitors and their limitations.

Usage recommendations are as follows:

On OS X, use only the FSEvents monitor (which is the default behaviour).
On Linux, use the inotify monitor (which is the default behaviour).
If the number of files to observe is sufficiently small, use the kqueue monitor. Beware that on some systems the maximum number of file descriptors that can be opened by a process is set to a very low value (values as low as 256 are not uncommon), even if the operating system may allow a much larger value. In this case, check your OS documentation to raise this limit on either a per process or a system-wide basis.
If feasible, watch directories instead of watching files. Properly crafting the receiving side of the events to deal with directories may sensibly reduce the monitor resource consumption.
If none of the above applies, use the poll monitor. The authors’ experience indicates that fswatch requires approximately 150 MB of RAM memory to observe a hierarchy of 500,000 files with a minimum path length of 32 characters. A common bottleneck of the poll monitor is disk access, since stat()-ing a great number of files may take a huge amount of time. In this case, the latency should be set to a sufficiently large value in order to reduce the performance degradation that may result from frequent disk access.