Next: GNU Free Documentation License, Previous: Invoking fswatch
, Up: Top [Contents][Index]
fswatch
is a file system monitoring utility that achieves
portability across multiple platform by decoupling the front-end (the
fswatch
itself) from back-end logic. Back-end logic is
encapsulated in multiple, system-specific monitors, interacting
with different monitoring APIs. Since each operating system
may ship a different set of APIs16, each operating system will support the
corresponding set of monitors.
The list of available monitors is decided at build time by the
configure
script. Monitors cannot be currently plugged-in
but recompiling the libfswatch
library (shipped with
fswath
). The list of available monitors can be obtained in
the help message:
$ fswatch --help [...] Available monitors in this platform: fsevents_monitor kqueue_monitor poll_monitor [...]
Currently, the available monitors are:
stat
can be used (see The Poll Monitor).
Each monitor has its own strengths, weakness and peculiarities.
Although fswatch
strives to provide a uniform experience no
matter which monitor is used, it is still important for users to know
which monitor they are using and to be aware of existing bugs,
limitations, corner cases or pathological behaviour.
The FSEvents monitor, available only on Apple OS X, has no known
limitations and scales very well with the number of files being
observed. In fact, I observed no performance degradation when testing
fswatch
observing changes on a filesystem of 500
GB over long periods of time. On OS X, this is the default
monitor.
The (--recursive, -r) option has no effect when used with the FSEvents monitor since the FSEvents API already monitors a directory’s children by default. There is no overhead nor resource-consumption issue with this behaviour, but users processing the output must be aware that for each directory multiple events may be generated by its children.
The kqueue monitor, available on any *BSD system featuring the
kevent
function, is very similar in principle to other
similar APIs (such as FSEvents and inotify) but has
important drawback and limitations.
The kqueue monitor requires a file descriptor to be opened for
every file being watched. As a result, this monitor scales
badly with the number of files being observed and may begin to
misbehave as soon as the fswatch
process runs out of file
descriptors. In this case, fswatch
dumps one error on
standard error for every file that cannot be opened so that users are
notified and can take action, including terminating the
fswatch
session. Beware that on some systems the maximum
number of file descriptors that can be opened by a process is set to a
very low value (values as low as 256 are not uncommon), even if
the operating system may allow a much larger value.
If you are running out of file descriptors when using this monitor and you cannot reduce the number of observed items, either:
The inotify monitor uses is backed by the inotify API and
the inotify_*
set of functions, introduced on Linux since
kernel 2.6.13. Similarly to the FSEvents API, inotify is
very efficient, suffers from no known resource-exhaustion problems and
scales very well with the number of files being watched. This monitor
is the default monitor on systems running inotify-enabled Linux
kernels.
The inotify monitor, may suffer a queue overflow if events are
generated faster than they are read from the queue. In any case, the
application is guaranteed to receive an overflow notification which
can be handled to gracefully recover. Currently, the
fswatch
process is terminated after the notification is sent
by throwing an exception. Future versions will handle the overflow by
emitting a notification in form of a specially-crafted change event.
However, the odds of observing a queue overflow on a default
configured mainstream GNU/Linux distribution is very
low.
The inotify API sends events for the direct child elements of a watched directory and it scales pretty well with the number of watched items. For this reason, depending on the number of files to watch, it may sometimes be preferable to non-recursively watch a common parent directory and filter received events rather than adding a huge number of file watches. If recursive watches are used, then duplicate change events will be received:
The poll monitor was added as a fallback mechanisms in the cases where no other monitor could be used, including:
The poll monitor, available on any platform, only relies on available CPU and memory to perform its task.
The resource consumption of this monitor increases increases linearly with the number of files being watched (the resulting system performance will probably degrade linearly or quicker).
The authors’ experience indicates that fswatch
requires
approximately 150 MB of RAM memory to observe a
hierarchy of 500,000 files with a minimum path length of 32
characters. A common bottleneck of the poll monitor is disk access,
since stat()
-ing a great number of files may take a
huge amount of time. In this case, the latency
(see Latency) should be set to a sufficiently large value in order
to reduce the performance degradation that may result from frequent
disk access; the inotify monitor, in fact, will re-scan all the
monitored object hierarchy looking for differences every time
its ‘monitoring loop’ is repeated.
Note: Using a disk drive with lower latencies may certainly help, although the authors suspect that switching to an operating system with proper file monitoring APIs is a better solution when performance problems with the poll monitors are experienced or when
fswatch
should drive mission-critical processes.
Since this monitor periodically checks the state of monitored objects looking for differences, it may miss events happened between one scan and another. Let’s suppose, for example, that a file file exists at time t_0 when a scan occurs. The poll monitors detects file and saves the relevant attributes in memory. file is then updated, moved to another place and recreated with the same name. The chain of events18 occurred to file are:
Updated
MovedFrom
(or Deleted
)
Created
Link
At time t_1, another scan runs and the poll monitor detects
that the modification date has changed. The poll monitor can only
infer that a ‘change’ has occurred and raises an Updated
event; other events that would be noticed and raised by other
APIs are effectively lost since they go unnoticed.
The odds of incurring such a loss is inversely proportional to the latency l: reducing the latency helps alleviating this problem, although on the other hands it also results in linearly increasing resource usage.
fswatch
already chooses the ‘best’ monitor for your platform
if you do not specify any. However, a specific monitor may be better
suited to specific use cases. Please, see Monitors to get a
description of all the available monitors and their limitations.
Usage recommendations are as follows:
stat()
-ing a great number of files may take a huge amount of
time. In this case, the latency should be set to a sufficiently large
value in order to reduce the performance degradation that may result
from frequent disk access.
In fact, only OS X supports more than one such API: BSD’s kqueue and FSEvents.
E.g.: observing a number of files greater than the available file descriptors on a system using the kqueue monitor.
The actual chain of events may in fact vary depending on the monitor being used.
Next: GNU Free Documentation License, Previous: Invoking fswatch
, Up: Top [Contents][Index]