Next: GNU Free Documentation License, Previous: Invoking fswatch
, Up: Top [Contents][Index]
fswatch
is a file system monitoring utility that achieves
portability across multiple platforms by decoupling the front-end (the
fswatch
itself) from back-end logic. Back-end logic is
encapsulated in multiple, system-specific monitors, interacting
with different monitoring APIs. Since each operating system
may ship a different set of APIs16, each operating system will support the
corresponding set of monitors.
The list of available monitors is decided at build time by the
configure
script. Monitors cannot be currently plugged-in
but recompiling the libfswatch
library (shipped with
fswath
). The list of available monitors can be obtained in
the help message:
$ fswatch --help [...] Available monitors in this platform: fsevents_monitor kqueue_monitor poll_monitor [...]
Currently, the available monitors are:
ReadDirectoryChangesW
function and reads change events
asynchronously.
stat
can be used (see The Poll Monitor).
Each monitor has its own strengths, weakness and peculiarities.
Although fswatch
strives to provide a uniform experience no
matter which monitor is used, it is still important for users to know
which monitor they are using and to be aware of existing bugs,
limitations, corner cases or pathological behaviour.
The FSEvents monitor, available only on Apple OS X, has no known
limitations and scales very well with the number of files being
observed. In fact, I observed no performance degradation when testing
fswatch
observing changes on a filesystem of 500
GB over long periods of time. This is the default monitor
on Apple OS X.
The (--recursive, -r) and (--directories, -d) options have no effect when used with the FSEvents monitor since the FSEvents API already monitors a directory’s children by default. There is no overhead nor resource-consumption issue with this behaviour, but users processing the output must be aware that for each directory multiple events may be generated by its children.
The kqueue monitor, available on any *BSD system featuring the
kevent
function, is very similar in principle to other
similar APIs (such as FSEvents and inotify) but has
important drawback and limitations.
The kqueue monitor requires a file descriptor to be opened for
every file being watched. As a result, this monitor scales
badly with the number of files being observed and may begin to
misbehave as soon as the fswatch
process runs out of file
descriptors. In this case, fswatch
dumps one error on
standard error for every file that cannot be opened so that users are
notified and can take action, including terminating the
fswatch
session. Beware that on some systems the maximum
number of file descriptors that can be opened by a process is set to a
very low value (values as low as 256 are not uncommon), even if
the operating system may allow a much larger value.
If you are running out of file descriptors when using this monitor and you cannot reduce the number of observed items, either:
The File Events Notification monitor is backed by the File Events Notification API of the Solaris/Illumos kernel. This monitor is very efficient, it suffers from no known resource-exhaustion problems and it scales very well with the number of objects being watched. This monitor is the default monitor on systems running a Solaris or Illumos kernel providing this API.
The inotify monitor is backed by the inotify API and
the inotify_*
set of functions, introduced on Linux since
kernel 2.6.13. Similarly to the FSEvents API, inotify is
very efficient, it suffers from no known resource-exhaustion problems
and it scales very well with the number of objects being watched.
This monitor is the default monitor on systems running inotify-enabled
Linux kernels.
The inotify monitor may suffer a queue overflow if events are generated faster than they are read from the queue. In any case, the application is guaranteed to receive an overflow notification which can be handled to gracefully recover.
By default, the fswatch
process is terminated after the
notification is sent by throwing an exception. Using the
--allow-overflow option makes fswatch
emit a change
event of type Overflow
without exiting.
The inotify API sends events for the direct child elements of a watched directory and it scales pretty well with the number of watched items. For this reason, depending on the number of files to watch, it may sometimes be preferable to non-recursively watch a common parent directory and filter received events rather than adding a huge number of file watches. If recursive watches are used, then duplicate change events will be received:
The Windows monitor uses the Windows’ ReadDirectoryChangesW
function for each watched path and asynchronously waits for change
events using overlapped I/O. The Windows monitor is the default
choice on Windows because it is the best performing monitor on that
platform and it is affected by virtually no limitations.
The Windows monitor may suffer a buffer overflow if events are
generated faster than they can be stored in the buffer allocated by
the operating system when ReadDirectoryChangesW
is first called
on a watched path. Once the buffer has been created, it is never
resized and will live until the file handle events are listened upon
is closed.
Another source of overflow is the size of the buffer passed to
ReadDirectoryChangesW
by its caller. Unless the one created by
Windows, this buffer’s size can be tuned by the user. The custom
windows.ReadDirectoryChangesW.buffer.size
property can be used
to programmatically set the size of the buffer (in bytes) when
fswatch
is invoked, as shown in the following example where a 4
kilobytes buffer is used:
$ fswatch --monitor-property \ windows.ReadDirectoryChangesW.buffer.size=4096 \ ~
By default, the fswatch
process is terminated after the
notification is sent by throwing an exception. Using the
--allow-overflow
option makes fswatch
emit a change
event of type Overflow
without exiting.
The Windows API lets user watch directory, not
files. fswatch
currently passes path arguments to the
underlying monitor as they are: as a consequence, if a path
corresponds to a file, the monitor will emit an error and will not be
able to watch it.
For the same reasons, the (--directories/-d) has no effect when using this monitor.
The Windows API will return change events related to a
watched directory and any children of its, at any depth. Essentially,
the subtree rooted at a directory is recursively watched even
if the -r
option is not used explicitly.
The poll monitor was added as a fallback mechanisms in the cases where no other monitor could be used, including:
The poll monitor, available on any platform, only relies on available CPU and memory to perform its task.
The resource consumption of this monitor increases increases linearly with the number of files being watched (the resulting system performance will probably degrade linearly or quicker).
The authors’ experience indicates that fswatch
requires
approximately 150 MB of RAM memory to observe a
hierarchy of 500,000 files with a minimum path length of 32
characters. A common bottleneck of the poll monitor is disk access,
since stat()
-ing a great number of files may take a
huge amount of time. In this case, the latency
(see Latency) should be set to a sufficiently large value in order
to reduce the performance degradation that may result from frequent
disk access; this monitor, in fact, will re-scan all the
monitored object hierarchy looking for differences every time
its ‘monitoring loop’ is repeated.
Note: Using a disk drive with lower latencies may certainly help, although the authors suspect that switching to an operating system with proper file monitoring APIs is a better solution when performance problems with the poll monitors are experienced or when
fswatch
should drive mission-critical processes.
Since this monitor periodically checks the state of monitored objects looking for differences, it may miss events happened between one scan and another. Let’s suppose, for example, that a file file exists at time t_0 when a scan occurs. The poll monitors detects file and saves the relevant attributes in memory. file is then updated, moved to another place and recreated with the same name. The chain of events18 occurred to file are:
Updated
MovedFrom
(or Deleted
)
Created
Link
At time t_1, another scan runs and the poll monitor detects
that the modification date has changed. The poll monitor can only
infer that a ‘change’ has occurred and raises an Updated
event; other events that would be noticed and raised by other
APIs are effectively lost since they go unnoticed.
The odds of incurring such a loss is inversely proportional to the latency l: reducing the latency helps alleviating this problem, although on the other hands it also results in linearly increasing resource usage.
fswatch
already chooses the ‘best’ monitor for your platform
if you do not specify any. However, a specific monitor may be better
suited to specific use cases. Please, see Monitors to get a
description of all the available monitors and their limitations.
Usage recommendations are as follows:
stat()
-ing a great number of files may take a huge amount of
time. In this case, the latency should be set to a sufficiently large
value in order to reduce the performance degradation that may result
from frequent disk access.
In fact, only OS X supports more than one such API: BSD’s kqueue and FSEvents.
E.g.: observing a number of files greater than the available file descriptors on a system using the kqueue monitor.
The actual chain of events may in fact vary depending on the monitor being used.
Next: GNU Free Documentation License, Previous: Invoking fswatch
, Up: Top [Contents][Index]