How do I create a watchdog in Python? Code implementation tutorials

How does Python create a watchdog? This article explains how to monitor and monitor events for a file system by using the watchdog and pygtail libraries to create a watchdog in Python.

In software development, application logs play a key role. As much as we want our software to be perfect, problems will always arise, so it’s important to have a robust monitoring and logging record in place to control and manage the inevitable chaos.

Today’s application support engineers need to be able to easily access and analyze the vast amounts of log data generated by their applications and infrastructure. When something goes wrong, they can’t wait a minute or two until the query returns results. Regardless of the amount of data they collect and query, they need speed.

In this tutorial, you’ll learn how to create a watchdog in Python, including an example of creating a watchdog in Python; We’ll explain how to detect changes in a specific directory (assuming that directory hosts your application logs). Whenever there is a change, the modified or newly created predefined type file is processed in a timely manner to retrieve the rows that conform to the specified pattern.

On the other hand, all rows in these files that do not match the specified pattern are considered outliers and are discarded in our analysis.

We’ll be using the watchdog and pygtail libraries to detect changes as they occur, and there’s also a Flask, Redis, and SocketIO version where a GUI web application was created for the same purpose, which you can refer to here at any time.

Process flow charts

First, let’s install the requirements:

$ pip3 install Pygtail==0.11.1 watchdog==2.1.1

How does Python create a watchdog? First, let’s define the configuration parameters for our application:config.py

# Application configuration File
################################
# Directory To Watch, If not specified, the following value will be considered explicitly.
WATCH_DIRECTORY = "C:\\SCRIPTS"
# Delay Between Watch Cycles In Seconds
WATCH_DELAY = 1
# Check The WATCH_DIRECTORY and its children
WATCH_RECURSIVELY = False
# whether to watch for directory events
DO_WATCH_DIRECTORIES = True
# Patterns of the files to watch
WATCH_PATTERN = '.txt,.trc,.log'
LOG_FILES_EXTENSIONS = ('.txt', '.log', '.trc')
# Patterns for observations
EXCEPTION_PATTERN = ['EXCEPTION', 'FATAL', 'ERROR']

The parameters in will be the default and later in the script we can override them as needed.config.py

Next, let’s define a checking mechanism that will utilize the modules pygtail and re to pinpoint the observations based on the parameters we just defined:EXCEPTION_PATTERNconfig.py

import datetime
from pygtail import Pygtail

# Loading the package called re from the RegEx Module in order to work with Regular Expressions
import re

class FileChecker:
    def __init__(self, exceptionPattern):
        self.exceptionPattern = exceptionPattern

    def checkForException(self, event, path):
        # Get current date and time according to the specified format.
        now = (datetime.datetime.now()).strftime("%Y-%m-%d %H:%M:%S")
        # Read the lines of the file (specified in the path) that have not been read yet
        # Meaning by that it will start from the point where it was last stopped.
        for num, line in enumerate(Pygtail(path), 1):
            # Remove leading and trailing whitespaces including newlines.
            line = line.strip()
            # Return all non-overlapping matches of the values specified in the Exception Pattern.
            # The line is scanned from left to right and matches are returned in the oder found.
            if line and any(re.findall('|'.join(self.exceptionPattern), line, flags=re.I | re.X)):
                # Observation Detected
                type = 'observation'
                msg = f"{now} -- {event.event_type} -- File = {path} -- Observation: {line}"
                yield type, msg
            elif line:
                # No Observation Detected
                type = 'msg'
                msg = f"{now} -- {event.event_type} -- File = {path}"
                yield type, msg

checkForException()The method defined in the code above will ingest events (as you’ll see later) that are scheduled by the watchdog module’s observer class.

These events are triggered by any file change in a given directory, and the event object has 3 properties:

  • event_type: The type of event that is a string (modify, create, move, or delete).
  • is_directory: A boolean value that indicates whether an event is emitted for the directory.
  • src_path: The source path of the file system object that triggered the event.

Python Create Watchdog Example – Now let’s define ours, first, let’s import the library:controller.py

# The Observer watches for any file change and then dispatches the respective events to an event handler.
from watchdog.observers import Observer
# The event handler will be notified when an event occurs.
from watchdog.events import FileSystemEventHandler
import time
import config
import os
from checker import FileChecker
import datetime
from colorama import Fore, init

init()

GREEN = Fore.GREEN
BLUE = Fore.BLUE
RESET = Fore.RESET
RED = Fore.RED
YELLOW = Fore.YELLOW

event2color = {
    "created": GREEN,
    "modified": BLUE,
    "deleted": RED,
    "moved": YELLOW,
}

def print_with_color(s, color=Fore.WHITE, brightness=Style.NORMAL, **kwargs):
    """Utility function wrapping the regular `print()` function 
    but with colors and brightness"""
    print(f"{brightness}{color}{s}{Style.RESET_ALL}", **kwargs)

We’ll use text colors to distinguish between different events, for more information on colorama, check out this tutorial.colorama

How do I create a watchdog in Python? Next, let’s define our event handlers:

# Class that inherits from FileSystemEventHandler for handling the events sent by the Observer
class LogHandler(FileSystemEventHandler):

    def __init__(self, watchPattern, exceptionPattern, doWatchDirectories):
        self.watchPattern = watchPattern
        self.exceptionPattern = exceptionPattern
        self.doWatchDirectories = doWatchDirectories
        # Instantiate the checker
        self.fc = FileChecker(self.exceptionPattern)

    def on_any_event(self, event):
        now = (datetime.datetime.now()).strftime("%Y-%m-%d %H:%M:%S")
        # print("event happened:", event)
        # To Observe files only not directories
        if not event.is_directory:
            # To cater for the on_move event
            path = event.src_path
            if hasattr(event, 'dest_path'):
                path = event.dest_path
            # Ensure that the file extension is among the pre-defined ones.
            if path.endswith(self.watchPattern):
                msg = f"{now} -- {event.event_type} -- File: {path}"
                if event.event_type in ('modified', 'created', 'moved'):
                    # check for exceptions in log files
                    if path.endswith(config.LOG_FILES_EXTENSIONS):
                        for type, msg in self.fc.checkForException(event=event, path=path):
                            print_with_color(msg, color=event2color[event.event_type])
                    else:
                        print_with_color(msg, color=event2color[event.event_type])
                else:
                    print_with_color(msg, color=event2color[event.event_type])
        elif self.doWatchDirectories:
            msg = f"{now} -- {event.event_type} -- Folder: {event.src_path}"
            print_with_color(msg, color=event2color[event.event_type])

    def on_modified(self, event):
        pass

    def on_deleted(self, event):
        pass

    def on_created(self, event):
        pass

    def on_moved(self, event):
        pass

Copy the main override method in the watchdog library that inherits the class named FileSystemEventHandler from the class.LogHandleron_any_event()

How does Python create a watchdog? Here are some useful methods if this class:

  • on_any_event(): Invoke any event.
  • on_created(): Called when a file or directory is created.
  • on_modified(): Called when a file is modified or a directory is renamed.
  • on_deleted(): Called when a file or directory is deleted.
  • on_moved(): Called when a file or directory is moved.

The code assigned to the method will:on_any_event()

  • Observe files and directories.
  • Verify that the extension of the file affected by the event is in the extension predefined in the Internal variable WATCH_PATTERNconfig.py
  • If detected, a message describing the event or observation is generated.

Python Creating a Watchdog Example: Now let’s write our class:LogWatcher

class LogWatcher:
    # Initialize the observer
    observer = None
    # Initialize the stop signal variable
    stop_signal = 0
    # The observer is the class that watches for any file system change and then dispatches the event to the event handler.
    def __init__(self, watchDirectory, watchDelay, watchRecursively, watchPattern, doWatchDirectories, exceptionPattern, sessionid, namespace):
        # Initialize variables in relation
        self.watchDirectory = watchDirectory
        self.watchDelay = watchDelay
        self.watchRecursively = watchRecursively
        self.watchPattern = watchPattern
        self.doWatchDirectories = doWatchDirectories
        self.exceptionPattern = exceptionPattern
        self.namespace = namespace
        self.sessionid = sessionid

        # Create an instance of watchdog.observer
        self.observer = Observer()
        # The event handler is an object that will be notified when something happens to the file system.
        self.event_handler = LogHandler(watchPattern, exceptionPattern, self.doWatchDirectories)

    def schedule(self):
        print("Observer Scheduled:", self.observer.name)
        # Call the schedule function via the Observer instance attaching the event
        self.observer.schedule(
            self.event_handler, self.watchDirectory, recursive=self.watchRecursively)

    def start(self):
        print("Observer Started:", self.observer.name)
        self.schedule()
        # Start the observer thread and wait for it to generate events
        now = (datetime.datetime.now()).strftime("%Y-%m-%d %H:%M:%S")
        msg = f"Observer: {self.observer.name} - Started On: {now} - Related To Session: {self.sessionid}"
        print(msg)

        msg = (
            f"Watching {'Recursively' if self.watchRecursively else 'Non-Recursively'}: {self.watchPattern}"
            f" -- Folder: {self.watchDirectory} -- Every: {self.watchDelay}(sec) -- For Patterns: {self.exceptionPattern}"
        )
        print(msg)
        self.observer.start()

    def run(self):
        print("Observer is running:", self.observer.name)
        self.start()
        try:
            while True:
                time.sleep(self.watchDelay)

                if self.stop_signal == 1:
                    print(
                        f"Observer stopped: {self.observer.name}  stop signal:{self.stop_signal}")
                    self.stop()
                    break
        except:
            self.stop()
        self.observer.join()

    def stop(self):
        print("Observer Stopped:", self.observer.name)

        now = (datetime.datetime.now()).strftime("%Y-%m-%d %H:%M:%S")
        msg = f"Observer: {self.observer.name} - Stopped On: {now} - Related To Session: {self.sessionid}"
        print(msg)
        self.observer.stop()
        self.observer.join()

    def info(self):
        info = {
            'observerName': self.observer.name,
            'watchDirectory': self.watchDirectory,
            'watchDelay': self.watchDelay,
            'watchRecursively': self.watchRecursively,
            'watchPattern': self.watchPattern,
        }
        return info

Here’s what we did on the class:LogWatcher

  • Create an instance of the watchdog.observer thread class, which monitors for any file system changes and then dispatches the appropriate events to the event handler.
  • Create an instance of an event handler that starts with a . Event handlers are notified when any changes occur.LogHandlerFileSystemEventHandler
  • Assign a schedule to our observers and define other input parameters like the catalog to watch, viewing mode, etc. Note that when you set the parameter to , you must ensure that you have sufficient access to the subfolder.recursiveTrue

Finally, let’s use the following argparse code to create command-line arguments around the code:

def is_dir_path(path):
    """Utility function to check whether a path is an actual directory"""
    if os.path.isdir(path):
        return path
    else:
        raise NotADirectoryError(path)

if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser(
        description="Watchdog script for watching for files & directories' changes")
    parser.add_argument("path",
                        default=config.WATCH_DIRECTORY,
                        type=is_dir_path,
                        )
    parser.add_argument("-d", "--watch-delay",
                        help=f"Watch delay, default is {config.WATCH_DELAY}",
                        default=config.WATCH_DELAY,
                        type=int,
                        )
    parser.add_argument("-r", "--recursive",
                        action="store_true",
                        help=f"Whether to recursively watch for the path's children, default is {config.WATCH_RECURSIVELY}",
                        default=config.WATCH_RECURSIVELY,
                        )
    parser.add_argument("-p", "--pattern",
                        help=f"Pattern of files to watch, default is {config.WATCH_PATTERN}",
                        default=config.WATCH_PATTERN,
                        )
    parser.add_argument("--watch-directories",
                        action="store_true",
                        help=f"Whether to watch directories, default is {config.DO_WATCH_DIRECTORIES}",
                        default=config.DO_WATCH_DIRECTORIES,
                        )
    # parse the arguments
    args = parser.parse_args()
    # define & launch the log watcher
    log_watcher = LogWatcher(
        watchDirectory=args.path,
        watchDelay=args.watch_delay,
        watchRecursively=args.recursive,
        watchPattern=tuple(args.pattern.split(",")),
        doWatchDirectories=args.watch_directories,
        exceptionPattern=config.EXCEPTION_PATTERN,
    )
    log_watcher.run()

Python creates watchdog example: We define it to ensure that the path entered is a valid directory. Let’s use the script:is_dir_path()

How do I create a watchdog in Python? I’ve gone by observing everything that’s happening in the directory, including subfolders, and I’ve also specified a mode for viewing text and image files.--recursiveE:\watchdog.txt,.log,.jpg,.png

Then I created a folder and started writing on the text file, then moved the image and deleted it, the watchdog was capturing everything!

Note that you can choose to override the parameter here or pass the parameter here.config.py

Conclusion

How does Python create a watchdog? After we dive into the available features of the watchdog and pygtail libraries, I hope this article has been helpful to you.

It’s worth noting that by extending the functionality described, you can correlate an alert mechanism or play a sound when a fatal error occurs in a log file. By doing so, when an observation is pinpointed, the configured workflow or alert will be automatically triggered.