Baskerville

Baskerville is a machine operating on the Deflect network that protects  sites from hounding, malicious bots. It’s also an open source project that, in time, will be able to reduce bad behaviour on your networks too. Baskerville responds to web traffic, analyzing requests in real-time, and challenging those acting suspiciously.

Baskerville pipeline

We’ve trained Baskerville to recognize what legitimate traffic on our network looks like, and how to distinguish it from malicious requests attempting to disrupt our clients’ websites. Baskerville has turned out to be very handy for mitigating DDoS attacks, and for correctly classifying other types of malicious behaviour. A few months ago, Baskerville passed an important milestone – making its own decisions on traffic deemed anomalous. The quality of these decisions (recall) is high and Baskerville has already successfully mitigated many sophisticated real-life attacks.

Baskerville monitoring of challenged IPs and false positives

Baskerville is also a complete pipeline to receives as input incoming web logs, either from a Kafka topic, from a locally saved raw log file, or from log files saved to an Elasticsearch instance. It processes these logs in batches, forming request sets by grouping them by requested host and requesting IP. It subsequently extracts features for these request sets and predicts whether they are malicious or benign using a model that was trained offline on previously observed and labelled data. Baskerville saves all the data and results to a Postgres database, and publishes metrics on its processing (e.g. number of logs processed, percentage predicted malicious, percentage predicted benign, processing speed etc) that can be consumed by Prometheus and visualized using a Grafana dashboard. As well as an engine that consumes and process web logs, a set of offline analysis tools have been developed for use in conjunction with Baskerville. These tools may be accessed directly, or via two Jupyter notebooks, which walk the user through the machine learning tools and the investigations tools, respectively. The machine learning notebook comprises tools for training, evaluating, and updating the model used in the Baskerville engine. The investigations notebook comprises tools for processing, analyzing, and visualizing past attacks, for reporting purposes.

You can take a look at an investigation we ran comparing human lead attack analytics with those produced by Baskerville.

Also take a look at some slides related to the project here.