BotDetector
in package
Comprehensive Bot Detection System
Provides multi-layered bot detection through user agent analysis, request patterns, header validation, behavioral indicators, and fingerprinting. Designed to protect applications from automated traffic, scrapers, and malicious bots while allowing legitimate crawlers and users.
Key Detection Mechanisms
- User Agent Analysis: Detects known bot patterns and suspicious agents
- Header Validation: Identifies missing or suspicious HTTP headers
- Request Fingerprinting: Creates and validates signed fingerprints for tracking
- Behavioral Analysis: Detects honeypot triggers and suspicious patterns
- Path Protection: Monitors access to sensitive endpoints
- Rate Limiting: Pluggable rate limiting system integration
- Scoring System: Provides weighted probability scores (0.0-1.0)
Architecture Overview
The detector uses a scoring system that combines multiple detection signals:
- 0.0 = Definitely human
- 0.3-0.4 = Suspicious, consider additional verification
- 0.7+ = Likely bot
- 1.0 = Definitely bot
Security Features
- HMAC Fingerprint Signing: Prevents fingerprint tampering
- Event Logging: Comprehensive EventManager integration for monitoring
- Defense in Depth: Multiple detection layers for robust protection
- Graceful Degradation: Continues functioning even if some checks fail
Usage Example
// Initial setup (once per application)
BotDetector::useSecretKey('your-32-character-minimum-secret-key');
BotDetector::registerSensitivePaths(['/admin', '/api/private']);
// Per-request detection
if (BotDetector::isLikelyBot()) {
http_response_code(429);
exit('Bot traffic detected');
}
// Detailed scoring
$score = BotDetector::score();
if ($score >= 0.4) {
requireCaptcha();
}
Tags
Table of Contents
Properties
- $botMap : array<string|int, mixed>|array<string|int, string>
- $rateLimitCallback : callable|null
- $secretKey : string|null
- $sensitivePaths : array<string|int, mixed>
Methods
- assertHeadersNotSent() : void
- Asserts that HTTP headers have not been sent yet, throwing exception if they have
- fingerprint() : string|null
- Retrieve the raw fingerprint value
- fpidState() : string
- Determine the state of the fingerprint cookie
- generateSecretKey() : string
- Generate a cryptographically secure random key for development/testing
- getBotName() : string
- Determines the bot name based on user agent string by checking against known bot patterns
- getSignature() : string
- Generate HMAC signature for fingerprint
- hasNoCookies() : bool
- Check if request has no cookies
- hasSuspiciousHeaders() : bool
- Check for missing headers typically present in browsers
- isConfigured() : bool
- Check if BotDetector has been configured with a secret key
- isEmptyUserAgent() : bool
- Check if user agent string is empty
- isHittingSensitivePath() : bool
- Checks if the current request path matches any registered sensitive paths
- isHoneypot() : bool
- Check if honeypot field has been filled
- isInvalidOrigin() : bool
- Checks if the Origin header doesn't match the expected host
- isLikelyBot() : bool
- Determines if the request is likely from a bot based on scoring threshold
- isLocalIp() : bool
- Checks if the request is coming from a local IP address
- isMaliciousRequest() : bool
- Check if request is malicious (path or payload)
- isPostWithoutReferer() : bool
- Check if POST request is missing Referer header
- isRateLimited() : bool
- Checks if the request is rate limited using the configured callback
- isSuspiciousAcceptHeaders() : bool
- Check if Accept headers are missing or suspicious
- isSuspiciousPath() : bool
- Check if request path contains suspicious patterns
- isSuspiciousPayload() : bool
- Analyze request payloads for attack patterns
- isSuspiciousUserAgent() : bool
- Check if user agent is suspicious
- isUsingRareMethod() : bool
- Checks if the request uses uncommon HTTP methods that may indicate bot behavior
- isValidFingerprint() : bool
- Validate fingerprint cookie signature
- matchesKnownBot() : bool
- Checks if the user agent matches any known bot patterns
- queryBotMap() : string
- Query the bot map for a specific identifier
- registerSensitivePaths() : void
- Registers an array of sensitive paths that should trigger bot detection
- resetSensitivePaths() : void
- Clears all registered sensitive paths
- score() : float
- Calculate comprehensive bot detection score
- signFingerprintCookie() : void
- Sign or validate the fingerprint cookie
- updateBotMap() : void
- Update or add a bot pattern to the detection map
- useRateLimiter() : void
- Sets a callback function for rate limiting checks
- userInfo() : array{ip: string, fingerprint: string|null, user_agent: string, referer: string}
- Collects and returns comprehensive user information from server variables
- useSecretKey() : void
- Set the secret key for fingerprint signing
- verifySignature() : bool
- Verify fingerprint signature
- assertConfigured() : void
- Assert that configuration is complete
Properties
$botMap
public
static array<string|int, mixed>|array<string|int, string>
$botMap
= ['adsbot' => 'Google AdsBot', 'ahrefsbot' => 'Ahrefs', 'amazonbot' => 'AmazonBot', 'applebot' => 'AppleBot', 'archive.org_bot' => 'Internet Archive', 'baiduspider' => 'Baidu', 'bingbot' => 'BingBot', 'bitlybot' => 'Bitly', 'bot' => 'Generic Bot', 'bytespider' => 'ByteDance Spider', 'censysinspect' => 'Censys', 'checkmarknetwork' => 'Checkmark Network', 'chrome-lighthouse' => 'Google Lighthouse', 'cloudflare' => 'Cloudflare', 'curl' => 'cURL', 'datadog' => 'Datadog Agent', 'discordbot' => 'Discord Bot', 'dotbot' => 'DotBot', 'duckduckbot' => 'DuckDuckGo', 'facebookexternalhit' => 'Facebook Crawler', 'facebot' => 'Facebook Bot', 'fetch' => 'Generic Fetch Client', 'gigabot' => 'Gigablast Bot', 'googlebot' => 'Googlebot', 'google' => 'Google (General)', 'gptbot' => 'OpenAI GPTBot', 'headless' => 'Headless Browser', 'httpclient' => 'HTTP Client', 'http_request2' => 'PEAR HTTP_Request2', 'ia_archiver' => 'Alexa (Amazon)', 'insights' => 'Microsoft Insights', 'ioncrawl' => 'IonCrawl', 'java/' => 'Java Client', 'libwww-perl' => 'libwww-perl', 'linkedinbot' => 'LinkedInBot', 'ltx71' => 'LTX71', 'mediapartners-google' => 'Google AdSense', 'mj12bot' => 'Majestic-12', 'monitoring' => 'Monitoring Agent', 'msnbot' => 'MSN Bot', 'nagios' => 'Nagios Checker', 'naverbot' => 'Naver Bot', 'netcraft' => 'Netcraft', 'newrelicpinger' => 'NewRelic', 'nutch' => 'Apache Nutch', 'openuabot' => 'OpenUA Bot', 'outbrain' => 'Outbrain Bot', 'panscient' => 'Panscient Bot', 'petalbot' => 'Huawei PetalBot', 'phantomjs' => 'PhantomJS', 'pingdom' => 'Pingdom', 'postman' => 'Postman Runtime', 'python' => 'Python Script', 'qwantbot' => 'Qwant Bot', 'rogerbot' => 'Rogerbot (Moz)', 'scrapy' => 'Scrapy Framework', 'searchmetricsbot' => 'SearchMetrics', 'semrush' => 'SEMRush', 'seokicks-robot' => 'SEO Kicks', 'serpstatbot' => 'Serpstat Bot', 'serpworx' => 'SERPWorx', 'shodan' => 'Shodan Bot', 'siteauditbot' => 'SiteAuditBot', 'sitecheckerbot' => 'SiteCheckerBot', 'slackbot' => 'Slack Bot', 'smtbot' => 'SMTBot', 'sogou' => 'Sogou Spider', 'spbot' => 'SEO Profiler Bot', 'spider' => 'Generic Spider', 'surveybot' => 'SurveyBot', 'telegrambot' => 'Telegram Bot', 'testcertificatechain' => 'Cert Validation Bot', 'trustpilot' => 'TrustPilot Bot', 'twitterbot' => 'Twitter Bot', 'uptimerobot' => 'UptimeRobot', 'vagabondbot' => 'Vagabond Bot', 'vkshare' => 'VKontakte Bot', 'wget' => 'Wget', 'whatsapp' => 'WhatsApp Bot', 'yacybot' => 'YaCy Bot', 'yahoo! slurp' => 'Yahoo Slurp', 'yahooseeker' => 'Yahoo Seeker', 'yandex' => 'Yandex Bot', 'zoominfo' => 'ZoomInfo Bot']
$rateLimitCallback
protected
static callable|null
$rateLimitCallback
= null
$secretKey
protected
static string|null
$secretKey
= null
$sensitivePaths
protected
static array<string|int, mixed>
$sensitivePaths
= []
Methods
assertHeadersNotSent()
Asserts that HTTP headers have not been sent yet, throwing exception if they have
public
static assertHeadersNotSent() : void
Tags
fingerprint()
Retrieve the raw fingerprint value
public
static fingerprint([bool $refresh = false ]) : string|null
Parameters
- $refresh : bool = false
-
Whether to refresh the cached result
Tags
Return values
string|null —The raw fingerprint or null if invalid/missing
fpidState()
Determine the state of the fingerprint cookie
public
static fpidState() : string
Tags
Return values
string —One of: 'missing', 'malformed', 'unsigned', 'valid', 'forged'
generateSecretKey()
Generate a cryptographically secure random key for development/testing
public
static generateSecretKey([int $length = 64 ]) : string
Parameters
- $length : int = 64
-
Length of the generated key (default: 64, minimum: 32)
Tags
Return values
string —A cryptographically secure random hexadecimal string
getBotName()
Determines the bot name based on user agent string by checking against known bot patterns
public
static getBotName([string $userAgent = '' ]) : string
Parameters
- $userAgent : string = ''
-
Optional user agent string. If empty, uses $_SERVER['HTTP_USER_AGENT']
Return values
string —The identified bot name or 'Unknown' if no match found
getSignature()
Generate HMAC signature for fingerprint
public
static getSignature(string $raw) : string
Parameters
- $raw : string
-
Raw fingerprint (must be valid SHA256 hex)
Tags
Return values
string —HMAC-SHA256 signature
hasNoCookies()
Check if request has no cookies
public
static hasNoCookies() : bool
Tags
Return values
bool —True if no cookies present, false otherwise
hasSuspiciousHeaders()
Check for missing headers typically present in browsers
public
static hasSuspiciousHeaders() : bool
Tags
Return values
bool —True if suspicious headers detected, false otherwise
isConfigured()
Check if BotDetector has been configured with a secret key
public
static isConfigured() : bool
Tags
Return values
bool —True if configured with secret key, false otherwise
isEmptyUserAgent()
Check if user agent string is empty
public
static isEmptyUserAgent([string|null $userAgent = null ]) : bool
Parameters
- $userAgent : string|null = null
-
Optional user agent (defaults to $_SERVER['HTTP_USER_AGENT'])
Tags
Return values
bool —True if empty, false otherwise
isHittingSensitivePath()
Checks if the current request path matches any registered sensitive paths
public
static isHittingSensitivePath([string $path = '' ]) : bool
Parameters
- $path : string = ''
-
Optional path to check. If empty, uses $_SERVER['REQUEST_URI']
Return values
bool —True if path contains sensitive content, false otherwise
isHoneypot()
Check if honeypot field has been filled
public
static isHoneypot([string $honeyField = 'primordyx_start' ]) : bool
Parameters
- $honeyField : string = 'primordyx_start'
-
Name of the honeypot field (default: 'primordyx_start')
Tags
Return values
bool —True if honeypot triggered, false otherwise
isInvalidOrigin()
Checks if the Origin header doesn't match the expected host
public
static isInvalidOrigin() : bool
Tags
Return values
bool —True if origin is invalid or suspicious, false otherwise
isLikelyBot()
Determines if the request is likely from a bot based on scoring threshold
public
static isLikelyBot() : bool
Return values
bool —True if bot score is >= 0.7, false otherwise
isLocalIp()
Checks if the request is coming from a local IP address
public
static isLocalIp() : bool
Return values
bool —True if request is from localhost, false otherwise
isMaliciousRequest()
Check if request is malicious (path or payload)
public
static isMaliciousRequest() : bool
Tags
Return values
bool —True if malicious patterns detected, false otherwise
isPostWithoutReferer()
Check if POST request is missing Referer header
public
static isPostWithoutReferer([string|null $method = null ][, string|null $referer = null ]) : bool
Parameters
- $method : string|null = null
-
Optional HTTP method
- $referer : string|null = null
-
Optional Referer header
Tags
Return values
bool —True if POST without referer, false otherwise
isRateLimited()
Checks if the request is rate limited using the configured callback
public
static isRateLimited() : bool
Tags
Return values
bool —True if rate limited, false otherwise
isSuspiciousAcceptHeaders()
Check if Accept headers are missing or suspicious
public
static isSuspiciousAcceptHeaders([string|null $accept = null ][, string|null $acceptLanguage = null ]) : bool
Parameters
- $accept : string|null = null
-
Optional Accept header
- $acceptLanguage : string|null = null
-
Optional Accept-Language header
Tags
Return values
bool —True if suspicious, false otherwise
isSuspiciousPath()
Check if request path contains suspicious patterns
public
static isSuspiciousPath([string $path = '' ]) : bool
Detects common attack patterns, vulnerability scans, and malicious probes in the request path. Checks for things like directory traversal, common vulnerable endpoints, and configuration file access attempts.
Parameters
- $path : string = ''
-
Optional path to check (defaults to $_SERVER['REQUEST_URI'])
Tags
Return values
bool —True if suspicious patterns detected, false otherwise
isSuspiciousPayload()
Analyze request payloads for attack patterns
public
static isSuspiciousPayload() : bool
Tags
Return values
bool —True if suspicious payload detected, false otherwise
isSuspiciousUserAgent()
Check if user agent is suspicious
public
static isSuspiciousUserAgent([string|null $userAgent = null ]) : bool
Parameters
- $userAgent : string|null = null
-
Optional user agent (defaults to $_SERVER['HTTP_USER_AGENT'])
Tags
Return values
bool —True if suspicious, false otherwise
isUsingRareMethod()
Checks if the request uses uncommon HTTP methods that may indicate bot behavior
public
static isUsingRareMethod() : bool
Tags
Return values
bool —True if using rare method, false otherwise
isValidFingerprint()
Validate fingerprint cookie signature
public
static isValidFingerprint([bool $refresh = false ]) : bool
Parameters
- $refresh : bool = false
-
Whether to refresh the cached validation result
Tags
Return values
bool —True if fingerprint is valid, false otherwise
matchesKnownBot()
Checks if the user agent matches any known bot patterns
public
static matchesKnownBot([string $userAgent = '' ]) : bool
Parameters
- $userAgent : string = ''
-
Optional user agent string. If empty, uses $_SERVER['HTTP_USER_AGENT']
Return values
bool —True if user agent matches a known bot, false otherwise
queryBotMap()
Query the bot map for a specific identifier
public
static queryBotMap(string $key) : string
Parameters
- $key : string
-
The bot identifier to look up
Tags
Return values
string —The bot name if found, 'Unknown' otherwise
registerSensitivePaths()
Registers an array of sensitive paths that should trigger bot detection
public
static registerSensitivePaths(array<string|int, mixed> $paths) : void
Parameters
- $paths : array<string|int, mixed>
-
Array of path strings to be considered sensitive
resetSensitivePaths()
Clears all registered sensitive paths
public
static resetSensitivePaths() : void
Tags
score()
Calculate comprehensive bot detection score
public
static score() : float
Analyzes multiple detection signals and returns a weighted probability score indicating likelihood of bot traffic. The scoring system combines user agent analysis, header validation, behavioral patterns, and fingerprint state to produce a score between 0.0 (human) and 1.0 (bot).
Scoring Components
- User Agent (0.4): Suspicious or known bot patterns
- Headers (0.2-0.3): Missing or suspicious headers
- Request Patterns (0.1-0.3): Path access, methods, rate limiting
- Fingerprint (0.1-0.5): Missing, unsigned, or forged fingerprints
- Payload (0.3-0.4): Honeypot triggers, attack patterns
Score Interpretation
- 0.0-0.2: Very likely human
- 0.3-0.4: Suspicious, consider additional verification
- 0.5-0.6: Probable bot
- 0.7-0.9: Very likely bot
- 1.0: Definitely bot (capped maximum)
Tags
Return values
float —Bot detection score between 0.0 and 1.0
signFingerprintCookie()
Sign or validate the fingerprint cookie
public
static signFingerprintCookie() : void
Processes the fingerprint cookie to ensure it's properly signed. Handles unsigned cookies by signing them, validates already signed cookies, and clears forged or malformed cookies.
Tags
updateBotMap()
Update or add a bot pattern to the detection map
public
static updateBotMap(string $key, string $value) : void
Parameters
- $key : string
-
Bot identifier (lowercase user agent substring)
- $value : string
-
Human-readable bot name
Tags
useRateLimiter()
Sets a callback function for rate limiting checks
public
static useRateLimiter(callable $callback) : void
Parameters
- $callback : callable
-
Function that returns boolean indicating if rate limited
userInfo()
Collects and returns comprehensive user information from server variables
public
static userInfo() : array{ip: string, fingerprint: string|null, user_agent: string, referer: string}
Return values
array{ip: string, fingerprint: string|null, user_agent: string, referer: string} —User information array
useSecretKey()
Set the secret key for fingerprint signing
public
static useSecretKey(string $secretKey) : void
Parameters
- $secretKey : string
-
A strong secret key (minimum 32 characters)
Tags
verifySignature()
Verify fingerprint signature
public
static verifySignature(string $raw, string $sig) : bool
Parameters
- $raw : string
-
Raw fingerprint value
- $sig : string
-
Signature to verify
Tags
Return values
bool —True if signature is valid, false otherwise
assertConfigured()
Assert that configuration is complete
protected
static assertConfigured() : void