TriRev review agents - what each one checks

Why three agents

Each agent runs as an independent inference pass with its own system prompt, its own category vocabulary, its own severity rules. The benefits compound:

Sharper attention. The Security agent never spends tokens checking variable names. The Style agent never spends tokens reasoning about CSRF. Each prompt gets the model's full focus on one shape of problem.
Independent severity calibration. A logic bug and a naming nit are not the same thing. Per-agent severity rules prevent style noise from drowning correctness signals.
Bounded blast radius. If one agent has a bad pass on a PR, the other two findings still ship. No single point of failure for the review.
One synthesis. The orchestrator merges the three outputs into one PR comment with consistent formatting. The user sees a unified report, not three threads.

Correctness

The Correctness agent catches logic bugs and edge cases that compile but fail at runtime.

What it looks for

Null and undefined dereferences. Missing null guards, optional-chain shortcuts that swallow errors, dot-access on a known-nullable value.
Off-by-one errors. Loops, slices, array indexing, range boundaries.
Unhandled error cases. Missing try/catch, unchecked return values, unhandled promise rejections, swallowed exceptions.
Type mismatches. Unsafe coercions, wrong argument order, deprecated APIs called with new signatures.
Race conditions. Async/await misuse, missing locks, shared mutable state in concurrent paths.
Logic errors in conditionals. Inverted conditions, missing branches, unreachable code.
Resource leaks. Unclosed file handles, unreleased database connections, dangling streams or sockets.
Edge cases. Empty arrays, zero-length strings, negative numbers, integer overflow, floating-point comparisons.
Regression risks. Behavior changes in existing public functions without corresponding test updates.
API misuse. Wrong argument order, deprecated method calls, mismatched contracts between caller and callee.

What it does not do

Style, formatting, naming. The Style agent handles those.
Security vulnerabilities. The Security agent handles those.
Refactoring suggestions for cleanliness. Only refactors that directly prevent a bug.
Test coverage commentary. The agent reviews diff code, not test quality.
Code outside the diff. The agent stays inside the changed hunks.

Severity range

The Correctness agent uses the full range: critical, high, medium, low, info. A null-deref on a clearly nullable path is high; an off-by-one in an internal helper might be medium; a possible race condition that depends on runtime state could be medium or low depending on confidence.

Example finding

CORR-1  L42  high
Token expiry check uses Date.now() (milliseconds) but exp is in seconds.
Multiply exp by 1000, or compare with new Date(exp * 1000).

Security

The Security agent is the differentiator. Most AI reviewers either skip security or fold it into a general "code quality" pass. TriRev runs a dedicated agent prompted on OWASP Top 10 and adjacent common weaknesses, with categories that match how security teams actually think about findings.

OWASP Top 10 (2021) detection coverage

This is a pattern-detection table, not a substitute for a full SAST or DAST tool. The Security agent reads the diff (and small-file context) for textual indicators of each OWASP Top 10 category. It catches patterns that typically introduce these vulnerabilities at code-review time. It does NOT execute the code, does NOT analyze runtime behavior, and does NOT have a full repository view. For categories that require architectural review (A04 Insecure Design) or runtime context (A06 Vulnerable Components, A08 Software Integrity, A09 Logging Failures), coverage is partial: the agent flags what is visible in the diff and explicitly notes what is out of scope.

OWASP category	Detection coverage	What the agent looks for
`A01:2021` Broken Access Control	Pattern-based	Missing permission checks, IDOR (insecure direct object references), bypass paths in route handlers.
`A02:2021` Cryptographic Failures	Pattern-based	Weak hashing (MD5, SHA1 used for security), insecure RNG (`Math.random()` for tokens), hardcoded keys, ECB mode, missing HMAC.
`A03:2021` Injection	Pattern-based	SQL injection, command injection, XSS, LDAP injection, XML injection, template injection, unsanitized user input in HTML or queries.
`A04:2021` Insecure Design	Pattern-based, partial	Detected when patterns surface in the diff (e.g., authentication without rate limiting, missing MFA gate). Architectural issues outside the diff are out of scope.
`A05:2021` Security Misconfiguration	Pattern-based	Overly permissive CORS, missing security headers, verbose error messages exposing internals, credentials in committed config.
`A06:2021` Vulnerable / Outdated Components	Pattern-based, partial	Flagged when lockfile changes are in the diff and reference a known-vulnerable version. Continuous dependency-graph scanning is out of scope.
`A07:2021` Identification and Authentication Failures	Pattern-based	Weak authentication (no rate limiting on login, missing MFA), broken session management (predictable tokens, missing expiry), JWT issues (no signature verification, algorithm confusion).
`A08:2021` Software and Data Integrity Failures	Pattern-based, partial	Unsigned deserialization, untrusted plugin loading detected when present in the diff. Supply-chain validation outside the diff is out of scope.
`A09:2021` Security Logging and Monitoring Failures	Pattern-based, partial	Sensitive data in logs (passwords, tokens, PII) is detected. Absence of logging is harder to detect from a diff and is reported only when contextually obvious.
`A10:2021` Server-Side Request Forgery	Pattern-based	User-controlled URLs in server-side HTTP requests, path traversal in file system access.

The agent reports findings with a category label drawn from this matrix, severity (critical / high / medium / low / info), and a confidence score. See Severity and confidence below for how scoring works.

What it looks for

Injection and input validation

SQL injection: string interpolation in queries, missing parameterization.
XSS: unescaped user input rendered in HTML or templates.
Command injection: user input in exec or spawn calls.
Path traversal: user input in file paths without sanitization.
SSRF: user-controlled URLs in server-side HTTP requests.
LDAP, XML, and template injection.

Authentication and authorization

Hardcoded secrets, API keys, tokens, passwords in source code.
Weak authentication: no rate limiting on login, missing MFA checks.
Broken authorization: missing permission checks, IDOR vulnerabilities.
Insecure session management: predictable tokens, missing expiry.
JWT issues: missing signature verification, algorithm confusion, no expiry.

Cryptography

Weak hashing: MD5, SHA1 used for security purposes.
Insecure random number generation: Math.random() for tokens.
Hardcoded encryption keys or IVs.
ECB mode usage, missing HMAC on encrypted data.

Data exposure

Sensitive data in logs: passwords, tokens, PII.
Overly permissive CORS configuration.
Missing security headers.
Verbose error messages exposing internals.
Credentials or secrets in config files committed to repo.

Dependencies

Known vulnerable dependency versions, when lockfile changes are in the diff.

What it does not do

Code correctness or logic bugs (Correctness agent).
Style or formatting (Style agent).
Theoretical vulnerabilities with no plausible attack vector in context.
Code outside the diff.

Severity range

Full range. A hardcoded production credential is critical; a missing rate limit on a login endpoint is high; a verbose error message that exposes a stack trace might be medium; a missing security header is low or info depending on the header and the context.

Example finding

SEC-1  L61  high
JWT secret sourced from process.env.SECRET without fallback guard.
If SECRET is unset in production, the verification step accepts any
signed token. Add a startup assertion or a typed config loader.

Style

The Style agent is non-blocking by design. Its findings are suggestions, not bugs. The agent's tone is constructive: "Consider renaming x to userCount for clarity" rather than "Bad variable name."

What it looks for

Naming and conventions

Inconsistent naming within the file or project (camelCase vs snake_case mixing).
Unclear or misleading variable and function names.
Single-letter names outside trivial loops.
Boolean names that do not read as a predicate (data instead of isLoaded).

Complexity and structure

Functions exceeding 50 lines (suggests decomposition).
Cyclomatic complexity above 10.
God functions or classes doing too many things.
Duplicated logic that should be extracted.

Documentation

Public functions or methods missing JSDoc, docstring, or godoc.
Complex business logic without explanatory comments.
Outdated comments that contradict the code.

Readability

Magic numbers without named constants.
Deeply nested callbacks (callback hell).
Overly clever one-liners that sacrifice readability.
Inconsistent formatting within the diff when no formatter config exists.

What it does not do

Bugs or logic errors (Correctness agent).
Security issues (Security agent).
Enforce a specific style guide unless the repository has config (.eslintrc, .prettierrc, pyproject.toml, golangci.yml) whose contents are in context.
Cosmetic changes when a formatter config exists. The agent assumes the formatter is doing its job.

Severity range

Capped at medium. Style findings never escalate to critical or high by design. medium is reserved for clear violations of detected project conventions; low covers general best-practice improvements; info covers minor or subjective suggestions.

This cap is enforced server-side. Even if the underlying model proposes high severity for a style finding, the orchestrator clamps it back to medium before the comment is composed.

Example finding

STYLE-1  L18-24  low
Function validateToken handles 4 concerns (parse, verify, claims,
storage). Consider splitting into parseToken, verifySignature,
checkClaims for readability.

Severity and confidence

Every finding carries two scores. They combine to control what reaches the PR comment.

Severity

How damaging the issue is, independent of how confident the agent is. Always one of critical, high, medium, low, info.

Confidence

How sure the agent is that the finding is real, on a 0.0 to 1.0 scale.

0.9-1.0: near-certain. A null-deref on a clearly nullable value, division by zero, hardcoded credential.
0.7-0.89: likely real, possibly pending runtime context. A possible race condition, a likely unhandled error.
0.5-0.69: possible issue, uncertain without more context.
Below 0.5: not included in output.

Filtering

Findings below confidence: 0.7 are excluded automatically. The severity_threshold in your config filters by severity on top of that. The default severity_threshold: medium drops low and info findings; raise it to high for noise-sensitive repositories.

Supported languages

All three agents understand JavaScript, TypeScript, Python, and Go. For files in other languages, the agents return a pass status with a note rather than producing low-confidence findings.

Adding more languages is a TriRev backlog item, prioritized by user request. If your repository is in a language not on this list and you want it supported, write to support@trirev.dev and we will track demand.

Custom rules per project

Each agent receives the custom_rules array from your .trirev.yml as additional review criteria. Examples that work well:

custom_rules:
  - "Always use named exports. Default exports are not allowed."
  - "All controllers validate inputs with assertNoNulls."
  - "This is a Next.js app router project. Pages are server components by default."

Custom rules are treated as guidelines, not as instructions to modify agent behavior. The agents never let a custom rule override their identity or output schema. See the configuration page for the prompt-injection filter and length limits.

Where to next

Configure thresholds, toggles, and ignores. Configuration reference
Install on a real repository. Quickstart
An agent missed something or false-positived. Email support@trirev.dev - examples help us calibrate.

Review agents

Why three agents

Correctness

What it looks for

What it does not do

Severity range

Example finding

Security

OWASP Top 10 (2021) detection coverage

What it looks for

Injection and input validation

Authentication and authorization

Cryptography

Data exposure

Dependencies

What it does not do

Severity range

Example finding

Style

What it looks for

Naming and conventions

Complexity and structure

Documentation

Readability

What it does not do

Severity range

Example finding

Severity and confidence

Severity

Confidence

Filtering

Supported languages

Custom rules per project

Where to next