|
| 1 | +--- |
| 2 | +sidebar_label: Keyword Classifier Configuration |
| 3 | +--- |
| 4 | + |
| 5 | +# Keyword Classifier Configuration |
| 6 | + |
| 7 | +The Keyword Classifier allows you to define custom routing rules based on the presence or absence of specific keywords or regular expressions within the input text. This provides a flexible and powerful way to categorize and route requests without relying solely on machine learning models. |
| 8 | + |
| 9 | +## Configuration Structure |
| 10 | + |
| 11 | +Keyword classification rules are defined in the `config.yaml` file under the `classifier.keyword_rules` section. Each rule is an object with the following parameters: |
| 12 | + |
| 13 | +```yaml |
| 14 | +classifier: |
| 15 | + keyword_rules: |
| 16 | + - category: "category_name" |
| 17 | + operator: "AND" | "OR" | "NOR" |
| 18 | + keywords: ["keyword1", "keyword2"] |
| 19 | + case_sensitive: true | false |
| 20 | +``` |
| 21 | +
|
| 22 | +## Configuration Parameters |
| 23 | +
|
| 24 | +### `category` (Required) |
| 25 | + |
| 26 | +- **Type**: String |
| 27 | +- **Description**: The classification label to assign if this rule matches. This will be the `category` returned by the classifier. |
| 28 | +- **Example**: `"urgent_request"`, `"sensitive_data"` |
| 29 | + |
| 30 | +### `operator` (Required) |
| 31 | + |
| 32 | +- **Type**: String |
| 33 | +- **Description**: Defines how multiple keywords within this rule are combined to determine a match. |
| 34 | +- **Valid Values**: |
| 35 | + - `AND`: All keywords in the `keywords` list must be present in the input text for the rule to match. |
| 36 | + - `OR`: At least one keyword from the `keywords` list must be present in the input text for the rule to match. |
| 37 | + - `NOR`: None of the keywords from the `keywords` list must be present in the input text for the rule to match. |
| 38 | +- **Example**: `"OR"`, `"AND"`, `"NOR"` |
| 39 | + |
| 40 | +### `keywords` (Required) |
| 41 | + |
| 42 | +- **Type**: Array of Strings |
| 43 | +- **Description**: A list of strings that the classifier will search for in the input text. These strings are treated as regular expressions. |
| 44 | +- **Behavior**: |
| 45 | + - For robustness and to allow for special characters, all keywords are automatically escaped using `regexp.QuoteMeta` before being compiled into regular expressions. This means you can use special regex characters (like `.`, `*`, `+`) as literal characters in your keywords without needing to escape them yourself in the `config.yaml`. |
| 46 | + - Word boundaries (`\b`) are conditionally applied around keywords that contain word characters. This helps ensure whole-word matching where appropriate (e.g., "cat" matches "cat" but not "category"). Keywords consisting solely of non-word characters (like punctuation) will not have word boundaries applied. |
| 47 | +- **Example**: `["urgent", "immediate", "asap"]`, `["SSN", "social security number"]`, `["user\\.name@domain\\.com", "C:\\Program Files\\\\"]` |
| 48 | + |
| 49 | +### `case_sensitive` (Optional) |
| 50 | + |
| 51 | +- **Type**: Boolean |
| 52 | +- **Description**: Determines whether the keyword matching should be case-sensitive. |
| 53 | +- **Default**: `false` (case-insensitive) |
| 54 | +- **Example**: `true` |
| 55 | + |
| 56 | +## Complete Configuration Example |
| 57 | + |
| 58 | +```yaml |
| 59 | +classifier: |
| 60 | + keyword_rules: |
| 61 | + - category: "urgent_request" |
| 62 | + operator: "OR" |
| 63 | + keywords: ["urgent", "immediate", "asap"] |
| 64 | + case_sensitive: false |
| 65 | + - category: "sensitive_data" |
| 66 | + operator: "AND" |
| 67 | + keywords: ["SSN", "social security number", "credit card"] |
| 68 | + case_sensitive: false |
| 69 | + - category: "exclude_spam" |
| 70 | + operator: "NOR" |
| 71 | + keywords: ["buy now", "free money"] |
| 72 | + case_sensitive: false |
| 73 | + - category: "regex_pattern_match" |
| 74 | + operator: "OR" |
| 75 | + keywords: ["user\\.name@domain\\.com", "C:\\Program Files\\\\"] # Keywords are treated as regex |
| 76 | + case_sensitive: false |
| 77 | +``` |
0 commit comments