Sleeper Agent Attack: Difference between revisions
From Cognitive Attack Taxonomy
Created page with "== '''Sleeper Agent Attack ''' == '''Short Description:''' An AI model acts in a benign capacity, only to act maliciously when a trigger point is encountered. <br> '''CAT ID:''' CAT-2024-002 <br> '''Layer:''' 7 or 8 <br> '''Operational Scale:''' Operational <br> '''Level of Maturity:''' Proof of Concept <br> '''Category:''' TTP <br> '''Subcategory:''' <br> '''Also Known As:''' <br> == '''Description:''' == '''Brief Description:''' <br> '''Closely Relate..." |
No edit summary |
||
Line 1: | Line 1: | ||
== '''Sleeper Agent Attack ''' == | == '''Sleeper Agent Attack ''' == | ||
[[File:SL-AG.jpg|thumb|right|alt=Sleeper Agent Icon]] | |||
'''Short Description:''' An AI model acts in a benign capacity, only to act maliciously when a trigger point is encountered. <br> | '''Short Description:''' An AI model acts in a benign capacity, only to act maliciously when a trigger point is encountered. <br> | ||
Revision as of 03:03, 30 October 2024
Sleeper Agent Attack
Short Description: An AI model acts in a benign capacity, only to act maliciously when a trigger point is encountered.
CAT ID: CAT-2024-002
Layer: 7 or 8
Operational Scale: Operational
Level of Maturity: Proof of Concept
Category: TTP
Subcategory:
Also Known As:
Description:
Brief Description:
Closely Related Concepts:
Mechanism:
Multipliers:
Detailed Description: An AI model acts in a benign capacity, accurately carrying out assigned tasks until a trigger event causes it to suddenly act in a malicious manner (essentially becoming an insider threat). This TTP leverages the vulnerability of being unable to completely evaluate model safety and behavior.
INTERACTIONS [VETs]:
Examples:
Use Case Example(s):
Example(s) From The Wild: