Sleeper Agent Attack

From Cognitive Attack Taxonomy
Revision as of 02:08, 30 July 2024 by EE (talk | contribs) (Created page with "== '''Sleeper Agent Attack ''' == '''Short Description:''' An AI model acts in a benign capacity, only to act maliciously when a trigger point is encountered. <br> '''CAT ID:''' CAT-2024-002 <br> '''Layer:''' 7 or 8 <br> '''Operational Scale:''' Operational <br> '''Level of Maturity:''' Proof of Concept <br> '''Category:''' TTP <br> '''Subcategory:''' <br> '''Also Known As:''' <br> == '''Description:''' == '''Brief Description:''' <br> '''Closely Relate...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Sleeper Agent Attack

Short Description: An AI model acts in a benign capacity, only to act maliciously when a trigger point is encountered.

CAT ID: CAT-2024-002

Layer: 7 or 8

Operational Scale: Operational

Level of Maturity: Proof of Concept

Category: TTP

Subcategory:

Also Known As:

Description:

Brief Description:

Closely Related Concepts:

Mechanism:

Multipliers:

Detailed Description: An AI model acts in a benign capacity, accurately carrying out assigned tasks until a trigger event causes it to suddenly act in a malicious manner (essentially becoming an insider threat). This TTP leverages the vulnerability of being unable to completely evaluate model safety and behavior.

INTERACTIONS [VETs]:

Examples:

Use Case Example(s):

Example(s) From The Wild:

Comments:

References: