Investigations Prompting Tips

Overview

The Research Agent is an agentic investigation interface built inside SpyCloud Cybercrime Investigations. You describe what you're investigating, in plain language, and the agent plans and executes the investigation across infostealer logs, phished data, breach and combolist records, and then returns findings with context and recommended next steps.

Don't forget all the selectors SpyCloud supports. Be sure to incorporate them in your prompts: https://docs.spycloud.com/public-sc/docs/selector-best-practices

Getting Started: Your First Prompt

Start with any email address, username, domain, IP, phone number, machine identifier, or a plain-language description of what you're investigating.

Basic example:

Investigate [email protected]. What's exposed and is there anything that warrants escalation?

The agent runs an initial investigation and returns findings. From there, continue in the same thread.

You can start with

An email address, username, domain, IP, phone number, or machine identifier
A mix of assets pulled from the same alert or ticket
A scenario description ("we had a credential stuffing attempt against our SSO. Here are the accounts targeted")
A hypothesis to confirm or rule out ("I want to determine whether this contractor and this employee are the same person")

What the agent returns

Confirmed identity clusters and exposure timelines
Credentials organized by severity and data source
Cross-identity and cross-device correlations
Risk assessment with confidence rating
Recommended next pivots and escalation actions

5 Tips for a Strong Prompt

More context produces better investigations. The five components below are the building blocks of a strong prompt. You don't need all five every time, but each one makes the output more targeted and actionable.

1 — Subject + Assets

Name the target and provide every selector you have. More assets give the agent more surface area to correlate.

Investigate [email protected]. I also have a machine ID from the EDR alert and a personal Gmail from the same infostealer log.

2 — Investigation Goal

State what question you're trying to answer. This focuses the agent rather than returning a broad exposure summary.

Determine whether this user's credentials
are in active criminal circulation and whether
there's evidence of compromise outside our env.

3 — Scope and Focus

Narrow by data type, credential tier, or time window.

Focus on infostealer logs from the last 90 days.
Prioritize corporate credentials and any personal
accounts with passwords that overlap work logins.

4 — Output Format

Tell the agent how to structure its response, especially when handing off findings or filing a case.

Return: confirmed identity cluster, exposure
timeline, risk assessment, and recommended
next pivot. Organize credentials by severity.

⚖️5 — Challenge the Finding

Before escalating, ask the agent to surface contradicting evidence: "Also identify any data that does NOT support unauthorized access. I need to be able to defend this finding before it goes to HR or legal." Use this before any formal escalation.

Examples of Weak vs Strong Prompts

Scenario: SOC Alert — Impossible Travel

❌ Weak

Look up [email protected]

✅ Strong

Investigate [email protected]. My SIEM flagged
an impossible travel alert — authenticated from
two locations 8 hours apart. Determine whether
her credentials appear in infostealer logs or
breach records. Look for password reuse between
corporate and personal accounts. If you find a
machine ID, pull the full record and identify
all other credentials and identities on it.
Return a risk assessment by severity, and close
with recommended next actions.

Scenario: SOC - Insider Threat

❌ Weak

Is [email protected]](mailto:[email protected]) a risk to my business?

✅ Strong

Determine whether [email protected]
shows signs of credential compromise or
off-platform activity consistent with insider
risk. Look for infostealer exposure, password
reuse on personal accounts, freelance platform
credentials, or co-occurrence with anonymization
infrastructure. If you find a machine ID, check
for multiple browser profiles and flag this.
Surface contradicting evidence too. Close with
a confidence-rated finding: HR, legal, or IR?

Scenario: Fraud - Multiple Accounts

❌ Weak

Check these three emails

✅ Strong

Investigate these three emails as a potential
fraud cluster: [email 1], [email 2], [email 3].
Pull all breach and infostealer records for each.
Cross-reference: identify shared passwords,
device overlap, or co-occurrence in the same
breach source. Determine whether these are likely
the same person under multiple identities. For
any email returning no records, assess naming
pattern consistency. Close with a confidence-
rated attribution and highest-yield next pivot.

Useful Prompt Phrases

Drop any of these into a prompt to direct or extend the investigation.

Phrase	When to use
`from infostealer logs only`	Restricts to malware data. Highest yield for reconstructing what was active on a compromised device at time of infection.
`prioritize privileged / admin / SSO credentials`	Surfaces VPN, SSO, cloud console, and admin credentials first. Use when assessing lateral movement or blast radius.
`last [N] days / months`	Bounds to a specific time window. Use for active incident triage vs. broad historical exposure.
`if you find a machine ID, pull the full record`	Triggers a full infected machine investigation as a conditional branch off an email or username pivot.
`look for password reuse across personal and corporate accounts`	Cross-references credential tiers for the same identity. Leading indicator for both insider threats and external compromise.
`treat these as a cluster and look for shared infrastructure`	Directs cross-identity correlation: shared passwords, devices, IPs, breach co-occurrence.
`challenge this conclusion`	Forces the Research Agent to surface contradicting evidence. Use before any formal escalation.
`for any identity with no records, assess pattern consistency`	Ensures identities with no records are evaluated, not skipped.
`what should I pivot on next`	Asks the Research Agent to recommend the highest-yield next selector based on what it already found.
`map the likely access path`	Reconstructs how access was obtained, from initial compromise through credential reuse or lateral movement.
`summarize for the case file`	Formats output as structured finished intelligence ready for a ticket, case management system, or stakeholder handoff.
`close with ordered escalation actions`	Instructs the agent to end with specific, executable next steps and not just observations.

Prompting Multi-Step Investigations

For complex investigations, structure your prompt in phases. Tell the agent what to do first, what to do with the results, and how to close.

ℹ️Why this works

The Research Agent processes multi-part prompts sequentially, completing each step and carrying results forward into the next. This mirrors how a manual investigation actually runs.

Basic phase structure:

Phase 1: Run the machine ID [X] and pull every identity and credential found on it.
Phase 2: For each email address found, look for additional infected machines.
Phase 3: Across all discovered machines, surface any credentials related to privileged
         access — VPN, SSO, admin accounts.
Organize by machine in a table.
Close with a risk assessment and recommended escalation path.

You don't need to use the word "phase." Any sequential structure works: First... Then... Finally... Close with...

The key is telling the agent explicitly what to do with what it finds — not just what to look up.

Conditional Pivoting

One of the most powerful behaviors of the Research Agent is conditional logic. You can instruct it to take different actions depending on what it finds.

Infected Machine ID

If you find an infected machine ID associated with any
of these emails, pull the full record and expand
the investigation to all other identities on
that machine.

Privileged credentials

If the credentials include anything related to
SSO, VPN, or cloud infrastructure, flag those
as critical and list them separately.

Identity clusters

If any two of these email addresses share a
password, machine ID, or breach,
treat them as the same identity cluster and
investigate accordingly.

No Records

If you find no records for an address, note
whether the naming pattern or email format is
consistent with the other identities in this
investigation.

Handling No-Record Results

⚠️No data is still a finding

When the agent returns no records for a target, that is a signal and not the end of the investigation.

*Tell the agent explicitly how to handle scenarios with no records:

For any email address that returns no SpyCloud records, don't skip it. Assess whether
the naming convention, email format, or domain pattern is consistent with the other
identities in this investigation. State whether absence of data suggests clean
infrastructure or simply that this identity hasn't been captured yet, and recommend
whether to expand the search.

What to assess when no records return

Does the email match a pattern seen across confirmed identities?
Does the username appear in other forms across the same cluster?
Is absence of data a signal of better operational security?

Pattern indicators to look for

firstname.lastname.tech@ — naming convention match
first+platform@ — tagged address variant
Username appearing across the cluster in alternative forms
Domain infrastructure shared with confirmed identities

Cross-Referencing and Co-Occurrence

The Research Agent can connect findings across identities, machines, and datasets, but you need to ask for it explicitly.

Cross-identity correlation

After analyzing each identity individually, look for shared passwords, device overlap, or co-occurrence in the same breach source:

After analyzing each identity individually, look for shared passwords, device overlap,
or co-occurrence in the same breach source across all of them. Summarize any links in a table.

Identify whether any of these email addresses appear on the same infected machine.
If so, treat those identities as connected and investigate the machine as the pivot point.

Cross-tier credential comparison

Check whether corporate credential exposure maps to personal account exposure for the same user:

For any corporate credential found, check whether the same password or a variant
appears on a personal account associated with the same user.

Flag any credentials that appear across three or more distinct platforms, becase
systemic password reuse is a key risk indicator.

Closing a multi-identity investigation

At the end of a cross-identity investigation, request a structured summary:

After individual analysis, deliver: a shared infrastructure table (machine IDs, IPs,
identities per node), a password clustering summary (which identities share credentials
or credential families), and a composite risk assessment with confidence rating.

Continuing an Investigation

⚠️Do not start a new conversation for each step

Continue every investigation in the same thread. The Research Agent retains full context of everything it already found, so each follow-up builds on prior results without re-entering background.

Threads persist across sessions. Return the next day with a new selector from another tool or a new alert and pick up exactly where you left off.

Example thread flow:

1 — Initial

Investigate the user, surface all exposed credentials.

2 — Expand on a finding

"You found a secondary email. Investigate that and look for device overlap."

3 — Pivot to machine

"Pull all other identities on the same machine ID."

4 — Cross-reference

"Cross-reference passwords across all identities found so far."

📋Close with the case file

End any investigation thread with: "Summarize everything as a case file with escalation actions." This formats all findings as structured finished intelligence ready for a ticket, case management system, or stakeholder handoff.