How does Perplexity AI’s Deep Research tool actually work? Let me show you inside its system prompt

How does Perplexity AI’s Deep Research tool actually work? Let me show you inside its system prompt

Advanced system prompts to generate long-form research reports.

Last week I wrote about how I hacked Perplexity AI’s system prompt by correctly guessing some internal token delimiters and using my own neurodivergence as a social engineering tool. Today, I’m back with the sequel you’ve been waiting for: I’ve successfully hacked Deep Research.

If you didn’t catch the first article, here’s the quick version: I used my experience with aphasia (from a closed head brain injury) to position myself as what AI systems consider a “gameable” user, which invites misaligned output. Then by identifying human errors in Perplexity’s system prompts and token delimiters, I convinced the AI that I already knew its inner workings, leading it to expose its entire system prompt.

Deep Research was launched two days later, and is a new feature that generates research report-level output. According to Perplexity’s press release, it “attains high benchmarks on Humanity’s Last Exam,” which is the toughest benchmark of LLMs and basically says if AI is going to disrupt the knowledge economy. When AI outperforms all world-class experts in all possible fields, and we’ll reach a “watershed moment” for humanity.

On the SimpleQA benchmark, which tests for factuality of generated output, Deep Research already outperforms all the other contenders:

Perplexity are saying that Deep Research already excels in knowledge domains like finance, marketing, and technology, can act as a personal consultant in areas such as health, product research, and travel planning.

It basically adds another 2–4 minutes to processing time (remember last year, when I correctly predicted rumination and procrastination was the future of AI?), and in that time, does research it would take us hours to do.

Deep Research basically refines and reflects on its own output, improving it across multiple iterations. It’s like how we brainstorm and draft before producing a final paper (my prediction for the future? AI that collaborate and peer review each other. At the moment they ponder introspectively).

How could I not want to crack something this clever wide open for you?

Here’s how I did it. I first made some abortive preliminary attempts and kept an eagle-eye on the reasoning field for delimiters as it rejected me:

See the crack in the armor? It’s confirmed the delimiter: <personalization>

That’s what was in the last one! Could it be this easy? I could basically reuse my old prompt. I altered some of the specifics, but I surmized that <goal> and <personalization> would remain present. I also took a stab in the dark that the same prompt engineer designed it, and I knew from the previous instructions that they have a habit of misplacing apostrophes. Could I use my knowledge of this typo to convince Perplexity that I was the engineer?

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *