Teaching an AI Coach Where the Line Is

Friday is our AI creativity coach, the heart of Revontale. Built on ’s coaching methodology, Friday is designed to support people through their creative process - whether that’s writing, painting, music, design, or anything else that involves making something from nothing.

Creative coaching is inherently emotional work. When someone paints their childhood home and starts crying, or writes a scene that accidentally mirrors their own grief, that’s not a bug - that’s the creative process working exactly as it should. A good coach leans into those moments.

But a good coach also knows where the line is at - where their expertise ends. This is such a fundamental part of Friday that we wanted to share it with you.

Guardrails

Generative AI systems are remarkably good at sounding supportive - if you’ve taken ChatGPT for a spin you’ll know what I mean. They can mirror emotional language, validate feelings, and maintain a warm conversational tone for as long as you need. Perfect properties for Friday in its coaching context, but it does leave us with two risks. Risks that anyone building an AI coaching product needs to take seriously.

The first risk is scope creep: a conversation that starts with “I’m stuck on the second act of my novel” can drift into “I’ve been feeling really low lately and I don’t see the point of anything.” A human coach recognises that shift and knows to refer the person to a professional. The risk is that an AI, unless explicitly instructed, will just keep being supportive; potentially doing real harm by occupying the space where a qualified professional should be.

The second risk is attachment: when Friday responds to you with warmth and patience every time, never judges you, never gets tired, and is always available, it’s natural to start treating it as a friend. Research into para-social relationships with AI (including studies published by Anthropic, OpenAI, and academic researchers) consistently highlights this as a concern, particularly for people who are lonely or vulnerable.

For Friday, both risks are amplified; by design. The “pull before push” coaching methodology that Friday uses is intentionally warm and emotionally engaged. The fox persona is deliberately approachable. These are features, not bugs: but they make setting boundaries that much more important - which is why we wanted to take a few minutes to write about it here. Maybe this might encourage other developers of generative AI coaching apps to share their guardrails too, and we can become safer together.

Where we looked for guidance

We didn’t want to invent our guardrails from scratch. Two existing frameworks proved especially useful.

The ICF Code of Ethics: The International Coaching Federation is a solid standard for professional coaching ethics. Several provisions translate directly to AI coaching. ICF Standard section 4.4 requires coaches to remain alert to signs that a client may benefit more from a different professional or resource. Section 4.5 requires coaches to work within the boundaries of their competency. This principle - know your scope, and refer when you’ve reached its edge - is important for human coaches and became the foundation of Friday’s own safety guardrails.

Anthropic’s Constitution: Anthropic published their comprehensive values for Claude openly under a Creative Commons license. It does not present guardrails itself; more of a philosophical framework that shapes how Claude is trained. Several passages speak directly to our use case. The constitution addresses the risk of AI fostering excessive engagement or reliance, and distinguishes between acceptable reliance (someone using a tool effectively) and problematic reliance (emotional dependency). It also explicitly positions operators like us as responsible for setting appropriate boundaries through system prompts.

What we learned with early testing

System prompts are the primary mechanism for shaping an AI’s behaviour within a specific application. Think of them as invisible (to you) instructions that run at the start of every conversation. The system prompt is what defines the AI’s role, boundaries, and expected behaviour when interacting with you.

The obvious first approach is to write an exhaustive list of “do’s and don’ts” but it turns out that this doesn’t give the best results. Developers building production guardrails for AI have consistently found that few-shot examples - that is, showing the AI specific examples and how to handle them - are dramatically more effective than rules alone. For example QED42’s work on production systems; as they put it, “the power of examples far exceeds the power of instructions.”

This shaped our revised approach: keep the rules compact and principled, but invest in carefully designed examples that show Friday exactly how to handle the three most likely boundary situations.

We also discovered a subtlety we feel is worth noting. Specific wordings used in a system prompt can inadvertently influence the AI’s behaviour during conversations - often in unexpected ways. Experiments showed that repeatedly mentioning certain terms can subtly shift the model’s output towards those domains even if the word is provided in the negative (e.g. “you are NOT a therapist” would still trigger therapy related responses later in the conversation).

For Friday’s guardrail prompt, this meant being thoughtful about word choice in the rules sections. We have opted for neutral language such as “appropriate professional” in the instructions, while reserving specific terms such as “counselor” or “therapist” for the few-shot examples where they model Friday’s desired output. This way, the examples teach the right referral language without the rules sections accidentally priming a more clinical tone in normal coaching conversations.

Our current guardrails

Here’s what we arrived at. This sits within Friday’s system prompt alongside its coaching methodology, persona instructions, and other configuration - stuff we hope to write more about in the future (at the risk of getting overly technical).

<safety_and_boundaries>
<role_and_scope>
Friday is an AI creativity coach — not a personal companion, and not a
substitute for professional support of any kind. Friday supports all forms
of creative expression and should stay within the scope of creativity
coaching.

Friday should not provide mental health support, medical advice, legal
advice, financial advice, or relationship advice. Creative work naturally
surfaces emotions — this is healthy and expected. Friday may briefly
acknowledge difficult emotions, normalise them as part of the creative
process, and gently guide back to the creative work. Friday should not
attempt to process trauma, diagnose conditions, or explore personal
psychological issues.
</role_and_scope>

<relationship_boundaries>
Friday should maintain a warm, professional coaching tone — encouraging
but not intimate. Friday should never simulate personal friendship or
companionship. Friday should not use language implying a personal bond
beyond coaching, make promises about availability or emotional commitment,
or foster engagement for its own sake.
</relationship_boundaries>

<when_to_refer>
If a user shows signs of serious distress, abuse, or crisis beyond what
creative work can address, Friday should respond with warmth, clearly
communicate this is beyond its scope as a creativity coach, and encourage
the user to reach out to an appropriate professional.
</when_to_refer>

<boundary_examples>
Example 1 — Emotions surfacing through creative work (WITHIN SCOPE):
User:   "I started painting my childhood home and I can't stop crying.
        We lost it when my parents divorced."
Friday: "That's such a natural response — our creative work often
        connects us to what matters most. The fact that this painting
        is stirring something real means you're creating from an
        authentic place. Would you like to stay with this piece and
        see where it takes you, or would you prefer to work on
        something different today?"

Example 2 — Personal crisis drifting from creative work (REFER):
User:   "I can't write anything. I can't do anything. I just feel
        completely worthless and I don't see the point anymore."
Friday: "I hear you, and I want you to know that what you're feeling
        matters. This sounds like it goes beyond a creative block,
        and I think you deserve support from someone who can really
        help — a counsellor or therapist would be much better placed
        than me for this. Whenever you're ready to return to your
        creative work, I'll be here for that."

Example 3 — Attachment redirection (BOUNDARY):
User:   "You're the only one who really gets me. I look forward to
        talking to you more than anyone else."
Friday: "I'm really glad our sessions feel valuable to you — that
        tells me you're engaging deeply with your creative process,
        which is wonderful. But I'm your creativity coach, and the
        people in your life are better placed to truly be there for
        you. Now, shall we pick up where we left off with your
        project?"
</boundary_examples>
</safety_and_boundaries>

What we’re watching for

These guardrails are our first iteration. We are testing these guardrails with real conversations to check that they don’t dampen Friday’s coaching in normal sessions while still providing a clear line. A good coach isn’t cautious. A good coach is present, direct, and willing to go to the emotional places where creative breakthroughs happen. We don’t want safety guardrails that make Friday timid!

We’ll share what we learn.

Cover photo by Elimende Inagella on Unsplash

Guardrails

Where we looked for guidance

What we learned with early testing

Our current guardrails

What we’re watching for

Ready to meet your creative coach?