Better prompt formatting for LLM uses

Tags:

One of the typical way to format prompt is using { … } like the below:

prompt = "hello {name}"
print(prompt.format(name="world"))
hello world

But it fails as soon as JSON is involved in the input or output.

prompt = """Below is a news article on company {company_name}.

{
    "title": "Foo launched X100C",
    "contents": "Foo launched X100C in Jan 1, 2024, celebrating its 100th anniversary.",
}
"""

print(prompt.format(company_name="Foo"))
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[3], line 9
      1 prompt = """Below is a news article on company {company_name}.
      2 
      3 {
   (...)
      6 }
      7 """
----> 9 print(prompt.format(company_name="Foo"))

KeyError: '\n    "title"'

Then, what should we do? My proposal is using string.Template. Its primary purpose is i18n, but it also provides a much more safe formatting, e.g.,

class SafeTemplate(Template):
    delimiter = "#!#!#"

Now, the previous example doesn’t fail anymore:

from string import Template


class SafeTemplate(Template):
    delimiter = "#!#!#"

template = SafeTemplate("""Below is a news article on company #!#!#COMPANY_NAME

{
    "title": "Foo launched X100C",
    "contents": "Foo launched X100C in Jan 1, 2024, celebrating its 100th anniversary.",
})
""")

print(template.safe_substitute(COMPANY_NAME="Foo"))
Below is a news article on company Foo

{
    "title": "Foo launched X100C",
    "contents": "Foo launched X100C in Jan 1, 2024, celebrating its 100th anniversary.",
})

What’s great is that it’s also safe to pass additional but non existent placeholder names.

print(template.safe_substitute(
    COMPANY_NAME="Foo", THIS_DOESNT_EXIST="Bar"))
Below is a news article on company Foo

{
    "title": "Foo launched X100C",
    "contents": "Foo launched X100C in Jan 1, 2024, celebrating its 100th anniversary.",
})