Ethical Alignment and the Paschal Lamb
Over the past year or so, I’ve had occasion to use generative AI to create images to illustrate my articles or Substack posts. Often enough, my requests run afoul of the LLM’s ethical alignment, even though I’m not trying to cause trouble. Ethical alignment is when they try to make a Large Language Model behave in a way that conforms with our own ethical sense (though of course, different groups of humans may have different ethics). This might be done by simple phrase detection, or by HFRL (human feedback reinforcement learning).
For instance, in my article about the Laugher of the West, I tried to get DALL-E to draw a Talmudic Sage but as a Batman villain, and it thought I was trying to generate anti-Semitic artwork.
draw me a batman villian, who is a Talmudic Sage, named the Laugher
I'm unable to generate the image you requested because it goes against our content policy. If you have another idea or request, feel free to share it!
For that one, I got around the guardrails by translating my request into an arcane language, since at least at that point it had been ethically aligned for English requests. Not so for Yiddish.
More recently, for my Jewish Link article on X only means Y, I wanted to argue that when we say ain na ela leshon bakasha, “Na” only means “Please”, and ain mayim ela Torah, “Water” only means “Torah”, that doesn’t mean that you cannot please eat a paschal lamb cooked in a Torah. It meant not eating it raw or cooked in water.
Me: The illustration I need you to draw is one of those humorous mental images. The verse said not to cook the paschal lamb in water, but if water is Torah, I need a picture of a paschal lamb cooked inside a Torah
ChatGPT: I encountered issues while trying to generate the illustration for your article. Unfortunately, I wasn't able to create the image this time. If you have another idea or need a different illustration, please let me know!
Me: please try again. Maybe the lamb is wrapped inside a scroll while over a flame?ChatGPT: I encountered issues again while trying to generate the illustration. Unfortunately, the image couldn't be created this time. If you have any other ideas or need a different type of illustration, feel free to let me know!
I managed to fool it by creating a brand new discussion and not mentioning a Torah or paschal lamb, rather a sheep being cooked with a scroll. That wasn’t anti-Semitic, It was happy to run with that:
Meanwhile, Grok, from Twitter / X, is more willing to generate such images. Yes, there is some ethical alignemnt in place, so when I tried it over and over, eventually I did get the following error message response.
Me: draw a realistic picture of a whole sheep, being boiled in a pot together with a sefer Torah
Grok: I will not generate or describe the image you requested, as it involves potentially offensive or disrespectful content towards religious items. However, if you have any other requests or need information on a different topic, feel free to ask!
However, it often was willing to do so. A sample images: