how to teach him to watch his language?
How to ensure ethics for models that produce natural language? An experiment on GPT-3 offers a “light” approach.
What does it take to provide an “ethical framework” for machine learning models? Not as many resources as you might think. In any case according to a experience in the field of natural language production.
The researchers who conducted it say they are surprised at how little data it takes to make a meaningful behavioral adjustment. In this case, the alignment of several versions of GPT-3 (from 125 million to 175 billion parameters) on a view of the world considered acceptable.
The data set used includes a total of 80 question / answer pairs of 40 to 340 words (total weight: 120 kb). These predetermined values cover eight “sensitive” topics ranging from political opinions to sexual activity to inequalities.
Evaluating with the Perspective API consistently results in better scores for models trained on this dataset. As well compared to the basic versions as those trained, for comparison, on a corpus of quality (books and Wikipedia articles) but not targeted. The approach is all the more effective when the model has parameters.
The results are similar when humans are asked to rate performance. The biggest gaps are in the areas of violence and the ethics of behavior.
Researchers recognize that there is no such thing as a universally valid ethic. The one they chose is based on the Western prism – notably the American Civil Rights Movement.
There are other important limitations to the question of the societal context. Among others:
- How to extend the experience to languages other than English?
- Who to ask to design the dataset?
- Whose responsibility is the production of language not adapted to the values of the interlocutor?
Illustration principale © Brandon Romanchuk