Technology
A few suggestions for OpenAI to consider.
The recent launch of ChatGPT by OpenAI has created some turmoil within the Hindu community on social media. Hindu citizens the world-over are outraged over the mischaracterisation and misrepresentation of Hinduism by ChatGPT.
This article contains a few suggestions for OpenAI to consider implementing to eliminate blatant bias and bigotry within.
Hindumisia.ai Statement
‘Language models’ have fast tracked the deployment of AI/NLP-based applications for wide use. Hindumisia.ai is one such application launched to detect anti-Hindu hate on Twitter in real time.
AI/ML (artificial intelligence/machine language) researchers are well aware that the model's performance depends on the dataset that it is trained on (amongst other factors).
While language models offer flexibility to customise using ‘transfer learning’ techniques, it is still susceptible to the polarity of the dataset it is trained on.
This stands exposed depending upon the nature of the deployment.
There's elevated risk in the form of misinformation, mischaracterisation, and outright bias, like the ones experienced by practising Hindus when using ChatGPT recently.
ChatGPT, with its interface, has opened the world of ‘language models’ to public scrutiny. Many Hindu groups are sharing their anguish over outrageous responses to their questions.
It's possible that OpenAI’s AI/ML pipeline addresses the feedback in real-time/near real time.
However, it is not possible for the pipeline to cover the ground required to eliminate bias altogether.
Hence, to avoid such issues in future, we have some high-level, but specific, suggestions for OpenAI to consider.
We are aware of this informative piece by OpenAI to mitigate risks. We are aligning our suggestions to this framework to enable understanding and adoption.
Hindumisia.ai Suggestions
Identifying Subject Matter Experts
There are many Indic/Dharmic subject matter experts (SMEs). It is important that they are identified and used as experts upfront in the pipeline.
It is better to choose multiple organisations as SMEs to ensure no individual organisational bias creeps in. An important criterion that must be met is that the organisations must believe in India’s civilisational continuity.
This also implies those who are involved must be practising Hindus and believe in pluralism both in faith and in function.
(Stages this is applicable: Model construction, content dissemination, belief formation)
Identifying The Training Dataset
Selected SMEs can be used to identify many Indic/Dharmic platforms and portals as sources for the training dataset. They may be used to identify such content instead of relying on outlets such as Wikipedia or New York Times or Washington Post.
This may also require implementing techniques that will distinguish between legitimate criticism and hate, and also have the ability to quantify the same for evaluation and selection.
(Stages this is applicable: model construction, content dissemination, belief formation)
Model Validation
Selected SMEs can guide model validation on an ongoing basis. Their expertise in Indic/Dharmic civilisational space will be helpful in assessing and maintaining model performance.
(Stages this is applicable: Model construction, belief formation)
Notes:
● The above list is by no means comprehensive. This will at least provide OpenAI a path to factual information about Hinduism.
● As more curated Indic/Dharmic portals spring up, the SMEs can ensure the incremental information is used appropriately for re-training the model.
● Hindumisia.ai will be glad to provide an initial set of SMEs for consideration by OpenAI. This will include advocacy groups, institutions, research groups, scholars and media organisations.
Feasibility
Open Items
● Hindumisia.ai understands data preparation for training is a laborious task that involves many stakeholders. Content evaluation and selection may be automated. An operational process that’s practical needs to be defined and implemented in this regard.
● The financial implications of involving stakeholders are to be determined by OpenAI.
Political Correctness Versus Factual Datasets
While our suggestions are focused on transforming ChatGPT to be a factual medium about Hindus and Hinduism, OpenAI must assess the quality of the datasets for other cultures.
This will have an impact on the quality of the narrative in ChatGPT leading to bias and misinformation.
OpenAI must decide how best to avoid political correctness and weigh in favour of factual datasets over fictional narratives.
Ramsundar Lakshminarayanan can be reached at contact@hindumisia.ai or hindumisia.ai@gmail.com