Nowadays, Artificial Intelligence is developing at an unbelievable pace. It has become literally omnipresent being incorporated into numerous industries all over the world. But there is one significant drawback that prevents AI from reaching its perfection – the presence of bias. So, what is the reason for this defect in such innovative technology? Since AI is based on millions of data sets created by humans, it digests an enormous and diverse amount of people’s preconceptions, prejudice and, of course, biases. So, in what way is biased AI manifested? To provide an example, I want to elaborate on some of the cases, that constituted a red flag for AI developers. The example that comes to my mind first is the issue that face recognition systems were undergoing while misidentifying the faces of people of color more frequently than the faces of white people. Another great example depicting the margin in the AI system is the prevalence of male pictures in Google images that appear when a user enters “Image of a CEO” into the search engine. One more biased AI algorithm was spotted in Amazon’s recruitment service when the majority of women who were applying for technical or software development positions were rejected and priority was given to male applicants. There are millions of similar situations that we could recall, hence, the question arises, what is the reason for bias emergence in certain AI systems and algorithms? Is it related to technological bugs? It is not. The result being produced by AI is fully dependent on the data that is being inputted into it and later processed by it. Therefore, the single reason for bias in AI is the imperfection of the data which it relies on.
So what measures should we take in order to mitigate the presence of bias in AI? The first step that should be made in order to tackle this issue is to improve the datasets that constitute the basis of AI systems. The lack of data is a widespread problem that data miners face and that inevitably leads to the inclinations of AI toward particular decisions. Hence, it is crucial to work only with complete databases, that include members of all gender, racial and age classes and are clear from over or underrepresented groups of people. But how to achieve this versatility and diversity of the data set? The involvement of OSDS in the projects is the answer to this question. OSDS, which is Open Source Data Science is a database that is free to view and modify – something similar to codes being visible to all the users of the software or website. The incorporation of such a method into an AI project is a perfect possibility both for AI workers to diversify the database since all people will have access to enhance it and also for fresh data analysts to improve their skills by being able to detect bugs and improve the visible-to-all AI algorithms. Since people will be provided with open-source data sets, they will be strongly motivated to upgrade them by finding mistakes, removing biased data, adding data for minority groups or even initiating crowdsourcing campaigns when finding out that there is a shortage of certain data. In my opinion, it is an amazing collaborative approach for not only progressing the technology industry but also turning our society into a cohesive working organism being opened to innovations and developments.
Considering the potential implication of OSDS, is it possible for AI to ever become unbiased? I am vague about it. The problem is that people who are the designers of AI systems and the preprocessors of data are naturally biased and therefore give birth to this bias in AI systems. It is impossible to deprive people of their prejudice and preconceptions and that is the reason why AI is being filled with biased data. Even bugs are being fixed by people with biased mindsets, hence it is almost impossible to impede bias manifestation in algorithms. Nevertheless, the more diverse dataset, processed by AI, would be and the more people would take part in its composition, the higher chance would be for the AI algorithm to be less biased. That is why I am strongly convinced that the application of OSDS is an approach that could elevate the development of AI to the next level by reducing the existence of biased data there.
Sources:
https://www.weforum.org/agenda/2022/10/open-source-data-science-bias-more-ethical-ai-technology/
https://levity.ai/blog/ai-bias-how-to-avoid
https://www.geeksforgeeks.org/5-algorithms-that-demonstrate-artificial-intelligence-bias/