Is AI Deepening the Divide Between the Global North and South? | Opinion

Artificial intelligence is the fastest growing technology globally and has the potential to transform the way we live, work, interact and evolve. The COVID-19 crisis proved tech can help businesses create game-changing solutions in record time.

However, in a recently published report, researchers from Princeton, Cornell, University of Montreal and the National Institute of Statistical Sciences point out that the actual benefits of artificial intelligence are being reaped by those with maximum economic power.

AI has created a new set of blue-collar workers called data labelers who earn, on average, as low as $2 per hour, much lower than the U.S. federal minimum wage of $7.25 per hour. These are figures for the U.S., so it's anyone's guess how low workers get paid in India, the Philippines and other countries in the global south where these data labelers reside.

Most AI systems still need to be trained on data, like toddlers learning to identify colors and shapes or kindergarteners learning to read and write. The supervised machine learning is even more critical to AI systems using computer vision, whether it is recognizing a face, identifying objects, reading legal clauses, or detecting diseases invading agricultural fields.

Training datasets must be labeled manually, which is a repetitive, boring and time-consuming job, requiring very low skills. What is forgotten is that unskilled jobs are a highly critical part of the AI development pipeline.

According to Grand View Research, the data collection and labeling market is set to grow at a compound annual growth rate of 25.6 percent between 2021 and 2028, to reach $8 million, 2020 being the base year.

AI's adoption last year was highly accelerated due to COVID-19 enforced lockdowns and the disruption of global economies. As organizations double down on IT technologies to continue providing value and staying relevant, their deep learning algorithms produce even more data to be labeled.

The question then becomes, who will get the lion's share of the revenue and benefits of this growth?

Although AI has found a widespread application, it is still a research and development based technology. The AI research labs and data preparation pipelines are concentrated in the global north, while low-skilled workers performing the data cleaning and labeling are from the global south.

These low-skilled workers get peanuts compared to the millions earned by companies like Samasource, Scale AI and Mighty AI. Data labeling startups are mushrooming globally, but it remains to be seen if they will decrease the disparity in AI's money concentration.

Laptop
Hands on a laptop. Francis Dean/Corbis via Getty Images

Much of the data labeling task is crowdsourced through platforms like Amazon Mechanical Turk, where the earnings are sometimes disbursed as Amazon gift cards.

Initially, only U.S. workers could transfer their earnings to bank accounts; till 25 more countries were provided the capability in May 2019. The problem is not the availability of these crowdsourcing platforms but the way the AI research and product development companies use these means—they can get tasks done for dirt cheap. And it cannot be denied that the amount of data labeling required is humongous and companies are already spending millions on it.

If not for cheap labor made available by the global south, data annotation projects would have been economically unviable, keeping AI a distant dream. Its precedent can be found in the successful handling of the Y2K bug globally because it could be manually corrected at feasible costs. But just because historically the global south has been the provider of cheap labor, it doesn't mean it should continue to be so. Adjustments can be made in the new order, aiming to decrease the economic divide.

The data labelers, without whom no AI is possible, can play a bigger role. As AI is finding widespread acceptance, solutions are being deployed in the global south as well, but the open-source datasets available are U.S. and Europe centric.

When AI models trained on these datasets are deployed in the global south, they fail miserably. While populations across the global south are already part of the data pipeline, why not make them data sources as well, to make training datasets more effective?

The data labelers could also be trained to acquire skills that move them a few notches up in the data pipeline, which would increase earnings as well. AI training data is in high demand for applications in health care, pharmaceuticals, environment, wildlife conservation, climate control and others. The need for specialized data labelers is bound to rise.

Even though the intent is not malicious, every new tool can be a potential weapon.

It's time we learn from our past mistakes and correct course with artificial intelligence. After all, that is what artificial intelligence is all about—learning from what already exists so that you are equipped to make correct decisions in the future.

Shweta Mishra is an author and freelance technology writer. Her work has appeared in Huff Post, Parentology and Thrive Global. Follow her LinkedIn.

The views expressed in this article are the writer's own.