It all started with text. And search engines.
When the internet exploded, it rapidly grew so large that it soon became impossible to find anything useful unless someone showed you where it was. Search engines capable of doing this well became the gateways to our internet experience. Success began to be defined by how well the algorithm could understand what humans meant when they typed a query. If they were going to give us the precise result we needed, they needed to first understand what we were looking for. And so, the IT industry dedicated itself to refining search engine algorithms to solve this problem. Since all the world’s search queries ran through their pipes, all they needed to do was process this information and apply probabilistic rankings to possible outcomes.
Over time, they became preternaturally good at this, to the point where not only were search engines able to understand what we were thinking when we typed a query into the search field, they were also able to predict our question before we finished typing it. This autocomplete functionality in search soon spilled over to various other areas. Our SMS and instant messaging applications began suggesting short responses to the messages we received and our email programs, which were able to process the history of our correspondence, suggested more extended responses. What started as a simple autocomplete functionality has, today, evolved into much, much more.
Voice assistants that were long viewed as little more than gimmicks – novelty items that could set the alarm or tell you what the weather would be – are today an integral part of our lives. We use them to order groceries, control our smart homes and even carry out financial transactions. Where they previously struggled to understand anything other than English spoken in a Western accent, they now have no problem understanding not only English spoken in a wide range of accents but virtually every major language on the planet. The impact of this on accessibility alone is significant as people who only speak their mother tongue can now use technology that was previously beyond their reach.
AI can also now generate, from scratch, images and videos that are indistinguishable from those made by humans. One of my favourite pastimes is using AI algorithms like Midjourney to create high-quality images as illustrations for my writing, for no better reason than it is fun to see what the algorithm comes up with. I am very rarely disappointed. I know that similar capabilities exist in video creation and it is only a matter of time before I graduate to trying my hand at that. Now computer-generated imaging has become such an integral part of the movie industry today that it is impossible to tell real actors from those entirely generated by CGI. The ability to generate high-quality videos based solely on the text prompts of a creator will soon radically transform the industry.
There is no better evidence of that possibility than The Crow, a short film generated entirely by text-to-video AI, which won the Jury Award at the Cannes Short Film Festival in 2022. If all of this leads you to believe that the AI advances are of little use for anything more than entertainment, you need look no further than how it is transforming the field of medicine. Image recognition algorithms are being used with incredible success in radiology. When applied to analysing X-rays, CT scans and MRIs, these algorithms have proved capable of detecting abnormalities invisible to the human eye. They can detect and identify pulmonary nodules, colonic polyps and microcalcifications which indicate different forms of cancer. In the case of skin cancers, they can detect a wide variety of sizes, shades and textures of suspected lesions better than a trained dermatologist could, to the point where they are sometimes capable of characterising a tumour as malignant
AI algorithms can also be used in drug discovery to design and evaluate molecules in silico by reducing the number of potential candidates to more manageable numbers. Also, in instances where drugs initially developed to address a particular medical condition can be repurposed to treat another disease, these algorithms can be used to identify these possibilities. This is of relevance in polypharmacology. There, deep learning algorithms can process information about the performance of drug candidates screened from a library of virtual compounds across multiple targets to identify a single drug that can simultaneously interact with multiple targets.
As impressive as AI has been so far, we are, at the time of this writing, on the brink of yet another transformation that promises to be even more dramatic. Over the past year or so, remarkable improvements in the capabilities of large language models (LLMs) have hinted at a new form of emergent ‘intelligence’ that can be deployed across a range of applications whose full scale and scope will only become evident over time. So powerful is the potential of this new technology that some of the brightest minds on the planet have called for a pause in its development out of the fear that it will lead to a SkyNet future and the genuine threat of unleashing malicious artificial general intelligence.
LLMs are computer algorithms designed to generate coherent and intelligent responses to queries in a humanlike conversational manner. They are built on artificial neural networks that have typically been trained on massive data sets that allow them to learn language structure. LLMs can learn without being explicitly programmed. They can, therefore, continue to improve the more data they receive. At their core, LLMs are not that different from the autocomplete technologies that started us down this path. They are designed to predict the probability of the next word in a sentence based on the previous words in the sentence, but they have mastered the technique to such an extent that they can autonomously generate tomes of text in a style that is often indistinguishable from a human author. LLMs have the potential to revolutionize how we interact with technology and, in the process, with one another. One of the most significant benefits of LLMs is their potential to improve access to information. By analysing vast amounts of text, LLMs can create summaries and identify critical information, making it easier for individuals to access information quickly and efficiently.
For example, in India, LLMs trained on an extensive database of government subsidies and benefits could engage with potential beneficiaries of these programmes through a conversational interface that allows them to query whether these benefits were applicable to them and if there were any ancillary schemes they could apply for. When integrated with real-time language translation capabilities, the power of AI to transform the lives of the poorest and most marginalised becomes immediately evident.
Excerpted with permission from The Third Way: India’s Revolutionary Approach to Data Governance, Rahul Matthan, Juggernaut.
Limited-time offer: Big stories, small price. Keep independent media alive. Become a Scroll member today!
Our journalism is for everyone. But you can get special privileges by buying an annual Scroll Membership. Sign up today!