The Latent Web
At its best, the web can feel as if it offers the world’s knowledge at your fingertips. What if it could create new content as you ask for it? Technically, this would be an enormous leap, but would it really feel that different to use? What would change?
Recent developments in generative modeling such as GPT-3 and Stable Diffusion have made it easy to create convincing content from a simple description. Julian Bilcke, a French engineer, has leveraged these models to create what he calls a “Latent Browser“, which uses these models to create web content based on a user query. This development, while technically simple, raises important questions about misinformation, bias, and the nature of knowledge.
Transforming knowledge
To analyze these questions, we need to discuss the fundamentals of large language models (LLMs). An LLM learns the statistical patterns in an enormous collection of text. Bloom, an open-source LLM, is trained on “books, academic publications, radio transcriptions, podcasts and websites”. “The 341-billion-word dataset used to train Bloom aims to encode different cultural contexts across languages, including Swahili, Catalan, Bengali and Vietnamese.” Any such LLM is trained to predict the next word that follows a given text. However, to do so, it is forced to develop an internal representation of language that mathematically encodes the meanings of words to some degree.
Modern LLMs have demonstrated astonishing emergent behaviors that have led to claims of intelligence and even sentience. I feel these claims are overwrought – remember these models are just learning statistical patterns, but these emergent capabilities are very impressive. One interesting class of such is what I would call “knowledge translation”. For example, modern LLMs are quite effective at:
- Turning a plain text description of a computer program into executable code, or translating code between two different programming languages.
- Turning bullet points describing an article into a full, polished draft.
- Turning a plain text description of a supply chain into a structured data file.
- Translating text between languages or styles of writing.
So imagine that an LLM is trained on the entire contents of the web. GPT-3 is actually trained on much less: 45 TB of text, compared to the approximately 15 TB of text in the American Library of Congress, but enough that, for our purposes, we can imagine it’s the whole web. So in some sense, the LLM encodes the knowledge of the web. Of course, this assertion needs to be critically examined: there may be limits and biases in the specific data on which the LLM is trained and the training and modeling process may somehow distort this knowledge. It’s not even clear what it should mean to say that the model “encodes all the knowledge” in its training data. But these models do shockingly well at some reasonable evaluations of this sort of claim: for example, performing well on parts of the LSAT and achieving an excellent score on the SAT.
So if these models encode the knowledge of the web and can transform this knowledge into any desired form of content, then the “latent web” seems quite within reach.
It’s here, and fairly simple
As I mentioned in the introduction, this latent web already exists and works reasonably well. And it’s not so complex, technically. It’s a layer on top of existing machine learning models that allows us to interact with them in new ways. But from a design perspective, the concept of a latent web is fascinating: take the knowledge of the web and produce custom content on demand.
The author refers to this project as “web4” but in a sense it’s closer to the original use of “web3”: the semantic web. The latent web encodes knowledge in a machine-readable way that allows arbitrary views.
Old Problems
However, these models have been criticized for hallucinating incorrect and sometimes dangerous content, reflecting or amplifying biases from the texts they were trained on, and generating hateful, degrading, or otherwise harmful content.
Of course, all these problems exist on our ordinary web that we interact with today. Perhaps LLMs make these problems worse, but maybe they present a new framework to think about them. What is harder: to reduce bias in an LLM or on the web? Is the potential for use of “custom LLMs” with different viewpoints going to further divide society or are there ways to leverage LLMs to create more cohesive, diverse, and inclusive experiences?
New Problems
Of course, the web is not just a repository of knowledge. It is also a place for conversation, community-building, organization, and huma n connection. LLMs have already demonstrated abilities to hallucinate entire online communities and even imitate real people, living and dead.
There are undoubtedly serious threats here to the health of our society. What do we do with this new explosion of capabilities? And what is coming next? These are questions we are just starting to grapple with, but are becoming more urgent by the day.
Thanks for the article!
Regarding what is next, in the immediate future I think I will separate the HTML generation from the text generation.
Today this is done in one go (one single output string from the LLM, and purely instruction-based to save on tokens) and as a consequence I’ve noticed it tends to produce simpler, shorter and lower quality content, such as lorem ipsum placeholders and very vague or generic text blocks.
Another thing I truly believe will be the main use case for the “latent web” are the “latent web apps”.
It is my main focus today (for instance I’ve been working this weekend to make application data persistent), and I will continue to introduce deeper levels of complexity by using web technologies (Javascript libraries, web workers, server side scripts..).
LikeLiked by 1 person