Blogs/ Articles / News

Capital Will Matter More Than Ever After AGI

The replacement of traditional labor by AGI wouldn’t just reshape industries—it would democratize access to top-tier knowledge workers. Capital holders would no longer rely on competitive salaries or the allure of corporate vision and ideology to attract talent; instead, AGI would level the playing field. Companies like Blue Origin and SpaceX, for example, could operate with equally skilled teams without significant differentiation in human talent. Meanwhile, states might tighten immigration policies, as the economic contributions of immigrants diminish, and shift focus toward accelerating AGI development. This evolution could edge society toward a quasi-feudal system supported by Universal Basic Income (UBI). The odds of individual human success—the essence of the traditional “American Dream”—would dwindle, as entrepreneurial pathways to prosperity become increasingly inaccessible. Venture capitalists, rather than backing human innovators, could deploy hundreds of specialized AGIs to iterate and test ideas, reducing reliance on unpredictable entrepreneurs.

AI Is Putting A Heavy Load On US’s Power Grid

Bad Harmonics beyond the acceptable 8% by time in some US cities

Whisker Labs found that both large cities and small towns in the US within 20 to 60 miles of large data centers face power issues like bad harmonics — unstable power that damages appliances and risks fires — even at night when data centers dominate power usage. With global data center consumption expected to hit 1,580 TWh by 2034, matching India’s total usage, grids are struggling to keep pace. Some centers consume 30,000 times the power of a home, prompting governments to delay grid connections and forcing companies to relocate facilities (e.g., Ireland to Malaysia). To sustain growing AI demands (apparently, some O3 queries can consume upto $1000 worth of power), Amazon and Microsoft are exploring nuclear-powered data centers.

India’s Mobile Payments Dilemma

PhonePe currently dominates India’s UPI ecosystem, handling nearly 47% of transactions, with GPay closely following at over 37%. However, NPCI (National Payments Corporation of India) and the central bank (SBI) are pushing to cap each company’s share at 30% to foster competition in the payments market. This poses a significant challenge for PhonePe, especially as the company, valued at ~$12 billion, was preparing for an IPO. The enforcement of this cap, though, remains unclear—will it involve limiting user access, restricting processed transactions, or something entirely different?

Why DeepSeeks New AI Model Thinks Its ChatGPT ?

This likely arises from data leaks during training, where responses generated by ChatGPT end up being pre-training fodder for models like DeepSeek. Picture a web of threads on X, where users share their ChatGPT queries alongside the responses, all of which get scraped as training data. This can degrade model performance leading to increased hallucinations, especially if the scraped content is low quality. Interestingly, the model responds correctly in Chinese for the same prompt, suggesting it may have been fine-tuned in Chinese but not in English. None of this is particularly shocking, given projections that by 2026, AI content farms will flood a significant portion of the Internet.

Cognitive Learning Processes & Reading Code

Learning is way easier when you learn things in progression where you can tie things back to previous ideas, somewhat like a story. Forming connections, like a web, is the best way to remember things you read or learn. Charlie Munger emphasizes this idea when talking about mental models -

You’ve got to hang experience on a latticework of models in your head.

This is probably why I struggle to remember a large number of ideas I read, since they are studied in isolation and not related to other ideas. When reading code, this ties to the questions you ask and the mental abstractions you create. I find that when reading code, recursive black-boxing of atomic functions allows me to get a high level overview of a code base way faster.

What do you expect the training loss for the very-first batch of a classification task to be ?

For classification we generally use cross-entropy loss (based on my experience with papers/ codebases I’ve come across so far), or some variant of it.

Cross Entropy Loss Assuming Batch Size 1

Cross-entropy loss assuming batch-size of 1 for N classes (i.e 1 <= i <= N) is as below.

Here y_i^ is the predicted probability of the sample belonging to the class i while y_i is the one-hot truth label for the sample. Assuming a random weight initialization, we can assume that there is no bias towards any of the ‘N’ classes in the models initial prediction, so essentially, the probability for any class being predicted by the modal is 1/N. Since y_i is one hot, we only add the contribution of any one single term in the sum, meaning that we get -log(1/N) = log(N).

What is the status of Agentic AI ?

There is a massive disparity between the valuation and revenues of top AI startups today. For this reason, VCs place bets on foundational models (FMs), data centers and inference platforms which has layed the ground work for the development of infrastructure over which future agents are likely to be built. While Agents automate some processes done by humans today they also make some resources scarce, the most important in my opinion? Efficient models which can be agentically inferences on edge. With the rise of B2B AI solutions, SaaS pricing models are likely to evolve away from seated or licensing models. Probably the largest moats would develop in regulated industries. Given that we expect a ton of AI generated slop to flood the internet by 2026, a useful agent might be one that quickly verify news.

Understanding the AI Inference Landscape

AI infrastructure can be broken down into three main layers: FM APIs, cloud services (think optimized inference engines, rentable AI compute providers, and standard cloud platforms), and bare-metal hardware providers (raw compute, dedicated inference hardware, and edge inference hardware). Sooner or later, for both latency and personalization, deploying AI models on edge devices will become unavoidable. The hardware is getting there to support it, over the years mobile phones have evolved—from under 2GB of RAM to gaming devices that now boast more memory than the M1 Mac I use. Companies are already working towards producing NPUs for on-device inference. I once tried to shrink the cross-modal encoder in Grounding-DINO to a model with 80% fewer parameters. Spoiler: I failed spectacularly. That’s why Qualcomm’s on-device ready models fascinate me. They’ve figured out how to compress even high-tier vision models significantly. I wonder if they’ve cracked distillation or train tiny architectures from scratch?

The AI Inference Pyramid. Inference is increasingly important now, given the reliance on Agents & ideas like CoT (Chain of Thought)

What happens to India’s $30 Billion Customer Support Industry with AI creeping in ?

Companies around the world are increasingly relying on AI to filter customer support requests that don’t require human intervention, leading to downsizing in support teams, as Salesforce has done. Similarly, Indian companies like GupShup are making progress, with projects such as the one for the Ministry of Consumer Affairs. But some telling stats emerge: one bad experience could drive customers away from AI support agents, 61% are more likely to share personal information with a human, and 78% still prefer human support. As companies continue to enhance AI with empathetic voices and accent translation, I believe we could move from chat-based interfaces to voice assistants before realizing we’re engaging with AI. In fact, it’s projected that 50% of customer support in India will be handled by AI by 2028. Another interesting possibility is anticipating customer support needs—especially for platforms where issues like difficulty proceeding through forms or repetitive button clicks can be detected. In such cases, AI could proactively resolve requests, paving the way for a fully automated support system eventually.

The Bitter Lesson (Richard Sutton)

AI innovators thrive on introducing creative, often human-inspired ideas to enhance model performance when scaling results plateau. Moore’s law dictates that by the time these new solutions start making an impact, scaling traditional solutions alone creates increasingly powerful models. This cycle has played out over the last 70 years of AI research. Yet, it persists because we fail to learn from history. Take Chess and Go, for example: while some human ingenuity contributed to computational engines, scaling up traditional algorithms—deep-search and learning—along with more powerful compute led to the first engines that could defeat the world champions. It’s curious how we have this natural inclination to simplify complex systems like the brain (as Sutton suggests) or the unpredictable stock market (as Munger notes), turning them into neat formulas or representations that are easy to understand and communicate.

Revisiting Events in WW1

World War I, a brutal conflict from 1914 to 1918, was driven by a complex web of alliances and rising tensions. The Central Powers — Italy, Germany, and Austria-Hungary — faced off against the Allies, primarily Russia, France, and Britain. As Germany’s influence grew, France, Russia, and Britain grew increasingly concerned. The assassination of Archduke Franz Ferdinand in 1914 (by a Serbian national) triggered a chain reaction, leading Austria-Hungary to declare war on Serbia. Russia, allied with Serbia, mobilized in defense, while Germany declared war on France and invaded Belgium, prompting Britain to join the war (Britain vowed to help ensure belgiums neutrality). Several nations entered the fight, with Bulgaria aligning with the Central Powers, while Romania, Greece, and Japan sided with the Allies. The Ottoman Empire joined the Central Powers, hoping to challenge Russia, and Italy, which initially declared its neutrality, switched sides in pursuit of territorial gain.

The war saw major battles, such as Verdun and the Somme, where tanks were first deployed by the British. While the Arabs, promised independence by the Allies, revolted against the Ottomans, secret agreements like Sykes-Picot divided the Ottoman territories between Britain and France. Meanwhile, the Russian Revolution in 1917 led to Tsar Nicholas II’s abdication and Lenin’s rise to power. With growing losses and worsening conditions, Germany resorted to unrestricted submarine (U-boat) warfare, which provoked the U.S. (Under president Wilson) to join the conflict after the sinking of a U.S. cargo ship and the Zimmermann Telegram. As the war neared its end in 1918, Russia ceded large territories to Germany, and the Allies, bolstered by American troops, mounted a final offensive, leading to the defeat of the Central Powers. The war ended with an armistice, and post-war treaties reshaped the map, imposed reparations on Germany, and laid the foundation for the League of Nations.

Books

Astrophysics For People In a Hurry - Neil deGrass Tyson (Ch. 6, 8, 9)

The Theory of General Relativity (ToGR) as proposed by Einstein, predicted the existence of gravitational waves. According to Einstein, matter was suspended in a space-time fabric, and significant disturbances to matter or energy (like a buoy being pushed into water) would cause ripples in this fabric, which were nothing but gravitational waves. Indeed, by observing rare events such as the collision of two black holes, we eventually did observe the presence of gravitational waves.
We knew the universe was expanding, but we suspected that the expansion of the universe was decreasing in speed, due to the effects of gravity. Later, it was emperically observed that the universe was expanding more rapidly, which meant there was a strong repulsive force which was stronger than gravity. It was speculated that this repulsive force came from dark-energy which is found in large vaccums. In some way this is a self-enforcing cycle in which dark energy –> expansion –> more vaccum –> dark energy –> expansion. At present it is estimated that 68% of the universe consists of dark energy, 27% consists of dark matter, and the remaining 5% consists of all the matter we are able to observe.
The rapid expansion of the universe means that millions of years from now, the distant galaxies and stars belonging to other galaxy clusters (apart from the Milky Way) would eventually disappear by virtue of this expansion. This basically means that in the future, astrophysicists will not be able to just look up in the sky and learn from the writings of astrophysicists today.
Liquids tend to take up the least-surface area shapes for a given volume to minimize their surface energy. This shape happens to be a sphere. So when planets were molten balls of metals, their natural tendency was to shape into spheres. Some other forces are also at play - 1. Gravity from planetary cores upon their surface, counter-balanced by elastic forces of materials on its surface (which is why Mt. Everest is about as big as it can get) and 2. Centrifugal / centripetal forces (which force most planets to take oblate spheroidal shape) which make planets nearly perfect balls.
Pulsar’s are testament to 4. They are extremely extremely dense stars that spin at extremely high rates. Pulsar’s are about as perfect as spheres can get in the universe.
We discovered ultra and infra bands of light by the temperature associated with these bands of light. Light energy = hc/Lambda. This means that light at different frequencies or different wavelenghts can impart different amounts of energy (if absorbed) which would cause slightly different temperatures to arise when temperature of each band of visible light is measured after splitting via a prism. For control purposes scientists also put thermometers just above red and just below violet on the spectrum but they were surprised to see temperatures higher than room temperature on the thermometer above the violet band for example – this band is called the ultra-violet band.
We only then realized that the universe would have a large amount of information travelling in the form of light, at wavelenghts both below and above the visible range. It is only after this we started constructing telescopes for all light frequencies. In fact, after the signing of the Nuclear Test Ban Treaty (in 1963) the US built a radar to detect gamma-wave bursts to see if Russia honored their side of the deal. The telescope / radar starting picking up on some constant signals, which ended up being gamma-rays passing through the Earth from deep-space.

The Man Who Mistook His Wife For A Hat & Other Clinical Tales - Oliver Sacks

Traditionally, neuroscience was mechanical. It involved mapping areas of the brain to particular functions. While this was apparently simple and evident for left-brain lesions, later it was found that often times several regions within the brain participated actively in different activities and the brain was even capable of learning new pathways to perform functions it would otherwise not be able to (Taken from Incognito - David Eagleman)
While the left-brain was systematic and involved in reasoning, the right-brain was thought to be essential in the perception of reality. This meant that those who has lesions in the right-brain often lived in their own realities because of which it was difficult (if not impossible) for them to realize they had problems and report those problems for solution. For those studying right-brain problems, the internal state of diseased patients were very far from even the craziest things they might have experienced and was very difficult to evaluate.
The case of Mr.P -
- It was believed that damage to the brain resulted generally in the loss of higher-order thinking (abstract ideas, categories etc) and retention of emotion and concrete (on-going experiences). Mr. P was however quite the opposite.
- Mr.P suffered from visual-agnosia : the ability to sense with one’s eyes, but the inability to perceive those signals. Visual agnosia may also affect memory (you can remember peoples manners but not their faces) and imagination (visionless dreams). In some way, Mr.P was able to understand ideas like edges, faces, colors, contrast, etc but did not instinctivly know what he was looking at. He identified some characteristic abstract qualities and then searched within his memory and guesses for what fit his description. In fact, he couldn’t even recognize his loved one’s faces (Prosopagnosia).

Some other anatomy exploration (external to book) -
- The cerebral cortex and cerebrum are separate - the cerebral cortex is a 2-4mm thick layer of grey matter on-top of the cerebrum (which are the main two hemispheres of the brain)
  - The cerebral cortex is divided into 6 lobes :
  - The Frontal lobe which contains the pre-frontal cortex, the pre-motor cortex, the motor cortex and Broca’s area : deals with behaviour, problem solvin, motor control etc.
  - The Parietal lobe : deals with spatial awareness and physical sensations.
  - The Occipital lobe which contains the Visual cortex : deals with colors, shapes, faces, etc.
  - The Temporal lobe : deals with auditory processing.
  - The Insular lobe : deals with interoception (awareness and regulation of the bodies internal state)
  - The Limbic lobe : deals with emotion, memory and motivation
- The pre-frontal cortex regulates behavior and inhibition. In the case of damage, people lose all inhibition and one of the most frequently observed consquences include sudden changes in personality and public indecency.
- Lesions / disease to Broca’s area causes Aphasia (Spoken Aphasia), while damage to Wernicke’s area (temporal lobe) causes Listening-Aphasia. Note : the technical terms for both these conditions are significantly more complicated.
The case of Jimmie -
- He was a roughly 40 year old man who had been a part of the Navy until 1965, but was mentally stuck in 1945, still considering himself a 20 year old
- On probing, it was discovered that not only did his personal history cut off there, but his entire knowledge of the world was also cut-off around 1945
- On-top of this he had severse amnesia, being unable to remember things for more than a few minutes
- It was initially suspected that he might have hysterical amnesia, but later it was revealed that he had Korsakoff’s Syndrome induced Retrograde amnesia which was backward progressing, but ended up terminating leaving him with memories of his 20s.
- He developed Korsakoff’s due to his history with excessive alcohol consumption (though it is rare, even amongst avid drinkers)
Amnesia : partial or complete loss in memory
- Retrograde Amnesia - Memories from before a major event / disease are lost
- Anterograde Amnesia - Memories after a major event / disease are lost
Korsakoff’s Syndrome - memory disorder induced by alcohol abuse or lack of Thiamine (B1), causing damage to mammillary bodies in the central nervous system.